Detect Speech In Noise Patents (Class 704/233)
  • Patent number: 11289109
    Abstract: Embodiments of the disclosure provide systems and methods for audio signal processing. An exemplary system may include a communication interface configured to receiving a first audio signal acquired from an audio source through a first channel, and a second audio signal acquired from the same audio source through a second channel. The system may also include at least one processor coupled to the communication interface. The at least one processor may be configured to determine channel features based on the first audio signal and the second audio signal individually and determine a cross-channel feature based on the first audio signal and the second audio signal collectively. The at least one processor may further be configured to concatenate the channel features and the cross-channel feature and estimate spectral-spatial masks for the first channel and the second channel using the concatenated channel features and the cross-channel feature.
    Type: Grant
    Filed: April 24, 2020
    Date of Patent: March 29, 2022
    Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
    Inventors: Chengyun Deng, Hui Song, Yi Zhang, Yongtao Sha
  • Patent number: 11282528
    Abstract: One embodiment provides a method, including: receiving, at an information handling device, user input comprising a potential wake word; determining, using a processor, whether the potential wake word is associated with a stored wake word; and responsive to determining that the potential wake word is associated with the stored wake word, activating, based on the potential wake word, a digital assistant associated with the information handling device. Other aspects are described and claimed.
    Type: Grant
    Filed: August 14, 2017
    Date of Patent: March 22, 2022
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Ryan Charles Knudson, Russell Speight VanBlon, Roderick Echols, Jonathan Gaither Knox
  • Patent number: 11270696
    Abstract: An audio device with at least one microphone adapted to receive sound from a sound field and create an output, and a processing system that is responsive to the output of the microphone. The processing system is configured to use a signal processing algorithm to detect a wakeup word, and modify the signal processing algorithm that is used to detect the wakeup word if the sound field changes.
    Type: Grant
    Filed: July 1, 2019
    Date of Patent: March 8, 2022
    Assignee: Bose Corporation
    Inventors: Ricardo Carreras, Alaganandan Ganeshkumar
  • Patent number: 11257497
    Abstract: The present disclosure provides a voice wake-up processing method, an apparatus and a storage medium. After acquiring voice wake-up signals collected by audio input devices in at least two audio zones, an electronic device may correct, based on to-be-woken-up audio zones obtained from amplitudes of the voice wake-up signals collected by the audio input devices in the at least two audio zones, a to-be-woken-up audio zone identified using a voice engine, avoiding that audio zones in which a plurality of audio input devices collecting voice wake-up signals produced from a same user are located are all woken up, therefore, it is possible to improve accuracy of a voice wake-up result obtained by the electronic device. Therefore, the present disclosure can solve the technical problem that a vehicle-mounted terminal has low voice wake-up accuracy due to an insufficient degree of sound isolation between audio zones of the vehicle-mounted terminal.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: February 22, 2022
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Hanying Peng, Nengjun Ouyang
  • Patent number: 11257512
    Abstract: Systems and methods include a first voice activity detector operable to detect speech in a frame of a multichannel audio input signal and output a speech determination, a constrained minimum variance adaptive filter operable to receive the multichannel audio input signal and the speech determination and minimize a signal variance at the output of the filter, thereby producing an equalized target speech signal, a mask estimator operable to receive the equalized target speech signal and the speech determination and generate a spectral-temporal mask to discriminate a target speech from noise and interference speech, and a second activity voice detector operable to detect voice in a frame of the speech discriminated signal. An audio input sensor array including a plurality of microphones, each microphone generating a channel of the multichannel audio input signal. A sub-band analysis module operable to decompose each of the channels into a plurality of frequency sub-bands.
    Type: Grant
    Filed: January 6, 2020
    Date of Patent: February 22, 2022
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Alireza Masnadi-Shirazi
  • Patent number: 11257485
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.
    Type: Grant
    Filed: December 10, 2019
    Date of Patent: February 22, 2022
    Assignee: Google LLC
    Inventors: Bo Li, Ron J. Weiss, Michiel A. U. Bacchiani, Tara N. Sainath, Kevin William Wilson
  • Patent number: 11257487
    Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words to invoke an automated assistant. In various implementations, an automated assistant may be executed in a default listening state at least in part on a user's computing device(s). While in the default listening state, audio data captured by microphone(s) may be monitored for default hot words. Detection of the default hot word(s) transitions of the automated assistant into a speech recognition state. Sensor signal(s) generated by hardware sensor(s) integral with the computing device(s) may be detected and analyzed to determine an attribute of the user. Based on the analysis, the automated assistant may transition into an enhanced listening state in which the audio data may be monitored for enhanced hot word(s). Detection of enhanced hot word(s) triggers the automated assistant to perform a responsive action without requiring detection of default hot word(s).
    Type: Grant
    Filed: August 21, 2018
    Date of Patent: February 22, 2022
    Assignee: GOOGLE LLC
    Inventor: Diego Melendo Casado
  • Patent number: 11250877
    Abstract: A method for generating a health indicator for at least one person of a group of people, the method comprising: receiving, at a processor, captured sound, where the captured sound is sound captured from the group of people; comparing the captured sound to a plurality of sound models to detect at least one non-speech sound event in the captured sound, each of the plurality of sound models associated with a respective health-related sound type; determining metadata associated with the at least one non-speech sound event; assigning the at least one non-speech sound event and the metadata to at least one person of the group of people; and outputting a message identifying the at least one non-speech event and the metadata to a health indicator generator module to generate a health indicator for the at least one person to whom the at least one non-speech sound event is assigned.
    Type: Grant
    Filed: July 25, 2019
    Date of Patent: February 15, 2022
    Assignee: AUDIO ANALYTIC LTD
    Inventors: Christopher Mitchell, Joe Patrick Lynas, Sacha Krstulovic, Amoldas Jasonas, Julian Harris
  • Patent number: 11250038
    Abstract: An interactive question and answer (Q&A) service provides pairs of questions and corresponding answers related to the content of a web page. The service includes pre-configured Q&A pairs derived from a deep learning framework that includes a series of neural networks trained through joint and transfer learning to generate questions for a given text passage. In addition, pre-configured Q&A pairs are generated from historical web access patterns and sources related to the content of the web page.
    Type: Grant
    Filed: August 13, 2018
    Date of Patent: February 15, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Payal Bajaj, Gearard Boland, Anshul Gupta, Matthew Glenn Jin, Eduardo Enrique Noriega De Armas, Jason Shaver, Neelakantan Sundaresan, Roshanak Zilouchian Moghaddam
  • Patent number: 11240609
    Abstract: An audio device that includes a music classifier that determines when music is present in an audio signal is disclosed. The audio device is configured to receive audio, process the received audio, and to output the processed audio to a user. The processing may be adjusted based on the output of the music classifier. The music classifier utilizes a plurality of decision making units, each operating on the received audio independently. The decision making units are simplified to reduce the processing, and therefore the power, necessary for operation. Accordingly each decision making unit may be insufficient to determine music alone but in combination may accurately detect music while consuming power at a rate that is suitable for a mobile device, such as a hearing aid.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: February 1, 2022
    Assignee: SEMICONDUCTOR COMPONENTS INDUSTRIES, LLC
    Inventors: Pejman Dehghani, Robert L. Brennan
  • Patent number: 11238883
    Abstract: A method and a system for dialogue enhancement of an audio signal, comprising receiving (step S1) the audio signal and a text content associated with dialogue occurring in the audio signal, generating (step S2) parameterized synthesized speech from the text content, and applying (step S3) dialogue enhancement to the audio signal based on the parameterized synthesized speech. With the invention text captions, subtitles, or other forms of text content included in an audio stream, can be used to significantly improve dialogue enhancement on the playback side.
    Type: Grant
    Filed: May 23, 2019
    Date of Patent: February 1, 2022
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Timothy Alan Port, Winston Chi Wai Ng, Mark William Gerrard
  • Patent number: 11222625
    Abstract: Systems and methods for training a control panel to recognize user defined and preprogrammed sound patterns are provided. Such systems and methods can include the control panel operating in a learning mode, receiving initial ambient audio from a region, and saving the initial ambient audio as an audio pattern in a memory device of the control panel. Such systems and methods can also include the control panel operating in an active mode, receiving subsequent ambient audio from the region, using an audio classification model to make an initial determination as to whether the subsequent ambient audio matches or is otherwise consistent with the audio pattern, determining whether the initial determination is correct, and when the control panel determines that the initial determination is incorrect, modifying or updating the audio classification model for improving the accuracy in detecting future consistency with the audio pattern.
    Type: Grant
    Filed: April 15, 2019
    Date of Patent: January 11, 2022
    Assignee: Ademco Inc.
    Inventors: Pradyumna Sampath, Ramprasad Yelchuru, Purnaprajna R. Mangsuli
  • Patent number: 11222624
    Abstract: A server may provide a voice recognition service. The server may include a memory configured for storing a plurality of voice recognition models, a communication device configured for communicating a plurality of voice recognition devices, and an artificial intelligence device configured for providing a voice recognition service to the plurality of voice recognition devices, acquiring use-related information regarding a first voice recognition device (from among the plurality of voice recognition devices), and changing a voice recognition model corresponding to the first voice recognition device from a first voice recognition model to a second voice recognition model based on the use-related information.
    Type: Grant
    Filed: August 20, 2019
    Date of Patent: January 11, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Jaehong Kim, Hangil Jeong
  • Patent number: 11222654
    Abstract: A method for voice detection, the method may include (a) generating an in-ear signal that represents a signal sensed by an in-ear microphone and fed to a feedback active noise cancellation (ANC) circuit; (b) generating at least one additional signal, based on at least one out of a playback signal and a pickup signal sensed by a voice pickup microphone; and (c) generating a voice indicator based on the in-ear signal and the at least one additional signal.
    Type: Grant
    Filed: January 13, 2020
    Date of Patent: January 11, 2022
    Assignee: DSP GROUP LTD.
    Inventors: Assaf Ganor, Ori Elyada
  • Patent number: 11205411
    Abstract: A method for processing audio signal includes that: audio signals emitted respectively from at least two sound sources are acquired through at least two microphones to obtain respective original noisy signals of the at least two microphones; sound source separation is performed on the respective original noisy signals of the at least two microphones to obtain respective time-frequency estimated signals of the at least two sound sources; a mask value of the time-frequency estimated signal of each sound source in the original noisy signal of each microphone is determined based on the respective time-frequency estimated signals; the respective time-frequency estimated signals of the at least two sound sources are updated based on the respective original noisy signals of the at least two microphones and the mask values; and the audio signals emitted respectively from the at least two sound sources are determined.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: December 21, 2021
    Assignee: Beijing Xiaomi Intelligent Technology Co., Ltd.
    Inventor: Haining Hou
  • Patent number: 11204736
    Abstract: The systems and methods described relate to the concept that smart devices can be used to 1) sense various types of phenomena like sound, blue light exposure, RF and microwave radiation, and 2) in real-time analyze, report and/or control outputs (e.g., displays or speakers). The systems are configurable and use standard computing devices, such as wearable electronics, tablet computers, and mobile phones to measure various frequency bands across multiple points, allowing a single user to visualize and/or adjust environmental conditions.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: December 21, 2021
    Assignee: ZOPHONOS INC.
    Inventor: Levaughn Denton
  • Patent number: 11189303
    Abstract: A multi-microphone algorithm for detecting and differentiating interference sources from desired talker speech in advanced audio processing for smart home applications is described. The approach is based on characterizing a persistent interference source when sounds repeated occur from a fixed spatial location relative to the device, which is also fixed. Some examples of such interference sources include TV, music system, air-conditioner, washing machine, and dishwasher. Real human talkers, in contrast, are not expected to remain stationary and speak continuously from the same position for a long time. The persistency of an acoustic source is established based on identifying historically-recurring inter-microphone frequency-dependent phase profiles in multiple time periods of the audio data. The detection algorithm can be used with a beamforming processor to suppress the interference and for achieving voice quality and automatic speech recognition rate improvements in smart home applications.
    Type: Grant
    Filed: September 25, 2017
    Date of Patent: November 30, 2021
    Assignee: Cirrus Logic, Inc.
    Inventors: Narayan Kovvali, Seth Suppappola
  • Patent number: 11188718
    Abstract: A collective emotional engagement detection arrangement is provided for determining emotions of users in group conversations. A computer-implemented method includes determining a first conversation velocity of communications through conversation channels over a first time period for a group discussion between user computers; determining that a conversation velocity of the communications has increased to a second conversation velocity of communications which exceeds a predetermined threshold, and has remained above the predetermined threshold for at least a second time period; determining, aggregated emotions of the users during the second time period; and providing an output to a moderator of the group discussion indicating that the second conversation velocity of the communications has exceeded the predetermined threshold for at least the second time period, and indicating the aggregated emotions of the users during the second time period.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: November 30, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ilse M. Breedvelt-Schouten, John A. Lyons, Jana H. Jenkins, Jeffrey A. Kusnitz
  • Patent number: 11184244
    Abstract: The current document is directed to methods and systems that employ network metrics collected by distributed-computer-system metrics-collection services to determine a service-call-based topology for distributed service-oriented applications. In a described implementation, network metrics are collected over a number of network-metric monitoring periods. Independent component analysis is used to extract, from the collected network metrics, signals corresponding to sequences of service calls initiated by calls to the application-programming interface of a distributed service-oriented application. The signals, in combination with call traces obtained from a distributed-services call-tracing utility or service, are then used to construct representations of distributed-service-oriented-application topologies. The distributed-service-oriented-application topologies provide a basis for any additional types of distributed-computer-system functionalities, utilities, and facilities.
    Type: Grant
    Filed: February 19, 2020
    Date of Patent: November 23, 2021
    Assignee: VMware, Inc.
    Inventors: Susobhit Panigrahi, Reghuram Vasanthakumari, Arihant Jain
  • Patent number: 11164591
    Abstract: A speech enhancement method includes determining a first spectral subtraction parameter based on a power spectrum of a speech signal containing noise and a power spectrum of a noise signal, determining a second spectral subtraction parameter based on the first spectral subtraction parameter and a reference power spectrum, and performing, based on the power spectrum of the noise signal and the second spectral subtraction parameter, spectral subtraction on the speech signal containing noise, where the reference power spectrum includes a predicted user speech power spectrum and/or predicted environmental noise power. Regularity of a power spectrum feature of a user speech of a terminal device and/or regularity of a power spectrum feature of noise in an environment in which a user is located are considered.
    Type: Grant
    Filed: January 18, 2018
    Date of Patent: November 2, 2021
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Weixiang Hu, Lei Miao
  • Patent number: 11150869
    Abstract: Aspects of the present disclosure relate to voice command filtering. One or more directions of background noise for a location of a voice command device are determined. The one or more directions of background noise are stored as one or more blocked directions. A voice input is received at the location of the voice command device. A direction the voice input is being received from is determined and compared to the one or more blocked directions. The voice input is ignored in response to the direction of the voice input being received from corresponding to a direction of the one or more blocked directions, unless the received voice input is in a recognized voice.
    Type: Grant
    Filed: February 14, 2018
    Date of Patent: October 19, 2021
    Assignee: International Business Machines Corporation
    Inventors: Eunjin Lee, Daniel Cunnington, John J. Wood, Giacomo G. Chiarella
  • Patent number: 11146298
    Abstract: A signal generator device includes a digital signal waveform generator to produce a digital signal waveform, a first frequency band signal path having a first frequency band filter to receive the digital signal waveform and to pass first frequency band components of the digital signal waveform, and a first digital-to-analog converter to receive the first frequency band components of the digital signal waveform and to produce a first frequency band analog signal, a second frequency band signal path having a second frequency band filter to receive the digital signal waveform and to pass second frequency band components of the digital signal waveform, a second digital-to-analog converter to receive the second frequency band components of the digital signal waveform and to produce a second frequency band analog signal, and a combining element to combine the first frequency band analog signal and the second frequency band analog signal to produce a wideband analog signal.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: October 12, 2021
    Assignee: Tektronix, Inc.
    Inventor: Gregory A. Martin
  • Patent number: 11146907
    Abstract: A system for identifying the contribution of a given sound source to a composite audio track, the system comprising an audio input unit operable to receive an input composite audio track comprising two or more sound sources, including the given sound source, an audio generation unit operable to generate, using a model of a sound source, an approximation of the contribution of the given sound source to the composite audio track, an audio comparison unit operable to compare the generated audio to at least a portion of the composite audio track to determine whether the generated audio provides an approximation of the composite audio track that meets a threshold degree of similarity, and an audio identification unit operable to identify, when the threshold is met, the generated audio as a suitable representation of the contribution of the sound source to the composite audio track.
    Type: Grant
    Filed: April 3, 2020
    Date of Patent: October 12, 2021
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Fabio Cappello, Oliver Hume
  • Patent number: 11138992
    Abstract: This application discloses a voice activity detection method. The method includes receiving speech data, the speech data including a multi-frame speech signal; determining energy and spectral entropy of a frame of speech signal; calculating a square root of the energy of the speech signal and/or calculating a square root of the spectral entropy of the frame of the speech signal; determining a spectral entropy-energy square root of the frame of the speech signal based on at least one of the square root of the energy and the square root of the spectral entropy; and determining that the frame of the speech signal is an unvoiced frame if the spectral entropy-energy square root of the speech signal is less than a first threshold, or that it is a voiced frame if the spectral entropy-energy square root of the speech signal is greater than or equal to the first threshold.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: October 5, 2021
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Jizhong Liu
  • Patent number: 11132998
    Abstract: A voice recognition device includes: a first feature vector calculating unit (2) for calculating a first feature vector from voice data input; an acoustic likelihood calculating unit (4) for calculating an acoustic likelihood of the first feature vector by using an acoustic model used for calculating an acoustic likelihood of a feature vector; a second feature vector calculating unit (3) for calculating a second feature vector from the voice data; a noise degree calculating unit (6) for calculating a noise degree of the second feature vector by using a discriminant model used for calculating a noise degree indicating whether a feature vector is noise or voice; a noise likelihood recalculating unit (8) for recalculating an acoustic likelihood of noise on the basis of the acoustic likelihood of the first feature vector and the noise degree of the second feature vector; and a collation unit (9) for performing collation with a pattern of a vocabulary word to be recognized, by using the acoustic likelihood calcula
    Type: Grant
    Filed: March 24, 2017
    Date of Patent: September 28, 2021
    Assignee: MITSUBISHI ELECTRIC CORPORATION
    Inventors: Toshiyuki Hanazawa, Tomohiro Narita
  • Patent number: 11122367
    Abstract: In a method for controlling an audio system of a vehicle, an intention to communicate and/or a voice of at least one of a specific, occupant of the vehicle and/or of an occupant on a specific seat of the vehicle, are/is sensed, and at least one audio signal of the vehicle is changed as a function of the sensed intention to communicate and/or the voice of the occupant.
    Type: Grant
    Filed: October 2, 2019
    Date of Patent: September 14, 2021
    Assignee: Bayerische Motoren Werke Aktiengesellschaft
    Inventor: Alexander Augst
  • Patent number: 11114108
    Abstract: A method includes extracting, from multiple microphone input, a hyperset of features of acoustic sources, using the extracted features to identify separable clusters associated with acoustic scenarios, and classifying subsequent input as one of the acoustic scenarios using the hyperset of features. The acoustic scenarios include a desired spatially moving/non-moving talker, and an undesired spatially moving/non-moving acoustic source. The hyperset of features includes both spatial and voice biometric features. The classified acoustic scenario may be used in a robotics application or voice assistant device desired speech enhancement or interference signal cancellation. Specifically, the classification of the acoustic scenarios can be used to adapt a beamformer, e.g., step size adjustment. The hyperset of features may also include visual biometric features extracted from one or more cameras viewing the acoustic sources.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: September 7, 2021
    Assignee: Cirrus Logic, Inc.
    Inventors: Ghassan Maalouli, Samuel P. Ebenezer
  • Patent number: 11114089
    Abstract: A method, system, and computer program product for applying a profile to an assistive device based on a multitude of cues includes: gathering audio inputs surrounding an assistive device; analyzing, by the assistive device, the audio inputs; determining, based on the analyzing, scenario cues; classifying a current environment surrounding the assistive device from the scenario cues; comparing the current environment to device profiles of the assistive device; determining, based on the comparing, a matching profile; and, in response to determining the matching profile, executing the matching profile on the assistive device.
    Type: Grant
    Filed: November 19, 2018
    Date of Patent: September 7, 2021
    Assignee: International Business Machines Corporation
    Inventors: Matthew Chapman, Chengxuan Xing, Andrew J. Daniel, Ashley Harrison
  • Patent number: 11114093
    Abstract: An intelligent voice recognition method, voice recognition apparatus and intelligent computing device are disclosed. An intelligent voice recognition method according to an embodiment of the present invention obtains a microphone detection signal, recognizes a voice of a user from the microphone detection signal and outputs a response related to the voice on the basis of a result of recognition of the voice, wherein the microphone detection signal includes noise, and a microphone detection signal including only the voice obtained by removing the noise from the microphone detection signal is recognized. Accordingly, only a voice of a user can be effectively separated from a microphone detection signal detected through a microphone of the voice recognition apparatus.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: September 7, 2021
    Assignee: LG ELECTRONICS INC.
    Inventor: Wonchul Kim
  • Patent number: 11094336
    Abstract: A sound analysis apparatus includes a sound acquirer configured to acquire a sound signal, a measurer configured to output time-series data of numerical values representing volumes based on the sound signal, and a calculator configured to perform calculation for analyzing the time-series data output from the measurer, wherein the calculator performs the calculation in a case of a first state in which a measured value that is the numerical value output from the measurer is included within an analysis target range that is a range in which the measured value is determined to be an analysis target, and wherein the calculator does not perform the calculation in a case of a second state in which the measured value is not included within the analysis target range.
    Type: Grant
    Filed: August 15, 2019
    Date of Patent: August 17, 2021
    Assignee: Yokogawa Electric Corporation
    Inventors: Yuko Ito, Hiroki Yoshino
  • Patent number: 11089404
    Abstract: A sound processing apparatus includes n number of microphones that are disposed correspondingly to n number of persons and that mainly collect sound signals uttered by respective relevant persons, a filter that suppresses crosstalk components included in a talker sound signal collected by a microphone corresponding to at least one talker using the sound signals collected by the n number of microphones, a parameter updater that updates a parameter of the filter for suppressing the crosstalk components and stores an update result in the memory in a case where a predetermined condition including time at which at least one talker talks is satisfied, and a sound output controller that outputs the sound signals, acquired by subtracting the crosstalk components by the filter from the talker sound signals based on the update result, from a speaker.
    Type: Grant
    Filed: January 24, 2020
    Date of Patent: August 10, 2021
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Masanari Miyamoto, Hiromasa Ohashi, Naoya Tanaka
  • Patent number: 11074927
    Abstract: A computer implemented method, computer system and computer program product are provided for acoustic event detection in polyphonic acoustic data, according to the method, polyphonic acoustic data is inputted by one or more processing units into a trained neural network trained by labeled monophonic acoustic data, a first output from a hidden layer of the trained neural network is obtained by one or more processing units, and at least one acoustic classification of the polyphonic acoustic data is determined by one or more processing units based on the first output and a feature dictionary learnt from the trained neural network.
    Type: Grant
    Filed: October 31, 2017
    Date of Patent: July 27, 2021
    Assignee: International Business Machines Corporation
    Inventors: Xiao Xing Liang, Ning Zhang, Yu Ling Zheng, Yu Chen Zhou
  • Patent number: 11069352
    Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.
    Type: Grant
    Filed: February 18, 2019
    Date of Patent: July 20, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Qingming Tang, Ming Sun, Chieh-Chi Kao, Chao Wang, Viktor Rozgic
  • Patent number: 11069353
    Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.
    Type: Grant
    Filed: May 6, 2019
    Date of Patent: July 20, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
  • Patent number: 11062708
    Abstract: A method and an apparatus for dialoguing based on a mood of a user, where the method includes: collecting first audio data from the user, determining the mood of the user according to a feature of the first audio data, and dialoguing with the user using second audio data corresponding to the mood of the user. The method and the apparatus for dialoguing based on the mood of the user provided by the present disclosure may make different responses according to the mood of the user when dialoguing with the user. Therefore, it further enriches response that the electronic device may make according to voice data of the user, and further improves the user experience during dialoguing with the electronic device.
    Type: Grant
    Filed: July 12, 2019
    Date of Patent: July 13, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Li Xu, Yingchao Li, Xiaoxin Ma
  • Patent number: 11036305
    Abstract: There is provided a signal processing apparatus that includes a control unit that executes, on a basis of a waveform signal generated in accordance with a motion of an attachment portion of a sensor attached to a tool or a body, effect processing for the waveform signal or another waveform signal, the waveform signal being output from the sensor.
    Type: Grant
    Filed: June 4, 2020
    Date of Patent: June 15, 2021
    Assignee: SONY CORPORATION
    Inventors: Heesoon Kim, Masaharu Yoshino, Tatsushi Nashida, Masahiko Inami, Kouta Minamizawa, Yuta Sugiura, Yusuke Mizushina
  • Patent number: 11031008
    Abstract: A terminal device is provided. The terminal device includes a communication interface, and a processor configured to receive performance information of one or more other terminal devices from each of the one or more other terminal devices, identify an edge device to perform voice recognition based on the performance information received from each of the one or more other terminal devices, based on the terminal device being identified as the edge device, receive information associated with reception quality from one or more other terminal devices which receive a sound wave including a triggering word, determine a terminal device to acquire the sound wave for voice recognition from based on the received information associated with the reception quality, and transmit, to the determined terminal device, a command to transmit the sound wave acquired for voice recognition to an external voice recognition device.
    Type: Grant
    Filed: April 10, 2019
    Date of Patent: June 8, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Minseok Kim
  • Patent number: 11024331
    Abstract: Systems and methods for optimizing voice detection via a network microphone device are disclosed herein. In one example, individual microphones of a network microphone device detect sound. The sound data is captured in a first buffer and analyzed to detect a trigger event. Metadata associated with the sound data is captured in a second buffer and provided to at least one network device to determine at least one characteristic of the detected sound based on the metadata. The network device provides a response that includes an instruction, based on the determined characteristic, to modify at least one performance parameter of the NMD. The NMD then modifies the at least one performance parameter based on the instruction.
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: June 1, 2021
    Assignee: Sonos, Inc.
    Inventors: Connor Kristopher Smith, Kurt Thomas Soto, Charles Conor Sleith
  • Patent number: 11024274
    Abstract: Systems, devices, and methods for segmenting musical compositions are described. Discrete, musically-coherent segments (such as intro, verse, chorus, bridge, solo, and the like) of a musical composition are identified. Distance measures are used to evaluate whether each bar of a musical composition is more like the bars that directly precede it or more like the bars that directly succeed it, and each respective series of musically similar bars is assigned to the same respective segment. Large changes in the distance measure(s) between adjacent bars may be used to identify boundaries between abutting musical segments. Computer systems and computer program products for implementing segmentation are also described. The results of segmentation may advantageously be applied in computer-based composition of music and musical variations, as well as in other applications involving labelling, characterizing, or otherwise processing music.
    Type: Grant
    Filed: January 28, 2020
    Date of Patent: June 1, 2021
    Assignee: Obeebo Labs Ltd.
    Inventor: Colin P. Williams
  • Patent number: 10997967
    Abstract: A method for initializing a device for performing acoustic speech recognition (ASR) using an ASR model, by a computer system including at least one processor and a system memory element. The method includes obtaining a plurality of voice data articulations of predetermined phrases, by the at least one processor via a user interface. The plurality of voice data articulations includes a first quantity of audio samples of actual articulated voice data, and each of the plurality of voice data articulations includes one of the audio samples including acoustic frequency components. The method further includes performing a plurality of augmentations to the plurality of voice data articulations of predetermined phrases, to generate a corpus audio data set that includes the first quantity of audio samples and a second quantity of audio samples including augmented versions of the first quantity of audio samples.
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: May 4, 2021
    Assignee: HONEYWELL INTERNATIONAL INC.
    Inventors: Luning Wang, Wei Yang, Zhiyong Dai
  • Patent number: 10972834
    Abstract: This disclosure describes techniques for detecting voice commands from a user of an ear-based device. The ear-based device may include an in-ear facing microphone to capture sound emitted in an ear of the user, and an exterior facing microphone to capture sound emitted in an exterior environment of the user. The in-ear microphone may generate an inner audio signal representing the sound emitted in the ear, and the exterior microphone may generate an outer audio signal representing sound from the exterior environment. The ear-based device may compute a ratio of a power of the inner audio signal to the outer audio signal and may compare this ratio to a threshold. If the ratio is larger than the threshold, the ear-based device may detect the voice of the user. Further, the ear-based device may set a value of the threshold based on a level of acoustic seal of the ear-based device.
    Type: Grant
    Filed: February 11, 2020
    Date of Patent: April 6, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Kuan-Chieh Yen, Daniel Wayne Harris, Carlo Murgia, Taro Kimura
  • Patent number: 10956734
    Abstract: An electronic device and a method of operating the electronic device are provided. The electronic device includes a proximity detector; an iris recognition module; a memory; and a processor electrically connected to the proximity detector, the iris recognition module, and the memory, wherein the processor is configured to execute an iris recognition operation based on the iris recognition module; determine proximity of an object based on the proximity detector while the iris recognition operation is performed; and, if the proximity of the object includes within a set reference range, stop the iris recognition operation.
    Type: Grant
    Filed: July 10, 2017
    Date of Patent: March 23, 2021
    Inventors: Hyung-Woo Shin, Hyemi Lee, Hyung Min Lee
  • Patent number: 10958468
    Abstract: A portable acoustic unit is adapted for insertion into an electrical receptacle. The portable acoustic unit has an integrated microphone and a wireless network interface to an automation controller. The portable acoustic unit detects spoken voice commands from users in the vicinity of the electrical receptacle. The portable acoustic unit merely plugs into a conventional electrical outlet to provide an extremely simple means of voice control through a home or business.
    Type: Grant
    Filed: August 29, 2018
    Date of Patent: March 23, 2021
    Assignee: AT&T INTELLECTUAL PROPERTY I, L. P.
    Inventors: Nafiz Haider, Ross Newman, Kristin Patterson, Thomas Risley, Curtis Stephenson, David Vaught
  • Patent number: 10950243
    Abstract: A system and method for improving T-matrix training for speaker recognition are provided. The method includes receiving an audio input, divisible into a plurality of audio frames, wherein at least a first audio frame includes an audio sample of a human speaker, the sample having a length above a first threshold; generating for each audio frame a feature vector; generating for a first plurality of feature vectors centered statistics of at least a zero order and a first order; generating a first i-vector, the first i-vector representing the human speaker; generating an optimized T-matrix training sequence computation, based on the first i-vector, an initialized T-matrix, the centered statistics, and a Gaussian mixture model (GMM) of a trained universal background model (UBM).
    Type: Grant
    Filed: March 1, 2019
    Date of Patent: March 16, 2021
    Assignee: ILLUMA Labs Inc.
    Inventor: Milind Borkar
  • Patent number: 10951996
    Abstract: A binaural hearing system includes a first hearing device and a second hearing device, each of which comprising: an input transducer; a transducer audio signal processor configured to provide a processed input transducer audio signal; an ear canal microphone; an ear canal audio signal processor configured to provide a processed ear canal audio signal; a first signal combiner configured to combine the processed input transducer audio signal with the processed ear canal audio signal to obtain an output transducer audio signal; a signal level detector configured to determine a signal level of (1) the output transducer audio signal or (2) an audio signal included in formation of the output transducer audio signal; and an output transducer; wherein the binaural hearing system further comprises a binaural excessive level detector connected to the first hearing device's signal level detector and the second hearing device's signal level detector.
    Type: Grant
    Filed: June 17, 2019
    Date of Patent: March 16, 2021
    Assignee: GN Hearing A/S
    Inventors: Søren Christian Voigt Pedersen, Jonathan Boley, James Robert Anderson
  • Patent number: 10943598
    Abstract: Methods and systems for determining periods of excessive noise for smart speaker voice commands. An electronic timeline of volume levels of currently playing content is made available to a smart speaker. From this timeline, periods of high content volume are determined, and the smart speaker alerts users during periods of high volume, requesting that they wait until the high-volume period has passed before issuing voice commands. In this manner, the smart speaker helps prevent voice commands that may not be detected, or may be detected inaccurately, due to the noise of the content currently being played.
    Type: Grant
    Filed: March 18, 2019
    Date of Patent: March 9, 2021
    Assignee: ROVI GUIDES, INC.
    Inventors: Gyanveer Singh, Sukanya Agarwal, Vikram Makam Gupta
  • Patent number: 10937448
    Abstract: A voice activity detection method and an apparatus are provided by embodiments of the present application. The method includes: performing framing processing on a voice to be detected to obtain a plurality of audio frames to be detected; obtaining an acoustic feature of each of the audio frames to be detected, and sequentially inputting the acoustic feature of the each of the audio frames to be detected to a VAD model, wherein the VAD model is configured to classify a first N voice frame in the voice to be detected as a noise frame, classify frames from an (N+1)-th voice frame to a last voice frame as voice frames, and classify a M noise frame after the last voice frame as a voice frame, where N and M are integers; and determining, according to a classification result output by the VAD model.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: March 2, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Chao Li, Weixin Zhu
  • Patent number: 10917717
    Abstract: Gain mismatch and related problems can be solved by a system and method that applies an automatic microphone signal gain equalization without any direct absolute reference or calibration phase. The system and method performs the steps of receiving, by a computing device, a speech signal from a speaking person via a plurality of microphones, determining a speech signal component in the time-frequency domain for each microphone of the plurality of microphones, calculating an instantaneous cross-talk coupling matrix based on the speech signal components across the microphones, estimating gain factors based on calculated cross-talk couplings and a given expected cross-talk attenuation, limiting the gain factors to appropriate maximum and minimum values, and applying the gain factors to the speech signal used in the control path to control further speech enhancement algorithms or used in the signal path for direct influence on the speech enhanced audio output signal.
    Type: Grant
    Filed: May 30, 2019
    Date of Patent: February 9, 2021
    Assignee: Nuance Communications, Inc.
    Inventors: Timo Matheja, Markus Buck
  • Patent number: 10903863
    Abstract: A first set of signal data is received. Generative machine learning models are trained based on the first set of signal data. The generative machine learning models include at least a first model trained to identify a first signal component and a second model trained to identify a second signal component. An incoming mixed signal data stream is dynamically separated into a clean signal component and a noise signal component by running the generative machine learning models.
    Type: Grant
    Filed: December 11, 2019
    Date of Patent: January 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Francois Pierre Luus, Etienne Eben Vos, Komminist Weldemariam
  • Patent number: 10885902
    Abstract: Techniques are described for using stenography to protect sensitive information within conversational audio data by generating a pseudo-language representation of conversational audio data. In some implementations, audio data corresponding to an utterance is received. The audio data is classified as likely sensitive audio data. A particular set of sentiments associated with the audio data is determined. Data indicating the particular set of sentiments associated with the audio data is provided to a model. The model is trained to output, for each of different sets of sentiments, desensitized, pseudo-language audio data that exhibits the set of sentiments, and is not classified as likely sensitive audio data. A particular desensitized, pseudo-language audio data is received from the model. The audio data is replaced with the particular desensitized, pseudo-language audio data and stored within an audio data repository.
    Type: Grant
    Filed: November 21, 2018
    Date of Patent: January 5, 2021
    Assignee: X Development LLC
    Inventors: Antonio Raymond Papania-Davis, Bin Ni, Shelby Lin