Detect Speech In Noise Patents (Class 704/233)

Systems and methods for audio signal processing using spectral-spatial mask estimation

Patent number: 11289109

Abstract: Embodiments of the disclosure provide systems and methods for audio signal processing. An exemplary system may include a communication interface configured to receiving a first audio signal acquired from an audio source through a first channel, and a second audio signal acquired from the same audio source through a second channel. The system may also include at least one processor coupled to the communication interface. The at least one processor may be configured to determine channel features based on the first audio signal and the second audio signal individually and determine a cross-channel feature based on the first audio signal and the second audio signal collectively. The at least one processor may further be configured to concatenate the channel features and the cross-channel feature and estimate spectral-spatial masks for the first channel and the second channel using the concatenated channel features and the cross-channel feature.

Type: Grant

Filed: April 24, 2020

Date of Patent: March 29, 2022

Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.

Inventors: Chengyun Deng, Hui Song, Yi Zhang, Yongtao Sha
Digital assistant activation based on wake word association

Patent number: 11282528

Abstract: One embodiment provides a method, including: receiving, at an information handling device, user input comprising a potential wake word; determining, using a processor, whether the potential wake word is associated with a stored wake word; and responsive to determining that the potential wake word is associated with the stored wake word, activating, based on the potential wake word, a digital assistant associated with the information handling device. Other aspects are described and claimed.

Type: Grant

Filed: August 14, 2017

Date of Patent: March 22, 2022

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Ryan Charles Knudson, Russell Speight VanBlon, Roderick Echols, Jonathan Gaither Knox
Audio device with wakeup word detection

Patent number: 11270696

Abstract: An audio device with at least one microphone adapted to receive sound from a sound field and create an output, and a processing system that is responsive to the output of the microphone. The processing system is configured to use a signal processing algorithm to detect a wakeup word, and modify the signal processing algorithm that is used to detect the wakeup word if the sound field changes.

Type: Grant

Filed: July 1, 2019

Date of Patent: March 8, 2022

Assignee: Bose Corporation

Inventors: Ricardo Carreras, Alaganandan Ganeshkumar
Voice wake-up processing method, apparatus and storage medium

Patent number: 11257497

Abstract: The present disclosure provides a voice wake-up processing method, an apparatus and a storage medium. After acquiring voice wake-up signals collected by audio input devices in at least two audio zones, an electronic device may correct, based on to-be-woken-up audio zones obtained from amplitudes of the voice wake-up signals collected by the audio input devices in the at least two audio zones, a to-be-woken-up audio zone identified using a voice engine, avoiding that audio zones in which a plurality of audio input devices collecting voice wake-up signals produced from a same user are located are all woken up, therefore, it is possible to improve accuracy of a voice wake-up result obtained by the electronic device. Therefore, the present disclosure can solve the technical problem that a vehicle-mounted terminal has low voice wake-up accuracy due to an insufficient degree of sound isolation between audio zones of the vehicle-mounted terminal.

Type: Grant

Filed: December 23, 2019

Date of Patent: February 22, 2022

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Hanying Peng, Nengjun Ouyang
Adaptive spatial VAD and time-frequency mask estimation for highly non-stationary noise sources

Patent number: 11257512

Abstract: Systems and methods include a first voice activity detector operable to detect speech in a frame of a multichannel audio input signal and output a speech determination, a constrained minimum variance adaptive filter operable to receive the multichannel audio input signal and the speech determination and minimize a signal variance at the output of the filter, thereby producing an equalized target speech signal, a mask estimator operable to receive the equalized target speech signal and the speech determination and generate a spectral-temporal mask to discriminate a target speech from noise and interference speech, and a second activity voice detector operable to detect voice in a frame of the speech discriminated signal. An audio input sensor array including a plurality of microphones, each microphone generating a channel of the multichannel audio input signal. A sub-band analysis module operable to decompose each of the channels into a plurality of frequency sub-bands.

Type: Grant

Filed: January 6, 2020

Date of Patent: February 22, 2022

Assignee: SYNAPTICS INCORPORATED

Inventors: Francesco Nesta, Alireza Masnadi-Shirazi
Adaptive audio enhancement for multichannel speech recognition

Patent number: 11257485

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

Type: Grant

Filed: December 10, 2019

Date of Patent: February 22, 2022

Assignee: Google LLC

Inventors: Bo Li, Ron J. Weiss, Michiel A. U. Bacchiani, Tara N. Sainath, Kevin William Wilson
Dynamic and/or context-specific hot words to invoke automated assistant

Patent number: 11257487

Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words to invoke an automated assistant. In various implementations, an automated assistant may be executed in a default listening state at least in part on a user's computing device(s). While in the default listening state, audio data captured by microphone(s) may be monitored for default hot words. Detection of the default hot word(s) transitions of the automated assistant into a speech recognition state. Sensor signal(s) generated by hardware sensor(s) integral with the computing device(s) may be detected and analyzed to determine an attribute of the user. Based on the analysis, the automated assistant may transition into an enhanced listening state in which the audio data may be monitored for enhanced hot word(s). Detection of enhanced hot word(s) triggers the automated assistant to perform a responsive action without requiring detection of default hot word(s).

Type: Grant

Filed: August 21, 2018

Date of Patent: February 22, 2022

Assignee: GOOGLE LLC

Inventor: Diego Melendo Casado
Sound detection

Patent number: 11250877

Abstract: A method for generating a health indicator for at least one person of a group of people, the method comprising: receiving, at a processor, captured sound, where the captured sound is sound captured from the group of people; comparing the captured sound to a plurality of sound models to detect at least one non-speech sound event in the captured sound, each of the plurality of sound models associated with a respective health-related sound type; determining metadata associated with the at least one non-speech sound event; assigning the at least one non-speech sound event and the metadata to at least one person of the group of people; and outputting a message identifying the at least one non-speech event and the metadata to a health indicator generator module to generate a health indicator for the at least one person to whom the at least one non-speech sound event is assigned.

Type: Grant

Filed: July 25, 2019

Date of Patent: February 15, 2022

Assignee: AUDIO ANALYTIC LTD

Inventors: Christopher Mitchell, Joe Patrick Lynas, Sacha Krstulovic, Amoldas Jasonas, Julian Harris
Question and answer pair generation using machine learning

Patent number: 11250038

Abstract: An interactive question and answer (Q&A) service provides pairs of questions and corresponding answers related to the content of a web page. The service includes pre-configured Q&A pairs derived from a deep learning framework that includes a series of neural networks trained through joint and transfer learning to generate questions for a given text passage. In addition, pre-configured Q&A pairs are generated from historical web access patterns and sources related to the content of the web page.

Type: Grant

Filed: August 13, 2018

Date of Patent: February 15, 2022

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Payal Bajaj, Gearard Boland, Anshul Gupta, Matthew Glenn Jin, Eduardo Enrique Noriega De Armas, Jason Shaver, Neelakantan Sundaresan, Roshanak Zilouchian Moghaddam
Music classifier and related methods

Patent number: 11240609

Abstract: An audio device that includes a music classifier that determines when music is present in an audio signal is disclosed. The audio device is configured to receive audio, process the received audio, and to output the processed audio to a user. The processing may be adjusted based on the output of the music classifier. The music classifier utilizes a plurality of decision making units, each operating on the received audio independently. The decision making units are simplified to reduce the processing, and therefore the power, necessary for operation. Accordingly each decision making unit may be insufficient to determine music alone but in combination may accurately detect music while consuming power at a rate that is suitable for a mobile device, such as a hearing aid.

Type: Grant

Filed: June 3, 2019

Date of Patent: February 1, 2022

Assignee: SEMICONDUCTOR COMPONENTS INDUSTRIES, LLC

Inventors: Pejman Dehghani, Robert L. Brennan
Dialogue enhancement based on synthesized speech

Patent number: 11238883

Abstract: A method and a system for dialogue enhancement of an audio signal, comprising receiving (step S1) the audio signal and a text content associated with dialogue occurring in the audio signal, generating (step S2) parameterized synthesized speech from the text content, and applying (step S3) dialogue enhancement to the audio signal based on the parameterized synthesized speech. With the invention text captions, subtitles, or other forms of text content included in an audio stream, can be used to significantly improve dialogue enhancement on the playback side.

Type: Grant

Filed: May 23, 2019

Date of Patent: February 1, 2022

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Timothy Alan Port, Winston Chi Wai Ng, Mark William Gerrard
Systems and methods for training devices to recognize sound patterns

Patent number: 11222625

Abstract: Systems and methods for training a control panel to recognize user defined and preprogrammed sound patterns are provided. Such systems and methods can include the control panel operating in a learning mode, receiving initial ambient audio from a region, and saving the initial ambient audio as an audio pattern in a memory device of the control panel. Such systems and methods can also include the control panel operating in an active mode, receiving subsequent ambient audio from the region, using an audio classification model to make an initial determination as to whether the subsequent ambient audio matches or is otherwise consistent with the audio pattern, determining whether the initial determination is correct, and when the control panel determines that the initial determination is incorrect, modifying or updating the audio classification model for improving the accuracy in detecting future consistency with the audio pattern.

Type: Grant

Filed: April 15, 2019

Date of Patent: January 11, 2022

Assignee: Ademco Inc.

Inventors: Pradyumna Sampath, Ramprasad Yelchuru, Purnaprajna R. Mangsuli
Server for providing voice recognition service

Patent number: 11222624

Abstract: A server may provide a voice recognition service. The server may include a memory configured for storing a plurality of voice recognition models, a communication device configured for communicating a plurality of voice recognition devices, and an artificial intelligence device configured for providing a voice recognition service to the plurality of voice recognition devices, acquiring use-related information regarding a first voice recognition device (from among the plurality of voice recognition devices), and changing a voice recognition model corresponding to the first voice recognition device from a first voice recognition model to a second voice recognition model based on the use-related information.

Type: Grant

Filed: August 20, 2019

Date of Patent: January 11, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Jaehong Kim, Hangil Jeong
Voice detection

Patent number: 11222654

Abstract: A method for voice detection, the method may include (a) generating an in-ear signal that represents a signal sensed by an in-ear microphone and fed to a feedback active noise cancellation (ANC) circuit; (b) generating at least one additional signal, based on at least one out of a playback signal and a pickup signal sensed by a voice pickup microphone; and (c) generating a voice indicator based on the in-ear signal and the at least one additional signal.

Type: Grant

Filed: January 13, 2020

Date of Patent: January 11, 2022

Assignee: DSP GROUP LTD.

Inventors: Assaf Ganor, Ori Elyada
Audio signal processing method and device, terminal and storage medium

Patent number: 11205411

Abstract: A method for processing audio signal includes that: audio signals emitted respectively from at least two sound sources are acquired through at least two microphones to obtain respective original noisy signals of the at least two microphones; sound source separation is performed on the respective original noisy signals of the at least two microphones to obtain respective time-frequency estimated signals of the at least two sound sources; a mask value of the time-frequency estimated signal of each sound source in the original noisy signal of each microphone is determined based on the respective time-frequency estimated signals; the respective time-frequency estimated signals of the at least two sound sources are updated based on the respective original noisy signals of the at least two microphones and the mask values; and the audio signals emitted respectively from the at least two sound sources are determined.

Type: Grant

Filed: May 29, 2020

Date of Patent: December 21, 2021

Assignee: Beijing Xiaomi Intelligent Technology Co., Ltd.

Inventor: Haining Hou
Multi-frequency sensing method and apparatus using mobile-clusters

Patent number: 11204736

Abstract: The systems and methods described relate to the concept that smart devices can be used to 1) sense various types of phenomena like sound, blue light exposure, RF and microwave radiation, and 2) in real-time analyze, report and/or control outputs (e.g., displays or speakers). The systems are configurable and use standard computing devices, such as wearable electronics, tablet computers, and mobile phones to measure various frequency bands across multiple points, allowing a single user to visualize and/or adjust environmental conditions.

Type: Grant

Filed: October 17, 2019

Date of Patent: December 21, 2021

Assignee: ZOPHONOS INC.

Inventor: Levaughn Denton
Persistent interference detection

Patent number: 11189303

Abstract: A multi-microphone algorithm for detecting and differentiating interference sources from desired talker speech in advanced audio processing for smart home applications is described. The approach is based on characterizing a persistent interference source when sounds repeated occur from a fixed spatial location relative to the device, which is also fixed. Some examples of such interference sources include TV, music system, air-conditioner, washing machine, and dishwasher. Real human talkers, in contrast, are not expected to remain stationary and speak continuously from the same position for a long time. The persistency of an acoustic source is established based on identifying historically-recurring inter-microphone frequency-dependent phase profiles in multiple time periods of the audio data. The detection algorithm can be used with a beamforming processor to suppress the interference and for achieving voice quality and automatic speech recognition rate improvements in smart home applications.

Type: Grant

Filed: September 25, 2017

Date of Patent: November 30, 2021

Assignee: Cirrus Logic, Inc.

Inventors: Narayan Kovvali, Seth Suppappola
Collective emotional engagement detection in group conversations

Patent number: 11188718

Abstract: A collective emotional engagement detection arrangement is provided for determining emotions of users in group conversations. A computer-implemented method includes determining a first conversation velocity of communications through conversation channels over a first time period for a group discussion between user computers; determining that a conversation velocity of the communications has increased to a second conversation velocity of communications which exceeds a predetermined threshold, and has remained above the predetermined threshold for at least a second time period; determining, aggregated emotions of the users during the second time period; and providing an output to a moderator of the group discussion indicating that the second conversation velocity of the communications has exceeded the predetermined threshold for at least the second time period, and indicating the aggregated emotions of the users during the second time period.

Type: Grant

Filed: September 27, 2019

Date of Patent: November 30, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ilse M. Breedvelt-Schouten, John A. Lyons, Jana H. Jenkins, Jeffrey A. Kusnitz
Method and system that determines application topology using network metrics

Patent number: 11184244

Abstract: The current document is directed to methods and systems that employ network metrics collected by distributed-computer-system metrics-collection services to determine a service-call-based topology for distributed service-oriented applications. In a described implementation, network metrics are collected over a number of network-metric monitoring periods. Independent component analysis is used to extract, from the collected network metrics, signals corresponding to sequences of service calls initiated by calls to the application-programming interface of a distributed service-oriented application. The signals, in combination with call traces obtained from a distributed-services call-tracing utility or service, are then used to construct representations of distributed-service-oriented-application topologies. The distributed-service-oriented-application topologies provide a basis for any additional types of distributed-computer-system functionalities, utilities, and facilities.

Type: Grant

Filed: February 19, 2020

Date of Patent: November 23, 2021

Assignee: VMware, Inc.

Inventors: Susobhit Panigrahi, Reghuram Vasanthakumari, Arihant Jain
Speech enhancement method and apparatus

Patent number: 11164591

Abstract: A speech enhancement method includes determining a first spectral subtraction parameter based on a power spectrum of a speech signal containing noise and a power spectrum of a noise signal, determining a second spectral subtraction parameter based on the first spectral subtraction parameter and a reference power spectrum, and performing, based on the power spectrum of the noise signal and the second spectral subtraction parameter, spectral subtraction on the speech signal containing noise, where the reference power spectrum includes a predicted user speech power spectrum and/or predicted environmental noise power. Regularity of a power spectrum feature of a user speech of a terminal device and/or regularity of a power spectrum feature of noise in an environment in which a user is located are considered.

Type: Grant

Filed: January 18, 2018

Date of Patent: November 2, 2021

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Weixiang Hu, Lei Miao
Voice command filtering

Patent number: 11150869

Abstract: Aspects of the present disclosure relate to voice command filtering. One or more directions of background noise for a location of a voice command device are determined. The one or more directions of background noise are stored as one or more blocked directions. A voice input is received at the location of the voice command device. A direction the voice input is being received from is determined and compared to the one or more blocked directions. The voice input is ignored in response to the direction of the voice input being received from corresponding to a direction of the one or more blocked directions, unless the received voice input is in a recognized voice.

Type: Grant

Filed: February 14, 2018

Date of Patent: October 19, 2021

Assignee: International Business Machines Corporation

Inventors: Eunjin Lee, Daniel Cunnington, John J. Wood, Giacomo G. Chiarella
Split frequency band signal paths for signal sources

Patent number: 11146298

Abstract: A signal generator device includes a digital signal waveform generator to produce a digital signal waveform, a first frequency band signal path having a first frequency band filter to receive the digital signal waveform and to pass first frequency band components of the digital signal waveform, and a first digital-to-analog converter to receive the first frequency band components of the digital signal waveform and to produce a first frequency band analog signal, a second frequency band signal path having a second frequency band filter to receive the digital signal waveform and to pass second frequency band components of the digital signal waveform, a second digital-to-analog converter to receive the second frequency band components of the digital signal waveform and to produce a second frequency band analog signal, and a combining element to combine the first frequency band analog signal and the second frequency band analog signal to produce a wideband analog signal.

Type: Grant

Filed: September 30, 2019

Date of Patent: October 12, 2021

Assignee: Tektronix, Inc.

Inventor: Gregory A. Martin
Audio contribution identification system and method

Patent number: 11146907

Abstract: A system for identifying the contribution of a given sound source to a composite audio track, the system comprising an audio input unit operable to receive an input composite audio track comprising two or more sound sources, including the given sound source, an audio generation unit operable to generate, using a model of a sound source, an approximation of the contribution of the given sound source to the composite audio track, an audio comparison unit operable to compare the generated audio to at least a portion of the composite audio track to determine whether the generated audio provides an approximation of the composite audio track that meets a threshold degree of similarity, and an audio identification unit operable to identify, when the threshold is met, the generated audio as a suitable representation of the contribution of the sound source to the composite audio track.

Type: Grant

Filed: April 3, 2020

Date of Patent: October 12, 2021

Assignee: Sony Interactive Entertainment Inc.

Inventors: Fabio Cappello, Oliver Hume
Voice activity detection based on entropy-energy feature

Patent number: 11138992

Abstract: This application discloses a voice activity detection method. The method includes receiving speech data, the speech data including a multi-frame speech signal; determining energy and spectral entropy of a frame of speech signal; calculating a square root of the energy of the speech signal and/or calculating a square root of the spectral entropy of the frame of the speech signal; determining a spectral entropy-energy square root of the frame of the speech signal based on at least one of the square root of the energy and the square root of the spectral entropy; and determining that the frame of the speech signal is an unvoiced frame if the spectral entropy-energy square root of the speech signal is less than a first threshold, or that it is a voiced frame if the spectral entropy-energy square root of the speech signal is greater than or equal to the first threshold.

Type: Grant

Filed: October 28, 2019

Date of Patent: October 5, 2021

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor: Jizhong Liu
Voice recognition device and voice recognition method

Patent number: 11132998

Abstract: A voice recognition device includes: a first feature vector calculating unit (2) for calculating a first feature vector from voice data input; an acoustic likelihood calculating unit (4) for calculating an acoustic likelihood of the first feature vector by using an acoustic model used for calculating an acoustic likelihood of a feature vector; a second feature vector calculating unit (3) for calculating a second feature vector from the voice data; a noise degree calculating unit (6) for calculating a noise degree of the second feature vector by using a discriminant model used for calculating a noise degree indicating whether a feature vector is noise or voice; a noise likelihood recalculating unit (8) for recalculating an acoustic likelihood of noise on the basis of the acoustic likelihood of the first feature vector and the noise degree of the second feature vector; and a collation unit (9) for performing collation with a pattern of a vocabulary word to be recognized, by using the acoustic likelihood calcula

Type: Grant

Filed: March 24, 2017

Date of Patent: September 28, 2021

Assignee: MITSUBISHI ELECTRIC CORPORATION

Inventors: Toshiyuki Hanazawa, Tomohiro Narita
Method, device, mobile user apparatus and computer program for controlling an audio system of a vehicle

Patent number: 11122367

Abstract: In a method for controlling an audio system of a vehicle, an intention to communicate and/or a voice of at least one of a specific, occupant of the vehicle and/or of an occupant on a specific seat of the vehicle, are/is sensed, and at least one audio signal of the vehicle is changed as a function of the sensed intention to communicate and/or the voice of the occupant.

Type: Grant

Filed: October 2, 2019

Date of Patent: September 14, 2021

Assignee: Bayerische Motoren Werke Aktiengesellschaft

Inventor: Alexander Augst
Acoustic source classification using hyperset of fused voice biometric and spatial features

Patent number: 11114108

Abstract: A method includes extracting, from multiple microphone input, a hyperset of features of acoustic sources, using the extracted features to identify separable clusters associated with acoustic scenarios, and classifying subsequent input as one of the acoustic scenarios using the hyperset of features. The acoustic scenarios include a desired spatially moving/non-moving talker, and an undesired spatially moving/non-moving acoustic source. The hyperset of features includes both spatial and voice biometric features. The classified acoustic scenario may be used in a robotics application or voice assistant device desired speech enhancement or interference signal cancellation. Specifically, the classification of the acoustic scenarios can be used to adapt a beamformer, e.g., step size adjustment. The hyperset of features may also include visual biometric features extracted from one or more cameras viewing the acoustic sources.

Type: Grant

Filed: May 11, 2020

Date of Patent: September 7, 2021

Assignee: Cirrus Logic, Inc.

Inventors: Ghassan Maalouli, Samuel P. Ebenezer
Customizing a voice-based interface using surrounding factors

Patent number: 11114089

Abstract: A method, system, and computer program product for applying a profile to an assistive device based on a multitude of cues includes: gathering audio inputs surrounding an assistive device; analyzing, by the assistive device, the audio inputs; determining, based on the analyzing, scenario cues; classifying a current environment surrounding the assistive device from the scenario cues; comparing the current environment to device profiles of the assistive device; determining, based on the comparing, a matching profile; and, in response to determining the matching profile, executing the matching profile on the assistive device.

Type: Grant

Filed: November 19, 2018

Date of Patent: September 7, 2021

Assignee: International Business Machines Corporation

Inventors: Matthew Chapman, Chengxuan Xing, Andrew J. Daniel, Ashley Harrison
Intelligent voice recognizing method, apparatus, and intelligent computing device

Patent number: 11114093

Abstract: An intelligent voice recognition method, voice recognition apparatus and intelligent computing device are disclosed. An intelligent voice recognition method according to an embodiment of the present invention obtains a microphone detection signal, recognizes a voice of a user from the microphone detection signal and outputs a response related to the voice on the basis of a result of recognition of the voice, wherein the microphone detection signal includes noise, and a microphone detection signal including only the voice obtained by removing the noise from the microphone detection signal is recognized. Accordingly, only a voice of a user can be effectively separated from a microphone detection signal detected through a microphone of the voice recognition apparatus.

Type: Grant

Filed: August 29, 2019

Date of Patent: September 7, 2021

Assignee: LG ELECTRONICS INC.

Inventor: Wonchul Kim
Sound analysis apparatus, sound analysis method, and non-transitory computer readable storage medium

Patent number: 11094336

Abstract: A sound analysis apparatus includes a sound acquirer configured to acquire a sound signal, a measurer configured to output time-series data of numerical values representing volumes based on the sound signal, and a calculator configured to perform calculation for analyzing the time-series data output from the measurer, wherein the calculator performs the calculation in a case of a first state in which a measured value that is the numerical value output from the measurer is included within an analysis target range that is a range in which the measured value is determined to be an analysis target, and wherein the calculator does not perform the calculation in a case of a second state in which the measured value is not included within the analysis target range.

Type: Grant

Filed: August 15, 2019

Date of Patent: August 17, 2021

Assignee: Yokogawa Electric Corporation

Inventors: Yuko Ito, Hiroki Yoshino
Sound processing apparatus and sound processing method

Patent number: 11089404

Abstract: A sound processing apparatus includes n number of microphones that are disposed correspondingly to n number of persons and that mainly collect sound signals uttered by respective relevant persons, a filter that suppresses crosstalk components included in a talker sound signal collected by a microphone corresponding to at least one talker using the sound signals collected by the n number of microphones, a parameter updater that updates a parameter of the filter for suppressing the crosstalk components and stores an update result in the memory in a case where a predetermined condition including time at which at least one talker talks is satisfied, and a sound output controller that outputs the sound signals, acquired by subtracting the crosstalk components by the filter from the talker sound signals based on the update result, from a speaker.

Type: Grant

Filed: January 24, 2020

Date of Patent: August 10, 2021

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Masanari Miyamoto, Hiromasa Ohashi, Naoya Tanaka
Acoustic event detection in polyphonic acoustic data

Patent number: 11074927

Abstract: A computer implemented method, computer system and computer program product are provided for acoustic event detection in polyphonic acoustic data, according to the method, polyphonic acoustic data is inputted by one or more processing units into a trained neural network trained by labeled monophonic acoustic data, a first output from a hidden layer of the trained neural network is obtained by one or more processing units, and at least one acoustic classification of the polyphonic acoustic data is determined by one or more processing units based on the first output and a feature dictionary learnt from the trained neural network.

Type: Grant

Filed: October 31, 2017

Date of Patent: July 27, 2021

Assignee: International Business Machines Corporation

Inventors: Xiao Xing Liang, Ning Zhang, Yu Ling Zheng, Yu Chen Zhou
Media presence detection

Patent number: 11069352

Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.

Type: Grant

Filed: February 18, 2019

Date of Patent: July 20, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Qingming Tang, Ming Sun, Chieh-Chi Kao, Chao Wang, Viktor Rozgic
Multilingual wakeword detection

Patent number: 11069353

Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.

Type: Grant

Filed: May 6, 2019

Date of Patent: July 20, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
Method and apparatus for dialoguing based on a mood of a user

Patent number: 11062708

Abstract: A method and an apparatus for dialoguing based on a mood of a user, where the method includes: collecting first audio data from the user, determining the mood of the user according to a feature of the first audio data, and dialoguing with the user using second audio data corresponding to the mood of the user. The method and the apparatus for dialoguing based on the mood of the user provided by the present disclosure may make different responses according to the mood of the user when dialoguing with the user. Therefore, it further enriches response that the electronic device may make according to voice data of the user, and further improves the user experience during dialoguing with the electronic device.

Type: Grant

Filed: July 12, 2019

Date of Patent: July 13, 2021

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Li Xu, Yingchao Li, Xiaoxin Ma
Signal processing apparatus and signal processing method

Patent number: 11036305

Abstract: There is provided a signal processing apparatus that includes a control unit that executes, on a basis of a waveform signal generated in accordance with a motion of an attachment portion of a sensor attached to a tool or a body, effect processing for the waveform signal or another waveform signal, the waveform signal being output from the sensor.

Type: Grant

Filed: June 4, 2020

Date of Patent: June 15, 2021

Assignee: SONY CORPORATION

Inventors: Heesoon Kim, Masaharu Yoshino, Tatsushi Nashida, Masahiko Inami, Kouta Minamizawa, Yuta Sugiura, Yusuke Mizushina
Terminal device and method for controlling thereof

Patent number: 11031008

Abstract: A terminal device is provided. The terminal device includes a communication interface, and a processor configured to receive performance information of one or more other terminal devices from each of the one or more other terminal devices, identify an edge device to perform voice recognition based on the performance information received from each of the one or more other terminal devices, based on the terminal device being identified as the edge device, receive information associated with reception quality from one or more other terminal devices which receive a sound wave including a triggering word, determine a terminal device to acquire the sound wave for voice recognition from based on the received information associated with the reception quality, and transmit, to the determined terminal device, a command to transmit the sound wave acquired for voice recognition to an external voice recognition device.

Type: Grant

Filed: April 10, 2019

Date of Patent: June 8, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventor: Minseok Kim
Voice detection optimization using sound metadata

Patent number: 11024331

Abstract: Systems and methods for optimizing voice detection via a network microphone device are disclosed herein. In one example, individual microphones of a network microphone device detect sound. The sound data is captured in a first buffer and analyzed to detect a trigger event. Metadata associated with the sound data is captured in a second buffer and provided to at least one network device to determine at least one characteristic of the detected sound based on the metadata. The network device provides a response that includes an instruction, based on the determined characteristic, to modify at least one performance parameter of the NMD. The NMD then modifies the at least one performance parameter based on the instruction.

Type: Grant

Filed: September 21, 2018

Date of Patent: June 1, 2021

Assignee: Sonos, Inc.

Inventors: Connor Kristopher Smith, Kurt Thomas Soto, Charles Conor Sleith
Systems, devices, and methods for segmenting a musical composition into musical segments

Patent number: 11024274

Abstract: Systems, devices, and methods for segmenting musical compositions are described. Discrete, musically-coherent segments (such as intro, verse, chorus, bridge, solo, and the like) of a musical composition are identified. Distance measures are used to evaluate whether each bar of a musical composition is more like the bars that directly precede it or more like the bars that directly succeed it, and each respective series of musically similar bars is assigned to the same respective segment. Large changes in the distance measure(s) between adjacent bars may be used to identify boundaries between abutting musical segments. Computer systems and computer program products for implementing segmentation are also described. The results of segmentation may advantageously be applied in computer-based composition of music and musical variations, as well as in other applications involving labelling, characterizing, or otherwise processing music.

Type: Grant

Filed: January 28, 2020

Date of Patent: June 1, 2021

Assignee: Obeebo Labs Ltd.

Inventor: Colin P. Williams
Methods and systems for cockpit speech recognition acoustic model training with multi-level corpus data augmentation

Patent number: 10997967

Abstract: A method for initializing a device for performing acoustic speech recognition (ASR) using an ASR model, by a computer system including at least one processor and a system memory element. The method includes obtaining a plurality of voice data articulations of predetermined phrases, by the at least one processor via a user interface. The plurality of voice data articulations includes a first quantity of audio samples of actual articulated voice data, and each of the plurality of voice data articulations includes one of the audio samples including acoustic frequency components. The method further includes performing a plurality of augmentations to the plurality of voice data articulations of predetermined phrases, to generate a corpus audio data set that includes the first quantity of audio samples and a second quantity of audio samples including augmented versions of the first quantity of audio samples.

Type: Grant

Filed: April 18, 2019

Date of Patent: May 4, 2021

Assignee: HONEYWELL INTERNATIONAL INC.

Inventors: Luning Wang, Wei Yang, Zhiyong Dai
Voice detection using ear-based devices

Patent number: 10972834

Abstract: This disclosure describes techniques for detecting voice commands from a user of an ear-based device. The ear-based device may include an in-ear facing microphone to capture sound emitted in an ear of the user, and an exterior facing microphone to capture sound emitted in an exterior environment of the user. The in-ear microphone may generate an inner audio signal representing the sound emitted in the ear, and the exterior microphone may generate an outer audio signal representing sound from the exterior environment. The ear-based device may compute a ratio of a power of the inner audio signal to the outer audio signal and may compare this ratio to a threshold. If the ratio is larger than the threshold, the ear-based device may detect the voice of the user. Further, the ear-based device may set a value of the threshold based on a level of acoustic seal of the ear-based device.

Type: Grant

Filed: February 11, 2020

Date of Patent: April 6, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Kuan-Chieh Yen, Daniel Wayne Harris, Carlo Murgia, Taro Kimura
Electronic device providing iris recognition based on proximity and operating method thereof

Patent number: 10956734

Abstract: An electronic device and a method of operating the electronic device are provided. The electronic device includes a proximity detector; an iris recognition module; a memory; and a processor electrically connected to the proximity detector, the iris recognition module, and the memory, wherein the processor is configured to execute an iris recognition operation based on the iris recognition module; determine proximity of an object based on the proximity detector while the iris recognition operation is performed; and, if the proximity of the object includes within a set reference range, stop the iris recognition operation.

Type: Grant

Filed: July 10, 2017

Date of Patent: March 23, 2021

Inventors: Hyung-Woo Shin, Hyemi Lee, Hyung Min Lee
Portable acoustical unit

Patent number: 10958468

Abstract: A portable acoustic unit is adapted for insertion into an electrical receptacle. The portable acoustic unit has an integrated microphone and a wireless network interface to an automation controller. The portable acoustic unit detects spoken voice commands from users in the vicinity of the electrical receptacle. The portable acoustic unit merely plugs into a conventional electrical outlet to provide an extremely simple means of voice control through a home or business.

Type: Grant

Filed: August 29, 2018

Date of Patent: March 23, 2021

Assignee: AT&T INTELLECTUAL PROPERTY I, L. P.

Inventors: Nafiz Haider, Ross Newman, Kristin Patterson, Thomas Risley, Curtis Stephenson, David Vaught
Method for reduced computation of t-matrix training for speaker recognition

Patent number: 10950243

Abstract: A system and method for improving T-matrix training for speaker recognition are provided. The method includes receiving an audio input, divisible into a plurality of audio frames, wherein at least a first audio frame includes an audio sample of a human speaker, the sample having a length above a first threshold; generating for each audio frame a feature vector; generating for a first plurality of feature vectors centered statistics of at least a zero order and a first order; generating a first i-vector, the first i-vector representing the human speaker; generating an optimized T-matrix training sequence computation, based on the first i-vector, an initialized T-matrix, the centered statistics, and a Gaussian mixture model (GMM) of a trained universal background model (UBM).

Type: Grant

Filed: March 1, 2019

Date of Patent: March 16, 2021

Assignee: ILLUMA Labs Inc.

Inventor: Milind Borkar
Binaural hearing device system with binaural active occlusion cancellation

Patent number: 10951996

Abstract: A binaural hearing system includes a first hearing device and a second hearing device, each of which comprising: an input transducer; a transducer audio signal processor configured to provide a processed input transducer audio signal; an ear canal microphone; an ear canal audio signal processor configured to provide a processed ear canal audio signal; a first signal combiner configured to combine the processed input transducer audio signal with the processed ear canal audio signal to obtain an output transducer audio signal; a signal level detector configured to determine a signal level of (1) the output transducer audio signal or (2) an audio signal included in formation of the output transducer audio signal; and an output transducer; wherein the binaural hearing system further comprises a binaural excessive level detector connected to the first hearing device's signal level detector and the second hearing device's signal level detector.

Type: Grant

Filed: June 17, 2019

Date of Patent: March 16, 2021

Assignee: GN Hearing A/S

Inventors: Søren Christian Voigt Pedersen, Jonathan Boley, James Robert Anderson
Method and apparatus for determining periods of excessive noise for receiving smart speaker voice commands

Patent number: 10943598

Abstract: Methods and systems for determining periods of excessive noise for smart speaker voice commands. An electronic timeline of volume levels of currently playing content is made available to a smart speaker. From this timeline, periods of high content volume are determined, and the smart speaker alerts users during periods of high volume, requesting that they wait until the high-volume period has passed before issuing voice commands. In this manner, the smart speaker helps prevent voice commands that may not be detected, or may be detected inaccurately, due to the noise of the content currently being played.

Type: Grant

Filed: March 18, 2019

Date of Patent: March 9, 2021

Assignee: ROVI GUIDES, INC.

Inventors: Gyanveer Singh, Sukanya Agarwal, Vikram Makam Gupta
Voice activity detection method and apparatus

Patent number: 10937448

Abstract: A voice activity detection method and an apparatus are provided by embodiments of the present application. The method includes: performing framing processing on a voice to be detected to obtain a plurality of audio frames to be detected; obtaining an acoustic feature of each of the audio frames to be detected, and sequentially inputting the acoustic feature of the each of the audio frames to be detected to a VAD model, wherein the VAD model is configured to classify a first N voice frame in the voice to be detected as a noise frame, classify frames from an (N+1)-th voice frame to a last voice frame as voice frames, and classify a M noise frame after the last voice frame as a voice frame, where N and M are integers; and determining, according to a classification result output by the VAD model.

Type: Grant

Filed: December 27, 2018

Date of Patent: March 2, 2021

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Chao Li, Weixin Zhu
Multi-channel microphone signal gain equalization based on evaluation of cross talk components

Patent number: 10917717

Abstract: Gain mismatch and related problems can be solved by a system and method that applies an automatic microphone signal gain equalization without any direct absolute reference or calibration phase. The system and method performs the steps of receiving, by a computing device, a speech signal from a speaking person via a plurality of microphones, determining a speech signal component in the time-frequency domain for each microphone of the plurality of microphones, calculating an instantaneous cross-talk coupling matrix based on the speech signal components across the microphones, estimating gain factors based on calculated cross-talk couplings and a given expected cross-talk attenuation, limiting the gain factors to appropriate maximum and minimum values, and applying the gain factors to the speech signal used in the control path to control further speech enhancement algorithms or used in the signal path for direct influence on the speech enhanced audio output signal.

Type: Grant

Filed: May 30, 2019

Date of Patent: February 9, 2021

Assignee: Nuance Communications, Inc.

Inventors: Timo Matheja, Markus Buck
Separating two additive signal sources

Patent number: 10903863

Abstract: A first set of signal data is received. Generative machine learning models are trained based on the first set of signal data. The generative machine learning models include at least a first model trained to identify a first signal component and a second model trained to identify a second signal component. An incoming mixed signal data stream is dynamically separated into a clean signal component and a noise signal component by running the generative machine learning models.

Type: Grant

Filed: December 11, 2019

Date of Patent: January 26, 2021

Assignee: International Business Machines Corporation

Inventors: Francois Pierre Luus, Etienne Eben Vos, Komminist Weldemariam
Non-semantic audio stenography

Patent number: 10885902

Abstract: Techniques are described for using stenography to protect sensitive information within conversational audio data by generating a pseudo-language representation of conversational audio data. In some implementations, audio data corresponding to an utterance is received. The audio data is classified as likely sensitive audio data. A particular set of sentiments associated with the audio data is determined. Data indicating the particular set of sentiments associated with the audio data is provided to a model. The model is trained to output, for each of different sets of sentiments, desensitized, pseudo-language audio data that exhibits the set of sentiments, and is not classified as likely sensitive audio data. A particular desensitized, pseudo-language audio data is received from the model. The audio data is replaced with the particular desensitized, pseudo-language audio data and stored within an audio data repository.

Type: Grant

Filed: November 21, 2018

Date of Patent: January 5, 2021

Assignee: X Development LLC

Inventors: Antonio Raymond Papania-Davis, Bin Ni, Shelby Lin

prev 1 2 3 4 5 6 … next