Speech Recognition Techniques For Robustness In Adverse Environments, E.g., In Noise, Of Stress Induced Speech, Etc. (epo) Patents (Class 704/E15.039)
  • Patent number: 11363367
    Abstract: A dual-microphone arrangement (300) provides improve voice performance in a wireless headset (12). A vibration sensor (1130) is used for voice pickup and will add low-frequency voice audio content in windy conditions. An equalizer (810) is used to restore low-frequency voice audio content in wind-free conditions. Depending on the measured wind power, the output will derive more signal from the equalizer (810) or more signal from the vibration sensor (1130).
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: June 14, 2022
    Assignee: Dopple IP B.V.
    Inventors: Jacobus Cornelis Haartsen, Aalbert Stek
  • Publication number: 20150149167
    Abstract: Aspects of this disclosure are directed to accurately transforming speech data into one or more word strings that represent the speech data. A speech recognition device may receive the speech data from a user device and an indication of the user device. The speech recognition device may execute a speech recognition algorithm using one or more user and acoustic condition specific transforms that are specific to the user device and an acoustic condition of the speech data. The execution of the speech recognition algorithm may transform the speech data into one or more word strings that represent the speech data. The speech recognition device may estimate which one of the one or more word strings more accurately represents the received speech data.
    Type: Application
    Filed: September 30, 2011
    Publication date: May 28, 2015
    Applicant: GOOGLE INC.
    Inventors: Françoise Beaufays, Johan Schalkwyk, Vincent Olivier Vanhoucke, Petar Stanisa Aleksic
  • Patent number: 8953812
    Abstract: Improvements in voice signals transmitted within communication systems are obtained by use of adaptive filters, front and rear microphones, noise cancelling systems and other means and methods. Disclosed embodiments include the use of directional microphones, primary inputs, secondary inputs, adaptive weight generators, canceller outputs to improve signal to noise ratios and other communication attributes.
    Type: Grant
    Filed: July 20, 2013
    Date of Patent: February 10, 2015
    Inventor: Alon Konchitsky
  • Publication number: 20140074464
    Abstract: Some embodiments of the inventive subject matter may include a method for detecting speech loss and supplying appropriate recollection data to the user. The method can include detecting a speech stream from a user. The method can include converting the speech stream to text. The method can include storing the text. The method can include detecting an interruption to the speech stream, wherein the interruption to the speech stream indicates speech loss by the user. The method can include searching a catalog using the text as a search parameter to find relevant catalog data. The method can include presenting the relevant catalog data to remind the user about the speech stream.
    Type: Application
    Filed: September 12, 2012
    Publication date: March 13, 2014
    Applicant: International Business Machines Corporation
    Inventor: Scott H. Berens
  • Publication number: 20140067387
    Abstract: Scalar operations for model adaptation or feature enhancement may be utilized for recognizing an utterance during automatic speech recognition in a noisy environment. An utterance including distorted speech generated from a transmission source for delivery to a receiver, may be received by a computer. The distorted speech may be caused by the noisy environment and channel distortion. Computations using scalar operations in the form of an algorithm may then be performed for recognizing the utterance. As a result of performing all of the computations with scalar operations, computational complexity is very small in comparison to matrix and vector operations. Vector Taylor Series with diagonal Jacobian approximation may also be utilized as a distortion-model-based noise robust algorithm with scalar operations.
    Type: Application
    Filed: September 5, 2012
    Publication date: March 6, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Jinyu Li, Michael Lewis Seltzer, Yifan Gong
  • Publication number: 20140012573
    Abstract: A signal processing apparatus includes a speech recognition system and a voice activity detection unit. The voice activity detection unit is coupled to the speech recognition system, and arranged for detecting whether an audio signal is a voice signal and accordingly generating a voice activity detection result to the speech recognition system to control whether the speech recognition system should perform speech recognition upon the audio signal.
    Type: Application
    Filed: September 13, 2012
    Publication date: January 9, 2014
    Inventors: Chia-Yu Hung, Tsung-Li Yeh, Yi-Chang Tu
  • Publication number: 20130311176
    Abstract: A wireless headset capable of receiving audio signals transmitted wirelessly and compatible for use in an MRI scanner is disclosed. The headset includes a first wireless module connected to the first earphone and a second wireless module connected to the second earphone. Each wireless module is electrically connected to a speaker in the respective earphone. The first wireless module receives the audio signal from a remote source and coordinates transmission of the audio signal to each of the speakers. The compact nature of each earphone minimizes the length of wire runs. In addition, the headset is made of materials having low magnetic susceptibility such that they will not be affected by the magnetic field from the MRI scanner.
    Type: Application
    Filed: June 8, 2012
    Publication date: November 21, 2013
    Inventors: Brian Brown, Manuel J. Ferrer Herrera, Richard J. Smaglick
  • Publication number: 20130304463
    Abstract: An embodiment of the invention provides a noise cancellation method for an electronic device. The method comprises: receiving an audio signal; applying a Fast Fourier Transform operation on the audio signal to generate a sound spectrum; acquiring a first spectrum corresponding to a noise and a second spectrum corresponding to a human voice signal from the sound spectrum; estimating a center frequency according to the first spectrum and the second spectrum; and applying a high pass filtering operation to the sound spectrum according to the center frequency.
    Type: Application
    Filed: May 14, 2012
    Publication date: November 14, 2013
    Inventors: Lei Chen, Yu-Chieh Lai, Chun-Ren Hu, Hann-Shi Tong
  • Publication number: 20130297305
    Abstract: A non-spatial speech detection system includes a plurality of microphones whose output is supplied to a fixed beamformer. An adaptive beamformer is used for receiving the output of the plurality of microphones and one or more processors are used for processing an output from the fixed beamformer and identifying speech from noise though the use of an algorithm utilizing a covariance matrix.
    Type: Application
    Filed: May 2, 2012
    Publication date: November 7, 2013
    Applicant: GENTEX CORPORATION
    Inventors: Robert R. Turnbull, Michael A. Bryson
  • Publication number: 20130297306
    Abstract: An adaptive equalization system that adjusts the spectral shape of a speech signal based on an intelligibility measurement of the speech signal may improve the intelligibility of the output speech signal. Such an adaptive equalization system may include a speech intelligibility measurement module, a spectral shape adjustment module, and an adaptive equalization module. The speech intelligibility measurement module is configured to calculate a speech intelligibility measurement of a speech signal. The spectral shape adjustment module is configured to generate a weighted long-term speech curve based on a first predetermined long-term average speech curve, a second predetermined long-term average speech curve, and the speech intelligibility measurement. The adaptive equalization module is configured to adapt equalization coefficients for the speech signal based on the weighted long-term speech curve.
    Type: Application
    Filed: May 4, 2012
    Publication date: November 7, 2013
    Applicant: QNX Software Systems Limited
    Inventors: Phillip Alan Hetherington, Xueman Li
  • Publication number: 20130246062
    Abstract: Method and system for tracking fundamental frequencies of pseudo-periodic signals in the presence of noise that include receiving a time-frequency representation of signals measured in a predefined environment; estimating and tracking a fundamental frequency of a respective pseudo-periodic signal at each time frame of the time-frequency representation by tracking detections of harmonious frequencies in the time-frequency representation over time; and outputting each respective estimated fundamental frequency associated with the pseudo-periodic signal of each respective time frame.
    Type: Application
    Filed: March 19, 2012
    Publication date: September 19, 2013
    Applicant: VOCALZOOM SYSTEMS LTD.
    Inventors: Yekutiel Avargel, Tal Bakish
  • Publication number: 20130226581
    Abstract: A communication method includes: capturing analog sound signals output by the audio output unit, and analyze the captured analog sound signals to obtain a corresponding digital audio information. Comparing the obtained digital audio information with a digital feature information stored in a storage unit to determine whether the obtained digital audio information includes the stored digital feature information. Playing a reply information stored in the storage unit if the obtained digital audio information includes the stored digital feature information.
    Type: Application
    Filed: September 26, 2012
    Publication date: August 29, 2013
    Applicants: HON HAI PRECISION INDUSTRY CO., LTD., HONG FU JIN PRECISION INDUSTRY (Shenzhen) CO., LTD .
    Inventors: HONG FU JIN PRECISION INDUSTRY (Shenzhen, HON HAI PRECISION INDUSTRY CO., LTD.
  • Publication number: 20130211832
    Abstract: A method of speech recognition in a vehicle. Audio including noise and a speech signal representative of an utterance from a user is received via a microphone, and a signal-to-noise ratio (SNR) for the received audio is calculated using a processor. It is determined whether the calculated SNR is greater than a predetermined SNR. If so, then a noise distribution is identified for addition to the received audio, and noise corresponding to the identified noise distribution is injected into the received audio to produce noise-injected audio including the speech signal.
    Type: Application
    Filed: February 9, 2012
    Publication date: August 15, 2013
    Applicant: GENERAL MOTORS LLC
    Inventors: Gaurav Talwar, Robert D. Sims
  • Publication number: 20130191117
    Abstract: In speech processing systems, compensation is made for sudden changes in the background noise in the average signal-to-noise ratio (SNR) calculation. SNR outlier filtering may be used, alone or in conjunction with weighting the average SNR. Adaptive weights may be applied on the SNRs per band before computing the average SNR. The weighting function can be a function of noise level, noise type, and/or instantaneous SNR value. Another weighting mechanism applies a null filtering or outlier filtering which sets the weight in a particular band to be zero. This particular band may be characterized as the one that exhibits an SNR that is several times higher than the SNRs in other bands.
    Type: Application
    Filed: November 6, 2012
    Publication date: July 25, 2013
    Applicant: Qualcomm Incorporated
    Inventor: Qualcomm Incorporated
  • Patent number: 8494174
    Abstract: A clear, high quality voice signal with a high signal-to-noise ratio is achieved by use of an adaptive noise reduction scheme with two microphones in close proximity. The method includes the use of two omini directional microphones in a highly directional mode, and then applying an adaptive noise cancellation algorithm to reduce the noise.
    Type: Grant
    Filed: June 14, 2010
    Date of Patent: July 23, 2013
    Inventor: Alon Konchitsky
  • Publication number: 20130185066
    Abstract: Sound related vehicle information representing one or more sounds may be received in a processor associated with a vehicle. The sound related vehicle information may or may not include an audio signal. An audio signal output to a passenger may be modified based on the sound related vehicle information.
    Type: Application
    Filed: January 17, 2012
    Publication date: July 18, 2013
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Eli TZIRKEL-HANCOCK, Omer Tsimhoni
  • Publication number: 20130185065
    Abstract: An audio signal may be received, in a processor associated with a vehicle. Sound related vehicle information representing one or more sounds may be received by the processor. The sound related vehicle information may or may not include an audio signal. A speech recognition process or system may be modified based on the sound related vehicle information.
    Type: Application
    Filed: January 17, 2012
    Publication date: July 18, 2013
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Eli TZIRKEL-HANCOCK, Omer Tsimhoni
  • Publication number: 20130179163
    Abstract: An In-Car Communication (ICC) system supports the communication paths within a car by receiving the speech signals of a speaking passenger and playing it back for one or more listening passengers. Signal processing tasks are split into a microphone related part and into a loudspeaker related part. A sound processing system suitable for use in a vehicle having multiple acoustic zones includes a plurality of microphone In-Car Communication (Mic-ICC) instances coupled and a plurality of loudspeaker In-Car Communication (Ls-ICC) instances. The system further includes a dynamic audio routing matrix with a controller and coupled to the Mic-ICC instances, a mixer coupled to the plurality of Mic-ICC instances and a distributor coupled to the Ls-ICC instances.
    Type: Application
    Filed: January 10, 2012
    Publication date: July 11, 2013
    Inventors: Tobias Herbig, Markus Buck, Meik Pfeffinger
  • Publication number: 20130144618
    Abstract: A disclosed embodiment provides a speech recognition method to be performed by an electronic device. The method includes: collecting user-specific information that is specific to a user through the user's usage of the electronic device; recording an utterance made by the user; letting a remote server generate a remote speech recognition result for the recorded utterance; generating rescoring information for the recorded utterance based on the collected user-specific information; and letting the remote speech recognition result rescored based on the rescoring information.
    Type: Application
    Filed: March 12, 2012
    Publication date: June 6, 2013
    Inventors: Liang-Che Sun, Yiou-Wen Cheng, Chao-Ling Hsu, Jyh-Horng Lin
  • Publication number: 20130138437
    Abstract: A speech recognition apparatus, includes a reliability estimating unit configured to estimate reliability of a time-frequency segment from an input voice signal; and a reliability reflecting unit configured to reflect the reliability of the time-frequency segment to a normalized cepstrum feature vector extracted from the input speech signal and a cepstrum average vector included for each state of an HMM in decoding. Further, the speech recognition apparatus includes a cepstrum transforming unit configured to transform the cepstrum feature vector and the average vector through a discrete cosine transformation matrix and calculate a transformed cepstrum vector. Furthermore, the speech recognition apparatus includes an output probability calculating unit configured to calculate an output probability value of time-frequency segments of the input speech signal by applying the transformed cepstrum vector to the cepstrum feature vector and the average vector.
    Type: Application
    Filed: July 25, 2012
    Publication date: May 30, 2013
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Hoon-Young Cho, Youngik Kim, Sanghun Kim
  • Publication number: 20130132077
    Abstract: Systems and methods for semi-supervised source separation using non-negative techniques are described. In some embodiments, various techniques disclosed herein may enable the separation of signals present within a mixture, where one or more of the signals may be emitted by one or more different sources. In audio-related applications, for instance, a signal mixture may include speech (e.g., from a human speaker) and noise (e.g., background noise). In some cases, speech may be separated from noise using a speech model developed from training data. A noise model may be created, for example, during the separation process (e.g., “on-the-fly”) and in the absence of corresponding training data.
    Type: Application
    Filed: May 27, 2011
    Publication date: May 23, 2013
    Inventors: Gautham J. Mysore, Paris Smaragdis
  • Publication number: 20130103397
    Abstract: Exemplary embodiments provide systems, devices and methods that allow creation and management of lists of items in an integrated manner on an interactive graphical user interface. A user may speak a plurality of list items in a natural unbroken manner to provide an audio input stream into an audio input device. Exemplary embodiments may automatically process the audio input stream to convert the stream into a text output, and may process the text output into one or more n-grams that may be used as list items to populate a list on a user interface.
    Type: Application
    Filed: October 21, 2011
    Publication date: April 25, 2013
    Applicant: WAL-MART STORES, INC.
    Inventors: Dion Almaer, Bernard Paul Cousineau, Ben Galbraith
  • Publication number: 20130096915
    Abstract: A speech processing method and arrangement are described. A dynamic noise adaptation (DNA) model characterizes a speech input reflecting effects of background noise. A null noise DNA model characterizes the speech input based on reflecting a null noise mismatch condition. A DNA interaction model performs Bayesian model selection and re-weighting of the DNA model and the null noise DNA model to realize a modified DNA model characterizing the speech input for automatic speech recognition and compensating for noise to a varying degree depending on relative probabilities of the DNA model and the null noise DNA model.
    Type: Application
    Filed: October 17, 2011
    Publication date: April 18, 2013
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Steven J. Rennie, Pierre Dognin, Petr Fousek
  • Publication number: 20130085753
    Abstract: A computing device is able to use an embedded speech recognizer and a network speech recognizer for speech recognition. In response to detecting speech in the captured audio, the computing device may forward the captured audio to its embedded speech recognizer and to a speech client for the network speech recognizer. The embedded speech recognizer provides an embedded-recognizer result for the captured audio. If a network-recognition criterion is met, the speech client forwards the captured audio to the network speech recognizer and receives a network-recognizer result for the captured audio from the network speech recognizer. A speech recognition result for the captured audio is forwarded to at least one application, wherein the speech recognition result is based on at least one of the embedded-recognizer result and the network-recognizer result.
    Type: Application
    Filed: August 15, 2012
    Publication date: April 4, 2013
    Applicant: GOOGLE INC.
    Inventors: Bjorn Erik Bringert, Johan Schalkwyk, Michael J. LeBeau, Richard Zarek Cohen, Luca Zanolin, Simon Tickner
  • Publication number: 20130060567
    Abstract: VoIP phones according to the present invention include a microphone, which may be internal or external, and allow the user to communicate unobtrusively, check voice mail and conduct other activities in an environment which can be noisy in general and extremely noisy sometimes. Speech recognition functionally may also be used to generate and send touch tone or DTMF tones such as in response to call trees or voice recognition functionality used by airlines, credit card companies, voice mail systems, and other applications. A system and method of audio processing which provides enhanced speech recognition is provided. Audio input is received at the microphone which is processed by adaptive noise cancellation to generate an enhanced audio signal. The operation of the speech recognition engine and the adaptive noise canceller may be advantageously controlled based on Voice Activity Detection (VAD).
    Type: Application
    Filed: October 31, 2012
    Publication date: March 7, 2013
    Inventor: Alon Konchitsky
  • Publication number: 20130054236
    Abstract: A method for the detection of noise and speech segments in a digital audio input signal, the input signal being divided into a plurality of frames including a first stage in which a first classification of a frame as noise is performed if the mean energy value for this frame and the previous N frames is not greater than a first energy threshold, N>1, a second stage in which for each frame that has not been classified as noise in the first stage it is decided if the frame is classified as noise or as speech based on combining at least a first criterion of spectral similarity of the frame with acoustic noise and speech models, a second criterion of analysis of the energy of the frame and a third criterion of duration, and of using a state machine for detecting the beginning of a segment as an accumulation of a determined number of consecutive frames with acoustic similarity greater than a first threshold and for detecting the end of the segment; a third stage in which the classification as speech or as noise
    Type: Application
    Filed: October 7, 2010
    Publication date: February 28, 2013
    Applicant: TELEFONICA, S.A.
    Inventors: Carlos Garcia Martinez, Helenca Duxans Barrobés, Mauricio Sendra Vicens, David Cadenas Sanchez
  • Publication number: 20130046536
    Abstract: Methods and apparatuses for performing song detection on an audio signal are described. Clips of the audio signal are classified into classes comprising music. Class boundaries of music clips are detected as candidate boundaries of a first type. Combinations including non-overlapped sections are derived. Each section meets the following conditions: 1) including at least one music segment longer than a predetermined minimum song duration, 2) shorter than a predetermined maximum song duration, 3) both starting and ending with a music clip, and 4) a proportion of the music clips in each of the sections is greater than a predetermined minimum proportion. In this way, various possible song partitions in the audio signal can be obtained for investigation.
    Type: Application
    Filed: July 26, 2012
    Publication date: February 21, 2013
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Lie Lu, Claus Bauer
  • Publication number: 20130035935
    Abstract: The present invention allows a man to recognize a location of a sound source in a three-dimensional space using two ears and applies a method of separating a sound source in a certain orientation to improve the performance of an application technology using a speech in a noisy environment. The present invention acquires a speech signal using two sensors and determines an orientation angle of a sound source in a zero-crossing point step with respect to a frequency separated signal with a band pass filter bank. An object of the present invention is to obtain excellent sound source orientation detection and division performance which is difficult to be obtained in an existing crossing correlation method calculated in units of time frames in a noisy environment with a plurality of sound sources.
    Type: Application
    Filed: May 1, 2012
    Publication date: February 7, 2013
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Young Ik KIM, Hoon Young Cho, Sang Hun Kim
  • Patent number: 8359020
    Abstract: In one implementation, a computer-implemented method includes detecting a current context associated with a mobile computing device and determining, based on the current context, whether to switch the mobile computing device from a current mode of operation to a second mode of operation during which the mobile computing device monitors ambient sounds for voice input that indicates a request to perform an operation. The method can further include, in response to determining whether to switch to the second mode of operation, activating one or more microphones and a speech analysis subsystem associated with the mobile computing device so that the mobile computing device receives a stream of audio data. The method can also include providing output on the mobile computing device that is responsive to voice input that is detected in the stream of audio data and that indicates a request to perform an operation.
    Type: Grant
    Filed: August 6, 2010
    Date of Patent: January 22, 2013
    Assignee: Google Inc.
    Inventors: Michael J. Lebeau, John Nicholas Jitkoff, Dave Burke
  • Publication number: 20130006624
    Abstract: An apparatus and a method that achieve physical separation of sound sources by pointing directly a beam of coherent electromagnetic waves (i.e. laser). Analyzing the physical properties of a beam reflected from the vibrations generating sound source enable the reconstruction of the sound signal generated by the sound source, eliminating the noise component added to the original sound signal. In addition, the use of multiple electromagnetic waves beams or a beam that rapidly skips from one sound source to another allows the physical separation of these sound sources. Aiming each beam to a different sound source ensures the independence of the sound signals sources and therefore provides full sources separation.
    Type: Application
    Filed: September 12, 2012
    Publication date: January 3, 2013
    Applicant: AUDIOZOOM LTD
    Inventor: Tal Bakish
  • Publication number: 20120330656
    Abstract: Discrimination between two classes comprises receiving a set of frames including an input signal and determining at least two different feature vectors for each of the frames. Discrimination between two classes further comprises classifying the two different feature vectors using sets of preclassifiers trained for at least two classes of events and from that classification, and determining values for at least one weighting factor. Discrimination between two classes still further comprises calculating a combined feature vector for each of the received frames by applying the weighting factor to the feature vectors and classifying the combined feature vector for each of the frames by using a set of classifiers trained for at least two classes of events.
    Type: Application
    Filed: September 4, 2012
    Publication date: December 27, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Zica Valsan
  • Publication number: 20120330651
    Abstract: A voice data transferring device intermediates between an in-vehicle terminal and a voice recognition server. In order to check a change in voice recognition performance of the voice recognition server, the voice data transferring device performs a noise suppression processing on a voice data for evaluation in a noise suppression module; transmits the voice data for evaluation to the voice recognition server; and receives a recognition result thereof. The voice data transferring device sets a value of a noise suppression parameter used for a noise suppression processing or a value of a result integration parameter used for a processing of integrating a plurality of recognition results acquired from the voice recognition server, at an optimum value, based on the recognition result of the voice recognition server. This makes it possible to set a suitable parameter even if the voice recognition performance of the voice recognition server changes.
    Type: Application
    Filed: June 22, 2012
    Publication date: December 27, 2012
    Inventors: Yasunari Obuchi, Takeshi Homma
  • Publication number: 20120330655
    Abstract: A voice recognition device includes a voice recognition dictionary in which a word which is recognized as a result of voice recognition on an inputted voice is registered, a reply voice data storage unit for storing recorded voice data about words registered in the voice recognition dictionary, a dialog control unit for, when a word registered in the voice recognition dictionary is recognized, acquiring recorded voice data corresponding to the word from the reply voice data storage unit, a reproduction noise reduction unit for carrying out a process of reducing noise included in the recorded voice data, an amplitude adjusting unit for adjusting an amplitude of the recorded voice data in which the noise has been reduced to a predetermined amplitude level, and a voice reproduction unit for reproducing a voice from the amplitude-adjusted recorded voice data.
    Type: Application
    Filed: June 28, 2010
    Publication date: December 27, 2012
    Inventors: Masanobu Osawa, Kazuyuki Nogi
  • Publication number: 20120330657
    Abstract: A speech feature extraction apparatus, speech feature extraction method, and speech feature extraction program. A speech feature extraction apparatus includes: first difference calculation module to: (i) receive, as an input, a spectrum of a speech signal segmented into frames for each frequency bin; and (ii) calculate a delta spectrum for each of the frame, where the delta spectrum is a difference of the spectrum within continuous frames for the frequency bin; and first normalization module to normalize the delta spectrum of the frame for the frequency bin by dividing the delta spectrum by a function of an average spectrum; where the average spectrum is an average of spectra through all frames that are overall speech for the frequency bin; and where an output of the first normalization module is defined as a first delta feature.
    Type: Application
    Filed: September 6, 2012
    Publication date: December 27, 2012
    Applicant: International Business Machines Corporation
    Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
  • Publication number: 20120316872
    Abstract: Embodiments of the present invention provide an adaptive noise canceling system. The adaptive noise canceling system may be used in a handset to cancel background noise by generating an anti-noise signal. The adaptive noise canceling system may include first input to receive a first signal from a feedforward microphone; a second input to receive a second signal from an error microphone; a controller coupled to the inputs, the controller configured to adaptively generate an anti-noise signal according to the received signals, wherein the controller derives a profile of the anti-noise signal from the first signal and derives a magnitude of the anti-noise signal from both first and second signal; and an output to transmit the anti-noise signal to a speaker.
    Type: Application
    Filed: June 7, 2011
    Publication date: December 13, 2012
    Applicant: ANALOG DEVICES, INC.
    Inventors: Thomas Stoltz, Kim Spetzler Berthelsen, Robert Adams
  • Publication number: 20120310641
    Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.
    Type: Application
    Filed: August 13, 2012
    Publication date: December 6, 2012
    Inventors: Riitta Elina Niemistö, Päivi Marianna Valve
  • Publication number: 20120310640
    Abstract: A personal audio device, such as a wireless telephone, includes noise canceling circuit that adaptively generates an anti-noise signal from a reference microphone signal and injects the anti-noise signal into the speaker or other transducer output to cause cancellation of ambient audio sounds. An error microphone may also be provided proximate the speaker to estimate an electro-acoustical path from the noise canceling circuit through the transducer. A processing circuit uses the reference and/or error microphone, optionally along with a microphone provided for capturing near-end speech, to determine whether one of the reference or error microphones is obstructed by comparing their received signal content and takes action to avoid generation of erroneous anti-noise.
    Type: Application
    Filed: September 30, 2011
    Publication date: December 6, 2012
    Inventors: Nitin Kwatra, Jeffrey Alderson, Jon D. Hendrix
  • Patent number: 8326328
    Abstract: In one implementation, a computer-implemented method includes detecting a current context associated with a mobile computing device and determining, based on the current context, whether to switch the mobile computing device from a current mode of operation to a second mode of operation during which the mobile computing device monitors ambient sounds for voice input that indicates a request to perform an operation. The method can further include, in response to determining whether to switch to the second mode of operation, activating one or more microphones and a speech analysis subsystem associated with the mobile computing device so that the mobile computing device receives a stream of audio data. The method can also include providing output on the mobile computing device that is responsive to voice input that is detected in the stream of audio data and that indicates a request to perform an operation.
    Type: Grant
    Filed: September 29, 2011
    Date of Patent: December 4, 2012
    Assignee: Google Inc.
    Inventors: Michael J. LeBeau, John Nicholas Jitkoff, Dave Burke
  • Publication number: 20120303366
    Abstract: A system detects a speech segment that may include unvoiced, fully voiced, or mixed voice content. The system includes a window function that passes signals within a programmed aural frequency range while substantially blocking signals above and below the programmed aural frequency range. A frequency converter converts the signals passing within the programmed aural frequency range into a plurality of frequency bins. A background voice detector estimates the strength of a background speech segment relative to the noise of selected portions of the aural spectrum. A noise estimator estimates a maximum distribution of noise to an average of an acoustic noise power of some of the plurality of frequency bins. A voice detector compares the strength of a desired speech segment to a maximum of an output of the background voice detector and an output of the noise estimator.
    Type: Application
    Filed: August 3, 2012
    Publication date: November 29, 2012
    Inventors: Phillip Alan Hetherington, Mark Ryan Fallat
  • Publication number: 20120303367
    Abstract: An enhancement system improves the estimate of noise from a received signal. The system includes a spectrum monitor that divides a portion of the signal at more than one frequency resolution. Adaptation logic derives a noise adaptation factor of the received signal. A plurality of devices tracks the characteristics of an estimated noise in the received signal and modifies multiple noise adaptation rates. Weighting logic applies the modified noise adaptation rates derived from the signal divided at a first frequency resolution to the signal divided at a second frequency resolution.
    Type: Application
    Filed: August 13, 2012
    Publication date: November 29, 2012
    Applicant: QNX Software Systems Limited
    Inventor: Phillip A. Hetherington
  • Publication number: 20120290297
    Abstract: A signal representative of an unpredictable audio stimulus is provided to a putative live speaker within a putative live recording environment. A second signal purportedly emanating from the putative live speaker and/or the environment is received. This second signal is examined for influence of the unpredictable audio stimulus on the putative live speaker and/or the putative live recording environment. The examining includes at least one of audio feedback analysis, Lombard analysis, and evoked otoacoustic response analysis. Based on the examining, a determination is made as to whether the putative live speaker is an actual live speaker and/or whether the putative live recording environment is an actual live recording environment.
    Type: Application
    Filed: May 11, 2011
    Publication date: November 15, 2012
    Applicant: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Jason W. Pelecanos
  • Publication number: 20120284023
    Abstract: The method comprises the steps of: digitizing sound signals picked up simultaneously by two microphones (N, M); executing a short-term Fourier transform on the signals (xn(t), xm(t)) picked up on the two channels so as to produce a succession of frames in a series of frequency bands; applying an algorithm for calculating a speech-presence confidence index on each channel, in particular a probability a speech that is present; selecting one of the two microphones by applying a decision rule to the successive frames of each of the channels, which rule is a function both of a channel selection criterion and of a speech-presence confidence index; and implementing speech processing on the sound signal picked up by the one microphone that is selected.
    Type: Application
    Filed: May 7, 2010
    Publication date: November 8, 2012
    Applicant: PARROT
    Inventors: Guillaume Vitte, Alexandre Briot, Guillaume Pinto
  • Publication number: 20120265526
    Abstract: An input signal is received. A plurality of electrical characteristics from the input signal is obtained. A plurality of acoustic features is determined from the obtained electrical characteristics and each of the acoustic features being different from the others. At least some of the acoustic features are compared to a plurality of predetermined criteria. Based upon the comparing of the acoustic features to the plurality of predetermined criteria, it is determined when the signal is a voice signal or a noise signal.
    Type: Application
    Filed: April 13, 2011
    Publication date: October 18, 2012
    Applicant: CONTINENTAL AUTOMOTIVE SYSTEMS, INC.
    Inventors: Suat Yeldener, David Barron
  • Publication number: 20120259629
    Abstract: To provide a noise reduction transmitter which can secure clarity of sounds collected in very noisy environments and maintain a quality of sounds without devising a noise insulation cover particularly. A transmission microphone 7 is arranged inside a noise insulation cover 2 worn on and covering at least a user's 1 mouth. A noise detection microphone 9 which detects external noises is arranged outside the noise insulation cover, and a noise component cancellation circuit 11 is provided which generates a noise component cancellation signal based on an output signal from the noise detection microphone. An electroacoustic transducer 8 is arranged in the noise insulation cover to reproduce a noise component cancellation sound based on an output signal from the noise component cancellation circuit 11.
    Type: Application
    Filed: April 6, 2012
    Publication date: October 11, 2012
    Applicant: KABUSHIKI KAISHA AUDIO-TECHNICA
    Inventor: Hiroshi AKINO
  • Publication number: 20120259628
    Abstract: A telecommunication device is disclosed, comprising: a microphone array comprising a plurality of microphones, wherein each microphone receives an analogue acoustic signal; a position sensing device for determining how the telecommunication device is positioned in three-dimensions with respect to a user's mouth; at least one analogue/digital converter for converting each analogue acoustic signal into a digital signal; a digital signal processor for performing signal processing on the received digital signals comprising a controller, a plurality of delay circuits for delaying each received signal based on an input from the controller and a plurality of preamplifiers for adjusting the gain of each received signal based on a gain input from the controller, wherein the controller selects the appropriate delay and gain values applied to each received signal to remove noise from the received signals based on the determined position of the telecommunication device.
    Type: Application
    Filed: May 4, 2011
    Publication date: October 11, 2012
    Applicant: SONY ERICSSON MOBILE COMMUNICATIONS AB
    Inventor: Georg SIOTIS
  • Publication number: 20120259631
    Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.
    Type: Application
    Filed: June 22, 2012
    Publication date: October 11, 2012
    Applicant: GOOGLE INC.
    Inventors: Matthew I. Lloyd, Trausti T. Kristjansson
  • Publication number: 20120245933
    Abstract: A device for suppressing ambient sounds from speech received by a microphone array is provided. One embodiment of the device comprises a microphone array, a processor, an analog-to-digital converter, and memory comprising instructions stored therein that are executable by the processor.
    Type: Application
    Filed: June 8, 2012
    Publication date: September 27, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Jason Flaks, Ivan Tashev, Duncan McKay, Xudong Ni, Robert Heitkamp, Wei Guo, John Tardif, Leo Shing, Michael Baseflug
  • Publication number: 20120239394
    Abstract: An erroneous detection determination device includes: a signal acquisition unit configured to acquire, from each of microphones, a plurality of audio signals relating to ambient sound including sound from a sound source in a certain direction; a result acquisition unit configured to acquire a recognition result including voice activity information indicating the inclusion of a voice activity relating to at least one of the audio signals; a calculation unit configured to calculate, for each of audio signals on the basis of the signals in respective unit times and the certain direction, a speech arrival rate representing the proportion of the sound from the certain direction to the ambient sound in each of the unit times; and an error detection unit configured to determine, on the basis of the recognition result and the speech arrival rate, whether or not the voice activity information is the result of erroneous detection.
    Type: Application
    Filed: February 28, 2012
    Publication date: September 20, 2012
    Applicant: FUJITSU LIMITED
    Inventor: Chikako MATSUMOTO
  • Publication number: 20120232896
    Abstract: A voice activity detection apparatus (1) comprising: a signal condition analyzing unit (3) which analyses at least one signal parameter of an input signal to detect a signal condition SC of said input signal; at least two voice activity detection units (4-i) comprising different voice detection characteristics, wherein each voice activity detection unit (4-i) performs separately a voice activity detection of said input signal to provide a voice activity detection decision VADD; and a decision combination unit (5) which combines the voice activity detection decisions VADDs provided by said voice activity detection units (4-i) depending on the detected signal condition SC to provide a combined voice activity detection decision cVADD.
    Type: Application
    Filed: May 21, 2012
    Publication date: September 13, 2012
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Anisse TALEB, Zhe WANG, Jianfeng XU, Lei MIAO
  • Publication number: 20120224715
    Abstract: The subject disclosure is directed towards a noise adaptive beamformer that dynamically selects between microphone array channels, based upon noise energy floor levels that are measured when no actual signal (e.g., no speech) is present. When speech (or a similar desired signal) is detected, the beamformer selects which microphone signal to use in signal processing, e.g., corresponding to the lowest noise channel. Multiple channels may be selected, with their signals combined. The beamformer transitions back to the noise measurement phase when the actual signal is no longer detected, so that the beamformer dynamically adapts as noise levels change, including on a per-microphone basis, to account for microphone hardware differences, changing noise sources, and individual microphone deterioration.
    Type: Application
    Filed: March 3, 2011
    Publication date: September 6, 2012
    Applicant: Microsoft Corporation
    Inventor: Harshavardhana N. Kikkeri