Detect Speech In Noise Patents (Class 704/233)
  • Patent number: 8321217
    Abstract: The present invention relates to a voice activity detector (VAD) comprising at least a first primary voice detector. The voice activity detector is configured to output a speech decision ‘vad_flag’ indicative of the presence of speech in an input signal based on at least a primary speech decision ‘vad_prim_A’ produced by said first primary voice detector. The voice activity detector further comprises a short term activity detector and the voice activity detector is further configured to produce a music decision ‘vad_music’ indicative of the presence of music in the input signal based on a short term primary activity signal ?vad_act_prim_A’ produced by said short term activity detector based on the primary speech decision ‘vad_prim_A’ produced by the first voice detector. The short term primary activity signal ‘vad_act_prim_A’ is proportional to the presence of music in the input signal. The invention also relates to a node, e.g. a terminal, in a communication system comprising such a VAD.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: November 27, 2012
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventor: Martin Sehlstedt
  • Patent number: 8315412
    Abstract: A user interface device provides secure access to equipment. An audio reception device is adapted to receive an audio input comprising background noise and to receive an audio user code from a user. A user input device receives a manual input comprising a manual user code from the user. A control unit stores one or more target user codes and receives an audio input and determines a noise level of the background noise. An instruction message is generated to inform the user to enter the manual user code in response to the noise level of the background noise exceeding a predetermined threshold level. An output device provides the instruction message to the user. A control unit provides secure access to the equipment in response to at least one of the audio user code and the manual user code.
    Type: Grant
    Filed: October 17, 2007
    Date of Patent: November 20, 2012
    Assignee: The Chamberlain Group, Inc.
    Inventors: Larry Strait, Steve Coates
  • Patent number: 8315865
    Abstract: A conversation detector and detection method is based on voice band energy detection. The detector is formed of a signal preconditioner, a comparator and an analysis unit. The comparator generates signal pulses reduced in resolution and sample rate (e.g., single bit data) and indicative of energy level and/or duration of activity detected in subject audio signals. The analysis unit determines from the generated signal pulses whether a conversation exists in the subject audio signal. The detector is also able to adapt to environmental noise change, automatically calibrate and operate in low power consumption mode.
    Type: Grant
    Filed: May 4, 2004
    Date of Patent: November 20, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Benjamin Kuris
  • Publication number: 20120290297
    Abstract: A signal representative of an unpredictable audio stimulus is provided to a putative live speaker within a putative live recording environment. A second signal purportedly emanating from the putative live speaker and/or the environment is received. This second signal is examined for influence of the unpredictable audio stimulus on the putative live speaker and/or the putative live recording environment. The examining includes at least one of audio feedback analysis, Lombard analysis, and evoked otoacoustic response analysis. Based on the examining, a determination is made as to whether the putative live speaker is an actual live speaker and/or whether the putative live recording environment is an actual live recording environment.
    Type: Application
    Filed: May 11, 2011
    Publication date: November 15, 2012
    Applicant: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Jason W. Pelecanos
  • Patent number: 8311819
    Abstract: A system detects a speech segment that may include unvoiced, fully voiced, or mixed voice content. The system includes a digital converter that converts a time-varying input signal into a digital-domain signal. A window function passes signals within a programmed aural frequency range while substantially blocking signals above and below the programmed aural frequency range when multiplied by an output of the digital converter. A frequency converter converts the signals passing within the programmed aural frequency range into a plurality of frequency bins. A background voice detector estimates the strength of a background speech segment relative to the noise of selected portions of the aural spectrum. A noise estimator estimates a maximum distribution of noise to an average of an acoustic noise power of some of the plurality of frequency bins.
    Type: Grant
    Filed: March 26, 2008
    Date of Patent: November 13, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Phillip A. Hetherington, Mark Fallat
  • Patent number: 8311881
    Abstract: A method for maintaining a website by providing rewards for indirect unseen activities is disclosed. Individual reviewers are rewarded for reviewing and assigning rating scores to articles of media as a part of panel of reviewers consisting of more that one individual reviewer. The score from each individual reviewer does not directly determine whether or not the article of media is accepted and can be viewed by others, rather it is the score from the collective panel. Further, reviewers are rewarded for reviewing even if the media clip is rejected and cannot be viewed by others. Thus, reviewers are rewarded for their indirect unseen activities.
    Type: Grant
    Filed: February 26, 2008
    Date of Patent: November 13, 2012
    Assignee: Wonderlandaward.com Ltd
    Inventor: Robert Silman
  • Patent number: 8311820
    Abstract: Presented is a method and system for speech recognition. The method includes determining a noise level in an environment, comparing the determined noise level with a predetermined noise level threshold value, using a first set of grammar for speech recognition, if the determined noise level is below the predetermined noise level threshold value, and using a second set of grammar for speech recognition, if the determined noise level is above the predetermined noise level threshold value.
    Type: Grant
    Filed: April 6, 2010
    Date of Patent: November 13, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Amit Ranjan
  • Patent number: 8311817
    Abstract: Provided are methods and systems for enhancing the quality of voice communications. The method and corresponding system may involve classifying an audio signal into speech, and speech and noise and creating speech-noise classification data. The method may further involve sharing the speech-noise classification data with a speech encoder via a shared memory or by a Least Significant Bit (LSB) of a Pulse Code Modulation (PCM) stream. The method and corresponding system may also involve sharing acoustic cues with the speech encoder to improve the speech noise classification and, in certain embodiments, sharing scaling transition factors with the speech encoder to enable the speech encoder to gradually change data rate in the transitions between the encoding modes.
    Type: Grant
    Filed: November 3, 2011
    Date of Patent: November 13, 2012
    Assignee: Audience, Inc.
    Inventors: Carlo Murgia, Scott Isabelle
  • Publication number: 20120284023
    Abstract: The method comprises the steps of: digitizing sound signals picked up simultaneously by two microphones (N, M); executing a short-term Fourier transform on the signals (xn(t), xm(t)) picked up on the two channels so as to produce a succession of frames in a series of frequency bands; applying an algorithm for calculating a speech-presence confidence index on each channel, in particular a probability a speech that is present; selecting one of the two microphones by applying a decision rule to the successive frames of each of the channels, which rule is a function both of a channel selection criterion and of a speech-presence confidence index; and implementing speech processing on the sound signal picked up by the one microphone that is selected.
    Type: Application
    Filed: May 7, 2010
    Publication date: November 8, 2012
    Applicant: PARROT
    Inventors: Guillaume Vitte, Alexandre Briot, Guillaume Pinto
  • Patent number: 8306815
    Abstract: A speech dialog system interfaces a user to a computer. The system includes a signal pre-processor that processes a speech input to generate an enhanced signal and an analysis signal. A speech recognition unit may generate a recognition result based on the enhanced signal. A control unit may manage an output unit or an external device based on the information within the analysis signal.
    Type: Grant
    Filed: December 6, 2007
    Date of Patent: November 6, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Lars König, Gerhard Uwe Schmidt, Andreas Löw
  • Patent number: 8306823
    Abstract: A speech receiving unit receives a user ID, a speech obtained at a terminal, and an utterance duration, from the terminal. A proximity determining unit calculates a correlation value expressing a correlation between speeches received from plural terminals, compares the correlation value with a first threshold value, and determines that the plural terminals that receive the speeches whose correlation value is calculated are close to each other, when the correlation value is larger than the first threshold value. A dialog detecting unit determines whether a relationship between the utterance durations received from the plural terminals that are determined to be close to each other within an arbitrarily target period during which a dialog is to be detected fits a rule. When the relationship is determined to fit the rule, the dialog detecting unit detects dialog information containing the target period and the user ID.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: November 6, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masayuki Okamoto, Naoki Iketani, Hideo Umeki, Sogo Tsuboi, Kenta Cho, Keisuke Nishimura, Masanori Hattori
  • Patent number: 8300801
    Abstract: A system and method for enhancing communications through a phone. A voice communication is received from a user of the phone. A secondary signal is received from an environment in proximity to the phone. The secondary signal is processed to determine an inverse signal in response to receiving the secondary signal. The inverse signal is combined with the voice communication and the secondary signal to destructively interfere with the secondary signal for allowing a receiving party to more effectively communicate with the user.
    Type: Grant
    Filed: June 26, 2008
    Date of Patent: October 30, 2012
    Assignee: CenturyLink Intellectual Property LLC
    Inventors: Jeffrey Michael Sweeney, Kelsyn Donel Seven Rooks, Sr., Michael Clayton Robinson
  • Publication number: 20120269332
    Abstract: A method is provided for encoding multiple microphone signals into a composite source-separable audio (SSA) signal, conducive for transmission over a voice network. The embodiments enable the processing of source separation of the target voice signal from its ambient sound to be performed at any point in the voice communication network, including the internet cloud. A multiplicity of processing is possible over the SSA signal, based on the intended voice application. The level of processing is adapted with the availability of the processing power at the chosen processing node in the network in one embodiment. An apparatus for separating out the target source voice from its ambient sound is also provided. The apparatus includes a directed source separation (DSS) unit, which processes the two virtual microphone signals in the SSA representation, to generate a new SSA signal including the enhanced target voice and the enhanced ambient noise.
    Type: Application
    Filed: April 20, 2012
    Publication date: October 25, 2012
    Inventor: Shridhar K. Mukund
  • Patent number: 8296135
    Abstract: A noise cancellation apparatus includes a noise estimation module for receiving a noise-containing input speech, and estimating a noise therefrom to output the estimated noise; a first Wiener filter module for receiving the input speech, and applying a first Wiener filter thereto to output a first estimation of clean speech; a database for storing data of a Gaussian mixture model for modeling clean speech; and an MMSE estimation module for receiving the first estimation of clean speech and the data of the Gaussian mixture model to output a second estimation of clean speech. The apparatus further includes a final clean speech estimation module for receiving the second estimation of clean speech from the MMSE estimation module and the estimated noise from the noise estimation module, and obtaining a final Wiener filter gain therefrom to output a final estimation of clean speech by applying the final Wiener filter gain.
    Type: Grant
    Filed: November 13, 2008
    Date of Patent: October 23, 2012
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Byung Ok Kang, Ho-Young Jung, Sung Joo Lee, Yunkeun Lee, Jeon Gue Park, Jeom Ja Kang, Hoon Chung, Euisok Chung, Ji Hyun Wang, Hyung-Bae Jeon
  • Patent number: 8296147
    Abstract: A method for facilitating project management includes identifying a user, identifying a project management access level for the user, dynamically generating a voice dialog based on the identified project management access level, dynamically generating grammars associated with the voice dialog based on the identified project management access level, and serving the voice dialog to the user. The method further includes receiving a voice request from the user corresponding to a generated grammar; retrieving project management information associated with the received voice request; dynamically generating a responsive voice dialog including the retrieved project management information; dynamically generating responsive grammars associated with the responsive voice dialog; and serving the responsive voice dialog to the user.
    Type: Grant
    Filed: August 7, 2006
    Date of Patent: October 23, 2012
    Assignee: Verizon Patent and Licensing Inc.
    Inventor: Rajesh Sharma
  • Patent number: 8296133
    Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term sliding mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term sliding mean of the frequency domain parameter in the history background noise frame; and judging whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the judgment criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.
    Type: Grant
    Filed: November 30, 2011
    Date of Patent: October 23, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Zhe Wang
  • Publication number: 20120264091
    Abstract: One example embodiment of the present disclosure includes method and system to improve communication comprising an assembly fixedly positioned during use in close proximity to a user's ear. The assembly includes an accelerometer to detect the initiation and duration of the user's speech and an output presentation system. The output presentation system comprises a non-occlusive ear fitting that presents unintelligible noise that is unrelated to the sound-frequency or intonation of the user's current speech. The unintelligible noise is presented to the patient at a level less than 85 dB. The system further comprises a control arrangement to maintain presentation of the noise substantially throughout the detected duration of the user's speech, but substantially not at other times.
    Type: Application
    Filed: February 16, 2012
    Publication date: October 18, 2012
    Applicant: Purdue Research Foundation
    Inventors: Jessica E. Huber, Scott Kepner, Derek Tully, Barbara S. Tully, James Thomas Jones, Kirk Solon Foster
  • Publication number: 20120265526
    Abstract: An input signal is received. A plurality of electrical characteristics from the input signal is obtained. A plurality of acoustic features is determined from the obtained electrical characteristics and each of the acoustic features being different from the others. At least some of the acoustic features are compared to a plurality of predetermined criteria. Based upon the comparing of the acoustic features to the plurality of predetermined criteria, it is determined when the signal is a voice signal or a noise signal.
    Type: Application
    Filed: April 13, 2011
    Publication date: October 18, 2012
    Applicant: CONTINENTAL AUTOMOTIVE SYSTEMS, INC.
    Inventors: Suat Yeldener, David Barron
  • Publication number: 20120259629
    Abstract: To provide a noise reduction transmitter which can secure clarity of sounds collected in very noisy environments and maintain a quality of sounds without devising a noise insulation cover particularly. A transmission microphone 7 is arranged inside a noise insulation cover 2 worn on and covering at least a user's 1 mouth. A noise detection microphone 9 which detects external noises is arranged outside the noise insulation cover, and a noise component cancellation circuit 11 is provided which generates a noise component cancellation signal based on an output signal from the noise detection microphone. An electroacoustic transducer 8 is arranged in the noise insulation cover to reproduce a noise component cancellation sound based on an output signal from the noise component cancellation circuit 11.
    Type: Application
    Filed: April 6, 2012
    Publication date: October 11, 2012
    Applicant: KABUSHIKI KAISHA AUDIO-TECHNICA
    Inventor: Hiroshi AKINO
  • Publication number: 20120259631
    Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.
    Type: Application
    Filed: June 22, 2012
    Publication date: October 11, 2012
    Applicant: GOOGLE INC.
    Inventors: Matthew I. Lloyd, Trausti T. Kristjansson
  • Publication number: 20120259630
    Abstract: The voice conversion method of a display apparatus includes: in response to the receipt of a first video frame, detecting one or more entities from the first video frame; in response to the selection of one of the detected entities, storing the selected entity; in response to the selection of one of a plurality of previously-stored voice samples, storing the selected voice sample in connection with the selected entity; and in response to the receipt of a second video frame including the selected entity, changing a voice of the selected entity based on the selected voice sample and outputting the changed voice.
    Type: Application
    Filed: April 11, 2012
    Publication date: October 11, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Aditi GARG, Kasthuri Jayachand YADLAPALLI
  • Publication number: 20120259628
    Abstract: A telecommunication device is disclosed, comprising: a microphone array comprising a plurality of microphones, wherein each microphone receives an analogue acoustic signal; a position sensing device for determining how the telecommunication device is positioned in three-dimensions with respect to a user's mouth; at least one analogue/digital converter for converting each analogue acoustic signal into a digital signal; a digital signal processor for performing signal processing on the received digital signals comprising a controller, a plurality of delay circuits for delaying each received signal based on an input from the controller and a plurality of preamplifiers for adjusting the gain of each received signal based on a gain input from the controller, wherein the controller selects the appropriate delay and gain values applied to each received signal to remove noise from the received signals based on the determined position of the telecommunication device.
    Type: Application
    Filed: May 4, 2011
    Publication date: October 11, 2012
    Applicant: SONY ERICSSON MOBILE COMMUNICATIONS AB
    Inventor: Georg SIOTIS
  • Patent number: 8285545
    Abstract: A voice command acquisition method and system for motor vehicles is improved in that noise source information is obtained directly from the vehicle system bus. Upon receiving an input signal with a voice command, the system bus is queried for one or more possible sources of a noise component in the input signal. In addition to vehicle-internal information (e.g., window status, fan blower speed, vehicle speed), the system may acquire external information (e.g., weather status) in order to better classify the noise component in the input signal. If the noise source is found to be a window, for example, the driver may be prompted to close the window. In addition, if the fan blower is at a high speed level, it may be slowed down automatically.
    Type: Grant
    Filed: October 3, 2008
    Date of Patent: October 9, 2012
    Assignee: Volkswagen AG
    Inventors: Chu Hee Lee, Johnathan Lee, Daniel Rosario, Edward Kim, Thomas Chan
  • Patent number: 8280731
    Abstract: A speech enhancement method operative for devices having limited available memory is described. The method is appropriate for very noisy environments and is capable of estimating the relative strengths of speech and noise components during both the presence as well as the absence of speech.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: October 2, 2012
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Rongshan Yu
  • Publication number: 20120245933
    Abstract: A device for suppressing ambient sounds from speech received by a microphone array is provided. One embodiment of the device comprises a microphone array, a processor, an analog-to-digital converter, and memory comprising instructions stored therein that are executable by the processor.
    Type: Application
    Filed: June 8, 2012
    Publication date: September 27, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Jason Flaks, Ivan Tashev, Duncan McKay, Xudong Ni, Robert Heitkamp, Wei Guo, John Tardif, Leo Shing, Michael Baseflug
  • Patent number: 8275610
    Abstract: A plural-channel audio signal (e.g., a stereo audio) is processed to modify a gain (e.g., a volume or loudness) of a speech component signal (e.g., dialogue spoken by actors in a movie) relative to an ambient component signal (e.g., reflected or reverberated sound) or other component signals. In one aspect, the speech component signal is identified and modified. In one aspect, the speech component signal is identified by assuming that the speech source (e.g., the actor currently speaking) is in the center of a stereo sound image of the plural-channel audio signal and by considering the spectral content of the speech component signal.
    Type: Grant
    Filed: September 14, 2007
    Date of Patent: September 25, 2012
    Assignee: LG Electronics Inc.
    Inventors: Christof Faller, Hyen-O Oh, Yang-Won Jung
  • Patent number: 8275150
    Abstract: An apparatus for processing an audio signal and method thereof are disclosed, by which a local dynamic range of an audio signal can be adaptively normalized as well as a maximum dynamic range of the audio signal. The present invention includes receiving a signal, by an audio processing apparatus; computing a long-term power and a short-term power by estimating power of the signal; generating a slow gain based on the long-term power; generating a fast gain based on the short-term power; obtaining a final gain by combining the slow gain and the fast gain; and, modifying the signal using the final gain.
    Type: Grant
    Filed: July 29, 2009
    Date of Patent: September 25, 2012
    Assignee: LG Electronics Inc.
    Inventors: Jong Ha Moon, Hyen O Oh, Joon Il Lee, Myung Hoon Lee, Yang Won Jung, Alexis Favrot, Christof Faller
  • Patent number: 8275148
    Abstract: An audio processing apparatus is provided, comprising: a main microphone for receiving sounds from a source and noises from non-source sources and generating a main input; a reference microphone for receiving the sounds and the noises and generating a reference input; a short-time Fourier transformation (STFT) unit for applying short time Fourier transformation to convert the main input of a time domain signals into a main signal of a frequency domain and convert the reference input of the time domain signals into a reference signal of the frequency domain; a sensitivity calibrating unit for performing sensitivity calibration on the main signal and the reference signal and generating a main calibrated signal and a reference calibrated signal; and a voice active detector (VAD) for generating a voice active signal according to the main calibrated signal, the reference calibrated signal and a direction of arrival (DOA) signal.
    Type: Grant
    Filed: July 28, 2009
    Date of Patent: September 25, 2012
    Assignee: Fortemedia, Inc.
    Inventors: Xi-Lin Li, Sheng Liu
  • Patent number: 8275611
    Abstract: An apparatus for adaptively suppressing noise in an input signal frequency spectrum derived from overlapping input frames is provided. The system includes a psychoacoustic power computation module configured to compute a noisy signal power in psychoacoustic bands, a voice activity scoring module configured to compute a probabilistic score for a presence of a speech, and a noise estimation module configured to estimate a noise power in the psychoacoustic bands based on information of past frames, the probabilistic score, and the computed noisy signal power. The system also includes a gain computation module configured to compute a gain for each frequency, based on a probabilistic heuristic, the probabilistic score and the information on the past frames, and a gain post-processing module configured to perform a gain time smoothing, a gain frequency smoothing, and a gain regulation for the computed gain.
    Type: Grant
    Filed: January 18, 2008
    Date of Patent: September 25, 2012
    Assignee: STMicroelectronics Asia Pacific Pte., Ltd.
    Inventors: Wenbo Zong, Yuan Wu, Sapna George
  • Patent number: 8275154
    Abstract: An apparatus for processing an audio signal and method thereof are disclosed, by which a local dynamic range of an audio signal can be adaptively normalized as well as a maximum dynamic range of the audio signal. The present invention includes receiving, by an audio processing apparatus, a signal, and feedback information estimated based on a normalizing gain; generating a noise estimation based on the signal; computing a gain filter for noise canceling, based on the noise estimation and the signal; and, obtaining a restricted gain filter by applying the feedback information to the gain filter.
    Type: Grant
    Filed: July 29, 2009
    Date of Patent: September 25, 2012
    Assignee: LG Electronics Inc.
    Inventors: Jong Ha Moon, Hyen O Oh, Joon Il Lee, Myung Hoon Lee, Yang Won Jung, Alexis Favrot, Christof Faller
  • Publication number: 20120239394
    Abstract: An erroneous detection determination device includes: a signal acquisition unit configured to acquire, from each of microphones, a plurality of audio signals relating to ambient sound including sound from a sound source in a certain direction; a result acquisition unit configured to acquire a recognition result including voice activity information indicating the inclusion of a voice activity relating to at least one of the audio signals; a calculation unit configured to calculate, for each of audio signals on the basis of the signals in respective unit times and the certain direction, a speech arrival rate representing the proportion of the sound from the certain direction to the ambient sound in each of the unit times; and an error detection unit configured to determine, on the basis of the recognition result and the speech arrival rate, whether or not the voice activity information is the result of erroneous detection.
    Type: Application
    Filed: February 28, 2012
    Publication date: September 20, 2012
    Applicant: FUJITSU LIMITED
    Inventor: Chikako MATSUMOTO
  • Patent number: 8271279
    Abstract: A speech enhancement system improves the perceptual quality of a processed voice signal. The system improves the perceptual quality of a voice signal by removing unwanted noise components from a voice signal. The system removes undesirable signals that may result in the loss of information. The system receives and analyzes signals to determine whether an undesired random or persistent signal corresponds to one or more modeled noises. When one or more noise components are detected, the noise components are substantially removed or dampened from the signal to provide a less noisy voice signal.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: September 18, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
  • Patent number: 8271277
    Abstract: A model application unit calculates linear prediction coefficients of a multi-step linear prediction model by using discrete acoustic signals. Then, a late reverberation predictor calculates linear prediction values obtained by substituting the linear prediction coefficients and the discrete acoustic signals into linear prediction term of the multi-step linear prediction model, as predicted late reverberations. Next, a frequency domain converter converts the discrete acoustic signals to discrete acoustic signals in the frequency domain and also converts the predicted late reverberations to predicted late reverberations in the frequency domain. A late reverberation eliminator calculates relative values between the amplitude spectra of the discrete acoustic signals expressed in the frequency domain and the amplitude spectra of the predicted late reverberations expressed in the frequency domain, and provides the relative values as predicted amplitude spectra of a dereverberation signal.
    Type: Grant
    Filed: March 5, 2007
    Date of Patent: September 18, 2012
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi
  • Publication number: 20120232895
    Abstract: According to one embodiment, an apparatus for discriminating speech/non-speech of a first acoustic signal includes a weight assignment unit, a feature extraction unit, and a speech/non-speech discrimination unit. The weight assignment unit is configured to assign a weight to each frequency band, based on a frequency spectrum of the first acoustic signal including a user's speech and a frequency spectrum of a second acoustic signal including a disturbance sound. The feature extraction unit is configured to extract a feature from the frequency spectrum of the first acoustic signal, based on the weight of each frequency band. The speech/non-speech discrimination unit is configured to discriminate speech/non-speech of the first acoustic signal, based on the feature.
    Type: Application
    Filed: September 14, 2011
    Publication date: September 13, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kaoru Suzuki, Masaru Sakai, Yusuke Kida
  • Publication number: 20120232896
    Abstract: A voice activity detection apparatus (1) comprising: a signal condition analyzing unit (3) which analyses at least one signal parameter of an input signal to detect a signal condition SC of said input signal; at least two voice activity detection units (4-i) comprising different voice detection characteristics, wherein each voice activity detection unit (4-i) performs separately a voice activity detection of said input signal to provide a voice activity detection decision VADD; and a decision combination unit (5) which combines the voice activity detection decisions VADDs provided by said voice activity detection units (4-i) depending on the detected signal condition SC to provide a combined voice activity detection decision cVADD.
    Type: Application
    Filed: May 21, 2012
    Publication date: September 13, 2012
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Anisse TALEB, Zhe WANG, Jianfeng XU, Lei MIAO
  • Publication number: 20120226498
    Abstract: Motion-based voice activity detection may be provided. A data stream may be received and a determination may be made whether at least one non-audio element associated with the data stream indicates that the data stream comprises speech. In response to determining that the at least one non-audio element associated with the data stream indicates that the data stream comprises speech, a speech to text conversion may be performed on at least one audio element associated with the data stream.
    Type: Application
    Filed: March 2, 2011
    Publication date: September 6, 2012
    Applicant: MICROSOFT CORPORATION
    Inventor: Remi Ken-Sho Kwan
  • Publication number: 20120224715
    Abstract: The subject disclosure is directed towards a noise adaptive beamformer that dynamically selects between microphone array channels, based upon noise energy floor levels that are measured when no actual signal (e.g., no speech) is present. When speech (or a similar desired signal) is detected, the beamformer selects which microphone signal to use in signal processing, e.g., corresponding to the lowest noise channel. Multiple channels may be selected, with their signals combined. The beamformer transitions back to the noise measurement phase when the actual signal is no longer detected, so that the beamformer dynamically adapts as noise levels change, including on a per-microphone basis, to account for microphone hardware differences, changing noise sources, and individual microphone deterioration.
    Type: Application
    Filed: March 3, 2011
    Publication date: September 6, 2012
    Applicant: Microsoft Corporation
    Inventor: Harshavardhana N. Kikkeri
  • Patent number: 8255209
    Abstract: A noise elimination method and apparatus. The method eliminates noise from an input signal containing a voice signal mixed with a noise signal. The method includes detecting a noise section, in which the noise signal is present, from the input signal; obtaining a weight to be used for the input signal from signals of the noise section; and filtering the input signal using the obtained weight. The method and apparatus enable a mobile robot to eliminate noise in real time and effectively detect and recognize voice.
    Type: Grant
    Filed: July 26, 2005
    Date of Patent: August 28, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Donggeon Kong, Changkyu Choi, Kiyoung Park
  • Publication number: 20120215519
    Abstract: Spatially selective augmentation of a multichannel audio signal is described.
    Type: Application
    Filed: February 21, 2012
    Publication date: August 23, 2012
    Applicant: QUALCOMM Incorporated
    Inventors: Hyun Jin Park, Kwokleung Chan, Ren Li
  • Patent number: 8249867
    Abstract: A microphone-array-based speech recognition system using a blind source separation (BBS) and a target speech extraction method in the system are provided. The speech recognition system performs an independent component analysis (ICA) to separate mixed signals input through a plurality of microphone into sound-source signals, extracts one target speech spoken for speech recognition from the separated sound-source signals by using a Gaussian mixture model (GMM) or a hidden Markov Model (HMM), and automatically recognizes a desired speech from the extracted target speech. Accordingly, it is possible to obtain a high speech recognition rate even in a noise environment.
    Type: Grant
    Filed: September 30, 2008
    Date of Patent: August 21, 2012
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hoon Young Cho, Yun Keun Lee, Jeom Ja Kang, Byung Ok Kang, Kap Kee Kim, Sung Joo Lee, Ho Young Jung, Hoon Chung, Jeon Gue Park, Hyung Bae Jeon
  • Patent number: 8249883
    Abstract: A multi-channel audio decoder reconstructs multi-channel audio of more than two physical channels from a reduced set of coded channels based on correlation parameters that specify a full power cross-correlation matrix of the physical channels, or merely preserve a partial correlation matrix (such as power of the physical channels, and some subset of cross-correlations between the physical channels, or cross-correlations of the physical channels with coded or virtual channels).
    Type: Grant
    Filed: October 26, 2007
    Date of Patent: August 21, 2012
    Assignee: Microsoft Corporation
    Inventors: Sanjeev Mehrotra, Kishore Kotteri
  • Patent number: 8249868
    Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: August 21, 2012
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Trausti Kristjansson
  • Patent number: 8249270
    Abstract: A sound signal correcting apparatus converts an acquired sound signal into a phase spectrum and an amplitude spectrum by an FFT process, compares the amplitude spectrum of the obtained sound signal with a noise model so that a correction coefficient used for correcting the amplitude spectrum of the sound signal is derived, smoothes waveform of the amplitude spectrum of the sound signal using the derived correction coefficient, and converts the sound signal into a sound signal where the amplitude spectrum is corrected by performing an inverse FFT process on the phase spectrum and the smoothed amplitude spectrum.
    Type: Grant
    Filed: January 26, 2007
    Date of Patent: August 21, 2012
    Assignee: Fujitsu Limited
    Inventor: Naoshi Matsuo
  • Publication number: 20120209603
    Abstract: Techniques for acoustic voice activity detection (AVAD) is described, including detecting a signal associated with a subband from a microphone, performing an operation on data associated with the signal, the operation generating a value associated with the subband, and determining whether the value distinguishes the signal from noise by using the value to determine a signal-to-noise ratio and comparing the value to a threshold.
    Type: Application
    Filed: January 9, 2012
    Publication date: August 16, 2012
    Inventor: Zhinian Jing
  • Publication number: 20120209604
    Abstract: The present invention relates to a method and a background estimator in voice activity detector for updating a background noise estimate for an input signal. The input signal for a current frame is received and it is determined whether the current frame of the input signal comprises non-noise. Further, an additional determination is performed whether the current frame of the non-noise input comprises noise by analyzing characteristics at least related to correlation and energy level of the input signal, and background noise estimate is updated if it is determined that the current frame comprises noise.
    Type: Application
    Filed: October 18, 2010
    Publication date: August 16, 2012
    Inventor: Martin Sehlstedt
  • Patent number: 8244526
    Abstract: In one embodiment, a highband burst suppressor includes a first burst detector configured to detect bursts in a lowband speech signal, and a second burst detector configured to detect bursts in a corresponding highband speech signal. The lowband and highband speech signals may be different (possibly overlapping) frequency regions of a wideband speech signal. The highband burst suppressor also includes an attenuation control signal calculator configured to calculate an attenuation control signal according to a difference between outputs of the first and second burst detectors. A gain control element is configured to apply the attenuation control signal to the highband speech signal. In one example, the attenuation control signal indicates an attenuation when a burst is found in the highband speech signal but is absent from a corresponding region in time of the lowband speech signal.
    Type: Grant
    Filed: April 3, 2006
    Date of Patent: August 14, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Koen Bernard Vos, Ananthapadmanabhan Arasanipalai Kandhadai
  • Patent number: 8244529
    Abstract: A method is provided for multi-pass echo residue detection. The method includes detecting audio data, and determining whether the audio data is recognized as speech. Additionally, the method categorizes the audio data recognized as speech as including an acceptable level of residual echo, and categorizes categorizing unrecognizable audio data as including an unacceptable level of residual echo. Furthermore, the method determines whether the unrecognizable audio data contains a user input, and also determines whether a duration of the user input is at least a predetermined duration, and when the user input is at least the predetermined duration, the method extracts the predetermined duration of the user input from a total duration of the user input.
    Type: Grant
    Filed: September 20, 2011
    Date of Patent: August 14, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Ngai Chiu Wong
  • Patent number: 8244528
    Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: August 14, 2012
    Assignee: Nokia Corporation
    Inventors: Riitta Elina Niemistö, Päivi Marianna Valve
  • Publication number: 20120203550
    Abstract: An interior rearview mirror system suitable for use in a vehicle includes an interior rearview mirror assembly having a mirror head and a reflective element. The mirror head includes a first microphone operable to generate a first analog signal and a second microphone operable to generate a second analog signal. The first analog signal is converted to a first digital signal by at least one analog to digital converter and the second analog signal is converted to a second digital signal by the at least one analog to digital converter. A digital sound processor is operable to process the first and second digital signals. Responsive to the processing of the first and second digital signals, the digital sound processor generates a digital output, and the digital output, at least in part, distinguishes a human voice present in the vehicle from noise present in the vehicle.
    Type: Application
    Filed: April 20, 2012
    Publication date: August 9, 2012
    Applicant: DONNELLY CORPORATION
    Inventors: Timothy G. Skiver, Joseph P. McCaw, John T. Uken, Jonathan E. DeLine, Niall R. Lynam
  • Publication number: 20120203549
    Abstract: A speech-segment determination process is performed to determine whether audio data is a speech segment. A result of the speech-segment determination process is memorized. A noise rejection process is performed to reject a noise component of the audio data while performing an adaptive process to change filter coefficients for adaptive filtration if a result of the determination process indicates that the audio data is not the speech segment. The noise component is rejected with no adaptive process if the result of the determination process indicates that the audio data is the speech segment. The determination process is performed again to the audio data having the noise component rejected and the rejection process is performed again to the audio data if a result of the determination process performed again is different from the memorized result of the determination process.
    Type: Application
    Filed: February 6, 2012
    Publication date: August 9, 2012
    Applicant: JVC KENWOOD Corporation a corporation of Japan
    Inventor: Joji NAITO