Detect Speech In Noise Patents (Class 704/233)
  • Patent number: 8818806
    Abstract: A signal portion is extracted per frame having a specific duration from an input signal, thus generating a per-frame input signal. The per-frame input signal in the time domain is converted into a per-frame input signal in the frequency domain, thereby generating a spectral pattern of spectra. Peak spectra having peaks are detected in the spectral pattern. A harmonic spectrum is determined, in the peak spectra, having a harmonic structure showing a relationship between a fundamental pitch and a harmonic overtone.
    Type: Grant
    Filed: November 28, 2011
    Date of Patent: August 26, 2014
    Assignee: JVC KENWOOD Corporation
    Inventor: Takaaki Yamabe
  • Publication number: 20140236594
    Abstract: A device for converting an audio signal into a visual representation, the device comprising at least one receiver for receiving the audio signal; a signal processing unit for processing the received audio signal; a converter for converting the processed audio signal into a visual representation; and projecting means for projecting the visual representation onto a display, wherein the display comprises an embedded grating structure.
    Type: Application
    Filed: October 2, 2012
    Publication date: August 21, 2014
    Inventor: Rahul Govind Kanegaonkar
  • Patent number: 8812312
    Abstract: The present invention relates to a system, method and program for speech recognition. In an embodiment of the invention a method for processing a speech signal consists of receiving a power spectrum of a speech signal and generating a log power spectrum signal of the power spectrum. The method further consists of performing discrete cosine transformation on the log power spectrum signal and cutting off cepstrum upper and lower terms of the discrete cosine transformed signal. The method further consists of performing inverse discrete cosine transformation on the signal from which the cepstrum upper and lower terms are cut off. The method further consists of converting the inverse discrete cosine transformed signal so as to bring the signal back to a power spectrum domain and filtering the power spectrum of the speech signal by using, as a filter, the signal which is brought back to the power spectrum domain.
    Type: Grant
    Filed: August 28, 2008
    Date of Patent: August 19, 2014
    Assignee: International Business Machines Corporation
    Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
  • Patent number: 8812014
    Abstract: A method of determining a position of a mobile device in a wireless communication network includes: accessing mobile device audio information from the mobile device; analyzing the mobile device audio information to determine an environmental characteristic of a present environment of the mobile device; and using the environmental characteristic to affect a determination of the position of the mobile device.
    Type: Grant
    Filed: August 30, 2010
    Date of Patent: August 19, 2014
    Assignee: QUALCOMM Incorporated
    Inventor: Ju-Yong Do
  • Patent number: 8812313
    Abstract: Judgment result deriving means 74 makes a judgment between active voice and non-active voice every unit time for a time series of voice data in which the number of active voice segments and the number of non-active voice segments are already known as a number of the labeled active voice segment and a number of the labeled non-active voice segment and shapes active voice segments and non-active voice segments as the result of the judgment by comparing the length of each segment during which the voice data is consecutively judged to correspond to active voice by the judgment or the length of each segment during which the voice data is consecutively judged to correspond to non-active voice by the judgment with a duration threshold. Segments number calculating means 75 calculates the number of active voice segments and the number of non-active voice segments.
    Type: Grant
    Filed: December 7, 2009
    Date of Patent: August 19, 2014
    Assignee: NEC Corporation
    Inventors: Takayuki Arakawa, Masanori Tsujikawa
  • Patent number: 8805560
    Abstract: Systems and methods for noise based interest point density pruning are disclosed herein. The systems include determining an amount of noise in an audio sample and adjusting the amount of interest points within an audio sample fingerprint based on the amount of noise. Samples containing high amounts of noise correspondingly generate fingerprints with more interest points. The disclosed systems and methods allow reference fingerprints to be reduced in size while increasing the size of sample fingerprints. The benefits in scalability do not compromise the accuracy of an audio matching system using noise based interest point density pruning.
    Type: Grant
    Filed: October 18, 2011
    Date of Patent: August 12, 2014
    Assignee: Google Inc.
    Inventors: George Tzanetakis, Dominik Roblek, Matthew Sharifi
  • Patent number: 8804974
    Abstract: Ambient audio event detection in a personal audio device headset provides for directive response to external audible events. Depending on the type of event, an alert may be issued, speech may be communicated to another device, program material may be interrupted and/or resumed with or without repositioning, and program material may be modified or selected for compatibility with, or to overcome, the ambient environment indicated by the detected event.
    Type: Grant
    Filed: February 7, 2011
    Date of Patent: August 12, 2014
    Assignee: Cirrus Logic, Inc.
    Inventor: John L. Melanson
  • Patent number: 8798993
    Abstract: A method for detecting speech using a first microphone adapted to produce a first signal (x), and a second microphone adapted to produce a second signal (x2), the method comprising the steps of: (i) applying gain to the second signal to produce a normalised second signal, which signal is normalised relative to the first signal; (ii) constructing one or more signal components from the first signal and the normalised second signal; (iii) constructing an adaptive differential microphone (ADM) having a constructed microphone response constructed from the one or more signal components which response has at least one directional null; (iv) producing one or more ADM outputs (yf, yb) from the constructed microphone response in response to detected sound; (v) computing a ratio of a parameter of either a first signal component or a constructed microphone response to a parameter of an output of the ADM; (vi) comparing the ratio to an adaptive threshold value; (vii) detecting speech if the ratio is greater than or equ
    Type: Grant
    Filed: November 19, 2010
    Date of Patent: August 5, 2014
    Assignee: NXP, B.V.
    Inventors: Patrick Kechichian, Cornelis Pieter Janse, Rene Martinus Maria Derkx, Wouter Joos Tirry
  • Patent number: 8798992
    Abstract: An signal processing apparatus, system and software product for audio modification/substitution of a background noise generated during an event including, but not be limited to, substituting or partially substituting a noise signal from one or more microphones by a pre-recorded noise, and/or selecting one or more noise signals from a plurality of microphones for further processing in real-time or near real-time broadcasting.
    Type: Grant
    Filed: May 18, 2011
    Date of Patent: August 5, 2014
    Assignee: Disney Enterprises, Inc.
    Inventors: Michael Gay, Jed Drake, Anthony Bailey
  • Patent number: 8798985
    Abstract: A method for interpreting a dialogue between two terminals includes establishing a communication channel between interpretation terminals of two parties in response to an interpretation request; specifying a language of an initiating party and a language of the other party in each of the interpretation terminals of the two parties by exchanging information about the language of the initiating party used in the interpretation terminal of the initiating party and the language of the other party used in the interpretation terminal of the other party via the communication channel; recognizing speech uttered from the interpretation terminal of the initiating party; translating the speech recognized by the interpretation terminal of the initiating party into the language of the other party; and transmitting a sentence translated into the language of the other party to the interpretation terminal of the other party.
    Type: Grant
    Filed: June 2, 2011
    Date of Patent: August 5, 2014
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Seung Yun, Sanghun Kim
  • Publication number: 20140214418
    Abstract: A sound processing device includes a first noise suppression unit configured to suppress a noise component included in an input sound signal using a first suppression amount, a second noise suppression unit configured to suppress the noise component included in the input sound signal using a second suppression amount greater than the first suppression amount, a speech section detection unit configured to detect whether the sound signal whose noise component has been suppressed by the second noise suppression unit includes a speech section having a speech for every predetermined time, and a speech recognition unit configured to perform a speech recognizing process on a section, which is detected to be a speech section by the speech section detection unit, in the sound signal whose noise component has been suppressed by the first noise suppression unit.
    Type: Application
    Filed: January 15, 2014
    Publication date: July 31, 2014
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro NAKADAI, Keisuke NAKAMURA, Tatsuya HIGUCHI
  • Patent number: 8793128
    Abstract: A speech signal processing system that includes a speech input unit for inputting a speech signal; input speech storage unit for storing an input speech signal that is the speech signal inputted through the speech input unit; characteristic estimation unit for referring to the input speech signal stored in the input speech storage unit, and estimating characteristics of an input speech indicated by the input speech signal, the characteristics including an environmental sound included in the input speech signal; reference speech output unit for causing a predetermined speech signal that becomes a reference speech, to output; and characteristic adding unit for adding the characteristics of the input speech estimated by the characteristic estimation unit, in a reference speech signal that is the speech signal caused to output by the reference speech output unit.
    Type: Grant
    Filed: February 3, 2012
    Date of Patent: July 29, 2014
    Assignee: NEC Corporation
    Inventor: Kiyokazu Miki
  • Publication number: 20140207447
    Abstract: Embodiments of the present invention provide a voice identification method, which includes: obtaining voice data; obtaining a confidence value according to the voice data; obtaining a noise scenario according to the voice data; obtaining a confidence threshold corresponding to the noise scenario; and if the confidence value is greater than or equal to the confidence threshold, processing the voice data. An apparatus is also provided. The method and apparatus that flexibly adjust the confidence threshold according to the noise scenario greatly improve a voice identification rate under a noise environment.
    Type: Application
    Filed: December 9, 2013
    Publication date: July 24, 2014
    Applicant: Huawei Device Co., Ltd.
    Inventors: Hongrui JIANG, Xiyong WANG, Junbin LIANG, Weijun ZHENG, Junyang ZHOU
  • Publication number: 20140207446
    Abstract: Embodiments are disclosed that relate to the use of speech inputs including indefinite quantitative terms as computing device inputs. For example, one disclosed embodiment provides a method of operating a computing device, the method including receiving a speech input comprising an indefinite quantitative term, determining a definite quantity corresponding to the indefinite quantitative term, and applying the definite quantity to an action performed via the computing device in response to the speech input.
    Type: Application
    Filed: January 24, 2013
    Publication date: July 24, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Christian Klein, Gregg Wygonik
  • Publication number: 20140200887
    Abstract: A sound processing device includes a noise suppression unit configured to suppress a noise component included in an input sound signal, an auxiliary noise addition unit configured to add auxiliary noise to the input sound signal, whose noise component has been suppressed by the noise suppression unit, to generate an auxiliary noise-added signal, a distortion calculation unit configured to calculate a degree of distortion of the auxiliary noise-added signal, and a control unit configured to control an addition amount by which the auxiliary noise addition unit adds the auxiliary noise based on the degree of distortion calculated by the distortion calculation unit.
    Type: Application
    Filed: January 7, 2014
    Publication date: July 17, 2014
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro NAKADAI, Keisuke NAKAMURA, Daisuke KIMOTO
  • Patent number: 8781826
    Abstract: A method for operating a speech recognition system is described in which a speech signal (S1) of a user is detected and analyzed so as to recognize speech information contained in the speech signal (S1). The speech recognition system determines a reception quality value (SQ) or a noise value which represents a current reception quality. The speech recognition system is switched over to a mode of operation which is less sensitive to noise and/or outputs an alert signal (SW) to the user when the reception quality value (SQ) drops below a given reception quality threshold or when the noise value exceeds a noise threshold. An appropriate speech recognition system is also described.
    Type: Grant
    Filed: October 24, 2003
    Date of Patent: July 15, 2014
    Assignee: Nuance Communications, Inc.
    Inventor: Albert Kooiman
  • Patent number: 8781821
    Abstract: A method is disclosed for controlling a voice-activated device by interpreting a spoken command as a series of voiced and non-voiced intervals. A responsive action is then performed according to the number of voiced intervals in the command. The method is well-suited to applications having a small number of specific voice-activated response functions. Applications using the inventive method offer numerous advantages over traditional speech recognition systems including speaker universality, language independence, no training or calibration needed, implementation with simple microcontrollers, and extremely low cost. For time-critical applications such as pulsers and measurement devices, where fast reaction is crucial to catch a transient event, the method provides near-instantaneous command response, yet versatile voice control.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: July 15, 2014
    Assignee: Zanavox
    Inventor: David Edward Newman
  • Publication number: 20140195228
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Application
    Filed: March 10, 2014
    Publication date: July 10, 2014
    Applicant: AT&T INTELLECTUAL PROPERTY II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8775168
    Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.
    Type: Grant
    Filed: August 3, 2007
    Date of Patent: July 8, 2014
    Assignee: STMicroelectronics Asia Pacific PTE, Ltd.
    Inventors: Karthik Muralidhar, Anoop Kumar Krishna
  • Patent number: 8775173
    Abstract: An erroneous detection determination device includes: a signal acquisition unit configured to acquire, from each of microphones, a plurality of audio signals relating to ambient sound including sound from a sound source in a certain direction; a result acquisition unit configured to acquire a recognition result including voice activity information indicating the inclusion of a voice activity relating to at least one of the audio signals; a calculation unit configured to calculate, for each of audio signals on the basis of the signals in respective unit times and the certain direction, a speech arrival rate representing the proportion of the sound from the certain direction to the ambient sound in each of the unit times; and an error detection unit configured to determine, on the basis of the recognition result and the speech arrival rate, whether or not the voice activity information is the result of erroneous detection.
    Type: Grant
    Filed: February 28, 2012
    Date of Patent: July 8, 2014
    Assignee: Fujitsu Limited
    Inventor: Chikako Matsumoto
  • Patent number: 8775182
    Abstract: Machine-readable media, methods, apparatus and system for speech segmentation are described. In some embodiments, a fuzzy rule may be determined to discriminate a speech segment from a non-speech segment. An antecedent of the fuzzy rule may include an input variable and an input variable membership. A consequent of the fuzzy rule may include an output variable and an output variable membership. An instance of the input variable may be extracted from a segment. An input variable membership function associated with the input variable membership and an output variable membership function associated with the output variable membership may be trained. The instance of the input variable, the input variable membership function, the output variable, and the output variable membership function may be operated, to determine whether the segment is the speech segment or the non-speech segment.
    Type: Grant
    Filed: April 12, 2013
    Date of Patent: July 8, 2014
    Assignee: Intel Corporation
    Inventors: Robert Du, Ye Tao, Daren Zu
  • Patent number: 8775166
    Abstract: An encoding method includes: extracting core layer characteristic parameters and enhancement layer characteristic parameters of a background noise signal, encoding the core layer characteristic parameters and enhancement layer characteristic parameters to obtain a core layer codestream and an enhancement layer codestream. The disclosure also provides an encoding device, a decoding device and method, an encapsulating method, a reconstructing method, an encoding-decoding system and an encoding-decoding method. By describing the background noise signal with the enhancement layer characteristic parameters, the background noise signal can be processed by using more accurate encoding and decoding method, so as to improve the quality of encoding and decoding the background noise signal.
    Type: Grant
    Filed: August 14, 2009
    Date of Patent: July 8, 2014
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Hualin Wan, Libin Zhang
  • Patent number: 8775171
    Abstract: A method and computing system for suppressing noise in an audio signal, comprising: receiving the audio signal at signal processing means; determining that another signal is input to the signal processing means, the input signal resulting from an activity which generates noise in the audio signal; and selectively suppressing noise in the audio signal in dependence on the determination that the input signal is input to the signal processing means to thereby suppress the generated noise in the audio signal.
    Type: Grant
    Filed: June 23, 2010
    Date of Patent: July 8, 2014
    Assignee: Skype
    Inventors: Karsten Vandborg Sorensen, Jon Bergenheim, Koen Vos
  • Publication number: 20140188467
    Abstract: A voice activity detector (VAD) combines the use of an acoustic VAD and a vibration sensor VAD as appropriate to the conditions a host device is operated. The VAD includes a first detector receiving a first signal and a second detector receiving a second signal. The VAD includes a first VAD component coupled to the first and second detectors. The first VAD component determines that the first signal corresponds to voiced speech when energy resulting from at least one operation on the first signal exceeds a first threshold. The VAD includes a second VAD component coupled to the second detector. The second VAD component determines that the second signal corresponds to voiced speech when a ratio of a second parameter corresponding to the second signal and a first parameter corresponding to the first signal exceeds a second threshold.
    Type: Application
    Filed: August 5, 2013
    Publication date: July 3, 2014
    Applicant: AliphCom
    Inventors: Zhinian Jing, Nicolas Jean Petit, Gregory C. Burnett
  • Patent number: 8768692
    Abstract: A speech recognition apparatus predicts, based on the occurrence cycle and duration time of impulse noise that occurs periodically, a segment in which impulse noise occurs, and executes speech recognition processing based on the feature components of the remaining frames excluding a feature component of a frame corresponding to the predicted segment, or the feature components extracted from frames created from sound data excluding a part corresponding to the predicted segment.
    Type: Grant
    Filed: May 3, 2007
    Date of Patent: July 1, 2014
    Assignee: Fujitsu Limited
    Inventor: Shoji Hayakawa
  • Patent number: 8768406
    Abstract: A system for background sound removal, the system may include: a noise reduction circuit arranged to apply a background sound reduction process on multiple samples of background sound and speech to provide first signals that comprise residual background sound; a background cancellation circuit arranged to remove the residual background sound from the first signals to provide second signals; and an output circuit arranged to output a mixture of the second signals and at least zero selected signals to a sound generating circuit that is arranged to output audio signals representative of the mixture.
    Type: Grant
    Filed: October 11, 2011
    Date of Patent: July 1, 2014
    Assignee: Bone Tone Communications Ltd.
    Inventors: Arie Heiman, Uri Yehuday
  • Publication number: 20140180685
    Abstract: According to an embodiment, a signal processing device includes a background calculator, a signal generator, an extractor, a similarity calculator, and a mixer. The background calculator is configured to calculate a first background signal in which a speech signal is removed, based on the acoustic signals. The signal generator is configured to generate a reference signal from at least one of the acoustic signals. The extractor is configured to extract a second background signal by removing a speech signal from the reference signal. The similarity calculator is configured to calculate a similarity between feature data of the background signals. The mixer is configured to calculate a weighted sum of the background signals in such a way that a greater weight is given to the first background signal as the similarity is higher and a greater weight is given to the second background signal as the similarity is lower.
    Type: Application
    Filed: December 20, 2013
    Publication date: June 26, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Toshiyuki ONO, Makoto HIROHATA, Masashi NISHIYAMA, Toru TANIGUCHI
  • Patent number: 8762143
    Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded time and speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.
    Type: Grant
    Filed: May 29, 2007
    Date of Patent: June 24, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Mazin Gilbert
  • Patent number: 8761410
    Abstract: The present technology provides robust, high quality dereverberation of an acoustic signal which can overcome or substantially alleviate the problems associated with the diverse and dynamic nature of the surrounding acoustic environment. The present technology utilizes acoustic signals received from a plurality of microphones to carry out a multi-faceted analysis which accurately identifies reverberation based on the correlation between the acoustic signals. Due to the spatial distance between the microphones and the variation in reflection paths present in the surrounding acoustic environment, the correlation between the acoustic signals can be used to accurately determine whether portions of one or more of the acoustic signals contain desired speech or undesired reverberation. These correlation characteristics are then used to generate signal modifications applied to one or more of the received acoustic signals to preserve speech and reduce reverberation.
    Type: Grant
    Filed: December 8, 2010
    Date of Patent: June 24, 2014
    Assignee: Audience, Inc.
    Inventors: Carlos Avendano, Carlo Murgia
  • Patent number: 8762144
    Abstract: A method and apparatus for detecting voice activity are disclosed. The method of detecting voice activity includes: extracting a feature parameter from a frame signal; determining whether the frame signal is a voice signal or a noise signal by comparing the feature parameter with model parameters of a plurality of comparison signals, respectively; and outputting the frame signal when the frame signal is determined to be a voice signal. The apparatus includes a classifier module which extracts a feature parameter from a frame signal, and generating labeling information with respect to the frame signal by comparing the feature parameter with model parameters of a plurality of comparison signals; and a voice detection unit which determines whether the frame signal is a noise signal or a voice signal with reference to the labeling information, and outputting the frame signal when the frame signal is determined to be a voice signal.
    Type: Grant
    Filed: May 3, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nam-gook Cho, Eun-kyoung Kim
  • Patent number: 8762147
    Abstract: A signal portion is extracted from an input signal for each frame having a specific duration to generate a per-frame input signal. The per-frame input signal in a time domain is converted into a per-frame input signal in a frequency domain, thereby generating a spectral pattern. Subband average energy is derived in each of subbands adjacent one another in the spectral pattern. The subband average energy is compared in at least one subband pair of a first subband and a second subband that is a higher frequency band than the first subband, the first and second subbands being consecutive subbands in the spectral pattern. It is determined that the per-frame input signal includes a consonant segment if the subband average energy of the second subband is higher than the subband average energy of the first subband.
    Type: Grant
    Filed: February 1, 2012
    Date of Patent: June 24, 2014
    Assignee: JVC KENWOOD Corporation
    Inventors: Akiko Akechi, Takaaki Yamabe
  • Patent number: 8762145
    Abstract: According to one embodiment, a voice recognition apparatus includes a determination unit, an estimating unit, and a voice recognition unit. The determination unit determines whether a component with a frequency of not less than 1000 Hz and with a level not lower than a predetermined level is included in a sound input from a plurality of microphones. The estimating unit estimates a sound source direction of the sound when the determination unit determines that the component is included in the sound. The voice recognition unit recognizes whether the sound obtained in the sound source direction coincides with a voice model registered beforehand.
    Type: Grant
    Filed: March 26, 2012
    Date of Patent: June 24, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kazushige Ouchi, Toshiyuki Koga, Daisuke Yamamoto, Miwako Doi
  • Patent number: 8762138
    Abstract: The present invention relates to a method as well as to a computing device (20) for editing a noise-database (13) containing noise information, said noise information being derived from noise signals within an audio stream (19). In order to enhance possibilities to create and utilize context information which emerge from tracking noise signals from an audio stream, for example a telephone call, the above method is characterized by the following steps: A) in a localizing step (14), determining geographical data of the location the noise signals origin from; B) in an analyzing step (15), analyzing the noise signals with reference to the noise content; C) in a linking step, linking the analyzed noise signals to said geographical data to create noise information; D) in a storing step, storing said noise information within said noise-database (13).
    Type: Grant
    Filed: August 30, 2010
    Date of Patent: June 24, 2014
    Assignee: Vodafone Holding GmbH
    Inventors: Stefan Holtel, Jad Noueihed
  • Publication number: 20140172424
    Abstract: Techniques are disclosed for using the hardware and/or software of the mobile device to obscure speech in the audio data before a context determination is made by a context awareness application using the audio data. In particular, a subset of a continuous audio stream is captured such that speech (words, phrases and sentences) cannot be reliably reconstructed from the gathered audio. The subset is analyzed for audio characteristics, and a determination can be made regarding the ambient environment.
    Type: Application
    Filed: February 21, 2014
    Publication date: June 19, 2014
    Applicant: QUALCOMM Incorporated
    Inventors: Leonard Henry Grokop, Vidya Narayanan, James W. Dolter, Sanjiv Nanda
  • Publication number: 20140163979
    Abstract: A voice processing device includes: a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute, receiving a first signal including a plurality of voice segments; controlling such that a non-voice segment with a length equal to or greater than a predetermined first threshold value exists between at least one of the plurality of voice segments; and outputting a second signal including the plurality of voice segments and the controlled non-voice segment.
    Type: Application
    Filed: November 7, 2013
    Publication date: June 12, 2014
    Applicant: FUJITSU LIMITED
    Inventors: Masanao SUZUKI, Takeshi OTANI, Taro TOGAWA
  • Publication number: 20140163978
    Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.
    Type: Application
    Filed: December 11, 2012
    Publication date: June 12, 2014
    Applicant: AMAZON TECHNOLOGIES, INC.
    Inventor: Amazon Techologies, Inc.
  • Patent number: 8751227
    Abstract: Parameters of a first variation model, a second variation model and an environment-independent acoustic model are estimated in such a way that an integrated degree of fitness obtained by integrating a degree of fitness of the first variation model to the sample speech data, a degree of fitness of the second variation model to the sample speech data, and a degree of fitness of the environment-independent acoustic model to the sample speech data becomes the maximum. Therefore, when constructing an acoustic model by using sample speech data affected by a plurality of acoustic environments; the effect on a speech which is caused by each of the acoustic environments can be extracted with high accuracy.
    Type: Grant
    Filed: February 10, 2009
    Date of Patent: June 10, 2014
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 8751224
    Abstract: The headset comprises: a physiological sensor suitable for being coupled to the cheek or the temple of the wearer of the headset and for picking up non-acoustic voice vibration transmitted by internal bone conduction; lowpass filter means for filtering the signal as picked up; a set of microphones picking up acoustic voice vibration transmitted by air from the mouth of the wearer of the headset; highpass filter means and noise-reduction means for acting on the signals picked up by the microphones; and mixer means for combining the filtered signals to output a signal representative of the speech uttered by the wearer of the headset. The signal of the physiological sensor is also used by means for calculating the cutoff frequency of the lowpass and highpass filters and by means for calculating the probability that speech is absent.
    Type: Grant
    Filed: April 18, 2012
    Date of Patent: June 10, 2014
    Assignee: Parrot
    Inventors: Michael Herve, Guillaume Vitte
  • Patent number: 8744839
    Abstract: Target word recognition includes: obtaining a candidate word set and corresponding characteristic computation data, the candidate word set comprising text data, and characteristic computation data being associated with the candidate word set; performing segmentation of the characteristic computation data to generate a plurality of text segments; combining the plurality of text segments to form a text data combination set; determining an intersection of the candidate word set and the text data combination set, the intersection comprising a plurality of text data combinations; determining a plurality of designated characteristic values for the plurality of text data combinations; based at least in part on the plurality of designated characteristic values and according to at least a criterion, recognizing among the plurality of text data combinations target words whose characteristic values fulfill the criterion.
    Type: Grant
    Filed: September 22, 2011
    Date of Patent: June 3, 2014
    Assignee: Alibaba Group Holding Limited
    Inventors: Haibo Sun, Yang Yang, Yining Chen
  • Patent number: 8744846
    Abstract: Provided are a noise state determination method and an apparatus and a computer readable recording medium therefor. A noisy speech signal processing method according to the present invention includes calculating a transformed spectrum by transforming an input noisy speech signal to a frequency domain; calculating a smoothed magnitude spectrum by reducing magnitude differences of the transformed spectrum between neighboring frames; calculating a search spectrum which represents an estimated noise component of the smoothed magnitude spectrum; and calculating an identification ratio which represents a ratio of a noise component included in the input noisy speech signal, by using the smoothed magnitude spectrum and the search spectrum. Since a small amount of calculation is required and a large-capacity memory is not required, the present invention may be easily implemented as hardware or software.
    Type: Grant
    Filed: November 27, 2008
    Date of Patent: June 3, 2014
    Assignee: Transono Inc.
    Inventors: Sung Il Jung, Dong Gyung Ha
  • Patent number: 8744845
    Abstract: A noise estimation method for a noisy speech signal according to an embodiment of the present invention includes the steps of approximating a transformation spectrum by transforming an input noisy speech signal to a frequency domain, calculating a smoothed magnitude spectrum having a decreased difference in a magnitude of the transformation spectrum between neighboring frames, calculating a search spectrum to represent an estimated noise component of the smoothed magnitude spectrum, and estimating a noise spectrum by using a recursive average method using an adaptive forgetting factor defined by using the search spectrum. According to an embodiment of the present invention, the amount of calculation for noise estimation is small, and large-capacity memory is not required. Accordingly, the present invention can be easily implemented in hardware or software. Further, the accuracy of noise estimation can be increase because an adaptive procedure can be performed on each frequency sub-band.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: June 3, 2014
    Assignee: Transono Inc.
    Inventors: Sung Il Jung, Dong Gyung Ha
  • Patent number: 8744849
    Abstract: A microphone-array-based speech recognition system combines a noise cancelling technique for cancelling noise of input speech signals from an array of microphones, according to at least an inputted threshold. The system receives noise-cancelled speech signals outputted by a noise masking module through at least a speech model and at least a filler model, then computes a confidence measure score with the at least a speech model and the at least a filler model for each threshold and each noise-cancelled speech signal, and adjusts the threshold to continue the noise cancelling for achieving a maximum confidence measure score, thereby outputting a speech recognition result related to the maximum confidence measure score.
    Type: Grant
    Filed: October 12, 2011
    Date of Patent: June 3, 2014
    Assignee: Industrial Technology Research Institute
    Inventor: Hsien-Cheng Liao
  • Patent number: 8744847
    Abstract: Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify phones or speech sounds spoken by the key child, independent of content. The number and type of phones is analyzed to automatically assess the key child's expressive language development. The assessment can result in a standard score, an estimated developmental age, or an estimated mean length of utterance.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: June 3, 2014
    Assignee: LENA Foundation
    Inventors: Terrance Paul, Dongxin Xu, Jeffrey A. Richards
  • Patent number: 8737640
    Abstract: Disclosed herein are system, method and apparatus with environmental noise cancellation. The instant disclosure is particularly adapted to a receiver module having at least two inputs. The two inputs respectively receive a main audio portion and the audio with majority of environmental noise. The system firstly calibrates the audio signals to reduce the error caused by the difference between the two inputs. An adaptive beamforming technology and a speech extractor are respectively used to extract the environmental noise portion with less main audio and the main audio portion with less noise. After a process of time-to-frequency domain transformation, a non-linear noise suppression technology is introduced into estimating the environmental noise and acquiring a gain. After noise suppression processed with the gain, a sequence of audio signals is output after a frequency-to-time domain transformation.
    Type: Grant
    Filed: January 7, 2011
    Date of Patent: May 27, 2014
    Assignee: C-Media Electronics Inc.
    Inventors: Yuepeng Li, Fenghai Qiu, Hua Gao
  • Patent number: 8738382
    Abstract: Audio feedback time shifted filtering systems and methods are presented. The systems and methods facilitate separation of program audio feedback from received environmental audio (e.g., audio sensed by a microphone.) The separation of the program audio feedback reduces interference from program content audio feedback on performance of voice recognition operations. In one embodiment of a personal video recorder audio filter method, environmental audio patterns are received, an audio feedback time shift filter process is executed for separating out program content from the environmental audio patterns, and voice recognition is performed on the filtered environment audio patterns (without interference from program audio content feedback). The time shift or deterministic delay provides a closer correlation between program audio content and program audio content feedback received at the microphone and permits input timing compensation to compensate for feedback loop delays.
    Type: Grant
    Filed: December 16, 2005
    Date of Patent: May 27, 2014
    Assignee: Nvidia Corporation
    Inventor: William Samuel Herz
  • Patent number: 8738376
    Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: May 27, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
  • Patent number: 8738367
    Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.
    Type: Grant
    Filed: February 18, 2010
    Date of Patent: May 27, 2014
    Assignee: NEC Corporation
    Inventor: Tadashi Emori
  • Patent number: 8731207
    Abstract: An embodiment of an apparatus for computing control information for a suppression filter for filtering a second audio signal to suppress an echo based on a first audio signal includes a computer having a value determiner for determining at least one energy-related value for a band-pass signal of at least two temporally successive data blocks of at least one signal of a group of signals. The computer further includes a mean value determiner for determining at least one mean value of the at least one determined energy-related value for the band-pass signal. The computer further includes a modifier for modifying the at least one energy-related value for the band-pass signal on the basis of the determined mean value for the band-pass signal. The computer further includes a control information computer for computing the control information for the suppression filter on the basis of the at least one modified energy-related value.
    Type: Grant
    Filed: January 12, 2009
    Date of Patent: May 20, 2014
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Fabian Kuech, Markus Kallinger, Christof Faller, Alexis Favrot
  • Patent number: 8731949
    Abstract: The present invention relates to a method and system for audio encoding and decoding and a method for estimating a noise level, and the method for estimating a noise level in the present invention comprises: estimating a power spectrum of an audio signal to be encoded according to a frequency domain coefficient of the audio signal to be encoded; and estimating a noise level of a zero bit encoding subband audio signal according to the power spectrum obtained by calculating, and this noise level for controlling an energy proportion of noise filling to spectral band replication during decoding; wherein a zero bit encoding subband refers to an encoding subband of which allocated bit number is zero. The present invention can well reconstruct the uncoded frequency domain coefficients.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: May 20, 2014
    Assignee: ZTE Corporation
    Inventors: Dongping Jiang, Hao Yuan, Ke Peng, Guoming Chen, Jiali Li
  • Publication number: 20140136194
    Abstract: The methods, apparatus, and systems described herein are designed to identify fraudulent callers. A voice print of a call is created and compared to known voice prints to determine if it matches one or more of the known voice prints. The methods include a pre-processing step to separate speech from non-speech, selecting a number of elements that affect the voice print the most, and/or computing an adjustment factor based on the scores of each received voice print against known voice prints.
    Type: Application
    Filed: November 9, 2012
    Publication date: May 15, 2014
    Applicant: Mattersight Corporation
    Inventors: Roger Warford, Douglas Brown, Christopher Danson, David Gustafson