Normalizing Patents (Class 704/234)

Interpretation terminals and method for interpretation through communication between interpretation terminals

Patent number: 8798985

Abstract: A method for interpreting a dialogue between two terminals includes establishing a communication channel between interpretation terminals of two parties in response to an interpretation request; specifying a language of an initiating party and a language of the other party in each of the interpretation terminals of the two parties by exchanging information about the language of the initiating party used in the interpretation terminal of the initiating party and the language of the other party used in the interpretation terminal of the other party via the communication channel; recognizing speech uttered from the interpretation terminal of the initiating party; translating the speech recognized by the interpretation terminal of the initiating party into the language of the other party; and transmitting a sentence translated into the language of the other party to the interpretation terminal of the other party.

Type: Grant

Filed: June 2, 2011

Date of Patent: August 5, 2014

Assignee: Electronics and Telecommunications Research Institute

Inventors: Seung Yun, Sanghun Kim
ADAPTIVE ONLINE FEATURE NORMALIZATION FOR SPEECH RECOGNITION

Publication number: 20140207448

Abstract: A speech recognition system adaptively estimates a warping factor used to reduce speaker variability. The warping factor is estimated using a small window (e.g. 100 ms) of speech. The warping factor is adaptively adjusted as more speech is obtained until the warping factor converges or a pre-defined maximum number of adaptation is reached. The speaker may be placed into a group selected from two or more groups based on characteristics that are associated with the speaker's window of speech. Different step sizes may be used within the different groups when estimating the warping factor. VTLN is applied to the speech input using the estimated warping factor. A linear transformation, including a bias term, may also be computed to assist in normalizing the speech along with the application of the VTLN.

Type: Application

Filed: January 23, 2013

Publication date: July 24, 2014

Applicant: Microsoft Corporation

Inventors: Shizhen Wang, Yifan Gong, Fileno Alleva
Reducing false positives in speech recognition systems

Patent number: 8781825

Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.

Type: Grant

Filed: August 24, 2011

Date of Patent: July 15, 2014

Assignee: Sensory, Incorporated

Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
Method and apparatus for voice-enabling an application

Patent number: 8768711

Abstract: A method of voice-enabling an application for command and control and content navigation can include the application dynamically generating a markup language fragment specifying a command and control and content navigation grammar for the application, instantiating an interpreter from a voice library, and providing the markup language fragment to the interpreter. The method also can include the interpreter processing a speech input using the command and control and content navigation grammar specified by the markup language fragment and providing an event to the application indicating an instruction representative of the speech input.

Type: Grant

Filed: June 17, 2004

Date of Patent: July 1, 2014

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Brien H. Muschett
Channel normalization using recognition feedback

Patent number: 8768695

Abstract: A computer-implemented arrangement is described for performing cepstral mean normalization (CMN) in automatic speech recognition. A current CMN function is stored in a computer memory as a previous CMN function. The current CMN function is updated based on a current audio input to produce an updated CMN function. The updated CMN function is used to process the current audio input to produce a processed audio input. Automatic speech recognition of the processed audio input is performed to determine representative text. If the audio input is not recognized as representative text, the updated CMN function is replaced with the previous CMN function.

Type: Grant

Filed: June 13, 2012

Date of Patent: July 1, 2014

Assignee: Nuance Communications, Inc.

Inventors: Yun Tang, Venkatesh Nagesha
Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition

Patent number: 8762143

Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded time and speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.

Type: Grant

Filed: May 29, 2007

Date of Patent: June 24, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Mazin Gilbert
Method for processing noisy speech signal, apparatus for same and computer-readable recording medium

Patent number: 8744845

Abstract: A noise estimation method for a noisy speech signal according to an embodiment of the present invention includes the steps of approximating a transformation spectrum by transforming an input noisy speech signal to a frequency domain, calculating a smoothed magnitude spectrum having a decreased difference in a magnitude of the transformation spectrum between neighboring frames, calculating a search spectrum to represent an estimated noise component of the smoothed magnitude spectrum, and estimating a noise spectrum by using a recursive average method using an adaptive forgetting factor defined by using the search spectrum. According to an embodiment of the present invention, the amount of calculation for noise estimation is small, and large-capacity memory is not required. Accordingly, the present invention can be easily implemented in hardware or software. Further, the accuracy of noise estimation can be increase because an adaptive procedure can be performed on each frequency sub-band.

Type: Grant

Filed: March 31, 2009

Date of Patent: June 3, 2014

Assignee: Transono Inc.

Inventors: Sung Il Jung, Dong Gyung Ha
Sparse data compression

Patent number: 8711015

Abstract: The invention relates to compressing of sparse data sets contains sequences of data values and position information therefor. The position information may be in the form of position indices defining active positions of the data values in a sparse vector of length N. The position information is encoded into the data values by adjusting one or more of the data values within a pre-defined tolerance range, so that a pre-defined mapping function of the data values and their positions is close to a target value. In one embodiment, the mapping function is defined using a sub-set of N filler values which elements are used to fill empty positions in the input sparse data vector. At the decoder, the correct data positions are identified by searching though possible sub-sets of filler values.

Type: Grant

Filed: August 24, 2011

Date of Patent: April 29, 2014

Assignee: Her Majesty the Queen in Right of Canada as represented by the Minister of Industry, through the Communications Research Centre Canada

Inventors: Frederic Mustiere, Hossein Najaf-Zadeh, Ramin Pishehvar, Hassan Lahdili, Louis Thibault, Martin Bouchard
Method and apparatus for generating synthetic speech with contrastive stress

Patent number: 8682671

Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.

Type: Grant

Filed: April 17, 2013

Date of Patent: March 25, 2014

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Stephen R. Springer
Prioritizing selection criteria by automated assistant

Patent number: 8660849

Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A user request is received, the user request including at least a speech input received from a user. The user request including the speech input is processed to obtain a representation of user intent for identifying items of a selection domain based on at least one selection criterion. A prompt is provided to the user, the prompt presenting two or more properties relevant to items of the selection domain and requesting the user to specify relative importance between the two or more properties. A listing of search results is provided to the user, where the listing of search results has been obtained based on the at least one selection criterion and the relative importance provided by the user.

Type: Grant

Filed: December 21, 2012

Date of Patent: February 25, 2014

Assignee: Apple Inc.

Inventors: Thomas Robert Gruber, Adam John Cheyer, Didier Rene Guzzoni, Christopher Dean Brigham, Harry Joseph Saddler
Personalized text-to-speech synthesis and personalized speech feature extraction

Patent number: 8655659

Abstract: A personalized text-to-speech synthesizing device includes: a personalized speech feature library creator, configured to recognize personalized speech features of a specific speaker by comparing a random speech fragment of the specific speaker with preset keywords, thereby to create a personalized speech feature library associated with the specific speaker, and store the personalized speech feature library in association with the specific speaker; and a text-to-speech synthesizer, configured to perform a speech synthesis of a text message from the specific speaker, based on the personalized speech feature library associated with the specific speaker and created by the personalized speech feature library creator, thereby to generate and output a speech fragment having pronunciation characteristics of the specific speaker.

Type: Grant

Filed: August 12, 2010

Date of Patent: February 18, 2014

Assignees: Sony Corporation, Sony Mobile Communications AB

Inventors: Qingfang Wang, Shouchun He
User-specific confidence thresholds for speech recognition

Patent number: 8639508

Abstract: A method of automatic speech recognition includes receiving an utterance from a user via a microphone that converts the utterance into a speech signal, pre-processing the speech signal using a processor to extract acoustic data from the received speech signal, and identifying at least one user-specific characteristic in response to the extracted acoustic data. The method also includes determining a user-specific confidence threshold responsive to the at least one user-specific characteristic, and using the user-specific confidence threshold to recognize the utterance received from the user and/or to assess confusability of the utterance with stored vocabulary.

Type: Grant

Filed: February 14, 2011

Date of Patent: January 28, 2014

Assignee: General Motors LLC

Inventors: Xufang Zhao, Gaurav Talwar
Method of using microphone characteristics to optimize speech recognition performance

Patent number: 8600741

Abstract: A system and method for tuning a speech recognition engine to an individual microphone using a database containing acoustical models for a plurality of microphones. Microphone performance characteristics are obtained from a microphone at a speech recognition engine, the database is searched for an acoustical model that matches the characteristics, and the speech recognition engine is then modified based on the matching acoustical model.

Type: Grant

Filed: August 20, 2008

Date of Patent: December 3, 2013

Assignee: General Motors LLC

Inventors: Gaurav Talwar, Rathinavelu Chengalvarayan, Jesse T. Gratke, Subhash B. Gullapalli, Dana B. Fecher
System and method for improving robustness of speech recognition using vocal tract length normalization codebooks

Patent number: 8600744

Abstract: Disclosed are systems, methods, and computer readable media for performing speech recognition. The method embodiment comprises selecting a codebook from a plurality of codebooks with a minimal acoustic distance to a received speech sample, the plurality of codebooks generated by a process of (a) computing a vocal tract length for a each of a plurality of speakers, (b) for each of the plurality of speakers, clustering speech vectors, and (c) creating a codebook for each speaker, the codebook containing entries for the respective speaker's vocal tract length, speech vectors, and an optional vector weight for each speech vector, (2) applying the respective vocal tract length associated with the selected codebook to normalize the received speech sample for use in speech recognition, and (3) recognizing the received speech sample based on the respective vocal tract length associated with the selected codebook.

Type: Grant

Filed: April 13, 2012

Date of Patent: December 3, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Mazin Gilbert
Speech recognition system and speech recognizing method

Patent number: 8577678

Abstract: A speech recognition system according to the present invention includes a sound source separating section which separates mixed speeches from multiple sound sources from one another; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each frequency spectral component of a separated speech signal using distributions of speech signal and noise against separation reliability of the separated speech signal; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.

Type: Grant

Filed: March 10, 2011

Date of Patent: November 5, 2013

Assignee: Honda Motor Co., Ltd.

Inventors: Kazuhiro Nakadai, Toru Takahashi, Hiroshi Okuno
Method and apparatus for generating synthetic speech with contrastive stress

Patent number: 8571870

Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.

Type: Grant

Filed: August 9, 2010

Date of Patent: October 29, 2013

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Stephen R. Springer
Mobile terminal and menu control method thereof

Patent number: 8560324

Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal, a memory configured to store information related to operations performed on the mobile terminal, and a controller configured to activate the voice recognition function upon receiving the input to activate the voice recognition function, to determine a meaning of an input voice instruction based on at least one prior operation performed on the mobile terminal and a language included in the voice instruction, and to provide operations related to the determined meaning of the input voice instruction based on the at least one prior operation performed on the mobile terminal and the language included in the voice instruction and based on a probability that the determined meaning of the input voice instruction matches the information related to the operations of the mobile terminal.

Type: Grant

Filed: January 31, 2012

Date of Patent: October 15, 2013

Assignee: LG Electronics Inc.

Inventors: Jong-Ho Shin, Jae-Do Kwak, Jong-Keun Youn
Confidence levels for speaker recognition

Patent number: 8560316

Abstract: The present invention relates to a system and method of making a verification decision within a speaker recognition system. A speech sample is gathered from a speaker over a period of time a verification score is then produce for said sample over the period. Once the verification score is determined a confidence measure is produced based on frame score observations from said sample over the period and a confidence measure calculated based on the standard Gaussian distribution. If the confidence measure indicates with a set level of confidence that the verification score is below the verification threshold the speaker is rejected and gathering process terminated.

Type: Grant

Filed: December 19, 2007

Date of Patent: October 15, 2013

Inventors: Robert Vogt, Michael Mason, Sridaran Subramanian
Training and applying prosody models

Patent number: 8554566

Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

Type: Grant

Filed: November 29, 2012

Date of Patent: October 8, 2013

Assignee: Morphism LLC

Inventor: James H. Stephens, Jr.
Method and apparatus for predicting word accuracy in automatic speech recognition systems

Patent number: 8538752

Abstract: The invention comprises a method and apparatus for predicting word accuracy. Specifically, the method comprises obtaining an utterance in speech data where the utterance comprises an actual word string, processing the utterance for generating an interpretation of the actual word string, processing the utterance to identify at least one utterance frame, and predicting a word accuracy associated with the interpretation according to at least one stationary signal-to-noise ratio and at least one non-stationary signal to noise ratio, wherein the at least one stationary signal-to-noise ratio and the at least one non-stationary signal to noise ratio are determined according to a frame energy associated with each of the at least one utterance frame.

Type: Grant

Filed: May 7, 2012

Date of Patent: September 17, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin Gilbert, Hong Kook Kim
Speech recognition system and speech recognizing method

Patent number: 8538751

Abstract: A speech recognition system and a speech recognizing method for high-accuracy speech recognition in the environment with ego noise are provided. A speech recognition system according to the present invention includes a sound source separating and speech enhancing section; an ego noise predicting section; and a missing feature mask generating section for generating missing feature masks using outputs of the sound source separating and speech enhancing section and the ego noise predicting section; an acoustic feature extracting section for extracting an acoustic feature of each sound source using an output for said each sound source of the sound source separating and speech enhancing section; and a speech recognizing section for performing speech recognition using outputs of the acoustic feature extracting section and the missing feature masks.

Type: Grant

Filed: June 10, 2011

Date of Patent: September 17, 2013

Assignee: Honda Motor Co., Ltd.

Inventors: Kazuhiro Nakadai, Gokhan Ince
Mobile voice self service device and method thereof

Patent number: 8521536

Abstract: A Mobile Voice Self Service (MVSS) mobile device and method thereof. A VoiceXML browser that is implemented directly on the MVSS mobile device may request a VoiceXML application from a VoiceXML application server and process it. A call data manager may also be implemented on the MVSS mobile device and may provide call data that, in conjunction with data from the VoiceXML application server, may authorize access to advanced Media Resource Control Protocol (MRCP) services, such as Automatic Speech Recognition (ASR) or Text-To-Speech (TTS). A media resource gateway may then provide the advanced MRCP services to the VoiceXML application processed by the VoiceXML application browser. Hotkey navigations and bookmarked application points to VoiceXML applications may be created and applied through application analysis and state tracking. Therein, VoiceXML document transitions and user input are stored to maintain application state changes until the user requests creation of an application bookmark.

Type: Grant

Filed: October 22, 2012

Date of Patent: August 27, 2013

Assignee: West Corporation

Inventor: Chad Daniel Fox
Spectrum harmonic/noise sharpness control

Patent number: 8515747

Abstract: A transmitted data that includes audio data and a transmitted spectral sharpness parameter representing a spectral harmonic/noise sharpness of a plurality of subbands are received. A measured spectral sharpness parameter is estimated from received audio data. The transmitted spectral sharpness parameter is compared with the measured spectral sharpness parameter. A main sharpness control parameter is formed for each of the decoded subbands. The main sharpness control parameter for each of the decoded subbands is analyzed. Ones of the decoded subbands are sharpened if the corresponding main sharpness control indicates that a corresponding subband is not sharp enough, wherein sharpened subbands are formed. Likewise, ones of the decoded subbands are flattened if the corresponding main sharpness control indicates that a corresponding subband is not flat enough, wherein flattened subbands are formed.

Type: Grant

Filed: September 4, 2009

Date of Patent: August 20, 2013

Assignee: Huawei Technologies Co., Ltd.

Inventor: Yang Gao
Weighting factor learning system and audio recognition system

Patent number: 8494847

Abstract: A weighting factor learning system includes an audio recognition section that recognizes learning audio data and outputting the recognition result; a weighting factor updating section that updates a weighting factor applied to a score obtained from an acoustic model and a language model so that the difference between a correct-answer score calculated with the use of a correct-answer text of the learning audio data and a score of the recognition result becomes large; a convergence determination section that determines, with the use of the score after updating, whether to return to the weighting factor updating section to update the weighting factor again; and a weighting factor convergence determination section that determines, with the use of the score after updating, whether to return to the audio recognition section to perform the process again and update the weighting factor using the weighting factor updating section.

Type: Grant

Filed: February 19, 2008

Date of Patent: July 23, 2013

Assignee: NEC Corporation

Inventors: Tadashi Emori, Yoshifumi Onishi
Noise reduction with integrated tonal noise reduction

Patent number: 8489396

Abstract: The system provides a technique for suppressing or eliminating tonal noise in and input signal. The system operates on the input signal at a plurality of frequency bins and uses information generated at a prior bin to assist in calculating values at subsequent bins. The system first identifies peaks in a signal and then determines if the peaks are from tonal effects. This can be done by comparing the estimated background noise of a current bin to the smoothed background noise of the same bin. The smoothed background noise can be calculated using an asymmetric IIR filter. When the ratio of the current background noise estimate to the currently calculated smoothed background noise is far greater than 1, tonal noise is assumed. When tonal noise is found, a number of suppression techniques can be applied to reduce the tonal noise, including gain suppression with fixed floor factor, an adaptive floor factor gain suppression technique, and a random phase technique.

Type: Grant

Filed: December 20, 2007

Date of Patent: July 16, 2013

Assignee: QNX Software Systems Limited

Inventors: Phil A. Hetherington, Xueman Li
VEHICLE VOICE INTERFACE SYSTEM CALIBRATION METHOD

Publication number: 20130179164

Abstract: A vehicle voice interface system calibration method comprising electronically convolving voice command data with voice impulse response data, electronically convolving audio system output data with feedback impulse response data, and calibrating the vehicle voice interface system. The voice command data is electronically convolved with voice impulse response data representing a voice acoustic signal path between an artificial mouth simulator and a first microphone, to simulate a voice acoustic transfer function pertaining to the passenger compartment. The audio system output data is convolved with feedback impulse response data representing a feedback acoustic signal path between a vehicle audio system output and a second microphone, to simulate a feedback acoustic transfer function pertaining to the passenger compartment. The voice interface system is calibrated to recognize voice commands represented by the voice command data based on the simulated voice and feedback acoustic transfer functions.

Type: Application

Filed: January 6, 2012

Publication date: July 11, 2013

Applicant: Nissan North America, Inc.

Inventor: Patrick Dennis
Weight normalization in hardware without a division operator

Patent number: 8484267

Abstract: Weight normalization in hardware or software without a division operator is described, using only right bit shift, addition and subtraction operations. A right bit shift is performed on an expected sum to effectively divide the expected sum by two to provide a first updated value for the expected sum. An iteration is performed which includes: incrementing with a first adder a first variable by the first updated value of the expected sum to provide an updated value for the first variable; subtracting with a first subtractor a second weight from a first weight to provide a first updated value for the first weight; and performing a left bit shift on the second weight to effectively multiply the second weight by two to provide a first updated value for the second weight.

Type: Grant

Filed: November 19, 2009

Date of Patent: July 9, 2013

Assignee: Xilinx, Inc.

Inventor: Gabor Szedo
Voice analysis device, voice analysis method, voice analysis program, and system integration circuit

Patent number: 8478587

Abstract: A sound analysis device comprises: a sound parameter calculation unit operable to acquire an audio signal and calculate a sound parameter for each of partial audio signals, the partial audio signals each being the acquired audio signal in a unit of time; a category determination unit operable to determine, from among a plurality of environmental sound categories, which environmental sound category each of the partial audio signals belongs to, based on a corresponding one of the calculated sound parameters; a section setting unit operable to sequentially set judgement target sections on a time axis as time elapses, each of the judgment target sections including two or more of the units of time, the two or more of the units of time being consecutive; and an environment judgment unit operable to judge, based on a number of partial audio signals in each environmental sound category determined in at least a most recent judgment target section, an environment that surrounds the sound analysis device in at least the

Type: Grant

Filed: March 13, 2008

Date of Patent: July 2, 2013

Assignee: Panasonic Corporation

Inventors: Takashi Kawamura, Ryouichi Kawanishi
Method and apparatus for defining an artificial brain via a plurality of concept nodes defined by frame semantics

Patent number: 8463720

Abstract: A method for defining a network of nodes is provided, each representing a unique concept, and making connections between individual concepts through unique relationships to other concepts. Each of the nodes is operable to store a unique identifier in the network and information regarding the concept in addition to the unique relationships.

Type: Grant

Filed: March 26, 2010

Date of Patent: June 11, 2013

Assignee: Neuric Technologies, LLC

Inventors: Jennifer Seale, Hannah Lindsley, Timothy Allen Margheim
Method and apparatus for generating synthetic speech with contrastive stress

Patent number: 8447610

Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.

Type: Grant

Filed: August 9, 2010

Date of Patent: May 21, 2013

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Stephen R. Springer
Voice authentication system and method

Patent number: 8396711

Abstract: A user's voice is authenticated by prompting a user to say a challenge phrase from a list of predetermined phrases and comparing the user's response with a prerecorded version of the same response. The user's stored recordings are associated with an electronic identification or serial number for a specific device, so that when communication is established using the device, only the specific user may authenticate the session. When several phrases and recordings are used, one may be selected at random for authentication so that fraudulent authentication using a recording of the user's voice may be thwarted. The system and method may be used for authenticating a device when it is first activated, such as a telephony device, or may be used when authenticating a specific communications session.

Type: Grant

Filed: May 1, 2006

Date of Patent: March 12, 2013

Assignee: Microsoft Corporation

Inventors: Dawson Yee, Gurdeep S. Pall
Speech recognition system and method for generating a mask of the system

Patent number: 8392185

Abstract: The speech recognition system of the present invention includes: a sound source separating section which separates mixed speeches from multiple sound sources; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each separated speech according to reliability of separation in separating operation of the sound source separating section; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.

Type: Grant

Filed: August 19, 2009

Date of Patent: March 5, 2013

Assignee: Honda Motor Co., Ltd.

Inventors: Kazuhiro Nakadai, Toru Takahashi, Hiroshi Okuno
Methods and systems for determining audio loudness levels in programming

Patent number: 8379880

Abstract: An example of a method of correcting an audio level of a stored program asset comprises retrieving a stored program asset having audio encoded at a first loudness setting. Dialog of the audio of the asset is identified, a loudness of the dialog is determined and the determined loudness is compared to the first loudness setting. The asset is re-encoded at a second loudness setting corresponding to the determined loudness, if the first loudness setting and the second loudness are different by more than a predetermined amount. The determined loudness is preferably a DIALNORM of the dialog. The asset may be stored with the re-encoded loudness setting. The method may be applied to programs as they are being received from a source, as well. Aspects of the method may also be applied to programs to be provided by a source. Systems are also disclosed.

Type: Grant

Filed: June 2, 2008

Date of Patent: February 19, 2013

Assignee: Time Warner Cable Inc.

Inventor: Steven E. Riedl
Training and applying prosody models

Patent number: 8374873

Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

Type: Grant

Filed: August 11, 2009

Date of Patent: February 12, 2013

Assignee: Morphism, LLC

Inventor: James H. Stephens, Jr.
System and method for low overhead time domain voice authentication

Patent number: 8326625

Abstract: A system and method are provided to authenticate a voice in a time domain. The initial rise time, initial fall time, second rise time, second fall time and final oscillation time are digitized into bits to form at least part of a voice ID. The voice IDs are used to authenticate a user's voice.

Type: Grant

Filed: November 10, 2009

Date of Patent: December 4, 2012

Assignee: Research In Motion Limited

Inventor: Sasan Adibi
Method and apparatus for obtaining media over a communications network

Patent number: 8316148

Abstract: A method and apparatus for obtaining a real time media stream provided as a plurality of media fragments from a plurality of remote nodes in a communications network. A first series of media fragments satisfying a first selection criterion is requested from a first remote node and a further series of media fragments satisfying a further different selection criterion is requested from at least one further remote node. When combined, the first series of fragments and the further series of fragments provide the complete media stream.

Type: Grant

Filed: February 22, 2008

Date of Patent: November 20, 2012

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventors: Andreas Ljunggren, Robert Skog
Method and apparatus for obtaining media over a communications network

Patent number: 8316108

Abstract: A method and apparatus for obtaining a real time media stream provided as a plurality of media fragments from a plurality of remote nodes in a communications network is described. Media fragments are requested from the plurality of remote nodes. A series of media fragments is received from at least one of the plurality of remote nodes. A selection criterion is determined for identifying the series of data fragments, and a blocking request is sent to at least one other of the plurality of remote nodes, the blocking request instructing the at least one other node to block the media fragments satisfying the selection criterion from being sent.

Type: Grant

Filed: February 22, 2008

Date of Patent: November 20, 2012

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventors: Andreas Ljunggren, Robert Skog
Parameterized temporal feature analysis

Patent number: 8311821

Abstract: A method (1) for classifying at least one audio signal (A) into at least one audio class (AC), the method (1) comprising the steps of analyzing (10) said audio signal to extract at least one predetermined audio feature, performing (12) a frequency analysis on a set of values of said audio feature at different time instances, deriving (12) at least one further audio feature representing a temporal behavior of said audio feature based on said frequency analysis, and classifying (14) said audio signal based on said further audio feature. With the further audio feature, information is obtained about the temporal fluctuation of an audio feature, which may be advantageous for a classification of audio.

Type: Grant

Filed: April 21, 2004

Date of Patent: November 13, 2012

Assignee: Koninklijke Philips Electronics N.V.

Inventors: Dirk Jeroen Breebaart, Martin Franciscus McKinney
Mobile voice self service system

Patent number: 8311837

Abstract: A Mobile Voice Self Service (MVSS) mobile system that includes an MVSS mobile device, on which a VoiceXML browser is implemented directly. The VoiceXML browser may request a VoiceXML application from a VoiceXML application server and process it. A client system may include the VoiceXML application server that the VoiceXML application is requested from. Upon request, the VoiceXML application may deliver the requested VoiceXML application to the VoiceXML application browser. A vendor media resource system may provide advanced Media Resource Control Protocol (MRCP) services, such as Automatic Speech Recognition (ASR) or Text-To-Speech (TTS), to the VoiceXML application that is being processed by the VoiceXML application browser. A call data manager may also be implemented on the MVSS mobile device and may provide call data that, in conjunction with data from the VoiceXML application server, may authorize access to advanced Media Resource Control Protocol (MRCP) services.

Type: Grant

Filed: June 13, 2008

Date of Patent: November 13, 2012

Assignee: West Corporation

Inventor: Chad Daniel Fox
Enhanced automatic speech recognition using mapping between unsupervised and supervised speech model parameters trained on same acoustic training data

Patent number: 8306819

Abstract: Techniques for enhanced automatic speech recognition are described. An enhanced ASR system may be operative to generate an error correction function. The error correction function may represent a mapping between a supervised set of parameters and an unsupervised training set of parameters generated using a same set of acoustic training data, and apply the error correction function to an unsupervised testing set of parameters to form a corrected set of parameters used to perform speaker adaptation. Other embodiments are described and claimed.

Type: Grant

Filed: March 9, 2009

Date of Patent: November 6, 2012

Assignee: Microsoft Corporation

Inventors: Chaojun Liu, Yifan Gong
Mobile voice self service device and method thereof

Patent number: 8296148

Abstract: A Mobile Voice Self Service (MVSS) mobile device and method thereof. A VoiceXML browser that is implemented directly on the MVSS mobile device may request a VoiceXML application from a VoiceXML application server and process it. A call data manager may also be implemented on the MVSS mobile device and may provide call data that, in conjunction with data from the VoiceXML application server, may authorize access to advanced Media Resource Control Protocol (MRCP) services, such as Automatic Speech Recognition (ASR) or Text-To-Speech (TTS). A media resource gateway may then provide the advanced MRCP services to the VoiceXML application processed by the VoiceXML application browser. Hotkey navigations and bookmarked application points to VoiceXML applications may be created and applied through application analysis and state tracking. Therein, VoiceXML document transitions and user input are stored to maintain application state changes until the user requests creation of an application bookmark.

Type: Grant

Filed: June 13, 2008

Date of Patent: October 23, 2012

Assignee: West Corporation

Inventor: Chad Daniel Fox
Online Maximum-Likelihood Mean and Variance Normalization for Speech Recognition

Publication number: 20120259632

Abstract: A feature transform for speech recognition is described. An input speech utterance is processed to produce a sequence of representative speech vectors. A time-synchronous speech recognition pass is performed using a decoding search to determine a recognition output corresponding to the speech input. The decoding search includes, for each speech vector after some first threshold number of speech vectors, estimating a feature transform based on the preceding speech vectors in the utterance and partial decoding results of the decoding search. The current speech vector is then adjusted based on the current feature transform, and the adjusted speech vector is used in a current frame of the decoding search.

Type: Application

Filed: February 22, 2010

Publication date: October 11, 2012

Applicant: NUANCE COMMUNICATIONS, INC.

Inventor: Daniel Willett
Apparatus for processing an audio signal and method thereof

Patent number: 8275154

Abstract: An apparatus for processing an audio signal and method thereof are disclosed, by which a local dynamic range of an audio signal can be adaptively normalized as well as a maximum dynamic range of the audio signal. The present invention includes receiving, by an audio processing apparatus, a signal, and feedback information estimated based on a normalizing gain; generating a noise estimation based on the signal; computing a gain filter for noise canceling, based on the noise estimation and the signal; and, obtaining a restricted gain filter by applying the feedback information to the gain filter.

Type: Grant

Filed: July 29, 2009

Date of Patent: September 25, 2012

Assignee: LG Electronics Inc.

Inventors: Jong Ha Moon, Hyen O Oh, Joon Il Lee, Myung Hoon Lee, Yang Won Jung, Alexis Favrot, Christof Faller
Tonal correction of speech

Patent number: 8249873

Abstract: Tonal correction of speech is provided. Received speech is analyzed and compared to a table of commonly mispronounced phrases. These phrases are mapped to the phrase likely intended by the speaker. The phrase determines to be the phrase the user likely intended can be suggested to the user. If the user approves of the suggestion, tonal correction can be applied to the speech before that speech is delivered to a recipient.

Type: Grant

Filed: August 12, 2005

Date of Patent: August 21, 2012

Assignee: Avaya Inc.

Inventors: Colin Blair, Kevin Chan, Christopher R. Gentle, Neil Hepworth, Andrew W. Lang, Paul R. Michaelis
SYSTEMS AND METHODS FOR MONITORING COMMUNICATIONS

Publication number: 20120196629

Abstract: In one embodiment, a method provides for monitoring and analyzing communications of a monitored user on behalf of a monitoring user, to determine whether the communication includes a violation. For example, SMS messages, MMS messages, IMs, e-mails, social network site postings or voice mails of a child may be monitored on behalf of a parent. In one embodiment, an algorithm is used to analyze a normalized version of the communication, which algorithm is retrained using results of past analysis, to determine a probability of a communication including a violation.

Type: Application

Filed: January 28, 2011

Publication date: August 2, 2012

Applicant: PROTEXT MOBILITY, INC.

Inventors: Edward Movsesyan, Igor Slavinsky
Providing enhanced content

Patent number: 8234411

Abstract: Methods, systems, computer readable media, and apparatuses for providing enhanced content are presented. Data including a first program, a first caption stream associated with the first program, and a second caption stream associated with the first program may be received. The second caption stream may be extracted from the data, and a second program may be encoded with the second caption stream. The first program may be transmitted with the first caption stream including first captions and may include first content configured to be played back at a first speed. In response to receiving an instruction to change play back speed, the second program may be transmitted with the second caption stream. The second program may include the first content configured to be played back at a second speed different from the first speed, and the second caption stream may include second captions different from the first captions.

Type: Grant

Filed: September 2, 2010

Date of Patent: July 31, 2012

Assignee: Comcast Cable Communications, LLC

Inventor: Ross Gilson
Class detection scheme and time mediated averaging of class dependent models

Patent number: 8229744

Abstract: A method, system, and computer program for class detection and time mediated averaging of class dependent models. A technique is described to take advantage of gender information in training data and how obtain female, male, and gender independent models from this information. By using a probability value to average male and female Gaussian Mixture Models (GMMs), dramatic deterioration in cross gender decoding performance is avoided.

Type: Grant

Filed: August 26, 2003

Date of Patent: July 24, 2012

Assignee: Nuance Communications, Inc.

Inventors: Satyanarayana Dharanipragada, Peder A. Olsen
Method and system for accent correction

Patent number: 8175882

Abstract: A method for task execution improvement, the method includes: generating a baseline model for executing a task; recording a user executing a task; comparing the baseline model to the user's execution of the task; and providing feedback to the user based on the differences in the user's execution and the baseline model.

Type: Grant

Filed: January 25, 2008

Date of Patent: May 8, 2012

Assignee: International Business Machines Corporation

Inventors: Sara H. Basson, Dimitiri Kanevsky, Edward E. Kelley, Bhuvana Ramabhadran
System and method for improving robustness of speech recognition using vocal tract length normalization codebooks

Patent number: 8160875

Abstract: Disclosed are systems, methods, and computer readable media for performing speech recognition. The method embodiment comprises selecting a codebook from a plurality of codebooks with a minimal acoustic distance to a received speech sample, the plurality of codebooks generated by a process of (a) computing a vocal tract length for a each of a plurality of speakers, (b) for each of the plurality of speakers, clustering speech vectors, and (c) creating a codebook for each speaker, the codebook containing entries for the respective speaker's vocal tract length, speech vectors, and an optional vector weight for each speech vector, (2) applying the respective vocal tract length associated with the selected codebook to normalize the received speech sample for use in speech recognition, and (3) recognizing the received speech sample based on the respective vocal tract length associated with the selected codebook.

Type: Grant

Filed: August 26, 2010

Date of Patent: April 17, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Mazin Gilbert
Voice recognizing apparatus, voice recognizing method, voice recognizing program, interference reducing apparatus, interference reducing method, and interference reducing program

Patent number: 8150688

Abstract: A voice recognizing apparatus includes a microphone 12 which inputs an input voice including speech voice uttered by a user speaker and interference voice uttered by an interference speaker other than the user speaker, superimposition amount determining unit 14 which determines a noise superimposition amount for the input voice on the basis of a speech voice and an interference voice separately input as the input voice, a noise superimposing unit 16 which superimposes noise according to the noise superimposition amount onto the input voice and outputs the resultant voice as noise-superimposed voice; and a voice recognizing unit 18 which recognizes the noise-superimposed voice.

Type: Grant

Filed: January 10, 2007

Date of Patent: April 3, 2012

Assignee: NEC Corporation

Inventor: Toru Iwasawa

prev 1 2 3 4 5 next