Hidden Markov (hm) Network (epo) Patents (Class 704/256.5)

Estimation of sleep quality parameters from whole night audio analysis

Patent number: 11712198

Abstract: The present invention relates to a system and method for determining sleep quality parameters according to audio analyses, comprising: obtaining an audio recorded signal comprising sleep sounds of a subject; segmenting the signal into epochs; generating a feature vector for each epoch, wherein each of said feature vectors comprises one or more feature parameters that are associated with a particular characteristic of the signal and that are calculated according to the epoch signal or according to a signal generated from the epoch signal; inputting the generated feature vectors into a machine learning classifier and applying a preformed classifying model on the feature vectors that outputs a probabilities vector for each epoch, wherein each of the probabilities vectors comprises the probabilities of the epoch being each of the sleep quality parameters; inputting the probabilities vectors for each epoch into a machine learning time series model and applying a preformed sleep quality time series pattern function

Type: Grant

Filed: July 11, 2017

Date of Patent: August 1, 2023

Assignees: B.G. NEGEV TECHNOLOGIES AND APPLICATIONS LTD., AT BEN-GURION UNIVERSITY, MOR RESEARCH APPLICATIONS LTD.

Inventors: Eliran Dafna, Yaniv Zigel, Dvir Ben Or, Matan Halevi, Ariel Tarasiuk
Machine translation method, device, and computer-readable storage medium

Patent number: 11593571

Abstract: A machine translation method includes: receiving to-be-processed information expressed in a source language; encoding the to-be-processed information, and generating an expression vector sequence of the to-be-processed information; and predicting feature information of a target foresight word at a first moment by using a prediction model. The feature information includes at least one of a part of speech or a word category of the target foresight word. The method also includes: determining a context vector corresponding to the first moment in the expression vector sequence according to the feature information of the target foresight word; and decoding the context vector by using a decoder, to obtain target content that corresponds to the context vector and expressed in a target language.

Type: Grant

Filed: June 8, 2020

Date of Patent: February 28, 2023

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Lemao Liu, Xintong Li, Shuming Shi
Improving speech recognition transcriptions

Patent number: 11580959

Abstract: An approach to correcting transcriptions of speech recognition models may be provided. A list of similar sounding phonemes from associated with the phonemes of high frequency terms may be generated for a particular node associated with a virtual assistant. An utterance may be transcribed and receive a confidence score regarding the correctness of the transcription based on audio metrics and other factors. The phonemes of the utterance can be compared to the phonemes of the high frequency terms from the list and a score for the matching phonemes and similar sounding phonemes can be determined. If it is determined the sounds similar score for a term from the high frequency term list is above a threshold, the transcription can be replaced with the term, providing a corrected transcription.

Type: Grant

Filed: September 28, 2020

Date of Patent: February 14, 2023

Assignee: International Business Machines Corporation

Inventors: Andrew R. Freed, Marco Noel, Aishwarya Hariharan, Martha Holloman, Mohammad Gorji-Sefidmazgi, Daniel Zyska
Electronic device and method for controlling the electronic device

Patent number: 11545149

Abstract: Disclosed are an electronic device capable of efficiently performing speech recognition and natural language understanding and a method for controlling thereof. The electronic device includes: a microphone; a non-volatile memory configured to store virtual assistant model data comprising data that is classified according to a plurality of domains and data that is commonly used for the plurality of domains; a volatile memory; and a processor configured to: based on receiving, through the microphone, a trigger input to perform speech recognition for a user speech, initiate loading the virtual assistant model data from the non-volatile memory into the volatile memory, load, into the volatile memory, first data from among the data classified according to the plurality of domains and, while loading the first data into the volatile memory, load at least a part of the data commonly used for the plurality of domains into the volatile memory.

Type: Grant

Filed: June 10, 2020

Date of Patent: January 3, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Saebom Jang, Hyeonmok Ko, Kyenghun Lee, Kunal Sharma, Raghavendra Hanumantasetty Ramasetty
Media playback system with concurrent voice assistance

Patent number: 11288039

Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.

Type: Grant

Filed: March 30, 2020

Date of Patent: March 29, 2022

Assignee: Sonos, Inc.

Inventors: Dayn Wilberding, John Tolomei
Media playback system with concurrent voice assistance

Patent number: 11175888

Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.

Type: Grant

Filed: January 25, 2021

Date of Patent: November 16, 2021

Assignee: Sonos, Inc.

Inventors: Dayn Wilberding, John Tolomei
Method and system of high accuracy keyphrase detection for low resource devices

Patent number: 11127394

Abstract: Techniques related to keyphrase detection for applications such as wake on voice are disclosed herein. Such techniques may have high accuracy by using scores of phone positions in triphones to select which triphones to use with a rejection model, using context-related phones for the rejection model, adding silence before keyphrase sounds for a keyphrase model, or any combination of these.

Type: Grant

Filed: March 29, 2019

Date of Patent: September 21, 2021

Assignee: Intel Corporation

Inventors: Sebastian Czyryba, Tobias Bocklet, Kuba Lopatka
Information processing device, information processing method, and program

Patent number: 11100919

Abstract: There is provided an information processing device including an analysis unit configured to analyze a character string indicating contents of utterance obtained as a result of speech recognition, and a display control unit configured to display the character string indicating the contents of the utterance and an analysis result on a display screen.

Type: Grant

Filed: July 26, 2019

Date of Patent: August 24, 2021

Assignee: SATURN LICENSING LLC

Inventors: Tomoaki Takemura, Shinya Masunaga, Koji Fujita, Katsutoshi Ishiwata, Kenichi Ikenaga, Katsutoshi Kusumoto
Media playback system with concurrent voice assistance

Patent number: 10606555

Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.

Type: Grant

Filed: November 4, 2019

Date of Patent: March 31, 2020

Assignee: Sonos, Inc.

Inventors: Dayn Wilberding, John Tolomei
Media playback system with voice assistance

Patent number: 10466962

Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, media playback system is configured to (i) capture a voice input via at least one microphone device, (ii) detect inclusion of one or more of the commands within the voice input, (iii) determine that the one or more commands meets corresponding command criteria associated with the one or more commands within the set of command information, and (iv) in response to the determination, select a first voice assistant service (VAS) and (a) forego selection of a second VAS, (b) send the voice input to first VAS, and (c) after sending the voice input, receiving a response to the voice input from the first VAS.

Type: Grant

Filed: September 29, 2017

Date of Patent: November 5, 2019

Assignee: Sonos, Inc.

Inventors: Dayn Wilberding, John Tolomei
Information processing device, information processing method, and program

Patent number: 10424291

Abstract: There is provided an information processing device including an analysis unit configured to analyze a character string indicating contents of utterance obtained as a result of speech recognition, and a display control unit configured to display the character string indicating the contents of the utterance and an analysis result on a display screen.

Type: Grant

Filed: November 20, 2013

Date of Patent: September 24, 2019

Assignee: Saturn Licensing LLC

Inventors: Tomoaki Takemura, Shinya Masunaga, Koji Fujita, Katsutoshi Ishiwata, Kenichi Ikenaga, Katsutoshi Kusumoto
False positive wake word

Patent number: 10204624

Abstract: One embodiment provides a method, comprising: receiving, at an information handling device, user input comprising a word that activates a digital assistant associated with the information handling device; determining, using a processor, whether the word comprises a false positive wake word, wherein the false positive wake word comprises a stored wake word; and responsive to determining that the word is associated with a false positive wake word, providing output associated with an alternative wake word. Other aspects are described and claimed.

Type: Grant

Filed: August 14, 2017

Date of Patent: February 12, 2019

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Ryan Charles Knudson, Russell Speight VanBlon, Roderick Echols, Jonathan Gaither Knox
Parsing annotator framework from external services

Patent number: 9183196

Abstract: A language processing system uses annotation services that are external to the language processing system to identify n-grams that identify entities in an input sentence. The n-grams are annotated by the annotation services. The annotations are used to determine which n-grams, if any, correspond to instances of an entity type (e.g., values for a variable or terminals for a non-terminal). After determining which n-grams correspond to entity types, parse initializations are generated for parsing rules and parses for each rule are attempted. The rules that successfully parse are used to determine whether the input sentence invokes a specific action, and if so, what arguments are to be passed to the invocation of the action.

Type: Grant

Filed: June 25, 2013

Date of Patent: November 10, 2015

Assignee: Google Inc.

Inventors: Jakob D. Uszkoreit, Percy Liang, Daniel M. Bikel
State detecting apparatus, communication apparatus, and storage medium storing state detecting program

Patent number: 9020820

Abstract: A state detecting apparatus includes: a processor to execute acquiring utterance data related to uttered speech, computing a plurality of statistical quantities for feature parameters regarding features of the utterance data, creating, on the basis of the plurality of statistical quantities regarding the utterance data and another plurality of statistical quantities regarding reference utterance data based on other uttered speech, pseudo-utterance data having at least one statistical quantity equal to a statistical quantity in the other plurality of statistical quantities, computing a plurality of statistical quantities for synthetic utterance data synthesized on the basis of the pseudo-utterance data and the utterance data, and determining, on the basis of a comparison between statistical quantities of the synthetic utterance data and statistical quantities of the reference utterance data, whether the speaker who produced the uttered speech is in a first state or a second state; and a memory.

Type: Grant

Filed: April 13, 2012

Date of Patent: April 28, 2015

Assignee: Fujitsu Limited

Inventors: Shoji Hayakawa, Naoshi Matsuo
Recognizing the numeric language in natural spoken dialogue

Patent number: 8949127

Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.

Type: Grant

Filed: February 17, 2014

Date of Patent: February 3, 2015

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
METHOD AND APPARATUS FOR DETECTING A TARGET KEYWORD

Publication number: 20140337031

Abstract: A method of detecting a target keyword for activating a function in an electronic device is disclosed. The method includes receiving an input sound starting from one of the plurality of portions of the target keyword. The input sound may be periodically received based on a duty cycle. The method extracts a plurality of sound features from the input sound, and obtains state information on a plurality of states associated with the portions of the target keyword. Based on the extracted sound features and the state information, the input sound may be detected as the target keyword. The plurality of states includes a predetermined number of entry states indicative of a predetermined number of the plurality of portions.

Type: Application

Filed: November 22, 2013

Publication date: November 13, 2014

Applicant: QUALCOMM Incorporated

Inventors: Sungwoong Kim, Minsub Lee, Taesu Kim, Minho Jin, Kyu Woong Hwang
Unified treatment of data-sparseness and data-overfitting in maximum entropy modeling

Patent number: 8700403

Abstract: A method of statistical modeling is provided which includes constructing a statistical model and incorporating Gaussian priors during feature selection and during parameter optimization for the construction of the statistical model.

Type: Grant

Filed: November 3, 2005

Date of Patent: April 15, 2014

Assignee: Robert Bosch GmbH

Inventors: Fuliang Weng, Lin Zhao
Methods and apparatus relating to searching of spoken audio data

Patent number: 8694317

Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.

Type: Grant

Filed: February 6, 2006

Date of Patent: April 8, 2014

Assignee: Aurix Limited

Inventors: Adrian I Skilling, Howard A K Wright
Automatic speech and concept recognition

Patent number: 8676580

Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.

Type: Grant

Filed: August 16, 2011

Date of Patent: March 18, 2014

Assignee: International Business Machines Corporation

Inventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
Recognizing the numeric language in natural spoken dialogue

Patent number: 8655658

Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.

Type: Grant

Filed: October 25, 2011

Date of Patent: February 18, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
Method and equipment of pattern recognition, its program and its recording medium for improving searching efficiency in speech recognition

Patent number: 8612227

Abstract: The present invention provides a method and equipment of pattern recognition capable of efficiently pruning partial hypotheses without lowering recognition accuracy, its pattern recognition program, and its recording medium. In a second search unit, a likelihood calculation unit calculates an acoustic likelihood by matching time series data of acoustic feature parameters against a lexical tree stored in a second database and an acoustic model stored in a third database to determine an accumulated likelihood by accumulating the acoustic likelihood in a time direction. A self-transition unit causes each partial hypothesis to make a self-transition in a search process. An LR transition unit causes each partial hypothesis to make an RL transition. A reward attachment unit adds a reward R(x) in accordance with the number of reachable words to each partial hypothesis to raise the accumulated likelihood. A pruning unit excludes partial hypotheses with less likelihood from search targets.

Type: Grant

Filed: July 22, 2010

Date of Patent: December 17, 2013

Assignee: KDDI Corporation

Inventor: Tsuneo Kato
Method and system for processing audio signals

Patent number: 8462193

Abstract: A multimedia conferencing system includes a loud speaker system, one or more microphones for receiving a local audio signal and a remote audio signal, a state machine and an echo canceller that operates in conjunction with two reference signals to remove substantially all of a feedback signal component in the local audio signal that results from reinforcing and playing the local audio signal over the loud speaker system. The state machine operates to detect that only the local audio is active, and if so controls the operation of the echo canceller such that only the feedback component of the local audio signal is removed and the local audio signal is not suppressed.

Type: Grant

Filed: January 8, 2010

Date of Patent: June 11, 2013

Assignee: Polycom, Inc.

Inventors: Kwan Truong, Peter Chu, Michael Pocino, John Allen
Botnet early detection using hybrid hidden markov model algorithm

Patent number: 8307459

Abstract: A botnet detection system is provided. A bursty feature extractor receives an Internet Relay Chat (IRC) packet value from a detection object network, and determines a bursty feature accordingly. A Hybrid Hidden Markov Model (HHMM) parameter estimator determines probability parameters for a Hybrid Hidden Markov Model according to the bursty feature. A traffic profile generator establishes a probability sequential model for the Hybrid Hidden Markov Model according to the probability parameters and pre-defined network traffic categories. A dubious state detector determines a traffic state corresponding to a network relaying the IRC packet in response to reception of a new IRC packet, determines whether the IRC packet flow of the object network is dubious by applying the bursty feature to the probability sequential model for the Hybrid Hidden Markov Model, and generates a warning signal when the IRC packet flow is regarded as having a dubious traffic state.

Type: Grant

Filed: March 17, 2010

Date of Patent: November 6, 2012

Assignee: National Taiwan University of Science and Technology

Inventors: Hahn-Ming Lee, Ching-Hao Mao, Yu-Jie Chen, Yi-Hsun Wang, Jerome Yeh, Tsu-Han Chen
Hybrid speech recognition

Patent number: 8249877

Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.

Type: Grant

Filed: September 24, 2010

Date of Patent: August 21, 2012

Assignee: Multimodal Technologies, LLC

Inventor: Detlef Koll
Class detection scheme and time mediated averaging of class dependent models

Patent number: 8229744

Abstract: A method, system, and computer program for class detection and time mediated averaging of class dependent models. A technique is described to take advantage of gender information in training data and how obtain female, male, and gender independent models from this information. By using a probability value to average male and female Gaussian Mixture Models (GMMs), dramatic deterioration in cross gender decoding performance is avoided.

Type: Grant

Filed: August 26, 2003

Date of Patent: July 24, 2012

Assignee: Nuance Communications, Inc.

Inventors: Satyanarayana Dharanipragada, Peder A. Olsen
Comparing events in word spotting

Patent number: 8170873

Abstract: An approach to comparing events in word spotting, such as comparing putative and reference instances of a keyword, makes use of a set of models of subword units. For each of two acoustic events and for each of a series of times in each of the events, a probability associated with each of the models of the set of subword units is computed. Then, a quantity characterizing a comparison of the two acoustic events, one occurring in each of the two acoustic signals, is computed using the computed probabilities associated with each of the models.

Type: Grant

Filed: July 22, 2004

Date of Patent: May 1, 2012

Assignee: Nexidia Inc.

Inventor: Robert W. Morris
Recognizing the numeric language in natural spoken dialogue

Patent number: 8050925

Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.

Type: Grant

Filed: November 5, 2009

Date of Patent: November 1, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
Method of speaker adaptation for a hidden markov model based voice recognition system

Patent number: 8041567

Abstract: Commercially available voice recognition systems are generally speaker-dependent, with the voice recognition system first being trained to the voice of the speaker before it can be used. A disadvantage with this method is that modified reference data has to be buffered and permanently saved in several steps when the speaker adaptation algorithm is executed, and thus requires a lot of memory space. This primarily negatively affects applications on devices with restricted processor power and limited memory space, such as mobile radio terminals for example. A method of speaker adaptation for a Hidden Markov Model based voice recognition system may address these issues. In the method, the memory space requirement and thus also the processor power required can be considerably reduced. This is achieved by using modified reference data in a speaker adaptation algorithm to adapt a new speaker to a reference speaker. The modified reference data is processed in compressed form.

Type: Grant

Filed: September 22, 2005

Date of Patent: October 18, 2011

Assignee: Siemens Aktiengesellschaft

Inventors: Sergey Astrov, Josef Bauer
Method for uncovering hidden Markov models

Patent number: 7912717

Abstract: The invention uses the ModelGrower program to generate possible candidates from an original or aggregated model. An isomorphic reduction program operates on the candidates to identify and exclude isomorphic models. A Markov model evaluation and optimization program operates on the remaining non-isomorphic candidates. The candidates are optimized and the ones that most closely conform to the data are kept. The best optimized candidate of one stage becomes the starting candidate for the next stage where ModelGrower and the other programs operate on the optimized candidate to generate a new optimized candidate. The invention repeats the steps of growing, excluding isomorphs, evaluating and optimizing until such repetitions yield no significantly better results.

Type: Grant

Filed: November 18, 2005

Date of Patent: March 22, 2011

Inventor: Albert Galick
Speech recognition device and speech recognition method and recording medium utilizing preliminary word selection

Patent number: 7881935

Abstract: A speech recognition apparatus in which the accuracy in speech recognition is improved as the resource is prevented from increasing. Such a word which is probable as the result of the speech recognition is selected on the basis of an acoustic score and a linguistic score, while word selection is also performed on the basis of a measure different from the acoustic score, such as the number of phonemes being small, a part of speech being a pre-set one, inclusion in the past results of speech recognition or the linguistic score being not less than a pre-set value. The words so selected are subjected to matching processing.

Type: Grant

Filed: February 16, 2001

Date of Patent: February 1, 2011

Assignee: Sony Corporation

Inventors: Yasuharu Asano, Katsuki Minamino, Hiroaki Ogawa, Helmut Lucke
Speech recognition system for mobile terminal

Patent number: 7856356

Abstract: A speech recognition system for a mobile terminal includes an acoustic variation channel unit and a pronunciation channel unit. The acoustic variation channel unit transforms a speech signal into feature parameters and Viterbi-decodes the speech signal to produce a varied phoneme sequence by using the feature parameters and predetermined models. Further, the pronunciation variation channel unit Viterbi-decodes the varied phoneme sequence to produce a word phoneme sequence by using the varied phoneme sequence and a preset DHMM (Discrete Hidden Markov Model) based context-dependent error model.

Type: Grant

Filed: December 20, 2006

Date of Patent: December 21, 2010

Assignee: Electronics and Telecommunications Research Institute

Inventors: Hoon Chung, Yunkeun Lee
Voice recognition method and system based on the contexual modeling of voice units

Patent number: 7818172

Abstract: The method of recognizing speech in an acoustic signal comprises developing acoustic stochastic models of voice units in the form of a set of states of an acoustic signal and using the acoustic models for recognition by a comparison of the signal with predetermined acoustic models obtained via a prior learning process. While developing the acoustic models, the voice units are modeled by means of a first portion of the states independent of adjacent voice units and by means of a second portion of the states dependent on adjacent voice units. The second portion of states dependent on adjacent voice units shares common parameters with a plurality of units sharing same phonemes.

Type: Grant

Filed: April 20, 2004

Date of Patent: October 19, 2010

Assignee: France Telecom

Inventors: Ronaldo Messina, Denis Jouvet
Hidden trajectory modeling with differential cepstra for speech recognition

Patent number: 7805308

Abstract: A novel system for speech recognition uses differential cepstra over time frames as acoustic features, together with the traditional static cepstral features, for hidden trajectory modeling, and provides greater accuracy and performance in automatic speech recognition. According to one illustrative embodiment, an automatic speech recognition method includes receiving a speech input, generating an interpretation of the speech, and providing an output based at least in part on the interpretation of the speech input. The interpretation of the speech uses hidden trajectory modeling with observation vectors that are based on cepstra and on differential cepstra derived from the cepstra. A method is developed that can automatically train the hidden trajectory model's parameters that are corresponding to the components of the differential cepstra in the full acoustic feature vectors.

Type: Grant

Filed: January 19, 2007

Date of Patent: September 28, 2010

Assignee: Microsoft Corporation

Inventors: Li Deng, Dong Yu
Speech recognition system and technique

Patent number: 7711561

Abstract: The present invention relates to speech recognition systems, particularly speech-to-text systems and software and decoders for the same.

Type: Grant

Filed: April 15, 2004

Date of Patent: May 4, 2010

Assignee: Kabushiki Kaisha Toshiba

Inventors: Wide Hogenhout, Kean Kheong Chin
Methods and apparatus for flexible speech recognition

Patent number: 7698136

Abstract: The present invention is directed to a computer implemented method and apparatus for flexibly recognizing meaningful data items within an arbitrary user utterance. According to one example embodiment of the invention, a set of one or more key phrases and a set of one or more filler phrases are defined, probabilities are assigned to the key phrases and/or the filler phrases, and the user utterances is evaluated against the set of key phrases and the set of filler phrases using the probabilities.

Type: Grant

Filed: January 28, 2003

Date of Patent: April 13, 2010

Assignee: Voxify, Inc.

Inventors: Patrick T. M. Nguyen, Adeeb W. M. Shana'a, Amit V. Desai
Recognizing the numeric language in natural spoken dialogue

Patent number: 7624015

Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.

Type: Grant

Filed: March 2, 2006

Date of Patent: November 24, 2009

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
Method of speech recognition using hidden trajectory Hidden Markov Models

Patent number: 7617104

Abstract: A method of speech recognition is provided that determines a production-related value, vocal-tract resonance frequencies in particular, for a state at a particular frame based on the production-related values associated with two preceding frames using a recursion. The production-related value is used to determine a probability distribution of the observed feature vector for the state. A probability for an observed value received for the frame is then determined from the probability distribution. Under one embodiment, the production-related value is determined using a noise-free recursive definition for the value. Use of the recursion substantially improves the decoding speed. When the decoding algorithm is applied to training data with known phonetic transcripts, forced alignment is created which improves the phone segmentation obtained from the prior art.

Type: Grant

Filed: January 21, 2003

Date of Patent: November 10, 2009

Assignee: Microsoft Corporation

Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide
Natural language system and method based on unisolated performance metric

Patent number: 7574358

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.

Type: Grant

Filed: February 28, 2005

Date of Patent: August 11, 2009

Assignee: International Business Machines Corporation

Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
Block synchronous decoding

Patent number: 7529671

Abstract: A pattern recognition system and method are provided. Aspects of the invention are particularly useful in combination with multi-state Hidden Markov Models. Pattern recognition is effected by processing Hidden Markov Model Blocks. This block-processing allows the processor to perform more operations upon data while such data is in cache memory. By so increasing cache locality, aspects of the invention provide significantly improved pattern recognition speed.

Type: Grant

Filed: March 4, 2003

Date of Patent: May 5, 2009

Assignee: Microsoft Corporation

Inventors: William H. Rockenbeck, Julian J. Odell
System and Method for Text Tagging and Segmentation Using a Generative/Discriminative Hybrid Hidden Markov Model

Publication number: 20090055183

Abstract: A method for sequence tagging medical patient records includes providing a labeled corpus of sentences taken from a set of medical records, initializing generative parameters ? and discriminative parameters {tilde over (?)}, providing a functional LL?C×Penalty, where LL is a log-likelihood function LL = log ? ? p ? ( ? , ? ~ ) + ? l = 1 M ? ? [ log ? ? p ? ( X l , Y l | ? ~ ) - log ? ? p ? ( X l | ? ~ ) ] + ? l = 1 M ? ? log ? ? p ? ( X l | ? ) , ? Penalty = ? y ? V Y ? ( em y 2 + tr y 2 + e ? ? m ~ y 2 + t ? ? r ~ y 2 ) , where emy=1???xj?VXp(xi|y), e{tilde over (m)}y=1???xi?VX{tilde over (p)}(xi|y) are emission probability constraints, try=1???yi?VYp(yi|y), t{tilde over (r)}y=1???yi?VY{tilde over (p)}(yi|y) are transition probability constraints, and extracting gradients of LL?C×Penalty with respect to the transition and emission probabilities and solving ?*k,{tilde o

Type: Application

Filed: August 21, 2008

Publication date: February 26, 2009

Applicant: Siemens Medical Solutions USA, Inc.

Inventors: Oksana Yakhnenko, Romer E. Rosales, Radu Stefan Niculescu, Lucian Vlad Lita
Speech recognition apparatus

Patent number: 7437288

Abstract: A speech recognition apparatus using a probability model that employs a mixed distribution, the apparatus formed by a standard pattern storage means for storing a standard pattern; a recognition means for outputting recognition results corresponding to an input speech by using the standard pattern; a standard pattern generating means for inputting learning speech and generating the standard pattern; and a standard pattern adjustment means, provided between the standard pattern generating means and the standard pattern storage means, for adjusting the number of element distributions of the mixed distribution of the standard pattern.

Type: Grant

Filed: March 11, 2002

Date of Patent: October 14, 2008

Assignee: NEC Corporation

Inventor: Koichi Shinoda
Embedded multi-layer coupled hidden Markov model

Patent number: 7089185

Abstract: An arrangement is provided for embedded coupled hidden Markov model. To train an embedded coupled hidden Markov model, training data is first segmented into uniform segments at different layers of the embedded coupled hidden Markov model. At each layer, a uniform segment corresponds to a state of a coupled hidden Markov model at that layer. An optimal segmentation is generated at the lower layer based on the uniform segmentation and is then used to update parameters of models associated with the states of coupled hidden Markov models at lower layer. The updated model parameters at the lower layer are then used to update the model parameters associated with states at the super layer.

Type: Grant

Filed: June 27, 2002

Date of Patent: August 8, 2006

Assignee: Intel Corporation

Inventor: Ara V Nefian