Hidden Markov (hm) Network (epo) Patents (Class 704/256.5)
  • Patent number: 11712198
    Abstract: The present invention relates to a system and method for determining sleep quality parameters according to audio analyses, comprising: obtaining an audio recorded signal comprising sleep sounds of a subject; segmenting the signal into epochs; generating a feature vector for each epoch, wherein each of said feature vectors comprises one or more feature parameters that are associated with a particular characteristic of the signal and that are calculated according to the epoch signal or according to a signal generated from the epoch signal; inputting the generated feature vectors into a machine learning classifier and applying a preformed classifying model on the feature vectors that outputs a probabilities vector for each epoch, wherein each of the probabilities vectors comprises the probabilities of the epoch being each of the sleep quality parameters; inputting the probabilities vectors for each epoch into a machine learning time series model and applying a preformed sleep quality time series pattern function
    Type: Grant
    Filed: July 11, 2017
    Date of Patent: August 1, 2023
    Assignees: B.G. NEGEV TECHNOLOGIES AND APPLICATIONS LTD., AT BEN-GURION UNIVERSITY, MOR RESEARCH APPLICATIONS LTD.
    Inventors: Eliran Dafna, Yaniv Zigel, Dvir Ben Or, Matan Halevi, Ariel Tarasiuk
  • Patent number: 11593571
    Abstract: A machine translation method includes: receiving to-be-processed information expressed in a source language; encoding the to-be-processed information, and generating an expression vector sequence of the to-be-processed information; and predicting feature information of a target foresight word at a first moment by using a prediction model. The feature information includes at least one of a part of speech or a word category of the target foresight word. The method also includes: determining a context vector corresponding to the first moment in the expression vector sequence according to the feature information of the target foresight word; and decoding the context vector by using a decoder, to obtain target content that corresponds to the context vector and expressed in a target language.
    Type: Grant
    Filed: June 8, 2020
    Date of Patent: February 28, 2023
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Lemao Liu, Xintong Li, Shuming Shi
  • Patent number: 11580959
    Abstract: An approach to correcting transcriptions of speech recognition models may be provided. A list of similar sounding phonemes from associated with the phonemes of high frequency terms may be generated for a particular node associated with a virtual assistant. An utterance may be transcribed and receive a confidence score regarding the correctness of the transcription based on audio metrics and other factors. The phonemes of the utterance can be compared to the phonemes of the high frequency terms from the list and a score for the matching phonemes and similar sounding phonemes can be determined. If it is determined the sounds similar score for a term from the high frequency term list is above a threshold, the transcription can be replaced with the term, providing a corrected transcription.
    Type: Grant
    Filed: September 28, 2020
    Date of Patent: February 14, 2023
    Assignee: International Business Machines Corporation
    Inventors: Andrew R. Freed, Marco Noel, Aishwarya Hariharan, Martha Holloman, Mohammad Gorji-Sefidmazgi, Daniel Zyska
  • Patent number: 11545149
    Abstract: Disclosed are an electronic device capable of efficiently performing speech recognition and natural language understanding and a method for controlling thereof. The electronic device includes: a microphone; a non-volatile memory configured to store virtual assistant model data comprising data that is classified according to a plurality of domains and data that is commonly used for the plurality of domains; a volatile memory; and a processor configured to: based on receiving, through the microphone, a trigger input to perform speech recognition for a user speech, initiate loading the virtual assistant model data from the non-volatile memory into the volatile memory, load, into the volatile memory, first data from among the data classified according to the plurality of domains and, while loading the first data into the volatile memory, load at least a part of the data commonly used for the plurality of domains into the volatile memory.
    Type: Grant
    Filed: June 10, 2020
    Date of Patent: January 3, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Saebom Jang, Hyeonmok Ko, Kyenghun Lee, Kunal Sharma, Raghavendra Hanumantasetty Ramasetty
  • Patent number: 11288039
    Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: March 29, 2022
    Assignee: Sonos, Inc.
    Inventors: Dayn Wilberding, John Tolomei
  • Patent number: 11175888
    Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.
    Type: Grant
    Filed: January 25, 2021
    Date of Patent: November 16, 2021
    Assignee: Sonos, Inc.
    Inventors: Dayn Wilberding, John Tolomei
  • Patent number: 11127394
    Abstract: Techniques related to keyphrase detection for applications such as wake on voice are disclosed herein. Such techniques may have high accuracy by using scores of phone positions in triphones to select which triphones to use with a rejection model, using context-related phones for the rejection model, adding silence before keyphrase sounds for a keyphrase model, or any combination of these.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: September 21, 2021
    Assignee: Intel Corporation
    Inventors: Sebastian Czyryba, Tobias Bocklet, Kuba Lopatka
  • Patent number: 11100919
    Abstract: There is provided an information processing device including an analysis unit configured to analyze a character string indicating contents of utterance obtained as a result of speech recognition, and a display control unit configured to display the character string indicating the contents of the utterance and an analysis result on a display screen.
    Type: Grant
    Filed: July 26, 2019
    Date of Patent: August 24, 2021
    Assignee: SATURN LICENSING LLC
    Inventors: Tomoaki Takemura, Shinya Masunaga, Koji Fujita, Katsutoshi Ishiwata, Kenichi Ikenaga, Katsutoshi Kusumoto
  • Patent number: 10606555
    Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.
    Type: Grant
    Filed: November 4, 2019
    Date of Patent: March 31, 2020
    Assignee: Sonos, Inc.
    Inventors: Dayn Wilberding, John Tolomei
  • Patent number: 10466962
    Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, media playback system is configured to (i) capture a voice input via at least one microphone device, (ii) detect inclusion of one or more of the commands within the voice input, (iii) determine that the one or more commands meets corresponding command criteria associated with the one or more commands within the set of command information, and (iv) in response to the determination, select a first voice assistant service (VAS) and (a) forego selection of a second VAS, (b) send the voice input to first VAS, and (c) after sending the voice input, receiving a response to the voice input from the first VAS.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: November 5, 2019
    Assignee: Sonos, Inc.
    Inventors: Dayn Wilberding, John Tolomei
  • Patent number: 10424291
    Abstract: There is provided an information processing device including an analysis unit configured to analyze a character string indicating contents of utterance obtained as a result of speech recognition, and a display control unit configured to display the character string indicating the contents of the utterance and an analysis result on a display screen.
    Type: Grant
    Filed: November 20, 2013
    Date of Patent: September 24, 2019
    Assignee: Saturn Licensing LLC
    Inventors: Tomoaki Takemura, Shinya Masunaga, Koji Fujita, Katsutoshi Ishiwata, Kenichi Ikenaga, Katsutoshi Kusumoto
  • Patent number: 10204624
    Abstract: One embodiment provides a method, comprising: receiving, at an information handling device, user input comprising a word that activates a digital assistant associated with the information handling device; determining, using a processor, whether the word comprises a false positive wake word, wherein the false positive wake word comprises a stored wake word; and responsive to determining that the word is associated with a false positive wake word, providing output associated with an alternative wake word. Other aspects are described and claimed.
    Type: Grant
    Filed: August 14, 2017
    Date of Patent: February 12, 2019
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Ryan Charles Knudson, Russell Speight VanBlon, Roderick Echols, Jonathan Gaither Knox
  • Patent number: 9183196
    Abstract: A language processing system uses annotation services that are external to the language processing system to identify n-grams that identify entities in an input sentence. The n-grams are annotated by the annotation services. The annotations are used to determine which n-grams, if any, correspond to instances of an entity type (e.g., values for a variable or terminals for a non-terminal). After determining which n-grams correspond to entity types, parse initializations are generated for parsing rules and parses for each rule are attempted. The rules that successfully parse are used to determine whether the input sentence invokes a specific action, and if so, what arguments are to be passed to the invocation of the action.
    Type: Grant
    Filed: June 25, 2013
    Date of Patent: November 10, 2015
    Assignee: Google Inc.
    Inventors: Jakob D. Uszkoreit, Percy Liang, Daniel M. Bikel
  • Patent number: 9020820
    Abstract: A state detecting apparatus includes: a processor to execute acquiring utterance data related to uttered speech, computing a plurality of statistical quantities for feature parameters regarding features of the utterance data, creating, on the basis of the plurality of statistical quantities regarding the utterance data and another plurality of statistical quantities regarding reference utterance data based on other uttered speech, pseudo-utterance data having at least one statistical quantity equal to a statistical quantity in the other plurality of statistical quantities, computing a plurality of statistical quantities for synthetic utterance data synthesized on the basis of the pseudo-utterance data and the utterance data, and determining, on the basis of a comparison between statistical quantities of the synthetic utterance data and statistical quantities of the reference utterance data, whether the speaker who produced the uttered speech is in a first state or a second state; and a memory.
    Type: Grant
    Filed: April 13, 2012
    Date of Patent: April 28, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 8949127
    Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.
    Type: Grant
    Filed: February 17, 2014
    Date of Patent: February 3, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
  • Publication number: 20140337031
    Abstract: A method of detecting a target keyword for activating a function in an electronic device is disclosed. The method includes receiving an input sound starting from one of the plurality of portions of the target keyword. The input sound may be periodically received based on a duty cycle. The method extracts a plurality of sound features from the input sound, and obtains state information on a plurality of states associated with the portions of the target keyword. Based on the extracted sound features and the state information, the input sound may be detected as the target keyword. The plurality of states includes a predetermined number of entry states indicative of a predetermined number of the plurality of portions.
    Type: Application
    Filed: November 22, 2013
    Publication date: November 13, 2014
    Applicant: QUALCOMM Incorporated
    Inventors: Sungwoong Kim, Minsub Lee, Taesu Kim, Minho Jin, Kyu Woong Hwang
  • Patent number: 8700403
    Abstract: A method of statistical modeling is provided which includes constructing a statistical model and incorporating Gaussian priors during feature selection and during parameter optimization for the construction of the statistical model.
    Type: Grant
    Filed: November 3, 2005
    Date of Patent: April 15, 2014
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Lin Zhao
  • Patent number: 8694317
    Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: April 8, 2014
    Assignee: Aurix Limited
    Inventors: Adrian I Skilling, Howard A K Wright
  • Patent number: 8676580
    Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.
    Type: Grant
    Filed: August 16, 2011
    Date of Patent: March 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
  • Patent number: 8655658
    Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.
    Type: Grant
    Filed: October 25, 2011
    Date of Patent: February 18, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
  • Patent number: 8612227
    Abstract: The present invention provides a method and equipment of pattern recognition capable of efficiently pruning partial hypotheses without lowering recognition accuracy, its pattern recognition program, and its recording medium. In a second search unit, a likelihood calculation unit calculates an acoustic likelihood by matching time series data of acoustic feature parameters against a lexical tree stored in a second database and an acoustic model stored in a third database to determine an accumulated likelihood by accumulating the acoustic likelihood in a time direction. A self-transition unit causes each partial hypothesis to make a self-transition in a search process. An LR transition unit causes each partial hypothesis to make an RL transition. A reward attachment unit adds a reward R(x) in accordance with the number of reachable words to each partial hypothesis to raise the accumulated likelihood. A pruning unit excludes partial hypotheses with less likelihood from search targets.
    Type: Grant
    Filed: July 22, 2010
    Date of Patent: December 17, 2013
    Assignee: KDDI Corporation
    Inventor: Tsuneo Kato
  • Patent number: 8462193
    Abstract: A multimedia conferencing system includes a loud speaker system, one or more microphones for receiving a local audio signal and a remote audio signal, a state machine and an echo canceller that operates in conjunction with two reference signals to remove substantially all of a feedback signal component in the local audio signal that results from reinforcing and playing the local audio signal over the loud speaker system. The state machine operates to detect that only the local audio is active, and if so controls the operation of the echo canceller such that only the feedback component of the local audio signal is removed and the local audio signal is not suppressed.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: June 11, 2013
    Assignee: Polycom, Inc.
    Inventors: Kwan Truong, Peter Chu, Michael Pocino, John Allen
  • Patent number: 8307459
    Abstract: A botnet detection system is provided. A bursty feature extractor receives an Internet Relay Chat (IRC) packet value from a detection object network, and determines a bursty feature accordingly. A Hybrid Hidden Markov Model (HHMM) parameter estimator determines probability parameters for a Hybrid Hidden Markov Model according to the bursty feature. A traffic profile generator establishes a probability sequential model for the Hybrid Hidden Markov Model according to the probability parameters and pre-defined network traffic categories. A dubious state detector determines a traffic state corresponding to a network relaying the IRC packet in response to reception of a new IRC packet, determines whether the IRC packet flow of the object network is dubious by applying the bursty feature to the probability sequential model for the Hybrid Hidden Markov Model, and generates a warning signal when the IRC packet flow is regarded as having a dubious traffic state.
    Type: Grant
    Filed: March 17, 2010
    Date of Patent: November 6, 2012
    Assignee: National Taiwan University of Science and Technology
    Inventors: Hahn-Ming Lee, Ching-Hao Mao, Yu-Jie Chen, Yi-Hsun Wang, Jerome Yeh, Tsu-Han Chen
  • Patent number: 8249877
    Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.
    Type: Grant
    Filed: September 24, 2010
    Date of Patent: August 21, 2012
    Assignee: Multimodal Technologies, LLC
    Inventor: Detlef Koll
  • Patent number: 8229744
    Abstract: A method, system, and computer program for class detection and time mediated averaging of class dependent models. A technique is described to take advantage of gender information in training data and how obtain female, male, and gender independent models from this information. By using a probability value to average male and female Gaussian Mixture Models (GMMs), dramatic deterioration in cross gender decoding performance is avoided.
    Type: Grant
    Filed: August 26, 2003
    Date of Patent: July 24, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Satyanarayana Dharanipragada, Peder A. Olsen
  • Patent number: 8170873
    Abstract: An approach to comparing events in word spotting, such as comparing putative and reference instances of a keyword, makes use of a set of models of subword units. For each of two acoustic events and for each of a series of times in each of the events, a probability associated with each of the models of the set of subword units is computed. Then, a quantity characterizing a comparison of the two acoustic events, one occurring in each of the two acoustic signals, is computed using the computed probabilities associated with each of the models.
    Type: Grant
    Filed: July 22, 2004
    Date of Patent: May 1, 2012
    Assignee: Nexidia Inc.
    Inventor: Robert W. Morris
  • Patent number: 8050925
    Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.
    Type: Grant
    Filed: November 5, 2009
    Date of Patent: November 1, 2011
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
  • Patent number: 8041567
    Abstract: Commercially available voice recognition systems are generally speaker-dependent, with the voice recognition system first being trained to the voice of the speaker before it can be used. A disadvantage with this method is that modified reference data has to be buffered and permanently saved in several steps when the speaker adaptation algorithm is executed, and thus requires a lot of memory space. This primarily negatively affects applications on devices with restricted processor power and limited memory space, such as mobile radio terminals for example. A method of speaker adaptation for a Hidden Markov Model based voice recognition system may address these issues. In the method, the memory space requirement and thus also the processor power required can be considerably reduced. This is achieved by using modified reference data in a speaker adaptation algorithm to adapt a new speaker to a reference speaker. The modified reference data is processed in compressed form.
    Type: Grant
    Filed: September 22, 2005
    Date of Patent: October 18, 2011
    Assignee: Siemens Aktiengesellschaft
    Inventors: Sergey Astrov, Josef Bauer
  • Patent number: 7912717
    Abstract: The invention uses the ModelGrower program to generate possible candidates from an original or aggregated model. An isomorphic reduction program operates on the candidates to identify and exclude isomorphic models. A Markov model evaluation and optimization program operates on the remaining non-isomorphic candidates. The candidates are optimized and the ones that most closely conform to the data are kept. The best optimized candidate of one stage becomes the starting candidate for the next stage where ModelGrower and the other programs operate on the optimized candidate to generate a new optimized candidate. The invention repeats the steps of growing, excluding isomorphs, evaluating and optimizing until such repetitions yield no significantly better results.
    Type: Grant
    Filed: November 18, 2005
    Date of Patent: March 22, 2011
    Inventor: Albert Galick
  • Patent number: 7881935
    Abstract: A speech recognition apparatus in which the accuracy in speech recognition is improved as the resource is prevented from increasing. Such a word which is probable as the result of the speech recognition is selected on the basis of an acoustic score and a linguistic score, while word selection is also performed on the basis of a measure different from the acoustic score, such as the number of phonemes being small, a part of speech being a pre-set one, inclusion in the past results of speech recognition or the linguistic score being not less than a pre-set value. The words so selected are subjected to matching processing.
    Type: Grant
    Filed: February 16, 2001
    Date of Patent: February 1, 2011
    Assignee: Sony Corporation
    Inventors: Yasuharu Asano, Katsuki Minamino, Hiroaki Ogawa, Helmut Lucke
  • Patent number: 7856356
    Abstract: A speech recognition system for a mobile terminal includes an acoustic variation channel unit and a pronunciation channel unit. The acoustic variation channel unit transforms a speech signal into feature parameters and Viterbi-decodes the speech signal to produce a varied phoneme sequence by using the feature parameters and predetermined models. Further, the pronunciation variation channel unit Viterbi-decodes the varied phoneme sequence to produce a word phoneme sequence by using the varied phoneme sequence and a preset DHMM (Discrete Hidden Markov Model) based context-dependent error model.
    Type: Grant
    Filed: December 20, 2006
    Date of Patent: December 21, 2010
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hoon Chung, Yunkeun Lee
  • Patent number: 7818172
    Abstract: The method of recognizing speech in an acoustic signal comprises developing acoustic stochastic models of voice units in the form of a set of states of an acoustic signal and using the acoustic models for recognition by a comparison of the signal with predetermined acoustic models obtained via a prior learning process. While developing the acoustic models, the voice units are modeled by means of a first portion of the states independent of adjacent voice units and by means of a second portion of the states dependent on adjacent voice units. The second portion of states dependent on adjacent voice units shares common parameters with a plurality of units sharing same phonemes.
    Type: Grant
    Filed: April 20, 2004
    Date of Patent: October 19, 2010
    Assignee: France Telecom
    Inventors: Ronaldo Messina, Denis Jouvet
  • Patent number: 7805308
    Abstract: A novel system for speech recognition uses differential cepstra over time frames as acoustic features, together with the traditional static cepstral features, for hidden trajectory modeling, and provides greater accuracy and performance in automatic speech recognition. According to one illustrative embodiment, an automatic speech recognition method includes receiving a speech input, generating an interpretation of the speech, and providing an output based at least in part on the interpretation of the speech input. The interpretation of the speech uses hidden trajectory modeling with observation vectors that are based on cepstra and on differential cepstra derived from the cepstra. A method is developed that can automatically train the hidden trajectory model's parameters that are corresponding to the components of the differential cepstra in the full acoustic feature vectors.
    Type: Grant
    Filed: January 19, 2007
    Date of Patent: September 28, 2010
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Dong Yu
  • Patent number: 7711561
    Abstract: The present invention relates to speech recognition systems, particularly speech-to-text systems and software and decoders for the same.
    Type: Grant
    Filed: April 15, 2004
    Date of Patent: May 4, 2010
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Wide Hogenhout, Kean Kheong Chin
  • Patent number: 7698136
    Abstract: The present invention is directed to a computer implemented method and apparatus for flexibly recognizing meaningful data items within an arbitrary user utterance. According to one example embodiment of the invention, a set of one or more key phrases and a set of one or more filler phrases are defined, probabilities are assigned to the key phrases and/or the filler phrases, and the user utterances is evaluated against the set of key phrases and the set of filler phrases using the probabilities.
    Type: Grant
    Filed: January 28, 2003
    Date of Patent: April 13, 2010
    Assignee: Voxify, Inc.
    Inventors: Patrick T. M. Nguyen, Adeeb W. M. Shana'a, Amit V. Desai
  • Patent number: 7624015
    Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.
    Type: Grant
    Filed: March 2, 2006
    Date of Patent: November 24, 2009
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
  • Patent number: 7617104
    Abstract: A method of speech recognition is provided that determines a production-related value, vocal-tract resonance frequencies in particular, for a state at a particular frame based on the production-related values associated with two preceding frames using a recursion. The production-related value is used to determine a probability distribution of the observed feature vector for the state. A probability for an observed value received for the frame is then determined from the probability distribution. Under one embodiment, the production-related value is determined using a noise-free recursive definition for the value. Use of the recursion substantially improves the decoding speed. When the decoding algorithm is applied to training data with known phonetic transcripts, forced alignment is created which improves the phone segmentation obtained from the prior art.
    Type: Grant
    Filed: January 21, 2003
    Date of Patent: November 10, 2009
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide
  • Patent number: 7574358
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
    Type: Grant
    Filed: February 28, 2005
    Date of Patent: August 11, 2009
    Assignee: International Business Machines Corporation
    Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Patent number: 7529671
    Abstract: A pattern recognition system and method are provided. Aspects of the invention are particularly useful in combination with multi-state Hidden Markov Models. Pattern recognition is effected by processing Hidden Markov Model Blocks. This block-processing allows the processor to perform more operations upon data while such data is in cache memory. By so increasing cache locality, aspects of the invention provide significantly improved pattern recognition speed.
    Type: Grant
    Filed: March 4, 2003
    Date of Patent: May 5, 2009
    Assignee: Microsoft Corporation
    Inventors: William H. Rockenbeck, Julian J. Odell
  • Publication number: 20090055183
    Abstract: A method for sequence tagging medical patient records includes providing a labeled corpus of sentences taken from a set of medical records, initializing generative parameters ? and discriminative parameters {tilde over (?)}, providing a functional LL?CĂ—Penalty, where LL is a log-likelihood function LL = log ? ? p ? ( ? , ? ~ ) + ? l = 1 M ? ? [ log ? ? p ? ( X l , Y l | ? ~ ) - log ? ? p ? ( X l | ? ~ ) ] + ? l = 1 M ? ? log ? ? p ? ( X l | ? ) , ? Penalty = ? y ? V Y ? ( em y 2 + tr y 2 + e ? ? m ~ y 2 + t ? ? r ~ y 2 ) , where emy=1???xj?VXp(xi|y), e{tilde over (m)}y=1???xi?VX{tilde over (p)}(xi|y) are emission probability constraints, try=1???yi?VYp(yi|y), t{tilde over (r)}y=1???yi?VY{tilde over (p)}(yi|y) are transition probability constraints, and extracting gradients of LL?CĂ—Penalty with respect to the transition and emission probabilities and solving ?*k,{tilde o
    Type: Application
    Filed: August 21, 2008
    Publication date: February 26, 2009
    Applicant: Siemens Medical Solutions USA, Inc.
    Inventors: Oksana Yakhnenko, Romer E. Rosales, Radu Stefan Niculescu, Lucian Vlad Lita
  • Patent number: 7437288
    Abstract: A speech recognition apparatus using a probability model that employs a mixed distribution, the apparatus formed by a standard pattern storage means for storing a standard pattern; a recognition means for outputting recognition results corresponding to an input speech by using the standard pattern; a standard pattern generating means for inputting learning speech and generating the standard pattern; and a standard pattern adjustment means, provided between the standard pattern generating means and the standard pattern storage means, for adjusting the number of element distributions of the mixed distribution of the standard pattern.
    Type: Grant
    Filed: March 11, 2002
    Date of Patent: October 14, 2008
    Assignee: NEC Corporation
    Inventor: Koichi Shinoda
  • Patent number: 7089185
    Abstract: An arrangement is provided for embedded coupled hidden Markov model. To train an embedded coupled hidden Markov model, training data is first segmented into uniform segments at different layers of the embedded coupled hidden Markov model. At each layer, a uniform segment corresponds to a state of a coupled hidden Markov model at that layer. An optimal segmentation is generated at the lower layer based on the uniform segmentation and is then used to update parameters of models associated with the states of coupled hidden Markov models at lower layer. The updated model parameters at the lower layer are then used to update the model parameters associated with states at the super layer.
    Type: Grant
    Filed: June 27, 2002
    Date of Patent: August 8, 2006
    Assignee: Intel Corporation
    Inventor: Ara V Nefian