Hidden Markov (hm) Network (epo) Patents (Class 704/256.5)
-
Patent number: 11712198Abstract: The present invention relates to a system and method for determining sleep quality parameters according to audio analyses, comprising: obtaining an audio recorded signal comprising sleep sounds of a subject; segmenting the signal into epochs; generating a feature vector for each epoch, wherein each of said feature vectors comprises one or more feature parameters that are associated with a particular characteristic of the signal and that are calculated according to the epoch signal or according to a signal generated from the epoch signal; inputting the generated feature vectors into a machine learning classifier and applying a preformed classifying model on the feature vectors that outputs a probabilities vector for each epoch, wherein each of the probabilities vectors comprises the probabilities of the epoch being each of the sleep quality parameters; inputting the probabilities vectors for each epoch into a machine learning time series model and applying a preformed sleep quality time series pattern functionType: GrantFiled: July 11, 2017Date of Patent: August 1, 2023Assignees: B.G. NEGEV TECHNOLOGIES AND APPLICATIONS LTD., AT BEN-GURION UNIVERSITY, MOR RESEARCH APPLICATIONS LTD.Inventors: Eliran Dafna, Yaniv Zigel, Dvir Ben Or, Matan Halevi, Ariel Tarasiuk
-
Patent number: 11593571Abstract: A machine translation method includes: receiving to-be-processed information expressed in a source language; encoding the to-be-processed information, and generating an expression vector sequence of the to-be-processed information; and predicting feature information of a target foresight word at a first moment by using a prediction model. The feature information includes at least one of a part of speech or a word category of the target foresight word. The method also includes: determining a context vector corresponding to the first moment in the expression vector sequence according to the feature information of the target foresight word; and decoding the context vector by using a decoder, to obtain target content that corresponds to the context vector and expressed in a target language.Type: GrantFiled: June 8, 2020Date of Patent: February 28, 2023Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Lemao Liu, Xintong Li, Shuming Shi
-
Patent number: 11580959Abstract: An approach to correcting transcriptions of speech recognition models may be provided. A list of similar sounding phonemes from associated with the phonemes of high frequency terms may be generated for a particular node associated with a virtual assistant. An utterance may be transcribed and receive a confidence score regarding the correctness of the transcription based on audio metrics and other factors. The phonemes of the utterance can be compared to the phonemes of the high frequency terms from the list and a score for the matching phonemes and similar sounding phonemes can be determined. If it is determined the sounds similar score for a term from the high frequency term list is above a threshold, the transcription can be replaced with the term, providing a corrected transcription.Type: GrantFiled: September 28, 2020Date of Patent: February 14, 2023Assignee: International Business Machines CorporationInventors: Andrew R. Freed, Marco Noel, Aishwarya Hariharan, Martha Holloman, Mohammad Gorji-Sefidmazgi, Daniel Zyska
-
Patent number: 11545149Abstract: Disclosed are an electronic device capable of efficiently performing speech recognition and natural language understanding and a method for controlling thereof. The electronic device includes: a microphone; a non-volatile memory configured to store virtual assistant model data comprising data that is classified according to a plurality of domains and data that is commonly used for the plurality of domains; a volatile memory; and a processor configured to: based on receiving, through the microphone, a trigger input to perform speech recognition for a user speech, initiate loading the virtual assistant model data from the non-volatile memory into the volatile memory, load, into the volatile memory, first data from among the data classified according to the plurality of domains and, while loading the first data into the volatile memory, load at least a part of the data commonly used for the plurality of domains into the volatile memory.Type: GrantFiled: June 10, 2020Date of Patent: January 3, 2023Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Saebom Jang, Hyeonmok Ko, Kyenghun Lee, Kunal Sharma, Raghavendra Hanumantasetty Ramasetty
-
Patent number: 11288039Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.Type: GrantFiled: March 30, 2020Date of Patent: March 29, 2022Assignee: Sonos, Inc.Inventors: Dayn Wilberding, John Tolomei
-
Patent number: 11175888Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.Type: GrantFiled: January 25, 2021Date of Patent: November 16, 2021Assignee: Sonos, Inc.Inventors: Dayn Wilberding, John Tolomei
-
Patent number: 11127394Abstract: Techniques related to keyphrase detection for applications such as wake on voice are disclosed herein. Such techniques may have high accuracy by using scores of phone positions in triphones to select which triphones to use with a rejection model, using context-related phones for the rejection model, adding silence before keyphrase sounds for a keyphrase model, or any combination of these.Type: GrantFiled: March 29, 2019Date of Patent: September 21, 2021Assignee: Intel CorporationInventors: Sebastian Czyryba, Tobias Bocklet, Kuba Lopatka
-
Patent number: 11100919Abstract: There is provided an information processing device including an analysis unit configured to analyze a character string indicating contents of utterance obtained as a result of speech recognition, and a display control unit configured to display the character string indicating the contents of the utterance and an analysis result on a display screen.Type: GrantFiled: July 26, 2019Date of Patent: August 24, 2021Assignee: SATURN LICENSING LLCInventors: Tomoaki Takemura, Shinya Masunaga, Koji Fujita, Katsutoshi Ishiwata, Kenichi Ikenaga, Katsutoshi Kusumoto
-
Patent number: 10606555Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.Type: GrantFiled: November 4, 2019Date of Patent: March 31, 2020Assignee: Sonos, Inc.Inventors: Dayn Wilberding, John Tolomei
-
Patent number: 10466962Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, media playback system is configured to (i) capture a voice input via at least one microphone device, (ii) detect inclusion of one or more of the commands within the voice input, (iii) determine that the one or more commands meets corresponding command criteria associated with the one or more commands within the set of command information, and (iv) in response to the determination, select a first voice assistant service (VAS) and (a) forego selection of a second VAS, (b) send the voice input to first VAS, and (c) after sending the voice input, receiving a response to the voice input from the first VAS.Type: GrantFiled: September 29, 2017Date of Patent: November 5, 2019Assignee: Sonos, Inc.Inventors: Dayn Wilberding, John Tolomei
-
Patent number: 10424291Abstract: There is provided an information processing device including an analysis unit configured to analyze a character string indicating contents of utterance obtained as a result of speech recognition, and a display control unit configured to display the character string indicating the contents of the utterance and an analysis result on a display screen.Type: GrantFiled: November 20, 2013Date of Patent: September 24, 2019Assignee: Saturn Licensing LLCInventors: Tomoaki Takemura, Shinya Masunaga, Koji Fujita, Katsutoshi Ishiwata, Kenichi Ikenaga, Katsutoshi Kusumoto
-
Patent number: 10204624Abstract: One embodiment provides a method, comprising: receiving, at an information handling device, user input comprising a word that activates a digital assistant associated with the information handling device; determining, using a processor, whether the word comprises a false positive wake word, wherein the false positive wake word comprises a stored wake word; and responsive to determining that the word is associated with a false positive wake word, providing output associated with an alternative wake word. Other aspects are described and claimed.Type: GrantFiled: August 14, 2017Date of Patent: February 12, 2019Assignee: Lenovo (Singapore) Pte. Ltd.Inventors: Ryan Charles Knudson, Russell Speight VanBlon, Roderick Echols, Jonathan Gaither Knox
-
Patent number: 9183196Abstract: A language processing system uses annotation services that are external to the language processing system to identify n-grams that identify entities in an input sentence. The n-grams are annotated by the annotation services. The annotations are used to determine which n-grams, if any, correspond to instances of an entity type (e.g., values for a variable or terminals for a non-terminal). After determining which n-grams correspond to entity types, parse initializations are generated for parsing rules and parses for each rule are attempted. The rules that successfully parse are used to determine whether the input sentence invokes a specific action, and if so, what arguments are to be passed to the invocation of the action.Type: GrantFiled: June 25, 2013Date of Patent: November 10, 2015Assignee: Google Inc.Inventors: Jakob D. Uszkoreit, Percy Liang, Daniel M. Bikel
-
Patent number: 9020820Abstract: A state detecting apparatus includes: a processor to execute acquiring utterance data related to uttered speech, computing a plurality of statistical quantities for feature parameters regarding features of the utterance data, creating, on the basis of the plurality of statistical quantities regarding the utterance data and another plurality of statistical quantities regarding reference utterance data based on other uttered speech, pseudo-utterance data having at least one statistical quantity equal to a statistical quantity in the other plurality of statistical quantities, computing a plurality of statistical quantities for synthetic utterance data synthesized on the basis of the pseudo-utterance data and the utterance data, and determining, on the basis of a comparison between statistical quantities of the synthetic utterance data and statistical quantities of the reference utterance data, whether the speaker who produced the uttered speech is in a first state or a second state; and a memory.Type: GrantFiled: April 13, 2012Date of Patent: April 28, 2015Assignee: Fujitsu LimitedInventors: Shoji Hayakawa, Naoshi Matsuo
-
Patent number: 8949127Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.Type: GrantFiled: February 17, 2014Date of Patent: February 3, 2015Assignee: AT&T Intellectual Property II, L.P.Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
-
Publication number: 20140337031Abstract: A method of detecting a target keyword for activating a function in an electronic device is disclosed. The method includes receiving an input sound starting from one of the plurality of portions of the target keyword. The input sound may be periodically received based on a duty cycle. The method extracts a plurality of sound features from the input sound, and obtains state information on a plurality of states associated with the portions of the target keyword. Based on the extracted sound features and the state information, the input sound may be detected as the target keyword. The plurality of states includes a predetermined number of entry states indicative of a predetermined number of the plurality of portions.Type: ApplicationFiled: November 22, 2013Publication date: November 13, 2014Applicant: QUALCOMM IncorporatedInventors: Sungwoong Kim, Minsub Lee, Taesu Kim, Minho Jin, Kyu Woong Hwang
-
Patent number: 8700403Abstract: A method of statistical modeling is provided which includes constructing a statistical model and incorporating Gaussian priors during feature selection and during parameter optimization for the construction of the statistical model.Type: GrantFiled: November 3, 2005Date of Patent: April 15, 2014Assignee: Robert Bosch GmbHInventors: Fuliang Weng, Lin Zhao
-
Patent number: 8694317Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.Type: GrantFiled: February 6, 2006Date of Patent: April 8, 2014Assignee: Aurix LimitedInventors: Adrian I Skilling, Howard A K Wright
-
Patent number: 8676580Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.Type: GrantFiled: August 16, 2011Date of Patent: March 18, 2014Assignee: International Business Machines CorporationInventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
-
Patent number: 8655658Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.Type: GrantFiled: October 25, 2011Date of Patent: February 18, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
-
Patent number: 8612227Abstract: The present invention provides a method and equipment of pattern recognition capable of efficiently pruning partial hypotheses without lowering recognition accuracy, its pattern recognition program, and its recording medium. In a second search unit, a likelihood calculation unit calculates an acoustic likelihood by matching time series data of acoustic feature parameters against a lexical tree stored in a second database and an acoustic model stored in a third database to determine an accumulated likelihood by accumulating the acoustic likelihood in a time direction. A self-transition unit causes each partial hypothesis to make a self-transition in a search process. An LR transition unit causes each partial hypothesis to make an RL transition. A reward attachment unit adds a reward R(x) in accordance with the number of reachable words to each partial hypothesis to raise the accumulated likelihood. A pruning unit excludes partial hypotheses with less likelihood from search targets.Type: GrantFiled: July 22, 2010Date of Patent: December 17, 2013Assignee: KDDI CorporationInventor: Tsuneo Kato
-
Patent number: 8462193Abstract: A multimedia conferencing system includes a loud speaker system, one or more microphones for receiving a local audio signal and a remote audio signal, a state machine and an echo canceller that operates in conjunction with two reference signals to remove substantially all of a feedback signal component in the local audio signal that results from reinforcing and playing the local audio signal over the loud speaker system. The state machine operates to detect that only the local audio is active, and if so controls the operation of the echo canceller such that only the feedback component of the local audio signal is removed and the local audio signal is not suppressed.Type: GrantFiled: January 8, 2010Date of Patent: June 11, 2013Assignee: Polycom, Inc.Inventors: Kwan Truong, Peter Chu, Michael Pocino, John Allen
-
Patent number: 8307459Abstract: A botnet detection system is provided. A bursty feature extractor receives an Internet Relay Chat (IRC) packet value from a detection object network, and determines a bursty feature accordingly. A Hybrid Hidden Markov Model (HHMM) parameter estimator determines probability parameters for a Hybrid Hidden Markov Model according to the bursty feature. A traffic profile generator establishes a probability sequential model for the Hybrid Hidden Markov Model according to the probability parameters and pre-defined network traffic categories. A dubious state detector determines a traffic state corresponding to a network relaying the IRC packet in response to reception of a new IRC packet, determines whether the IRC packet flow of the object network is dubious by applying the bursty feature to the probability sequential model for the Hybrid Hidden Markov Model, and generates a warning signal when the IRC packet flow is regarded as having a dubious traffic state.Type: GrantFiled: March 17, 2010Date of Patent: November 6, 2012Assignee: National Taiwan University of Science and TechnologyInventors: Hahn-Ming Lee, Ching-Hao Mao, Yu-Jie Chen, Yi-Hsun Wang, Jerome Yeh, Tsu-Han Chen
-
Patent number: 8249877Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.Type: GrantFiled: September 24, 2010Date of Patent: August 21, 2012Assignee: Multimodal Technologies, LLCInventor: Detlef Koll
-
Patent number: 8229744Abstract: A method, system, and computer program for class detection and time mediated averaging of class dependent models. A technique is described to take advantage of gender information in training data and how obtain female, male, and gender independent models from this information. By using a probability value to average male and female Gaussian Mixture Models (GMMs), dramatic deterioration in cross gender decoding performance is avoided.Type: GrantFiled: August 26, 2003Date of Patent: July 24, 2012Assignee: Nuance Communications, Inc.Inventors: Satyanarayana Dharanipragada, Peder A. Olsen
-
Patent number: 8170873Abstract: An approach to comparing events in word spotting, such as comparing putative and reference instances of a keyword, makes use of a set of models of subword units. For each of two acoustic events and for each of a series of times in each of the events, a probability associated with each of the models of the set of subword units is computed. Then, a quantity characterizing a comparison of the two acoustic events, one occurring in each of the two acoustic signals, is computed using the computed probabilities associated with each of the models.Type: GrantFiled: July 22, 2004Date of Patent: May 1, 2012Assignee: Nexidia Inc.Inventor: Robert W. Morris
-
Patent number: 8050925Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.Type: GrantFiled: November 5, 2009Date of Patent: November 1, 2011Assignee: AT&T Intellectual Property II, L.P.Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
-
Patent number: 8041567Abstract: Commercially available voice recognition systems are generally speaker-dependent, with the voice recognition system first being trained to the voice of the speaker before it can be used. A disadvantage with this method is that modified reference data has to be buffered and permanently saved in several steps when the speaker adaptation algorithm is executed, and thus requires a lot of memory space. This primarily negatively affects applications on devices with restricted processor power and limited memory space, such as mobile radio terminals for example. A method of speaker adaptation for a Hidden Markov Model based voice recognition system may address these issues. In the method, the memory space requirement and thus also the processor power required can be considerably reduced. This is achieved by using modified reference data in a speaker adaptation algorithm to adapt a new speaker to a reference speaker. The modified reference data is processed in compressed form.Type: GrantFiled: September 22, 2005Date of Patent: October 18, 2011Assignee: Siemens AktiengesellschaftInventors: Sergey Astrov, Josef Bauer
-
Patent number: 7912717Abstract: The invention uses the ModelGrower program to generate possible candidates from an original or aggregated model. An isomorphic reduction program operates on the candidates to identify and exclude isomorphic models. A Markov model evaluation and optimization program operates on the remaining non-isomorphic candidates. The candidates are optimized and the ones that most closely conform to the data are kept. The best optimized candidate of one stage becomes the starting candidate for the next stage where ModelGrower and the other programs operate on the optimized candidate to generate a new optimized candidate. The invention repeats the steps of growing, excluding isomorphs, evaluating and optimizing until such repetitions yield no significantly better results.Type: GrantFiled: November 18, 2005Date of Patent: March 22, 2011Inventor: Albert Galick
-
Patent number: 7881935Abstract: A speech recognition apparatus in which the accuracy in speech recognition is improved as the resource is prevented from increasing. Such a word which is probable as the result of the speech recognition is selected on the basis of an acoustic score and a linguistic score, while word selection is also performed on the basis of a measure different from the acoustic score, such as the number of phonemes being small, a part of speech being a pre-set one, inclusion in the past results of speech recognition or the linguistic score being not less than a pre-set value. The words so selected are subjected to matching processing.Type: GrantFiled: February 16, 2001Date of Patent: February 1, 2011Assignee: Sony CorporationInventors: Yasuharu Asano, Katsuki Minamino, Hiroaki Ogawa, Helmut Lucke
-
Patent number: 7856356Abstract: A speech recognition system for a mobile terminal includes an acoustic variation channel unit and a pronunciation channel unit. The acoustic variation channel unit transforms a speech signal into feature parameters and Viterbi-decodes the speech signal to produce a varied phoneme sequence by using the feature parameters and predetermined models. Further, the pronunciation variation channel unit Viterbi-decodes the varied phoneme sequence to produce a word phoneme sequence by using the varied phoneme sequence and a preset DHMM (Discrete Hidden Markov Model) based context-dependent error model.Type: GrantFiled: December 20, 2006Date of Patent: December 21, 2010Assignee: Electronics and Telecommunications Research InstituteInventors: Hoon Chung, Yunkeun Lee
-
Patent number: 7818172Abstract: The method of recognizing speech in an acoustic signal comprises developing acoustic stochastic models of voice units in the form of a set of states of an acoustic signal and using the acoustic models for recognition by a comparison of the signal with predetermined acoustic models obtained via a prior learning process. While developing the acoustic models, the voice units are modeled by means of a first portion of the states independent of adjacent voice units and by means of a second portion of the states dependent on adjacent voice units. The second portion of states dependent on adjacent voice units shares common parameters with a plurality of units sharing same phonemes.Type: GrantFiled: April 20, 2004Date of Patent: October 19, 2010Assignee: France TelecomInventors: Ronaldo Messina, Denis Jouvet
-
Patent number: 7805308Abstract: A novel system for speech recognition uses differential cepstra over time frames as acoustic features, together with the traditional static cepstral features, for hidden trajectory modeling, and provides greater accuracy and performance in automatic speech recognition. According to one illustrative embodiment, an automatic speech recognition method includes receiving a speech input, generating an interpretation of the speech, and providing an output based at least in part on the interpretation of the speech input. The interpretation of the speech uses hidden trajectory modeling with observation vectors that are based on cepstra and on differential cepstra derived from the cepstra. A method is developed that can automatically train the hidden trajectory model's parameters that are corresponding to the components of the differential cepstra in the full acoustic feature vectors.Type: GrantFiled: January 19, 2007Date of Patent: September 28, 2010Assignee: Microsoft CorporationInventors: Li Deng, Dong Yu
-
Patent number: 7711561Abstract: The present invention relates to speech recognition systems, particularly speech-to-text systems and software and decoders for the same.Type: GrantFiled: April 15, 2004Date of Patent: May 4, 2010Assignee: Kabushiki Kaisha ToshibaInventors: Wide Hogenhout, Kean Kheong Chin
-
Patent number: 7698136Abstract: The present invention is directed to a computer implemented method and apparatus for flexibly recognizing meaningful data items within an arbitrary user utterance. According to one example embodiment of the invention, a set of one or more key phrases and a set of one or more filler phrases are defined, probabilities are assigned to the key phrases and/or the filler phrases, and the user utterances is evaluated against the set of key phrases and the set of filler phrases using the probabilities.Type: GrantFiled: January 28, 2003Date of Patent: April 13, 2010Assignee: Voxify, Inc.Inventors: Patrick T. M. Nguyen, Adeeb W. M. Shana'a, Amit V. Desai
-
Patent number: 7624015Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.Type: GrantFiled: March 2, 2006Date of Patent: November 24, 2009Assignee: AT&T Intellectual Property II, L.P.Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
-
Patent number: 7617104Abstract: A method of speech recognition is provided that determines a production-related value, vocal-tract resonance frequencies in particular, for a state at a particular frame based on the production-related values associated with two preceding frames using a recursion. The production-related value is used to determine a probability distribution of the observed feature vector for the state. A probability for an observed value received for the frame is then determined from the probability distribution. Under one embodiment, the production-related value is determined using a noise-free recursive definition for the value. Use of the recursion substantially improves the decoding speed. When the decoding algorithm is applied to training data with known phonetic transcripts, forced alignment is created which improves the phone segmentation obtained from the prior art.Type: GrantFiled: January 21, 2003Date of Patent: November 10, 2009Assignee: Microsoft CorporationInventors: Li Deng, Jian-Iai Zhou, Frank Torsten Bernd Seide
-
Patent number: 7574358Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.Type: GrantFiled: February 28, 2005Date of Patent: August 11, 2009Assignee: International Business Machines CorporationInventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
-
Patent number: 7529671Abstract: A pattern recognition system and method are provided. Aspects of the invention are particularly useful in combination with multi-state Hidden Markov Models. Pattern recognition is effected by processing Hidden Markov Model Blocks. This block-processing allows the processor to perform more operations upon data while such data is in cache memory. By so increasing cache locality, aspects of the invention provide significantly improved pattern recognition speed.Type: GrantFiled: March 4, 2003Date of Patent: May 5, 2009Assignee: Microsoft CorporationInventors: William H. Rockenbeck, Julian J. Odell
-
Publication number: 20090055183Abstract: A method for sequence tagging medical patient records includes providing a labeled corpus of sentences taken from a set of medical records, initializing generative parameters ? and discriminative parameters {tilde over (?)}, providing a functional LL?CĂ—Penalty, where LL is a log-likelihood function LL = log ? ? p ? ( ? , ? ~ ) + ? l = 1 M ? ? [ log ? ? p ? ( X l , Y l | ? ~ ) - log ? ? p ? ( X l | ? ~ ) ] + ? l = 1 M ? ? log ? ? p ? ( X l | ? ) , ? Penalty = ? y ? V Y ? ( em y 2 + tr y 2 + e ? ? m ~ y 2 + t ? ? r ~ y 2 ) , where emy=1???xj?VXp(xi|y), e{tilde over (m)}y=1???xi?VX{tilde over (p)}(xi|y) are emission probability constraints, try=1???yi?VYp(yi|y), t{tilde over (r)}y=1???yi?VY{tilde over (p)}(yi|y) are transition probability constraints, and extracting gradients of LL?CĂ—Penalty with respect to the transition and emission probabilities and solving ?*k,{tilde oType: ApplicationFiled: August 21, 2008Publication date: February 26, 2009Applicant: Siemens Medical Solutions USA, Inc.Inventors: Oksana Yakhnenko, Romer E. Rosales, Radu Stefan Niculescu, Lucian Vlad Lita
-
Patent number: 7437288Abstract: A speech recognition apparatus using a probability model that employs a mixed distribution, the apparatus formed by a standard pattern storage means for storing a standard pattern; a recognition means for outputting recognition results corresponding to an input speech by using the standard pattern; a standard pattern generating means for inputting learning speech and generating the standard pattern; and a standard pattern adjustment means, provided between the standard pattern generating means and the standard pattern storage means, for adjusting the number of element distributions of the mixed distribution of the standard pattern.Type: GrantFiled: March 11, 2002Date of Patent: October 14, 2008Assignee: NEC CorporationInventor: Koichi Shinoda
-
Patent number: 7089185Abstract: An arrangement is provided for embedded coupled hidden Markov model. To train an embedded coupled hidden Markov model, training data is first segmented into uniform segments at different layers of the embedded coupled hidden Markov model. At each layer, a uniform segment corresponds to a state of a coupled hidden Markov model at that layer. An optimal segmentation is generated at the lower layer based on the uniform segmentation and is then used to update parameters of models associated with the states of coupled hidden Markov models at lower layer. The updated model parameters at the lower layer are then used to update the model parameters associated with states at the super layer.Type: GrantFiled: June 27, 2002Date of Patent: August 8, 2006Assignee: Intel CorporationInventor: Ara V Nefian