Viterbi Trellis Patents (Class 704/242)
  • Patent number: 10733977
    Abstract: Methods, systems, and computer readable media for automated transcription model adaptation includes obtaining audio data from a plurality of audio files. The audio data is transcribed to produce at least one audio file transcription which represents a plurality of transcription alternatives for each audio file. Speech analytics are applied to each audio file transcription. A best transcription is selected from the plurality of transcription alternatives for each audio file. Statistics from the selected best transcription are calculated. An adapted model is created from the calculated statistics.
    Type: Grant
    Filed: March 29, 2017
    Date of Patent: August 4, 2020
    Assignee: Verint Systems Ltd.
    Inventors: Ran Achituv, Omer Ziv, Roni Romano, Ido Shapira, Daniel Baum
  • Patent number: 10665227
    Abstract: A voice recognition device extracts, from a first voice signal of a user, a first string of phonemes included in the first voice signal, extracts, from a second voice signal of the user, a second string of phonemes included in the second voice signal, extracts a string of common phonemes from the first string and the second string, calculates, for each of a plurality of registered keywords, a degree of similarity between a string of phonemes corresponding to the keyword and the string of common phonemes, and selects, among the plurality of keywords, a prescribed number of keywords based on the degree of similarity for each keyword.
    Type: Grant
    Filed: August 10, 2017
    Date of Patent: May 26, 2020
    Assignee: FUJITSU LIMITED
    Inventor: Shoji Hayakawa
  • Patent number: 10586537
    Abstract: Methods, computer program products, and systems are presented. The method computer program products, and systems can include, for instance: receiving, from a user, voice data defining a candidate directive invoking vocal utterance for invoking a directive to execute a first text based command to perform a first computer function of a computer system, wherein the candidate directive invoking vocal utterance includes at least one word or phrase of the text based command, wherein the computer system is configured to perform the first computer function in response to the first text based command and wherein the computer system is configured to perform a second computer function in response to a second text based command; determining, based on machine logic, whether a word or phrase of the candidate vocal utterance sounds confusingly similar to a speech rendering of a word or phrase defining the second text based command.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: March 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jeremy A. Greenberger, Nicholas R. Sandonato
  • Patent number: 9984685
    Abstract: A speech recognition system used for hands-free data entry receives and analyzes speech input to recognize and accept a user's response. Under certain conditions, a user's response might be expected. In these situations, the expected response may modify the behavior of the speech recognition system to improve performance. For example, if the hypothesis of a user's response matches the expected response then there is a high probability that the user's response was recognized correctly. This information may be used to make adjustments. An expected response may include expected response parts, each part containing expected words. By considering an expected response as the concatenation of expected response parts, each part may be considered independently for the purposes of adjusting an acceptance algorithm, adjusting a model, or recording an apparent error. In this way, the speech recognition system may make modifications based on a wide range of user responses.
    Type: Grant
    Filed: November 7, 2014
    Date of Patent: May 29, 2018
    Assignee: Hand Held Products, Inc.
    Inventors: Keith Braho, Jason M. Makay
  • Patent number: 9865254
    Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.
    Type: Grant
    Filed: June 20, 2016
    Date of Patent: January 9, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
  • Patent number: 9787830
    Abstract: Systems, methods and apparatus for initiating communication. According to one implementation, a recording of a particular user speaking a name of a contact is obtained, a voice dialing command including an utterance of the name of the contact by the particular user is received, and in response to receiving the voice dialing command including the utterance of the name of the contact by the particular user, the recording of the particular user speaking the name of the contact or a text-to-speech audio output of the name is provided for output, and communication is initiated between the particular user and the contact.
    Type: Grant
    Filed: August 10, 2016
    Date of Patent: October 10, 2017
    Assignee: Google Inc.
    Inventors: Craig L. Reding, Suzi Levas
  • Patent number: 9684437
    Abstract: A system for memorization of content includes a memory storing the content. The memory includes a tangible computer readable medium with an instruction set, and at least one database. The database has information corresponding to the content. A processor communicating with the memory is configured to: execute the instruction set to present information corresponding to an original discrete portion of the content, and conduct an accuracy analysis of at least one version of the discrete portion audibly repeated by a user. An audio input device communicating with the processor receives the version from the user for use in the accuracy analysis by the processor. An input device inputs a user predetermined quantity of repetitions of the version presented to the user prior to the user progressing to a next discrete portion of the original discrete portion when a user predetermined accuracy threshold of the accuracy analysis is met.
    Type: Grant
    Filed: June 30, 2014
    Date of Patent: June 20, 2017
    Inventor: Michael L. Thornton, II
  • Patent number: 9607618
    Abstract: A method for adapting a speech recognition system for out-of-vocabulary, comprising, decoding by a hybrid speech recognition a speech including out-of-vocabulary terms, thereby generating graphemic transcriptions of the speech with a mixture of recognized in-vocabulary words and unrecognized sub-words, while keeping a track of the decoded segments of the speech, determining in the transcription sequences of sub-words as candidate out-of-vocabulary words based on a first condition with respect to lengths of the sequences of sub-words and a second condition with respect to the number of repetitions of the sequences, audibly presenting to a user the candidate out-of-vocabulary words from the corresponding segments of the speech according to the track, and receiving from the user indications of valid words corresponding to audible presentations of the sequences of sub-words in the candidate out-of-vocabulary words, and training a speech recognition to additionally recognize the candidate out-of-vocabulary words,
    Type: Grant
    Filed: December 16, 2014
    Date of Patent: March 28, 2017
    Assignee: NICE-SYSTEMS LTD
    Inventors: Maor Nissan, Ronny Bretter
  • Patent number: 9553741
    Abstract: A neural network demodulator is used within a receiver to provide Inter Symbol Interference (ISI) channel equalization and to correct for I/Q/phase imbalance. The neural network is trained with a single integrated training step to simultaneously handle the channel impairments of ISI equalization and I/Q phase imbalance as opposed to prior art methods of separately addressing each channel impairment in sequence.
    Type: Grant
    Filed: May 2, 2016
    Date of Patent: January 24, 2017
    Assignee: Kelquan Holdings Ltd.
    Inventors: Dhadesugoor Vaman, Oluwatobi Olabiyi
  • Patent number: 9218412
    Abstract: A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.
    Type: Grant
    Filed: May 10, 2007
    Date of Patent: December 22, 2015
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alejandro Acero, Geoffrey G. Zweig
  • Patent number: 9110880
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for pruning a language model are disclosed. The methods, systems, and apparatus include actions of selecting a candidate portion of the language model to evaluate for pruning, obtaining an entropy score representing information loss that would result from pruning the candidate portion of the language model, obtaining an acoustic score representing acoustic confusability of one or more words modeled by the candidate portion of the language model, and evaluating whether to prune the candidate portion of the language model using the entropy score and the acoustic score.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: August 18, 2015
    Assignee: Google Inc.
    Inventors: Brian Strope, Francoise Beaufays
  • Patent number: 9110852
    Abstract: Information may be extracted from a text corpus. The text corpus may be parsed into a parse tree structure based on the parts of speech of the words of the text corpus. A path in the parse tree structure may be identified as linking an entity and a value, and the path may be applied to the same or other text corpuses to extract other instances of entity-value pairs. Extracted information, associated paths, or both may be validated in some instances.
    Type: Grant
    Filed: July 20, 2012
    Date of Patent: August 18, 2015
    Assignee: Google Inc.
    Inventor: Kevin Lerman
  • Patent number: 9026442
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: May 5, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 8959019
    Abstract: Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principled
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: February 17, 2015
    Assignee: Promptu Systems Corporation
    Inventors: Harry Printz, Narren Chittar
  • Patent number: 8953889
    Abstract: An augmented reality environment allows interaction between virtual and real objects and enhances an unstructured real-world environment. An object datastore comprising attributes of an object within the environment may be built and/or maintained from sources including manufacturers, retailers, shippers, and users. This object datastore may be local, cloud based, or a combination thereof. Applications may interrogate the object datastore to provide user functionality.
    Type: Grant
    Filed: September 14, 2011
    Date of Patent: February 10, 2015
    Assignee: Rawles LLC
    Inventors: William Spencer Worley, III, Edward Dietz Crump
  • Patent number: 8897397
    Abstract: A method and system for a reconfigurable orthogonal frequency division multiplexing (OFDM) chip supporting single weight diversity are provided. The reconfigurable OFDM chip may be configured to process signals such as IEEE 802.11, 802.16, and digital video broadcasting (DVB). The OFDM chip may generate channel weights to be applied to signals received in receive antennas. The weighted signals may be combined into a single received signal and channel estimates may be generated from the single received signal. Updated channel weights may be generated from the generated channel estimates. Updates to the channel weights may be performed dynamically. The configurable OFDM chip may be utilized to provide collaborative cellular and OFDM-based communication. The reconfigurable OFDM chip and the cellular chip may communicate data and/or control information via a memory coupled to a common bus.
    Type: Grant
    Filed: June 3, 2013
    Date of Patent: November 25, 2014
    Assignee: Broadcom Corporation
    Inventor: Pieter van Rooyen
  • Patent number: 8880399
    Abstract: In the field of language learning systems, proper pronunciation of words and phrases is an integral aspect of language learning, determining the proximity of the language learner's pronunciation to a standardized, i.e. ‘perfect’, pronunciation is utilized to guide the learner from imperfect toward perfect pronunciation. In this regard, a phoneme lattice scoring system is utilized, whereby an input from a user is transduced into the perfect pronunciation example in a phoneme lattice. The cost of this transduction may be determined based on a summation of substitutions, deletions and insertions of phonemes needed to transducer from the input to the perfect pronunciation of the utterance.
    Type: Grant
    Filed: September 27, 2010
    Date of Patent: November 4, 2014
    Assignee: Rosetta Stone, Ltd.
    Inventors: Andreas Hagen, Bryan Pellom
  • Publication number: 20140324426
    Abstract: The present invention, pertaining to the field of speech recognition, discloses a reminder setting method and apparatus. The method includes: acquiring speech signals; acquiring time information in speech signals by using keyword recognition, and determining reminder time for reminder setting according to the time information; acquiring text sequence corresponding to the speech signals by using continuous speech recognition, and determining reminder content for reminder setting according to the time information and the text sequence; and setting a reminder according to the reminder time and the reminder content.
    Type: Application
    Filed: May 28, 2013
    Publication date: October 30, 2014
    Inventors: Li LU, Feng RAO, Song LIU, Zongyao TANG, Xiang ZHANG, Shuai YUE, Bo CHEN
  • Patent number: 8874540
    Abstract: A system and method for semantically classifying numerical data includes using semantic classification techniques on ‘nearby’ non-numerical data to identify a context whereby opaque data sets of numbers can be semantically classified inside of that context. An Electronic Knowledge Base is used to query against the context and determine the semantics of the opaque numeric data sets.
    Type: Grant
    Filed: September 7, 2011
    Date of Patent: October 28, 2014
    Assignee: Xerox Corporation
    Inventors: Michael David Shepherd, Dale Ellen Gaucas, Kirk J. Ocke
  • Patent number: 8861652
    Abstract: A system and method for determining a transport format of a transport channel is described. A guiding stream is received on a guiding transport channel and a guided stream is received on a guided transport channel. The guided stream is convolutionally decoded to produce a plurality of Viterbi path metrics. A transport format for the guided transport channel is selected from possible transport formats. The possible transport formats are determined by information provided on the guiding transport channel. The selection of the transport format is based at least in part on a metric computed from a combination of the Viterbi path metrics.
    Type: Grant
    Filed: March 31, 2012
    Date of Patent: October 14, 2014
    Assignee: Broadcom Corporation
    Inventor: Chuan-Hsuan Kuo
  • Patent number: 8849668
    Abstract: A speech recognition apparatus is provided. The speech recognition apparatus includes a primary speech recognition unit configured to perform speech recognition on input speech and thus to generate word lattice information, a word string generation unit configured to generate one or more word strings based on the word lattice information, a language model score calculation unit configured to calculate bidirectional language model scores of the generated word strings selectively using forward and backward language models for each of words in each of the generated word strings, and a sentence output unit configured to output one or more of the generated word strings with high scores as results of the speech recognition of the input speech based on the calculated bidirectional language model scores.
    Type: Grant
    Filed: May 24, 2011
    Date of Patent: September 30, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ick-Sang Han, Chi-Youn Park, Jeong-Su Kim, Jeong-Mi Cho
  • Patent number: 8849667
    Abstract: A computer-implemented method, apparatus and computer program product. The computer-implemented method performed by a computerized device, comprising: transforming a hidden Markov model to qubits; transforming data into groups of qubits, the data being determined upon the hidden Markov model and features extracted from an audio signal, the data representing a likelihood observation matrix representing likelihood of phoneme and state combinations in an audio signal; applying a quantum search algorithm for finding a maximal value of the qubits; and transforming the maximal value of the qubits into a number, the number representing an entry in a delta array used in speech recognition.
    Type: Grant
    Filed: July 7, 2013
    Date of Patent: September 30, 2014
    Assignee: Novospeech Ltd.
    Inventor: Yossef Ben-Ezra
  • Patent number: 8812315
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: August 19, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 8768696
    Abstract: A speech recognition circuit comprises a memory containing lexical data for word recognition, the lexical data comprising a plurality of lexical data structures stored in each of a plurality of parts of the memory; and a parallel processor structure connected to the memory to process speech parameters by performing parallel processing on a plurality of the lexical data structures.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: July 1, 2014
    Assignee: Zentian Limited
    Inventor: Mark Catchpole
  • Patent number: 8700399
    Abstract: In one embodiment the present invention includes a method comprising receiving an acoustic input signal and processing the acoustic input signal with a plurality of acoustic recognition processes configured to recognize the same target sound. Different acoustic recognition processes start processing different segments of the acoustic input signal at different time points in the acoustic input signal. In one embodiment, initial states in the recognition processes may be configured on each time step.
    Type: Grant
    Filed: July 6, 2010
    Date of Patent: April 15, 2014
    Assignee: Sensory, Inc.
    Inventors: Pieter J. Vermeulen, Jonathan Shaw, Todd F. Mozer
  • Patent number: 8688449
    Abstract: A weight coefficient generation device, a speech recognition device, a navigation system, a vehicle, a vehicle coefficient generation method, and a weight coefficient generation program are provided for the purpose of improving a speech recognition performance of place names. In order to address the above purpose, an address database 12 has address information data items including country names, city names, street names, and house numbers, and manages the address information having a tree structure indicating hierarchical relationships between the place names from wide area to a narrow area. Each of the place names stored in the address database 12 is taken as a speech recognition candidate. A weight coefficient calculation unit 11 of a weight coefficient generation device 10 calculates a weight coefficient of the likelihood of the aforementioned recognition candidate based on the number of the street names belonging to the lower hierarchy below the city names.
    Type: Grant
    Filed: November 17, 2009
    Date of Patent: April 1, 2014
    Assignee: Asahi Kasei Kabushiki Kaisha
    Inventor: Toshiyuki Miyazaki
  • Publication number: 20140088963
    Abstract: A method and an apparatus for speech signal processing are provided. The method includes: receiving an encoded speech signal sent by a user equipment, where the encoded speech signal includes a first substream, a second substream, and a third substream, and the first substream is attached with a cyclic redundancy check (CRC); performing decoding processing on the first substream, the second substream, and the third substream by adopting a decoding algorithm, where a decoding algorithm that is based on an auxiliary decision of the CRC is adopted to perform decoding processing on the first substream; and sending decoding results of the first substream, the second substream, and the third substream to a base station controller, where the decoding result of the first substream includes a decoded bit stream and a CRC result. Decoding performance of the first substream is improved, and users' higher requirements for the speech quality are met.
    Type: Application
    Filed: November 27, 2013
    Publication date: March 27, 2014
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Yuejun Wei, Xin Tang, Qian Zhu
  • Patent number: 8676574
    Abstract: In a spoken language processing method for tone/intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.
    Type: Grant
    Filed: November 10, 2010
    Date of Patent: March 18, 2014
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Ozlem Kalinli
  • Patent number: 8650032
    Abstract: The present invention discloses converting a text form into a speech. In the present invention, partial word lists of a data source are obtained by parsing the data source in parallel or in series. The partial word lists are then compiled to obtain phoneme graphs corresponding, respectively, to the partial word lists, and then the obtained phoneme graphs are combined. Speech recognition is then conducted according to the combination results. According to the present invention, computational complexity may be reduced and recognition efficiency may be improved during speech recognition.
    Type: Grant
    Filed: November 2, 2011
    Date of Patent: February 11, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Guo Kang Fu, Zhao Bing Han, Bin Jia, Ying Liu
  • Publication number: 20140006023
    Abstract: A method and apparatus to search a codebook including pulses that model a predetermined component of a speech signal. The method includes the operations of selecting a predetermined number of paths corresponding to a predetermined number of pulse locations that are most consistent with the predetermined component, from among paths corresponding to pulse locations of a predetermined pulse location set allocated to at least one branch that connects one state of a predetermined Trellis structure to another state, performing the path selecting operation on each of states other than the one state, and selecting a path corresponding to pulse locations that are most consistent with the predetermined component, from among paths including the selected paths. Accordingly, the number of calculations required during a codebook search is reduced.
    Type: Application
    Filed: September 6, 2013
    Publication date: January 2, 2014
    Applicant: SAMSUNG Electronics Co., Ltd.
    Inventors: Hosang SUNG, Kangeun LEE, Sang-won KANG, Thomas R. FISCHER, Ja-kyoung JUN
  • Patent number: 8620655
    Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic
    Type: Grant
    Filed: August 10, 2011
    Date of Patent: December 31, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
  • Patent number: 8620656
    Abstract: The present invention discloses converting a text form into a speech. In the present invention, partial word lists of a data source are obtained by parsing the data source in parallel or in series. The partial word lists are then compiled to obtain phoneme graphs corresponding, respectively, to the partial word lists, and then the obtained phoneme graphs are combined. Speech recognition is then conducted according to the combination results. According to the present invention, computational complexity may be reduced and recognition efficiency may be improved during speech recognition.
    Type: Grant
    Filed: March 4, 2012
    Date of Patent: December 31, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Guo Kang Fu, Zhao Bing Han, Bin Jia, Ying Liu
  • Patent number: 8612225
    Abstract: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.
    Type: Grant
    Filed: February 26, 2008
    Date of Patent: December 17, 2013
    Assignee: NEC Corporation
    Inventors: Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa
  • Patent number: 8606578
    Abstract: According to some embodiments, a method and apparatus are provided to buffer N audio frames of a plurality of audio frames associated with an audio signal, pre-compute scores for a subset of context dependent models (CDMs), and perform a graphical model search associated with the N audio frames where a score of a context independent model (CIM) associated with a CDM is used in lieu of a score for the CDM when a score for the CDM is needed and has not been pre-computed.
    Type: Grant
    Filed: June 25, 2009
    Date of Patent: December 10, 2013
    Assignee: Intel Corporation
    Inventors: Michael Eugene Deisher, Tao Ma
  • Patent number: 8589163
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask.
    Type: Grant
    Filed: December 4, 2009
    Date of Patent: November 19, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Mazin Gilbert
  • Patent number: 8582677
    Abstract: There is provided a communication apparatus, including a transmission pattern generation unit that generates a transmission pattern according to a modulation method, a metric calculation unit that calculates an inter-signal distance between a received signal vector of received signals and an estimation vector, which is a product of channel information and the transmission pattern, a maximum likelihood pattern determination unit that determines a maximum likelihood signal pattern from the inter-signal distance calculated by the metric calculation unit, and an error estimation unit that estimates a phase error component and an amplitude error component contained in the received signal vector. The metric calculation unit calculates the inter-signal distance between the received signal vector and the estimation vector by using the phase error component and/or the amplitude error component estimated by the error estimation unit.
    Type: Grant
    Filed: June 8, 2009
    Date of Patent: November 12, 2013
    Assignee: Sony Corporation
    Inventor: Ryo Sawai
  • Patent number: 8578254
    Abstract: Systems and methods are provided for generating error events for decoded bits using a Soft output Viterbi algorithm (SOVA). A winning path through a trellis can be determined and decoded information can be generated. Path metric differences can be computed within the trellis based on the winning path. A plurality of error event masks and error event metrics can be generated based on the decoded information and the path metric differences.
    Type: Grant
    Filed: November 6, 2012
    Date of Patent: November 5, 2013
    Assignee: Marvell International Ltd.
    Inventor: Manoj Kumar Yadav
  • Patent number: 8548807
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: June 9, 2009
    Date of Patent: October 1, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 8542760
    Abstract: A full-rate distributed space-time (ST) code design is provided for amplify-and-forward cooperative wireless channels. A signal space diversity technique is employed at the source node and a unique signature vector at each relay node. The distributed space-time (ST) codes can achieve full cooperative diversity and full rate. The achievable diversity gain is M+1, where M is the number of relay nodes. Optimal power allocation can be used to maximize the coding gain under a total power constraint.
    Type: Grant
    Filed: November 16, 2007
    Date of Patent: September 24, 2013
    Assignee: Lingna Holdings Pte., LLC
    Inventors: Wei Zhang, Khaled Ben Letaief
  • Patent number: 8509359
    Abstract: A multi-channel sequential Viterbi decoder includes: an input buffer, a “Read Data from Input Buffer” signal driver, a processing unit selector, a decoder channel parameters registers unit, a processing unit for a “Reset Path Metrics” command, a processing unit for a “Set Path Metric Value for the Given Path Number” command, a processing unit for a “Get Single Bit from the Path with Given Number” command, a processing unit for a “Process Input Samples” command, a memory for storing decoding paths and path metrics, a unit for generating an address for the memory, and data buffers for decoder channels output.
    Type: Grant
    Filed: October 15, 2012
    Date of Patent: August 13, 2013
    Assignee: Topcon Positioning Systems, Inc.
    Inventors: Timur G. Kelin, Dmitry D. Murzinov, Dmitry A. Pyatkov
  • Patent number: 8478589
    Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.
    Type: Grant
    Filed: January 5, 2005
    Date of Patent: July 2, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
  • Patent number: 8457230
    Abstract: A method and system for a reconfigurable orthogonal frequency division multiplexing (OFDM) chip supporting single weight diversity are provided. The reconfigurable OFDM chip may be configured to process signals such as IEEE 802.11, 802.16, and digital video broadcasting (DVB). The OFDM chip may generate channel weights to be applied to signals received in receive antennas. The weighted signals may be combined into a single received signal and channel estimates may be generated from the single received signal. Updated channel weights may be generated from the generated channel estimates. Updates to the channel weights may be performed dynamically. The configurable OFDM chip may be utilized to provide collaborative cellular and OFDM-based communication. The reconfigurable OFDM chip and the cellular chip may communicate data and/or control information via a memory coupled to a common bus.
    Type: Grant
    Filed: September 28, 2005
    Date of Patent: June 4, 2013
    Assignee: Broadcom Corporation
    Inventor: Pieter van Rooyen
  • Patent number: 8442163
    Abstract: Two decoding algorithms are introduced for the decoding of multi-level coded modulation and other types of coded modulation involving component codes and interleaving operations. An improved hard iterative decoding (IHID) algorithm is presented that improves upon a hard iteration decoding technique by adding a stopping criterion. Also, a list Viterbi hard iteration decoding (LV-IHID) algorithm is presented that employs list decoding in conjunction with the IHID algorithm. Both of these decoding algorithms improve upon conventional multi-stage decoding by reducing the effective error multiplicity that is observed at the lowest coding level. It is demonstrated that the LV-IHID algorithm performs close to soft iterative decoding. The computational and delay complexity of the proposed decoding algorithms compare favorably with soft iterative decoding strategies. Also, a novel labeling strategy for MLC design is presented.
    Type: Grant
    Filed: August 24, 2009
    Date of Patent: May 14, 2013
    Inventors: Eric Morgan Dowling, John P. Fonseka
  • Publication number: 20130117020
    Abstract: Disclosed are a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services. The present invention provides a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services capable of maximizing an effect of advertisement by grasping user's intention, an emotion state, and positional information from speech data uttered by a user during a process of providing speech recognition SMS services, configuring advertisements based thereon, and exposing the configured advertisements to a user.
    Type: Application
    Filed: September 5, 2012
    Publication date: May 9, 2013
    Applicant: Electronics and telecommunications Research Institute
    Inventors: Hoon CHUNG, Jeon Gue Park, Hyung Bae Jeon, Ki Young Park, Yun Keun Lee, Sang Kyu Park
  • Patent number: 8428950
    Abstract: A speech recognition apparatus (110) selects an optimum recognition result from recognition results output from a set of speech recognizers (s1-sM) based on a majority decision. This decision is implemented with taking into account weight values, as to the set of the speech recognizers, learned by a learning apparatus (100). The learning apparatus includes a unit (103) selecting speech recognizers corresponding to characteristics of speech for learning (101), a unit (104) finding recognition results of the speech for learning by using the selected speech recognizers, a unit (105) unifying the recognition results and generating a word string network, and a unit (106) finding weight values concerning a set of the speech recognizers by implementing learning processing.
    Type: Grant
    Filed: January 18, 2008
    Date of Patent: April 23, 2013
    Assignee: NEC Corporation
    Inventors: Yoshifumi Onishi, Tadashi Emori
  • Patent number: 8392187
    Abstract: Methods, speech recognition systems, and computer readable media are provided that recognize speech using dynamic pruning techniques. A search network is expanded based on a frame from a speech signal, a best hypothesis is determined in the search network, a default beam threshold is modified, and the search network is pruned using the modified beam threshold. The search network may be further pruned based on the search depth of the best hypothesis and/or the average number of frames per state for a search path.
    Type: Grant
    Filed: January 30, 2009
    Date of Patent: March 5, 2013
    Assignee: Texas Instruments Incorporated
    Inventor: Qifeng Zhu
  • Patent number: 8385479
    Abstract: An apparatus and a method for canceling interference based on Maximum Likelihood (ML) at a receiver of a multi-antenna system are provided. The method includes estimating a channel using a signal received over one or more receiver antennas; generating one or more weights using the estimated channel to cancel interference; detecting a candidate symbol having the shortest Euclidean distance through Maximum Likelihood using a first weight; and soft-decoding the candidate symbol using a second weight. Therefore, by receiving the signal based on the ML under the interference, the system capacity can be increased and the reception performance can be enhanced.
    Type: Grant
    Filed: December 2, 2008
    Date of Patent: February 26, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Joo-Hyun Lee, Suk-Seung Hwang, Jong-Ho Lee, Soong-Yoon Choi
  • Patent number: 8355918
    Abstract: A method (10) in a speech recognition application callflow can include the steps of assigning (11) an individual option and a pre-built grammar to a same prompt, treating (15) the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase (12) or an annotation (13) in the pre-built grammar, and treating (14) the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.
    Type: Grant
    Filed: January 5, 2012
    Date of Patent: January 15, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Ciprian Agapi, Felipe Gomez, James R. Lewis, Vanessa V. Michelini
  • Patent number: 8340204
    Abstract: A Viterbi trellis processing technique in which soft decisions and hard decisions are derived from a received signal and the soft decisions are enhanced by being modified using the hard decisions. A log likelihood ratio for a bit of the received signal can be derived by grouping candidate metrics associated with the decision that the bit has a first state, grouping candidate metrics associated with the decision that the bit has a second state, applying respective functions to the groups and calculating the difference of the function values.
    Type: Grant
    Filed: August 5, 2005
    Date of Patent: December 25, 2012
    Assignees: MStar Semiconductor, Inc., MStar Software R&D (Shenzhen) Ltd., MStar France SAS, MStar Semiconductor, Inc.
    Inventors: Navid Fatemi-Ghomi, Cyril Valadon
  • Patent number: 8332222
    Abstract: A Viterbi decoder includes: an observation vector sequence generator for generating an observation vector sequence by converting an input speech to a sequence of observation vectors; a local optimal state calculator for obtaining a partial state sequence having a maximum similarity up to a current observation vector as an optimal state; an observation probability calculator for obtaining, as a current observation probability, a probability for observing the current observation vector in the optimal state; a buffer for storing therein a specific number of previous observation probabilities; a non-linear filter for calculating a filtered probability by using the previous observation probabilities stored in the buffer and the current observation probability; and a maximum likelihood calculator for calculating a partial maximum likelihood by using the filtered probability.
    Type: Grant
    Filed: July 21, 2009
    Date of Patent: December 11, 2012
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hoon Chung, Jeon Gue Park, Yunkeun Lee, Ho-Young Jung, Hyung-Bae Jeon, Jeom Ja Kang, Sung Joo Lee, Euisok Chung, Ji Hyun Wang, Byung Ok Kang, Ki-young Park, Jong Jin Kim