Viterbi Trellis Patents (Class 704/242)
-
Patent number: 10963500Abstract: A computer-implemented method includes receiving an input query including a comparative question referencing a plurality of entities. The comparative question includes a comparative term regarding a metric associated with the plurality of entities. The computer-implemented method includes generating a plurality of polar questions. Each of the plurality of polar questions seeks a polar answer to a comparison, along the metric and according to the comparative term, between a target entity of the plurality of entities and at least one second entity of the plurality of entities. The computer-implemented method includes determining confidence scores for the plurality of polar questions. The computer-implemented method includes identifying a particular polar question of the plurality of polar questions based on the confidence scores. The computer-implemented method includes outputting a response to the input query that identifies the target entity of the particular polar question as an answer to the input query.Type: GrantFiled: September 4, 2018Date of Patent: March 30, 2021Assignee: International Business Machines CorporationInventors: Stanley J. Vernier, Stephen A. Boxwell, Kyle M. Brake, Keith G. Frost
-
Patent number: 10896672Abstract: Implementations relate to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Selection of a speech recognition model for a particular language can based on one or more interaction characteristics exhibited during a dialog session between a user and an automated assistant. Such interaction characteristics can include anticipated user input types, anticipated user input durations, a duration for monitoring for a user response, and/or an actual duration of a provided user response.Type: GrantFiled: April 16, 2018Date of Patent: January 19, 2021Assignee: GOOGLE LLCInventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno
-
Patent number: 10733977Abstract: Methods, systems, and computer readable media for automated transcription model adaptation includes obtaining audio data from a plurality of audio files. The audio data is transcribed to produce at least one audio file transcription which represents a plurality of transcription alternatives for each audio file. Speech analytics are applied to each audio file transcription. A best transcription is selected from the plurality of transcription alternatives for each audio file. Statistics from the selected best transcription are calculated. An adapted model is created from the calculated statistics.Type: GrantFiled: March 29, 2017Date of Patent: August 4, 2020Assignee: Verint Systems Ltd.Inventors: Ran Achituv, Omer Ziv, Roni Romano, Ido Shapira, Daniel Baum
-
Patent number: 10665227Abstract: A voice recognition device extracts, from a first voice signal of a user, a first string of phonemes included in the first voice signal, extracts, from a second voice signal of the user, a second string of phonemes included in the second voice signal, extracts a string of common phonemes from the first string and the second string, calculates, for each of a plurality of registered keywords, a degree of similarity between a string of phonemes corresponding to the keyword and the string of common phonemes, and selects, among the plurality of keywords, a prescribed number of keywords based on the degree of similarity for each keyword.Type: GrantFiled: August 10, 2017Date of Patent: May 26, 2020Assignee: FUJITSU LIMITEDInventor: Shoji Hayakawa
-
Patent number: 10586537Abstract: Methods, computer program products, and systems are presented. The method computer program products, and systems can include, for instance: receiving, from a user, voice data defining a candidate directive invoking vocal utterance for invoking a directive to execute a first text based command to perform a first computer function of a computer system, wherein the candidate directive invoking vocal utterance includes at least one word or phrase of the text based command, wherein the computer system is configured to perform the first computer function in response to the first text based command and wherein the computer system is configured to perform a second computer function in response to a second text based command; determining, based on machine logic, whether a word or phrase of the candidate vocal utterance sounds confusingly similar to a speech rendering of a word or phrase defining the second text based command.Type: GrantFiled: November 30, 2017Date of Patent: March 10, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jeremy A. Greenberger, Nicholas R. Sandonato
-
Patent number: 9984685Abstract: A speech recognition system used for hands-free data entry receives and analyzes speech input to recognize and accept a user's response. Under certain conditions, a user's response might be expected. In these situations, the expected response may modify the behavior of the speech recognition system to improve performance. For example, if the hypothesis of a user's response matches the expected response then there is a high probability that the user's response was recognized correctly. This information may be used to make adjustments. An expected response may include expected response parts, each part containing expected words. By considering an expected response as the concatenation of expected response parts, each part may be considered independently for the purposes of adjusting an acceptance algorithm, adjusting a model, or recording an apparent error. In this way, the speech recognition system may make modifications based on a wide range of user responses.Type: GrantFiled: November 7, 2014Date of Patent: May 29, 2018Assignee: Hand Held Products, Inc.Inventors: Keith Braho, Jason M. Makay
-
Patent number: 9865254Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.Type: GrantFiled: June 20, 2016Date of Patent: January 9, 2018Assignee: Amazon Technologies, Inc.Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
-
Patent number: 9787830Abstract: Systems, methods and apparatus for initiating communication. According to one implementation, a recording of a particular user speaking a name of a contact is obtained, a voice dialing command including an utterance of the name of the contact by the particular user is received, and in response to receiving the voice dialing command including the utterance of the name of the contact by the particular user, the recording of the particular user speaking the name of the contact or a text-to-speech audio output of the name is provided for output, and communication is initiated between the particular user and the contact.Type: GrantFiled: August 10, 2016Date of Patent: October 10, 2017Assignee: Google Inc.Inventors: Craig L. Reding, Suzi Levas
-
Patent number: 9684437Abstract: A system for memorization of content includes a memory storing the content. The memory includes a tangible computer readable medium with an instruction set, and at least one database. The database has information corresponding to the content. A processor communicating with the memory is configured to: execute the instruction set to present information corresponding to an original discrete portion of the content, and conduct an accuracy analysis of at least one version of the discrete portion audibly repeated by a user. An audio input device communicating with the processor receives the version from the user for use in the accuracy analysis by the processor. An input device inputs a user predetermined quantity of repetitions of the version presented to the user prior to the user progressing to a next discrete portion of the original discrete portion when a user predetermined accuracy threshold of the accuracy analysis is met.Type: GrantFiled: June 30, 2014Date of Patent: June 20, 2017Inventor: Michael L. Thornton, II
-
Patent number: 9607618Abstract: A method for adapting a speech recognition system for out-of-vocabulary, comprising, decoding by a hybrid speech recognition a speech including out-of-vocabulary terms, thereby generating graphemic transcriptions of the speech with a mixture of recognized in-vocabulary words and unrecognized sub-words, while keeping a track of the decoded segments of the speech, determining in the transcription sequences of sub-words as candidate out-of-vocabulary words based on a first condition with respect to lengths of the sequences of sub-words and a second condition with respect to the number of repetitions of the sequences, audibly presenting to a user the candidate out-of-vocabulary words from the corresponding segments of the speech according to the track, and receiving from the user indications of valid words corresponding to audible presentations of the sequences of sub-words in the candidate out-of-vocabulary words, and training a speech recognition to additionally recognize the candidate out-of-vocabulary words,Type: GrantFiled: December 16, 2014Date of Patent: March 28, 2017Assignee: NICE-SYSTEMS LTDInventors: Maor Nissan, Ronny Bretter
-
Patent number: 9553741Abstract: A neural network demodulator is used within a receiver to provide Inter Symbol Interference (ISI) channel equalization and to correct for I/Q/phase imbalance. The neural network is trained with a single integrated training step to simultaneously handle the channel impairments of ISI equalization and I/Q phase imbalance as opposed to prior art methods of separately addressing each channel impairment in sequence.Type: GrantFiled: May 2, 2016Date of Patent: January 24, 2017Assignee: Kelquan Holdings Ltd.Inventors: Dhadesugoor Vaman, Oluwatobi Olabiyi
-
Patent number: 9218412Abstract: A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.Type: GrantFiled: May 10, 2007Date of Patent: December 22, 2015Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alejandro Acero, Geoffrey G. Zweig
-
Patent number: 9110880Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for pruning a language model are disclosed. The methods, systems, and apparatus include actions of selecting a candidate portion of the language model to evaluate for pruning, obtaining an entropy score representing information loss that would result from pruning the candidate portion of the language model, obtaining an acoustic score representing acoustic confusability of one or more words modeled by the candidate portion of the language model, and evaluating whether to prune the candidate portion of the language model using the entropy score and the acoustic score.Type: GrantFiled: March 15, 2013Date of Patent: August 18, 2015Assignee: Google Inc.Inventors: Brian Strope, Francoise Beaufays
-
Patent number: 9110852Abstract: Information may be extracted from a text corpus. The text corpus may be parsed into a parse tree structure based on the parts of speech of the words of the text corpus. A path in the parse tree structure may be identified as linking an entity and a value, and the path may be applied to the same or other text corpuses to extract other instances of entity-value pairs. Extracted information, associated paths, or both may be validated in some instances.Type: GrantFiled: July 20, 2012Date of Patent: August 18, 2015Assignee: Google Inc.Inventor: Kevin Lerman
-
Patent number: 9026442Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: August 14, 2014Date of Patent: May 5, 2015Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Patent number: 8959019Abstract: Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principledType: GrantFiled: October 31, 2007Date of Patent: February 17, 2015Assignee: Promptu Systems CorporationInventors: Harry Printz, Narren Chittar
-
Patent number: 8953889Abstract: An augmented reality environment allows interaction between virtual and real objects and enhances an unstructured real-world environment. An object datastore comprising attributes of an object within the environment may be built and/or maintained from sources including manufacturers, retailers, shippers, and users. This object datastore may be local, cloud based, or a combination thereof. Applications may interrogate the object datastore to provide user functionality.Type: GrantFiled: September 14, 2011Date of Patent: February 10, 2015Assignee: Rawles LLCInventors: William Spencer Worley, III, Edward Dietz Crump
-
Patent number: 8897397Abstract: A method and system for a reconfigurable orthogonal frequency division multiplexing (OFDM) chip supporting single weight diversity are provided. The reconfigurable OFDM chip may be configured to process signals such as IEEE 802.11, 802.16, and digital video broadcasting (DVB). The OFDM chip may generate channel weights to be applied to signals received in receive antennas. The weighted signals may be combined into a single received signal and channel estimates may be generated from the single received signal. Updated channel weights may be generated from the generated channel estimates. Updates to the channel weights may be performed dynamically. The configurable OFDM chip may be utilized to provide collaborative cellular and OFDM-based communication. The reconfigurable OFDM chip and the cellular chip may communicate data and/or control information via a memory coupled to a common bus.Type: GrantFiled: June 3, 2013Date of Patent: November 25, 2014Assignee: Broadcom CorporationInventor: Pieter van Rooyen
-
Patent number: 8880399Abstract: In the field of language learning systems, proper pronunciation of words and phrases is an integral aspect of language learning, determining the proximity of the language learner's pronunciation to a standardized, i.e. ‘perfect’, pronunciation is utilized to guide the learner from imperfect toward perfect pronunciation. In this regard, a phoneme lattice scoring system is utilized, whereby an input from a user is transduced into the perfect pronunciation example in a phoneme lattice. The cost of this transduction may be determined based on a summation of substitutions, deletions and insertions of phonemes needed to transducer from the input to the perfect pronunciation of the utterance.Type: GrantFiled: September 27, 2010Date of Patent: November 4, 2014Assignee: Rosetta Stone, Ltd.Inventors: Andreas Hagen, Bryan Pellom
-
Publication number: 20140324426Abstract: The present invention, pertaining to the field of speech recognition, discloses a reminder setting method and apparatus. The method includes: acquiring speech signals; acquiring time information in speech signals by using keyword recognition, and determining reminder time for reminder setting according to the time information; acquiring text sequence corresponding to the speech signals by using continuous speech recognition, and determining reminder content for reminder setting according to the time information and the text sequence; and setting a reminder according to the reminder time and the reminder content.Type: ApplicationFiled: May 28, 2013Publication date: October 30, 2014Inventors: Li LU, Feng RAO, Song LIU, Zongyao TANG, Xiang ZHANG, Shuai YUE, Bo CHEN
-
Patent number: 8874540Abstract: A system and method for semantically classifying numerical data includes using semantic classification techniques on ‘nearby’ non-numerical data to identify a context whereby opaque data sets of numbers can be semantically classified inside of that context. An Electronic Knowledge Base is used to query against the context and determine the semantics of the opaque numeric data sets.Type: GrantFiled: September 7, 2011Date of Patent: October 28, 2014Assignee: Xerox CorporationInventors: Michael David Shepherd, Dale Ellen Gaucas, Kirk J. Ocke
-
Patent number: 8861652Abstract: A system and method for determining a transport format of a transport channel is described. A guiding stream is received on a guiding transport channel and a guided stream is received on a guided transport channel. The guided stream is convolutionally decoded to produce a plurality of Viterbi path metrics. A transport format for the guided transport channel is selected from possible transport formats. The possible transport formats are determined by information provided on the guiding transport channel. The selection of the transport format is based at least in part on a metric computed from a combination of the Viterbi path metrics.Type: GrantFiled: March 31, 2012Date of Patent: October 14, 2014Assignee: Broadcom CorporationInventor: Chuan-Hsuan Kuo
-
Patent number: 8849668Abstract: A speech recognition apparatus is provided. The speech recognition apparatus includes a primary speech recognition unit configured to perform speech recognition on input speech and thus to generate word lattice information, a word string generation unit configured to generate one or more word strings based on the word lattice information, a language model score calculation unit configured to calculate bidirectional language model scores of the generated word strings selectively using forward and backward language models for each of words in each of the generated word strings, and a sentence output unit configured to output one or more of the generated word strings with high scores as results of the speech recognition of the input speech based on the calculated bidirectional language model scores.Type: GrantFiled: May 24, 2011Date of Patent: September 30, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Ick-Sang Han, Chi-Youn Park, Jeong-Su Kim, Jeong-Mi Cho
-
Patent number: 8849667Abstract: A computer-implemented method, apparatus and computer program product. The computer-implemented method performed by a computerized device, comprising: transforming a hidden Markov model to qubits; transforming data into groups of qubits, the data being determined upon the hidden Markov model and features extracted from an audio signal, the data representing a likelihood observation matrix representing likelihood of phoneme and state combinations in an audio signal; applying a quantum search algorithm for finding a maximal value of the qubits; and transforming the maximal value of the qubits into a number, the number representing an entry in a delta array used in speech recognition.Type: GrantFiled: July 7, 2013Date of Patent: September 30, 2014Assignee: Novospeech Ltd.Inventor: Yossef Ben-Ezra
-
Patent number: 8812315Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: October 1, 2013Date of Patent: August 19, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Patent number: 8768696Abstract: A speech recognition circuit comprises a memory containing lexical data for word recognition, the lexical data comprising a plurality of lexical data structures stored in each of a plurality of parts of the memory; and a parallel processor structure connected to the memory to process speech parameters by performing parallel processing on a plurality of the lexical data structures.Type: GrantFiled: October 5, 2011Date of Patent: July 1, 2014Assignee: Zentian LimitedInventor: Mark Catchpole
-
Patent number: 8700399Abstract: In one embodiment the present invention includes a method comprising receiving an acoustic input signal and processing the acoustic input signal with a plurality of acoustic recognition processes configured to recognize the same target sound. Different acoustic recognition processes start processing different segments of the acoustic input signal at different time points in the acoustic input signal. In one embodiment, initial states in the recognition processes may be configured on each time step.Type: GrantFiled: July 6, 2010Date of Patent: April 15, 2014Assignee: Sensory, Inc.Inventors: Pieter J. Vermeulen, Jonathan Shaw, Todd F. Mozer
-
Patent number: 8688449Abstract: A weight coefficient generation device, a speech recognition device, a navigation system, a vehicle, a vehicle coefficient generation method, and a weight coefficient generation program are provided for the purpose of improving a speech recognition performance of place names. In order to address the above purpose, an address database 12 has address information data items including country names, city names, street names, and house numbers, and manages the address information having a tree structure indicating hierarchical relationships between the place names from wide area to a narrow area. Each of the place names stored in the address database 12 is taken as a speech recognition candidate. A weight coefficient calculation unit 11 of a weight coefficient generation device 10 calculates a weight coefficient of the likelihood of the aforementioned recognition candidate based on the number of the street names belonging to the lower hierarchy below the city names.Type: GrantFiled: November 17, 2009Date of Patent: April 1, 2014Assignee: Asahi Kasei Kabushiki KaishaInventor: Toshiyuki Miyazaki
-
Publication number: 20140088963Abstract: A method and an apparatus for speech signal processing are provided. The method includes: receiving an encoded speech signal sent by a user equipment, where the encoded speech signal includes a first substream, a second substream, and a third substream, and the first substream is attached with a cyclic redundancy check (CRC); performing decoding processing on the first substream, the second substream, and the third substream by adopting a decoding algorithm, where a decoding algorithm that is based on an auxiliary decision of the CRC is adopted to perform decoding processing on the first substream; and sending decoding results of the first substream, the second substream, and the third substream to a base station controller, where the decoding result of the first substream includes a decoded bit stream and a CRC result. Decoding performance of the first substream is improved, and users' higher requirements for the speech quality are met.Type: ApplicationFiled: November 27, 2013Publication date: March 27, 2014Applicant: Huawei Technologies Co., Ltd.Inventors: Yuejun Wei, Xin Tang, Qian Zhu
-
Patent number: 8676574Abstract: In a spoken language processing method for tone/intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.Type: GrantFiled: November 10, 2010Date of Patent: March 18, 2014Assignee: Sony Computer Entertainment Inc.Inventor: Ozlem Kalinli
-
Patent number: 8650032Abstract: The present invention discloses converting a text form into a speech. In the present invention, partial word lists of a data source are obtained by parsing the data source in parallel or in series. The partial word lists are then compiled to obtain phoneme graphs corresponding, respectively, to the partial word lists, and then the obtained phoneme graphs are combined. Speech recognition is then conducted according to the combination results. According to the present invention, computational complexity may be reduced and recognition efficiency may be improved during speech recognition.Type: GrantFiled: November 2, 2011Date of Patent: February 11, 2014Assignee: Nuance Communications, Inc.Inventors: Guo Kang Fu, Zhao Bing Han, Bin Jia, Ying Liu
-
Publication number: 20140006023Abstract: A method and apparatus to search a codebook including pulses that model a predetermined component of a speech signal. The method includes the operations of selecting a predetermined number of paths corresponding to a predetermined number of pulse locations that are most consistent with the predetermined component, from among paths corresponding to pulse locations of a predetermined pulse location set allocated to at least one branch that connects one state of a predetermined Trellis structure to another state, performing the path selecting operation on each of states other than the one state, and selecting a path corresponding to pulse locations that are most consistent with the predetermined component, from among paths including the selected paths. Accordingly, the number of calculations required during a codebook search is reduced.Type: ApplicationFiled: September 6, 2013Publication date: January 2, 2014Applicant: SAMSUNG Electronics Co., Ltd.Inventors: Hosang SUNG, Kangeun LEE, Sang-won KANG, Thomas R. FISCHER, Ja-kyoung JUN
-
Patent number: 8620655Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acousticType: GrantFiled: August 10, 2011Date of Patent: December 31, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
-
Patent number: 8620656Abstract: The present invention discloses converting a text form into a speech. In the present invention, partial word lists of a data source are obtained by parsing the data source in parallel or in series. The partial word lists are then compiled to obtain phoneme graphs corresponding, respectively, to the partial word lists, and then the obtained phoneme graphs are combined. Speech recognition is then conducted according to the combination results. According to the present invention, computational complexity may be reduced and recognition efficiency may be improved during speech recognition.Type: GrantFiled: March 4, 2012Date of Patent: December 31, 2013Assignee: Nuance Communications, Inc.Inventors: Guo Kang Fu, Zhao Bing Han, Bin Jia, Ying Liu
-
Patent number: 8612225Abstract: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.Type: GrantFiled: February 26, 2008Date of Patent: December 17, 2013Assignee: NEC CorporationInventors: Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa
-
Patent number: 8606578Abstract: According to some embodiments, a method and apparatus are provided to buffer N audio frames of a plurality of audio frames associated with an audio signal, pre-compute scores for a subset of context dependent models (CDMs), and perform a graphical model search associated with the N audio frames where a score of a context independent model (CIM) associated with a CDM is used in lieu of a score for the CDM when a score for the CDM is needed and has not been pre-computed.Type: GrantFiled: June 25, 2009Date of Patent: December 10, 2013Assignee: Intel CorporationInventors: Michael Eugene Deisher, Tao Ma
-
Patent number: 8589163Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask.Type: GrantFiled: December 4, 2009Date of Patent: November 19, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Mazin Gilbert
-
Patent number: 8582677Abstract: There is provided a communication apparatus, including a transmission pattern generation unit that generates a transmission pattern according to a modulation method, a metric calculation unit that calculates an inter-signal distance between a received signal vector of received signals and an estimation vector, which is a product of channel information and the transmission pattern, a maximum likelihood pattern determination unit that determines a maximum likelihood signal pattern from the inter-signal distance calculated by the metric calculation unit, and an error estimation unit that estimates a phase error component and an amplitude error component contained in the received signal vector. The metric calculation unit calculates the inter-signal distance between the received signal vector and the estimation vector by using the phase error component and/or the amplitude error component estimated by the error estimation unit.Type: GrantFiled: June 8, 2009Date of Patent: November 12, 2013Assignee: Sony CorporationInventor: Ryo Sawai
-
Patent number: 8578254Abstract: Systems and methods are provided for generating error events for decoded bits using a Soft output Viterbi algorithm (SOVA). A winning path through a trellis can be determined and decoded information can be generated. Path metric differences can be computed within the trellis based on the winning path. A plurality of error event masks and error event metrics can be generated based on the decoded information and the path metric differences.Type: GrantFiled: November 6, 2012Date of Patent: November 5, 2013Assignee: Marvell International Ltd.Inventor: Manoj Kumar Yadav
-
Patent number: 8548807Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: June 9, 2009Date of Patent: October 1, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Patent number: 8542760Abstract: A full-rate distributed space-time (ST) code design is provided for amplify-and-forward cooperative wireless channels. A signal space diversity technique is employed at the source node and a unique signature vector at each relay node. The distributed space-time (ST) codes can achieve full cooperative diversity and full rate. The achievable diversity gain is M+1, where M is the number of relay nodes. Optimal power allocation can be used to maximize the coding gain under a total power constraint.Type: GrantFiled: November 16, 2007Date of Patent: September 24, 2013Assignee: Lingna Holdings Pte., LLCInventors: Wei Zhang, Khaled Ben Letaief
-
Patent number: 8509359Abstract: A multi-channel sequential Viterbi decoder includes: an input buffer, a “Read Data from Input Buffer” signal driver, a processing unit selector, a decoder channel parameters registers unit, a processing unit for a “Reset Path Metrics” command, a processing unit for a “Set Path Metric Value for the Given Path Number” command, a processing unit for a “Get Single Bit from the Path with Given Number” command, a processing unit for a “Process Input Samples” command, a memory for storing decoding paths and path metrics, a unit for generating an address for the memory, and data buffers for decoder channels output.Type: GrantFiled: October 15, 2012Date of Patent: August 13, 2013Assignee: Topcon Positioning Systems, Inc.Inventors: Timur G. Kelin, Dmitry D. Murzinov, Dmitry A. Pyatkov
-
Patent number: 8478589Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.Type: GrantFiled: January 5, 2005Date of Patent: July 2, 2013Assignee: AT&T Intellectual Property II, L.P.Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
-
Patent number: 8457230Abstract: A method and system for a reconfigurable orthogonal frequency division multiplexing (OFDM) chip supporting single weight diversity are provided. The reconfigurable OFDM chip may be configured to process signals such as IEEE 802.11, 802.16, and digital video broadcasting (DVB). The OFDM chip may generate channel weights to be applied to signals received in receive antennas. The weighted signals may be combined into a single received signal and channel estimates may be generated from the single received signal. Updated channel weights may be generated from the generated channel estimates. Updates to the channel weights may be performed dynamically. The configurable OFDM chip may be utilized to provide collaborative cellular and OFDM-based communication. The reconfigurable OFDM chip and the cellular chip may communicate data and/or control information via a memory coupled to a common bus.Type: GrantFiled: September 28, 2005Date of Patent: June 4, 2013Assignee: Broadcom CorporationInventor: Pieter van Rooyen
-
Patent number: 8442163Abstract: Two decoding algorithms are introduced for the decoding of multi-level coded modulation and other types of coded modulation involving component codes and interleaving operations. An improved hard iterative decoding (IHID) algorithm is presented that improves upon a hard iteration decoding technique by adding a stopping criterion. Also, a list Viterbi hard iteration decoding (LV-IHID) algorithm is presented that employs list decoding in conjunction with the IHID algorithm. Both of these decoding algorithms improve upon conventional multi-stage decoding by reducing the effective error multiplicity that is observed at the lowest coding level. It is demonstrated that the LV-IHID algorithm performs close to soft iterative decoding. The computational and delay complexity of the proposed decoding algorithms compare favorably with soft iterative decoding strategies. Also, a novel labeling strategy for MLC design is presented.Type: GrantFiled: August 24, 2009Date of Patent: May 14, 2013Inventors: Eric Morgan Dowling, John P. Fonseka
-
Publication number: 20130117020Abstract: Disclosed are a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services. The present invention provides a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services capable of maximizing an effect of advertisement by grasping user's intention, an emotion state, and positional information from speech data uttered by a user during a process of providing speech recognition SMS services, configuring advertisements based thereon, and exposing the configured advertisements to a user.Type: ApplicationFiled: September 5, 2012Publication date: May 9, 2013Applicant: Electronics and telecommunications Research InstituteInventors: Hoon CHUNG, Jeon Gue Park, Hyung Bae Jeon, Ki Young Park, Yun Keun Lee, Sang Kyu Park
-
Patent number: 8428950Abstract: A speech recognition apparatus (110) selects an optimum recognition result from recognition results output from a set of speech recognizers (s1-sM) based on a majority decision. This decision is implemented with taking into account weight values, as to the set of the speech recognizers, learned by a learning apparatus (100). The learning apparatus includes a unit (103) selecting speech recognizers corresponding to characteristics of speech for learning (101), a unit (104) finding recognition results of the speech for learning by using the selected speech recognizers, a unit (105) unifying the recognition results and generating a word string network, and a unit (106) finding weight values concerning a set of the speech recognizers by implementing learning processing.Type: GrantFiled: January 18, 2008Date of Patent: April 23, 2013Assignee: NEC CorporationInventors: Yoshifumi Onishi, Tadashi Emori
-
Patent number: 8392187Abstract: Methods, speech recognition systems, and computer readable media are provided that recognize speech using dynamic pruning techniques. A search network is expanded based on a frame from a speech signal, a best hypothesis is determined in the search network, a default beam threshold is modified, and the search network is pruned using the modified beam threshold. The search network may be further pruned based on the search depth of the best hypothesis and/or the average number of frames per state for a search path.Type: GrantFiled: January 30, 2009Date of Patent: March 5, 2013Assignee: Texas Instruments IncorporatedInventor: Qifeng Zhu
-
Patent number: 8385479Abstract: An apparatus and a method for canceling interference based on Maximum Likelihood (ML) at a receiver of a multi-antenna system are provided. The method includes estimating a channel using a signal received over one or more receiver antennas; generating one or more weights using the estimated channel to cancel interference; detecting a candidate symbol having the shortest Euclidean distance through Maximum Likelihood using a first weight; and soft-decoding the candidate symbol using a second weight. Therefore, by receiving the signal based on the ML under the interference, the system capacity can be increased and the reception performance can be enhanced.Type: GrantFiled: December 2, 2008Date of Patent: February 26, 2013Assignee: Samsung Electronics Co., Ltd.Inventors: Joo-Hyun Lee, Suk-Seung Hwang, Jong-Ho Lee, Soong-Yoon Choi
-
Patent number: 8355918Abstract: A method (10) in a speech recognition application callflow can include the steps of assigning (11) an individual option and a pre-built grammar to a same prompt, treating (15) the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase (12) or an annotation (13) in the pre-built grammar, and treating (14) the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.Type: GrantFiled: January 5, 2012Date of Patent: January 15, 2013Assignee: Nuance Communications, Inc.Inventors: Ciprian Agapi, Felipe Gomez, James R. Lewis, Vanessa V. Michelini