Viterbi Trellis Patents (Class 704/242)

Determining answers to comparative questions

Patent number: 10963500

Abstract: A computer-implemented method includes receiving an input query including a comparative question referencing a plurality of entities. The comparative question includes a comparative term regarding a metric associated with the plurality of entities. The computer-implemented method includes generating a plurality of polar questions. Each of the plurality of polar questions seeks a polar answer to a comparison, along the metric and according to the comparative term, between a target entity of the plurality of entities and at least one second entity of the plurality of entities. The computer-implemented method includes determining confidence scores for the plurality of polar questions. The computer-implemented method includes identifying a particular polar question of the plurality of polar questions based on the confidence scores. The computer-implemented method includes outputting a response to the input query that identifies the target entity of the particular polar question as an answer to the input query.

Type: Grant

Filed: September 4, 2018

Date of Patent: March 30, 2021

Assignee: International Business Machines Corporation

Inventors: Stanley J. Vernier, Stephen A. Boxwell, Kyle M. Brake, Keith G. Frost
Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface

Patent number: 10896672

Abstract: Implementations relate to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Selection of a speech recognition model for a particular language can based on one or more interaction characteristics exhibited during a dialog session between a user and an automated assistant. Such interaction characteristics can include anticipated user input types, anticipated user input durations, a duration for monitoring for a user response, and/or an actual duration of a provided user response.

Type: Grant

Filed: April 16, 2018

Date of Patent: January 19, 2021

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno
System and method of automated model adaptation

Patent number: 10733977

Abstract: Methods, systems, and computer readable media for automated transcription model adaptation includes obtaining audio data from a plurality of audio files. The audio data is transcribed to produce at least one audio file transcription which represents a plurality of transcription alternatives for each audio file. Speech analytics are applied to each audio file transcription. A best transcription is selected from the plurality of transcription alternatives for each audio file. Statistics from the selected best transcription are calculated. An adapted model is created from the calculated statistics.

Type: Grant

Filed: March 29, 2017

Date of Patent: August 4, 2020

Assignee: Verint Systems Ltd.

Inventors: Ran Achituv, Omer Ziv, Roni Romano, Ido Shapira, Daniel Baum
Voice recognition device and voice recognition method

Patent number: 10665227

Abstract: A voice recognition device extracts, from a first voice signal of a user, a first string of phonemes included in the first voice signal, extracts, from a second voice signal of the user, a second string of phonemes included in the second voice signal, extracts a string of common phonemes from the first string and the second string, calculates, for each of a plurality of registered keywords, a degree of similarity between a string of phonemes corresponding to the keyword and the string of common phonemes, and selects, among the plurality of keywords, a prescribed number of keywords based on the degree of similarity for each keyword.

Type: Grant

Filed: August 10, 2017

Date of Patent: May 26, 2020

Assignee: FUJITSU LIMITED

Inventor: Shoji Hayakawa
Filtering directive invoking vocal utterances

Patent number: 10586537

Abstract: Methods, computer program products, and systems are presented. The method computer program products, and systems can include, for instance: receiving, from a user, voice data defining a candidate directive invoking vocal utterance for invoking a directive to execute a first text based command to perform a first computer function of a computer system, wherein the candidate directive invoking vocal utterance includes at least one word or phrase of the text based command, wherein the computer system is configured to perform the first computer function in response to the first text based command and wherein the computer system is configured to perform a second computer function in response to a second text based command; determining, based on machine logic, whether a word or phrase of the candidate vocal utterance sounds confusingly similar to a speech rendering of a word or phrase defining the second text based command.

Type: Grant

Filed: November 30, 2017

Date of Patent: March 10, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jeremy A. Greenberger, Nicholas R. Sandonato
Concatenated expected responses for speech recognition using expected response boundaries to determine corresponding hypothesis boundaries

Patent number: 9984685

Abstract: A speech recognition system used for hands-free data entry receives and analyzes speech input to recognize and accept a user's response. Under certain conditions, a user's response might be expected. In these situations, the expected response may modify the behavior of the speech recognition system to improve performance. For example, if the hypothesis of a user's response matches the expected response then there is a high probability that the user's response was recognized correctly. This information may be used to make adjustments. An expected response may include expected response parts, each part containing expected words. By considering an expected response as the concatenation of expected response parts, each part may be considered independently for the purposes of adjusting an acceptance algorithm, adjusting a model, or recording an apparent error. In this way, the speech recognition system may make modifications based on a wide range of user responses.

Type: Grant

Filed: November 7, 2014

Date of Patent: May 29, 2018

Assignee: Hand Held Products, Inc.

Inventors: Keith Braho, Jason M. Makay
Compressed finite state transducers for automatic speech recognition

Patent number: 9865254

Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.

Type: Grant

Filed: June 20, 2016

Date of Patent: January 9, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
Performing speech recognition over a network and using speech recognition results based on determining that a network connection exists

Patent number: 9787830

Abstract: Systems, methods and apparatus for initiating communication. According to one implementation, a recording of a particular user speaking a name of a contact is obtained, a voice dialing command including an utterance of the name of the contact by the particular user is received, and in response to receiving the voice dialing command including the utterance of the name of the contact by the particular user, the recording of the particular user speaking the name of the contact or a text-to-speech audio output of the name is provided for output, and communication is initiated between the particular user and the contact.

Type: Grant

Filed: August 10, 2016

Date of Patent: October 10, 2017

Assignee: Google Inc.

Inventors: Craig L. Reding, Suzi Levas
Memorization system and method

Patent number: 9684437

Abstract: A system for memorization of content includes a memory storing the content. The memory includes a tangible computer readable medium with an instruction set, and at least one database. The database has information corresponding to the content. A processor communicating with the memory is configured to: execute the instruction set to present information corresponding to an original discrete portion of the content, and conduct an accuracy analysis of at least one version of the discrete portion audibly repeated by a user. An audio input device communicating with the processor receives the version from the user for use in the accuracy analysis by the processor. An input device inputs a user predetermined quantity of repetitions of the version presented to the user prior to the user progressing to a next discrete portion of the original discrete portion when a user predetermined accuracy threshold of the accuracy analysis is met.

Type: Grant

Filed: June 30, 2014

Date of Patent: June 20, 2017

Inventor: Michael L. Thornton, II
Out of vocabulary pattern learning

Patent number: 9607618

Abstract: A method for adapting a speech recognition system for out-of-vocabulary, comprising, decoding by a hybrid speech recognition a speech including out-of-vocabulary terms, thereby generating graphemic transcriptions of the speech with a mixture of recognized in-vocabulary words and unrecognized sub-words, while keeping a track of the decoded segments of the speech, determining in the transcription sequences of sub-words as candidate out-of-vocabulary words based on a first condition with respect to lengths of the sequences of sub-words and a second condition with respect to the number of repetitions of the sequences, audibly presenting to a user the candidate out-of-vocabulary words from the corresponding segments of the speech according to the track, and receiving from the user indications of valid words corresponding to audible presentations of the sequences of sub-words in the candidate out-of-vocabulary words, and training a speech recognition to additionally recognize the candidate out-of-vocabulary words,

Type: Grant

Filed: December 16, 2014

Date of Patent: March 28, 2017

Assignee: NICE-SYSTEMS LTD

Inventors: Maor Nissan, Ronny Bretter
Adaptive demodulation method and apparatus using an artificial neural network to improve data recovery in high speed channels

Patent number: 9553741

Abstract: A neural network demodulator is used within a receiver to provide Inter Symbol Interference (ISI) channel equalization and to correct for I/Q/phase imbalance. The neural network is trained with a single integrated training step to simultaneously handle the channel impairments of ISI equalization and I/Q phase imbalance as opposed to prior art methods of separately addressing each channel impairment in sequence.

Type: Grant

Filed: May 2, 2016

Date of Patent: January 24, 2017

Assignee: Kelquan Holdings Ltd.

Inventors: Dhadesugoor Vaman, Oluwatobi Olabiyi
Searching a database of listings

Patent number: 9218412

Abstract: A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.

Type: Grant

Filed: May 10, 2007

Date of Patent: December 22, 2015

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alejandro Acero, Geoffrey G. Zweig
Methods and systems for extracting information from text

Patent number: 9110852

Abstract: Information may be extracted from a text corpus. The text corpus may be parsed into a parse tree structure based on the parts of speech of the words of the text corpus. A path in the parse tree structure may be identified as linking an entity and a value, and the path may be applied to the same or other text corpuses to extract other instances of entity-value pairs. Extracted information, associated paths, or both may be validated in some instances.

Type: Grant

Filed: July 20, 2012

Date of Patent: August 18, 2015

Assignee: Google Inc.

Inventor: Kevin Lerman
Acoustically informed pruning for language modeling

Patent number: 9110880

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for pruning a language model are disclosed. The methods, systems, and apparatus include actions of selecting a candidate portion of the language model to evaluate for pruning, obtaining an entropy score representing information loss that would result from pruning the candidate portion of the language model, obtaining an acoustic score representing acoustic confusability of one or more words modeled by the candidate portion of the language model, and evaluating whether to prune the candidate portion of the language model using the entropy score and the acoustic score.

Type: Grant

Filed: March 15, 2013

Date of Patent: August 18, 2015

Assignee: Google Inc.

Inventors: Brian Strope, Francoise Beaufays
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9026442

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: August 14, 2014

Date of Patent: May 5, 2015

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Efficient empirical determination, computation, and use of acoustic confusability measures

Patent number: 8959019

Abstract: Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principled

Type: Grant

Filed: October 31, 2007

Date of Patent: February 17, 2015

Assignee: Promptu Systems Corporation

Inventors: Harry Printz, Narren Chittar
Object datastore in an augmented reality environment

Patent number: 8953889

Abstract: An augmented reality environment allows interaction between virtual and real objects and enhances an unstructured real-world environment. An object datastore comprising attributes of an object within the environment may be built and/or maintained from sources including manufacturers, retailers, shippers, and users. This object datastore may be local, cloud based, or a combination thereof. Applications may interrogate the object datastore to provide user functionality.

Type: Grant

Filed: September 14, 2011

Date of Patent: February 10, 2015

Assignee: Rawles LLC

Inventors: William Spencer Worley, III, Edward Dietz Crump
Reconfigurable orthogonal frequency division multiplexing (OFDM) chip supporting single weight diversity

Patent number: 8897397

Abstract: A method and system for a reconfigurable orthogonal frequency division multiplexing (OFDM) chip supporting single weight diversity are provided. The reconfigurable OFDM chip may be configured to process signals such as IEEE 802.11, 802.16, and digital video broadcasting (DVB). The OFDM chip may generate channel weights to be applied to signals received in receive antennas. The weighted signals may be combined into a single received signal and channel estimates may be generated from the single received signal. Updated channel weights may be generated from the generated channel estimates. Updates to the channel weights may be performed dynamically. The configurable OFDM chip may be utilized to provide collaborative cellular and OFDM-based communication. The reconfigurable OFDM chip and the cellular chip may communicate data and/or control information via a memory coupled to a common bus.

Type: Grant

Filed: June 3, 2013

Date of Patent: November 25, 2014

Assignee: Broadcom Corporation

Inventor: Pieter van Rooyen
Utterance verification and pronunciation scoring by lattice transduction

Patent number: 8880399

Abstract: In the field of language learning systems, proper pronunciation of words and phrases is an integral aspect of language learning, determining the proximity of the language learner's pronunciation to a standardized, i.e. ‘perfect’, pronunciation is utilized to guide the learner from imperfect toward perfect pronunciation. In this regard, a phoneme lattice scoring system is utilized, whereby an input from a user is transduced into the perfect pronunciation example in a phoneme lattice. The cost of this transduction may be determined based on a summation of substitutions, deletions and insertions of phonemes needed to transducer from the input to the perfect pronunciation of the utterance.

Type: Grant

Filed: September 27, 2010

Date of Patent: November 4, 2014

Assignee: Rosetta Stone, Ltd.

Inventors: Andreas Hagen, Bryan Pellom
REMINDER SETTING METHOD AND APPARATUS

Publication number: 20140324426

Abstract: The present invention, pertaining to the field of speech recognition, discloses a reminder setting method and apparatus. The method includes: acquiring speech signals; acquiring time information in speech signals by using keyword recognition, and determining reminder time for reminder setting according to the time information; acquiring text sequence corresponding to the speech signals by using continuous speech recognition, and determining reminder content for reminder setting according to the time information and the text sequence; and setting a reminder according to the reminder time and the reminder content.

Type: Application

Filed: May 28, 2013

Publication date: October 30, 2014

Inventors: Li LU, Feng RAO, Song LIU, Zongyao TANG, Xiang ZHANG, Shuai YUE, Bo CHEN
Method for semantic classification of numeric data sets

Patent number: 8874540

Abstract: A system and method for semantically classifying numerical data includes using semantic classification techniques on ‘nearby’ non-numerical data to identify a context whereby opaque data sets of numbers can be semantically classified inside of that context. An Electronic Knowledge Base is used to query against the context and determine the semantics of the opaque numeric data sets.

Type: Grant

Filed: September 7, 2011

Date of Patent: October 28, 2014

Assignee: Xerox Corporation

Inventors: Michael David Shepherd, Dale Ellen Gaucas, Kirk J. Ocke
Detecting format of a transport channel

Patent number: 8861652

Abstract: A system and method for determining a transport format of a transport channel is described. A guiding stream is received on a guiding transport channel and a guided stream is received on a guided transport channel. The guided stream is convolutionally decoded to produce a plurality of Viterbi path metrics. A transport format for the guided transport channel is selected from possible transport formats. The possible transport formats are determined by information provided on the guiding transport channel. The selection of the transport format is based at least in part on a metric computed from a combination of the Viterbi path metrics.

Type: Grant

Filed: March 31, 2012

Date of Patent: October 14, 2014

Assignee: Broadcom Corporation

Inventor: Chuan-Hsuan Kuo
Method and apparatus for speech recognition

Patent number: 8849667

Abstract: A computer-implemented method, apparatus and computer program product. The computer-implemented method performed by a computerized device, comprising: transforming a hidden Markov model to qubits; transforming data into groups of qubits, the data being determined upon the hidden Markov model and features extracted from an audio signal, the data representing a likelihood observation matrix representing likelihood of phoneme and state combinations in an audio signal; applying a quantum search algorithm for finding a maximal value of the qubits; and transforming the maximal value of the qubits into a number, the number representing an entry in a delta array used in speech recognition.

Type: Grant

Filed: July 7, 2013

Date of Patent: September 30, 2014

Assignee: Novospeech Ltd.

Inventor: Yossef Ben-Ezra
Speech recognition apparatus and method

Patent number: 8849668

Abstract: A speech recognition apparatus is provided. The speech recognition apparatus includes a primary speech recognition unit configured to perform speech recognition on input speech and thus to generate word lattice information, a word string generation unit configured to generate one or more word strings based on the word lattice information, a language model score calculation unit configured to calculate bidirectional language model scores of the generated word strings selectively using forward and backward language models for each of words in each of the generated word strings, and a sentence output unit configured to output one or more of the generated word strings with high scores as results of the speech recognition of the input speech based on the calculated bidirectional language model scores.

Type: Grant

Filed: May 24, 2011

Date of Patent: September 30, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ick-Sang Han, Chi-Youn Park, Jeong-Su Kim, Jeong-Mi Cho
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 8812315

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: October 1, 2013

Date of Patent: August 19, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Speech recognition circuit using parallel processors

Patent number: 8768696

Abstract: A speech recognition circuit comprises a memory containing lexical data for word recognition, the lexical data comprising a plurality of lexical data structures stored in each of a plurality of parts of the memory; and a parallel processor structure connected to the memory to process speech parameters by performing parallel processing on a plurality of the lexical data structures.

Type: Grant

Filed: October 5, 2011

Date of Patent: July 1, 2014

Assignee: Zentian Limited

Inventor: Mark Catchpole
Systems and methods for hands-free voice control and voice search

Patent number: 8700399

Abstract: In one embodiment the present invention includes a method comprising receiving an acoustic input signal and processing the acoustic input signal with a plurality of acoustic recognition processes configured to recognize the same target sound. Different acoustic recognition processes start processing different segments of the acoustic input signal at different time points in the acoustic input signal. In one embodiment, initial states in the recognition processes may be configured on each time step.

Type: Grant

Filed: July 6, 2010

Date of Patent: April 15, 2014

Assignee: Sensory, Inc.

Inventors: Pieter J. Vermeulen, Jonathan Shaw, Todd F. Mozer
Weight coefficient generation device, voice recognition device, navigation device, vehicle, weight coefficient generation method, and weight coefficient generation program

Patent number: 8688449

Abstract: A weight coefficient generation device, a speech recognition device, a navigation system, a vehicle, a vehicle coefficient generation method, and a weight coefficient generation program are provided for the purpose of improving a speech recognition performance of place names. In order to address the above purpose, an address database 12 has address information data items including country names, city names, street names, and house numbers, and manages the address information having a tree structure indicating hierarchical relationships between the place names from wide area to a narrow area. Each of the place names stored in the address database 12 is taken as a speech recognition candidate. A weight coefficient calculation unit 11 of a weight coefficient generation device 10 calculates a weight coefficient of the likelihood of the aforementioned recognition candidate based on the number of the street names belonging to the lower hierarchy below the city names.

Type: Grant

Filed: November 17, 2009

Date of Patent: April 1, 2014

Assignee: Asahi Kasei Kabushiki Kaisha

Inventor: Toshiyuki Miyazaki
METHOD, APPARATUS, AND ACCESS NETWORK SYSTEM FOR SPEECH SIGNAL PROCESSING

Publication number: 20140088963

Abstract: A method and an apparatus for speech signal processing are provided. The method includes: receiving an encoded speech signal sent by a user equipment, where the encoded speech signal includes a first substream, a second substream, and a third substream, and the first substream is attached with a cyclic redundancy check (CRC); performing decoding processing on the first substream, the second substream, and the third substream by adopting a decoding algorithm, where a decoding algorithm that is based on an auxiliary decision of the CRC is adopted to perform decoding processing on the first substream; and sending decoding results of the first substream, the second substream, and the third substream to a base station controller, where the decoding result of the first substream includes a decoded bit stream and a CRC result. Decoding performance of the first substream is improved, and users' higher requirements for the speech quality are met.

Type: Application

Filed: November 27, 2013

Publication date: March 27, 2014

Applicant: Huawei Technologies Co., Ltd.

Inventors: Yuejun Wei, Xin Tang, Qian Zhu
Method for tone/intonation recognition using auditory attention cues

Patent number: 8676574

Abstract: In a spoken language processing method for tone/intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.

Type: Grant

Filed: November 10, 2010

Date of Patent: March 18, 2014

Assignee: Sony Computer Entertainment Inc.

Inventor: Ozlem Kalinli
Partial word lists into a phoneme tree

Patent number: 8650032

Abstract: The present invention discloses converting a text form into a speech. In the present invention, partial word lists of a data source are obtained by parsing the data source in parallel or in series. The partial word lists are then compiled to obtain phoneme graphs corresponding, respectively, to the partial word lists, and then the obtained phoneme graphs are combined. Speech recognition is then conducted according to the combination results. According to the present invention, computational complexity may be reduced and recognition efficiency may be improved during speech recognition.

Type: Grant

Filed: November 2, 2011

Date of Patent: February 11, 2014

Assignee: Nuance Communications, Inc.

Inventors: Guo Kang Fu, Zhao Bing Han, Bin Jia, Ying Liu
METHOD AND APPARATUS TO SEARCH FIXED CODEBOOK

Publication number: 20140006023

Abstract: A method and apparatus to search a codebook including pulses that model a predetermined component of a speech signal. The method includes the operations of selecting a predetermined number of paths corresponding to a predetermined number of pulse locations that are most consistent with the predetermined component, from among paths corresponding to pulse locations of a predetermined pulse location set allocated to at least one branch that connects one state of a predetermined Trellis structure to another state, performing the path selecting operation on each of states other than the one state, and selecting a path corresponding to pulse locations that are most consistent with the predetermined component, from among paths including the selected paths. Accordingly, the number of calculations required during a codebook search is reduced.

Type: Application

Filed: September 6, 2013

Publication date: January 2, 2014

Applicant: SAMSUNG Electronics Co., Ltd.

Inventors: Hosang SUNG, Kangeun LEE, Sang-won KANG, Thomas R. FISCHER, Ja-kyoung JUN
Speech processing system and method

Patent number: 8620655

Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic

Type: Grant

Filed: August 10, 2011

Date of Patent: December 31, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
Converting partial word lists into a phoneme tree for speech recognition

Patent number: 8620656

Abstract: The present invention discloses converting a text form into a speech. In the present invention, partial word lists of a data source are obtained by parsing the data source in parallel or in series. The partial word lists are then compiled to obtain phoneme graphs corresponding, respectively, to the partial word lists, and then the obtained phoneme graphs are combined. Speech recognition is then conducted according to the combination results. According to the present invention, computational complexity may be reduced and recognition efficiency may be improved during speech recognition.

Type: Grant

Filed: March 4, 2012

Date of Patent: December 31, 2013

Assignee: Nuance Communications, Inc.

Inventors: Guo Kang Fu, Zhao Bing Han, Bin Jia, Ying Liu
Voice recognition device, voice recognition method, and voice recognition program

Patent number: 8612225

Abstract: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.

Type: Grant

Filed: February 26, 2008

Date of Patent: December 17, 2013

Assignee: NEC Corporation

Inventors: Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa
Method and apparatus for improving memory locality for real-time speech recognition

Patent number: 8606578

Abstract: According to some embodiments, a method and apparatus are provided to buffer N audio frames of a plurality of audio frames associated with an audio signal, pre-compute scores for a subset of context dependent models (CDMs), and perform a graphical model search associated with the N audio frames where a score of a context independent model (CIM) associated with a CDM is used in lieu of a score for the CDM when a score for the CDM is needed and has not been pre-computed.

Type: Grant

Filed: June 25, 2009

Date of Patent: December 10, 2013

Assignee: Intel Corporation

Inventors: Michael Eugene Deisher, Tao Ma
Adapting language models with a bit mask for a subset of related words

Patent number: 8589163

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask.

Type: Grant

Filed: December 4, 2009

Date of Patent: November 19, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Mazin Gilbert
Communication apparatus, communication method and computer program

Patent number: 8582677

Abstract: There is provided a communication apparatus, including a transmission pattern generation unit that generates a transmission pattern according to a modulation method, a metric calculation unit that calculates an inter-signal distance between a received signal vector of received signals and an estimation vector, which is a product of channel information and the transmission pattern, a maximum likelihood pattern determination unit that determines a maximum likelihood signal pattern from the inter-signal distance calculated by the metric calculation unit, and an error estimation unit that estimates a phase error component and an amplitude error component contained in the received signal vector. The metric calculation unit calculates the inter-signal distance between the received signal vector and the estimation vector by using the phase error component and/or the amplitude error component estimated by the error estimation unit.

Type: Grant

Filed: June 8, 2009

Date of Patent: November 12, 2013

Assignee: Sony Corporation

Inventor: Ryo Sawai
Modified trace-back using soft output Viterbi algorithm (SOVA)

Patent number: 8578254

Abstract: Systems and methods are provided for generating error events for decoded bits using a Soft output Viterbi algorithm (SOVA). A winning path through a trellis can be determined and decoded information can be generated. Path metric differences can be computed within the trellis based on the winning path. A plurality of error event masks and error event metrics can be generated based on the decoded information and the path metric differences.

Type: Grant

Filed: November 6, 2012

Date of Patent: November 5, 2013

Assignee: Marvell International Ltd.

Inventor: Manoj Kumar Yadav
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 8548807

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: June 9, 2009

Date of Patent: October 1, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Full-rate distributed space-time codes for cooperative communications

Patent number: 8542760

Abstract: A full-rate distributed space-time (ST) code design is provided for amplify-and-forward cooperative wireless channels. A signal space diversity technique is employed at the source node and a unique signature vector at each relay node. The distributed space-time (ST) codes can achieve full cooperative diversity and full rate. The achievable diversity gain is M+1, where M is the number of relay nodes. Optimal power allocation can be used to maximize the coding gain under a total power constraint.

Type: Grant

Filed: November 16, 2007

Date of Patent: September 24, 2013

Assignee: Lingna Holdings Pte., LLC

Inventors: Wei Zhang, Khaled Ben Letaief
Multi-channel sequential Viterbi decoder

Patent number: 8509359

Abstract: A multi-channel sequential Viterbi decoder includes: an input buffer, a “Read Data from Input Buffer” signal driver, a processing unit selector, a decoder channel parameters registers unit, a processing unit for a “Reset Path Metrics” command, a processing unit for a “Set Path Metric Value for the Given Path Number” command, a processing unit for a “Get Single Bit from the Path with Given Number” command, a processing unit for a “Process Input Samples” command, a memory for storing decoding paths and path metrics, a unit for generating an address for the memory, and data buffers for decoder channels output.

Type: Grant

Filed: October 15, 2012

Date of Patent: August 13, 2013

Assignee: Topcon Positioning Systems, Inc.

Inventors: Timur G. Kelin, Dmitry D. Murzinov, Dmitry A. Pyatkov
Library of existing spoken dialog data for use in generating new natural language spoken dialog systems

Patent number: 8478589

Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.

Type: Grant

Filed: January 5, 2005

Date of Patent: July 2, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
Reconfigurable orthogonal frequency division multiplexing (OFDM) chip supporting single weight diversity

Patent number: 8457230

Abstract: A method and system for a reconfigurable orthogonal frequency division multiplexing (OFDM) chip supporting single weight diversity are provided. The reconfigurable OFDM chip may be configured to process signals such as IEEE 802.11, 802.16, and digital video broadcasting (DVB). The OFDM chip may generate channel weights to be applied to signals received in receive antennas. The weighted signals may be combined into a single received signal and channel estimates may be generated from the single received signal. Updated channel weights may be generated from the generated channel estimates. Updates to the channel weights may be performed dynamically. The configurable OFDM chip may be utilized to provide collaborative cellular and OFDM-based communication. The reconfigurable OFDM chip and the cellular chip may communicate data and/or control information via a memory coupled to a common bus.

Type: Grant

Filed: September 28, 2005

Date of Patent: June 4, 2013

Assignee: Broadcom Corporation

Inventor: Pieter van Rooyen
List-viterbi hard iterative decoder for multilevel codes

Patent number: 8442163

Abstract: Two decoding algorithms are introduced for the decoding of multi-level coded modulation and other types of coded modulation involving component codes and interleaving operations. An improved hard iterative decoding (IHID) algorithm is presented that improves upon a hard iteration decoding technique by adding a stopping criterion. Also, a list Viterbi hard iteration decoding (LV-IHID) algorithm is presented that employs list decoding in conjunction with the IHID algorithm. Both of these decoding algorithms improve upon conventional multi-stage decoding by reducing the effective error multiplicity that is observed at the lowest coding level. It is demonstrated that the LV-IHID algorithm performs close to soft iterative decoding. The computational and delay complexity of the proposed decoding algorithms compare favorably with soft iterative decoding strategies. Also, a novel labeling strategy for MLC design is presented.

Type: Grant

Filed: August 24, 2009

Date of Patent: May 14, 2013

Inventors: Eric Morgan Dowling, John P. Fonseka
PERSONALIZED ADVERTISEMENT DEVICE BASED ON SPEECH RECOGNITION SMS SERVICE, AND PERSONALIZED ADVERTISEMENT EXPOSURE METHOD BASED ON SPEECH RECOGNITION SMS SERVICE

Publication number: 20130117020

Abstract: Disclosed are a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services. The present invention provides a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services capable of maximizing an effect of advertisement by grasping user's intention, an emotion state, and positional information from speech data uttered by a user during a process of providing speech recognition SMS services, configuring advertisements based thereon, and exposing the configured advertisements to a user.

Type: Application

Filed: September 5, 2012

Publication date: May 9, 2013

Applicant: Electronics and telecommunications Research Institute

Inventors: Hoon CHUNG, Jeon Gue Park, Hyung Bae Jeon, Ki Young Park, Yun Keun Lee, Sang Kyu Park
Recognizer weight learning apparatus, speech recognition apparatus, and system

Patent number: 8428950

Abstract: A speech recognition apparatus (110) selects an optimum recognition result from recognition results output from a set of speech recognizers (s1-sM) based on a majority decision. This decision is implemented with taking into account weight values, as to the set of the speech recognizers, learned by a learning apparatus (100). The learning apparatus includes a unit (103) selecting speech recognizers corresponding to characteristics of speech for learning (101), a unit (104) finding recognition results of the speech for learning by using the selected speech recognizers, a unit (105) unifying the recognition results and generating a word string network, and a unit (106) finding weight values concerning a set of the speech recognizers by implementing learning processing.

Type: Grant

Filed: January 18, 2008

Date of Patent: April 23, 2013

Assignee: NEC Corporation

Inventors: Yoshifumi Onishi, Tadashi Emori
Dynamic pruning for automatic speech recognition

Patent number: 8392187

Abstract: Methods, speech recognition systems, and computer readable media are provided that recognize speech using dynamic pruning techniques. A search network is expanded based on a frame from a speech signal, a best hypothesis is determined in the search network, a default beam threshold is modified, and the search network is pruned using the modified beam threshold. The search network may be further pruned based on the search depth of the best hypothesis and/or the average number of frames per state for a search path.

Type: Grant

Filed: January 30, 2009

Date of Patent: March 5, 2013

Assignee: Texas Instruments Incorporated

Inventor: Qifeng Zhu
Apparatus and method for canceling interference in multi-antenna system

Patent number: 8385479

Abstract: An apparatus and a method for canceling interference based on Maximum Likelihood (ML) at a receiver of a multi-antenna system are provided. The method includes estimating a channel using a signal received over one or more receiver antennas; generating one or more weights using the estimated channel to cancel interference; detecting a candidate symbol having the shortest Euclidean distance through Maximum Likelihood using a first weight; and soft-decoding the candidate symbol using a second weight. Therefore, by receiving the signal based on the ML under the interference, the system capacity can be increased and the reception performance can be enhanced.

Type: Grant

Filed: December 2, 2008

Date of Patent: February 26, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Joo-Hyun Lee, Suk-Seung Hwang, Jong-Ho Lee, Soong-Yoon Choi
Method and arrangement for managing grammar options in a graphical callflow builder

Patent number: 8355918

Abstract: A method (10) in a speech recognition application callflow can include the steps of assigning (11) an individual option and a pre-built grammar to a same prompt, treating (15) the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase (12) or an annotation (13) in the pre-built grammar, and treating (14) the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.

Type: Grant

Filed: January 5, 2012

Date of Patent: January 15, 2013

Assignee: Nuance Communications, Inc.

Inventors: Ciprian Agapi, Felipe Gomez, James R. Lewis, Vanessa V. Michelini

1 2 3 4 next