Patents Examined by David D. Knepper

In-the-field adaptation of a large vocabulary automatic speech recognizer (ASR)

Patent number: 7505905

Abstract: A technique for improving the recognition accuracy of a speech recognizer includes deploying the speech recognizer, wherein live input data is received by the recognizer as an input for a given speaker independent adaptation algorithm associated with the speech recognizer. The algorithm enhances the accuracy of the speech recognizer without human supervision. This technique is particularly suitable for adapting a large vocabulary ASR engine.

Type: Grant

Filed: May 13, 1999

Date of Patent: March 17, 2009

Assignee: Nuance Communications, Inc.

Inventors: Roger Scott Zimmerman, Gary Neil Tajchman, Ian Scott Boardman, Hejko Willy Rahmel, Thomas B. Schalk
Using source-channel models for word segmentation

Patent number: 7493251

Abstract: A method and apparatus for segmenting text is provided that identifies a sequence of entity types from a sequence of characters and thereby identifies a segmentation for the sequence of characters. Under the invention, the sequence of entity types is identified using probabilistic models that describe the likelihood of a sequence of entities and the likelihood of sequences of characters given particular entities. Under one aspect of the invention, organization name entities are identified from a first sequence of identified entities to form a final sequence of identified entities.

Type: Grant

Filed: May 30, 2003

Date of Patent: February 17, 2009

Assignee: Microsoft Corporation

Inventors: Jianfeng Gao, Mu Li, Chang-Ning Huang, Jian Sun, Lei Zhang, Ming Zhou
Coupled hidden Markov model (CHMM) for continuous audiovisual speech recognition

Patent number: 7454342

Abstract: Method and apparatus for an audiovisual continuous speech recognition (AVCSR) system using a coupled hidden Markov model (CHMM) are described herein. In one aspect, an exemplary process includes receiving an audio data stream and a video data stream, and performing continuous speech recognition based on the audio and video data streams using a plurality of hidden Markov models (HMMs), a node of each of the HMMs at a time slot being subject to one or more nodes of related HMMs at a preceding time slot. Other methods and apparatuses are also described.

Type: Grant

Filed: March 19, 2003

Date of Patent: November 18, 2008

Assignee: Intel Corporation

Inventors: Ara Victor Nefian, Xiaoxing Liu, Xiaobo Pi, Luhong Liang, Yibao Zhao
Speech recognizing apparatus having optimal phoneme series comparing unit and speech recognizing method

Patent number: 7447634

Abstract: A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit obtains a likelihood that respective recognizing-unit standard patterns coincide with a time series of the amount of characteristics representing the characteristics of the input speed.

Type: Grant

Filed: June 11, 2007

Date of Patent: November 4, 2008

Assignee: Kabushiki Kaisha Toshiba

Inventor: Ryosuke Koshiba
Method and apparatus for formant tracking using a residual model

Patent number: 7424423

Abstract: A method of tracking formants defines a formant search space comprising sets of formants to be searched. Formants are identified for a first frame in the speech utterance by searching the entirety of the formant search space using the codebook, and for the remaining frames by searching the same space using both the codebook and the continuity constraint across adjacent frames. Under one embodiment, the formants are identified by mapping sets of formants into feature vectors and applying the feature vectors to a model. Formants are also identified by applying dynamic programming to search for the best sequence that optimally satisfies the continuity constraint required by the model.

Type: Grant

Filed: April 1, 2003

Date of Patent: September 9, 2008

Assignee: Microsoft Corporation

Inventors: Issam Bazzi, Li Deng, Alejandro Acero
Communication system noise cancellation power signal calculation techniques

Patent number: 7424424

Abstract: In order to enhance the quality of a communication signal derived from speech and noise, a filter divides the communication signal into a plurality of frequency band signals. A calculator generates a plurality of power band signals each having a power band value and corresponding to one of the frequency band signals. The power band values are based on estimating, over a time period, the power of one of the frequency band signals. The time period is different for different ones of the frequency band signals. The power band values are used to calculate weighting factors which are used to alter the frequency band signals that are combined to generate an improved communication signal.

Type: Grant

Filed: June 28, 2006

Date of Patent: September 9, 2008

Assignee: Tellabs Operations, Inc.

Inventors: Ravi Chandran, Bruce E. Dunne, Daniel J. Marchok
Speech recognizing apparatus with noise model adapting processing unit and speech recognizing method

Patent number: 7415408

Abstract: A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit compares the time series of the amount of characteristics with one recognizing standard pattern or with two or more combined recognizing standard patterns one-by-one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics. A rejection determining unit determining unit determines whether or not the input signal is a noise by comparing the likelihood obtained by the recognizing target vocabulary comparing step with the likelihood obtained by the environment adaptive noise model comparing step.

Type: Grant

Filed: June 11, 2007

Date of Patent: August 19, 2008

Assignee: Kabushiki Kaisha Toshiba

Inventor: Ryosuke Koshiba
Language-enhanced programming tools

Patent number: 7412388

Abstract: Statements of a computer program expressed using a first source natural language are made meaningful to a programmer familiar with a second target natural language. The first source natural language of the computer program is determined from the programmer, or through analysis, and the second target natural language desired by the programmer is selected. Textual constructs may be parsed, with reference to stored coding conventions to determine meaningful lexical tokens. Such tokens are translated with a translation engine, and displayed to the programmer, desirably using a graphical user interface feature of an integrated development environment (IDE) for computer programming in a particular programming language.

Type: Grant

Filed: December 12, 2003

Date of Patent: August 12, 2008

Assignee: International Business Machines Corporation

Inventors: Baiju Dalal, Mohit Kalra
Speech recognizing apparatus with noise model adapting processing unit, speech recognizing method and computer-readable medium

Patent number: 7409341

Abstract: A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adapted noise model comparing unit compares the time series of the amount of characteristics with one recognizing standard pattern or with two or more combined recognizing standard patterns one-by-one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics. A rejection determining unit determines whether or not the input signal is noise by comparing the likelihood obtained by the recognizing target vocabulary comparing step with the likelihood obtained by the environment adaptive noise model comparing step.

Type: Grant

Filed: June 11, 2007

Date of Patent: August 5, 2008

Assignee: Kabushiki Kaisha Toshiba

Inventor: Ryosuke Koshiba
Encoding device, decoding device and audio data distribution system

Patent number: 7392176

Abstract: An audio data input unit of an encoding device splits an audio data string into contiguous samples of audio data, and a transforming unit transforms the split audio data into spectral data in a frequency domain. A data dividing unit divides the spectral data into a lower frequency band and a higher frequency band at 11.025 kHz (f1) as a boundary. The spectral data in the lower frequency band is quantized and encoded by a first quantizing unit and an encoding unit. A second quantizing unit generates sub information indicating a characteristic of the spectral data in the higher frequency band, and a second encoding unit encodes the sub information. A stream output unit integrates the codes obtained by the first and second encoding units and outputs the integrated one. Here, f1 is a half or less of a sampling frequency f2 at which the audio data string is created.

Type: Grant

Filed: November 1, 2002

Date of Patent: June 24, 2008

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Kosuke Nishio, Takeshi Norimatsu, Mineo Tsushima, Naoya Tanaka
LPC vector quantization apparatus

Patent number: 7392179

Abstract: The present invention carries out pre-selection on many LPC codevectors stored in an LSF codebook 101 using a weighted Euclidean distortion as a measure and carries out a full-code selection on the LPC codevectors left after the pre-selection using an amount of distortion in a spectral space as a measure. This makes it possible to improve the quantization performance of the LPC parameter vector quantizer and improve the quality of synthesized speech of the speech coder/decoder.

Type: Grant

Filed: November 29, 2001

Date of Patent: June 24, 2008

Assignees: Matsushita Electric Industrial Co., Ltd., Nippon Telegraph and Telephone Corporation

Inventors: Kazutoshi Yasunaga, Toshiyuki Morii, Hiroyuki Ehara, Kazunori Mano, Yusuke Hiwasaki
Schedule event context for speech recognition

Patent number: 7392183

Abstract: A processor-based system obtaining information about an event from schedule data, and using the information to assist speech recognition of speech occurring during at least a portion of the event.

Type: Grant

Filed: December 27, 2002

Date of Patent: June 24, 2008

Assignee: Intel Corporation

Inventor: Michael E. Deisher
Speech coder and method

Patent number: 7386447

Abstract: An overflow problem of LSF quantization in G.729 Annex B speech encoding which may lead to non-assignment of a codebook index. Preferred embodiments fix the problem with default or limited random variable assignments or flagging the overflow and adjusting the frame encoding such as by limiting spectral components or changing quantization targets.

Type: Grant

Filed: November 4, 2002

Date of Patent: June 10, 2008

Assignee: Texas Instruments Incorporated

Inventors: Dunling Li, Gokhan Sisli, John T. Dowdal, Zoran Mladenovic
Method for determining a characteristic data record for a data signal

Patent number: 7383184

Abstract: A method for determining a characteristic data set (“fingerprint”) for a sound signal, the sound signal itself is searched through for characteristic locations, and these characteristics locations are used for producing a characteristic data set. For this the frequency spectrum is evaluated over a time interval, subdivided into frequency bands, and averaged over each frequency band into a value. The fingerprint then consists of data that has been obtained from these values after possible further averagings, wherein only data is included which belongs to certain time segments.

Type: Grant

Filed: April 17, 2001

Date of Patent: June 3, 2008

Assignee: Creaholic SA

Inventor: Christoph Dworzak
Reducing acoustic noise in wireless and landline based telephony

Patent number: 7369990

Abstract: Acoustic noise for wireless or landline telephony is reduced through optimal filtering in which each frequency band of every time frame is filtered as a function of the estimated signal-to-noise ratio and the estimated total noise energy for the frame. Non-speech bands, non-speech frames and other special frames are further attenuated by one or more predetermined multiplier values. Noise in a transmitted signal formed of frames each formed of frequency bands is reduced. A respective total signal energy and a respective current estimate of the noise energy for at least one of the frequency bands is determined. A respective local signal-to-noise ratio for at least one of the frequency bands is determined as a function of the respective signal energy and the respective current estimate of the noise energy. A respective smoothed signal-to-noise ratio is determined from the respective local signal-to-noise ratio and another respective signal-to-noise ratio estimated for a previous frame.

Type: Grant

Filed: June 5, 2006

Date of Patent: May 6, 2008

Assignee: Nortel Networks Limited

Inventor: Elias J. Nemer
Method for improving speech quality in speech transmission tasks

Patent number: 7318025

Abstract: A method for calculating the amplication factor, which co-determines the volume, for a speech signal transmitted in encoded form includes dividing the speech signal into short temporal signal segments. The individual signal segments are encoded and transmitted separately from each other, and the amplication factor for each signal segment is calculated, transmitted and used by the decoder to reconstruct the signal. The amplication factor is determined by minimizing the value E(g_opt2)=(1?a)*f1(g_opt2)+a*f2(g_opt2), the weighting factor a being determined taking into account both the periodicity and the stationarity of the encoded speech signal.

Type: Grant

Filed: March 8, 2001

Date of Patent: January 8, 2008

Assignee: Deutsche Telekom AG

Inventors: Alexander Kyrill Fischer, Christoph Erdmann
Apparatus for performing speaker identification and speaker searching in speech or sound image data, and method thereof

Patent number: 7315819

Abstract: A process of identifying a speaker in coded speech data and a process of searching for the speaker are efficiently performed with fewer computations and with a smaller storage capacity. In an information search apparatus, an LSP decoding section extracts and decodes only LSP information from coded speech data which is read for each block. An LPC conversion section converts the LSP information into LPC information. A Cepstrum conversion section converts the obtained LPC information into an LPC Cepstrum which represents features of speech. A vector quantization section performs vector quantization on the LPC Cepstrum. A speaker identification section identifies a speaker on the basis of the result of the vector quantization. Furthermore, the identified speaker is compared with a search condition in a condition comparison section, and based on the result, the search result is output.

Type: Grant

Filed: July 23, 2002

Date of Patent: January 1, 2008

Assignee: Sony Corporation

Inventors: Yasuhiro Toguri, Masayuki Nishiguchi
Method and system for embedding and extracting data from encoded voice code

Patent number: 7310596

Abstract: When a voice encoding apparatus embeds any data in encoded voice code, the apparatus determines whether data embedding condition is satisfied using a first element code from among element codes constituting the encoded voice code, and a threshold value. If the data embedding condition is satisfied, the apparatus embeds optional data in the encoded voice code by replacing a second element code with the optional data. When a voice decoding apparatus extracts data that has been embedded in encoded voice code, the apparatus determines whether data embedding condition is satisfied using a first element code from among element codes constituting the encoded voice code, and a threshold value. If the data embedding condition is satisfied, the apparatus determines that optional data has been embedded in the second element code portion of the encoded voice code and extracts this embedded data.

Type: Grant

Filed: February 3, 2003

Date of Patent: December 18, 2007

Assignee: Fujitsu Limited

Inventors: Yasuji Ota, Masanao Suzuki, Yoshiteru Tsuchinaga, Masakiyo Tanaka, Shigeru Sasaki
Three-stage individual word recognition

Patent number: 7299179

Abstract: In a three-stage speech recognition process, a phoneme sequence is first assigned to a speech unit, then those vocabulary entries which are most similar to the phoneme sequence are sought in a selection vocabulary, and finally the speech unit is recognized using a speech unit recognizer which uses, as its vocabulary, the selected vocabulary entries which are most like the phoneme sequence.

Type: Grant

Filed: January 19, 2004

Date of Patent: November 20, 2007

Assignee: Siemens Aktiengesellschaft

Inventors: Hans-Ulrich Block, Stefanie Schachtl
Signal processing utilizing a tree-structured array

Patent number: RE40281

Abstract: A communication system for sending a sequence of symbols on a communication link. The system includes a transmitter for placing information indicative of the sequence of symbols on the communication link and a receiver for receiving the information placed on the communication link by the transmitter. The transmitter includes a clock for defining successive frames, each of the frames including M time intervals, where M is an integer greater than 1. A modulator modulates each of M carrier signals with a signal related to the value of one of the symbols thereby generating a modulated carrier signal corresponding to each of the carrier signals. The modulated carriers are combined into a sum signal which is transmitted on the communication link. The carrier signals include first and second carriers, the first carrier having a different bandwidth than the second carrier.

Type: Grant

Filed: November 23, 2004

Date of Patent: April 29, 2008

Assignee: Aware, Inc.

Inventors: Michael A. Tzannes, Peter N. Heller, John P. Stautner, William R. Morrell, Sriram Jayasimha

1 2 3 4 5 … next