Patents Examined by Matthew J. Sked

Phonetic searching

Patent number: 7769587

Abstract: An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.

Type: Grant

Filed: November 26, 2008

Date of Patent: August 3, 2010

Assignee: Georgia Tech Research Corporation

Inventors: Peter S. Cardillo, Mark A. Clements, William E. Price
Prosodic control rule generation method and apparatus, and speech synthesis method and apparatus

Patent number: 7761301

Abstract: A prosodic control rule generation method includes dividing an input text into language units, estimating a punctuation mark incidence at a boundary between language units in the input text, the punctuation mark incidence indicating a degree that a punctuation mark occurs at the boundary, based on attribute information items of a plurality of language units adjacent to the boundary, and generating a prosodic control rule for speech synthesis including a condition for the punctuation mark incidence based on a plurality of learning data items each concerning prosody and including the punctuation mark incidence.

Type: Grant

Filed: October 20, 2006

Date of Patent: July 20, 2010

Assignee: Kabushiki Kaisha Toshiba

Inventor: Dawei Xu
Automatic language model update

Patent number: 7756708

Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.

Type: Grant

Filed: April 3, 2006

Date of Patent: July 13, 2010

Assignee: Google Inc.

Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
Cadence management of translated multi-speaker conversations using pause marker relationship models

Patent number: 7752031

Abstract: Multiple speaker cadence is managed for translated conversations by separating a multi-speaker audio stream into single-speaker audio tracks containing one or more first language audio snippets organized according to a timing relationship as related in the multi-speaker audio stream, generating a pause relationship model by determining time relationships between the single-speaker snippets and assigning pause marker values denoting the each beginning and each ending of each mutual silence pause, collecting a translated language audio track corresponding to each single-speaker track, generating pause relationship controls according the pause relationship model, and producing a translated multi-speaker audio output including the translated tracks in which the translated snippets are related in time according to the pause relationship controls.

Type: Grant

Filed: March 23, 2006

Date of Patent: July 6, 2010

Assignee: International Business Machines Corporation

Inventors: Rhonda L. Childress, Stewart Jason Hyman, David Bruce Kumhyr, Stephen James Watt
Pitch lag estimation

Patent number: 7752038

Abstract: Autocorrelation values are determined as a basis for an estimation of a pitch lag in a segment of an audio signal. A first considered delay range for the autocorrelation computations is divided into a first set of sections, and first autocorrelation values are determined for delays in a plurality of sections of this first set of sections. A second considered delay range for the autocorrelation computations is divided into a second set of sections such that sections of the first set and sections of the second set are overlapping. Second autocorrelation values are determined for delays in a plurality of sections of this second set of sections.

Type: Grant

Filed: October 13, 2006

Date of Patent: July 6, 2010

Assignee: Nokia Corporation

Inventors: Lasse Laaksonen, Anssi Ramo, Adriana Vasilache
Data summarization method and apparatus

Patent number: 7747429

Abstract: A method of generating caption abstract, including: generating a target text from a predetermined caption, analyzing a morpheme of a word included in the target text, and analyzing a grammatical structure of the target text by referring to the morpheme; extracting and removing low content words from the target text by using the morpheme or information on the grammatical structure and determining a main predicate; extracting a major sentence component with respect to the main predicate by referring to the information on the grammatical structure, as a candidate abstract word; substituting a relevant word for a complex noun phrase or a predicate phrase from the candidate abstract words by referring to a predetermined database; and generating an abstract by rearranging the candidate abstract words according to a predetermined rule.

Type: Grant

Filed: October 30, 2006

Date of Patent: June 29, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jeong Mi Cho, Hyoung Gook Kim
Voice synthesizer, voice synthesizing method, and computer program

Patent number: 7739113

Abstract: A voice synthesizer includes a recorded voice storage portion (124) that stores recorded voices that are pre-recorded; a voice input portion (110) that is input with a reading voice reading out a text that is to be generated by the synthesized voice; an attribute information input portion (112) that is input with a label string, which is a string of labels assigned to each phoneme included in the reading voice, and label information, which indicates the border position of each phoneme corresponding to each label; a parameter extraction portion (116) that extracts characteristic parameters of the reading voice based on the label string, the label information, and the reading voice; and a voice synthesis portion (122) that selects the recorded voices from the recorded voice storage portion in accordance with the characteristic parameters, synthesizes the recorded voices, and generates the synthesized voice that reads out the text.

Type: Grant

Filed: November 9, 2006

Date of Patent: June 15, 2010

Assignee: Oki Electric Industry Co., Ltd.

Inventor: Tsutomu Kaneyasu
Data cleansing system and method

Patent number: 7729899

Abstract: An automated system and method is provided for debugging training data used to train an automated language identifier. The system and method collects texts written in a particular language, generates an occurrence count for words in each text by counting the number of times each of the words is found within the text, and generates an occurrence ratio (OR) of each of the words by dividing the occurrence count by the total number of words in each text. Words are then filtered from the texts in which their occurrence ratios are substantially higher than their occurrence ratios in at least one of the other texts, to generate a clean text.

Type: Grant

Filed: February 6, 2007

Date of Patent: June 1, 2010

Assignee: Basis Technology Corporation

Inventor: Nobuo Otsuka
System for classifying words

Patent number: 7729901

Abstract: A system is disclosed for determining probable meanings of a word. A prior probability is established of the probable meanings of the word. A context frequency probability is established of the probable meanings of the word. The probability that each meaning is a correct meaning may be provided in accordance with both the prior probability and the context frequency probability.

Type: Grant

Filed: December 13, 2005

Date of Patent: June 1, 2010

Assignee: Yahoo! Inc.

Inventors: David Richardson-Bunbury, Soren Riise, Devesh Patel, Eugene H. Stipp, Paul J. Grealish
Applying speech recognition adaptation in an automated speech recognition system of a telematics-equipped vehicle

Patent number: 7725316

Abstract: A speech recognition adaptation method for a vehicle having a telematics unit with an embedded speech recognition system. Speech is received and pre-processed to generate acoustic feature vectors, and an adaptation parameter is applied to the acoustic feature vectors to yield transformed acoustic feature vectors. The transformed acoustic feature vectors are decoded and a hypothesis of the speech is selected, and the adaptation parameter is trained using acoustic feature vectors from the hypothesis. The method also includes one or more of the following steps: the speech is observed for a certain characteristic and the trained adaptation parameter is saved in accordance with the certain characteristic for use in transforming feature vectors of subsequent speech having the certain characteristic; use of the trained adaptation parameter persists from one vehicle ignition cycle to the next; and use of the trained adaptation parameter is ceased upon detection of a system fault.

Type: Grant

Filed: July 5, 2006

Date of Patent: May 25, 2010

Assignee: General Motors LLC

Inventors: Rathinavelu Chengalvarayan, John J Correia, Scott M Pennock
Digital voice profiles

Patent number: 7720681

Abstract: Generally described, the present invention is directed toward generating, maintaining, updating, and applying digital voice profiles. Voice profiles may be generated for individuals. The voice profiles include information that is unique to each individual and which may be applied to digital representations of that individual's voice to improve the quality of a transmitted digital representation of that individual's voice. A voice profile may include, but is not limited to, basic information about the individual, and filter definitions relating to the individuals voice patters, such as a frequency range and amplitude range. The voice profile may also include a speech definition that includes digital representations of the individual's unique speech patterns.

Type: Grant

Filed: March 23, 2006

Date of Patent: May 18, 2010

Assignee: Microsoft Corporation

Inventors: David Milstein, Kuansan Wang, Linda Criddle
Method, apparatus and computer program product for providing adaptive language model scaling

Patent number: 7716049

Abstract: An apparatus for providing adaptive language model scaling includes an adaptive scaling element and an interface element. The adaptive scaling element is configured to receive input speech comprising a sequence of spoken words and to determine a plurality of candidate sequences of text words in which each of the candidate sequences has a corresponding sentence score representing a probability that a candidate sequence matches the sequence of spoken words. Each corresponding sentence score is calculated using an adaptive scaling factor. The interface element is configured to receive a user input selecting one of the candidate sequences. The adaptive scaling element is further configured to estimate an objective function based on the user input and to modify the adaptive scaling factor based on the estimated objective function.

Type: Grant

Filed: June 30, 2006

Date of Patent: May 11, 2010

Assignee: Nokia Corporation

Inventor: Jilei Tian
Distributed voice recognition system and method

Patent number: 7716051

Abstract: A distributed voice recognition system (500) and method employs principles of bottom-up (i.e., raw input) and top-down (i.e., prediction based on past experience) processing to perform client-side and server-side processing by (i) at the client-side, replacing application data by a phonotactic table (504); (ii) at the server-side, tracking separate confidence scores for matches against an acoustic model and comparison to a grammar; and (iii) at the server-side using a contention resolver (514) to weight the client-side and server-side results to establish a single output which represents the collaboration between client-side processing and server-side processing.

Type: Grant

Filed: June 28, 2006

Date of Patent: May 11, 2010

Assignee: Nuance Communications, Inc.

Inventors: Barry Neil Dow, Stephen Graham Lawrence, John Brian Pickering
Verification of extracted data

Patent number: 7716040

Abstract: Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself.

Type: Grant

Filed: June 21, 2007

Date of Patent: May 11, 2010

Assignee: Multimodal Technologies, Inc.

Inventors: Detlef Koll, Michael Finke
Autonomous integrated headset and sound processing system for tactical applications

Patent number: 7707035

Abstract: A sound processing system including a user headset for use in tactical military operations provides integrated sound and speech analysis including sound filtering and amplification, sound analysis and speech recognition for analyzing speech and non-speech sounds and taking programmed actions based on the analysis, recognizing language of speech for purposes of one-way and two-way voice translation, word spotting to detect and identify elements of conversation, and non-speech recognition and identification. The headset includes housings with form factors for insulating a user's ear from direct exposure to ambient sounds with at least one microphone for receiving sound around the user, and a microphone for receiving user speech. The user headset can further include interconnections for connecting the headset with out systems outside of the headset, including target designation systems, communication networks, and radio transmitters.

Type: Grant

Filed: October 13, 2005

Date of Patent: April 27, 2010

Assignee: Integrated Wave Technologies, Inc.

Inventor: Timothy S. McCune
System and method for dynamically selecting among TTS systems

Patent number: 7702510

Abstract: Systems and methods for dynamically selecting among text-to-speech (TTS) systems. Exemplary embodiments of the systems and methods include identifying text for converting into a speech waveform, synthesizing said text by three TTS systems, generating a candidate waveform from each of the three systems, generating a score from each of the three systems, comparing each of the three scores, selecting a score based on a criteria and selecting one of the three waveforms based on the selected of the three scores.

Type: Grant

Filed: January 12, 2007

Date of Patent: April 20, 2010

Assignee: Nuance Communications, Inc.

Inventors: Ellen M. Eide, Raul Fernandez, Wael M. Hamza, Michael A. Picheny
Continuous speech processing using heterogeneous and adapted transfer function

Patent number: 7693712

Abstract: A pre-processing system of a signal of interest in an automatic speech recognition system in a vehicle, includes an acoustic sensor to sense the signal of interest, a non acoustic sensor to sense a non acoustic noise signal, a pre-processing unit of the signal of interest, comprising a processing section of coherent frequency bands signals for suppressing the noise from the received signal, a processing section of non coherent frequency bands signals, comprising transfer function estimation device of a signal through the vehicle cabin, and a methods selection section for determining the coherence properties of the received signal, and for selecting the processing section of coherent frequency bands signals or the processing section of non coherent frequency bands signals depending on the result of the properties of the received signal.

Type: Grant

Filed: March 27, 2006

Date of Patent: April 6, 2010

Assignee: Aisin Seiki Kabushiki Kaisha

Inventors: Michel Gaeta, Abderrahman Essebbar
Speech models generated using competitive training, asymmetric training, and data boosting

Patent number: 7693713

Abstract: Speech models are trained using one or more of three different training systems. They include competitive training which reduces a distance between a recognized result and a true result, data boosting which divides and weights training data, and asymmetric training which trains different model components differently.

Type: Grant

Filed: June 17, 2005

Date of Patent: April 6, 2010

Assignee: Microsoft Corporation

Inventors: Xiaodong He, Jian Wu
Method of learning a second language through the guidance of pictures

Patent number: 7689407

Abstract: A method of learning a second language through the guidance of pictures that enables users to learn multiple languages through computers. After users input a plurality of words, a picture/text interface will display the plurality of input words, a plurality of output words, and a plurality of pictures. The plurality of output words represent the plurality of input words in another language, and the plurality of pictures represent the plurality of input words.

Type: Grant

Filed: February 7, 2007

Date of Patent: March 30, 2010

Inventors: Kuo-Ping Yang, Chao-Jen Huang, Chien-Liang Chiang, Kun-Yi Hua, Chih-Long Chang, Ming-Hsiang Cheng, Yen-Jui Chiao
Distributed speech recognition method

Patent number: 7689424

Abstract: This invention relates to a distributed speech recognition method comprising at least one user terminal and at least one server which can communicate with each other by means of a telecommunication network. The inventive method comprises the following steps consisting in: at the user terminal, attempting to associate a saved form with the signal to be recognized and, independently of said step, transmitting a signal to the server, indicating the signal to be recognized; and, at the server, attempting to associate a saved form with the signal received.

Type: Grant

Filed: March 8, 2004

Date of Patent: March 30, 2010

Assignee: France Telecom

Inventors: Jean Monne, Jean-Pierre Petit, Patrick Brisard

prev 1 2 3 4 5 6 7 8 … next