Patents Examined by Matthew J. Sked
  • Patent number: 7769587
    Abstract: An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.
    Type: Grant
    Filed: November 26, 2008
    Date of Patent: August 3, 2010
    Assignee: Georgia Tech Research Corporation
    Inventors: Peter S. Cardillo, Mark A. Clements, William E. Price
  • Patent number: 7761301
    Abstract: A prosodic control rule generation method includes dividing an input text into language units, estimating a punctuation mark incidence at a boundary between language units in the input text, the punctuation mark incidence indicating a degree that a punctuation mark occurs at the boundary, based on attribute information items of a plurality of language units adjacent to the boundary, and generating a prosodic control rule for speech synthesis including a condition for the punctuation mark incidence based on a plurality of learning data items each concerning prosody and including the punctuation mark incidence.
    Type: Grant
    Filed: October 20, 2006
    Date of Patent: July 20, 2010
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Dawei Xu
  • Patent number: 7756708
    Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
    Type: Grant
    Filed: April 3, 2006
    Date of Patent: July 13, 2010
    Assignee: Google Inc.
    Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
  • Patent number: 7752031
    Abstract: Multiple speaker cadence is managed for translated conversations by separating a multi-speaker audio stream into single-speaker audio tracks containing one or more first language audio snippets organized according to a timing relationship as related in the multi-speaker audio stream, generating a pause relationship model by determining time relationships between the single-speaker snippets and assigning pause marker values denoting the each beginning and each ending of each mutual silence pause, collecting a translated language audio track corresponding to each single-speaker track, generating pause relationship controls according the pause relationship model, and producing a translated multi-speaker audio output including the translated tracks in which the translated snippets are related in time according to the pause relationship controls.
    Type: Grant
    Filed: March 23, 2006
    Date of Patent: July 6, 2010
    Assignee: International Business Machines Corporation
    Inventors: Rhonda L. Childress, Stewart Jason Hyman, David Bruce Kumhyr, Stephen James Watt
  • Patent number: 7752038
    Abstract: Autocorrelation values are determined as a basis for an estimation of a pitch lag in a segment of an audio signal. A first considered delay range for the autocorrelation computations is divided into a first set of sections, and first autocorrelation values are determined for delays in a plurality of sections of this first set of sections. A second considered delay range for the autocorrelation computations is divided into a second set of sections such that sections of the first set and sections of the second set are overlapping. Second autocorrelation values are determined for delays in a plurality of sections of this second set of sections.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: July 6, 2010
    Assignee: Nokia Corporation
    Inventors: Lasse Laaksonen, Anssi Ramo, Adriana Vasilache
  • Patent number: 7747429
    Abstract: A method of generating caption abstract, including: generating a target text from a predetermined caption, analyzing a morpheme of a word included in the target text, and analyzing a grammatical structure of the target text by referring to the morpheme; extracting and removing low content words from the target text by using the morpheme or information on the grammatical structure and determining a main predicate; extracting a major sentence component with respect to the main predicate by referring to the information on the grammatical structure, as a candidate abstract word; substituting a relevant word for a complex noun phrase or a predicate phrase from the candidate abstract words by referring to a predetermined database; and generating an abstract by rearranging the candidate abstract words according to a predetermined rule.
    Type: Grant
    Filed: October 30, 2006
    Date of Patent: June 29, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jeong Mi Cho, Hyoung Gook Kim
  • Patent number: 7739113
    Abstract: A voice synthesizer includes a recorded voice storage portion (124) that stores recorded voices that are pre-recorded; a voice input portion (110) that is input with a reading voice reading out a text that is to be generated by the synthesized voice; an attribute information input portion (112) that is input with a label string, which is a string of labels assigned to each phoneme included in the reading voice, and label information, which indicates the border position of each phoneme corresponding to each label; a parameter extraction portion (116) that extracts characteristic parameters of the reading voice based on the label string, the label information, and the reading voice; and a voice synthesis portion (122) that selects the recorded voices from the recorded voice storage portion in accordance with the characteristic parameters, synthesizes the recorded voices, and generates the synthesized voice that reads out the text.
    Type: Grant
    Filed: November 9, 2006
    Date of Patent: June 15, 2010
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Tsutomu Kaneyasu
  • Patent number: 7729899
    Abstract: An automated system and method is provided for debugging training data used to train an automated language identifier. The system and method collects texts written in a particular language, generates an occurrence count for words in each text by counting the number of times each of the words is found within the text, and generates an occurrence ratio (OR) of each of the words by dividing the occurrence count by the total number of words in each text. Words are then filtered from the texts in which their occurrence ratios are substantially higher than their occurrence ratios in at least one of the other texts, to generate a clean text.
    Type: Grant
    Filed: February 6, 2007
    Date of Patent: June 1, 2010
    Assignee: Basis Technology Corporation
    Inventor: Nobuo Otsuka
  • Patent number: 7729901
    Abstract: A system is disclosed for determining probable meanings of a word. A prior probability is established of the probable meanings of the word. A context frequency probability is established of the probable meanings of the word. The probability that each meaning is a correct meaning may be provided in accordance with both the prior probability and the context frequency probability.
    Type: Grant
    Filed: December 13, 2005
    Date of Patent: June 1, 2010
    Assignee: Yahoo! Inc.
    Inventors: David Richardson-Bunbury, Soren Riise, Devesh Patel, Eugene H. Stipp, Paul J. Grealish
  • Patent number: 7725316
    Abstract: A speech recognition adaptation method for a vehicle having a telematics unit with an embedded speech recognition system. Speech is received and pre-processed to generate acoustic feature vectors, and an adaptation parameter is applied to the acoustic feature vectors to yield transformed acoustic feature vectors. The transformed acoustic feature vectors are decoded and a hypothesis of the speech is selected, and the adaptation parameter is trained using acoustic feature vectors from the hypothesis. The method also includes one or more of the following steps: the speech is observed for a certain characteristic and the trained adaptation parameter is saved in accordance with the certain characteristic for use in transforming feature vectors of subsequent speech having the certain characteristic; use of the trained adaptation parameter persists from one vehicle ignition cycle to the next; and use of the trained adaptation parameter is ceased upon detection of a system fault.
    Type: Grant
    Filed: July 5, 2006
    Date of Patent: May 25, 2010
    Assignee: General Motors LLC
    Inventors: Rathinavelu Chengalvarayan, John J Correia, Scott M Pennock
  • Patent number: 7720681
    Abstract: Generally described, the present invention is directed toward generating, maintaining, updating, and applying digital voice profiles. Voice profiles may be generated for individuals. The voice profiles include information that is unique to each individual and which may be applied to digital representations of that individual's voice to improve the quality of a transmitted digital representation of that individual's voice. A voice profile may include, but is not limited to, basic information about the individual, and filter definitions relating to the individuals voice patters, such as a frequency range and amplitude range. The voice profile may also include a speech definition that includes digital representations of the individual's unique speech patterns.
    Type: Grant
    Filed: March 23, 2006
    Date of Patent: May 18, 2010
    Assignee: Microsoft Corporation
    Inventors: David Milstein, Kuansan Wang, Linda Criddle
  • Patent number: 7716049
    Abstract: An apparatus for providing adaptive language model scaling includes an adaptive scaling element and an interface element. The adaptive scaling element is configured to receive input speech comprising a sequence of spoken words and to determine a plurality of candidate sequences of text words in which each of the candidate sequences has a corresponding sentence score representing a probability that a candidate sequence matches the sequence of spoken words. Each corresponding sentence score is calculated using an adaptive scaling factor. The interface element is configured to receive a user input selecting one of the candidate sequences. The adaptive scaling element is further configured to estimate an objective function based on the user input and to modify the adaptive scaling factor based on the estimated objective function.
    Type: Grant
    Filed: June 30, 2006
    Date of Patent: May 11, 2010
    Assignee: Nokia Corporation
    Inventor: Jilei Tian
  • Patent number: 7716051
    Abstract: A distributed voice recognition system (500) and method employs principles of bottom-up (i.e., raw input) and top-down (i.e., prediction based on past experience) processing to perform client-side and server-side processing by (i) at the client-side, replacing application data by a phonotactic table (504); (ii) at the server-side, tracking separate confidence scores for matches against an acoustic model and comparison to a grammar; and (iii) at the server-side using a contention resolver (514) to weight the client-side and server-side results to establish a single output which represents the collaboration between client-side processing and server-side processing.
    Type: Grant
    Filed: June 28, 2006
    Date of Patent: May 11, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Barry Neil Dow, Stephen Graham Lawrence, John Brian Pickering
  • Patent number: 7716040
    Abstract: Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself.
    Type: Grant
    Filed: June 21, 2007
    Date of Patent: May 11, 2010
    Assignee: Multimodal Technologies, Inc.
    Inventors: Detlef Koll, Michael Finke
  • Patent number: 7707035
    Abstract: A sound processing system including a user headset for use in tactical military operations provides integrated sound and speech analysis including sound filtering and amplification, sound analysis and speech recognition for analyzing speech and non-speech sounds and taking programmed actions based on the analysis, recognizing language of speech for purposes of one-way and two-way voice translation, word spotting to detect and identify elements of conversation, and non-speech recognition and identification. The headset includes housings with form factors for insulating a user's ear from direct exposure to ambient sounds with at least one microphone for receiving sound around the user, and a microphone for receiving user speech. The user headset can further include interconnections for connecting the headset with out systems outside of the headset, including target designation systems, communication networks, and radio transmitters.
    Type: Grant
    Filed: October 13, 2005
    Date of Patent: April 27, 2010
    Assignee: Integrated Wave Technologies, Inc.
    Inventor: Timothy S. McCune
  • Patent number: 7702510
    Abstract: Systems and methods for dynamically selecting among text-to-speech (TTS) systems. Exemplary embodiments of the systems and methods include identifying text for converting into a speech waveform, synthesizing said text by three TTS systems, generating a candidate waveform from each of the three systems, generating a score from each of the three systems, comparing each of the three scores, selecting a score based on a criteria and selecting one of the three waveforms based on the selected of the three scores.
    Type: Grant
    Filed: January 12, 2007
    Date of Patent: April 20, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Ellen M. Eide, Raul Fernandez, Wael M. Hamza, Michael A. Picheny
  • Patent number: 7693712
    Abstract: A pre-processing system of a signal of interest in an automatic speech recognition system in a vehicle, includes an acoustic sensor to sense the signal of interest, a non acoustic sensor to sense a non acoustic noise signal, a pre-processing unit of the signal of interest, comprising a processing section of coherent frequency bands signals for suppressing the noise from the received signal, a processing section of non coherent frequency bands signals, comprising transfer function estimation device of a signal through the vehicle cabin, and a methods selection section for determining the coherence properties of the received signal, and for selecting the processing section of coherent frequency bands signals or the processing section of non coherent frequency bands signals depending on the result of the properties of the received signal.
    Type: Grant
    Filed: March 27, 2006
    Date of Patent: April 6, 2010
    Assignee: Aisin Seiki Kabushiki Kaisha
    Inventors: Michel Gaeta, Abderrahman Essebbar
  • Patent number: 7693713
    Abstract: Speech models are trained using one or more of three different training systems. They include competitive training which reduces a distance between a recognized result and a true result, data boosting which divides and weights training data, and asymmetric training which trains different model components differently.
    Type: Grant
    Filed: June 17, 2005
    Date of Patent: April 6, 2010
    Assignee: Microsoft Corporation
    Inventors: Xiaodong He, Jian Wu
  • Patent number: 7689407
    Abstract: A method of learning a second language through the guidance of pictures that enables users to learn multiple languages through computers. After users input a plurality of words, a picture/text interface will display the plurality of input words, a plurality of output words, and a plurality of pictures. The plurality of output words represent the plurality of input words in another language, and the plurality of pictures represent the plurality of input words.
    Type: Grant
    Filed: February 7, 2007
    Date of Patent: March 30, 2010
    Inventors: Kuo-Ping Yang, Chao-Jen Huang, Chien-Liang Chiang, Kun-Yi Hua, Chih-Long Chang, Ming-Hsiang Cheng, Yen-Jui Chiao
  • Patent number: 7689424
    Abstract: This invention relates to a distributed speech recognition method comprising at least one user terminal and at least one server which can communicate with each other by means of a telecommunication network. The inventive method comprises the following steps consisting in: at the user terminal, attempting to associate a saved form with the signal to be recognized and, independently of said step, transmitting a signal to the server, indicating the signal to be recognized; and, at the server, attempting to associate a saved form with the signal received.
    Type: Grant
    Filed: March 8, 2004
    Date of Patent: March 30, 2010
    Assignee: France Telecom
    Inventors: Jean Monne, Jean-Pierre Petit, Patrick Brisard