Patents Examined by Matthew J. Sked
  • Patent number: 7689428
    Abstract: An acoustic signal encoding device for down-mixing at different ratios to encode a multichannel signal with a small number of channels, and an acoustic signal decoding device for decoding the signal encoded by the acoustic signal encoding device. In these devices, weighting means (103) in the acoustic signal encoding device (100) weights input signals of two channels individually according to a down-mixing coefficient thereby to calculate the level difference of the signals of two channels weighted by a level difference calculation unit (104). A separating unit (202) in the acoustic signal decoding device (200) separates the down-mixed signals into signals of two channels with the level difference information weighted.
    Type: Grant
    Filed: October 13, 2005
    Date of Patent: March 30, 2010
    Assignee: Panasonic Corporation
    Inventors: Yoshiaki Takagi, Naoya Tanaka
  • Patent number: 7684986
    Abstract: An apparatus, medium, and method recognizing speech. The method may include the calculating of scores indicating the degree of similarity between a characteristic of an input speech and characteristics of speech models based on the degree of similarity between the length of each phoneme included in an input speech and the length of phonemes included in each speech model, and determining a speech model with the highest score among the scores to be the corresponding recognized speech for the input speech. By doing so, the speech recognition rate may be greatly enhanced and when an input speech includes continuous identical phonemes the word error rate (WER) may be greatly reduced.
    Type: Grant
    Filed: December 23, 2005
    Date of Patent: March 23, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Icksang Han, Sangbae Jeong, Jeongsu Kim
  • Patent number: 7680663
    Abstract: A hidden dynamics value in speech is represented by a higher order, discretized dynamic model, which predicts the discretized dynamic variable that changes over time. Parameters are trained for the model. A decoder algorithm is developed for estimating the underlying phonological speech units in sequence that correspond to the observed speech signal using the higher order, discretized dynamic model.
    Type: Grant
    Filed: August 21, 2006
    Date of Patent: March 16, 2010
    Assignee: Micrsoft Corporation
    Inventor: Li Deng
  • Patent number: 7680657
    Abstract: Possible segmentations for an audio signal are scored based on distortions for feature vectors of the audio signal and the total number of segments in the segmentation. The scores are used to select a segmentation and the selected segmentation is used to identify a starting point and an ending point for a speech signal in the audio signal.
    Type: Grant
    Filed: August 15, 2006
    Date of Patent: March 16, 2010
    Assignee: Microsoft Corporation
    Inventors: Yu Shi, Frank Kao-ping Soong, Jian-Iai Zhou
  • Patent number: 7680664
    Abstract: A multi-state pattern recognition model with non-uniform kernel allocation is formed by setting a number of states for a multi-state pattern recognition model and assigning different numbers of kernels to different states. The kernels are then trained using training data to form the multi-state pattern recognition model.
    Type: Grant
    Filed: August 16, 2006
    Date of Patent: March 16, 2010
    Assignee: Microsoft Corporation
    Inventors: Peng Liu, Jian-Iai Zhou, Frank Kao-ping Soong
  • Patent number: 7676363
    Abstract: A speech recognition method includes the steps of receiving speech in a vehicle, extracting acoustic data from the received speech, and applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data. The speech recognition method may also include one or more of the following steps: pre-processing the normalized acoustic data to extract acoustic feature vectors; decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models built according to a plurality of Lombard levels of a Lombard speech corpus covering a plurality of vehicles; calculating the Lombard level of vehicle noise; and/or selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step.
    Type: Grant
    Filed: June 29, 2006
    Date of Patent: March 9, 2010
    Assignee: General Motors LLC
    Inventors: Rathinavelu Chengalvarayan, Scott M Pennock
  • Patent number: 7676374
    Abstract: A filtering method and system for a subband-domain is provided. A first analysis filter bank is configured to divide an input signal into a plurality of subbands. A second analysis filter bank divides one or more of the subbands into a second set of subbands. A modification unit accepts the plurality of subbands, the second set of subbands and modification data and outputs a plurality of modified frequency subbands. A first synthesis filter bank synthesizes the plurality of modified subbands. A filter then filters the plurality of modified subbands and the one or more synthesized modified subbands to obtain a plurality of filtered subbands. A second synthesis filter bank synthesizes the plurality of filtered subbands to obtain an output signal.
    Type: Grant
    Filed: March 28, 2006
    Date of Patent: March 9, 2010
    Assignee: Nokia Corporation
    Inventor: Mikko Tammi
  • Patent number: 7668716
    Abstract: A device improves speech recognition accuracy by utilizing an external knowledge source. The device receives a speech recognition result from an automatic speech recognition (ASR) engine, the speech recognition result including a plurality of ordered interpretations, wherein each of the ordered interpretations includes a plurality of information items. The device analyzes and filters the plurality of interpretations using an external knowledge source to create a filtered plurality of ordered interpretations. The device stores the filtered plurality of ordered interpretations to a memory. The device transmits the filtered plurality of ordered interpretations to a dialog manager module to create a textual output. Alternatively, the dialog manager module retrieves the filtered plurality of ordered interpretations from a memory.
    Type: Grant
    Filed: May 5, 2006
    Date of Patent: February 23, 2010
    Assignee: Dictaphone Corporation
    Inventors: Marc Helbing, Klaus Reifenrath
  • Patent number: 7660717
    Abstract: Speech recognition is performed by matching between a characteristic quantity of an inputted speech and a composite HMM obtained by synthesizing a speech HMM (hidden Markov model) and a noise HMM for each speech frame of the inputted speech by use of the composite HMM.
    Type: Grant
    Filed: January 9, 2008
    Date of Patent: February 9, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Tetsuya Takiguchi, Masafumi Nishimura
  • Patent number: 7657421
    Abstract: The “Idiom Identifier” converts an original text document to a “neutral” form containing no punctuation, no capital letters, and having only a single space between each word. Neutral form text also removes hidden markup such as line breaks, paragraph breaks or page breaks. The Idiom Identifier performs an enhanced text search to locate idioms listed in a library file. The Idiom Identifier marks each identified idiom in a marked-up copy of the original text document. A reader can click on the marked-up idiom to see a definition of the idiom.
    Type: Grant
    Filed: June 28, 2006
    Date of Patent: February 2, 2010
    Assignee: International Business Machines Corporation
    Inventors: Thomas H. Barnes, Yen-Fu Chen, John W. Dunsmoir, Sheryl S. Kinstler, Carol S. Walton
  • Patent number: 7657425
    Abstract: A method, a system and a computer program for extracting information from a natural language text corpus based on a natural language query are disclosed. The natural language text corpus is indexed and stored. A natural language query is analyzed with respect to phrases, phrase types, syntactic roles, word tokens of phrases, and lexical meaning of word tokens. One or more surface variants are created for at least one phrase of the natural language query. The one or more surface variants each have the same phrase type as the at least one phrase of the natural language query, and each comprise a word token which is a lexical head and has the same lexical meaning as a word token which is a lexical head of the at least one phrase of the natural language query. The one or more surface variants and the at least one phrase of the natural language query are compared with the indexed and stored natural language text corpus.
    Type: Grant
    Filed: March 16, 2007
    Date of Patent: February 2, 2010
    Assignee: Hapax Limited
    Inventors: Eva Ingegerd Ejerhed, Peter A. Braroe
  • Patent number: 7657427
    Abstract: Speech signal classification and encoding systems and methods are disclosed herein. The signal classification is done in three steps each of them discriminating a specific signal class. First, a voice activity detector (VAD) discriminates between active and inactive speech frames. If an inactive speech frame is detected (background noise signal) then the classification chain ends and the frame is encoded with comfort noise generation (CNG). If an active speech frame is detected, the frame is subjected to a second classifier dedicated to discriminate unvoiced frames. If the classifier classifies the frame as unvoiced speech signal, the classification chain ends, and the frame is encoded using a coding method optimized for unvoiced signals. Otherwise, the speech frame is passed through to the “stable voiced” classification module. If the frame is classified as stable voiced frame, then the frame is encoded using a coding method optimized for stable voiced signals.
    Type: Grant
    Filed: January 19, 2005
    Date of Patent: February 2, 2010
    Assignee: Nokia Corporation
    Inventor: Milan Jelinek
  • Patent number: 7653543
    Abstract: The present invention is directed toward a method, device, and system for providing a high quality communication session. The system provides a way of determining speech characteristics of participants in the communication session and adjusting, if necessary, signals from a speaker to a listener such that the listener can more intelligibly understand what the speaker is saying.
    Type: Grant
    Filed: March 24, 2006
    Date of Patent: January 26, 2010
    Assignee: Avaya Inc.
    Inventors: Colin Blair, Jonathan R. Yee-Hang Choy, Andrew W. Lang, David Preshan Thambiratnam, Paul Roller Michaelis
  • Patent number: 7653546
    Abstract: Provided is a system and method for creating program code via voice input. The method includes providing a client application configured to compare a voice input to a grammar specified in a document; mapping a plurality of commands specified in the grammar to programming language commands; and enhancing the mapped programming language commands to enable compiling. The enhancing can include creating programming code by inserting at least implicit parentheses, punctuation, and default variable values. The programming language commands can be associated with Java or another language. The document can be a VoiceXML file that can be altered to permit a number of different programming language. A voice programming system includes a receiver to receive voice commands; a voice programming processor configured to process the voice commands to create code; and an enhancement block configured to alter the code into compilable code.
    Type: Grant
    Filed: November 18, 2004
    Date of Patent: January 26, 2010
    Assignee: Nuance Communications, Inc.
    Inventor: Karl J. Weinmeister
  • Patent number: 7653545
    Abstract: A method of developing an interactive system, including inputting application data representative of an application for the system, the application data including operations and parameters for the application, generating prompts on the basis of the application data, and generating grammar on the basis of the application data. The prompts and grammar are generated on the basis of a predetermined pattern or structure for the prompts and grammar. The grammar also includes predefined grammar. Grammatical inference is also executed to enhance the grammar. The grammatical inference method for developing the grammar may include processing rules of the grammar, creating additional rules representative of repeated phrases, and merging equivalent symbols of the grammar, wherein the rules define slots to represent data on which an interactive system executes operations and include symbols representing at least a phrase or term.
    Type: Grant
    Filed: June 9, 2000
    Date of Patent: January 26, 2010
    Assignee: Telstra Corporation Limited
    Inventor: Bradford Craig Starkie
  • Patent number: 7650284
    Abstract: A method, system and apparatus for enabling voice clicks in a multimodal page. In accordance with the present invention, a method for enabling voice clicks in a multimodal page can include toggling a display of indicia binding selected user interface elements in the multimodal page to corresponding voice logic; and, processing a selection of the selected user interface elements in the multimodal page through different selection modalities. In particular, the toggling step can include toggling a display of both indexing indicia for the selected user interface elements, and also a text display indicating that a voice selection of the selected user interface elements is supported.
    Type: Grant
    Filed: November 19, 2004
    Date of Patent: January 19, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Charles W. Cross, Marc White
  • Patent number: 7636664
    Abstract: A putting device comprising a storage means having a plurality of recorded audio samples stored thereon; a randomizing means, and a sensing means; wherein all said means are cooperatively connected together; upon activation of said sensing means, said randomizing means selects an audio sample from said storage means for said putting device to audibilize.
    Type: Grant
    Filed: June 17, 2004
    Date of Patent: December 22, 2009
    Inventor: Steven J. Lee
  • Patent number: 7630890
    Abstract: A block-constrained Trellis coded quantization (TCQ) method and a method and apparatus for quantizing line spectral frequency (LSF) parameters employing the same in a speech coding system wherein the LSF coefficient quantizing method includes: removing the direct current (DC) component in an input LSF coefficient vector; generating a first prediction error vector by performing inter-frame and intra-frame prediction for the LSF coefficient vector, in which the DC component is removed, quantizing the first prediction error vector by using the BC-TCQ algorithm, and by performing intra-frame and inter-frame prediction compensation, generating a quantized first LSF coefficient vector; generating a second prediction error vector by performing intra-frame prediction for the LSF coefficient vector, in which the DC component is removed, quantizing the second prediction error vector by using the BC-TCQ algorithm, and then, by performing intra-frame prediction compensation, generating a quantized second LSF coefficient
    Type: Grant
    Filed: February 19, 2004
    Date of Patent: December 8, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Chang-yong Son, Sang-won Kang, Yong-won Shin, Thomas R. Fischer
  • Patent number: 7627475
    Abstract: A system and method are provided for detecting emotional states using statistics. First, a speech signal is received. At least one acoustic parameter is extracted from the speech signal. Then statistics or features from samples of the voice are calculated from extracted speech parameters. The features serve as inputs to a classifier, which can be a computer program, a device or both. The classifier assigns at least one emotional state from a finite number of possible emotional states to the speech signal. The classifier also estimates the confidence of its decision. Features that are calculated may include a maximum value of a fundamental frequency, a standard deviation of the fundamental frequency, a range of the fundamental frequency, a mean of the fundamental frequency, and a variety of other statistics.
    Type: Grant
    Filed: March 8, 2007
    Date of Patent: December 1, 2009
    Assignee: Accenture LLP
    Inventor: Valery A. Petrushin
  • Patent number: 7620549
    Abstract: A system and method are provided for receiving speech and/or non-speech communications of natural language questions and/or commands and executing the questions and/or commands. The invention provides a conversational human-machine interface that includes a conversational speech analyzer, a general cognitive model, an environmental model, and a personalized cognitive model to determine context, domain knowledge, and invoke prior information to interpret a spoken utterance or a received non-spoken message. The system and method creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech or non-speech communication and presenting the expected results for a particular question or command.
    Type: Grant
    Filed: August 10, 2005
    Date of Patent: November 17, 2009
    Assignee: VoiceBox Technologies, Inc.
    Inventors: Philippe Di Cristo, Chris Weider, Robert A. Kennewick