Patents Examined by Matthew J. Sked

Acoustic signal encoding device, and acoustic signal decoding device

Patent number: 7689428

Abstract: An acoustic signal encoding device for down-mixing at different ratios to encode a multichannel signal with a small number of channels, and an acoustic signal decoding device for decoding the signal encoded by the acoustic signal encoding device. In these devices, weighting means (103) in the acoustic signal encoding device (100) weights input signals of two channels individually according to a down-mixing coefficient thereby to calculate the level difference of the signals of two channels weighted by a level difference calculation unit (104). A separating unit (202) in the acoustic signal decoding device (200) separates the down-mixed signals into signals of two channels with the level difference information weighted.

Type: Grant

Filed: October 13, 2005

Date of Patent: March 30, 2010

Assignee: Panasonic Corporation

Inventors: Yoshiaki Takagi, Naoya Tanaka
Method, medium, and apparatus recognizing speech considering similarity between the lengths of phonemes

Patent number: 7684986

Abstract: An apparatus, medium, and method recognizing speech. The method may include the calculating of scores indicating the degree of similarity between a characteristic of an input speech and characteristics of speech models based on the degree of similarity between the length of each phoneme included in an input speech and the length of phonemes included in each speech model, and determining a speech model with the highest score among the scores to be the corresponding recognized speech for the input speech. By doing so, the speech recognition rate may be greatly enhanced and when an input speech includes continuous identical phonemes the word error rate (WER) may be greatly reduced.

Type: Grant

Filed: December 23, 2005

Date of Patent: March 23, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventors: Icksang Han, Sangbae Jeong, Jeongsu Kim
Using a discretized, higher order representation of hidden dynamic variables for speech recognition

Patent number: 7680663

Abstract: A hidden dynamics value in speech is represented by a higher order, discretized dynamic model, which predicts the discretized dynamic variable that changes over time. Parameters are trained for the model. A decoder algorithm is developed for estimating the underlying phonological speech units in sequence that correspond to the observed speech signal using the higher order, discretized dynamic model.

Type: Grant

Filed: August 21, 2006

Date of Patent: March 16, 2010

Assignee: Micrsoft Corporation

Inventor: Li Deng
Auto segmentation based partitioning and clustering approach to robust endpointing

Patent number: 7680657

Abstract: Possible segmentations for an audio signal are scored based on distortions for feature vectors of the audio signal and the total number of segments in the segmentation. The scores are used to select a segmentation and the selected segmentation is used to identify a starting point and an ending point for a speech signal in the audio signal.

Type: Grant

Filed: August 15, 2006

Date of Patent: March 16, 2010

Assignee: Microsoft Corporation

Inventors: Yu Shi, Frank Kao-ping Soong, Jian-Iai Zhou
Parsimonious modeling by non-uniform kernel allocation

Patent number: 7680664

Abstract: A multi-state pattern recognition model with non-uniform kernel allocation is formed by setting a number of states for a multi-state pattern recognition model and assigning different numbers of kernels to different states. The kernels are then trained using training data to form the multi-state pattern recognition model.

Type: Grant

Filed: August 16, 2006

Date of Patent: March 16, 2010

Assignee: Microsoft Corporation

Inventors: Peng Liu, Jian-Iai Zhou, Frank Kao-ping Soong
Automated speech recognition using normalized in-vehicle speech

Patent number: 7676363

Abstract: A speech recognition method includes the steps of receiving speech in a vehicle, extracting acoustic data from the received speech, and applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data. The speech recognition method may also include one or more of the following steps: pre-processing the normalized acoustic data to extract acoustic feature vectors; decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models built according to a plurality of Lombard levels of a Lombard speech corpus covering a plurality of vehicles; calculating the Lombard level of vehicle noise; and/or selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step.

Type: Grant

Filed: June 29, 2006

Date of Patent: March 9, 2010

Assignee: General Motors LLC

Inventors: Rathinavelu Chengalvarayan, Scott M Pennock
Low complexity subband-domain filtering in the case of cascaded filter banks

Patent number: 7676374

Abstract: A filtering method and system for a subband-domain is provided. A first analysis filter bank is configured to divide an input signal into a plurality of subbands. A second analysis filter bank divides one or more of the subbands into a second set of subbands. A modification unit accepts the plurality of subbands, the second set of subbands and modification data and outputs a plurality of modified frequency subbands. A first synthesis filter bank synthesizes the plurality of modified subbands. A filter then filters the plurality of modified subbands and the one or more synthesized modified subbands to obtain a plurality of filtered subbands. A second synthesis filter bank synthesizes the plurality of filtered subbands to obtain an output signal.

Type: Grant

Filed: March 28, 2006

Date of Patent: March 9, 2010

Assignee: Nokia Corporation

Inventor: Mikko Tammi
Incorporation of external knowledge in multimodal dialog systems

Patent number: 7668716

Abstract: A device improves speech recognition accuracy by utilizing an external knowledge source. The device receives a speech recognition result from an automatic speech recognition (ASR) engine, the speech recognition result including a plurality of ordered interpretations, wherein each of the ordered interpretations includes a plurality of information items. The device analyzes and filters the plurality of interpretations using an external knowledge source to create a filtered plurality of ordered interpretations. The device stores the filtered plurality of ordered interpretations to a memory. The device transmits the filtered plurality of ordered interpretations to a dialog manager module to create a textual output. Alternatively, the dialog manager module retrieves the filtered plurality of ordered interpretations from a memory.

Type: Grant

Filed: May 5, 2006

Date of Patent: February 23, 2010

Assignee: Dictaphone Corporation

Inventors: Marc Helbing, Klaus Reifenrath
Speech recognition system and program thereof

Patent number: 7660717

Abstract: Speech recognition is performed by matching between a characteristic quantity of an inputted speech and a composite HMM obtained by synthesizing a speech HMM (hidden Markov model) and a noise HMM for each speech frame of the inputted speech by use of the composite HMM.

Type: Grant

Filed: January 9, 2008

Date of Patent: February 9, 2010

Assignee: Nuance Communications, Inc.

Inventors: Tetsuya Takiguchi, Masafumi Nishimura
System and method for identifying and defining idioms

Patent number: 7657421

Abstract: The “Idiom Identifier” converts an original text document to a “neutral” form containing no punctuation, no capital letters, and having only a single space between each word. Neutral form text also removes hidden markup such as line breaks, paragraph breaks or page breaks. The Idiom Identifier performs an enhanced text search to locate idioms listed in a library file. The Idiom Identifier marks each identified idiom in a marked-up copy of the original text document. A reader can click on the marked-up idiom to see a definition of the idiom.

Type: Grant

Filed: June 28, 2006

Date of Patent: February 2, 2010

Assignee: International Business Machines Corporation

Inventors: Thomas H. Barnes, Yen-Fu Chen, John W. Dunsmoir, Sheryl S. Kinstler, Carol S. Walton
Method and system for information extraction

Patent number: 7657425

Abstract: A method, a system and a computer program for extracting information from a natural language text corpus based on a natural language query are disclosed. The natural language text corpus is indexed and stored. A natural language query is analyzed with respect to phrases, phrase types, syntactic roles, word tokens of phrases, and lexical meaning of word tokens. One or more surface variants are created for at least one phrase of the natural language query. The one or more surface variants each have the same phrase type as the at least one phrase of the natural language query, and each comprise a word token which is a lexical head and has the same lexical meaning as a word token which is a lexical head of the at least one phrase of the natural language query. The one or more surface variants and the at least one phrase of the natural language query are compared with the indexed and stored natural language text corpus.

Type: Grant

Filed: March 16, 2007

Date of Patent: February 2, 2010

Assignee: Hapax Limited

Inventors: Eva Ingegerd Ejerhed, Peter A. Braroe
Methods and devices for source controlled variable bit-rate wideband speech coding

Patent number: 7657427

Abstract: Speech signal classification and encoding systems and methods are disclosed herein. The signal classification is done in three steps each of them discriminating a specific signal class. First, a voice activity detector (VAD) discriminates between active and inactive speech frames. If an inactive speech frame is detected (background noise signal) then the classification chain ends and the frame is encoded with comfort noise generation (CNG). If an active speech frame is detected, the frame is subjected to a second classifier dedicated to discriminate unvoiced frames. If the classifier classifies the frame as unvoiced speech signal, the classification chain ends, and the frame is encoded using a coding method optimized for unvoiced signals. Otherwise, the speech frame is passed through to the “stable voiced” classification module. If the frame is classified as stable voiced frame, then the frame is encoded using a coding method optimized for stable voiced signals.

Type: Grant

Filed: January 19, 2005

Date of Patent: February 2, 2010

Assignee: Nokia Corporation

Inventor: Milan Jelinek
Automatic signal adjustment based on intelligibility

Patent number: 7653543

Abstract: The present invention is directed toward a method, device, and system for providing a high quality communication session. The system provides a way of determining speech characteristics of participants in the communication session and adjusting, if necessary, signals from a speaker to a listener such that the listener can more intelligibly understand what the speaker is saying.

Type: Grant

Filed: March 24, 2006

Date of Patent: January 26, 2010

Assignee: Avaya Inc.

Inventors: Colin Blair, Jonathan R. Yee-Hang Choy, Andrew W. Lang, David Preshan Thambiratnam, Paul Roller Michaelis
Method and system for efficient voice-based programming

Patent number: 7653546

Abstract: Provided is a system and method for creating program code via voice input. The method includes providing a client application configured to compare a voice input to a grammar specified in a document; mapping a plurality of commands specified in the grammar to programming language commands; and enhancing the mapped programming language commands to enable compiling. The enhancing can include creating programming code by inserting at least implicit parentheses, punctuation, and default variable values. The programming language commands can be associated with Java or another language. The document can be a VoiceXML file that can be altered to permit a number of different programming language. A voice programming system includes a receiver to receive voice commands; a voice programming processor configured to process the voice commands to create code; and an enhancement block configured to alter the code into compilable code.

Type: Grant

Filed: November 18, 2004

Date of Patent: January 26, 2010

Assignee: Nuance Communications, Inc.

Inventor: Karl J. Weinmeister
Method of developing an interactive system

Patent number: 7653545

Abstract: A method of developing an interactive system, including inputting application data representative of an application for the system, the application data including operations and parameters for the application, generating prompts on the basis of the application data, and generating grammar on the basis of the application data. The prompts and grammar are generated on the basis of a predetermined pattern or structure for the prompts and grammar. The grammar also includes predefined grammar. Grammatical inference is also executed to enhance the grammar. The grammatical inference method for developing the grammar may include processing rules of the grammar, creating additional rules representative of repeated phrases, and merging equivalent symbols of the grammar, wherein the rules define slots to represent data on which an interactive system executes operations and include symbols representing at least a phrase or term.

Type: Grant

Filed: June 9, 2000

Date of Patent: January 26, 2010

Assignee: Telstra Corporation Limited

Inventor: Bradford Craig Starkie
Enabling voice click in a multimodal page

Patent number: 7650284

Abstract: A method, system and apparatus for enabling voice clicks in a multimodal page. In accordance with the present invention, a method for enabling voice clicks in a multimodal page can include toggling a display of indicia binding selected user interface elements in the multimodal page to corresponding voice logic; and, processing a selection of the selected user interface elements in the multimodal page through different selection modalities. In particular, the toggling step can include toggling a display of both indexing indicia for the selected user interface elements, and also a text display indicating that a voice selection of the selected user interface elements is supported.

Type: Grant

Filed: November 19, 2004

Date of Patent: January 19, 2010

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Marc White
Golf commentator

Patent number: 7636664

Abstract: A putting device comprising a storage means having a plurality of recorded audio samples stored thereon; a randomizing means, and a sensing means; wherein all said means are cooperatively connected together; upon activation of said sensing means, said randomizing means selects an audio sample from said storage means for said putting device to audibilize.

Type: Grant

Filed: June 17, 2004

Date of Patent: December 22, 2009

Inventor: Steven J. Lee
Block-constrained TCQ method, and method and apparatus for quantizing LSF parameter employing the same in speech coding system

Patent number: 7630890

Abstract: A block-constrained Trellis coded quantization (TCQ) method and a method and apparatus for quantizing line spectral frequency (LSF) parameters employing the same in a speech coding system wherein the LSF coefficient quantizing method includes: removing the direct current (DC) component in an input LSF coefficient vector; generating a first prediction error vector by performing inter-frame and intra-frame prediction for the LSF coefficient vector, in which the DC component is removed, quantizing the first prediction error vector by using the BC-TCQ algorithm, and by performing intra-frame and inter-frame prediction compensation, generating a quantized first LSF coefficient vector; generating a second prediction error vector by performing intra-frame prediction for the LSF coefficient vector, in which the DC component is removed, quantizing the second prediction error vector by using the BC-TCQ algorithm, and then, by performing intra-frame prediction compensation, generating a quantized second LSF coefficient

Type: Grant

Filed: February 19, 2004

Date of Patent: December 8, 2009

Assignee: Samsung Electronics Co., Ltd.

Inventors: Chang-yong Son, Sang-won Kang, Yong-won Shin, Thomas R. Fischer
Detecting emotions using voice signal analysis

Patent number: 7627475

Abstract: A system and method are provided for detecting emotional states using statistics. First, a speech signal is received. At least one acoustic parameter is extracted from the speech signal. Then statistics or features from samples of the voice are calculated from extracted speech parameters. The features serve as inputs to a classifier, which can be a computer program, a device or both. The classifier assigns at least one emotional state from a finite number of possible emotional states to the speech signal. The classifier also estimates the confidence of its decision. Features that are calculated may include a maximum value of a fundamental frequency, a standard deviation of the fundamental frequency, a range of the fundamental frequency, a mean of the fundamental frequency, and a variety of other statistics.

Type: Grant

Filed: March 8, 2007

Date of Patent: December 1, 2009

Assignee: Accenture LLP

Inventor: Valery A. Petrushin
System and method of supporting adaptive misrecognition in conversational speech

Patent number: 7620549

Abstract: A system and method are provided for receiving speech and/or non-speech communications of natural language questions and/or commands and executing the questions and/or commands. The invention provides a conversational human-machine interface that includes a conversational speech analyzer, a general cognitive model, an environmental model, and a personalized cognitive model to determine context, domain knowledge, and invoke prior information to interpret a spoken utterance or a received non-spoken message. The system and method creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech or non-speech communication and presenting the expected results for a particular question or command.

Type: Grant

Filed: August 10, 2005

Date of Patent: November 17, 2009

Assignee: VoiceBox Technologies, Inc.

Inventors: Philippe Di Cristo, Chris Weider, Robert A. Kennewick

prev 1 2 3 4 5 6 7 8 9 … next