Patents Examined by Justin W. Rider

N-best list rescoring in speech recognition

Patent number: 7747437

Abstract: A method of speech recognition processing is described based on an N-best list of recognition hypotheses corresponding to a spoken input. Each hypothesis on the N-best list is rescored based on its rank in the rescored N-best list. The rescoring may be based on a Statistical Language Model (SLM) or Dynamic Semantic Model (DSM). One or more rescoring categories may be associated with each recognition hypotheses to affect or bias the rescoring.

Type: Grant

Filed: December 16, 2005

Date of Patent: June 29, 2010

Assignee: Nuance Communications, Inc.

Inventors: Jan Verhasselt, Helmut Dercks
System and method for updating information for various dialog modalities in a dialog scenario according to a semantic context

Patent number: 7742924

Abstract: The present invention provides a dialog system, a dialog execution method and a computer program which are capable of easily updating input information and output information of a dialog scenario and easily changing a plurality of modalities by using a general-purpose dialog scenario. In a dialog system that receives information from outside, controls the pursuit of dialog along the stored dialog scenario and outputs information along the dialog scenario to the outside, a dialog scenario written by using information for identifying the meaning of words/phrases used in the input information and the output information is stored, one or a plurality of words/phrases are stored in association with information for identifying the meaning of words/phrases, input information is analyzed, a corresponding word/phrase is extracted based on the derived information for identifying the meaning of words/phrases, and output information along a dialog scenario stored, based on the extracted word/phrase is outputted.

Type: Grant

Filed: September 30, 2004

Date of Patent: June 22, 2010

Assignee: Fujitsu Limited

Inventors: Ryosuke Miyata, Toshiyuki Fukuoka, Eiji Kitagawa
Method for searching fixed codebook based upon global pulse replacement

Patent number: 7739108

Abstract: The present research can decrease the amount of computation and enhance speech quality by using a global pulse replacement method in a fixed codebook search. The fixed codebook search method in a speech encoder based upon global pulse replacement, includes the steps of: (a) computing absolute values of the pulse-position likelihood-estimator vectors; (b) temporarily obtaining a codebook vector; (c) computing a mathematical equation by replacing a pulse; (d) determining whether a value computed based upon the mathematical equation is increased after pulse replacement; (e) obtaining a new codebook vector by replacing the pulse; and (f) maintaining a previous codebook vector.

Type: Grant

Filed: December 17, 2003

Date of Patent: June 15, 2010

Assignee: Electronics and Telecommunications Research Institute

Inventors: Eung-Don Lee, Do-Young Kim
Information transmission system and information transmission method

Patent number: 7739118

Abstract: In this information transmission system, a first user terminal gains access to a business server through a communication network, to transmit a body text of an electronic mail in voice information there. The business server converts the voice information into text information, discriminates the information to be converted into a pictogram, from this information, and further converts it into a pictogram. A user of the first user terminal recognizes a destination from the mail body text and transmits an electronic mail including the pictogram/text mixed body text to, for example, a second user terminal. Also in a TV phone, the voice information is similarly converted into text information and pictogram and combined with an image, and displayed.

Type: Grant

Filed: June 1, 2005

Date of Patent: June 15, 2010

Assignee: NEC Corporation

Inventor: Kimio Ueno
Method and system for voice-enabled autofill

Patent number: 7739117

Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

Type: Grant

Filed: September 20, 2004

Date of Patent: June 15, 2010

Assignees: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Speech recognition enhancer

Patent number: 7734472

Abstract: The invention concerns a speech recognition enhancer (51) and a speech recognition system comprising such speech recognition enhancer (51), an audio input unit (41) and a speech recognizer (61, 3). The speech recognition enhancer (51) is arranged between the audio input unit (41) and the speech recognizer (61, 3). The speech recognition enhancer (51) has a parametrizable pre-filtering unit (511), a parametrizable dynamic voice level control unit (512), a parametrizable noise reduction unit (513) and a parametrizable voice level control unit (514). The parameters of these parametrizable units (511, 512, 513, 514) are adjusted to the characteristics of the specific audio input unit (41) and/or the characteristics of the specific speech recognizer (61, 3) for adapting the audio input unit (41) to the speech recognizer (61, 3).

Type: Grant

Filed: September 29, 2004

Date of Patent: June 8, 2010

Assignee: Alcatel

Inventor: Michael Walker
Time asynchronous decoding for long-span trajectory model

Patent number: 7734460

Abstract: A time-asynchronous lattice-constrained search algorithm is developed and used to process a linguistic model of speech that has a long-contextual-span capability. In the algorithm, nodes and links in the lattices developed from the model are expanded via look-ahead. Heuristics as utilized by a search algorithm are estimated. Additionally, pruning strategies can be applied to speed up the search.

Type: Grant

Filed: December 20, 2005

Date of Patent: June 8, 2010

Assignee: Microsoft Corporation

Inventors: Dong Yu, Li Deng, Alejandro Acero
Method and system for interactive conversational dialogue for cognitively overloaded device users

Patent number: 7716056

Abstract: A system and method to interactively converse with a cognitively overloaded user of a device, includes maintaining a knowledge base of information regarding the device and a domain, organizing the information in at least one of a relational manner and an ontological manner, receiving speech from the user, converting the speech into a word sequence, recognizing a partial proper name in the word sequence, identifying meaning structures from the word sequence using a model of the domain information, adjusting a boundary of the partial proper names to enhance an accuracy of the meaning structures, interpreting the meaning structures in a context of the conversation with the cognitively overloaded user using the knowledge base, selecting a content for a response to the cognitively overloaded user, generating the response based on the selected content, the context of the conversation, and grammatical rules, and synthesizing speech wave forms for the response.

Type: Grant

Filed: September 27, 2004

Date of Patent: May 11, 2010

Assignees: Robert Bosch Corporation, Volkswagen of America

Inventors: Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Laura Hiatt, Hauke Schmidt, Alexander Gruenstein, Stanley Peters
Information transmission method and information transmission system in which content is varied in process of information transmission

Patent number: 7716053

Abstract: A system and method for transmitting messages among terminals with increasing uncertainty. According to one embodiment, players in a game exchange text messages via wireless computing devices. A first player selects from among a menu of fixed phrases and inserts words within the phrase to complete a message. When the message is received at a second wireless computing device, the message may be degraded by, for example, by removing one or more words or reducing a reliability rating. As the message is passed to third and subsequent wireless computer devices, the message can be further degraded. By degrading the message, the curiosity of a player can be peaked, thus resulting in more engaging game play.

Type: Grant

Filed: October 27, 2006

Date of Patent: May 11, 2010

Assignee: Sega Corporation

Inventors: Noriyuki Shimoda, Wataru Nakanishi, Taku Kihara, Daichi Katagiri, Makoto Osaki
Static analysis to identify defects in grammars

Patent number: 7711551

Abstract: The present invention provides static analysis of speech grammars prior to the speech grammars being deployed in a speech system.

Type: Grant

Filed: June 13, 2005

Date of Patent: May 4, 2010

Assignee: Microsoft Corporation

Inventors: Ricardo Lopez-Barquilla, Craig Campbell
Voice review of privacy policy in a mobile environment

Patent number: 7707036

Abstract: Systems and methods for providing aural review of a privacy policy are disclosed. Generally, a first version of a privacy policy is retrieved. A natural language version of the privacy policy is then retrieved based on at least one user preference and an audio representation of the natural language version of the privacy policy is played through an audio system of a device to a user.

Type: Grant

Filed: March 7, 2007

Date of Patent: April 27, 2010

Assignee: SBC Technology Resources Inc

Inventor: Lalitha Suryanaraya
Message transcription, voice query and query delivery system

Patent number: 7698140

Abstract: A message transmission system accepts a telephone call from a user who wishes to send an e-mail message, send an SMS message, perform an Internet query or retrieve his or her electronic mail. The voice call is transcribed and the message is sent, or the question in the voice call is transcribed and answered by an agent. Any number of agents connect to a central site over an Internet connection and transcribe messages or answer queries in an assembly line like fashion. In addition, a Web query delivery system accepts a query or statement from a user; the query is transcribed, classified, and then broadcast over any medium to any number of experts or web sites that desire to answer the particular type of query received. The entire query is delivered to an expert or web site who provides a full answer to the user.

Type: Grant

Filed: March 6, 2006

Date of Patent: April 13, 2010

Assignee: FoneWeb, Inc.

Inventors: Vinod K. Bhardwaj, Scott England, Dean Whitlock
Simultaneous audio decoding apparatus for plural compressed audio streams

Patent number: 7693722

Abstract: An audio decoding apparatus for decoding and reproducing a plurality of compressed audio streams simultaneously without sound interruption, even when the number of samples per frame is different. The audio decoding apparatus includes: a first and second audio decoder which decode two inputted compressed audio streams, and output audio data; a first and second intermediate buffer which temporarily hold the outputted audio data; a first and second audio output unit which convert the audio data into audio signals and output such audio signals; an output control unit which reads the audio data from the first and second intermediate buffer, and transmits the audio data to the first and second audio output unit. The output control unit repeats the reading and transmission of either the same number of samples of audio data or the number of samples of audio data for the same amount of transmission time.

Type: Grant

Filed: June 29, 2005

Date of Patent: April 6, 2010

Assignee: Panasonic Corporation

Inventors: Hideyuki Kakuno, Masahiro Sueyoshi, Kosuke Nishio
Hybrid multi-channel/cue coding/decoding of audio signals

Patent number: 7693721

Abstract: Part of the spectrum of two or more input signals is encoded using conventional coding techniques, while encoding the rest of the spectrum using binaural cue coding (BCC). In BCC coding, spectral components of the input signals are downmixed and BCC parameters (e.g., inter-channel level and/or time differences) are generated. In a stereo implementation, after converting the left and right channels to the frequency domain, pairs of left- and right-channel spectral components are downmixed to mono. The mono components are then converted back to the time domain, along with those left- and right-channel spectral components that were not downmixed, to form hybrid stereo signals, which can then be encoded using conventional coding techniques. For playback, the encoded bitstream is decoded using conventional decoding techniques. BCC synthesis techniques may then apply the BCC parameters to synthesize an auditory scene based on the mono components as well as the unmixed stereo components.

Type: Grant

Filed: December 10, 2007

Date of Patent: April 6, 2010

Assignee: Agere Systems Inc.

Inventors: Frank Baumgarte, Peter Kroon
Techniques for disambiguating speech input using multimodal interfaces

Patent number: 7684985

Abstract: A technique is disclosed for disambiguating speech input for multimodal systems by using a combination of speech and visual I/O interfaces. When the user's speech input is not recognized with sufficiently high confidence, a the user is presented with a set of possible matches using a visual display and/or speech output. The user then selects the intended input from the list of matches via one or more available input mechanisms (e.g., stylus, buttons, keyboard, mouse, or speech input). These techniques involve the combined use of speech and visual interfaces to correctly identify user's speech input. The techniques disclosed herein may be utilized in computer devices such as PDAs, cellphones, desktop and laptop computers, tablet PCs, etc.

Type: Grant

Filed: December 10, 2003

Date of Patent: March 23, 2010

Inventors: Richard Dominach, Sastry Isukapalli, Sandeep Sibal, Shirish Vaidya
Interactive robot, speech recognition method and computer program product

Patent number: 7680667

Abstract: An interactive robot capable of speech recognition includes a sound-source-direction estimating unit that estimates a direction of a sound source for target voices which are required to undergo speech recognition; a moving unit that moves the interactive robot in the sound-source direction; a target-voice acquiring unit that acquires the target voices at a position after moving; and a speech recognizing unit that performs speech recognition of the target voices.

Type: Grant

Filed: December 20, 2005

Date of Patent: March 16, 2010

Assignee: Kabuhsiki Kaisha Toshiba

Inventors: Takafumi Sonoura, Kaoru Suzuki
Prosthetic hearing device that transforms a detected speech into a speech of a speech form assistive in understanding the semantic meaning in the detected speech

Patent number: 7676372

Abstract: A speech transformation apparatus comprises a microphone 21 for detecting speech and generating a speech signal; a signal processor 22 for performing a speech recognition process using the speech signal; a speech information generator for transforming the recognition result responsive to the physical state of the user, the operating conditions, and/or the purpose for using the apparatus; and a display unit 26 and loudspeaker 25 for generating a control signal for outputting a raw recognition result and/or a transformed recognition result. In a speech transformation apparatus thus constituted, speech enunciated by a spoken-language-impaired individual can be transformed and presented to the user, and sounds from outside sources can also be transformed and presented to the user.

Type: Grant

Filed: February 16, 2000

Date of Patent: March 9, 2010

Assignee: Yugen Kaisha GM&M

Inventor: Toshihiko Oba
Method and system for FFT-based companding for automatic speech recognition

Patent number: 7672842

Abstract: A method and system processes a speech signal. A fast Fourier transform is performed on a speech signal to produce a speech signal having a plurality of frequency bands in a frequency domain.

Type: Grant

Filed: July 26, 2006

Date of Patent: March 2, 2010

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Bhiksha Ramakrishnan, Bent Schmidt-Nielsen, Lorenzo Turicchia, Rahul Sarpeshkar
Language processing device, language processing method and language processing program

Patent number: 7664631

Abstract: According to an aspect of the present invention, a language processing device has a text input section and an anaphora analysis section. The text input section acquires text data. The anaphora analysis section analyzes whether a correct anaphora relation is included in the text data acquired by the text input section. The correct anaphora relation has an anaphor and an antecedent corresponding to the anaphor.

Type: Grant

Filed: December 20, 2005

Date of Patent: February 16, 2010

Assignee: Fuji Xerox Co., Ltd.

Inventors: Daigo Sugihara, Hiroshi Masuichi, Shunichi Kimura, Katsuhiko Itonori, Hideaki Ashikaga, Hiroki Yoshimura, Masanori Onda, Masahiro Kato, Masanori Satake
Method and system for detecting voice activity based on cross-correlation

Patent number: 7653537

Abstract: A system and method is provided for determining whether a data frame of a coded speech signal corresponds to voice or to noise. In one embodiment, a voice activity detector determines a cross-correlation of data. If the cross-correlation is lower than a predetermined cross-correlation value, then the data frame corresponds to noise. If not, then the voice activity detector determines a periodicity of the cross-correlation and a variance of the periodicity. If the variance is less than a predetermined variance value, then the data frame corresponds to voice. In another embodiment, a method determines energy of the data frame and an average energy of the coded speech signal. If the data frame is one of a predetermined number of initial data frames, then a comparison between the average energy to the energy of the data frame is used to determine whether the data frame is noise or voice.

Type: Grant

Filed: September 28, 2004

Date of Patent: January 26, 2010

Assignee: STMicroelectronics Asia Pacific Pte. Ltd.

Inventors: Kabi Prakash Padhi, Sapna George

prev 1 2 3 4 5 6 7 … next