Patents Examined by Justin W. Rider
  • Patent number: 7747437
    Abstract: A method of speech recognition processing is described based on an N-best list of recognition hypotheses corresponding to a spoken input. Each hypothesis on the N-best list is rescored based on its rank in the rescored N-best list. The rescoring may be based on a Statistical Language Model (SLM) or Dynamic Semantic Model (DSM). One or more rescoring categories may be associated with each recognition hypotheses to affect or bias the rescoring.
    Type: Grant
    Filed: December 16, 2005
    Date of Patent: June 29, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Jan Verhasselt, Helmut Dercks
  • Patent number: 7742924
    Abstract: The present invention provides a dialog system, a dialog execution method and a computer program which are capable of easily updating input information and output information of a dialog scenario and easily changing a plurality of modalities by using a general-purpose dialog scenario. In a dialog system that receives information from outside, controls the pursuit of dialog along the stored dialog scenario and outputs information along the dialog scenario to the outside, a dialog scenario written by using information for identifying the meaning of words/phrases used in the input information and the output information is stored, one or a plurality of words/phrases are stored in association with information for identifying the meaning of words/phrases, input information is analyzed, a corresponding word/phrase is extracted based on the derived information for identifying the meaning of words/phrases, and output information along a dialog scenario stored, based on the extracted word/phrase is outputted.
    Type: Grant
    Filed: September 30, 2004
    Date of Patent: June 22, 2010
    Assignee: Fujitsu Limited
    Inventors: Ryosuke Miyata, Toshiyuki Fukuoka, Eiji Kitagawa
  • Patent number: 7739118
    Abstract: In this information transmission system, a first user terminal gains access to a business server through a communication network, to transmit a body text of an electronic mail in voice information there. The business server converts the voice information into text information, discriminates the information to be converted into a pictogram, from this information, and further converts it into a pictogram. A user of the first user terminal recognizes a destination from the mail body text and transmits an electronic mail including the pictogram/text mixed body text to, for example, a second user terminal. Also in a TV phone, the voice information is similarly converted into text information and pictogram and combined with an image, and displayed.
    Type: Grant
    Filed: June 1, 2005
    Date of Patent: June 15, 2010
    Assignee: NEC Corporation
    Inventor: Kimio Ueno
  • Patent number: 7739117
    Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.
    Type: Grant
    Filed: September 20, 2004
    Date of Patent: June 15, 2010
    Assignees: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
  • Patent number: 7739108
    Abstract: The present research can decrease the amount of computation and enhance speech quality by using a global pulse replacement method in a fixed codebook search. The fixed codebook search method in a speech encoder based upon global pulse replacement, includes the steps of: (a) computing absolute values of the pulse-position likelihood-estimator vectors; (b) temporarily obtaining a codebook vector; (c) computing a mathematical equation by replacing a pulse; (d) determining whether a value computed based upon the mathematical equation is increased after pulse replacement; (e) obtaining a new codebook vector by replacing the pulse; and (f) maintaining a previous codebook vector.
    Type: Grant
    Filed: December 17, 2003
    Date of Patent: June 15, 2010
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Eung-Don Lee, Do-Young Kim
  • Patent number: 7734472
    Abstract: The invention concerns a speech recognition enhancer (51) and a speech recognition system comprising such speech recognition enhancer (51), an audio input unit (41) and a speech recognizer (61, 3). The speech recognition enhancer (51) is arranged between the audio input unit (41) and the speech recognizer (61, 3). The speech recognition enhancer (51) has a parametrizable pre-filtering unit (511), a parametrizable dynamic voice level control unit (512), a parametrizable noise reduction unit (513) and a parametrizable voice level control unit (514). The parameters of these parametrizable units (511, 512, 513, 514) are adjusted to the characteristics of the specific audio input unit (41) and/or the characteristics of the specific speech recognizer (61, 3) for adapting the audio input unit (41) to the speech recognizer (61, 3).
    Type: Grant
    Filed: September 29, 2004
    Date of Patent: June 8, 2010
    Assignee: Alcatel
    Inventor: Michael Walker
  • Patent number: 7734460
    Abstract: A time-asynchronous lattice-constrained search algorithm is developed and used to process a linguistic model of speech that has a long-contextual-span capability. In the algorithm, nodes and links in the lattices developed from the model are expanded via look-ahead. Heuristics as utilized by a search algorithm are estimated. Additionally, pruning strategies can be applied to speed up the search.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: June 8, 2010
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Li Deng, Alejandro Acero
  • Patent number: 7716053
    Abstract: A system and method for transmitting messages among terminals with increasing uncertainty. According to one embodiment, players in a game exchange text messages via wireless computing devices. A first player selects from among a menu of fixed phrases and inserts words within the phrase to complete a message. When the message is received at a second wireless computing device, the message may be degraded by, for example, by removing one or more words or reducing a reliability rating. As the message is passed to third and subsequent wireless computer devices, the message can be further degraded. By degrading the message, the curiosity of a player can be peaked, thus resulting in more engaging game play.
    Type: Grant
    Filed: October 27, 2006
    Date of Patent: May 11, 2010
    Assignee: Sega Corporation
    Inventors: Noriyuki Shimoda, Wataru Nakanishi, Taku Kihara, Daichi Katagiri, Makoto Osaki
  • Patent number: 7716056
    Abstract: A system and method to interactively converse with a cognitively overloaded user of a device, includes maintaining a knowledge base of information regarding the device and a domain, organizing the information in at least one of a relational manner and an ontological manner, receiving speech from the user, converting the speech into a word sequence, recognizing a partial proper name in the word sequence, identifying meaning structures from the word sequence using a model of the domain information, adjusting a boundary of the partial proper names to enhance an accuracy of the meaning structures, interpreting the meaning structures in a context of the conversation with the cognitively overloaded user using the knowledge base, selecting a content for a response to the cognitively overloaded user, generating the response based on the selected content, the context of the conversation, and grammatical rules, and synthesizing speech wave forms for the response.
    Type: Grant
    Filed: September 27, 2004
    Date of Patent: May 11, 2010
    Assignees: Robert Bosch Corporation, Volkswagen of America
    Inventors: Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Laura Hiatt, Hauke Schmidt, Alexander Gruenstein, Stanley Peters
  • Patent number: 7711551
    Abstract: The present invention provides static analysis of speech grammars prior to the speech grammars being deployed in a speech system.
    Type: Grant
    Filed: June 13, 2005
    Date of Patent: May 4, 2010
    Assignee: Microsoft Corporation
    Inventors: Ricardo Lopez-Barquilla, Craig Campbell
  • Patent number: 7707036
    Abstract: Systems and methods for providing aural review of a privacy policy are disclosed. Generally, a first version of a privacy policy is retrieved. A natural language version of the privacy policy is then retrieved based on at least one user preference and an audio representation of the natural language version of the privacy policy is played through an audio system of a device to a user.
    Type: Grant
    Filed: March 7, 2007
    Date of Patent: April 27, 2010
    Assignee: SBC Technology Resources Inc
    Inventor: Lalitha Suryanaraya
  • Patent number: 7698140
    Abstract: A message transmission system accepts a telephone call from a user who wishes to send an e-mail message, send an SMS message, perform an Internet query or retrieve his or her electronic mail. The voice call is transcribed and the message is sent, or the question in the voice call is transcribed and answered by an agent. Any number of agents connect to a central site over an Internet connection and transcribe messages or answer queries in an assembly line like fashion. In addition, a Web query delivery system accepts a query or statement from a user; the query is transcribed, classified, and then broadcast over any medium to any number of experts or web sites that desire to answer the particular type of query received. The entire query is delivered to an expert or web site who provides a full answer to the user.
    Type: Grant
    Filed: March 6, 2006
    Date of Patent: April 13, 2010
    Assignee: FoneWeb, Inc.
    Inventors: Vinod K. Bhardwaj, Scott England, Dean Whitlock
  • Patent number: 7693722
    Abstract: An audio decoding apparatus for decoding and reproducing a plurality of compressed audio streams simultaneously without sound interruption, even when the number of samples per frame is different. The audio decoding apparatus includes: a first and second audio decoder which decode two inputted compressed audio streams, and output audio data; a first and second intermediate buffer which temporarily hold the outputted audio data; a first and second audio output unit which convert the audio data into audio signals and output such audio signals; an output control unit which reads the audio data from the first and second intermediate buffer, and transmits the audio data to the first and second audio output unit. The output control unit repeats the reading and transmission of either the same number of samples of audio data or the number of samples of audio data for the same amount of transmission time.
    Type: Grant
    Filed: June 29, 2005
    Date of Patent: April 6, 2010
    Assignee: Panasonic Corporation
    Inventors: Hideyuki Kakuno, Masahiro Sueyoshi, Kosuke Nishio
  • Patent number: 7693721
    Abstract: Part of the spectrum of two or more input signals is encoded using conventional coding techniques, while encoding the rest of the spectrum using binaural cue coding (BCC). In BCC coding, spectral components of the input signals are downmixed and BCC parameters (e.g., inter-channel level and/or time differences) are generated. In a stereo implementation, after converting the left and right channels to the frequency domain, pairs of left- and right-channel spectral components are downmixed to mono. The mono components are then converted back to the time domain, along with those left- and right-channel spectral components that were not downmixed, to form hybrid stereo signals, which can then be encoded using conventional coding techniques. For playback, the encoded bitstream is decoded using conventional decoding techniques. BCC synthesis techniques may then apply the BCC parameters to synthesize an auditory scene based on the mono components as well as the unmixed stereo components.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: April 6, 2010
    Assignee: Agere Systems Inc.
    Inventors: Frank Baumgarte, Peter Kroon
  • Patent number: 7684985
    Abstract: A technique is disclosed for disambiguating speech input for multimodal systems by using a combination of speech and visual I/O interfaces. When the user's speech input is not recognized with sufficiently high confidence, a the user is presented with a set of possible matches using a visual display and/or speech output. The user then selects the intended input from the list of matches via one or more available input mechanisms (e.g., stylus, buttons, keyboard, mouse, or speech input). These techniques involve the combined use of speech and visual interfaces to correctly identify user's speech input. The techniques disclosed herein may be utilized in computer devices such as PDAs, cellphones, desktop and laptop computers, tablet PCs, etc.
    Type: Grant
    Filed: December 10, 2003
    Date of Patent: March 23, 2010
    Inventors: Richard Dominach, Sastry Isukapalli, Sandeep Sibal, Shirish Vaidya
  • Patent number: 7680667
    Abstract: An interactive robot capable of speech recognition includes a sound-source-direction estimating unit that estimates a direction of a sound source for target voices which are required to undergo speech recognition; a moving unit that moves the interactive robot in the sound-source direction; a target-voice acquiring unit that acquires the target voices at a position after moving; and a speech recognizing unit that performs speech recognition of the target voices.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: March 16, 2010
    Assignee: Kabuhsiki Kaisha Toshiba
    Inventors: Takafumi Sonoura, Kaoru Suzuki
  • Patent number: 7676372
    Abstract: A speech transformation apparatus comprises a microphone 21 for detecting speech and generating a speech signal; a signal processor 22 for performing a speech recognition process using the speech signal; a speech information generator for transforming the recognition result responsive to the physical state of the user, the operating conditions, and/or the purpose for using the apparatus; and a display unit 26 and loudspeaker 25 for generating a control signal for outputting a raw recognition result and/or a transformed recognition result. In a speech transformation apparatus thus constituted, speech enunciated by a spoken-language-impaired individual can be transformed and presented to the user, and sounds from outside sources can also be transformed and presented to the user.
    Type: Grant
    Filed: February 16, 2000
    Date of Patent: March 9, 2010
    Assignee: Yugen Kaisha GM&M
    Inventor: Toshihiko Oba
  • Patent number: 7672842
    Abstract: A method and system processes a speech signal. A fast Fourier transform is performed on a speech signal to produce a speech signal having a plurality of frequency bands in a frequency domain.
    Type: Grant
    Filed: July 26, 2006
    Date of Patent: March 2, 2010
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Bhiksha Ramakrishnan, Bent Schmidt-Nielsen, Lorenzo Turicchia, Rahul Sarpeshkar
  • Patent number: 7664631
    Abstract: According to an aspect of the present invention, a language processing device has a text input section and an anaphora analysis section. The text input section acquires text data. The anaphora analysis section analyzes whether a correct anaphora relation is included in the text data acquired by the text input section. The correct anaphora relation has an anaphor and an antecedent corresponding to the anaphor.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: February 16, 2010
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Daigo Sugihara, Hiroshi Masuichi, Shunichi Kimura, Katsuhiko Itonori, Hideaki Ashikaga, Hiroki Yoshimura, Masanori Onda, Masahiro Kato, Masanori Satake
  • Patent number: 7653537
    Abstract: A system and method is provided for determining whether a data frame of a coded speech signal corresponds to voice or to noise. In one embodiment, a voice activity detector determines a cross-correlation of data. If the cross-correlation is lower than a predetermined cross-correlation value, then the data frame corresponds to noise. If not, then the voice activity detector determines a periodicity of the cross-correlation and a variance of the periodicity. If the variance is less than a predetermined variance value, then the data frame corresponds to voice. In another embodiment, a method determines energy of the data frame and an average energy of the coded speech signal. If the data frame is one of a predetermined number of initial data frames, then a comparison between the average energy to the energy of the data frame is used to determine whether the data frame is noise or voice.
    Type: Grant
    Filed: September 28, 2004
    Date of Patent: January 26, 2010
    Assignee: STMicroelectronics Asia Pacific Pte. Ltd.
    Inventors: Kabi Prakash Padhi, Sapna George