Patents Examined by Justin W. Rider
-
Patent number: 7747437Abstract: A method of speech recognition processing is described based on an N-best list of recognition hypotheses corresponding to a spoken input. Each hypothesis on the N-best list is rescored based on its rank in the rescored N-best list. The rescoring may be based on a Statistical Language Model (SLM) or Dynamic Semantic Model (DSM). One or more rescoring categories may be associated with each recognition hypotheses to affect or bias the rescoring.Type: GrantFiled: December 16, 2005Date of Patent: June 29, 2010Assignee: Nuance Communications, Inc.Inventors: Jan Verhasselt, Helmut Dercks
-
Patent number: 7742924Abstract: The present invention provides a dialog system, a dialog execution method and a computer program which are capable of easily updating input information and output information of a dialog scenario and easily changing a plurality of modalities by using a general-purpose dialog scenario. In a dialog system that receives information from outside, controls the pursuit of dialog along the stored dialog scenario and outputs information along the dialog scenario to the outside, a dialog scenario written by using information for identifying the meaning of words/phrases used in the input information and the output information is stored, one or a plurality of words/phrases are stored in association with information for identifying the meaning of words/phrases, input information is analyzed, a corresponding word/phrase is extracted based on the derived information for identifying the meaning of words/phrases, and output information along a dialog scenario stored, based on the extracted word/phrase is outputted.Type: GrantFiled: September 30, 2004Date of Patent: June 22, 2010Assignee: Fujitsu LimitedInventors: Ryosuke Miyata, Toshiyuki Fukuoka, Eiji Kitagawa
-
Patent number: 7739117Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.Type: GrantFiled: September 20, 2004Date of Patent: June 15, 2010Assignees: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 7739108Abstract: The present research can decrease the amount of computation and enhance speech quality by using a global pulse replacement method in a fixed codebook search. The fixed codebook search method in a speech encoder based upon global pulse replacement, includes the steps of: (a) computing absolute values of the pulse-position likelihood-estimator vectors; (b) temporarily obtaining a codebook vector; (c) computing a mathematical equation by replacing a pulse; (d) determining whether a value computed based upon the mathematical equation is increased after pulse replacement; (e) obtaining a new codebook vector by replacing the pulse; and (f) maintaining a previous codebook vector.Type: GrantFiled: December 17, 2003Date of Patent: June 15, 2010Assignee: Electronics and Telecommunications Research InstituteInventors: Eung-Don Lee, Do-Young Kim
-
Patent number: 7739118Abstract: In this information transmission system, a first user terminal gains access to a business server through a communication network, to transmit a body text of an electronic mail in voice information there. The business server converts the voice information into text information, discriminates the information to be converted into a pictogram, from this information, and further converts it into a pictogram. A user of the first user terminal recognizes a destination from the mail body text and transmits an electronic mail including the pictogram/text mixed body text to, for example, a second user terminal. Also in a TV phone, the voice information is similarly converted into text information and pictogram and combined with an image, and displayed.Type: GrantFiled: June 1, 2005Date of Patent: June 15, 2010Assignee: NEC CorporationInventor: Kimio Ueno
-
Patent number: 7734460Abstract: A time-asynchronous lattice-constrained search algorithm is developed and used to process a linguistic model of speech that has a long-contextual-span capability. In the algorithm, nodes and links in the lattices developed from the model are expanded via look-ahead. Heuristics as utilized by a search algorithm are estimated. Additionally, pruning strategies can be applied to speed up the search.Type: GrantFiled: December 20, 2005Date of Patent: June 8, 2010Assignee: Microsoft CorporationInventors: Dong Yu, Li Deng, Alejandro Acero
-
Patent number: 7734472Abstract: The invention concerns a speech recognition enhancer (51) and a speech recognition system comprising such speech recognition enhancer (51), an audio input unit (41) and a speech recognizer (61, 3). The speech recognition enhancer (51) is arranged between the audio input unit (41) and the speech recognizer (61, 3). The speech recognition enhancer (51) has a parametrizable pre-filtering unit (511), a parametrizable dynamic voice level control unit (512), a parametrizable noise reduction unit (513) and a parametrizable voice level control unit (514). The parameters of these parametrizable units (511, 512, 513, 514) are adjusted to the characteristics of the specific audio input unit (41) and/or the characteristics of the specific speech recognizer (61, 3) for adapting the audio input unit (41) to the speech recognizer (61, 3).Type: GrantFiled: September 29, 2004Date of Patent: June 8, 2010Assignee: AlcatelInventor: Michael Walker
-
Patent number: 7716056Abstract: A system and method to interactively converse with a cognitively overloaded user of a device, includes maintaining a knowledge base of information regarding the device and a domain, organizing the information in at least one of a relational manner and an ontological manner, receiving speech from the user, converting the speech into a word sequence, recognizing a partial proper name in the word sequence, identifying meaning structures from the word sequence using a model of the domain information, adjusting a boundary of the partial proper names to enhance an accuracy of the meaning structures, interpreting the meaning structures in a context of the conversation with the cognitively overloaded user using the knowledge base, selecting a content for a response to the cognitively overloaded user, generating the response based on the selected content, the context of the conversation, and grammatical rules, and synthesizing speech wave forms for the response.Type: GrantFiled: September 27, 2004Date of Patent: May 11, 2010Assignees: Robert Bosch Corporation, Volkswagen of AmericaInventors: Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Laura Hiatt, Hauke Schmidt, Alexander Gruenstein, Stanley Peters
-
Patent number: 7716053Abstract: A system and method for transmitting messages among terminals with increasing uncertainty. According to one embodiment, players in a game exchange text messages via wireless computing devices. A first player selects from among a menu of fixed phrases and inserts words within the phrase to complete a message. When the message is received at a second wireless computing device, the message may be degraded by, for example, by removing one or more words or reducing a reliability rating. As the message is passed to third and subsequent wireless computer devices, the message can be further degraded. By degrading the message, the curiosity of a player can be peaked, thus resulting in more engaging game play.Type: GrantFiled: October 27, 2006Date of Patent: May 11, 2010Assignee: Sega CorporationInventors: Noriyuki Shimoda, Wataru Nakanishi, Taku Kihara, Daichi Katagiri, Makoto Osaki
-
Patent number: 7711551Abstract: The present invention provides static analysis of speech grammars prior to the speech grammars being deployed in a speech system.Type: GrantFiled: June 13, 2005Date of Patent: May 4, 2010Assignee: Microsoft CorporationInventors: Ricardo Lopez-Barquilla, Craig Campbell
-
Patent number: 7707036Abstract: Systems and methods for providing aural review of a privacy policy are disclosed. Generally, a first version of a privacy policy is retrieved. A natural language version of the privacy policy is then retrieved based on at least one user preference and an audio representation of the natural language version of the privacy policy is played through an audio system of a device to a user.Type: GrantFiled: March 7, 2007Date of Patent: April 27, 2010Assignee: SBC Technology Resources IncInventor: Lalitha Suryanaraya
-
Patent number: 7698140Abstract: A message transmission system accepts a telephone call from a user who wishes to send an e-mail message, send an SMS message, perform an Internet query or retrieve his or her electronic mail. The voice call is transcribed and the message is sent, or the question in the voice call is transcribed and answered by an agent. Any number of agents connect to a central site over an Internet connection and transcribe messages or answer queries in an assembly line like fashion. In addition, a Web query delivery system accepts a query or statement from a user; the query is transcribed, classified, and then broadcast over any medium to any number of experts or web sites that desire to answer the particular type of query received. The entire query is delivered to an expert or web site who provides a full answer to the user.Type: GrantFiled: March 6, 2006Date of Patent: April 13, 2010Assignee: FoneWeb, Inc.Inventors: Vinod K. Bhardwaj, Scott England, Dean Whitlock
-
Patent number: 7693721Abstract: Part of the spectrum of two or more input signals is encoded using conventional coding techniques, while encoding the rest of the spectrum using binaural cue coding (BCC). In BCC coding, spectral components of the input signals are downmixed and BCC parameters (e.g., inter-channel level and/or time differences) are generated. In a stereo implementation, after converting the left and right channels to the frequency domain, pairs of left- and right-channel spectral components are downmixed to mono. The mono components are then converted back to the time domain, along with those left- and right-channel spectral components that were not downmixed, to form hybrid stereo signals, which can then be encoded using conventional coding techniques. For playback, the encoded bitstream is decoded using conventional decoding techniques. BCC synthesis techniques may then apply the BCC parameters to synthesize an auditory scene based on the mono components as well as the unmixed stereo components.Type: GrantFiled: December 10, 2007Date of Patent: April 6, 2010Assignee: Agere Systems Inc.Inventors: Frank Baumgarte, Peter Kroon
-
Patent number: 7693722Abstract: An audio decoding apparatus for decoding and reproducing a plurality of compressed audio streams simultaneously without sound interruption, even when the number of samples per frame is different. The audio decoding apparatus includes: a first and second audio decoder which decode two inputted compressed audio streams, and output audio data; a first and second intermediate buffer which temporarily hold the outputted audio data; a first and second audio output unit which convert the audio data into audio signals and output such audio signals; an output control unit which reads the audio data from the first and second intermediate buffer, and transmits the audio data to the first and second audio output unit. The output control unit repeats the reading and transmission of either the same number of samples of audio data or the number of samples of audio data for the same amount of transmission time.Type: GrantFiled: June 29, 2005Date of Patent: April 6, 2010Assignee: Panasonic CorporationInventors: Hideyuki Kakuno, Masahiro Sueyoshi, Kosuke Nishio
-
Patent number: 7684985Abstract: A technique is disclosed for disambiguating speech input for multimodal systems by using a combination of speech and visual I/O interfaces. When the user's speech input is not recognized with sufficiently high confidence, a the user is presented with a set of possible matches using a visual display and/or speech output. The user then selects the intended input from the list of matches via one or more available input mechanisms (e.g., stylus, buttons, keyboard, mouse, or speech input). These techniques involve the combined use of speech and visual interfaces to correctly identify user's speech input. The techniques disclosed herein may be utilized in computer devices such as PDAs, cellphones, desktop and laptop computers, tablet PCs, etc.Type: GrantFiled: December 10, 2003Date of Patent: March 23, 2010Inventors: Richard Dominach, Sastry Isukapalli, Sandeep Sibal, Shirish Vaidya
-
Patent number: 7680667Abstract: An interactive robot capable of speech recognition includes a sound-source-direction estimating unit that estimates a direction of a sound source for target voices which are required to undergo speech recognition; a moving unit that moves the interactive robot in the sound-source direction; a target-voice acquiring unit that acquires the target voices at a position after moving; and a speech recognizing unit that performs speech recognition of the target voices.Type: GrantFiled: December 20, 2005Date of Patent: March 16, 2010Assignee: Kabuhsiki Kaisha ToshibaInventors: Takafumi Sonoura, Kaoru Suzuki
-
Patent number: 7676372Abstract: A speech transformation apparatus comprises a microphone 21 for detecting speech and generating a speech signal; a signal processor 22 for performing a speech recognition process using the speech signal; a speech information generator for transforming the recognition result responsive to the physical state of the user, the operating conditions, and/or the purpose for using the apparatus; and a display unit 26 and loudspeaker 25 for generating a control signal for outputting a raw recognition result and/or a transformed recognition result. In a speech transformation apparatus thus constituted, speech enunciated by a spoken-language-impaired individual can be transformed and presented to the user, and sounds from outside sources can also be transformed and presented to the user.Type: GrantFiled: February 16, 2000Date of Patent: March 9, 2010Assignee: Yugen Kaisha GM&MInventor: Toshihiko Oba
-
Patent number: 7672842Abstract: A method and system processes a speech signal. A fast Fourier transform is performed on a speech signal to produce a speech signal having a plurality of frequency bands in a frequency domain.Type: GrantFiled: July 26, 2006Date of Patent: March 2, 2010Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Bhiksha Ramakrishnan, Bent Schmidt-Nielsen, Lorenzo Turicchia, Rahul Sarpeshkar
-
Patent number: 7664631Abstract: According to an aspect of the present invention, a language processing device has a text input section and an anaphora analysis section. The text input section acquires text data. The anaphora analysis section analyzes whether a correct anaphora relation is included in the text data acquired by the text input section. The correct anaphora relation has an anaphor and an antecedent corresponding to the anaphor.Type: GrantFiled: December 20, 2005Date of Patent: February 16, 2010Assignee: Fuji Xerox Co., Ltd.Inventors: Daigo Sugihara, Hiroshi Masuichi, Shunichi Kimura, Katsuhiko Itonori, Hideaki Ashikaga, Hiroki Yoshimura, Masanori Onda, Masahiro Kato, Masanori Satake
-
Patent number: 7653537Abstract: A system and method is provided for determining whether a data frame of a coded speech signal corresponds to voice or to noise. In one embodiment, a voice activity detector determines a cross-correlation of data. If the cross-correlation is lower than a predetermined cross-correlation value, then the data frame corresponds to noise. If not, then the voice activity detector determines a periodicity of the cross-correlation and a variance of the periodicity. If the variance is less than a predetermined variance value, then the data frame corresponds to voice. In another embodiment, a method determines energy of the data frame and an average energy of the coded speech signal. If the data frame is one of a predetermined number of initial data frames, then a comparison between the average energy to the energy of the data frame is used to determine whether the data frame is noise or voice.Type: GrantFiled: September 28, 2004Date of Patent: January 26, 2010Assignee: STMicroelectronics Asia Pacific Pte. Ltd.Inventors: Kabi Prakash Padhi, Sapna George