Patents Examined by Talivaldis Ivars Smit
  • Patent number: 8255220
    Abstract: A device, a method, and a medium for establishing a language model for speech recognition are disclosed. The language-model-establishing device includes: a schema expander for expanding a state schema which is composed of at least one state defined by a finite state grammar using a general grammar database; a grammatical-structure-expander for expanding grammatical structures which can be expressed by each state of the expanded state schema using the general grammar database; and a grammatical-structure-filter for filtering out any incorrect grammatical structure from the expanded grammatical structures using the general grammar database. Since the state schema is expanded using the general grammar database, it is possible to improve recognition of unlearned grammatical structures.
    Type: Grant
    Filed: October 11, 2006
    Date of Patent: August 28, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jeong-mi Cho, Byung-kwan Kwak
  • Patent number: 8249878
    Abstract: A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes a first portion of the speech stream and, if a predetermined criterion is satisfied by the speech recognition result, waits until the speech recognizer has been reconfigured before recognizing a second portion of the speech stream. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.
    Type: Grant
    Filed: August 2, 2011
    Date of Patent: August 21, 2012
    Assignee: Multimodal Technologies, LLC
    Inventors: Eric Carraux, Detlef Koll
  • Patent number: 8249879
    Abstract: Disclosed is a system and method for training a spoken dialog service component from website data. Spoken dialog service components typically include an automatic speech recognition module, a language understanding module, a dialog management module, a language generation module and a text-to-speech module. The method includes converting data from a structured database associated with a website to a structured text data set and a structured task knowledge base, extracting linguistic items from the structured database, and training a spoken dialog service component using at least one of the structured text data, the structured task knowledge base, or the linguistic items. The system includes modules configured to implement the method.
    Type: Grant
    Filed: November 7, 2011
    Date of Patent: August 21, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Srinivas Bangalore, Junlan Feng, Mazin G. Rahim
  • Patent number: 8249866
    Abstract: A speech decoding method which generates an excitation signal and a synthesis filter from coded data and which obtains a speech signal based on the excitation signal and the synthesis filter. The method includes acquiring identification information used for determining whether the speech signal to be decoded is a narrowband signal or a wideband signal; and modifying the excitation signal based on the identification information by controlling strength or presence of emphasis of pitch periodicity with respect to the excitation signal generated from the coded data, so as to generate the speech signal by use of the modified excitation signal and the synthesis filter.
    Type: Grant
    Filed: March 31, 2010
    Date of Patent: August 21, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Kimio Miseki
  • Patent number: 8244529
    Abstract: A method is provided for multi-pass echo residue detection. The method includes detecting audio data, and determining whether the audio data is recognized as speech. Additionally, the method categorizes the audio data recognized as speech as including an acceptable level of residual echo, and categorizes categorizing unrecognizable audio data as including an unacceptable level of residual echo. Furthermore, the method determines whether the unrecognizable audio data contains a user input, and also determines whether a duration of the user input is at least a predetermined duration, and when the user input is at least the predetermined duration, the method extracts the predetermined duration of the user input from a total duration of the user input.
    Type: Grant
    Filed: September 20, 2011
    Date of Patent: August 14, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Ngai Chiu Wong
  • Patent number: 8239196
    Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.
    Type: Grant
    Filed: July 28, 2011
    Date of Patent: August 7, 2012
    Assignee: Google Inc.
    Inventor: Marco Paniconi
  • Patent number: 8239202
    Abstract: A method and system for audibly outputting text messages includes: setting a vocalizing function for audibly outputting text messages, searching a character speech library for each character of a received text message, and acquiring pronunciation data of each character of the received text message. The method and the system further includes vocalizing the pronunciation data of each character of the received text message, generating a voice message, and audibly outputting the generated voice message.
    Type: Grant
    Filed: December 23, 2008
    Date of Patent: August 7, 2012
    Assignee: Chi Mei Communication Systems, Inc.
    Inventor: Chi-Ming Hsiao
  • Patent number: 8239188
    Abstract: A translation apparatus is provided that includes a bilingual example sentence dictionary that stores plural example sentences in a first language and plural example sentences in a second language being translation of the plural example sentences, an input unit that inputs an input sentence in the first language, a first search unit that searches whether the input sentence matches any of the plural example sentences in the first language, a second search unit that searches for at least one example sentence candidate that is similar to the input sentence from the plural example sentences in the first language, when a matching example sentence is not found in the first search unit, and an output unit that outputs an example sentence in the second language that is translation of an example sentence searched in the first search unit or the example sentence candidate searched in the second search unit.
    Type: Grant
    Filed: March 28, 2007
    Date of Patent: August 7, 2012
    Assignee: Fuji Xerox Co., Ltd.
    Inventor: Shaoming Liu
  • Patent number: 8239194
    Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.
    Type: Grant
    Filed: September 26, 2011
    Date of Patent: August 7, 2012
    Assignee: Google Inc.
    Inventor: Marco Paniconi
  • Patent number: 8234120
    Abstract: The present invention discloses a solution for assuring user-defined voice commands are unambiguous. The solution can include a step of identifying a user attempt to enter a user-defined voice command into a voice-enabled system. A safety analysis can be performed on the user-defined voice command to determine a likelihood that the user-defined voice command will be confused with preexisting voice commands recognized by the voice-enabled system. When a high likelihood of confusion is determined by the safety analysis, a notification can be presented that the user-defined voice command is subject to confusion. A user can then define a different voice command or can choose to continue to use the potentially confusing command, possibly subject to a system imposed confusion mitigating condition or action.
    Type: Grant
    Filed: July 26, 2006
    Date of Patent: July 31, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Ciprian Agapi, Oscar J. Blass, Brennan D. Monteiro, Roberto Vila
  • Patent number: 8229752
    Abstract: Systems and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a method may include conducting the voice interaction between the agent and a client, wherein the agent follows the script via a plurality of panels. From there, the voice interaction is evaluated via the plurality of panels employing panel-by-panel playback with an automatic speech recognition component adapted to analyze the voice interaction. As such, it may be determined, via generating a score using confidence level thresholds of an automatic speech recognition component such that confidence level thresholds are assigned to each of the plurality of panels and evaluating the score against at least one of a static standard and a varying standard, whether the agent has adequately followed the script.
    Type: Grant
    Filed: April 26, 2010
    Date of Patent: July 24, 2012
    Assignee: West Corporation
    Inventors: Mark J Pettay, Jill M Vacek
  • Patent number: 8224651
    Abstract: A speech recognition and control system including a sound card for receiving speech and converting the speech into digital data, the sound card removably connected to an input of a computer, recognizer software executing on the computer for interpreting at least a portion of the digital data, event detection software executing on the computer for detecting connectivity of the sound card, and command control software executing on the computer for generating a command based on at least one of the digital data and the connectivity of the sound card.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: July 17, 2012
    Assignee: Storz Endoskop Produktions GmbH
    Inventors: Gang Wang, Chengyi Zheng, Heinz-Werner Stiller, Matteo Contolini
  • Patent number: 8219388
    Abstract: A sensation of presence of voice chat in a virtual space is enhanced. A user speech synthesizer used in a virtual space sharing system where information processing devices share the virtual space. The user speech synthesizer comprises a speech data acquiring section (60) for acquiring speech data representing a speech uttered by the user of one of the information processing devices, an environment sound storage section (66) for storing an environment sound associated with one or more regions defined in the virtual space, a region specifying section (64) for specifying a region corresponding to the user in the virtual space, and an environment sound synthesizing section (68) for acquiring the environment sound associated with the specified region from the environment sound storage section (66), combining the acquired environment sound and the speech data and synthesizing synthesized speech data.
    Type: Grant
    Filed: June 7, 2006
    Date of Patent: July 10, 2012
    Assignee: Konami Digital Entertainment Co., Ltd.
    Inventors: Hiromasa Kaneko, Masaki Takeuchi
  • Patent number: 8219406
    Abstract: A multi-modal human computer interface (HCI) receives a plurality of available information inputs concurrently, or serially, and employs a subset of the inputs to determine or infer user intent with respect to a communication or information goal. Received inputs are respectively parsed, and the parsed inputs are analyzed and optionally synthesized with respect to one or more of each other. In the event sufficient information is not available to determine user intent or goal, feedback can be provided to the user in order to facilitate clarifying, confirming, or augmenting the information inputs.
    Type: Grant
    Filed: March 15, 2007
    Date of Patent: July 10, 2012
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Li Deng
  • Patent number: 8219403
    Abstract: In the case of an incoming call, at least attempting to select and allocate one of a plurality of different types of hardware platforms to the incoming call based on initial signaling information and load criteria and performing the allocation if the allocation can be provided. If such an allocation cannot be provided, at least attempting to provide the allocation based on other signaling information following the initial signaling information. If such an allocation cannot be provided based on the other signaling information, then a relevant voice page is requested from a storage device and a pre-analysis is performed, during which the requests included therein are determined and the browser function is at least attempted to be allocated based on the determination, and if still no allocation can be achieved, then a universally usable browser functionality is allocated.
    Type: Grant
    Filed: January 12, 2007
    Date of Patent: July 10, 2012
    Assignee: Nokia Siemens Networks GmbH & Co. KG
    Inventors: Detlev Freund, Nobert Löbig
  • Patent number: 8219397
    Abstract: A method, system, and computer program product for autonomously transcribing and building tagging data of a conversation. A corpus processing agent monitors a conversation and utilizes a speech recognition agent to identify the spoken languages, speakers, and emotional patterns of speakers of the conversation. While monitoring the conversation, the corpus processing agent determines emotional patterns by monitoring voice modulation of the speakers and evaluating the context of the conversation. When the conversation is complete, the corpus processing agent determines synonyms and paraphrases of spoken words and phrases of the conversation taking into consideration any localized dialect of the speakers. Additionally, metadata of the conversation is created and stored in a link database, for comparison with other processed conversations. A corpus, a transcription of the conversation containing metadata links, is then created.
    Type: Grant
    Filed: June 10, 2008
    Date of Patent: July 10, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Peeyush Jaiswal, Vikram S. Khatri, Naveen Narayan, Burt Vialpando
  • Patent number: 8219381
    Abstract: A storing unit stores therein dictionary information in which a first text in a first language is associated with a second text that is a translation of the first text into a second language. An extracting unit extracts, when an input text includes an unregistered text that is not registered as the first text in the dictionary information, the unregistered text from the input text. A translating unit translates an input similar text that expresses the unregistered text with a different text, into the second language. A registering unit registers the unregistered text in association with translated similar text on the dictionary information.
    Type: Grant
    Filed: March 26, 2007
    Date of Patent: July 10, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Noriko Yamanaka
  • Patent number: 8214207
    Abstract: Provided are, among other things, systems, methods and techniques for quantizing a joint-channel-encoded audio signal, e.g., by: identifying a target quantization unit for reduction of quantization step size based on quantization errors; determining whether the target quantization unit has been jointly sum/difference encoded with another quantization unit; if the target quantization unit has been jointly sum/difference encoded with another quantization unit, then (i) designating the sum or difference channel quantization unit as a target S/D quantization unit in based on which has a greater quantization error and (ii) re-quantizing the target S/D channel quantization using a decreased quantization step size; recalculating the quantization error for the target quantization unit; and repeating the process until a specified criterion is satisfied.
    Type: Grant
    Filed: August 23, 2011
    Date of Patent: July 3, 2012
    Assignee: Digital Rise Technology Co., Ltd.
    Inventor: Yuli You
  • Patent number: 8214213
    Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the langue model.
    Type: Grant
    Filed: April 27, 2006
    Date of Patent: July 3, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Andrej Ljolje
  • Patent number: 8214199
    Abstract: A method and computer system for translating sentences between languages from an intermediate language-independent semantic representation is provided. On the basis of comprehensive understanding about languages and semantics, exhaustive linguistic descriptions are used to analyze sentences, to build syntactic structures and language independent semantic structures and representations, and to synthesize one or more sentences in a natural or artificial language. A computer system is also provided to analyze and synthesize various linguistic structures and to perform translation of a wide spectrum of various sentence types. As result, a generalized data structure, such as a semantic structure, is generated from a sentence of an input language and can be transformed into a natural sentence expressing its meaning correctly in an output language.
    Type: Grant
    Filed: March 22, 2007
    Date of Patent: July 3, 2012
    Assignee: ABBYY Software, Ltd.
    Inventors: Konstantin Anismovich, Vladimir Selegey, Konstantin Zuev