Patents Examined by Talivaldis Ivars Smit

Device, method, and medium for establishing language model for expanding finite state grammar using a general grammar database

Patent number: 8255220

Abstract: A device, a method, and a medium for establishing a language model for speech recognition are disclosed. The language-model-establishing device includes: a schema expander for expanding a state schema which is composed of at least one state defined by a finite state grammar using a general grammar database; a grammatical-structure-expander for expanding grammatical structures which can be expressed by each state of the expanded state schema using the general grammar database; and a grammatical-structure-filter for filtering out any incorrect grammatical structure from the expanded grammatical structures using the general grammar database. Since the state schema is expanded using the general grammar database, it is possible to improve recognition of unlearned grammatical structures.

Type: Grant

Filed: October 11, 2006

Date of Patent: August 28, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jeong-mi Cho, Byung-kwan Kwak
Distributed speech recognition using one way communication

Patent number: 8249878

Abstract: A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes a first portion of the speech stream and, if a predetermined criterion is satisfied by the speech recognition result, waits until the speech recognizer has been reconfigured before recognizing a second portion of the speech stream. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.

Type: Grant

Filed: August 2, 2011

Date of Patent: August 21, 2012

Assignee: Multimodal Technologies, LLC

Inventors: Eric Carraux, Detlef Koll
System and method of providing a spoken dialog interface to a website

Patent number: 8249879

Abstract: Disclosed is a system and method for training a spoken dialog service component from website data. Spoken dialog service components typically include an automatic speech recognition module, a language understanding module, a dialog management module, a language generation module and a text-to-speech module. The method includes converting data from a structured database associated with a website to a structured text data set and a structured task knowledge base, extracting linguistic items from the structured database, and training a spoken dialog service component using at least one of the structured text data, the structured task knowledge base, or the linguistic items. The system includes modules configured to implement the method.

Type: Grant

Filed: November 7, 2011

Date of Patent: August 21, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, Junlan Feng, Mazin G. Rahim
Speech decoding method and apparatus which generates an excitation signal and a synthesis filter

Patent number: 8249866

Abstract: A speech decoding method which generates an excitation signal and a synthesis filter from coded data and which obtains a speech signal based on the excitation signal and the synthesis filter. The method includes acquiring identification information used for determining whether the speech signal to be decoded is a narrowband signal or a wideband signal; and modifying the excitation signal based on the identification information by controlling strength or presence of emphasis of pitch periodicity with respect to the excitation signal generated from the coded data, so as to generate the speech signal by use of the modified excitation signal and the synthesis filter.

Type: Grant

Filed: March 31, 2010

Date of Patent: August 21, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventor: Kimio Miseki
Multi-pass echo residue detection with speech application intelligence

Patent number: 8244529

Abstract: A method is provided for multi-pass echo residue detection. The method includes detecting audio data, and determining whether the audio data is recognized as speech. Additionally, the method categorizes the audio data recognized as speech as including an acceptable level of residual echo, and categorizes categorizing unrecognizable audio data as including an unacceptable level of residual echo. Furthermore, the method determines whether the unrecognizable audio data contains a user input, and also determines whether a duration of the user input is at least a predetermined duration, and when the user input is at least the predetermined duration, the method extracts the predetermined duration of the user input from a total duration of the user input.

Type: Grant

Filed: September 20, 2011

Date of Patent: August 14, 2012

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Ngai Chiu Wong
System and method for multi-channel multi-feature speech/noise classification for noise suppression

Patent number: 8239196

Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.

Type: Grant

Filed: July 28, 2011

Date of Patent: August 7, 2012

Assignee: Google Inc.

Inventor: Marco Paniconi
System and method for audibly outputting text messages

Patent number: 8239202

Abstract: A method and system for audibly outputting text messages includes: setting a vocalizing function for audibly outputting text messages, searching a character speech library for each character of a received text message, and acquiring pronunciation data of each character of the received text message. The method and the system further includes vocalizing the pronunciation data of each character of the received text message, generating a voice message, and audibly outputting the generated voice message.

Type: Grant

Filed: December 23, 2008

Date of Patent: August 7, 2012

Assignee: Chi Mei Communication Systems, Inc.

Inventor: Chi-Ming Hsiao
Example based translation apparatus, translation method, and translation program

Patent number: 8239188

Abstract: A translation apparatus is provided that includes a bilingual example sentence dictionary that stores plural example sentences in a first language and plural example sentences in a second language being translation of the plural example sentences, an input unit that inputs an input sentence in the first language, a first search unit that searches whether the input sentence matches any of the plural example sentences in the first language, a second search unit that searches for at least one example sentence candidate that is similar to the input sentence from the plural example sentences in the first language, when a matching example sentence is not found in the first search unit, and an output unit that outputs an example sentence in the second language that is translation of an example sentence searched in the first search unit or the example sentence candidate searched in the second search unit.

Type: Grant

Filed: March 28, 2007

Date of Patent: August 7, 2012

Assignee: Fuji Xerox Co., Ltd.

Inventor: Shaoming Liu
System and method for multi-channel multi-feature speech/noise classification for noise suppression

Patent number: 8239194

Abstract: An architecture and framework for speech/noise classification of an audio signal using multiple features with multiple input channels (e.g., microphones) are provided. The architecture may be implemented with noise suppression in a multi-channel environment where noise suppression is based on an estimation of the noise spectrum. The noise spectrum is estimated using a model that classifies each time/frame and frequency component of a signal as speech or noise by applying a speech/noise probability function. The speech/noise probability function estimates a speech/noise probability for each frequency and time bin. A speech/noise classification estimate is obtained by fusing (e.g., combining) data across different input channels using a layered network model.

Type: Grant

Filed: September 26, 2011

Date of Patent: August 7, 2012

Assignee: Google Inc.

Inventor: Marco Paniconi
Performing a safety analysis for user-defined voice commands to ensure that the voice commands do not cause speech recognition ambiguities

Patent number: 8234120

Abstract: The present invention discloses a solution for assuring user-defined voice commands are unambiguous. The solution can include a step of identifying a user attempt to enter a user-defined voice command into a voice-enabled system. A safety analysis can be performed on the user-defined voice command to determine a likelihood that the user-defined voice command will be confused with preexisting voice commands recognized by the voice-enabled system. When a high likelihood of confusion is determined by the safety analysis, a notification can be presented that the user-defined voice command is subject to confusion. A user can then define a different voice command or can choose to continue to use the potentially confusing command, possibly subject to a system imposed confusion mitigating condition or action.

Type: Grant

Filed: July 26, 2006

Date of Patent: July 31, 2012

Assignee: Nuance Communications, Inc.

Inventors: Ciprian Agapi, Oscar J. Blass, Brennan D. Monteiro, Roberto Vila
Script compliance and agent feedback

Patent number: 8229752

Abstract: Systems and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a method may include conducting the voice interaction between the agent and a client, wherein the agent follows the script via a plurality of panels. From there, the voice interaction is evaluated via the plurality of panels employing panel-by-panel playback with an automatic speech recognition component adapted to analyze the voice interaction. As such, it may be determined, via generating a score using confidence level thresholds of an automatic speech recognition component such that confidence level thresholds are assigned to each of the plurality of panels and evaluating the score against at least one of a static standard and a varying standard, whether the agent has adequately followed the script.

Type: Grant

Filed: April 26, 2010

Date of Patent: July 24, 2012

Assignee: West Corporation

Inventors: Mark J Pettay, Jill M Vacek
System and method for hazard mitigation in voice-driven control applications

Patent number: 8224651

Abstract: A speech recognition and control system including a sound card for receiving speech and converting the speech into digital data, the sound card removably connected to an input of a computer, recognizer software executing on the computer for interpreting at least a portion of the digital data, event detection software executing on the computer for detecting connectivity of the sound card, and command control software executing on the computer for generating a command based on at least one of the digital data and the connectivity of the sound card.

Type: Grant

Filed: April 18, 2008

Date of Patent: July 17, 2012

Assignee: Storz Endoskop Produktions GmbH

Inventors: Gang Wang, Chengyi Zheng, Heinz-Werner Stiller, Matteo Contolini
User voice mixing device, virtual space sharing system, computer control method, and information storage medium

Patent number: 8219388

Abstract: A sensation of presence of voice chat in a virtual space is enhanced. A user speech synthesizer used in a virtual space sharing system where information processing devices share the virtual space. The user speech synthesizer comprises a speech data acquiring section (60) for acquiring speech data representing a speech uttered by the user of one of the information processing devices, an environment sound storage section (66) for storing an environment sound associated with one or more regions defined in the virtual space, a region specifying section (64) for specifying a region corresponding to the user in the virtual space, and an environment sound synthesizing section (68) for acquiring the environment sound associated with the specified region from the environment sound storage section (66), combining the acquired environment sound and the speech data and synthesizing synthesized speech data.

Type: Grant

Filed: June 7, 2006

Date of Patent: July 10, 2012

Assignee: Konami Digital Entertainment Co., Ltd.

Inventors: Hiromasa Kaneko, Masaki Takeuchi
Speech-centric multimodal user interface design in mobile technology

Patent number: 8219406

Abstract: A multi-modal human computer interface (HCI) receives a plurality of available information inputs concurrently, or serially, and employs a subset of the inputs to determine or infer user intent with respect to a communication or information goal. Received inputs are respectively parsed, and the parsed inputs are analyzed and optionally synthesized with respect to one or more of each other. In the event sufficient information is not available to determine user intent or goal, feedback can be provided to the user in order to facilitate clarifying, confirming, or augmenting the information inputs.

Type: Grant

Filed: March 15, 2007

Date of Patent: July 10, 2012

Assignee: Microsoft Corporation

Inventors: Dong Yu, Li Deng
Device and method for the creation of a voice browser functionality

Patent number: 8219403

Abstract: In the case of an incoming call, at least attempting to select and allocate one of a plurality of different types of hardware platforms to the incoming call based on initial signaling information and load criteria and performing the allocation if the allocation can be provided. If such an allocation cannot be provided, at least attempting to provide the allocation based on other signaling information following the initial signaling information. If such an allocation cannot be provided based on the other signaling information, then a relevant voice page is requested from a storage device and a pre-analysis is performed, during which the requests included therein are determined and the browser function is at least attempted to be allocated based on the determination, and if still no allocation can be achieved, then a universally usable browser functionality is allocated.

Type: Grant

Filed: January 12, 2007

Date of Patent: July 10, 2012

Assignee: Nokia Siemens Networks GmbH & Co. KG

Inventors: Detlev Freund, Nobert Löbig
Data processing system for autonomously building speech identification and tagging data

Patent number: 8219397

Abstract: A method, system, and computer program product for autonomously transcribing and building tagging data of a conversation. A corpus processing agent monitors a conversation and utilizes a speech recognition agent to identify the spoken languages, speakers, and emotional patterns of speakers of the conversation. While monitoring the conversation, the corpus processing agent determines emotional patterns by monitoring voice modulation of the speakers and evaluating the context of the conversation. When the conversation is complete, the corpus processing agent determines synonyms and paraphrases of spoken words and phrases of the conversation taking into consideration any localized dialect of the speakers. Additionally, metadata of the conversation is created and stored in a link database, for comparison with other processed conversations. A corpus, a transcription of the conversation containing metadata links, is then created.

Type: Grant

Filed: June 10, 2008

Date of Patent: July 10, 2012

Assignee: Nuance Communications, Inc.

Inventors: Peeyush Jaiswal, Vikram S. Khatri, Naveen Narayan, Burt Vialpando
Dictionary registration apparatus, dictionary registration method, and computer product

Patent number: 8219381

Abstract: A storing unit stores therein dictionary information in which a first text in a first language is associated with a second text that is a translation of the first text into a second language. An extracting unit extracts, when an input text includes an unregistered text that is not registered as the first text in the dictionary information, the unregistered text from the input text. A translating unit translates an input similar text that expresses the unregistered text with a different text, into the second language. A registering unit registers the unregistered text in association with translated similar text on the dictionary information.

Type: Grant

Filed: March 26, 2007

Date of Patent: July 10, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventor: Noriko Yamanaka
Quantizing a joint-channel-encoded audio signal

Patent number: 8214207

Abstract: Provided are, among other things, systems, methods and techniques for quantizing a joint-channel-encoded audio signal, e.g., by: identifying a target quantization unit for reduction of quantization step size based on quantization errors; determining whether the target quantization unit has been jointly sum/difference encoded with another quantization unit; if the target quantization unit has been jointly sum/difference encoded with another quantization unit, then (i) designating the sum or difference channel quantization unit as a target S/D quantization unit in based on which has a greater quantization error and (ii) re-quantizing the target S/D channel quantization using a decreased quantization step size; recalculating the quantization error for the target quantization unit; and repeating the process until a specified criterion is satisfied.

Type: Grant

Filed: August 23, 2011

Date of Patent: July 3, 2012

Assignee: Digital Rise Technology Co., Ltd.

Inventor: Yuli You
Speech recognition based on pronunciation modeling

Patent number: 8214213

Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the langue model.

Type: Grant

Filed: April 27, 2006

Date of Patent: July 3, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Andrej Ljolje
Systems for translating sentences between languages using language-independent semantic structures and ratings of syntactic constructions

Patent number: 8214199

Abstract: A method and computer system for translating sentences between languages from an intermediate language-independent semantic representation is provided. On the basis of comprehensive understanding about languages and semantics, exhaustive linguistic descriptions are used to analyze sentences, to build syntactic structures and language independent semantic structures and representations, and to synthesize one or more sentences in a natural or artificial language. A computer system is also provided to analyze and synthesize various linguistic structures and to perform translation of a wide spectrum of various sentence types. As result, a generalized data structure, such as a semantic structure, is generated from a sentence of an input language and can be transformed into a natural sentence expressing its meaning correctly in an output language.

Type: Grant

Filed: March 22, 2007

Date of Patent: July 3, 2012

Assignee: ABBYY Software, Ltd.

Inventors: Konstantin Anismovich, Vladimir Selegey, Konstantin Zuev

prev … 4 5 6 7 8 9 10 11 12 … next