Patents Examined by James S. Wozniak

Digital recording and playback system with voice recognition capability for concurrent text generation

Patent number: 8112270

Abstract: A recording and playback system is provided. The system includes an audio capturing device configured to receive an analog input and an encoder coupled to the audio capturing device configured to generate a digital signal based on the analog input. The system further includes a recognition engine coupled to the audio capturing device and configured to generate text data based on the analog input, wherein the encoder and the recognition engine simultaneously generate the digital signal and the text data such that the digital signal and the text data can be provided in a synchronized manner.

Type: Grant

Filed: October 9, 2007

Date of Patent: February 7, 2012

Assignees: Sony Corporation, Sony Electronics, Inc.

Inventor: Takashi Nakatsuyama
Methods and apparatus for understanding machine vocabulary

Patent number: 8108207

Abstract: Configurations herein provide a language processing mechanism operable to define a machine vocabulary and identify a machine language version of the words that preserves context and identifies the proper definition of the words by identifying and preserving context of a particular set of words, such as a sentence or paragraph. The machine vocabulary includes a definition section for each definition of a word. Each definition section includes a set of one or more definition elements. The definition elements include a predetermined format of definition fields, and each has a corresponding mask indicative of significant definition fields. The set of definition elements corresponding to a particular definition describe the usage of the word in a context matching that particular definition. Each definition element captures a characteristic of the definition according to fuzzy logic such that the definition elements collectively capture the context.

Type: Grant

Filed: September 1, 2009

Date of Patent: January 31, 2012

Assignee: Artificial Cognition Inc.

Inventors: George H. Harvey, Donald R. Greenbaum, Charles H. Collins, Charles D. Harvey
Voice interface and search for electronic devices including bluetooth headsets and remote systems

Patent number: 8099289

Abstract: Systems and methods for improving the interaction between a user and a small electronic device such as a Bluetooth headset are described. The use of a voice user interface in electronic devices may be used. In one embodiment, recognition processing limitations of some devices are overcome by employing speech synthesizers and recognizers in series where one electronic device responds to simple audio commands and sends audio requests to a remote device with more significant recognition analysis capability. Embodiments of the present invention may include systems and methods for utilizing speech recognizers and synthesizers in series to provide simple, reliable, and hands-free interfaces with users.

Type: Grant

Filed: May 28, 2008

Date of Patent: January 17, 2012

Assignee: Sensory, Inc.

Inventors: Todd F. Mozer, Forrest S. Mozer
System for improving speech intelligibility through high frequency compression

Patent number: 8086451

Abstract: A speech enhancement system that improves the intelligibility and the perceived quality of processed speech includes a frequency transformer and a spectral compressor. The frequency transformer converts speech signals from the time domain to the frequency domain. The spectral compressor compresses a pre-selected portion of the high frequency band and maps the compressed high frequency band to a lower band limited frequency range.

Type: Grant

Filed: December 9, 2005

Date of Patent: December 27, 2011

Assignee: QNX Software Systems Co.

Inventors: Phillip A. Hetherington, Xueman Li
System and method for detection and analysis of speech

Patent number: 8078465

Abstract: Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify words, vocalizations, or other noises directed to or spoken by the key child, independent of content. The analysis can include the number of responses between the child and another, such as an adult and the number of words spoken by the child and/or another, independent of content of the speech. One or more metrics can be determined based on the analysis and provided to assist in improving the language environment and/or tracking language development of the key child.

Type: Grant

Filed: January 23, 2008

Date of Patent: December 13, 2011

Assignee: LENA Foundation

Inventors: Terrance Paul, Dongxin Xu, Umit Yapenel, Sharmistha Gray
Systems, methods, and apparatus for highband time warping

Patent number: 8078474

Abstract: In one embodiment, a method of signal processing including includes encoding a low-frequency portion of a speech signal into at least an encoded narrowband excitation signal and a plurality of narrowband filter parameters; and generating a highband excitation signal based on a narrowband excitation signal. The encoded narrowband excitation signal includes a time warping, and the method includes applying a time shift to a high-frequency portion of the speech signal based on the information related to the time warping. The method also includes encoding the time-shifted high-frequency portion of the speech signal into at least one (A) a plurality of highband filter parameters and (B) a plurality of high band gain factors.

Type: Grant

Filed: April 3, 2006

Date of Patent: December 13, 2011

Assignee: QUALCOMM Incorporated

Inventors: Koen Bernard Vos, Ananthapadmanabhan Aasanipalai Kandhadai
Repetitive transient noise removal

Patent number: 8073689

Abstract: A system improves the perceptual quality of a speech signal by dampening undesired repetitive transient noises. The system includes a repetitive transient noise detector adapted to detect repetitive transient noise in a received signal. The received signal may include a harmonic and a noise spectrum. The system further includes a repetitive transient noise attenuator that substantially removes or dampens repetitive transient noises from the received signal. The method of dampening the repetitive transient noises includes modeling characteristics of repetitive transient noises; detecting characteristics in the received signal that correspond to the modeled characteristics of the repetitive transient noises; and substantially removing components of the repetitive transient noises from the received signal that correspond to some or all of the modeled characteristics of the repetitive transient noises.

Type: Grant

Filed: January 13, 2006

Date of Patent: December 6, 2011

Assignee: QNX Software Systems Co.

Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
Translation device, translation method, and storage medium

Patent number: 8073678

Abstract: A translation device has a dictionary that stores a set of words and their corresponding meanings in plural languages; an input unit that inputs a document; a recognizing unit that recognizes text in the inputted document; an analyzing unit that devides the text recognized by the recognizing unit into words; a translating unit that translates each of the words obtained by the analyzing unit into a translated term by using the dictionary; and an output unit that outputs an output image containing the translated term for a key word.

Type: Grant

Filed: September 1, 2005

Date of Patent: December 6, 2011

Assignee: Fuji Xerox Co., Ltd.

Inventors: Hiroshi Masuichi, Michihiro Tamune, Masatoshi Tagawa, Shaoming Liu, Kiyoshi Tashiro, Atsushi Itoh, Kyosuke Ishikawa, Naoko Sato
Method, apparatus, and computer program product for one-step correction of voice interaction

Patent number: 8065148

Abstract: A one-step correction mechanism for voice interaction is provided. Correction of a previous state is enabled simultaneously with recognition in a current or subsequent state. An application is decomposed into a set of tasks. Each task is associated with the collection of one piece of information. Each task may be in a different state. At any point during the interaction, while a task/state pair is active, the dialog manager may enable multiple other task/state pairs to be active in latent fashion. The application developer may then use those facilities or resources to the active task/state and the latent task/state pairs depending on contextual condition of the interaction state of the application.

Type: Grant

Filed: March 25, 2010

Date of Patent: November 22, 2011

Assignee: Nuance Communications, Inc.

Inventors: Juan Manuel Huerta, Roberto Pieraccini
Adapting masking thresholds for encoding a low frequency transient signal in audio data

Patent number: 8060375

Abstract: An improved audio coding technique encodes audio having a low frequency transient signal, using a long block, but with a set of adapted masking thresholds. Upon identifying an audio window that contains a low frequency transient signal, masking thresholds for the long block may be calculated as usual. A set of masking thresholds calculated for the 8 short blocks corresponding to the long block are calculated. The masking thresholds for low frequency critical bands are adapted based on the thresholds calculated for the short blocks, and the resulting adapted masking thresholds are used to encode the long block of audio data. The result is encoded audio with rich harmonic content and negligible coder noise resulting from the low frequency transient signal.

Type: Grant

Filed: January 12, 2011

Date of Patent: November 15, 2011

Assignee: Apple Inc.

Inventors: Shyh-Shiaw Kuo, Frank Baumgarte
Methods and appratus for characterizing media

Patent number: 8060372

Abstract: Methods and apparatus for characterizing media are described. In one example, a method of characterizing media includes capturing a block of audio; converting at least a portion of the block of audio into a frequency domain representation including a plurality of complex-valued frequency components; defining a band of complex-valued frequency components for consideration; determining a decision metric using the band of complex-valued frequency components; and determining a signature bit based on a value of the decision metric. Other examples are shown and described.

Type: Grant

Filed: February 20, 2008

Date of Patent: November 15, 2011

Assignee: The Nielsen Company (US), LLC

Inventors: Alexander Topchy, Venugopal Srinivasan, Arun Ramaswamy
Apparatus and method for translating input speech sentences in accordance with information obtained from a pointing device

Patent number: 8055495

Abstract: An associated-information storage unit stores a name of associated information and a display position in association with each other. An example storage unit stores a semantic class, an example in a source language, and an example in a target language in association with each other. A dictionary storage unit stores the name of associated information and the semantic class in association with each other. An acquiring unit acquires the name of the associated information corresponding to the display position of the selected associated information from the associated-information storage unit, and acquires a semantic class corresponding to the acquired name of the associated information from the dictionary storage unit. A translation unit acquires an example in the target language corresponding to the acquired semantic class and a speech recognition result from the example storage unit, thereby translating the recognition result.

Type: Grant

Filed: September 6, 2007

Date of Patent: November 8, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventor: Kazuo Sumita
Method and apparatus for transmitting speech activity in distributed voice recognition systems

Patent number: 8050911

Abstract: A system, method, apparatus, signal-bearing medium, and means for transmitting speech activity in a distributed voice recognition (VR) system. The distributed voice recognition system includes a local VR engine in a subscriber unit (102) and a server VR engine on a server (160). The local VR engine comprises a voice activity detection (VAD) module (106) that detects voice activity within a speech signal, and comprises an advanced feature extraction (AFE) module (104) that extracts features from a speech signal. The detected voice activity information is transmitted over a first wireless communication channel to the server (160). The feature extraction information is transmitted over a second wireless communication channel, separate from the first wireless communication channel, to the server (160). The server (160) processes the received information to determine a linguistic estimate of the electrical speech signal, and transmits the linguistic estimate to the subscriber unit (102).

Type: Grant

Filed: March 1, 2007

Date of Patent: November 1, 2011

Assignee: QUALCOMM Incorporated

Inventor: Harinath Garudadri
Telephone voice command enabled computer administration method and system

Patent number: 8050930

Abstract: An administration method and system. The method includes receiving by a computing system, a telephone call from an administrator. The computing system presents an audible menu associated with a plurality of computers to the administrator. The computing system receives from the administrator, an audible selection for a computer from the audible menu. The computing system receives from the administrator, an audible verbal command for performing a maintenance operation on the computer. The computing system executes the maintenance operation on the computer. The computing system receives from the computer, confirmation data indicating that the maintenance operation has been completed. The computing system converts the confirmation data into an audible verbal message. The computing system transmits the second audible verbal message to the administrator.

Type: Grant

Filed: May 1, 2008

Date of Patent: November 1, 2011

Assignee: International Business Machines Corporation

Inventors: Peeyush Jaiswal, Naveen Narayan
Voice recognition with dynamic filter bank adjustment based on speaker categorization

Patent number: 8050922

Abstract: Voice recognition methods and systems are disclosed. A voice signal is obtained for an utterance of a speaker. The speaker is categorized as a male, female, or child and the categorization is used as a basis for dynamically adjusting a maximum frequency fmax and a minimum frequency fmin of a filter bank used for processing the input utterance to produce an output. Corresponding gender or age specific acoustic models are used to perform voice recognition based on the filter bank output.

Type: Grant

Filed: July 21, 2010

Date of Patent: November 1, 2011

Assignee: Sony Computer Entertainment Inc.

Inventor: Ruxin Chen
Speech synthesis method and apparatus using pre-recorded speech and rule-based synthesized speech

Patent number: 8041569

Abstract: A language processing unit identifies a word by performing language analysis on a text supplied from a text holding unit. A synthesis selection unit selects speech synthesis processing performed by a rule-based synthesis unit or speech synthesis processing performed by a pre-recorded-speech-based synthesis unit for a word of interest extracted from the language analysis result. The selected rule-based synthesis unit or pre-recorded-speech-based synthesis unit executes speech synthesis processing for the word of interest.

Type: Grant

Filed: February 22, 2008

Date of Patent: October 18, 2011

Assignee: Canon Kabushiki Kaisha

Inventors: Yasuo Okutani, Michio Aizawa, Toshiaki Fukada
Method and system for identifying and correcting accent-induced speech recognition difficulties

Patent number: 8036893

Abstract: A system for use in speech recognition includes an acoustic module accessing a plurality of distinct-language acoustic models, each based upon a different language; a lexicon module accessing at least one lexicon model; and a speech recognition output module. The speech recognition output module generates a first speech recognition output using a first model combination that combines one of the plurality of distinct-language acoustic models with the at least one lexicon model. In response to a threshold determination, the speech recognition output module generates a second speech recognition output using a second model combination that combines a different one of the plurality of distinct-language acoustic models with the at least one distinct-language lexicon model.

Type: Grant

Filed: July 22, 2004

Date of Patent: October 11, 2011

Assignee: Nuance Communications, Inc.

Inventor: David E. Reich
Methods of defining ontologies, word disambiguation methods, computer systems, and articles of manufacture

Patent number: 8036876

Abstract: Methods of defining ontologies, word disambiguation methods, computer systems, and articles of manufacture are described according to some aspects. In one aspect, a word disambiguation method includes accessing textual content to be disambiguated, wherein the textual content comprises a plurality of words individually comprising a plurality of word senses, for an individual word of the textual content, identifying one of the word senses of the word as indicative of the meaning of the word in the textual content, for the individual word, selecting one of a plurality of event classes of a lexical database ontology using the identified word sense of the individual word, and for the individual word, associating the selected one of the event classes with the textual content to provide disambiguation of a meaning of the individual word in the textual content.

Type: Grant

Filed: November 4, 2005

Date of Patent: October 11, 2011

Assignee: Battelle Memorial Institute

Inventors: Antonio P. Sanfilippo, Stephen C. Tratz, Michelle L. Gregory, Alan R. Chappell, Paul D. Whitney, Christian Posse, Robert L. Baddeley, Ryan E. Hohimer
Device incorporating improved text input mechanism

Patent number: 8036878

Abstract: A device including a display screen for displaying m-words of data, a text entry device for entering data, a processor receiving data from the text entry device and causing it to be displayed on the display screen. Upon activation the processor initializes a precursor to a predefined value. The device further includes a non-volatile memory storing a dictionary containing a plurality of entries, each entry including an index, a candidate word, and a score. The processor selects a list of n-number of candidate words from the dictionary whose index matches the precursor, and causes m-number of candidate words from the list of candidate words to be displayed on the display screen. The processor causes the display to prompt the user to select one of the displayed candidate words or enter a desired word using the text entry device.

Type: Grant

Filed: May 18, 2005

Date of Patent: October 11, 2011

Assignee: Never Wall Treuhand GmbH

Inventor: Ramin Oliver Assadollahi
Adaptive postfiltering methods and systems for decoding speech

Patent number: 8032363

Abstract: A method of processing a decoded speech (DS) signal including successive DS frames, each DS frame including DS samples. The method comprises: adaptively filtering the DS signal to produce a filtered signal; gain-scaling the filtered signal with an adaptive gain updated once a DS frame, thereby producing a gain-scaled signal; and performing a smoothing operation to smooth possible waveform discontinuities in the gain-scaled signal.

Type: Grant

Filed: August 9, 2002

Date of Patent: October 4, 2011

Assignee: Broadcom Corporation

Inventors: Juin-Hwey Chen, Jes Thyssen, Chris C Lee

prev 1 2 3 4 5 6 7 8 9 … next