Patents Examined by Justin W. Rider

Enabling speech within a multimodal program using markup

Patent number: 7356472

Abstract: A method for speech enabling an application can include the step of specifying a speech input within a speech-enabled markup. The speech-enabled markup can also specify an application operation that is to be executed responsive to the detection of the speech input. After the speech input has been defined within the speech-enabled markup, the application can be instantiated. The specified speech input can then be detected and the application operation can be responsively executed in accordance with the specified speech-enabled markup.

Type: Grant

Filed: December 11, 2003

Date of Patent: April 8, 2008

Assignee: International Business Machines Corporation

Inventors: Charles W. Cross, Leslie R. Wilson, Steven G. Woodward
Management and assistance system for the deaf

Patent number: 7356473

Abstract: A computer-aided communication and assistance system that uses a signal processing and other algorithms in a processor in wireless communication with a microphone system to aid a deaf person. An instrumented communication module receives information from one or more microphones and provides textual and, optionally, stimulatory information to the deaf person. In one embodiment, a microphone is provided in a piece of jewelry or clothing. In one embodiment, a wireless (or wired) earpiece is provided to provide microphones and vibration stimulators.

Type: Grant

Filed: January 21, 2005

Date of Patent: April 8, 2008

Inventor: Lawrence Kates
Linguistically informed statistical models of constituent structure for ordering in sentence realization for a natural language generation system

Patent number: 7346493

Abstract: The present invention is a tree ordering component within a sentence realization system which receives an unordered syntax tree and generates a ranked list of alternative ordered syntax trees from the unordered syntax tree. The present invention also includes statistical models of constituent structure employed by the tree ordering component in scoring the alternative ordered trees.

Type: Grant

Filed: March 25, 2003

Date of Patent: March 18, 2008

Assignee: Microsoft Corporation

Inventors: Eric Ringger, Michael Gamon, Martine Smets, Simon Corston-Oliver, Robert C. Moore
Audio encoder utilizing bandwidth-limiting processing based on code amount characteristics

Patent number: 7343292

Abstract: A mapping transform unit subjects input audio signals to a mapping transform and generates frequency region signals that take frequency as a variable; a code amount designation unit supplies a preset coding bit rate as a code amount output; a frequency region signal compression encoder, based on the code amount, subjects input frequency region signals to a compression encoding process and generates a bitstream; and a bandwidth-limiting unit executes a bandwidth-limiting processing in which a part of the frequency zone covered by frequency region signals is allotted to an attenuation frequency zone, and in which the value of the frequency region signal is multiplied by an attenuation coefficient having a value less than 1 in the attenuation frequency zone to attenuate the frequency region signal in the attenuation frequency zone, and supplies the frequency region signals that have undergone the bandwidth-limiting processing to the frequency region signal compression encoder.

Type: Grant

Filed: October 11, 2001

Date of Patent: March 11, 2008

Assignee: NEC Corporation

Inventors: Yuichiro Takamizawa, Toshiyuki Nomura
Method for transcoding audio signals, transcoder, network element, wireless communications network and communications system

Patent number: 7343282

Abstract: The invention relates to a method for transcoding audio signals in a communications system. In order to improve the inter-operability between units (2,40) capable of handling wideband audio signals and units (3,46) or network components (50) capable of handling narrowband audio signals, it is proposed that first, an audio signal is received in a network element (42) of a communications network via which said audio signal is transmitted. Next, it is determined in said network element (42) whether a transcoding of the received audio signal is required. In case a narrowband-to-wideband transcoding of the received signal is required, the received narrowband audio signal is transcoded into a wideband audio signal in the network element (1,42). The generated wideband audio signal is then forwarded to the receiving terminal (2,40). The invention equally relates to a corresponding communications system and its components.

Type: Grant

Filed: June 26, 2001

Date of Patent: March 11, 2008

Assignee: Nokia Corporation

Inventors: Olli Kirla, Henrik Lepanaho, Teemu Himanen
Statistical translation using a large monolingual corpus

Patent number: 7340388

Abstract: A statistical machine translation (MT) system may use a large monolingual corpus to improve the accuracy of translated phrases/sentences. The MT system may produce a alternative translations and use the large monolingual corpus to (re)rank the alternative translations.

Type: Grant

Filed: March 26, 2003

Date of Patent: March 4, 2008

Assignee: University of Southern California

Inventors: Radu Soricut, Daniel Marcu, Kevin Knight
Tonal analysis for perceptual audio coding using a compressed spectral representation

Patent number: 7333930

Abstract: The present invention provides an apparatus, method and tangible medium storing instructions for determining tonality of an input audio signal, for selection of corresponding masked thresholds for use in perceptual audio coding. In the various embodiments, the input audio signal is sampled and transformed using a compressed spectral operation to form a compressed spectral representation, such as a cepstral representation. A peak magnitude and an average magnitude of the compressed spectral representation are determined. Depending upon the ratio of peak-to-average magnitudes, a masked threshold is selected having a corresponding degree of tonality, and is used to determine a plurality of quantization levels and a plurality of bit allocations to perceptually encode the input audio signal with a distortion spectrum beneath a level of just noticeable distortion (JND).

Type: Grant

Filed: March 14, 2003

Date of Patent: February 19, 2008

Assignee: Agere Systems Inc.

Inventor: Frank Baumgarte
Bubble splitting for compact acoustic modeling

Patent number: 7328154

Abstract: An improved method is provided for constructing compact acoustic models for use in a speech recognizer. The method includes: partitioning speech data from a plurality of training speakers according to at least one speech related criteria (i.e., vocal tract length); grouping together the partitioned speech data from training speakers having a similar speech characteristic; and training an acoustic bubble model for each group using the speech data within the group.

Type: Grant

Filed: August 13, 2003

Date of Patent: February 5, 2008

Assignee: Matsushita Electrical Industrial Co., Ltd.

Inventors: Ambroise Mutel, Patrick Nguyen, Luca Rigazio
Unilingual translator

Patent number: 7319949

Abstract: A machine translator trained with textual inputs generated by other machine translators is disclosed. A textual input in a first language is provided by a user or other source. This textual input is then translated by a first machine translator to generate a translated version of the textual input in a second language. The textual input and the translated version are parsed and passed through a training architecture to develop a transfer mapping, and a bilingual dictionary. These components are then used by a second machine translator when translating other textual inputs.

Type: Grant

Filed: May 27, 2003

Date of Patent: January 15, 2008

Assignee: Microsoft Corporation

Inventor: Jessie Pinkham
Encoding apparatus and method, decoding apparatus and method, and recording medium recording apparatus and method

Patent number: 7318026

Abstract: An encoding method comprising the steps of forming a difference signal which is the difference between a first channel signal and a second channel signal of an input PCM signal, encoding the difference signal and the second channel signal with a time difference, dividing a signal which has been encoded with the time difference in the unit of a predetermined number of bits, adaptively encoding the divided data in the unit of the predetermined number of bits, and arranging the adaptively encoded data in a predetermined format.

Type: Grant

Filed: September 30, 2002

Date of Patent: January 8, 2008

Assignee: Sony Corporation

Inventor: Tatsuya Inokuchi
Method and apparatus for providing an interactive language tutor

Patent number: 7299188

Abstract: A method and apparatus for generating a pronunciation score by receiving a user phrase intended to conform to a reference phrase and processing the user phrase in accordance with at least one of an articulation-scoring engine, a duration scoring engine and an intonation-scoring engine to derive thereby the pronunciation score. The scores provided by the various scoring engines are adapted to provide a visual and/or numerical feedback that provides information pertaining to correctness or incorrectness in one or more speech-features such as intonation, articulation, voicing, phoneme error and relative word duration. Such useful interactive feedback will allow a user to quickly identify the problem area and take remedial action in reciting “tutor” sentences or phrases.

Type: Grant

Filed: February 10, 2003

Date of Patent: November 20, 2007

Assignee: Lucent Technologies Inc.

Inventors: Sunil K. Gupta, ZiYi Lu, Prabhu Raghavan, Zulfiquar Sayeed, Aravind Sethuraman, Chetan Vinchhi
Voice command processing system and computer therefor, and voice command processing method

Patent number: 7299187

Abstract: When a user issued voice command does not match grammars registered in advance, the voice command is identified as a sentence (step S305). This sentence is compared with the registered grammars to calculate a similarity (step S307). When the similarity is higher than a first threshold value (TH1), the voice command is executed (step S315). When the similarity is equal to or lower than the first threshold value (TH1) and higher than a second threshold value (TH2), command choices are displayed for the user and the user is permitted to select a command to be executed (step S319). When the similarity is equal to or lower than the second threshold value (TH2), the command is not executed (step S321). Furthermore, once a command has been executed it is added as a grammar, so that it can be identified when next it is used.

Type: Grant

Filed: February 10, 2003

Date of Patent: November 20, 2007

Assignee: International Business Machines Corporation

Inventors: Yoshinori Tahara, Daisuke Tomoda, Kikuo Mitsubo, Yoshinori Atake
Quantization and inverse quantization for audio

Patent number: 7299190

Abstract: An audio encoder and decoder use architectures and techniques that improve the efficiency of quantization (e.g., weighting) and inverse quantization (e.g., inverse weighting) in audio coding and decoding. The described strategies include various techniques and tools, which can be used in combination or independently. For example, an audio encoder quantizes audio data in multiple channels, applying multiple channel-specific quantizer step modifiers, which give the encoder more control over balancing reconstruction quality between channels. The encoder also applies multiple quantization matrices and varies the resolution of the quantization matrices, which allows the encoder to use more resolution if overall quality is good and use less resolution if overall quality is poor. Finally, the encoder compresses one or more quantization matrices using temporal prediction to reduce the bitrate associated with the quantization matrices. An audio decoder performs corresponding inverse processing and decoding.

Type: Grant

Filed: August 15, 2003

Date of Patent: November 20, 2007

Assignee: Microsoft Corporation

Inventors: Naveen Thumpudi, Wei-Ge Chen
Pattern matching method and apparatus

Patent number: 7295980

Abstract: A system is provided for matching two or more sequences of phonemes both or all of which may be generated from text or speech. A dynamic programming matching technique is preferably used having constraints which depend upon whether or not the two sequences are generated from text or speech and in which the scoring of the dynamic programming paths is weighted by phoneme confusion scores, phoneme insertion scores and phoneme deletion scores where appropriate.

Type: Grant

Filed: August 31, 2006

Date of Patent: November 13, 2007

Assignee: Canon Kabushiki Kaisha

Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth, Asako Higuchi
Active learning process for spoken dialog systems

Patent number: 7292976

Abstract: A large amount of human labor is required to transcribe and annotate a training corpus that is needed to create and update models for automatic speech recognition (ASR) and spoken language understanding (SLU). Active learning enables a reduction in the amount of transcribed and annotated data required to train ASR and SLU models. In one aspect of the present invention, an active learning ASR process and active learning SLU process are coupled, thereby enabling further efficiencies to be gained relative to a process that maintains an isolation of data in both the ASR and SLU domains.

Type: Grant

Filed: May 29, 2003

Date of Patent: November 6, 2007

Assignee: AT&T Corp.

Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
Active labeling for spoken language understanding

Patent number: 7292982

Abstract: An active labeling process is provided that aims to minimize the number of utterances to be checked again by automatically selecting the ones that are likely to be erroneous or inconsistent with the previously labeled examples. In one embodiment, the errors and inconsistencies are identified based on the confidences obtained from a previously trained classifier model. In a second embodiment, the errors and inconsistencies are identified based on an unsupervised learning process. In both embodiments, the active labeling process is not dependent upon the particular classifier model.

Type: Grant

Filed: May 29, 2003

Date of Patent: November 6, 2007

Assignee: AT&T Corp.

Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Gokhan Tur
Computer, display control device, pointer position control method, and program

Patent number: 7286991

Abstract: To provide a pointer position control method and the like for manipulating a pointer more easily. The user moves the pointer P two-dimensionally and perform click and other operations by using only “voice”—by varying the volume and pitch of produced voice without uttering any specific command. The user moves the pointer P by varying the volume and switches the travel direction of the pointer P by changing the pitch. Also, by stopping to vary the volume, the user can automatically enter a fine adjustment mode in which the user can make fine adjustments. Furthermore, the user can perform a click by stopping to produce voice suddenly and return to normal speech recognition mode by keeping silent.

Type: Grant

Filed: May 30, 2003

Date of Patent: October 23, 2007

Assignee: International Business Machines Corporation

Inventors: Yoshinori Tahara, Tooru Tabara, Reiko Kawase, Masaru Horioka
System, method and program product for bidirectional text translation

Patent number: 7283949

Abstract: A system, method, and program product for translating text. The invention provides a bidirectional translation corpus that is used to translate phrases from a first language to a second language and vice versa. The bidirectional translation corpus has multiple entries, each having a phrase in the first language and a corresponding phrase in the second language. A source phrase is compared with each entry in the bidirectional translation corpus to determine if it matches one of the entries. If a match is found, the corresponding phrase is used as a translated phrase. Otherwise, the phrase is translated using a translation system.

Type: Grant

Filed: April 4, 2003

Date of Patent: October 16, 2007

Assignee: International Business Machines Corporation

Inventor: Winston Tsu-Rong Shieh
Method, system and storage medium for commercial and musical composition recognition and storage

Patent number: 7277852

Abstract: A playlist generating method for generating a playlist of content from received broadcasted data is provided. The playlist generating method includes the steps of: extracting features of broadcast content beforehand, storing the features in a content feature file, and storing information relating to the broadcast content in a content information DB; extracting features from the received data, and storing the features in a data feature file; searching for broadcast content of a predetermined kind by comparing data in the content feature file and data in the data feature file; when a name of the predetermined kind of content is determined, storing data corresponding to the broadcast content of the predetermined kind in a search result file; generating a playlist for the broadcast content of the predetermined kind from the search result file and the content information DB.

Type: Grant

Filed: October 22, 2001

Date of Patent: October 2, 2007

Assignee: NTT Communications Corporation

Inventors: Miwako Iyoku, Tatsuhiro Kobayashi
Disambiguating results within a speech based IVR session

Patent number: 7260537

Abstract: Within an interactive voice response system, a method of automatically disambiguating results presented to a user can include determining the identity of a user within an interactive voice response session, receiving user inputs specifying selections in an interactive voice response menu hierarchy, and storing historical information specifying the user selections within a profile associated with the identity of the user. For at least one subsequent input from the user, identifying the historical information associated with the identity of the user and using the historical information to reduce a number of possible selections in the interactive voice response menu hierarchy which are presented to the user.

Type: Grant

Filed: March 25, 2003

Date of Patent: August 21, 2007

Assignee: International Business Machines Corporation

Inventors: Thomas E. Creamer, Brent L. Davis, Peeyush Jaiswal, Victor S. Moore

prev … 4 5 6 7 8 9 next