Patents Examined by Donald L. Storm

Audio overhang reduction by silent frame deletion in wireless calls

Patent number: 6999921

Abstract: To address the need for reducing audio overhang in wireless communication systems (e.g., 100), the present invention provides for the deletion of silent frames before they are converted to audio by the listening devices. The present invention only provides for the deletion of a portion of the silent frames that make up a period of silence or low voice activity in the speaker's audio. Voice frames that make up periods of silence less than a given length of time are not deleted.

Type: Grant

Filed: December 13, 2001

Date of Patent: February 14, 2006

Assignee: Motorola, Inc.

Inventors: John M. Harris, Philip J. Fleming, Joseph Tobin
Database annotation and retrieval including phoneme data

Patent number: 6990448

Abstract: The data structure is used in accessing a plurality of data files. The data stucture comprises a plurality of annotation storage areas adapted to correspond with the data files, each annotation storage area containing an annotation an annotation representing a time sequential signal and each annotation storage area comprising a plurality of block storage areas each containing phoneme and word data forming a respective temporal block of the annotation and each block having an associated time index identifying a timing of the block within the corresponding annotation. Each block storage area includes a plurality of node storage areas, each asociated with a node which represents a point in time at which a word and/or phoneme begins or ends within the corresponding annotation, and each node storage area having a time offset storage area containing a time offset defining the point in time represented by the node relative to the time index associated with the corresponding block.

Type: Grant

Filed: August 23, 2001

Date of Patent: January 24, 2006

Assignee: Canon Kabushiki Kaisha

Inventors: Jason Peter Andrew Charlesworth, Philip Neil Garner, Jebu Jacob Rajan
System and method for combined frequency-domain and time-domain pitch extraction for speech signals

Patent number: 6988064

Abstract: A system, computer readable medium, and method for sampling a speech signal; dividing the sampled speech signal into overlapped frames; extracting first pitch information from a frame using frequency domain analysis; providing at least one pitch candidate, each being associated with a spectral score, from the first pitch information, each of the at least one pitch candidate representing a possible pitch estimate for the frame; extracting second pitch information from the frame using a time domain analysis; providing a correlation score for the at least one pitch candidate from the second pitch information; and selecting one of the at least one pitch candidate to represent the pitch estimate of the frame. The system, computer readable medium, and method are suitable for speech coding and for distributed speech recognition.

Type: Grant

Filed: March 31, 2003

Date of Patent: January 17, 2006

Assignees: Motorola, Inc., International Business Machines Corporation

Inventors: Tenkasi V. Ramabadran, Alexander Sorin
Controlling the listening horizon of an automatic speech recognition system for use in handsfree conversational dialogue

Patent number: 6988072

Abstract: Conversational dialog with a computer or other processor-based device without requiring push-to-talk functionality. In one embodiment, a computer-implemented method first determines that a user desires to engage in a dialog. Based thereon the method turns on a speech recognition functionality for a period of time referred to as a listening horizon. Upon the listening horizon expiring, the method turns off the speech recognition functionality.

Type: Grant

Filed: July 7, 2004

Date of Patent: January 17, 2006

Assignee: Microsoft Corporation

Inventor: Eric Horvitz
Compressed audio stream data decoder memory sharing techniques

Patent number: 6985853

Abstract: A decoder (10) decodes compressed data. A memory (44) stores the compressed data and stores operating data and operating code for a plurality of decompression algorithms requiring different amounts of memory for the operating data and operating code and requiring different amounts of memory to store compressed data corresponding to a predetermined amount of uncompressed data. A processor (42) is arranged to select one of the decompression algorithms, to allocate an amount of the memory for storing compressed data and operating data and operating code depending on the decompression algorithm selected and to decode the compressed data stored in the allocated amount of memory.

Type: Grant

Filed: February 28, 2002

Date of Patent: January 10, 2006

Assignee: Broadcom Corporation

Inventors: Paul Morton, Darwin Rambo
Systems and methods for combining subword recognition and whole word recognition of a spoken input

Patent number: 6985861

Abstract: A computer-based detection (e.g., speech recognition) system combines a word decoder and subword decoder to detect words (or phrases) in a spoken input provided by a user into a speaker connected to the detection system. The word decoder detects words by comparing an input pattern (e.g., of hypothetical word matches) to reference patterns (e.g., words). The subword decoder compares an input pattern (e.g., hypothetical words matches based on subword or phoneme recognition) to reference patterns (e.g., words) based on a word pronunciation distance measure that indicates how close each input pattern is to matching each reference pattern. The subword decoder sorts the source set of reference patterns based on a closeness of each reference pattern to correctly matching the input pattern based on generated pattern comparisons. The word decoder and subword decoder each provide an N-best list of hypothetical matches to the spoken input.

Type: Grant

Filed: December 12, 2001

Date of Patent: January 10, 2006

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Jean-Manuel Van Thong, Ernest Pusateri
Speech processing apparatus and method measuring signal to noise ratio and scaling speech and noise

Patent number: 6965860

Abstract: A speech processing apparatus and method are provided for processing an input speech signal to compensate for the effects of noise in the input speech signal. The method and apparatus divide the input speech signal into a plurality of sequential time frames and a set of spectral parameters are extracted for each time frame, which parameters are representative of the input signal during the time frame. The system then processes the input speech by scaling the parameters for each frame in dependence upon a measure of the signal to noise ratio for the input frame. In this way, the effects of additive noise on the input signal can be reduced.

Type: Grant

Filed: April 19, 2000

Date of Patent: November 15, 2005

Assignee: Canon Kabushiki Kaisha

Inventors: David Llewellyn Rees, Robert Alexander Keiller
Linguistic prosodic model-based text to speech

Patent number: 6961704

Abstract: An arrangement is provided for text to speech processing based on linguistic prosodic models. Linguistic prosodic models are established to characterize different linguistic prosodic characteristics. When an input text is received, a target unit sequence is generated with a linguistic target that annotates target units in the target unit sequence with a plurality of linguistic prosodic characteristics so that speech synthesized in accordance with the target unit sequence and the linguistic target has certain desired prosodic properties. A unit sequence is selected in accordance with the target unit sequence and the linguistic target based on joint cost information evaluated using established linguistic prosodic models. The selected unit sequence is used to produce synthesized speech corresponding to the input text.

Type: Grant

Filed: January 31, 2003

Date of Patent: November 1, 2005

Assignee: Speechworks International, Inc.

Inventors: Michael S. Phillips, Daniel S. Faulkner, Marek A. Przezdzieci
Quantization matrices based on critical band pattern information for digital audio wherein quantization bands differ from critical bands

Patent number: 6934677

Abstract: Quantization matrices facilitate digital audio encoding and decoding. An audio encoder generates and compresses quantization matrices; an audio decoder decompresses and applies the quantization matrices. The invention includes several techniques and tools, which can be used in combination or separately. For example, the audio encoder can generate quantization matrices from critical band patterns for blocks of audio data. The encoder can compute the quantization matrices directly from the critical band patterns, which can be computed from the same audio data that is being compressed. The audio encoder/decoder can use different modes for generating/applying quantization matrices depending on the coding channel mode of multi-channel audio data. The audio encoder/decoder can use different compression/decompression modes for the quantization matrices, including a parametric compression/decompression mode.

Type: Grant

Filed: December 14, 2001

Date of Patent: August 23, 2005

Assignee: Microsoft Corporation

Inventors: Wei-Ge Chen, Naveen Thumpudi, Ming-Chieh Lee
Method and apparatus for musical modification of speech signal

Patent number: 6928410

Abstract: A method and apparatus for modification of a speech signal indicative of a stream of speech data having a plurality of syllables. The method comprises the steps of mapping the stream of speech data from the speech signal into a stream of tone data according to a linguistic rule regarding the syllables for providing a tone signal indicative of the stream of tone data; forming a string of musical notes responsive to the tone signal for providing a carrier signal indicative of the string of musical notes; modulating the carrier signal with the speech signal for providing a modified signal; and providing an audible signal representative of the speech signal, musically modified according to the linguistic rule. The linguistic rule includes an assignment of a tone to a syllable of the speech data based on a vowel of the syllable, a consonant of the syllable, the intonation of the syllable for a monosyllabic language.

Type: Grant

Filed: November 6, 2000

Date of Patent: August 9, 2005

Assignee: Nokia Mobile Phones Ltd.

Inventors: Juha Marila, Sami Ronkainen, Mika Röykkee, Fumiko Ichikawa
Visual interactive response system and method translated from interactive voice response for telephone utility

Patent number: 6920425

Abstract: A system, method, and computer readable medium storing a software program for translating a script for an interactive voice response system to a script for a visual interactive response system. The visual interactive response system executes the translated visual-based script when a user using a display telephone calls the visual interactive response system. The visual interactive response system then transmits a visual menu to the display telephone to allow the user to select a desired response, which is subsequently sent back to the visual interactive response system for processing. The voice-based script may be defined in voice extensible markup language and the visual-based script may be defined in wireless markup language, hypertext markup language, or handheld device markup language.

Type: Grant

Filed: May 16, 2000

Date of Patent: July 19, 2005

Assignee: Nortel Networks Limited

Inventors: Craig A. Will, Wayne N. Shelley
Method and system for frame alignment and unsupervised adaptation of acoustic models

Patent number: 6917918

Abstract: An unsupervised adaptation method and apparatus are provided that reduce the storage and time requirements associated with adaptation. Under the invention, utterances are converted into feature vectors, which are decoded to produce a transcript and alignment unit boundaries for the utterance. Individual alignment units and the feature vectors associated with those alignment units are then provided to an alignment function, which aligns the feature vectors with the states of each alignment unit. Because the alignment is performed within alignment unit boundaries, fewer feature vectors are used and the time for alignment is reduced. After alignment, the feature vector dimensions aligned to a state are added to dimension sums that are kept for that state. After all the states in an utterance have had their sums updated, the speech signal and the alignment units are deleted. Once sufficient frames of data have been received to perform adaptive training, the acoustic model is adapted.

Type: Grant

Filed: December 22, 2000

Date of Patent: July 12, 2005

Assignee: Microsoft Corporation

Inventors: William H. Rockenbeck, Milind V. Mahajan, Fileno A. Alleva
Preprocessing modules for quality enhancement of MBE coders and decoders for signals having transmission path characteristics

Patent number: 6912496

Abstract: Pursuant to one aspect of the invention, a prefilter module that incorporates an inverse filter is used in conjunction with an encoder. The inverse filter has an inverse frequency response of a frequency response of a filter that simulates speech having transmission path characteristics, such as telephone-channel bandwidth speech, and/or noisy speech. The inverse filter is used to compensate transmission path characteristics of an input signal. The inverse filter can be designed using several methods, such as, for example, an autoregressive model or a moving average model. Pursuant to a second aspect of the invention, a parameter preprocessor is used in conjunction with a decoder. The parameter preprocessor performs pitch rectification through use of a medium and linear filter, and updates spectral amplitudes and voicing parameter depending on the pitch rectification.

Type: Grant

Filed: October 26, 2000

Date of Patent: June 28, 2005

Assignee: Silicon Automation Systems

Inventors: Puranjoy Bhattacharya, Manoj Kumar Singhal, Sangeetha Dummy
Method and apparatus for improving the intelligibility of digitally compressed speech

Patent number: 6889186

Abstract: A system for processing a speech signal to enhance signal intelligibility identifies portions of the speech signal that include sounds that typically present intelligibility problems and modifies those portions in an appropriate manner. First, the speech signal is divided into a plurality of time-based frames. Each of the frames is then analyzed to determine a sound type associated with the frame. Selected frames are then modified based on the sound type associated with the frame or with surrounding frames. For example, the amplitude of frames determined to include unvoiced plosive sounds may be boosted as these sounds are known to be important to intelligibility and are typically harder to hear than other sounds in normal speech. In a similar manner, the amplitudes of frames preceding such unvoiced plosive sounds can be reduced to better accentuate the plosive. Such techniques will make these sounds easier to distinguish upon subsequent playback.

Type: Grant

Filed: June 1, 2000

Date of Patent: May 3, 2005

Assignee: Avaya Technology Corp.

Inventor: Paul Roller Michaelis
Systems and methods for TV navigation with compressed voice-activated commands

Patent number: 6889191

Abstract: A method, apparatus and system that receives speech commands at a remote control device, digitizes those speech commands, and transmits the digitized speech commands to an electronic device, such as a digital home communication terminal (DCHT). The electronic device interprets the speech commands to allow the remote control operator to control the electronic device.

Type: Grant

Filed: December 3, 2001

Date of Patent: May 3, 2005

Assignee: Scientific-Atlanta, Inc.

Inventors: Arturo A. Rodriguez, David A. Sedacca, Albert Garcia
Bit error concealment methods for speech coding

Patent number: 6885988

Abstract: A method of concealing bit errors in a signal is provided. The method comprises encoding a signal parameter according to a set of constraints placed on a signal parameter quantizer. The encoded signal parameter is decoded and compared against the set of constraints. Finally, the method includes declaring the decoded signal parameter invalid when the set of constraints is violated. Training binned ranges of gain values provide a threshold for selecting data segments to examine for violation of constraints on gain differences. Further, an additional method comprises training a threshold function T(qlg(m?1), ?qlg(m?1) used in a codec bit error detecting technique. The threshold function is based upon a first training file having N signal segments. The method includes encoding the first training file and determining gain values qlg(m) of each of the N signal segments within the encoded first training file. The gain values form a range and the range is divided into bins.

Type: Grant

Filed: August 19, 2002

Date of Patent: April 26, 2005

Assignee: Broadcom Corporation

Inventor: Juin-Hwey Chen
Method for producing a speech rendition of text from diphone sounds

Patent number: 6879957

Abstract: A text-to-speech system utilizes a method for producing a speech rendition of text based on dividing some or all words of a sentence into component diphones. A phonetic dictionary is aligned so that each letter within each word has a single corresponding phoneme. The aligned dictionary is analyzed to determine the most common phoneme representation of the letter in the context of a string of letters before and after it. The results for each letter are stored in phoneme rule matrix. A diphone database is created using a way editor to cut 2,000 distinct diphones out of specially selected words. A computer algorithm selects a phoneme for each letter. Then, two phonemes are used to create a diphone. Words are then read aloud by concatenating sounds from the diphone database. In one embodiment, diphones are used only when a word is not one of a list of pre-recorded words.

Type: Grant

Filed: September 1, 2000

Date of Patent: April 12, 2005

Inventors: William H. Pechter, Joseph E. Pechter
Speech recognition system and method permitting user customization

Patent number: 6873951

Abstract: A system and method for speech recognition includes a speaker-independent set of stored word representations derived from speech of many users deemed to be typical speakers and for use by all users, and may further include speaker-dependent sets of stored word representations specific to each user. The speaker-dependent sets may be used to store custom commands, so that a user may replace default commands to customize and simplify use of the system. Utterances from a user which match stored words in either set according to the ordering rules are reported as words.

Type: Grant

Filed: September 29, 2000

Date of Patent: March 29, 2005

Assignee: Nortel Networks Limited

Inventors: Lin Lin, Jason Adelman, Lloyd Florence, Ping Lin
Pattern recognition with criterion for output from selected model to trigger succeeding models

Patent number: 6871177

Abstract: A method and apparatus of recognizing a pattern comprising a sequence of sub-patterns includes a set of possible patterns being modelled by a network of sub-pattern models. One or more initial software model objects are instantiated first. As these models produce outputs, succeeding model objects are instantiated if they have not already been instantiated. However, the succeeding model objects are only instantiated if a triggering model output meets a predetermined criterion. This ensures that the processing required is maintained at a manageable level. If the models comprise finite state networks, pruning of internal states may also be performed. The criterion applied to this pruning is less harsh than that applied when determining whether to instantiate a succeeding model.

Type: Grant

Filed: October 27, 1998

Date of Patent: March 22, 2005

Assignee: British Telecommunications public limited company

Inventors: Simon A Hovell, Mark Wright, Simon P. A Ringland
Speech synthesis device handling phoneme units of extended CV

Patent number: 6847932

Abstract: Given phonetic information is divided into speech units of extended CV which is a contiguous sequence of phonemes without clear distinction containing a vowel or some vowels. Contour of vocal tract transmission function of phoneme of the speech unit of extended CV is obtained from the phoneme directory which contains a contour of vocal tract transmission function of each phoneme associated with phonetic information in a unit of extended CV. Speech waveform data is generated based on the contour of vocal tract transmission function of phoneme of the speech unit of extended CV. Speech waveform data is converted into analog voice signal.

Type: Grant

Filed: September 28, 2000

Date of Patent: January 25, 2005

Assignee: Arcadia, Inc.

Inventors: Kazuyuki Ashimura, Seiichi Tenpaku

prev 1 2 3 4 5 6 7 8 9 … next