Patents Examined by Donald L. Storm
  • Patent number: 6999921
    Abstract: To address the need for reducing audio overhang in wireless communication systems (e.g., 100), the present invention provides for the deletion of silent frames before they are converted to audio by the listening devices. The present invention only provides for the deletion of a portion of the silent frames that make up a period of silence or low voice activity in the speaker's audio. Voice frames that make up periods of silence less than a given length of time are not deleted.
    Type: Grant
    Filed: December 13, 2001
    Date of Patent: February 14, 2006
    Assignee: Motorola, Inc.
    Inventors: John M. Harris, Philip J. Fleming, Joseph Tobin
  • Patent number: 6990448
    Abstract: The data structure is used in accessing a plurality of data files. The data stucture comprises a plurality of annotation storage areas adapted to correspond with the data files, each annotation storage area containing an annotation an annotation representing a time sequential signal and each annotation storage area comprising a plurality of block storage areas each containing phoneme and word data forming a respective temporal block of the annotation and each block having an associated time index identifying a timing of the block within the corresponding annotation. Each block storage area includes a plurality of node storage areas, each asociated with a node which represents a point in time at which a word and/or phoneme begins or ends within the corresponding annotation, and each node storage area having a time offset storage area containing a time offset defining the point in time represented by the node relative to the time index associated with the corresponding block.
    Type: Grant
    Filed: August 23, 2001
    Date of Patent: January 24, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventors: Jason Peter Andrew Charlesworth, Philip Neil Garner, Jebu Jacob Rajan
  • Patent number: 6988064
    Abstract: A system, computer readable medium, and method for sampling a speech signal; dividing the sampled speech signal into overlapped frames; extracting first pitch information from a frame using frequency domain analysis; providing at least one pitch candidate, each being associated with a spectral score, from the first pitch information, each of the at least one pitch candidate representing a possible pitch estimate for the frame; extracting second pitch information from the frame using a time domain analysis; providing a correlation score for the at least one pitch candidate from the second pitch information; and selecting one of the at least one pitch candidate to represent the pitch estimate of the frame. The system, computer readable medium, and method are suitable for speech coding and for distributed speech recognition.
    Type: Grant
    Filed: March 31, 2003
    Date of Patent: January 17, 2006
    Assignees: Motorola, Inc., International Business Machines Corporation
    Inventors: Tenkasi V. Ramabadran, Alexander Sorin
  • Patent number: 6988072
    Abstract: Conversational dialog with a computer or other processor-based device without requiring push-to-talk functionality. In one embodiment, a computer-implemented method first determines that a user desires to engage in a dialog. Based thereon the method turns on a speech recognition functionality for a period of time referred to as a listening horizon. Upon the listening horizon expiring, the method turns off the speech recognition functionality.
    Type: Grant
    Filed: July 7, 2004
    Date of Patent: January 17, 2006
    Assignee: Microsoft Corporation
    Inventor: Eric Horvitz
  • Patent number: 6985853
    Abstract: A decoder (10) decodes compressed data. A memory (44) stores the compressed data and stores operating data and operating code for a plurality of decompression algorithms requiring different amounts of memory for the operating data and operating code and requiring different amounts of memory to store compressed data corresponding to a predetermined amount of uncompressed data. A processor (42) is arranged to select one of the decompression algorithms, to allocate an amount of the memory for storing compressed data and operating data and operating code depending on the decompression algorithm selected and to decode the compressed data stored in the allocated amount of memory.
    Type: Grant
    Filed: February 28, 2002
    Date of Patent: January 10, 2006
    Assignee: Broadcom Corporation
    Inventors: Paul Morton, Darwin Rambo
  • Patent number: 6985861
    Abstract: A computer-based detection (e.g., speech recognition) system combines a word decoder and subword decoder to detect words (or phrases) in a spoken input provided by a user into a speaker connected to the detection system. The word decoder detects words by comparing an input pattern (e.g., of hypothetical word matches) to reference patterns (e.g., words). The subword decoder compares an input pattern (e.g., hypothetical words matches based on subword or phoneme recognition) to reference patterns (e.g., words) based on a word pronunciation distance measure that indicates how close each input pattern is to matching each reference pattern. The subword decoder sorts the source set of reference patterns based on a closeness of each reference pattern to correctly matching the input pattern based on generated pattern comparisons. The word decoder and subword decoder each provide an N-best list of hypothetical matches to the spoken input.
    Type: Grant
    Filed: December 12, 2001
    Date of Patent: January 10, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Jean-Manuel Van Thong, Ernest Pusateri
  • Patent number: 6965860
    Abstract: A speech processing apparatus and method are provided for processing an input speech signal to compensate for the effects of noise in the input speech signal. The method and apparatus divide the input speech signal into a plurality of sequential time frames and a set of spectral parameters are extracted for each time frame, which parameters are representative of the input signal during the time frame. The system then processes the input speech by scaling the parameters for each frame in dependence upon a measure of the signal to noise ratio for the input frame. In this way, the effects of additive noise on the input signal can be reduced.
    Type: Grant
    Filed: April 19, 2000
    Date of Patent: November 15, 2005
    Assignee: Canon Kabushiki Kaisha
    Inventors: David Llewellyn Rees, Robert Alexander Keiller
  • Patent number: 6961704
    Abstract: An arrangement is provided for text to speech processing based on linguistic prosodic models. Linguistic prosodic models are established to characterize different linguistic prosodic characteristics. When an input text is received, a target unit sequence is generated with a linguistic target that annotates target units in the target unit sequence with a plurality of linguistic prosodic characteristics so that speech synthesized in accordance with the target unit sequence and the linguistic target has certain desired prosodic properties. A unit sequence is selected in accordance with the target unit sequence and the linguistic target based on joint cost information evaluated using established linguistic prosodic models. The selected unit sequence is used to produce synthesized speech corresponding to the input text.
    Type: Grant
    Filed: January 31, 2003
    Date of Patent: November 1, 2005
    Assignee: Speechworks International, Inc.
    Inventors: Michael S. Phillips, Daniel S. Faulkner, Marek A. Przezdzieci
  • Patent number: 6934677
    Abstract: Quantization matrices facilitate digital audio encoding and decoding. An audio encoder generates and compresses quantization matrices; an audio decoder decompresses and applies the quantization matrices. The invention includes several techniques and tools, which can be used in combination or separately. For example, the audio encoder can generate quantization matrices from critical band patterns for blocks of audio data. The encoder can compute the quantization matrices directly from the critical band patterns, which can be computed from the same audio data that is being compressed. The audio encoder/decoder can use different modes for generating/applying quantization matrices depending on the coding channel mode of multi-channel audio data. The audio encoder/decoder can use different compression/decompression modes for the quantization matrices, including a parametric compression/decompression mode.
    Type: Grant
    Filed: December 14, 2001
    Date of Patent: August 23, 2005
    Assignee: Microsoft Corporation
    Inventors: Wei-Ge Chen, Naveen Thumpudi, Ming-Chieh Lee
  • Patent number: 6928410
    Abstract: A method and apparatus for modification of a speech signal indicative of a stream of speech data having a plurality of syllables. The method comprises the steps of mapping the stream of speech data from the speech signal into a stream of tone data according to a linguistic rule regarding the syllables for providing a tone signal indicative of the stream of tone data; forming a string of musical notes responsive to the tone signal for providing a carrier signal indicative of the string of musical notes; modulating the carrier signal with the speech signal for providing a modified signal; and providing an audible signal representative of the speech signal, musically modified according to the linguistic rule. The linguistic rule includes an assignment of a tone to a syllable of the speech data based on a vowel of the syllable, a consonant of the syllable, the intonation of the syllable for a monosyllabic language.
    Type: Grant
    Filed: November 6, 2000
    Date of Patent: August 9, 2005
    Assignee: Nokia Mobile Phones Ltd.
    Inventors: Juha Marila, Sami Ronkainen, Mika Röykkee, Fumiko Ichikawa
  • Patent number: 6920425
    Abstract: A system, method, and computer readable medium storing a software program for translating a script for an interactive voice response system to a script for a visual interactive response system. The visual interactive response system executes the translated visual-based script when a user using a display telephone calls the visual interactive response system. The visual interactive response system then transmits a visual menu to the display telephone to allow the user to select a desired response, which is subsequently sent back to the visual interactive response system for processing. The voice-based script may be defined in voice extensible markup language and the visual-based script may be defined in wireless markup language, hypertext markup language, or handheld device markup language.
    Type: Grant
    Filed: May 16, 2000
    Date of Patent: July 19, 2005
    Assignee: Nortel Networks Limited
    Inventors: Craig A. Will, Wayne N. Shelley
  • Patent number: 6917918
    Abstract: An unsupervised adaptation method and apparatus are provided that reduce the storage and time requirements associated with adaptation. Under the invention, utterances are converted into feature vectors, which are decoded to produce a transcript and alignment unit boundaries for the utterance. Individual alignment units and the feature vectors associated with those alignment units are then provided to an alignment function, which aligns the feature vectors with the states of each alignment unit. Because the alignment is performed within alignment unit boundaries, fewer feature vectors are used and the time for alignment is reduced. After alignment, the feature vector dimensions aligned to a state are added to dimension sums that are kept for that state. After all the states in an utterance have had their sums updated, the speech signal and the alignment units are deleted. Once sufficient frames of data have been received to perform adaptive training, the acoustic model is adapted.
    Type: Grant
    Filed: December 22, 2000
    Date of Patent: July 12, 2005
    Assignee: Microsoft Corporation
    Inventors: William H. Rockenbeck, Milind V. Mahajan, Fileno A. Alleva
  • Patent number: 6912496
    Abstract: Pursuant to one aspect of the invention, a prefilter module that incorporates an inverse filter is used in conjunction with an encoder. The inverse filter has an inverse frequency response of a frequency response of a filter that simulates speech having transmission path characteristics, such as telephone-channel bandwidth speech, and/or noisy speech. The inverse filter is used to compensate transmission path characteristics of an input signal. The inverse filter can be designed using several methods, such as, for example, an autoregressive model or a moving average model. Pursuant to a second aspect of the invention, a parameter preprocessor is used in conjunction with a decoder. The parameter preprocessor performs pitch rectification through use of a medium and linear filter, and updates spectral amplitudes and voicing parameter depending on the pitch rectification.
    Type: Grant
    Filed: October 26, 2000
    Date of Patent: June 28, 2005
    Assignee: Silicon Automation Systems
    Inventors: Puranjoy Bhattacharya, Manoj Kumar Singhal, Sangeetha Dummy
  • Patent number: 6889186
    Abstract: A system for processing a speech signal to enhance signal intelligibility identifies portions of the speech signal that include sounds that typically present intelligibility problems and modifies those portions in an appropriate manner. First, the speech signal is divided into a plurality of time-based frames. Each of the frames is then analyzed to determine a sound type associated with the frame. Selected frames are then modified based on the sound type associated with the frame or with surrounding frames. For example, the amplitude of frames determined to include unvoiced plosive sounds may be boosted as these sounds are known to be important to intelligibility and are typically harder to hear than other sounds in normal speech. In a similar manner, the amplitudes of frames preceding such unvoiced plosive sounds can be reduced to better accentuate the plosive. Such techniques will make these sounds easier to distinguish upon subsequent playback.
    Type: Grant
    Filed: June 1, 2000
    Date of Patent: May 3, 2005
    Assignee: Avaya Technology Corp.
    Inventor: Paul Roller Michaelis
  • Patent number: 6889191
    Abstract: A method, apparatus and system that receives speech commands at a remote control device, digitizes those speech commands, and transmits the digitized speech commands to an electronic device, such as a digital home communication terminal (DCHT). The electronic device interprets the speech commands to allow the remote control operator to control the electronic device.
    Type: Grant
    Filed: December 3, 2001
    Date of Patent: May 3, 2005
    Assignee: Scientific-Atlanta, Inc.
    Inventors: Arturo A. Rodriguez, David A. Sedacca, Albert Garcia
  • Patent number: 6885988
    Abstract: A method of concealing bit errors in a signal is provided. The method comprises encoding a signal parameter according to a set of constraints placed on a signal parameter quantizer. The encoded signal parameter is decoded and compared against the set of constraints. Finally, the method includes declaring the decoded signal parameter invalid when the set of constraints is violated. Training binned ranges of gain values provide a threshold for selecting data segments to examine for violation of constraints on gain differences. Further, an additional method comprises training a threshold function T(qlg(m?1), ?qlg(m?1) used in a codec bit error detecting technique. The threshold function is based upon a first training file having N signal segments. The method includes encoding the first training file and determining gain values qlg(m) of each of the N signal segments within the encoded first training file. The gain values form a range and the range is divided into bins.
    Type: Grant
    Filed: August 19, 2002
    Date of Patent: April 26, 2005
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 6879957
    Abstract: A text-to-speech system utilizes a method for producing a speech rendition of text based on dividing some or all words of a sentence into component diphones. A phonetic dictionary is aligned so that each letter within each word has a single corresponding phoneme. The aligned dictionary is analyzed to determine the most common phoneme representation of the letter in the context of a string of letters before and after it. The results for each letter are stored in phoneme rule matrix. A diphone database is created using a way editor to cut 2,000 distinct diphones out of specially selected words. A computer algorithm selects a phoneme for each letter. Then, two phonemes are used to create a diphone. Words are then read aloud by concatenating sounds from the diphone database. In one embodiment, diphones are used only when a word is not one of a list of pre-recorded words.
    Type: Grant
    Filed: September 1, 2000
    Date of Patent: April 12, 2005
    Inventors: William H. Pechter, Joseph E. Pechter
  • Patent number: 6873951
    Abstract: A system and method for speech recognition includes a speaker-independent set of stored word representations derived from speech of many users deemed to be typical speakers and for use by all users, and may further include speaker-dependent sets of stored word representations specific to each user. The speaker-dependent sets may be used to store custom commands, so that a user may replace default commands to customize and simplify use of the system. Utterances from a user which match stored words in either set according to the ordering rules are reported as words.
    Type: Grant
    Filed: September 29, 2000
    Date of Patent: March 29, 2005
    Assignee: Nortel Networks Limited
    Inventors: Lin Lin, Jason Adelman, Lloyd Florence, Ping Lin
  • Patent number: 6871177
    Abstract: A method and apparatus of recognizing a pattern comprising a sequence of sub-patterns includes a set of possible patterns being modelled by a network of sub-pattern models. One or more initial software model objects are instantiated first. As these models produce outputs, succeeding model objects are instantiated if they have not already been instantiated. However, the succeeding model objects are only instantiated if a triggering model output meets a predetermined criterion. This ensures that the processing required is maintained at a manageable level. If the models comprise finite state networks, pruning of internal states may also be performed. The criterion applied to this pruning is less harsh than that applied when determining whether to instantiate a succeeding model.
    Type: Grant
    Filed: October 27, 1998
    Date of Patent: March 22, 2005
    Assignee: British Telecommunications public limited company
    Inventors: Simon A Hovell, Mark Wright, Simon P. A Ringland
  • Patent number: 6847932
    Abstract: Given phonetic information is divided into speech units of extended CV which is a contiguous sequence of phonemes without clear distinction containing a vowel or some vowels. Contour of vocal tract transmission function of phoneme of the speech unit of extended CV is obtained from the phoneme directory which contains a contour of vocal tract transmission function of each phoneme associated with phonetic information in a unit of extended CV. Speech waveform data is generated based on the contour of vocal tract transmission function of phoneme of the speech unit of extended CV. Speech waveform data is converted into analog voice signal.
    Type: Grant
    Filed: September 28, 2000
    Date of Patent: January 25, 2005
    Assignee: Arcadia, Inc.
    Inventors: Kazuyuki Ashimura, Seiichi Tenpaku