Patents Examined by Daniel Nolan

Speech driven data selection in a voice-enabled program

Patent number: 6832196

Abstract: A method of dynamically formatting a speech menu construct can include a series of steps. A markup language document containing a reference to a server-side program can be provided. The server-side program can be programmed to dynamically format data using a voice-enabled markup language. A database can be accessed using the server-side program. The database can have a plurality of data items. Using the voice-enabled markup language, the selected data items can be formatted thereby creating speech menu items. The speech menu items can specify a speech menu construct resulting in a menu interface that is dynamically generated from data in data store, rather than being written by a programmer, and allows the user to “speak to the data.

Type: Grant

Filed: March 30, 2001

Date of Patent: December 14, 2004

Assignee: International Business Machines Corporation

Inventor: David E. Reich
Concealment of frame erasures and method

Patent number: 6826527

Abstract: A decoder for code excited LP encoded frames with both adaptive and fixed codebooks; erased frame concealment uses muted repetitive excitation, threshold-adapted bandwidth expanded repetitive synthesis filter, and jittered repetitive pitch lag.

Type: Grant

Filed: November 3, 2000

Date of Patent: November 30, 2004

Assignee: Texas Instruments Incorporated

Inventor: Takahiro Unno
Low bit-rate coding of unvoiced segments of speech

Patent number: 6820052

Abstract: A low-bit-rate coding technique for unvoiced segments of speech includes the steps of extracting high-time-resolution energy coefficients from a frame of speech, quantizing the energy coefficients, generating a high-time-resolution energy envelope from the quantized energy coefficients, and reconstituting a residue signal by shaping a randomly generated noise vector with quantized values of the energy envelope. The energy envelope may be generated with a linear interpolation technique. A post-processing measure may be obtained and compared with a predefined threshold to determine whether the coding algorithm is performing adequately.

Type: Grant

Filed: July 17, 2002

Date of Patent: November 16, 2004

Assignee: Qualcomm Incorporated

Inventors: Amitava Das, Sharath Manjunath
Expressivity of voice synthesis by emphasizing source signal features

Patent number: 6804649

Abstract: Voice synthesis with improved expressivity is obtained in a voice synthesiser of source-filter type by making use of a library of source sound categories in the source module. Each source sound category corresponds to a particular morphological category and is derived from analysis of real vocal sounds, by inverse filtering so as to subtract the effect of the vocal tract. The library may be parametrical, that is, the stored data corresponds not to the inverse-filtered sounds themselves but to synthesis coefficients for resynthesising the inverse-filtered sounds using any suitable re-synthesis technique, such as the phase vocoder technique. The coefficients are derived by Short Time Fourier Transform (STFT) analysis.

Type: Grant

Filed: June 1, 2001

Date of Patent: October 12, 2004

Assignee: Sony France S.A.

Inventor: Eduardo Reck Miranda
Variable bit rate speech encoding after gain suppression

Patent number: 6799161

Abstract: A speech coding apparatus having a speech input unit for receiving input speech, a speech coding rate selector for selecting an appropriate speech coding rate according to the power of the input speech, a speech analyzer for processing the input speech to estimate a transfer function of the speaker's oral cavity, and a speech coding unit forming a synthesis filter based on the transfer function of the oral cavity. The speech coding unit also codes an excitation signal of the synthesis filter on the basis of an estimation result supplied by the speech analyzer. A gain suppressor interposed between the speech input unit and the speech coding unit suppresses the gain of a signal supplied from the speech input unit to the speech coding unit during an unvoiced period according to information from the speech coding rate selector.

Type: Grant

Filed: January 15, 2002

Date of Patent: September 28, 2004

Assignee: Oki Electric Industry Co., Ltd.

Inventor: Atsushi Yokoyama
Alternate window compression/decompression method, apparatus, and system

Patent number: 6785644

Abstract: With respect to data having periodicity to be compressed, windows of the same size are set for every two sections according to an interval of peaks appearing substantially periodically and processing for sorting sample data alternately among the set windows of the same size is sequentially performed, whereby a frequency of data having periodicity is replaced with an approximately half frequency without damaging reproducibility to original data at all to make it possible to apply compression processing to data of the replaced low frequency. If this sorting processing is applied to compression processing having a characteristic that a compression ratio is not increased in a high-frequency region, it becomes possible to improve a compression ratio without damaging a quality of reproduced data by decompression at all.

Type: Grant

Filed: December 16, 2002

Date of Patent: August 31, 2004

Assignee: Yasue Sakai

Inventor: Yukio Koyanagi
Digital audio reproducing apparatus

Patent number: 6775654

Abstract: A digital audio reproducing apparatus including a receiver receiving modulated data, a demodulator demodulating the modulated data received by the receiver, an audio decoder decoding, in a unit of a frame, digital audio information contained in the modulated data demodulated by the demodulator, and an audibility corrector for effecting audibility correction on failing digital audio information contained in a frame that failed to be decoded, when the audio decoder fails to decode the digital audio information.

Type: Grant

Filed: August 31, 1999

Date of Patent: August 10, 2004

Assignees: Fujitsu Limited, FFC Limited

Inventors: Hideaki Yokoyama, Kazuhisa Matsushima, Hiroshi Okubo, Tadayoshi Katoh, Takashi Saito
Application server configured for dynamically generating web pages for voice enabled web applications

Patent number: 6766298

Abstract: A unified web-based voice messaging system provides voice application control between a web browser and an application server via an hypertext transport protocol (HTTP) connection on an Internet Protocol (IP) network. The web browser receives an HTML page from the application server having an XML element that defines data for an audio operation to be performed by an executable audio resource. The application server executes the voice-enabled web application by runtime execution of extensible markup language (XML) documents that define the voice-enabled web application to be executed. The application server, in response to receiving a user request from a user, accesses a selected XML page that defines at least a part of the voice application to be executed for the user. The application server then parses the XML page, and executes the operation describer by the XML page.

Type: Grant

Filed: January 11, 2000

Date of Patent: July 20, 2004

Assignee: Cisco Technology, Inc.

Inventors: Lewis Dean Dodrill, Geetha Ravishankar, Satish Joshi, Keith M. Basil, Ryan Alan Danner, James Richard Grove, Jr., Steven J. Martin
Codebook tables for multi-rate encoding and decoding with pre-gain and delayed-gain quantization tables

Patent number: 6757649

Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

Type: Grant

Filed: April 8, 2003

Date of Patent: June 29, 2004

Assignee: Mindspeed Technologies Inc.

Inventors: Yang Gao, Adil Benyassine, Jes Thyssen, Eyal Shlomot, Huan-yu Su
System and method for concurrent presentation of multiple audio information sources

Patent number: 6757656

Abstract: A method for concurrent presentation of multiple audio information sources. In the method, audio information from at least two audio information sources is concurrently presented, and a user speech selection of one of the audio information sources is accepted. At least one of the audio information sources can then be reconfigured. The reconfiguration audibly distinguishes the user selected audio information source from other audio information sources.

Type: Grant

Filed: June 15, 2000

Date of Patent: June 29, 2004

Assignee: International Business Machines Corporation

Inventors: Qing Gong, James R. Lewis, Ronald E. Vanbuskirk, Huifang Wang
Voice-actuated generation of documents containing photographic identification

Patent number: 6751589

Abstract: A preferred method for generating a document includes the steps of: providing an applicant with a visual representation, via a visual display device, of at least a portion of a document; prompting an applicant to provide first information corresponding to a first portion of the document; receiving the first information, as a first vocal response, from the applicant; converting the first vocal response to corresponding first textual data; providing the applicant with an updated visual representation, via the visual display device, of the first textual data appearing at the first portion of the document; and generating a printed document corresponding to the updated visual representation of the document. Systems and computer readable media also are provided.

Type: Grant

Filed: September 18, 2000

Date of Patent: June 15, 2004

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Gustavo M. Guillemin
Data transfer system and data transfer method

Patent number: 6741964

Abstract: When recording digital data corresponding to a voice signal, a voice data recording and reproducing apparatus generates an error correction code and records this code together with the digital data in semiconductor memory. When transferring the digital data to the PC, a system control section in the voice data recording and reproducing apparatus transmits voice data including the error correction code without performing error correction. The system control section provides a lower data processing capability than that of a PC's CPU. The PC's CPU having a higher data processing capability performs error correction of the voice data by using the error correction code included in the received voice data.

Type: Grant

Filed: January 8, 2001

Date of Patent: May 25, 2004

Assignee: Olympus Optical Co., Ltd.

Inventor: Hideo Okano
Encoding and decoding speech signals variably based on signal classification

Patent number: 6735567

Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

Type: Grant

Filed: April 8, 2003

Date of Patent: May 11, 2004

Assignee: Mindspeed Technologies, Inc.

Inventors: Yang Gao, Adil Benyassine, Jes Thyssen, Eyal Shlomot, Huan-yu Su
Portrayal of talk group at a location in virtual audio space for identification in telecommunication system management

Patent number: 6735564

Abstract: A method and arrangement for managing talk groups of a telecommunication system at a dispatcher station of the telecommunications system having one or more talk groups which may consist of one or more users and which are controlled by the dispatcher at the dispatcher station. The arrangement includes a two-channel or a multichannel sound reproducing system which is configured to create an artificial acoustic space at the dispatcher station, and reproduce voices of each talk group so that the voices are heard from a certain point of the acoustic space, which allows the dispatcher to recognize the talk group to which the voice belongs on the basis of the location of the voice.

Type: Grant

Filed: December 28, 2000

Date of Patent: May 11, 2004

Assignee: Nokia Networks Oy

Inventor: Pekka Puhakainen
System and method for automated multimedia content indexing and retrieval

Patent number: 6714909

Abstract: The invention provides a system and method for automatically indexing and retrieving multimedia content. The method may include separating a multimedia data stream into audio, visual and text components, segmenting the audio, visual and text components based on semantic differences, identifying at least one target speaker using the audio and visual components, identifying a topic of the multimedia event using the segmented text and topic category models, generating a summary of the multimedia event based on the audio, visual and text components, the identified topic and the identified target speaker, and generating a multimedia description of the multimedia event based on the identified target speaker, the identified topic, and the generated summary.

Type: Grant

Filed: November 21, 2000

Date of Patent: March 30, 2004

Assignee: AT&T Corp.

Inventors: David Crawford Gibbon, Qian Huang, Zhu Liu, Aaron Edward Rosenberg, Behzad Shahraray
Hand-held transmitter having speech storage actuated by transmission failure

Patent number: 6711545

Abstract: A device for processing speech signal is disclosed comprising a speech signal input and speech signal digitizer, a control signal input and control signal digitizer, a digital signal processor, a transmitter to transmit processed digital signals in a cordless fashion to a base station, and a storage means, which is connected upstream of the transmitter and by which, depending on the quality of the communication link between the device and the base station, buffers the processed digital signals to be supplied and to be transmitted between the device and the base station.

Type: Grant

Filed: October 11, 2000

Date of Patent: March 23, 2004

Assignee: Koninklijke Philips Electronics N.V.

Inventor: Manfred Hörndl
Method and apparatus for using formant models in resonance control for speech systems

Patent number: 6708154

Abstract: A model is provided for formants found in human speech. Under one aspect of the invention, the model is used to synthesize speech. Under this aspect of the invention, the formant model is used to identify a most likely formant track for the synthesized speech. Based on this track, a series of resonators are used to introduce the formants into the speech signal.

Type: Grant

Filed: November 14, 2002

Date of Patent: March 16, 2004

Assignee: Microsoft Corporation

Inventor: Alejandro Acero
Recursively excited linear prediction speech coder

Patent number: 6704703

Abstract: The excitation in a CELP-like speech coder is recursively calculated. For a given bitrate and a given complexity, the recursive approach described lowers the complexity with minimum impact on speech quality. The excitation signal is a sum of at least three vector terms, each vector term being a product of a codebook vector zk and an associated gain term gk. A first vector term g0z0 is determined that is representative of a target excitation vector x. Each remaining vector term is recursively determined as a vector term gkzk representative of the difference between the target excitation vector x and the sum of previously determined vector terms, ∑ i = 0 k - 1 ⁢ g i ⁢ z i .

Type: Grant

Filed: February 2, 2001

Date of Patent: March 9, 2004

Assignee: ScanSoft, Inc.

Inventors: Mohand Ferhaoul, Jean-Francois Rasaminjanahary, Stefaan Van Gerven, Abderrahman Essebbar
System and method for improving the accuracy of a speech recognition program

Patent number: 6704709

Abstract: A system and method for improving the accuracy of a speech recognition program. The system is based on a speech recognition program that automatically converts a pre-recorded audio file into a written text. The system parses the written text into segments, each of which can be corrected by the system and saved in a retrievable manner in association with the computer. The standard speech files are saved towards improving accuracy in speech-to-text conversion by the speech recognition program. The system further includes facilities to repetitively establish an independent instance of the written text from the pre-recorded audio file using the speech recognition program. This independent instance can then be broken into segments and each erroneous segment in said independent instance replaced with the corrected segment associated with that segment. In this manner, repetitive instruction of a speech recognition program can be facilitated.

Type: Grant

Filed: July 26, 2000

Date of Patent: March 9, 2004

Assignee: Custom Speech USA, Inc.

Inventors: Jonathan Kahn, Thomas P Flynn, Charles Qin, Nicholas A. Linden
Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains

Patent number: RE39336

Abstract: The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross face techniques, one applied in the time domain in the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.

Type: Grant

Filed: November 5, 2002

Date of Patent: October 10, 2006

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Steve Pearson, Nicholas Kibre, Nancy Niedzielski

1 2 3 4 5 … next