Patents Represented by Attorney, Agent or Law Firm Thomas A. Restaino

Speech enhancement with gain limitations based on speech activity

Patent number: 6542864

Abstract: An apparatus and method for data processing that improves estimation of spectral parameters of speech data and reduces algorithmic delay in a data coding operation. Estimation of spectral parameters is improved by adaptively adjusting a gain function used to enhance data based on whether the data contains information speech and noise or noise only. Delay is reduced by extracting coding parameters using incompletely processed data. This data is formed by multiplying a less current portion of an input data frame with a synthesis window and a more current portion of the data frame with an inverse analysis window, and performing an overlap-add process on the data frame and a similarly processed previous data frame.

Type: Grant

Filed: October 2, 2001

Date of Patent: April 1, 2003

Assignee: AT&T Corp.

Inventors: Richard Vandervoort Cox, Rainer Martin
Generalized analysis-by-synthesis speech coding method and apparatus

Patent number: 6169970

Abstract: A generalized analysis-by-synthesis method and apparatus are disclosed. A plurality of trial original signals are generated based on an original signal for coding. The trial original signals are constrained to be perceptually similar to the original signal. Trial original signals are coded to produce one or more parameters representative thereof. Estimates of the trial original signals are synthesized from these parameters. Errors between the trial original signals and the synthesized estimates are determined. A coded representation of the original signal is determined which comprises parameters of the trial original signal having an associated error which satisfies an error evaluation process. Trial original signals may be generated by application of time-warps or time-shifts to the original signal. Coding of a trial original signal may be performed with conventional analysis-by-synthesis coding such as code-excited linear prediction coding (CELP).

Type: Grant

Filed: January 8, 1998

Date of Patent: January 2, 2001

Assignee: Lucent Technologies Inc.

Inventor: Willem Bastiaan Kleijn
Voice command control and verification system

Patent number: 6081782

Abstract: A voice command control and verification system and method stores for each authorized user, one or a series of speech models of voice commands or phrases uttered by the authorized user. Each speech model has an associated action component which specifies the specific action that the authorized user desires in response to the issuance of the corresponding voice command. Each user has a means of asserting his or her claimed identity to the system, preferably without an overt action such as the entry of digits. When an identity is asserted, and a voice command is thereafter spoken by a person, the system first matches a model of the voice command against the stored models for the user having the claimed identity.

Type: Grant

Filed: December 29, 1993

Date of Patent: June 27, 2000

Assignee: Lucent Technologies Inc.

Inventor: Michael D. Rabin
Synthesis of speech signals in the absence of coded parameters

Patent number: 6014621

Abstract: A speech compression system called "Transform Predictive Coding", or TPC, provides for encoding 7 kHz wideband speech (16 kHz sampling) at a target bit-rate range of 16 to 32 kb/s (1 to 2 bits/sample). The system uses short-term and long-term prediction to remove the redundancy in speech. A prediction residual is transformed and coded in the frequency domain to take advantage of knowledge in human auditory perception. The TPC coder uses only open-loop quantization and therefore has a fairly low complexity. The speech quality of TPC is essentially transparent at 32 kb/s, very good at 24 kb/s, and acceptable at 16 kb/s.

Type: Grant

Filed: April 2, 1997

Date of Patent: January 11, 2000

Assignee: Lucent Technologies Inc.

Inventor: Juin-Hwey Chen
Prototype waveform speech coding with interpolation of pitch, pitch-period waveforms, and synthesis filter

Patent number: 5884253

Abstract: A speech coding system providing reconstructed voiced speech with a smoothly evolving pitch-cycle waveform. A speech signal is represented by isolating and coding prototype waveforms. Each prototype waveform is an exemplary pitch-cycle of voiced speech. A coded prototype waveform is transmitted at regular intervals to a receiver which synthesizes (or reconstructs) an estimate of the original speech segment based on the prototypes. The estimate of the original speech signal is provided by a prototype interpolation process which provides a smooth time-evolution of pitch-cycle waveforms in the reconstructed speech. Illustratively, a frame of original speech is coded by first filtering the frame with a linear predictive filter. Next a pitch-cycle of the filtered original is identified and extracted as a prototype waveform. The prototype waveform is then represented as a set of Fourier series (frequency domain) coefficients.

Type: Grant

Filed: October 3, 1997

Date of Patent: March 16, 1999

Assignee: Lucent Technologies, Inc.

Inventor: Willem Bastiaan Kleijn
Linear prediction coefficient generation during frame erasure or packet loss

Patent number: 5884010

Abstract: A speech coding system robust to frame erasure (or packet loss) is described. Illustrative embodiments are directed to a modified version of CCITT standard G.728. In the event of frame erasure, vectors of an excitation signal are synthesized based on previously stored excitation signal vectors generated during non-erased frames. Specifically, the decoder generates and stores samples of a first excitation signal in a memory, and then, in response to a signal indicating a frame erasure, the decoder synthesizes a second excitation signal based on the previously stored samples. In particular, the second excitation is synthesized by correlating a first subset of the stored samples with a second subset thereof, identifying a set of stored excitation signal samples based on the correlation, and synthesizing the second excitation signal based on the identified samples. Finally, the decoder then filters the second excitation signal to synthesize a signal reflecting human speech.

Type: Grant

Filed: February 16, 1995

Date of Patent: March 16, 1999

Assignee: Lucent Technologies Inc.

Inventors: Juin-Hwey Chen, Craig Robert Watkins
Multimedia networked system detecting congestion by monitoring buffers' threshold and compensating by reducing video transmittal rate then reducing audio playback rate

Patent number: 5822537

Abstract: Disclosed is a networked multimedia information system which may be utilized to record, store and distribute multimedia presentations together with any supplemental materials that may be referenced during the presentation. The recorded presentation, together with the associated supplemental materials, may be simultaneously presented on a display containing two separate viewing windows. The effects of network congestion are minimized by prefetching audio and video data for storage in audio and video buffers. An adaptive control algorithm compensates for network congestion by dynamically varying the rate at which video frames are retrieved over the network, in response to network traffic conditions. The audio playback speed is reduced if the audio data does not arrive fast enough over the network to maintain the desired size of the audio buffer after the amount of video data transmitted across the network has been reduced to a minimum value.

Type: Grant

Filed: April 5, 1996

Date of Patent: October 13, 1998

Assignee: AT&T Corp.

Inventors: Howard P. Katseff, Bethany Scott Robinson
Call notification feature for a telephone line connected to the internet

Patent number: 5805587

Abstract: A facility is provided to alert a subscriber whose telephone station set is connected to the Internet of a waiting call via the Internet connection. Specifically, a call that is waiting may be forwarded via the Public Switched Network to a services platform, which, in turn, establishes a connection to the subscriber using the Internet. The platform then notifies the subscriber of the waiting call via the Internet connection. The platform may then forward the telephone call to the subscriber via the Internet responsive to a subscriber request to do so with interrupting the subscriber's Internet connection.

Type: Grant

Filed: November 27, 1995

Date of Patent: September 8, 1998

Assignee: AT&T Corp.

Inventors: John H. Norris, Thomas Leonard Russell, Jr.
Signal conditioned minimum error rate training for continuous speech recognition

Patent number: 5806029

Abstract: Hierarchical signal bias removal (HSBR) signal conditioning uses a codebook constructed from the set of recognition models and is updated as the recognition models are modified during recognition model training. As a result, HSBR signal conditioning and recognition model training are based on the same set of recognition model parameters, which provides significant reduction in recognition error rate for the speech recognition system.

Type: Grant

Filed: September 15, 1995

Date of Patent: September 8, 1998

Assignee: AT&T Corp

Inventors: Eric Rolfe Buhrke, Wu Chou, Mazin G. Rahim
Perceptual noise masking measure based on synthesis filter frequency response

Patent number: 5790759

Abstract: A speech compression system called "Transform Predictive Coding", or TPC, provides for encoding 7 kHz wideband speech (16 kHz sampling) at a target bit-rate range of 16 to 32 kb/s (1 to 2 bits/sample). The system uses short-term and long-term prediction to remove the redundancy in speech. A prediction residual is transformed and coded in the frequency domain to take advantage of knowledge in human auditory perception. The TPC coder uses only open-loop quantization and therefore has a fairly low complexety.

Type: Grant

Filed: September 19, 1995

Date of Patent: August 4, 1998

Assignee: Lucent Technologies Inc.

Inventor: Juin-Hwey Chen
Multimedia system

Patent number: 5754784

Abstract: The provisioning of a multimedia application is enhanced by using a communications protocol which defines a number of different functions and which allows a multimedia application to be segmented into a plurality of logical blocks such that a user may enter a request characterizing one such function and, in response thereto, the system performs the function with respect to a particular one of the logical blocks. The underlying multimedia system is also enhanced by arranging it so that an application provider may interact with the system via a telecommunications network for the purpose of, for example, storing the application thereon and/or "debugging" the stored application.

Type: Grant

Filed: August 15, 1996

Date of Patent: May 19, 1998

Assignee: AT&T Corp

Inventors: J. David Garland, Andrew R. McGee
Automatic call-back system and method using data indicating best time to call

Patent number: 5742674

Abstract: A system and method are disclosed for providing an automatic call-back when a calling party encounters a ring-no-answer condition upon calling a called party. An exemplary system includes: a first originating switch node (OSN) coupled to a first originating telephone call location corresponding to a calling party and, a terminating switch node (TSN) coupled to a destination telephone call location corresponding to a called party; a database coupled to the TSN for storing the called party's times of calling activity; a processor coupled to the TSN and to the database for processing the times of calling activity to determine a best time to call (BTTC) the called party (wherein the BTTC comprises a best exact time and/or a window of time) and providing the TSN with the BTTC; a signaling network for routing the BTTC to the OSN; and an adjunct coupled to the OSN for receiving the calling party's automatic call-back request and initiating an automatic call back to the calling party at the BTTC the called party.

Type: Grant

Filed: December 22, 1995

Date of Patent: April 21, 1998

Assignee: AT&T Corp.

Inventors: Ajay K. Jain, Paramdeep S. Sahni
Speech recognition employing a permissive recognition criterion for a repeated phrase utterance

Patent number: 5737724

Abstract: The invention relates to a method and apparatus for speech recognition, the speech to be recognized including one or more words. Recognition is based on an analysis of a first and a second utterance. In accordance with the invention, the first utterance is compared to one or more models of speech to determine a similarity metric for each such comparison. The model of speech which most closely matches the first utterance is determined based on the one or more similarity metrics. The similarity metric corresponding to the most closely matching model of speech is analyzed to determine whether the similarity metric satisfies a first recognition criterion. The second utterance is compared to one or more models of speech associated with the most closely matching model (which may include the most closely matching model) to determine a second utterance similarity metric for each such comparison.

Type: Grant

Filed: August 8, 1996

Date of Patent: April 7, 1998

Assignee: Lucent Technologies Inc.

Inventors: Bishnu Saroop Atal, Raziel Haimi-Cohen, David Bjorn Roe
Voiced/unvoiced classification of speech for excitation codebook selection in celp speech decoding during frame erasures

Patent number: 5732389

Abstract: A CELP speech decoder includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook. The CS-ACELP decoder generates a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information. The decoder does this by classifying the speech signal to be generated as periodic (voiced) or non-periodic (unvoiced) and then generating an excitation signal based on this classification. If the speech signal is classified as periodic, the excitation signal is generated based on the output signal from the first portion and not on the output signal from the second portion. If the speech signal is classified as non-periodic, the excitation signal is generated based on the output signal from said second portion and not on the output signal from said first portion.

Type: Grant

Filed: June 7, 1995

Date of Patent: March 24, 1998

Assignee: Lucent Technologies Inc.

Inventors: Peter Kroon, Yair Shoham
Network based multimedia messaging method for non-CCITT compliant switches

Patent number: 5724407

Abstract: A method is disclosed for messaging multimedia calls with a non-CCITT compliant switch when the intended recipient is unavailable. The method is used in connection with a telecommunication network which has an identified multimedia server. According to the method, multimedia calls from a caller, using a multimedia device running a first application and a second application, are messaged when the called party is unavailable. The method involves initiating a multimedia call to the called party using the first application and determining that the multimedia call is an unanswered call using the second application. Using the second application, the first application is signalled to indicate that the multimedia call was unanswered. In response to the signal, an X.25 packet message is sent from the multimedia device to the identified network based multimedia server. The multimedia device receives back a messaging address identifying a multimedia messaging server from the network based multimedia server.

Type: Grant

Filed: March 13, 1996

Date of Patent: March 3, 1998

Assignee: AT&T Corp.

Inventors: Richard F. Bruno, Robert E. Markowitz, Carlos A. Perea, Peter H. Stuntebeck, Roy P. Weber
Long term predictor

Patent number: 5719993

Abstract: An improved long-term predictor (LTP) for use in analysis-by-synthesis coding systems, such as CELP is disclosed. The invention provides control of the periodicity of speech signals generated by the LTP. This control facilitates a reduction in perceptible noise/buzziness in reconstructed speech. An embodiment of the invention includes a conventional LTP in combination with a two-tap finite impulse response filter. The filter augments operation of the LTP by generating precursor signals of LTP output signals. These precursor signals are combined with the LTP output signals to form the output of the improved LTP.

Type: Grant

Filed: December 21, 1995

Date of Patent: February 17, 1998

Assignee: Lucent Technologies Inc.

Inventor: Willem Bastiaan Kleijn
Speech-rate modification for linear-prediction based analysis-by-synthesis speech coders

Patent number: 5717823

Abstract: Synergy between operations performed by a speech-rate modification system and those operations performed in a speech coding system is exploited to provide a speech-rate modification system with reduced hardware requirements. The speech rate of an input signal is modified based on a signal representing a predetermined change in speech rate. The modified speech-rate signal is then filtered to generate a speech signal having increased short-term correlation. Modification of the input speech signal may be performed by inserting in the input speech signal a previous sequence of samples corresponding substantially to a pitch cycle. Alternatively, the input speech signal may be modified by removing from the input speech signal a sequence of samples corresponding substantially to a pitch cycle.

Type: Grant

Filed: April 14, 1994

Date of Patent: February 10, 1998

Assignee: Lucent Technologies Inc.

Inventor: Willem Bastiaan Kleijn
Computational complexity reduction during frame erasure of packet loss

Patent number: 5717822

Abstract: A speech coding system robust to frame erasure (or packet loss) is described. Illustrative embodiments are directed to a modified version of CCITT standard G.728. In the event of frame erasure, vectors of an excitation signal are synthesized based on previously stored excitation signal vectors generated during non-erased frames. This synthesis differs for voiced and non-voiced speech. During erased frames, linear prediction filter coefficients are synthesized as a weighted extrapolation of a set of linear prediction filter coefficients determined during non-erased frames. The weighting factor is a number less than 1. This weighting accomplishes a bandwidth-expansion of peaks in the frequency response of a linear predictive filter. Computational complexity during erased frames is reduced through the elimination of certain computations needed during non-erased frames only.

Type: Grant

Filed: February 16, 1996

Date of Patent: February 10, 1998

Assignee: Lucent Technologies Inc.

Inventor: Juin-Hwey Chen
Multimedia networked system detecting congestion by monitoring buffers' threshold and compensating by reducing video transmittal rate then reducing audio playback rate

Patent number: 5715404

Abstract: Disclosed is a networked multimedia information system which may be utilized to record, store and distribute multimedia presentations together with any supplemental materials that may be referenced during the presentation. The recorded presentation, together with the associated supplemental materials, may be simultaneously presented on a display containing two separate viewing windows. The effects of network congestion are minimized by prefetching audio and video data for storage in audio and video buffers. An adaptive control algorithm compensates for network congestion by dynamically varying the rate at which video frames are retrieved over the network, in response to network traffic conditions. The audio playback speed is reduced if the audio data does not arrive fast enough over the network to maintain the desired size of the audio buffer after the amount of video data transmitted across the network has been reduced to a minimum value.

Type: Grant

Filed: April 5, 1996

Date of Patent: February 3, 1998

Assignee: AT&T

Inventors: Howard P. Katseff, Bethany Scott Robinson
Speech signal quantization using human auditory models in predictive coding systems

Patent number: 5710863

Abstract: A speech compression system called "Transform Predictive Coding", or TPC, provides for encoding 7 kHz wideband speech (160 kHz sampling) at a target bit-rate range of 16 to 32 kb/s (1 to 2 bits/sample). The system uses short-term and long-term prediction to remove the redundancy in speech. A prediction residual is transformed and coded in the frequency domain to take advantage of knowledge in human auditory perception. The TPC coder uses only open-loop quantization and therefore has a fairly low complexity. The speech quality of TPC is essentially transparent at 32 kb/s, very good at 24 kb/s, and acceptable at 16 kb/s.

Type: Grant

Filed: September 19, 1995

Date of Patent: January 20, 1998

Inventor: Juin-Hwey Chen

1 2 3 4 next