Patents Examined by Thomas J. Onka

Method and device for discriminating voiced and unvoiced sounds

Patent number: 5664052

Abstract: A method and a device for discriminating a voiced sound from an unvoiced sound or background noise in speech signals are disclosed. Each block or frame of input speech signals is divided into plural sub-blocks and the standard deviation, effective value or the peak value is detected in a detection unit for detecting statistical characteristics from one sub-block to another. A bias detection unit detects a bias on the time scale of the standard deviation, effective value or the peak value to decide whether the speech signals are voiced or unvoiced from one block to another.

Type: Grant

Filed: April 14, 1993

Date of Patent: September 2, 1997

Assignee: Sony Corporation

Inventors: Masayuki Nishiguchi, Jun Matsumoto
Stereophonic voice recording and playback device

Patent number: 5642463

Abstract: A stereophonic voice recording and playback device for stereophonically recording and reproducing voice signals, includes: an adding circuit for receiving a first channel analog voice signal and a second channel analog voice signal, performing orthogonal conversion of the respective analog voice signals, and adding the orthogonally converted signals; an Analog-to-Digital (A/D) converter for receiving the added signal from the adding circuit and converting the added signal to a digital signal; a compressing circuit for receiving the digital signal from the A/D converter and compressing the digital signal; a memory circuit for storing the compressed digital signal; and an expanding circuit for reading out the digital signal from the memory circuit and reproducing a stereophonic signal.

Type: Grant

Filed: December 21, 1993

Date of Patent: June 24, 1997

Assignee: Sharp Kabushiki Kaisha

Inventors: Takahiko Nakano, Yasumoto Murata
Method of and apparatus for performing time-scale modification of speech signals

Patent number: 5630013

Abstract: An apparatus for transforming an input signal having a time length L into an output signal having a time length .alpha.L in accordance with a given time-scale modification ratio .alpha., including a correlator for calculating a value of a correlation function between a first signal and a second signal having a time length T and for determining a time delay T.sub.c at which the value of the correlation function becomes the greatest; an adder for adding the first signal multiplied by a first window function to the second signal multiplied by a second window function with a displacement of the time delay T.sub.c ; and an outputting circuit for selectively outputting the output of the adder and a third signal succeeding the output of the adder so that the sum of a time length of the output of the adder and a time length of the third signal is substantially equal to a time length defined by the time-scale modification ratio .alpha., the time delay T.sub.c and the time length T.

Type: Grant

Filed: January 25, 1994

Date of Patent: May 13, 1997

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Ryoji Suzuki, Masayuki Misaki
Efficient codebook structure for code excited linear prediction coding

Patent number: 5621852

Abstract: A code excited linear prediction speech communication system includes a ternary codebook of innovation sequences. The ternary codebook is formed as the sum of first and second binary codebooks containing binary codevectors. Code sequences C.sub.k, in the ternary codebook are constructed from the set of values, {-1,0,1}. To form the ternary codebook, one binary codebook has the binary codevector values {0,1}, and the other binary codebook has the binary codevector values {-1,0}. The sum of one binary codevector from each binary codebook forms a ternary codevector. The present codebook structure permits several efficient search procedures for an optimum innovation sequence. In particular, the binary codevectors may be searched using a given fidelity criterion function in order to find an optimum ternary codevector representing the optimum innovation sequence.

Type: Grant

Filed: December 14, 1993

Date of Patent: April 15, 1997

Assignee: Interdigital Technology Corporation

Inventor: Daniel Lin
Subband coding of a digital signal in a stereo intensity mode

Patent number: 5621855

Abstract: Subband coding a digital signal having a first and a second signal component in a stereo intensity mode. The digital signal is subband coded to produce a first subband signal having a first q-sample signal block in response to the first signal component, and a second subband signal having a second q-sample signal block in response to the second signal component, the first and the second subband signals being in the same subband, and the first and second signal blocks being time-equivalent. The first and second signal blocks are processed to obtain a minimum distance value representative of a distance between a line and a plurality of points if (a) the points correspond to the respective pairs of time-equivalent samples in the first and second signal blocks, and are plotted in a coordinate system, and (b) the line is plotted such that it traverses the origin and the points at a minimum distance from the points in the coordinate system.

Type: Grant

Filed: October 19, 1994

Date of Patent: April 15, 1997

Assignee: U.S. Philips Corporation

Inventors: Raymond N. J. Veldhuis, Robbert G. van der Waal, Leon M. van de Kerkhof
Rejection of non-digit strings for connected digit speech recognition

Patent number: 5613037

Abstract: A high reliability digit string recognizer/rejection system that processes spoken words through an HMM recognizer to determine a string of candidate digits, a filler model for each digit in the digit string, and other information. Next, a weighted sum is generated for each digit in the string and for a filler model for each digit in the string. A confidence score is generated for each digit by subtracting the filler weighted sum from the digit weighted sum. The confidence score for each digit is then compared to a threshold and, if the confidence score for any of the digits is less than the threshold, the entire digit string is rejected. If the confidence scores for all of the digits in the digit string are equal to or greater than the threshold, then the candidate digit string is accepted as a digit string.

Type: Grant

Filed: December 21, 1993

Date of Patent: March 18, 1997

Assignee: Lucent Technologies Inc.

Inventor: Rafid A. Sukkar
Audio decompression system employing multi-rate signal analysis

Patent number: 5606642

Abstract: An audio decompression system is disclosed. The corresponding compression system utilizes sub-band analysis filters whose bandwidths are chosen to approximate the critical bands of the human auditory system while avoiding the aliasing problems encountered in QMF filter banks designed to provide similar band splitting. One embodiment of the invention may be implemented on a digital computer. The computational requirements of the synthesis filters may be varied in response to the available computational resources of the computer, thereby allowing a single compressed audio signal to be played back in real time on a variety of platforms by trading off audio quality against available computational resources. Similar trade offs can be made in compressing an audio signal, thereby allowing a platform having limited computational capacity to compress a signal in real time.

Type: Grant

Filed: September 16, 1994

Date of Patent: February 25, 1997

Assignee: Aware, Inc.

Inventors: John P. Stautner, William R. Morrell, Sriram Jayasimha
Voice compression system and method

Patent number: 5592586

Abstract: A system for performing voice compression which includes, a conversion means for converting analog voice signals into discrete samples of digital voice data and collecting the discrete samples into segments, with a means for dividing the segments into subsegments and for producing therefrom a current voice subsegment. A pitch prediction means is used for determining the long term predicted gain of the current voice subsegment by comparing the current voice subsegment to reconstructed voice samples to produce a pitch predictor gain and a lag component and a pitch filter means is used for pitch filtering the current voice subsegment based on the pitch predictor gain and the lag component which then produces long term residual samples.

Type: Grant

Filed: August 11, 1994

Date of Patent: January 7, 1997

Assignee: Multi-Tech Systems, Inc.

Inventors: Sidhartha Maitra, Ashish Thanawala, Steve Young
Neural network acoustic and visual speech recognition system

Patent number: 5586215

Abstract: The apparatus for the recognition of speech comprises an acoustic preprocessor, a visual preprocessor, and a speech classifier that operates on the acoustic and visual preprocessed data. The acoustic preprocessor comprises a log mel spectrum analyzer that produces an equal mel bandwidth log power spectrum. The visual processor detects the motion of a set of fiducial markers on the speaker's face and extracts a set of normalized distance vectors describing lip and mouth movement. The speech classifier uses a multilevel time-delay neural network operating on the preprocessed acoustic and visual data to form an output probability distribution that indicates the probability of each candidate utterance having been spoken, based on the acoustic and visual data.

Type: Grant

Filed: May 26, 1992

Date of Patent: December 17, 1996

Assignees: Ricoh Corporation, Ricoh Company, Ltd.

Inventors: David G. Stork, Gregory J. Wolff, Earl I. Levine
Recording method and apparatus and audio data user interface

Patent number: 5586216

Abstract: A method and apparatus for marking audio data as it is recorded, and a user interface for the audio data in a computerized system, is disclosed. A recorder, such as a tape recorder, having a plurality of marker buttons is provided. The audio data is recorded on one channel of a magnetic tape. Any time one of the marker buttons is pressed, a distinct tone is recorded on another channel of the tape as a marker. The audio data and markers are then transferred to the computer system. The user interface provides a graphical display of the audio data, and provides graphical markers which correspond to the marker buttons on the recorder. The audio data can be accessed at any random point, including a point marked by a marker. Without changing modes, a user can access the data at any random point, stop play, select a new point to initiate playback and restart playback, and change the speed of playback.

Type: Grant

Filed: September 25, 1992

Date of Patent: December 17, 1996

Assignee: Apple Computer, Inc.

Inventors: Leo M. W. F. Degen, S. Joy Mountford, Richard Mander, Gitta B. Salomon
System for predictive coding/decoding of a digital speech signal by embedded-code adaptive transform

Patent number: 5583963

Abstract: A system for predictive coding of a digital speech signal with embedded codes used in any transmission system or for storing speech signals. The coded digital signal (S.sub.n) is formed by a coded speech signal and, if appropriate, by auxiliary data. A perceptual weighting filter is formed by a filter for short-term prediction of the speech signal to be coded, in order to produce a frequency distribution of the quantization noise. A circuit makes it possible to perform the subtraction from the perceptual signal of the contribution of the past excitation signal P.sup.0.sub.n to deliver an updated perceptual signal P.sub.n. A long-term prediction circuit is formed, as a closed loop, from a dictionary updated by the modelled page excitation r.sup.1 .sub.n for the lowest throughput and makes it possible to deliver an optimal waveform and an associated estimated gain which make up the estimated perceptual signal P.sup.1.sub.n.

Type: Grant

Filed: January 21, 1994

Date of Patent: December 10, 1996

Assignee: France Telecom

Inventor: Bruno Lozach
Speech signal bandwidth compression and expansion apparatus, and bandwidth compressing speech signal transmission method, and reproducing method

Patent number: 5579434

Abstract: A speech signal bandwidth compression and expansion apparatus and its method. On the transmitting side, system parameters are extracted from a speech signal by a linear prediction analyzer. A prediction residual signal is obtained by inverse filtering processing by using the system parameters. The prediction residual signal is lowered in sampling rate by a down-sampler and converted to a baseband signal. From the baseband signal, a time series signal is derived by a linear prediction synthesizer. Thereafter, the time series signal is converted to an analog signal and transmitted. On the receiving side, a received signal is subjected to inverse filtering processing to reproduce a baseband signal. The sampling rate of the reproduced baseband signal is raised to derive a time series signal. From the time series signal, a high frequency band component is generated. The high frequency band component is added to the baseband signal to generate an excitation signal.

Type: Grant

Filed: December 6, 1994

Date of Patent: November 26, 1996

Assignee: Hitachi Denshi Kabushiki Kaisha

Inventors: Yasushi Kudo, Yoshiro Kokuryo
System for recognizing or counting spoken itemized expressions

Patent number: 5577163

Abstract: A speech categorization system includes first and second timers which generate first and second measured durations indicative of duration of selected higher and lower amplitude segments included in a voice message. A higher amplitude segment is classified in a first category when the first and second measured durations corresponding to the higher amplitude segment and an adjacent lower amplitude segment satisfy a classification test, and a counter counts the number of the higher amplitude segments classified in the first category. Accented syllables in the higher amplitude segment are recognized to aid classification.

Type: Grant

Filed: December 29, 1993

Date of Patent: November 19, 1996

Inventor: Peter F. Theis
Speech codec and a method of processing a speech signal with speech codec

Patent number: 5572681

Abstract: One frame of speech signal data and subframes of speech signal data divided from the frame of speech signal data are encoded by frame and subframe encoding processes while another frame is decoded by frame and subframe decoding processes. The frame and subframe encoding processes and the frame and subframe decoding processes are interleaved to reduce the DSP memory capacity needed for a speech codec.

Type: Grant

Filed: August 16, 1993

Date of Patent: November 5, 1996

Assignee: NEC Corporation

Inventors: Makio Nakamura, Akira Hioki
Method and apparatus for detecting and extending temporal gaps in speech signal and appliances using the same

Patent number: 5572593

Abstract: A method and an apparatus for detecting and extending controllably temporal gaps in a speech in dependence on power thereof for the purpose of aiding an auditory sense organ. A temporal gap detecting facility for detecting temporal gaps in the input speech signal and a temporal gap extension facility for extending the temporal gap by repetitive addition thereof are provided, wherein the number of repetition is selected to be proportional to power of the input speech signal at a time point immediately preceding to the temporal gap. Alternatively, the temporal gap extension facility adds repeatedly to the temporal gap a part thereof exclusive of start and end parts.

Type: Grant

Filed: June 23, 1993

Date of Patent: November 5, 1996

Assignee: Hitachi, Ltd.

Inventors: Yoshito Nejime, Hiroshi Ikeda, Yukio Kumagai
Method of speech detection

Patent number: 5572623

Abstract: A method for detecting the start and end of speech from a noisy signal including the steps of:detecting a voiced frame;searching for noise frames preceding this voiced frame;constructing an autoregressive model of the noise and a mean noise spectrum;bleaching the flames preceding the voicing,searching for the actual start of speech in the bleached frames;removing the noise from the voiced frames and parameterizing them; andsearching for the actual end of speech.

Type: Grant

Filed: October 21, 1993

Date of Patent: November 5, 1996

Assignee: Sextant Avionique

Inventor: Dominique Pastor
Multi-pulse analysis speech processing System and method

Patent number: 5568588

Abstract: A speech processing system and method are disclosed. In one embodiment of the present invention, the system includes at least a maximum likelihood quantization (MLQ) multi-pulse analysis unit operating on a target vector. The MLQ multi-pulse analyses unit typically determines an initial gain level for the multi-pulse sequence and performs single gain multi-pulse analysis (MPA) a number of times, each with a different gain level. The pulse sequence which most closely represents the target vector is provided as an output signal. In another embodiment, the system includes at least a pulse train multi-pulse analysis unit wherein the target vector is modeled as a series of pulse trains. Each pulse train comprises a plurality of single gain pulses, wherein each pulse is at a position which is a pitch value distance apart from the previous pulse in the pulse train. Combinations of maximum likelihood analyses with pulse trains are also part of the present invention.

Type: Grant

Filed: April 29, 1994

Date of Patent: October 22, 1996

Assignee: AudioCodes Ltd.

Inventors: Leon Bialik, Felix Flomen
Automatic speech recognition (ASR) processing using confidence measures

Patent number: 5566272

Abstract: The user interface in an automatic speech recognition (ASR) system is dynamically controlled, based upon the level of confidence in the results of the ASR process. In one embodiment, the system is arranged to distinguish error prone ASR interpretations from those likely to be correct, using a degree of confidence in the output of the ASR system determined as a function of the difference between the confidence in the "first choice" selected by the ASR system and the confidence in the "second choice" selected by the ASR system. In this embodiment, the user interface is arranged so that the explicit verification steps taken by the system as a result of uncertain information is different from the action taken when the confidence is high. In addition, different treatment can be provided based upon the "consequences" of misinterpretation as well as the historical performance of the system with respect to the specific user whose speech is being processed.

Type: Grant

Filed: October 27, 1993

Date of Patent: October 15, 1996

Assignee: Lucent Technologies Inc.

Inventors: Douglas J. Brems, Max S. Schoeffler
Data recording/replay device and data recording medium

Patent number: 5563719

Abstract: A data recording and/or reproducing device for recording and/or reproducing data on or from a recording medium includes information concerning the recording method used for recording data on a recording medium, that is the encoding/decoding instruction data is recorded on the recording medium along with the data. Using the encoding/decoding instruction on the recording medium, specifically the decoding instruction, the data on the recording medium is restored by a DSP. The data recording and/or reproducing device enables data processed with an inconvenient system or an older model system to be restored easily and inexpensively.

Type: Grant

Filed: January 7, 1994

Date of Patent: October 8, 1996

Assignee: Sony Corporation

Inventor: Yoshiaki Oikawa
Companding of voice signal for simultaneous voice and data transmission

Patent number: 5559791

Abstract: In a simultaneous voice and data communications system, a voice signal is added to a data signal before transmission over the public switched telephone network (PSTN). In particular, in every signaling interval, a signal point is selected for transmission as a function of both the voice signal and the data signal. Since the voice signal is effectively offset by the data signal, compandors normally found in the PSTN are not effective in improving the signal to noise ratio of the transmitted voice and data signal. Therefore, the voice signal is additionally companded in the transmitter before transmission over the PSTN. This additional companding by the transmitter improves the signal to noise ratio of the combined voice and data signal.

Type: Grant

Filed: June 14, 1993

Date of Patent: September 24, 1996

Assignee: Lucent Technologies Inc.

Inventors: Gordon Bremer, Kenneth D. Ko, Luke J. Smithwick

1 2 next