Patents Examined by Thomas J. Onka
  • Patent number: 5664052
    Abstract: A method and a device for discriminating a voiced sound from an unvoiced sound or background noise in speech signals are disclosed. Each block or frame of input speech signals is divided into plural sub-blocks and the standard deviation, effective value or the peak value is detected in a detection unit for detecting statistical characteristics from one sub-block to another. A bias detection unit detects a bias on the time scale of the standard deviation, effective value or the peak value to decide whether the speech signals are voiced or unvoiced from one block to another.
    Type: Grant
    Filed: April 14, 1993
    Date of Patent: September 2, 1997
    Assignee: Sony Corporation
    Inventors: Masayuki Nishiguchi, Jun Matsumoto
  • Patent number: 5642463
    Abstract: A stereophonic voice recording and playback device for stereophonically recording and reproducing voice signals, includes: an adding circuit for receiving a first channel analog voice signal and a second channel analog voice signal, performing orthogonal conversion of the respective analog voice signals, and adding the orthogonally converted signals; an Analog-to-Digital (A/D) converter for receiving the added signal from the adding circuit and converting the added signal to a digital signal; a compressing circuit for receiving the digital signal from the A/D converter and compressing the digital signal; a memory circuit for storing the compressed digital signal; and an expanding circuit for reading out the digital signal from the memory circuit and reproducing a stereophonic signal.
    Type: Grant
    Filed: December 21, 1993
    Date of Patent: June 24, 1997
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Takahiko Nakano, Yasumoto Murata
  • Patent number: 5630013
    Abstract: An apparatus for transforming an input signal having a time length L into an output signal having a time length .alpha.L in accordance with a given time-scale modification ratio .alpha., including a correlator for calculating a value of a correlation function between a first signal and a second signal having a time length T and for determining a time delay T.sub.c at which the value of the correlation function becomes the greatest; an adder for adding the first signal multiplied by a first window function to the second signal multiplied by a second window function with a displacement of the time delay T.sub.c ; and an outputting circuit for selectively outputting the output of the adder and a third signal succeeding the output of the adder so that the sum of a time length of the output of the adder and a time length of the third signal is substantially equal to a time length defined by the time-scale modification ratio .alpha., the time delay T.sub.c and the time length T.
    Type: Grant
    Filed: January 25, 1994
    Date of Patent: May 13, 1997
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Ryoji Suzuki, Masayuki Misaki
  • Patent number: 5621852
    Abstract: A code excited linear prediction speech communication system includes a ternary codebook of innovation sequences. The ternary codebook is formed as the sum of first and second binary codebooks containing binary codevectors. Code sequences C.sub.k, in the ternary codebook are constructed from the set of values, {-1,0,1}. To form the ternary codebook, one binary codebook has the binary codevector values {0,1}, and the other binary codebook has the binary codevector values {-1,0}. The sum of one binary codevector from each binary codebook forms a ternary codevector. The present codebook structure permits several efficient search procedures for an optimum innovation sequence. In particular, the binary codevectors may be searched using a given fidelity criterion function in order to find an optimum ternary codevector representing the optimum innovation sequence.
    Type: Grant
    Filed: December 14, 1993
    Date of Patent: April 15, 1997
    Assignee: Interdigital Technology Corporation
    Inventor: Daniel Lin
  • Patent number: 5621855
    Abstract: Subband coding a digital signal having a first and a second signal component in a stereo intensity mode. The digital signal is subband coded to produce a first subband signal having a first q-sample signal block in response to the first signal component, and a second subband signal having a second q-sample signal block in response to the second signal component, the first and the second subband signals being in the same subband, and the first and second signal blocks being time-equivalent. The first and second signal blocks are processed to obtain a minimum distance value representative of a distance between a line and a plurality of points if (a) the points correspond to the respective pairs of time-equivalent samples in the first and second signal blocks, and are plotted in a coordinate system, and (b) the line is plotted such that it traverses the origin and the points at a minimum distance from the points in the coordinate system.
    Type: Grant
    Filed: October 19, 1994
    Date of Patent: April 15, 1997
    Assignee: U.S. Philips Corporation
    Inventors: Raymond N. J. Veldhuis, Robbert G. van der Waal, Leon M. van de Kerkhof
  • Patent number: 5613037
    Abstract: A high reliability digit string recognizer/rejection system that processes spoken words through an HMM recognizer to determine a string of candidate digits, a filler model for each digit in the digit string, and other information. Next, a weighted sum is generated for each digit in the string and for a filler model for each digit in the string. A confidence score is generated for each digit by subtracting the filler weighted sum from the digit weighted sum. The confidence score for each digit is then compared to a threshold and, if the confidence score for any of the digits is less than the threshold, the entire digit string is rejected. If the confidence scores for all of the digits in the digit string are equal to or greater than the threshold, then the candidate digit string is accepted as a digit string.
    Type: Grant
    Filed: December 21, 1993
    Date of Patent: March 18, 1997
    Assignee: Lucent Technologies Inc.
    Inventor: Rafid A. Sukkar
  • Patent number: 5606642
    Abstract: An audio decompression system is disclosed. The corresponding compression system utilizes sub-band analysis filters whose bandwidths are chosen to approximate the critical bands of the human auditory system while avoiding the aliasing problems encountered in QMF filter banks designed to provide similar band splitting. One embodiment of the invention may be implemented on a digital computer. The computational requirements of the synthesis filters may be varied in response to the available computational resources of the computer, thereby allowing a single compressed audio signal to be played back in real time on a variety of platforms by trading off audio quality against available computational resources. Similar trade offs can be made in compressing an audio signal, thereby allowing a platform having limited computational capacity to compress a signal in real time.
    Type: Grant
    Filed: September 16, 1994
    Date of Patent: February 25, 1997
    Assignee: Aware, Inc.
    Inventors: John P. Stautner, William R. Morrell, Sriram Jayasimha
  • Patent number: 5592586
    Abstract: A system for performing voice compression which includes, a conversion means for converting analog voice signals into discrete samples of digital voice data and collecting the discrete samples into segments, with a means for dividing the segments into subsegments and for producing therefrom a current voice subsegment. A pitch prediction means is used for determining the long term predicted gain of the current voice subsegment by comparing the current voice subsegment to reconstructed voice samples to produce a pitch predictor gain and a lag component and a pitch filter means is used for pitch filtering the current voice subsegment based on the pitch predictor gain and the lag component which then produces long term residual samples.
    Type: Grant
    Filed: August 11, 1994
    Date of Patent: January 7, 1997
    Assignee: Multi-Tech Systems, Inc.
    Inventors: Sidhartha Maitra, Ashish Thanawala, Steve Young
  • Patent number: 5586215
    Abstract: The apparatus for the recognition of speech comprises an acoustic preprocessor, a visual preprocessor, and a speech classifier that operates on the acoustic and visual preprocessed data. The acoustic preprocessor comprises a log mel spectrum analyzer that produces an equal mel bandwidth log power spectrum. The visual processor detects the motion of a set of fiducial markers on the speaker's face and extracts a set of normalized distance vectors describing lip and mouth movement. The speech classifier uses a multilevel time-delay neural network operating on the preprocessed acoustic and visual data to form an output probability distribution that indicates the probability of each candidate utterance having been spoken, based on the acoustic and visual data.
    Type: Grant
    Filed: May 26, 1992
    Date of Patent: December 17, 1996
    Assignees: Ricoh Corporation, Ricoh Company, Ltd.
    Inventors: David G. Stork, Gregory J. Wolff, Earl I. Levine
  • Patent number: 5586216
    Abstract: A method and apparatus for marking audio data as it is recorded, and a user interface for the audio data in a computerized system, is disclosed. A recorder, such as a tape recorder, having a plurality of marker buttons is provided. The audio data is recorded on one channel of a magnetic tape. Any time one of the marker buttons is pressed, a distinct tone is recorded on another channel of the tape as a marker. The audio data and markers are then transferred to the computer system. The user interface provides a graphical display of the audio data, and provides graphical markers which correspond to the marker buttons on the recorder. The audio data can be accessed at any random point, including a point marked by a marker. Without changing modes, a user can access the data at any random point, stop play, select a new point to initiate playback and restart playback, and change the speed of playback.
    Type: Grant
    Filed: September 25, 1992
    Date of Patent: December 17, 1996
    Assignee: Apple Computer, Inc.
    Inventors: Leo M. W. F. Degen, S. Joy Mountford, Richard Mander, Gitta B. Salomon
  • Patent number: 5583963
    Abstract: A system for predictive coding of a digital speech signal with embedded codes used in any transmission system or for storing speech signals. The coded digital signal (S.sub.n) is formed by a coded speech signal and, if appropriate, by auxiliary data. A perceptual weighting filter is formed by a filter for short-term prediction of the speech signal to be coded, in order to produce a frequency distribution of the quantization noise. A circuit makes it possible to perform the subtraction from the perceptual signal of the contribution of the past excitation signal P.sup.0.sub.n to deliver an updated perceptual signal P.sub.n. A long-term prediction circuit is formed, as a closed loop, from a dictionary updated by the modelled page excitation r.sup.1 .sub.n for the lowest throughput and makes it possible to deliver an optimal waveform and an associated estimated gain which make up the estimated perceptual signal P.sup.1.sub.n.
    Type: Grant
    Filed: January 21, 1994
    Date of Patent: December 10, 1996
    Assignee: France Telecom
    Inventor: Bruno Lozach
  • Patent number: 5579434
    Abstract: A speech signal bandwidth compression and expansion apparatus and its method. On the transmitting side, system parameters are extracted from a speech signal by a linear prediction analyzer. A prediction residual signal is obtained by inverse filtering processing by using the system parameters. The prediction residual signal is lowered in sampling rate by a down-sampler and converted to a baseband signal. From the baseband signal, a time series signal is derived by a linear prediction synthesizer. Thereafter, the time series signal is converted to an analog signal and transmitted. On the receiving side, a received signal is subjected to inverse filtering processing to reproduce a baseband signal. The sampling rate of the reproduced baseband signal is raised to derive a time series signal. From the time series signal, a high frequency band component is generated. The high frequency band component is added to the baseband signal to generate an excitation signal.
    Type: Grant
    Filed: December 6, 1994
    Date of Patent: November 26, 1996
    Assignee: Hitachi Denshi Kabushiki Kaisha
    Inventors: Yasushi Kudo, Yoshiro Kokuryo
  • Patent number: 5577163
    Abstract: A speech categorization system includes first and second timers which generate first and second measured durations indicative of duration of selected higher and lower amplitude segments included in a voice message. A higher amplitude segment is classified in a first category when the first and second measured durations corresponding to the higher amplitude segment and an adjacent lower amplitude segment satisfy a classification test, and a counter counts the number of the higher amplitude segments classified in the first category. Accented syllables in the higher amplitude segment are recognized to aid classification.
    Type: Grant
    Filed: December 29, 1993
    Date of Patent: November 19, 1996
    Inventor: Peter F. Theis
  • Patent number: 5572681
    Abstract: One frame of speech signal data and subframes of speech signal data divided from the frame of speech signal data are encoded by frame and subframe encoding processes while another frame is decoded by frame and subframe decoding processes. The frame and subframe encoding processes and the frame and subframe decoding processes are interleaved to reduce the DSP memory capacity needed for a speech codec.
    Type: Grant
    Filed: August 16, 1993
    Date of Patent: November 5, 1996
    Assignee: NEC Corporation
    Inventors: Makio Nakamura, Akira Hioki
  • Patent number: 5572593
    Abstract: A method and an apparatus for detecting and extending controllably temporal gaps in a speech in dependence on power thereof for the purpose of aiding an auditory sense organ. A temporal gap detecting facility for detecting temporal gaps in the input speech signal and a temporal gap extension facility for extending the temporal gap by repetitive addition thereof are provided, wherein the number of repetition is selected to be proportional to power of the input speech signal at a time point immediately preceding to the temporal gap. Alternatively, the temporal gap extension facility adds repeatedly to the temporal gap a part thereof exclusive of start and end parts.
    Type: Grant
    Filed: June 23, 1993
    Date of Patent: November 5, 1996
    Assignee: Hitachi, Ltd.
    Inventors: Yoshito Nejime, Hiroshi Ikeda, Yukio Kumagai
  • Patent number: 5572623
    Abstract: A method for detecting the start and end of speech from a noisy signal including the steps of:detecting a voiced frame;searching for noise frames preceding this voiced frame;constructing an autoregressive model of the noise and a mean noise spectrum;bleaching the flames preceding the voicing,searching for the actual start of speech in the bleached frames;removing the noise from the voiced frames and parameterizing them; andsearching for the actual end of speech.
    Type: Grant
    Filed: October 21, 1993
    Date of Patent: November 5, 1996
    Assignee: Sextant Avionique
    Inventor: Dominique Pastor
  • Patent number: 5568588
    Abstract: A speech processing system and method are disclosed. In one embodiment of the present invention, the system includes at least a maximum likelihood quantization (MLQ) multi-pulse analysis unit operating on a target vector. The MLQ multi-pulse analyses unit typically determines an initial gain level for the multi-pulse sequence and performs single gain multi-pulse analysis (MPA) a number of times, each with a different gain level. The pulse sequence which most closely represents the target vector is provided as an output signal. In another embodiment, the system includes at least a pulse train multi-pulse analysis unit wherein the target vector is modeled as a series of pulse trains. Each pulse train comprises a plurality of single gain pulses, wherein each pulse is at a position which is a pitch value distance apart from the previous pulse in the pulse train. Combinations of maximum likelihood analyses with pulse trains are also part of the present invention.
    Type: Grant
    Filed: April 29, 1994
    Date of Patent: October 22, 1996
    Assignee: AudioCodes Ltd.
    Inventors: Leon Bialik, Felix Flomen
  • Patent number: 5566272
    Abstract: The user interface in an automatic speech recognition (ASR) system is dynamically controlled, based upon the level of confidence in the results of the ASR process. In one embodiment, the system is arranged to distinguish error prone ASR interpretations from those likely to be correct, using a degree of confidence in the output of the ASR system determined as a function of the difference between the confidence in the "first choice" selected by the ASR system and the confidence in the "second choice" selected by the ASR system. In this embodiment, the user interface is arranged so that the explicit verification steps taken by the system as a result of uncertain information is different from the action taken when the confidence is high. In addition, different treatment can be provided based upon the "consequences" of misinterpretation as well as the historical performance of the system with respect to the specific user whose speech is being processed.
    Type: Grant
    Filed: October 27, 1993
    Date of Patent: October 15, 1996
    Assignee: Lucent Technologies Inc.
    Inventors: Douglas J. Brems, Max S. Schoeffler
  • Patent number: 5563719
    Abstract: A data recording and/or reproducing device for recording and/or reproducing data on or from a recording medium includes information concerning the recording method used for recording data on a recording medium, that is the encoding/decoding instruction data is recorded on the recording medium along with the data. Using the encoding/decoding instruction on the recording medium, specifically the decoding instruction, the data on the recording medium is restored by a DSP. The data recording and/or reproducing device enables data processed with an inconvenient system or an older model system to be restored easily and inexpensively.
    Type: Grant
    Filed: January 7, 1994
    Date of Patent: October 8, 1996
    Assignee: Sony Corporation
    Inventor: Yoshiaki Oikawa
  • Patent number: 5559791
    Abstract: In a simultaneous voice and data communications system, a voice signal is added to a data signal before transmission over the public switched telephone network (PSTN). In particular, in every signaling interval, a signal point is selected for transmission as a function of both the voice signal and the data signal. Since the voice signal is effectively offset by the data signal, compandors normally found in the PSTN are not effective in improving the signal to noise ratio of the transmitted voice and data signal. Therefore, the voice signal is additionally companded in the transmitter before transmission over the PSTN. This additional companding by the transmitter improves the signal to noise ratio of the combined voice and data signal.
    Type: Grant
    Filed: June 14, 1993
    Date of Patent: September 24, 1996
    Assignee: Lucent Technologies Inc.
    Inventors: Gordon Bremer, Kenneth D. Ko, Luke J. Smithwick