Patents Examined by Thomas J. Onka
-
Patent number: 5664052Abstract: A method and a device for discriminating a voiced sound from an unvoiced sound or background noise in speech signals are disclosed. Each block or frame of input speech signals is divided into plural sub-blocks and the standard deviation, effective value or the peak value is detected in a detection unit for detecting statistical characteristics from one sub-block to another. A bias detection unit detects a bias on the time scale of the standard deviation, effective value or the peak value to decide whether the speech signals are voiced or unvoiced from one block to another.Type: GrantFiled: April 14, 1993Date of Patent: September 2, 1997Assignee: Sony CorporationInventors: Masayuki Nishiguchi, Jun Matsumoto
-
Patent number: 5642463Abstract: A stereophonic voice recording and playback device for stereophonically recording and reproducing voice signals, includes: an adding circuit for receiving a first channel analog voice signal and a second channel analog voice signal, performing orthogonal conversion of the respective analog voice signals, and adding the orthogonally converted signals; an Analog-to-Digital (A/D) converter for receiving the added signal from the adding circuit and converting the added signal to a digital signal; a compressing circuit for receiving the digital signal from the A/D converter and compressing the digital signal; a memory circuit for storing the compressed digital signal; and an expanding circuit for reading out the digital signal from the memory circuit and reproducing a stereophonic signal.Type: GrantFiled: December 21, 1993Date of Patent: June 24, 1997Assignee: Sharp Kabushiki KaishaInventors: Takahiko Nakano, Yasumoto Murata
-
Patent number: 5630013Abstract: An apparatus for transforming an input signal having a time length L into an output signal having a time length .alpha.L in accordance with a given time-scale modification ratio .alpha., including a correlator for calculating a value of a correlation function between a first signal and a second signal having a time length T and for determining a time delay T.sub.c at which the value of the correlation function becomes the greatest; an adder for adding the first signal multiplied by a first window function to the second signal multiplied by a second window function with a displacement of the time delay T.sub.c ; and an outputting circuit for selectively outputting the output of the adder and a third signal succeeding the output of the adder so that the sum of a time length of the output of the adder and a time length of the third signal is substantially equal to a time length defined by the time-scale modification ratio .alpha., the time delay T.sub.c and the time length T.Type: GrantFiled: January 25, 1994Date of Patent: May 13, 1997Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Ryoji Suzuki, Masayuki Misaki
-
Patent number: 5621852Abstract: A code excited linear prediction speech communication system includes a ternary codebook of innovation sequences. The ternary codebook is formed as the sum of first and second binary codebooks containing binary codevectors. Code sequences C.sub.k, in the ternary codebook are constructed from the set of values, {-1,0,1}. To form the ternary codebook, one binary codebook has the binary codevector values {0,1}, and the other binary codebook has the binary codevector values {-1,0}. The sum of one binary codevector from each binary codebook forms a ternary codevector. The present codebook structure permits several efficient search procedures for an optimum innovation sequence. In particular, the binary codevectors may be searched using a given fidelity criterion function in order to find an optimum ternary codevector representing the optimum innovation sequence.Type: GrantFiled: December 14, 1993Date of Patent: April 15, 1997Assignee: Interdigital Technology CorporationInventor: Daniel Lin
-
Patent number: 5621855Abstract: Subband coding a digital signal having a first and a second signal component in a stereo intensity mode. The digital signal is subband coded to produce a first subband signal having a first q-sample signal block in response to the first signal component, and a second subband signal having a second q-sample signal block in response to the second signal component, the first and the second subband signals being in the same subband, and the first and second signal blocks being time-equivalent. The first and second signal blocks are processed to obtain a minimum distance value representative of a distance between a line and a plurality of points if (a) the points correspond to the respective pairs of time-equivalent samples in the first and second signal blocks, and are plotted in a coordinate system, and (b) the line is plotted such that it traverses the origin and the points at a minimum distance from the points in the coordinate system.Type: GrantFiled: October 19, 1994Date of Patent: April 15, 1997Assignee: U.S. Philips CorporationInventors: Raymond N. J. Veldhuis, Robbert G. van der Waal, Leon M. van de Kerkhof
-
Patent number: 5613037Abstract: A high reliability digit string recognizer/rejection system that processes spoken words through an HMM recognizer to determine a string of candidate digits, a filler model for each digit in the digit string, and other information. Next, a weighted sum is generated for each digit in the string and for a filler model for each digit in the string. A confidence score is generated for each digit by subtracting the filler weighted sum from the digit weighted sum. The confidence score for each digit is then compared to a threshold and, if the confidence score for any of the digits is less than the threshold, the entire digit string is rejected. If the confidence scores for all of the digits in the digit string are equal to or greater than the threshold, then the candidate digit string is accepted as a digit string.Type: GrantFiled: December 21, 1993Date of Patent: March 18, 1997Assignee: Lucent Technologies Inc.Inventor: Rafid A. Sukkar
-
Patent number: 5606642Abstract: An audio decompression system is disclosed. The corresponding compression system utilizes sub-band analysis filters whose bandwidths are chosen to approximate the critical bands of the human auditory system while avoiding the aliasing problems encountered in QMF filter banks designed to provide similar band splitting. One embodiment of the invention may be implemented on a digital computer. The computational requirements of the synthesis filters may be varied in response to the available computational resources of the computer, thereby allowing a single compressed audio signal to be played back in real time on a variety of platforms by trading off audio quality against available computational resources. Similar trade offs can be made in compressing an audio signal, thereby allowing a platform having limited computational capacity to compress a signal in real time.Type: GrantFiled: September 16, 1994Date of Patent: February 25, 1997Assignee: Aware, Inc.Inventors: John P. Stautner, William R. Morrell, Sriram Jayasimha
-
Patent number: 5592586Abstract: A system for performing voice compression which includes, a conversion means for converting analog voice signals into discrete samples of digital voice data and collecting the discrete samples into segments, with a means for dividing the segments into subsegments and for producing therefrom a current voice subsegment. A pitch prediction means is used for determining the long term predicted gain of the current voice subsegment by comparing the current voice subsegment to reconstructed voice samples to produce a pitch predictor gain and a lag component and a pitch filter means is used for pitch filtering the current voice subsegment based on the pitch predictor gain and the lag component which then produces long term residual samples.Type: GrantFiled: August 11, 1994Date of Patent: January 7, 1997Assignee: Multi-Tech Systems, Inc.Inventors: Sidhartha Maitra, Ashish Thanawala, Steve Young
-
Patent number: 5586215Abstract: The apparatus for the recognition of speech comprises an acoustic preprocessor, a visual preprocessor, and a speech classifier that operates on the acoustic and visual preprocessed data. The acoustic preprocessor comprises a log mel spectrum analyzer that produces an equal mel bandwidth log power spectrum. The visual processor detects the motion of a set of fiducial markers on the speaker's face and extracts a set of normalized distance vectors describing lip and mouth movement. The speech classifier uses a multilevel time-delay neural network operating on the preprocessed acoustic and visual data to form an output probability distribution that indicates the probability of each candidate utterance having been spoken, based on the acoustic and visual data.Type: GrantFiled: May 26, 1992Date of Patent: December 17, 1996Assignees: Ricoh Corporation, Ricoh Company, Ltd.Inventors: David G. Stork, Gregory J. Wolff, Earl I. Levine
-
Patent number: 5586216Abstract: A method and apparatus for marking audio data as it is recorded, and a user interface for the audio data in a computerized system, is disclosed. A recorder, such as a tape recorder, having a plurality of marker buttons is provided. The audio data is recorded on one channel of a magnetic tape. Any time one of the marker buttons is pressed, a distinct tone is recorded on another channel of the tape as a marker. The audio data and markers are then transferred to the computer system. The user interface provides a graphical display of the audio data, and provides graphical markers which correspond to the marker buttons on the recorder. The audio data can be accessed at any random point, including a point marked by a marker. Without changing modes, a user can access the data at any random point, stop play, select a new point to initiate playback and restart playback, and change the speed of playback.Type: GrantFiled: September 25, 1992Date of Patent: December 17, 1996Assignee: Apple Computer, Inc.Inventors: Leo M. W. F. Degen, S. Joy Mountford, Richard Mander, Gitta B. Salomon
-
System for predictive coding/decoding of a digital speech signal by embedded-code adaptive transform
Patent number: 5583963Abstract: A system for predictive coding of a digital speech signal with embedded codes used in any transmission system or for storing speech signals. The coded digital signal (S.sub.n) is formed by a coded speech signal and, if appropriate, by auxiliary data. A perceptual weighting filter is formed by a filter for short-term prediction of the speech signal to be coded, in order to produce a frequency distribution of the quantization noise. A circuit makes it possible to perform the subtraction from the perceptual signal of the contribution of the past excitation signal P.sup.0.sub.n to deliver an updated perceptual signal P.sub.n. A long-term prediction circuit is formed, as a closed loop, from a dictionary updated by the modelled page excitation r.sup.1 .sub.n for the lowest throughput and makes it possible to deliver an optimal waveform and an associated estimated gain which make up the estimated perceptual signal P.sup.1.sub.n.Type: GrantFiled: January 21, 1994Date of Patent: December 10, 1996Assignee: France TelecomInventor: Bruno Lozach -
Patent number: 5579434Abstract: A speech signal bandwidth compression and expansion apparatus and its method. On the transmitting side, system parameters are extracted from a speech signal by a linear prediction analyzer. A prediction residual signal is obtained by inverse filtering processing by using the system parameters. The prediction residual signal is lowered in sampling rate by a down-sampler and converted to a baseband signal. From the baseband signal, a time series signal is derived by a linear prediction synthesizer. Thereafter, the time series signal is converted to an analog signal and transmitted. On the receiving side, a received signal is subjected to inverse filtering processing to reproduce a baseband signal. The sampling rate of the reproduced baseband signal is raised to derive a time series signal. From the time series signal, a high frequency band component is generated. The high frequency band component is added to the baseband signal to generate an excitation signal.Type: GrantFiled: December 6, 1994Date of Patent: November 26, 1996Assignee: Hitachi Denshi Kabushiki KaishaInventors: Yasushi Kudo, Yoshiro Kokuryo
-
Patent number: 5577163Abstract: A speech categorization system includes first and second timers which generate first and second measured durations indicative of duration of selected higher and lower amplitude segments included in a voice message. A higher amplitude segment is classified in a first category when the first and second measured durations corresponding to the higher amplitude segment and an adjacent lower amplitude segment satisfy a classification test, and a counter counts the number of the higher amplitude segments classified in the first category. Accented syllables in the higher amplitude segment are recognized to aid classification.Type: GrantFiled: December 29, 1993Date of Patent: November 19, 1996Inventor: Peter F. Theis
-
Patent number: 5572681Abstract: One frame of speech signal data and subframes of speech signal data divided from the frame of speech signal data are encoded by frame and subframe encoding processes while another frame is decoded by frame and subframe decoding processes. The frame and subframe encoding processes and the frame and subframe decoding processes are interleaved to reduce the DSP memory capacity needed for a speech codec.Type: GrantFiled: August 16, 1993Date of Patent: November 5, 1996Assignee: NEC CorporationInventors: Makio Nakamura, Akira Hioki
-
Patent number: 5572593Abstract: A method and an apparatus for detecting and extending controllably temporal gaps in a speech in dependence on power thereof for the purpose of aiding an auditory sense organ. A temporal gap detecting facility for detecting temporal gaps in the input speech signal and a temporal gap extension facility for extending the temporal gap by repetitive addition thereof are provided, wherein the number of repetition is selected to be proportional to power of the input speech signal at a time point immediately preceding to the temporal gap. Alternatively, the temporal gap extension facility adds repeatedly to the temporal gap a part thereof exclusive of start and end parts.Type: GrantFiled: June 23, 1993Date of Patent: November 5, 1996Assignee: Hitachi, Ltd.Inventors: Yoshito Nejime, Hiroshi Ikeda, Yukio Kumagai
-
Patent number: 5572623Abstract: A method for detecting the start and end of speech from a noisy signal including the steps of:detecting a voiced frame;searching for noise frames preceding this voiced frame;constructing an autoregressive model of the noise and a mean noise spectrum;bleaching the flames preceding the voicing,searching for the actual start of speech in the bleached frames;removing the noise from the voiced frames and parameterizing them; andsearching for the actual end of speech.Type: GrantFiled: October 21, 1993Date of Patent: November 5, 1996Assignee: Sextant AvioniqueInventor: Dominique Pastor
-
Patent number: 5568588Abstract: A speech processing system and method are disclosed. In one embodiment of the present invention, the system includes at least a maximum likelihood quantization (MLQ) multi-pulse analysis unit operating on a target vector. The MLQ multi-pulse analyses unit typically determines an initial gain level for the multi-pulse sequence and performs single gain multi-pulse analysis (MPA) a number of times, each with a different gain level. The pulse sequence which most closely represents the target vector is provided as an output signal. In another embodiment, the system includes at least a pulse train multi-pulse analysis unit wherein the target vector is modeled as a series of pulse trains. Each pulse train comprises a plurality of single gain pulses, wherein each pulse is at a position which is a pitch value distance apart from the previous pulse in the pulse train. Combinations of maximum likelihood analyses with pulse trains are also part of the present invention.Type: GrantFiled: April 29, 1994Date of Patent: October 22, 1996Assignee: AudioCodes Ltd.Inventors: Leon Bialik, Felix Flomen
-
Patent number: 5566272Abstract: The user interface in an automatic speech recognition (ASR) system is dynamically controlled, based upon the level of confidence in the results of the ASR process. In one embodiment, the system is arranged to distinguish error prone ASR interpretations from those likely to be correct, using a degree of confidence in the output of the ASR system determined as a function of the difference between the confidence in the "first choice" selected by the ASR system and the confidence in the "second choice" selected by the ASR system. In this embodiment, the user interface is arranged so that the explicit verification steps taken by the system as a result of uncertain information is different from the action taken when the confidence is high. In addition, different treatment can be provided based upon the "consequences" of misinterpretation as well as the historical performance of the system with respect to the specific user whose speech is being processed.Type: GrantFiled: October 27, 1993Date of Patent: October 15, 1996Assignee: Lucent Technologies Inc.Inventors: Douglas J. Brems, Max S. Schoeffler
-
Patent number: 5563719Abstract: A data recording and/or reproducing device for recording and/or reproducing data on or from a recording medium includes information concerning the recording method used for recording data on a recording medium, that is the encoding/decoding instruction data is recorded on the recording medium along with the data. Using the encoding/decoding instruction on the recording medium, specifically the decoding instruction, the data on the recording medium is restored by a DSP. The data recording and/or reproducing device enables data processed with an inconvenient system or an older model system to be restored easily and inexpensively.Type: GrantFiled: January 7, 1994Date of Patent: October 8, 1996Assignee: Sony CorporationInventor: Yoshiaki Oikawa
-
Patent number: 5559791Abstract: In a simultaneous voice and data communications system, a voice signal is added to a data signal before transmission over the public switched telephone network (PSTN). In particular, in every signaling interval, a signal point is selected for transmission as a function of both the voice signal and the data signal. Since the voice signal is effectively offset by the data signal, compandors normally found in the PSTN are not effective in improving the signal to noise ratio of the transmitted voice and data signal. Therefore, the voice signal is additionally companded in the transmitter before transmission over the PSTN. This additional companding by the transmitter improves the signal to noise ratio of the combined voice and data signal.Type: GrantFiled: June 14, 1993Date of Patent: September 24, 1996Assignee: Lucent Technologies Inc.Inventors: Gordon Bremer, Kenneth D. Ko, Luke J. Smithwick