Patents Examined by Richard J. Kim

Music/voice discriminating apparatus

Patent number: 5375188

Abstract: A music/voice discriminating apparatus is composed of a signal processing portion for effecting the signal processing upon input acoustic signals, a music/voice deciding portion for discriminating whether the input acoustic signals are music or voice, a first signal processing portion for optimally setting acoustic parameters for the signal processing respectively for music or voice, and a second signal processing portion for controlling the acoustic parameters of the first signal processing portion in accordance with the decision results of the music/voice deciding portion so that it may become a desirable value set in the second parameter setting portion.

Type: Grant

Filed: June 8, 1992

Date of Patent: December 20, 1994

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Mitsuhiko Serikawa, Akihisa Kawamura, Masaharu Matsumoto, Hiroko Numazu
Method and system for CELP speech coding and codebook for use therewith

Patent number: 5371853

Abstract: Apparatus and method for encoding speech using a codebook excited linear predictive (CELP) speech processor and an algebraic codebook for use therewith. The CELP speech processor receives a digital speech input representative of human speech and performs linear predictive code analysis and perceptual weighting filtering to produce a short term speech information and a long term speech information. The CELP speech processor utilizes an organized, non-overlapping, algebraic codebook containing a predetermined number of vectors, uniformly distributed over a multi-dimensional sphere to generate a remaining speech residual. The short term speech information, long term speech information and remaining speech residual are combinable to form a quality reproduction of the digital speech input.

Type: Grant

Filed: October 28, 1991

Date of Patent: December 6, 1994

Assignee: University of Maryland at College Park

Inventors: Yuhung Kao, John Baras
Speech recognition circuitry employing nonlinear processing speech element modeling and phoneme estimation

Patent number: 5369726

Abstract: A phoneme estimator in a speech-recognition system includes energy detect circuitry for detecting the segments of a speech signal that should be analyzed for phoneme content. Speech-element processors then process the speech signal segments, calculating nonlinear representations of the segments. The nonlinear representation data is applied to speech-element modeling circuitry which reduces the data through speech element specific modeling. The reduced data are then subjected to further nonlinear processing. The results of the further nonlinear processing are again applied to speech-element modeling circuitry, producing phoneme isotype estimates. The phoneme isotype estimates are rearranged and consolidated, that is, the estimates are uniformly labeled and duplicate estimates are consolidated, forming estimates of words or phrases containing minimal numbers of phonemes. The estimates may then be compared with stored words or phrases to determine what was spoken.

Type: Grant

Filed: February 9, 1993

Date of Patent: November 29, 1994

Assignee: Eliza Corporation

Inventors: John P. Kroeker, Robert L. Powers
Speech synthesizer

Patent number: 5369730

Abstract: In an overlap addition unit, speech waveform data is subjected to overlap addition every period read out from a period storage unit, and in a simple addition unit, the waveform data obtained by the overlap addition and the aperiodic waveform data read out from an aperiodic waveform storage unit are added to each other. Thus, the aperiodic waveform is given to the speech waveform to improve the quality of synthesized speech.

Type: Grant

Filed: May 26, 1992

Date of Patent: November 29, 1994

Assignee: Hitachi, Ltd.

Inventor: Shunichi Yajima
Signal encoding device

Patent number: 5361323

Abstract: In a signal encoding device, a voice signal is divided into subsignals each comprising a predetermined number of successive samples, and data trains are generated by giving initial values to a recurrence equation. The generated data trains are divided into patterns each comprising the same number of samples as the predetermined number. The distance between each pattern produced by the pattern dividing unit and each of the subsignals are calculated. The pattern that provides the smallest distance with respect to each of the subsignals is identified, and the initial value set in the recurrence equation for generation of the data train that constitutes the pattern is output as coded data representing the respective subsignal.

Type: Grant

Filed: November 29, 1991

Date of Patent: November 1, 1994

Assignee: Sharp Kabushiki Kaisha

Inventors: Yasumoto Murata, Shuichi Yoshikawa, Yuji Nishiwaki, Shuichi Kawama, Tomokazu Morio, Atsunori Kitoh
Speech recognition system

Patent number: 5355432

Abstract: A speech recognition system includes an acoustic analyzer which produces a time sequence of acoustic parameters from an input speech signal in an utterance boundary thereof, and estimates a trajectory in a parameter space from the time sequence of acoustic parameters. The trajectory is re-sampled in the parameter space at predetermined constant intervals sequentially each time the acoustic parameters are produced by the acoustic analyzing means, thereby producing an input utterance pattern. The input utterance pattern is matched with reference speech patterns to recognize the input speech signal. The speech recognition system also has an utterance boundary detector for detecting the utterance boundary of the input speech signal. The trajectory is re-sampled while the utterance boundary is being detected by the utterance boundary detector.

Type: Grant

Filed: August 12, 1992

Date of Patent: October 11, 1994

Assignee: Sony Corporation

Inventors: Miyuki Tanaka, Masao Watari, Yasuhiko Kato
Noise suppressor

Patent number: 5353408

Abstract: A code conversion table, in which a code of a voice with noise added thereto and a code of a voice without noise are associated with each other in terms of probability, is referred to in a code converter. Using the code converter, a code is obtained in a vector quantizer by vector-quantizing cepstrum coefficients extracted from the voice with noise added thereto, and is converted into a code of a voice obtained by suppressing the noise in the voice with noise added thereto. Linear predictive coefficients are obtained from the code, and the voice signal is reproduced in a synthesis filter according to the linear predictive coefficients.

Type: Grant

Filed: December 30, 1992

Date of Patent: October 4, 1994

Assignee: Sony Corporation

Inventors: Yasuhiko Kato, Masao Watari, Makoto Akabane
System for embedded coding of speech signals

Patent number: 5353373

Abstract: The set of possible excitation signals is subdivided into a plurality of subsets, the first of which provides the contribution to the coded signal necessary to set up a transmission at a minimum rate guaranteed by the network, while the others supply a contribution which, when added to that of the first subset, causes a rate increase by successive steps. At the receiving side, a decoded signal is generated by using the excitation contribution of the first subset alone if the coded signals are received at the minimum rate, while for rates higher than the minimum rate the contributions of the subsets which have allowed such rate increase are also used.

Type: Grant

Filed: December 4, 1991

Date of Patent: October 4, 1994

Assignee: SIP - Societa Italiana per l'Esercizio delle Telecomunicazioni P.A.

Inventors: Rosario Drogo de Iacovo, Roberto Montagna, Daniele Sereno
System and method for improved speech acquisition for hands-free voice telecommunication in a noisy environment

Patent number: 5353376

Abstract: Systems and methods for improved speech acquisition are disclosed including a plurality of linearly arrayed sensors to detect spoken input and to output signals in response thereto, a beamformer connected to the sensors to cancel a preselected noise portion of the signals to thereby produce a processed signal, and a speech recognition system to recognize the processed signal and to respond thereto. The beamformer may also include an adaptive filter with enable/disable circuitry for selectively training the adaptive filter a predetermined period of time. A highpass filter may also be used to filter a preselected noise portion of the sensed signals before the signals are forwarded to the beamformer. The speech recognition system may include a speaker independent base which is able to be adapted by a predetermined amount of training by a speaker, and which system includes a voice dialer or a speech coder for telecommunication.

Type: Grant

Filed: March 20, 1992

Date of Patent: October 4, 1994

Assignee: Texas Instruments Incorporated

Inventors: Sang G. Oh, Vishu R. Viswanathan
Method of speech recognition

Patent number: 5345536

Abstract: A set of "m" feature parameters is generated every frame from reference speech which is spoken by at least one speaker and which represents recognition-object words, where "m" denotes a preset integer. A set of "n" types of standard patterns is previously generated on the basis of speech data of a plurality of speakers, where "n" denotes a preset integer. Matching between the feature parameters of the reference speech and each of the standard patterns is executed to generate a vector of "n" reference similarities between the feature parameters of the reference speech and each of the standard patterns every frame. The reference similarity vectors of respective frames are arranged into temporal sequences corresponding to the recognition-object words respectively. The reference similarity vector sequences are previously registered as dictionary similarity vector sequences. Input speech to be recognized is analyzed to generate "m" feature parameters from the input speech.

Type: Grant

Filed: December 17, 1991

Date of Patent: September 6, 1994

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Masakatsu Hoshimi, Maki Miyata, Shoji Hiraoka, Katsuyuki Niyada
Speech recognizer

Patent number: 5337394

Abstract: In the speech recognizer disclosed herein, alignment of an unknown speech sediment, represented by a finely gradiated sequence of frames, with a model sediment represented by a sequence of states is performed by first preparing respective coarse sequences representing the unknown and model segments thereby to define a coarse matrix representing possible alignments. The fine sequences correspondingly define a fine matrix. A best alignment of the coarse sequences is determined thereby to define a coarse path through the coarse matrix. The coarse path is overlaid on the fine matrix and a corridor is defined which includes fine matrix locations which lie within a preselected metric of the coarse path. Only transitions within the corridor are calculated in determining the fine alignment of the unknown speech segment with the model segment, thereby significantly reducing the number of computations required.

Type: Grant

Filed: June 9, 1992

Date of Patent: August 9, 1994

Assignee: Kurzweil Applied Intelligence, Inc.

Inventor: Vladimir Sejnoha
IC card with built-in voice synthesizing function

Patent number: 5325463

Abstract: An IC card with a built-in voice synthesizing function is provided which includes a solid-state memory containing vector-quantized coded data, a pattern generating means for generating patterns each composed of a prescribed number of digital data by repeatedly performing calculations using a recurrence equation with initial values given by coded data read out of the solid-state memory, a converter circuit for limiting the band of each pattern generated by the pattern generating means and converting into an analog voice signal, and a gain control circuit for adjusting the gain of the analog voice signal output from the converter circuit on the basis of gain data read out of the solid-state memory.

Type: Grant

Filed: January 30, 1992

Date of Patent: June 28, 1994

Assignee: Sharp Kabushiki Kaisha

Inventors: Yasumoto Murata, Shuichi Yoshikawa, Yuji Nishiwaki
System and method for speech synthesis employing improved formant composition

Patent number: 5325462

Abstract: A method, system and process to improve the formant composition in a speech synthesis system so that the formants are more intelligible. The system employs a process in the memory of a processor to change the starting and ending frequency of phonemes from the frequency of the independent phonemes. The process examines preceding and succeeding ending phoneme frequency values to detect similar phoneme frequency values. If a dissimilar value is detected, then the invention provides for exchange of the formants to render the resulting speech more intelligible.

Type: Grant

Filed: August 3, 1992

Date of Patent: June 28, 1994

Assignee: International Business Machines Corporation

Inventor: Peter W. Farrett
Low-delay audio signal coder, using analysis-by-synthesis techniques

Patent number: 5321793

Abstract: A low-delay audio signal coding system, using analysis-by-synthesis techniques, has circuitry for adapting the spectral parameters and the prediction order of synthesis filters, and of perceptual weighting filters in the order at each frame, starting from the reconstructed signal relevant to the previous frame. In the case of a CELP coder, gain controls are also provided to adapt, starting from the reconstructed sinal, a factor, bound to the average power of the input signal, of the gain by which the innovation vectors are weighted.

Type: Grant

Filed: May 21, 1993

Date of Patent: June 14, 1994

Assignee: SIP--Societa Italiana per l'Esercizio delle Telecommunicazioni P.A.

Inventors: Rosario Drogo De Iacovo, Roberto Montagna, Daniele Sereno
Method and apparatus for context-dependent estimation of multiple probability distributions of phonetic classes with multilayer perceptrons in a speech recognition system

Patent number: 5317673

Abstract: In a hidden Markov model-based speech recognition system, multilayer perceptrons (MLPs) are used in context-dependent estimation of a plurality of state-dependent observation probability distributions of phonetic classes. Estimation is obtained by the Bayesian factorization of the observation likelihood in terms of posterior probabilities of phone classes assuming the context and the input speech vector. The context-dependent estimation is employed as the state-dependent observation probabilities needed as parameter input to a hidden Markov model speech processor to identify the word sequence representing the unknown speech input of input speech vectors. Within the speech processor, models are provided which employ the observation probabilities in the recognition process.

Type: Grant

Filed: June 22, 1992

Date of Patent: May 31, 1994

Assignee: SRI International

Inventors: Michael H. Cohen, Horacio E. Franco
Pen recorder

Patent number: 5313557

Abstract: A pen recorder comprises a hollow pen body having a proximal end and a distal end a solid state audio recorder in its interior. A retractable pen mechanism is mounted on the distal end of the pen body, and a microphone transducer and speaker transducer are mounted at the proximal end. The speaker transducer is transversely mounted across an open proximal end of the pen body so that the interior of the pen body acts as a resonator improving the sound quality of the speaker.

Type: Grant

Filed: December 17, 1991

Date of Patent: May 17, 1994

Assignee: Machina

Inventor: Ralph Osterhout
Apparatus for processing digital audio signal

Patent number: 5303374

Abstract: A digital audio signal processing apparatus is provided having a predictive error generator for generating predictive error data by processing input digital data to acquire a plurality of different frequency characteristics. A selector selects one of the plural predictive error data. A requantizer requantizes the selected predictive error data. A corrector processes with a predetermined frequency characteristic, the requantization error induced during the operation of the requantizer, thereby correcting the requantization error caused in the requantizer. A frequency characteristic control selects at least two of the predictive error data obtained with the plural frequency characteristics, then calculates the selected predictive error data and controls the frequency characteristic in the corrector in accordance with the result of such calculation.

Type: Grant

Filed: October 15, 1991

Date of Patent: April 12, 1994

Assignee: Sony Corporation

Inventors: Satoshi Mitsuhashi, Masayuki Nishiguchi