Abstract: There is provided a voice recognition system comprising a standard pattern memory in which a voice pattern of a predetermined word is stored as a positive reference pattern and also voice patterns of words similar to but different from the first-mentioned word are stored as negative reference patterns, a pattern comparator for calculating dissimilarities of an input voice pattern with respect to the positive reference pattern and negative reference patterns, and a discriminator for providing a coincidence confirmation output signal when the dissimilarity with respect to the positive reference pattern is less than a predetermined threshold value and less than the dissimilarities with respect to the negative reference patterns while otherwise rejecting the result of recognition.
Abstract: Speech is synthesized by repeated readout of prestored basic speech waveforms. For varying the speech tone frequency, readout is done at a fixed rate but skipping samples sequentially stored.
Abstract: A synthesizer for a channel vocoder for the transmission of speech with considerable frequency band reduction in digital technology provides a reduction of circuit expense. This is accomplished, given non-recursive filters having finite impulse response, not with a weighting of the pulse-shaped excitation variable with the transmitted envelope values of the spectral channels, but with a time variance of the filters by weighting their filter coefficients with the transmitted envelope values. In addition, with proper dimensioning of the filter coefficients, an optimum speech quality of synthesized speech is obtained at the output of the synthesizer even given elimination of the transmission of a voiced/voiceless signal.
Abstract: Techniques are disclosed for converting a multibit digital data word into a corresponding analog signal. The converter of the present invention utilizes information in a received signal plus noise associated with the information in such a manner as to reduce the mean squared error in the generated analog signal. The actual voltage associated with each bit position in the digital word is measured. In addition, a measurement of the signal-to-noise ratio is made. From these two measurements a processing device is adapted to generate a value corresponding to the statistical probability that the detected voltage level for each bit is a logical one. The probability values are used as scaling factors to modify the weights to be given each bit during the normal digital to analog conversion process.
Abstract: In a speech recognition and control apparatus, the speech signal is filtered into sub-bands, each sub-band signal sampled and binary quantized to form a digital string which is simplified (data reduced or compressed) by further encoding based on detecting in the string an isolated sample having one binary value surrounded by two samples having the other binary value.
Type:
Grant
Filed:
May 17, 1982
Date of Patent:
February 25, 1986
Assignee:
Asulab S.A.
Inventors:
Ngoc C. Bui, Jean-Georges Michel, Jean-Jacques Monbaron
Abstract: In a speech recognition system using time-warping for better matching, accuracy and efficiency of dynamic programming computation is improved by expanding the approximated start and end points into an assembly of points, and performing successive iteration for lattice points d.sub.i,j spaced N.gtoreq.3 points apart rather than every lattice point.
Type:
Grant
Filed:
December 9, 1982
Date of Patent:
February 11, 1986
Assignee:
Nippon Telegraph & Telephone Public Corporation
Abstract: The baseband portion x(t) of a voice signal is first applied to a complex filter which provides a couple of signal components y and y in quadrature with each other. Each of the components y and y is then fed to a bank of QMF filters which split the spectra of y and y into N subbands providing subband components y.sub.i and y.sub.i (with i=1, 2, . . . , N), respectively. The subband components y.sub.i and y.sub.i are combined together to generate information representative of the phase (.phi..sub.i) and amplitude (.rho..sub.i) of the contents of each subband, which information is then transcoded by a coder which dynamically distributes the available bit resources to the various subbands.
Type:
Grant
Filed:
July 19, 1982
Date of Patent:
February 4, 1986
Assignee:
International Business Machines Corporation
Abstract: A continuous speech recognition system includes a plurality of processors doing template comparisons of speech data. Each processor has an associated memory shared with the other processors by direct memory access (DMA) through a shared data bus. The DMA circuitry is distributed between the processors to eliminate redundancy, since if each processor had a full DMA circuit, one of the circuits would be idle when the processors communicated.
Type:
Grant
Filed:
November 3, 1982
Date of Patent:
January 28, 1986
Assignee:
International Telephone and Telegraph Corporation
Inventors:
George Vensko, Lawrence Carlin, John Potter, Allen R. Smith
Abstract: A speech synthesis system uses a programmable digital divider to generate desired formant frequencies. The divide factor is controlled by an MPU. The output of the divider is passed through a sinewave shaper and then through an electronically controlled amplitude amplifier to provide an audio output. The system is based upon the realization that alternating between two formant frequencies at the pitch rate generates a sound comparable to generating two formants simultaneously and adding algebraically.
Abstract: A speech synthesizer may generate random sounds during die-down after power turn-off. To prevent such sound generation, the synthesizer clock circuit is grounded by an FET simultaneously with power turn-off.
Abstract: An electrical measuring instrument, such as a multi-meter, for measuring such electrical variables as current, voltage, resistance, impedence, frequency and other variables. The results of such measurement are both displayed and are provided as sounds of words of speech indicating measurements as they are made. A memory temporarily records the results of measurement and provides signals, on demand, for repeating the generation of synthetic speech indicating the results of a measurement. In one form, signals representative of a plurality of different measurements are recorded in the memory and each may be selectively reproduced for display and/or generating synthetic speech indicative of the particular measurement represented thereby. A keyboard forming part of the measuring instrument may be employed for the selective reproduction of signals from the memory as well as for coding the recorded signals so that they may be selectively reproduced from the memory.
Abstract: Speech signal pitch detection uses the residual signal output of an LPC (linear prediction coder) filter. The residual signal (or its Hilbert transform) is divided into frames of 20 milliseconds, and each frame searched for signal peak periodicity by first detecting the highest peak, then detecting a set of equally-spaced selected samples from which the pitch period is determined. If no periodicity is found over three frames, an unvoiced decision is made.
Abstract: In a machine implemented voice recognition method, as a first step speech signals are analyzed for feature vectors which are used to compare input signals with prestored reference signals. Patterns of any suitable form are used to calculate a similarity distance measure d.sub.IJ which is tested against a threshold to select likely candidates as a first step. A second step selects the most likely candidate by using "common nature" parameters of phonemes such as relative occurrence. Five embodiments of the second step are disclosed, each using a "common nature" criteria of inference to infer (select) the most likely candidate:(1) d'.sub.I =W.sub.1,.W.sub.2.W.sub.3 where W is a weighting factor;(2) d".sub.I =C.sub.I d'.sub.I where C.sub.I is a correction factor;(3) max p(i,j) where p(i) is the probability of occurrence of the i.sup.th phoneme;(4) min d'.sub.ij as a variation of max p(i,j); and(5) N(i) is the numerical similarity of the common characteristics of the selected candidates.
Abstract: In a speech recognition and control system for an automobile, the microphone preamplifier gain is made inversely proportional to the background sound noise level. Noise level is determined by averaging the microphone output for 100 milliseconds after the recognition switch is closed, before speech.
Type:
Grant
Filed:
January 6, 1983
Date of Patent:
December 10, 1985
Assignee:
Nissan Motor Company, Limited
Inventors:
Kazunori Noso, Norimasa Kishi, Toru Futami
Abstract: A connected word recognition system operable according to a DP algorithm and in compliance with a regular grammar, is put into operation in synchronism with successive specification of feature vectors of an input pattern. In an m-th period in which an m-th feature vector is specified, similarity measures are calculated (58, 59) between reference patterns representative of reference words and those fragmentary patterns of the input pattern, which start at several previous periods and end at the m-th period, for start and end states of the reference words. In the m-th period, an extremum of the similarity measures is found (66, 69, 86), together with a particular word and a particular pair of start and end states thereof, and stored (61-63). Moreover, a particular start period is selected (67, 86) and stored (64).
Abstract: A circuit that combines the functions of audio compression and voice detection comprises an amplifier in a negative-feedback configuration in which the feedback signal is the product of the output of the amplifier and a syllabic voltage derived from that output. The syllabic voltage is compared with a threshold level to provide an output that is a measure of the presence of a voice signal. In an alternate embodiment, the syllabic voltage is compared with the sum of a threshold level and a detected syllabic voltage to generate a signal that is a measure of the presence of voice. The syllabic detector may be either an averaging detector or a valley detector.
Type:
Grant
Filed:
December 11, 1984
Date of Patent:
October 29, 1985
Assignee:
Motorola, Inc.
Inventors:
Steven F. Gillig, George H. Fergus, Michael F. Barnes
Abstract: An analog speech signal is sampled of a nominal rate of 6 kilohertz and digitized in a Mu-Law Encoder. The digital output of the Mu-Law Encoder is converted by a microprocessor performing table look-up to linearized pulse code modulation (PCM) samples nominally of eight bits per sample. Using a BSPCM (Block Scaled Pulse Code Modulation) method, in each block of nominally 246 eight-bit PCM samples (representing approximately 41 milliseconds), the maximum and minimum sample values are found and used to calculate a scale factor equal to the maximum sample value minus the minimum sample value, with the difference being then divided by a constant number nominally equaling 16. Then the BSPCM samples are generated from the PCM samples each as a corresponding one PCM sample minus the minimum PCM sample value, the difference being then divided by the scale factor. In effect, the bit rate is reduced by adjusting the step size to follow the local block dynamic range.
Type:
Grant
Filed:
September 20, 1982
Date of Patent:
October 29, 1985
Assignee:
Sperry Corporation
Inventors:
David P. Andersen, Raymond C. Hedin, John F. Siebenand
Abstract: High-frequency broadband noise (at frequencies above the lower-frequency analog information signal) is actually required for this system which transforms the information-plus-noise signal (FIG. 2a) into a binary-coded pulse-width-modulated signal (FIG. 2c) whose modulation represents the original information and noise. Such coding uses broadband infinite clipping to enhance the information to noise ratio, and also enhance the dynamic-range capability. A second embodiment synchronizes the binary-coded signal to a signal clock. Enhanced intelligibility for the hearing-impaired is discussed.
Abstract: Speech and music phonation formed by spectral formants is synthesized by a composite filter containing parallel filters in cascade. The composite filter design is generated by partial-function expansion of the approximate sound channel transfer function ##EQU1## and the composite filter is implemented with mutually adjacent formants in cascade filter elements.
Abstract: A speech controlled dialing circuit identifies input utterances which may be a command word (mode select), repertory word (dialing name or number), or nonrecognized ("Other"). Responsive to the identification of each occurring input utterance, a set of predetermined templates are selected to identify the next occuring utterance. A programmed microprocessor system is described to implement the main controller function.
Type:
Grant
Filed:
September 7, 1984
Date of Patent:
October 22, 1985
Assignee:
AT&T Bell Laboratories
Inventors:
Frank C. Pirz, Lawrence R. Rabiner, Aaron E. Rosenberg, Jay G. Wilpon