Patents Examined by Harold Zintel
  • Patent number: 6324502
    Abstract: Noisy speech parameters are enhanced by determining a background noise power spectral density (PSD) estimate, determining noisy speech parameters, determining a noisy speech PSD estimate from the speech parameters, subtracting a background noise PSD estimate from the noisy speech PSD estimate, and estimating enhanced speech parameters from the enhanced speech PSD estimate.
    Type: Grant
    Filed: January 9, 1997
    Date of Patent: November 27, 2001
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Peter Handel, Patrik Sörqvist
  • Patent number: 6256609
    Abstract: A method and apparatus for speech analysis and synthesis, including speaker recognition, includes a programmable lattice-ladder notch filter which may be programmed to exhibit both filter poles and filter zeros and thereby exhibit a power spectral density with a better fit to that of a speech frame such that, when energized by a selected signal sample, a more accurate regeneration of speech is achieved. The filter parameters may be reliably and systematically determined as a single solution to a mathematical analysis given a set of gain parameters matching the observed covariance data and having a prescribed set of transmission zeros. These transmission zeros may either be preselected as a design specification, or recovered from analysis of the speech data. A speech frame may be analyzed and provide a set of parameters which may be transmitted to a remote location where a synthesizer may accurately reproduce the speech frame.
    Type: Grant
    Filed: August 5, 1998
    Date of Patent: July 3, 2001
    Assignee: Washington University
    Inventors: Christopher I. Byrnes, Anders Lindquist
  • Patent number: 6243679
    Abstract: A pattern recognition system and method for optimal reduction of redundancy and size of a weighted and labeled graph presents receiving speech signals, converting the speech signals into word sequences, interpreting the word sequences in a graph where the graph is labeled with word sequences and weighted with probabilities and determinizing the graph by removing redundant word sequences. The size of the graph can also be minimized by collapsing some nodes of the graph in a reverse determinizing manner. The graph can further be tested for determinizability to determine if the graph can be determinized. The resulting word sequence in the graph may be shown in a display device so that recognition of speech signals can be demonstrated.
    Type: Grant
    Filed: October 2, 1998
    Date of Patent: June 5, 2001
    Assignee: AT&T Corporation
    Inventors: Mehryar Mohri, Fernando Carlos Neves Pereira, Michael Dennis Riley
  • Patent number: 6243672
    Abstract: A pitch detection method and apparatus capable of realizing high-precision pitch detection even for speech signals in which half-pitch or double-pitch exhibits stronger autocorrelation than the pitch for detection. An input speech signal is judged as to voicedness or unvoicedness and a voiced portion and an unvoiced portion of the input speech signal are encoded by a sinusoidal analytic encoding unit 114 and by a code excitation encoding unit 120, respectively, for producing respective encoded outputs. The sinusoidal analytic encoding unit 114 performs pitch search on the encoded outputs for finding the pitch information from the input speech signal and sets the high-reliability pitch information based on the detected pitch information. The results of pitch detection are determined based on the high-reliability pitch information.
    Type: Grant
    Filed: September 11, 1997
    Date of Patent: June 5, 2001
    Assignee: Sony Corporation
    Inventors: Kazuyuki Iijima, Masayuki Nishiguchi, Jun Matsumoto
  • Patent number: 6236962
    Abstract: An apparatus, method, and storage medium for eliminating the influence of line characteristics in a real-time manner in order to raise recognition precision of input speech and to enable the speech to be recognized in a real-time manner, includes a device and step for obtaining, an estimate value of a long-time mean of a parameter from speech feature parameters which are sequentially inputted by using the speech feature parameters which have already been inputted, and a device and step for normalizing the speech feature parameter inputted at that time point by using the obtained estimate value. Each time the speech feature parameter is inputted, the latest estimate value is obtained by using the already inputted parameters including the inputted speech feature parameter, and the latest input speech feature parameter is normalized by using the updated estimate value.
    Type: Grant
    Filed: March 12, 1998
    Date of Patent: May 22, 2001
    Assignee: Canon Kabushiki Kaisha
    Inventors: Tetsuo Kosaka, Yasuhiro Komori
  • Patent number: 6223160
    Abstract: A portable input apparatus permits an elevator destination call to be generated remotely from an associated elevator installation with respect to time and location. The input apparatus can be formed as, for example, a wristwatch having a user interface including a first display for visually indicating data, a first keyboard for the manual input of data or travel commands and an audio unit for the acoustic input of data or travel commands and for the generation of audio information to the elevator user. A destination floor identification, such as the English word “twenty-seven”, can be an acoustic command destination call entered into the input apparatus and indicated on the first display as “27” whereby in the proximity of the elevator installation, the destination call is automatically communicated to an elevator control of the elevator installation.
    Type: Grant
    Filed: May 20, 1998
    Date of Patent: April 24, 2001
    Assignee: Inventio AG
    Inventors: Miroslav Kostka, Paul Friedli
  • Patent number: 6199035
    Abstract: A method of speech coding a sampled speech signal using long term prediction (LTP). A LTP pitch-lag parameter is determined for each frame of the speech signal by first determining the autocorrelation function for the frame within the signal, between predefined maximum and minimum delays. The autocorrelation function is then weighted to emphasize the function for delays in the neighborhood of the pitch-lag parameter determined for the most recent voiced frame. The maximum value for the weighted autocorrelation function is then found and identified as the pitch-lag parameter for the frame.
    Type: Grant
    Filed: May 6, 1998
    Date of Patent: March 6, 2001
    Assignee: Nokia Mobile Phones Limited
    Inventors: Ari Lakaniemi, Janne Vainio, Pasi Ojala, Petri Haavisto
  • Patent number: 6199038
    Abstract: A signal encoding method for encoding input digital data by high-efficiency encoding whereby acoustic signals can be encoded on a real-time basis with small-sized hardware. The spectral signal components of an input signal are divided into first band units as encoding units, that is units used for encoding, and second band units, for setting an initial value of quantization precision. An estimated value of the required number of bits for each encoding unit is computed. A total number of bits required to encode the spectral components of the input signal is adjusted based on an estimated value of the required number of bits as separated and extracted from one encoding unit.
    Type: Grant
    Filed: January 15, 1997
    Date of Patent: March 6, 2001
    Assignee: Sony Corporation
    Inventor: Kyoya Tsutsui
  • Patent number: 6199042
    Abstract: A reading system includes a computer and a mass storage device and software including instructions for causing a computer to accept an image file generated from optically scanning an image of a document. The software converts the image file into a converted text file that includes text information, and positional information associating the text with the position of its representation in the image file. The software records the voice of an operator of the reading machine as a series of voice samples in synchronization with a highlighting indicia applied to a displayed representation of the document and stores the series of voice samples in a data structure that associates the voice samples with displayed representation. The reading machine plays back the stored, recorded voice samples corresponding to words in the document as displayed by the monitor while highlighting is applied to the words in the displayed document.
    Type: Grant
    Filed: June 19, 1998
    Date of Patent: March 6, 2001
    Assignee: L&H Applications USA, Inc.
    Inventor: Raymond C. Kurzweil
  • Patent number: 6188985
    Abstract: A hand-held wireless voice-activated device (10) for controlling a host system (11), such as a computer connected to the World Wide Web. The device (10) has a display (10a), a microphone (10b), and a wireless transmitter (10g) and receiver (10h). It may also have a processor (10e) and memory (10f) for performing voice recognition. A device (20) can be specifically designed for Web browsing, by having a processor (20e) and memory (20f) that perform both voice recognition and interpretation of results of the voice recognition.
    Type: Grant
    Filed: October 3, 1997
    Date of Patent: February 13, 2001
    Assignee: Texas Instruments Incorporated
    Inventors: Philip R. Thrift, Charles T. Hemphill
  • Patent number: 6185539
    Abstract: In a method for coding an audio signal digitized at a low sampling rate to obtain time domain audio samples. A frequency domain representation of the time domain audio samples is produced. The frequency domain representation includes successive frequency lines. These frequency lines are grouped into a plurality of scale factor bands. The successive frequency lines in a scale factor band are coded with the same scale factor. A plurality of regions is formed by grouping the scale factor bands, wherein successive scale factor bands form a region within which all the scale factors are coded with the same number of bits, which is determined according to the largest scale factor of the region. The scale factors assigned to scale factor bands within the highest region that includes the higher frequency successive frequency lines are set to zero. The frequency lines in the highest region are coded using the zero-valued scale factors that correspond to a multiplication factor of 1.
    Type: Grant
    Filed: May 26, 1998
    Date of Patent: February 6, 2001
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Oliver Kunz, Martin Dietz, Rainer Buchta, Jürgen Zeller, Karlheinz Brandenburg, Martin Sieler, Heinz Gerhäuser
  • Patent number: 6182028
    Abstract: A method (300), device (408), and system (400) provide part-of-speech disambiguation for words based on hybrid neural-network and stochastic processing. The method disambiguates the part-of-speech tags of text tokens by obtaining a set of probabilistically annotated tags for each text token, determining a locally predicted tag for each text token based on the local context of the text token, determining an alternative tag for each text token based on the expanded context of the text token, and choosing between the locally predicted tag and the alternative tag when the locally predicted tag and the alternative tag are different.
    Type: Grant
    Filed: November 7, 1997
    Date of Patent: January 30, 2001
    Assignee: Motorola, Inc.
    Inventors: Orhan Karaali, Andrew William Mackie
  • Patent number: 6182037
    Abstract: Fast and detailed match techniques for speaker recognition are combined into a hybrid system in which speakers are associated in groups when potential confusion is detected between a speaker being enrolled and a previously enrolled speaker. Thus the detailed match techniques are invoked only at the potential onset of saturation of the fast match technique while the detailed match is facilitated by limitation of comparisons to the group and the development of speaker-dependent models which principally function to distinguish between members of a group rather than to more fully characterize each speaker. Thus storage and computational requirements are limited and fast and accurate speaker recognition can be extended over populations of speakers which would degrade or saturate fast match systems and degrade performance of detailed match systems.
    Type: Grant
    Filed: May 6, 1997
    Date of Patent: January 30, 2001
    Assignee: International Business Machines Corporation
    Inventor: Stephane Herman Maes
  • Patent number: 6178238
    Abstract: An arrangement for permitting callers to make speed dialing calls away from their home telephone. A speed dialing list is maintained for those who subscribe to this service for a particular telephone line. When the caller who normally uses that line and makes use of his/her speed dialing feature from that line, and is away from his/her telephone, speed dialing service can be obtained if the call is a calling card, or credit/debit card telephone call. Such calls are routed initially to a switch for serving such calls. This switch then queries a calling card or credit card data base to obtain either a number corresponding to the caller's speed calling number, or to obtain an identification of the switch which contains that caller's speed dialing list. The caller is identified by his/her calling card, or credit/debit card number.
    Type: Grant
    Filed: April 9, 1998
    Date of Patent: January 23, 2001
    Assignee: Lucent Technologies Inc.
    Inventors: Barbara Ann Bozek, James Lee Turner
  • Patent number: 6173059
    Abstract: A telephone system includes two or more cardioid microphones held together and directed outwardly from a central point. Mixing circuitry and control circuitry combines and analyzes signals from the microphones and selects the signal from one of the microphones or from one of one or more predetermined combinations of microphone signals in order to track a speaker as the speaker moves about a room or as various speakers situated about the room speak then fall silent. Visual indicators, in the form of light emitting diodes (LEDs) are evenly spaced around the perimeter of a circle concentric with the microphone array. Mixing circuitry produces ten combination signals, A+B, A+C, B+C, A+B+C, A−B, B−C, A−C, A−0.5(B+C), B−0.5(A+C), and C−0.5(B+A), with the “listening beam” formed by combinations, such as A−0.
    Type: Grant
    Filed: April 24, 1998
    Date of Patent: January 9, 2001
    Assignee: Gentner Communications Corporation
    Inventors: Jixiong Huang, Richard S. Grinnell
  • Patent number: 6151578
    Abstract: A system for broadcasting data (D) that can transmit information in the passband of a broadcast audio-frequency signal (S). The system can determine at least one frequency band (F'.sub.13, . . . , F'.sub.24) and the amplitude (A'.sub.13, . . . , A'.sub.24) of the audio-frequency signal (S). The system compares this amplitude with an auditory masking level (Nm(13), . . . , Nm(24)) associated with this frequency band and eliminates the frequency components of the audio-frequency signal in the frequency band if the amplitude of the signal is lower than the auditory masking level of the band. The system can insert the data in this frequency band at a level lower than or equal to the auditory masking level of the frequency band.
    Type: Grant
    Filed: November 21, 1997
    Date of Patent: November 21, 2000
    Assignee: Telediffusion de France
    Inventors: Patrice Bourcet, Denis Masse, Bruno Jahan
  • Patent number: 6141640
    Abstract: A digital transmitter/receiver communications system transmits audio voice signals over a channel with increased quality for a specified bit rate. The method of encoding takes advantage of spherical symmetry of error vectors associated with encoding Line Spectral Frequency (LSF) coefficients, to reduce the information transmitted. Errors in encoding the LSF coefficient sets, vectors J, are modeled by a number of vectors J.sub.p having all positive components, and a sign vector s indicating the polarity of each component of the vector. Each LSF vector J intended to be transmitted is approximated by a positive vector J.sub.p and a sign vector s. An index I.sub.p of the positive vector J.sub.p and the sign vector corresponding to vector J are transmitted, along with other audio information to a receiver/decoder where the signal is decoded into an audio signal closely representing the original signal intended to be transmitted.
    Type: Grant
    Filed: February 20, 1998
    Date of Patent: October 31, 2000
    Assignee: General Electric Company
    Inventor: Peter Warren Moo
  • Patent number: 6138091
    Abstract: This invention relates to a method by means of which more than one audio signal can be recorded in compressed form in a memory element, and to a system implementing such a method. In the system according to the invention, audio signal samples are recorded only when voice is detected in the audio signals. The system according to the invention saves memory capacity required by the recording by combining the audio signal samples when voice is detected in samples of more than one audio signal. Furthermore, an audio signal is not recorded when no voice is detected in the signal. The invention also reduces the average computing capacity needed and thus power consumption, since signal combination, or mixing, is advantageously performed only when voice is detected in the samples of more than one audio signal.
    Type: Grant
    Filed: December 17, 1997
    Date of Patent: October 24, 2000
    Assignee: Nokia Mobile Phones Ltd.
    Inventors: Tero Haataja, Ari Sinisalo
  • Patent number: 6138096
    Abstract: Signal processing apparatus for providing a conversion of audio and E-mail signals received from a network interface 13 and for supplying the signals either to an audio reception device 80 in the form of a telephone handset or to a video reception in the form of a video processor 8 for display on a television set 9 and, selectively, for converting the received signals for reception on the other reception device.
    Type: Grant
    Filed: March 18, 1997
    Date of Patent: October 24, 2000
    Assignee: Add Value Holdings Pte Ltd.
    Inventors: Colin Kum Lok Chan, Khai Pang Tan
  • Patent number: 6134529
    Abstract: The invention extends the capability of conventional computer speech recognition programs to reliably recognize and understand large word and phrase vocabularies for teaching written language skills. At each step of a teaching program, information is supplied to the user such that some responses in the language being taught are correct (or appropriate) and some are incorrect (or inappropriate), with these respective sets of responses judiciously selected to teach some language aspect (i.e., vocabulary, sentence structure). A subset of allowable correct; and incorrect responses is selected such that a speech recognition subprogram readily discerns certain allowable responses from other allowable responses, including each incorrect response being discriminable from each correct response. The meanings of at least the correct allowable responses are made clear by aural or visual information, such as graphic images, printed text, or translations into the user's native language.
    Type: Grant
    Filed: February 9, 1998
    Date of Patent: October 17, 2000
    Assignee: Syracuse Language Systems, Inc.
    Inventor: Martin Rothenberg