Patents Examined by Donald L. Storm
  • Patent number: 6732074
    Abstract: A standard dictionary; a feature extracting unit which extracts features from an input speech; a matching unit which performs matching between the features of the input speech extracted by the feature extracting unit and the standard dictionary; a result outputting unit which outputs a matching result in the matching unit; and a dictionary updating portion which updates the standard dictionary are provided. The standard dictionary is built initially as a dictionary to be used for recognizing speeches produced by any independent speaker; and the dictionary updating unit updates the standard dictionary so as to provide a dictionary to be used for recognizing speeches produced by a dependent speaker based on the result of matching between the features extracted from the input speech and the standard dictionary.
    Type: Grant
    Filed: January 27, 2000
    Date of Patent: May 4, 2004
    Assignee: Ricoh Company, Ltd.
    Inventor: Masaru Kuroda
  • Patent number: 6728673
    Abstract: A video retrieval data generation apparatus includes an extractor that is configured to extract a characteristic pattern from a voice signal synchronous with a video signal. The video retrieval data generation apparatus also includes an index generator that is configured to set the voice signal for a voice period as a processing target. The index generator is further configured to prepare standard voice patterns of a subword corresponding to a plurality of subwords, detect, for each subword, a characteristic pattern similar to a standard voice pattern at each of the voice periods, and generate, for each subword, an index containing time synchronization information corresponding to a position where the similar characteristic pattern is detected. The video retrieval data generation apparatus also includes a multiplexer that is configured to multiplex video signals, voice signals and indexes to output in a data stream format.
    Type: Grant
    Filed: May 9, 2003
    Date of Patent: April 27, 2004
    Assignee: Matsushita Electric Industrial Co., LTD
    Inventors: Hiroshi Furuyama, Hitoshi Yashio, Ikuo Inoue, Mitsuru Endo, Masakatsu Hoshimi
  • Patent number: 6725190
    Abstract: A speech reconstruction method and system for converting a series of binned spectra or functions thereof such as the Mel Frequency Cepstra Coefficients (MFCC), of an original digitized speech signal, into a reconstructed speech signal, where each binned spectrum has a respective pitch value and voicing decision. The binned spectra are derived from the original digitized speech signal at successive instances by multiplying each estimate of the spectral envelope by a predetermined set of frequency domain window functions and computing the integrals thereof. At each respective time instance, harmonic frequencies and weights are generated according to the respective pitch value and voicing decision. Basis functions having bounded supports on the frequency axis are each sampled at all said harmonic frequencies, which are within its support and multiplied by respective harmonic weights.
    Type: Grant
    Filed: November 2, 1999
    Date of Patent: April 20, 2004
    Assignee: International Business Machines Corporation
    Inventors: Dan Chazan, Gilad Cohen, Ron Hoory
  • Patent number: 6724859
    Abstract: A method for determining the make up of a subscriber loop via improved time-domain reflectometry techniques by analyzing the echo responses generated by transmittal of pulses onto the subscriber loop. In the method discontinuities along a loop are identified sequentially and in a step-by-step fashion by comparing the measured waveform to suitable waveforms generated on the basis of a hypothesized topology. Once the generated waveform that best matches the measured data has been found and a discontinuity identified, the waveform generated in correspondence of the loop topology identified so far is subtracted from the measured data to produce a compensated waveform, which, is more suitable for detection and location of the next echo.
    Type: Grant
    Filed: September 29, 2000
    Date of Patent: April 20, 2004
    Assignee: Telcordia Technologies, Inc.
    Inventor: Stefano Galli
  • Patent number: 6718297
    Abstract: The present invention classifies an input signal as either voice or data with reduced energy consumption. The present invention includes a frequency estimator and an energy estimator for processing an input signal and a classification unit connected to both the frequency and energy estimators for classifying the input signal. The frequency estimator includes a delay and difference integrator. In operation, the delay receives the input signal and generates a delayed input signal and the difference integrator receives the delayed and input signals and generates a frequency estimate value representing both the estimated central frequency of the input signal and the estimated energy of the input signal. The energy estimator generates an estimate of the energy level of the input signal. The classification unit classifies the input as either voice or data based on a comparison of the frequency and energy estimate values and a data threshold value.
    Type: Grant
    Filed: February 15, 2000
    Date of Patent: April 6, 2004
    Assignee: The Boeing Company
    Inventors: Joseph Peebles Pride, III, Edward James Carroll, Cheryl Jean Franklin
  • Patent number: 6718302
    Abstract: A method for utilizing validity constraints in a speech endpoint detector comprises a validity manager that may utilize a pulse width module to validate utterances that include a plurality of energy pulses during a certain time period. The validity manager also may utilize a minimum power module to ensure that speech energy below a pre-determined level is not classified as a valid utterance. In addition the validity manager may use a duration module to ensure that valid utterances fall within a specified duration. Finally, the validity manager may utilize a short-utterance minimum power module to specifically distinguish an utterance of short duration from background noise based on the energy level of the short utterance.
    Type: Grant
    Filed: January 12, 2000
    Date of Patent: April 6, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Miyuki Tanaka, Ruxin Chen, Lex Olorenshaw
  • Patent number: 6718305
    Abstract: Disclosed is a method for use by a speech recognizer. The method includes determining a regression class tree structure for the speech recognizer, wherein the tree structure includes, representing word subunits or regression classes, as tree leaves, combining the word subunits to form tree nodes using a distance measure for the word subunits in the acoustic space, and combining regression classes to a regression class that lies closer to a tree root of the tree structure using a correlation measure, and wherein at least two of regression classes having the largest correlation parameter are combined to a new regression class that is used in the formation of the regression tree structure, instead of the two combined regression classes, to determine a regression class representing the tree root.
    Type: Grant
    Filed: March 17, 2000
    Date of Patent: April 6, 2004
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Reinhold Häb-Umbach
  • Patent number: 6718308
    Abstract: A system and method for searching, assembling, and manipulating a variety of multi-media using voice converted to text commands. Digital images, movies, audio, or text is verbally searched and retrieved from a variety of video and audio databases using a combination of directional commands and a means for juxtaposing and assembling search results. The desired media is then placed onto a platform means for manipulating and editing the media files. Any retrieved media files and/or images can be manipulated and assembled on-screen using commands such as “zoom” or “move left” by having corners and borders read by the grid of the platform means. The image(s) are also capable of being stacked, or overlay one another to define re-proportioned backgrounds. The image(s) from the platform means are displayed without the grid using an image platter as a means of providing a preliminary view of the presentation prior to projection.
    Type: Grant
    Filed: July 7, 2000
    Date of Patent: April 6, 2004
    Inventor: Daniel L. Nolting
  • Patent number: 6701294
    Abstract: A natural language-based interface data presentation system interfaces, for example, information visualization system interfaces, is realized by employing so-called open-ended natural language inquiries to the interface that translates them into database queries and a set of information to be provided to a user. More specifically, a natural language inquiry is translated to database queries by determining if any complete database queries can be formulated based on the natural language inquiry and, if so, specifying which complete database queries are to be made. In accordance with one aspect of the invention, knowledge of the information visualization presentation is advantageously employed in the interface to guide a user in response to the user's natural language inquiries.
    Type: Grant
    Filed: January 19, 2000
    Date of Patent: March 2, 2004
    Assignee: Lucent Technologies, Inc.
    Inventors: Thomas J. Ball, Kenneth Charles Cox, Rebecca Elizabeth Grinter, Stacie Lynn Hibino, Lalita Jategaonkar Jagadeesan, David Alejandro Mantilla
  • Patent number: 6694294
    Abstract: A method and system that improves voice recognition by improving the voice recognizer of a voice recognition system. Mu-law compression of bark amplitudes is used to reduce the effect of additive noise and thus improve the accuracy of the voice recognition system. A-law compression of bark amplitudes is used to improve the accuracy of the voice recognizer. Both mu-law compression and mu-law expansion can be used in the voice recognizer to improve the accuracy of the voice recognizer. Both A-law compression and A-law expansion can be used in the voice recognizer to improve the accuracy of the voice recognizer.
    Type: Grant
    Filed: October 31, 2000
    Date of Patent: February 17, 2004
    Assignee: Qualcomm Incorporated
    Inventor: Harinath Garudadri
  • Patent number: 6678655
    Abstract: A method for encoding a digitized speech signal so as to generate data capable of being decoded as speech. A digitized speech signal is first converted to a series of feature vectors using for example known Mel-frequency Cepstral coefficients (MFCC) techniques. At successive instances instance of time a respective pitch value of the digitized speech signal is computed, and successive acoustic vectors each containing the respective pitch value and feature vector are compressed so as to derive therefrom a bit stream. A suitable decoder reverses the operation so as to extract the features vectors and pitch values, thus allowing speech reproduction and playback. In addition, speech recognition is possible using the decompressed feature vectors, with no impairment of the recognition accuracy and no computational overhead.
    Type: Grant
    Filed: November 12, 2002
    Date of Patent: January 13, 2004
    Assignee: International Business Machines Corporation
    Inventors: Ron Hoory, Dan Chazan, Ezra Silvera, Meir Zibulski
  • Patent number: 6678661
    Abstract: A method for highlighting a desired portion in an audio sequence for use in a visual display challenged environment. The method includes storing the audio sequence in memory. Next, the user selects a desired portion of the audio sequence and the selected portion is distinguished from the remainder of the audio sequence by automatically varying an audio characteristic of the selected portion during playback, without permanently altering the selected portion. In a related embodiment, the audio characteristic that is varied is pitch of the selected portion.
    Type: Grant
    Filed: February 11, 2000
    Date of Patent: January 13, 2004
    Assignee: International Business Machines Corporation
    Inventors: Gordon James Smith, George Willard Van Leeuwen
  • Patent number: 6671668
    Abstract: A speech recognition system is trained to be sensitive not only to the actual spoken text, but also to the manner in which the text is spoken, for example, whether something is said confidently, or hesitatingly. In the preferred embodiment, this is achieved by using a Hidden Markov Model (HMM) as the recognition engine, and training the HMM to recognise different styles of input. This approach finds particular application in the telephony voice processing environment, where short caller responses need to be recognised, and the system can then react in a fashion appropriate to the tone or manner in which the caller has spoken.
    Type: Grant
    Filed: December 20, 2002
    Date of Patent: December 30, 2003
    Assignee: International Business Machines Corporation
    Inventor: Robert Harris
  • Patent number: 6658385
    Abstract: On improved transformation method uses an initial set of Hidden Markov Models (HMMs) trained on a large amount of speech recorded in a low noise environment R to provide rich information on co-articulation and speaker variation and a smaller database in a more noisy target environment T. A set H of HMMs is trained with data provided in the low noise environment R and the utterances in the noisy environment T are transcribed phonetically using set H of HMMs. The transcribed segments are grouped into a set of Classes C. For each subclass c of Classes C, the transformation &PHgr;c is found to maximize likelihood utterances in T, given H. The HMMs are transformed and steps repeated until likelihood stabilizes.
    Type: Grant
    Filed: February 10, 2000
    Date of Patent: December 2, 2003
    Assignee: Texas Instruments Incorporated
    Inventors: Yifan Gong, John J. Godfrey
  • Patent number: 6631352
    Abstract: A decoding circuit, for receiving a bit stream including an encoded audio signal and header information used for-decoding the encoded audio signal, and decoding the encoded audio signal based on the header information, includes a header analysis section for outputting at least one decoding parameter obtained from the header information and decoding parameter change information indicating whether or not the at least one decoding parameter has been changed; a signal processing section for decoding the encoded audio signal, based on the at least one decoding parameter, into a decoded signal and outputting the decoded signal; an automatic mute processing section for executing automatic mute on the decoded signal after the at least one decoding parameter is changed; and an output section for outputting the decoded signal output from the automatic mute processing section.
    Type: Grant
    Filed: January 3, 2000
    Date of Patent: October 7, 2003
    Assignee: Matushita Electric Industrial Co. Ltd.
    Inventors: Takeshi Fujita, Takashi Katayama, Masahiro Sueyoshi, Shuji Miyasaka, Masaharu Matsumoto, Akihisa Kawamura, Kazutaka Abe, Kousuke Nishio
  • Patent number: 6629078
    Abstract: A method of coding a time-discrete stereo signal, the stereo signal having a first and a second channel, permits scalable stereo coding. At first, a mono signal is formed from the stereo signal, which is then coded, whereupon the coded mono signal is transmitted to a bit stream. Thereafter, the coded mono singal is decoded again, whereupon stereo information is formed on the basis of the coded/decoded mono signal and the first and second channels, with such stereo information being coded and being also written into the bit stream in order to obtain a bit stream comprising a complete coded monolayer as well as a layer with coded stereo information.
    Type: Grant
    Filed: December 13, 1999
    Date of Patent: September 30, 2003
    Assignee: Fraunhofer-Gesellschaft zur Forderung der angewandten Forschung e.V.
    Inventors: Bernhard Grill, Bodo Teichmann, Karlheinz Brandenburg
  • Patent number: 6615171
    Abstract: A portable speech signal preprocessing (SSP) device having a microphone for receiving spoken speech and background noise, a digital signal processor (DSP) for processing the received noise into feature vectors, a coupler for coupling to a communication device and for transmission over a communication channel. An automatic speech/speaker recognition (ASSR) server receives over the communication channel the preprocessed speech data and recognizes the spoken speech/speaker. A system having the portable SSP device and the ASSR server can be used to remotely activate, reset, or change PIN codes in smartcards, magnetic cards, or electronic money cards.
    Type: Grant
    Filed: August 13, 1999
    Date of Patent: September 2, 2003
    Assignee: International Business Machines Corporation
    Inventors: Dimitri Kanevsky, Stephane Herman Maes, Peter S Poon, Carl Prochilo
  • Patent number: 6611803
    Abstract: A video retrieval apparatus includes a retrieval data generator that is configured to extract a characteristic pattern from a voice signal synchronous with a video signal to generate an index for video retrieval. The video retrieval apparatus also includes a retrieval processor that is configured to input a key word from a retriever and collate the key word with the index to retrieve a desired video. The retrieval data generator includes a multiplexor that is configured to multiplex video signals, voice signals and indexes to output in data stream format. The retrieval processor includes a demultiplexor that is configured to demultiplex the multiplexed data stream into the video signals, the voice signals and the indexes. A video reproduction apparatus may collate a visual pattern of the key word visual pattern data of the video signal at the time a person vocalizes a sound as the index for retrieval.
    Type: Grant
    Filed: August 14, 2000
    Date of Patent: August 26, 2003
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Hiroshi Furuyama, Hitoshi Yashio, Ikuo Inoue, Mitsuru Endo, Masakatsu Hoshimi
  • Patent number: 6604071
    Abstract: An apparatus and method for data processing that improves estimation of spectral parameters of speech data and reduces algorithmic delay in a data coding operation. Estimation of spectral parameters is improved by adaptively adjusting a gain function used to enhance data based on whether the data contains information speech and noise or noise only. A determination is made concerning whether the speech signal to be processed represents articulated speech or a speech pause and a gain is formed for application to the speech signal. The lowest value the gain may assume (i.e., its lower limit) is determined based on whether the speech signal is known to represent articulated speech or not. The lower limit of the gain during periods of speech activity is constrained to be lower than the lower limit of the gain during speech pause. Also, the gain that is applied to a data frame of the speech signal is adaptively limited based on limited a priori signal-to-noise (SNR) values.
    Type: Grant
    Filed: February 8, 2000
    Date of Patent: August 5, 2003
    Assignee: AT&T Corp.
    Inventors: Richard Vandervoort Cox, Rainer Martin
  • Patent number: 6587817
    Abstract: A method which comprises forming a first noise reduction frame (18) containing speech samples; which is windowed by a first window function. For the windowed frame, noise reduction is performed for producing a second noise reduction frame (19; 45). A speech coding frame (44) to be formed comprises noise-reduced samples of at least two successive second noise reduction frames (45, 46), partly summed with one another. On the basis of said speech coding frame (44), a set of speech coding parameters pj are determined. A lookahead part (42) of the speech coding frame is at least partly formed of a first slope (41), the first slope (10, 41) comprising a set of most recent noise-reduced samples of the second noise reduction frame, not summed with the samples of any other second noise reduction frame. The method reduces the delay caused by speech coding and noise reduction.
    Type: Grant
    Filed: January 7, 2000
    Date of Patent: July 1, 2003
    Assignee: Nokia Mobile Phones Ltd.
    Inventors: Antti Vähätalo, Erkki Paajanen