Patents Examined by T{overscore (a)}livaldis Ivars {haeck over (S)}mits
  • Patent number: 6816837
    Abstract: A voice controlled capture device contains a processor that receives voice macros to control its operation. The capture device receives voice input, digitizes and sends the input to a second processor in a host computer system where speech recognition software within the host computer interprets the voice input to select a macro, and returns commands from the macro to the capture device where they are executed. Utilizing an interface or macro recorder within the capture device, and the speech recognition software within the host computer, the user can create voice macros incorporating individual voice commands. In a second embodiment, the capture device both analyzes the voice input and executes the commands of the macro.
    Type: Grant
    Filed: May 6, 1999
    Date of Patent: November 9, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Kenneth P. Davis
  • Patent number: 6807524
    Abstract: A perceptual weighting device for producing a perceptually weighted signal in response to a wideband signal comprises a signal pre-emphasis filter, a synthesis filter calculator, and a perceptual weighting filter. The signal pre-emphasis filter enhances the high frequency content of the wideband signal to thereby produce a pre-emphasized signal. The signal pre-emphasis filter has a transfer function of the form: P(z)=1−&mgr;z−1, wherein &mgr; is a pre-emphasis factor having a value located between 0 and 1. The synthesis filter calculator is responsive to the pre-emphasized signal for producing synthesis filter coefficients. Finally, the perceptual weighting filter processes the pre-emphasized signal in relation to the synthesis filter coefficients to produce the perceptually weighted signal. The perceptual weighting filter has a transfer function, with fixed denominator, of the form: W (z)=A (z/&ggr;1)/(1−&ggr;2z−1) where 0<&ggr;2<&ggr;1≦1.
    Type: Grant
    Filed: June 20, 2001
    Date of Patent: October 19, 2004
    Assignee: Voiceage Corporation
    Inventors: Bruno Bessette, Redwan Salami, Roch Lefebvre
  • Patent number: 6807529
    Abstract: A multimodal network element facilitates concurrent multimodal communication sessions through differing user agent programs on one or more devices. For example, a user agent program communicating in a voice mode, such as a voice browser in a voice gateway that includes a speech engine and call/session termination, is synchronized with another user agent program operating in a different modality, such as a graphical browser on a mobile device. The plurality of user agent programs are operatively coupled with a content server during a session to enable concurrent multimodal interaction.
    Type: Grant
    Filed: February 27, 2002
    Date of Patent: October 19, 2004
    Assignee: Motorola, Inc.
    Inventors: Greg Johnson, Senaka Balasuriya, James Ferrans, Jerome Jahnke, Rainu Pierce, David Cuka, Dilani Galagedara
  • Patent number: 6792404
    Abstract: A handheld audio spectrum analyzer includes a stored STI measurement algorithm and a selector for selecting the stored STI measurement algorithm to process a transduced sound signal received by a microphone associated with the handheld audio spectrum analyzer to provide the STI between the microphone and the source of the sound signal that transduces an audio test signal related to the STI measurement algorithm stored in the handheld audio spectrum analyzer.
    Type: Grant
    Filed: January 22, 2001
    Date of Patent: September 14, 2004
    Assignee: Bose Corporation
    Inventor: Kenneth Dylan Jacob
  • Patent number: 6772120
    Abstract: Computer method and apparatus for segmenting text streams is disclosed. Given is an input text stream formed of a series of words. A probability member provides working probabilities that a group of words is of a topic selected from a plurality of predetermined topics. The probability member accounts for relationships between words. A processing module receives the input text stream and using the probability member determines probability of certain words in the input text stream being of a same topic. As such, the processing module segments the input text stream into single topic groupings of words, where each grouping is of a respective single topic.
    Type: Grant
    Filed: November 21, 2000
    Date of Patent: August 3, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Pedro J. Moreno, David M. Blei
  • Patent number: 6766293
    Abstract: In a method for signalling a noise substitution when coding an audio signal, the time-domain audio signal is first transformed into the frequency domain to obtain spectral values. The spectral values are subsequently grouped together to form groups of spectral values. On the basis of a detection establishing whether a group of spectral values is a noisy group or not, a codebook is allocated to a non-noisy or tonal group by means of a codebook number for redundancy coding of the same. If a group is noisy, an additional codebook number which does not refer to a codebook is allocated to it in order to signal that this group is noisy and therefore does not have to be redundancy coded. By signalling noise substitution by means of a Huffman codebook number for noisy groups of spectral values, which are e.g.
    Type: Grant
    Filed: August 18, 1999
    Date of Patent: July 20, 2004
    Assignee: Fraunhofer-Gesellschaft Zur Foerderung der Angewandten Forschung E.V.
    Inventors: Jürgen Herre, Uwe Gbur, Andreas Ehret, Martin Dietz, Bodo Teichmann, Oliver Kunz, Karlheinz Brandenburg, Heinz Gerhäuser
  • Patent number: 6760701
    Abstract: The voice print system of the present invention is a subword-based, text-dependent automatic speaker verification system that embodies the capability of user-selectable passwords with no constraints on the choice of vocabulary words or the language. An estimate of the enrollement channel and of the test channel is developed for inverse filtering of the enrollment or the test speech, respectively. Automatic blind speech segmentation allows speech to be segmented into subword units without any linguistic knowledge of the password. Subword modeling is performed using a multiple classifiers. The system also takes advantage of such concepts as multiple classifier fusion and data resampling to successfully boost the performance. Key word/key phrase spotting is used to optimally locate the password phrase. Numerous adaptation techniques increase the flexibility of the base system, and include: channel adaptation, fusion adaptation, model adaptation and threshold adaptation.
    Type: Grant
    Filed: January 8, 2002
    Date of Patent: July 6, 2004
    Assignee: T-NETIX, Inc.
    Inventors: Manish Sharma, Xiaoyu Zhang, Richard J. Mammone
  • Patent number: 6760698
    Abstract: A speech coding system includes an adaptive codebook containing excitation vector data associated with corresponding adaptive codebook indices (e.g., pitch lags). Different excitation vectors in the adaptive codebook have distinct corresponding resolution levels. The resolution levels include a first resolution range of continuously variable or finely variable resolution levels. A gain adjuster scales a selected excitation vector data or preferential excitation vector data from the adaptive codebook. A synthesis filter synthesizes a synthesized speech signal in response to an input of the scaled excitation vector data. The speech coding system may be applied to an encoder, a decoder, or both.
    Type: Grant
    Filed: February 12, 2001
    Date of Patent: July 6, 2004
    Assignee: Mindspeed Technologies Inc.
    Inventor: Yang Gao
  • Patent number: 6748355
    Abstract: A sound synthesis method for modeling and synthesizing dynamic, parameterized sounds. The sound synthesis method yields perceptually convincing sounds and provides flexibility through model parameterization. By manipulating model parameters, a variety of related, but perceptually different sounds can be generated. The result is subtle changes in sounds, in addition to synthesis of a variety of sounds, all from a small set of models. The sound models can change dynamically according to changes in the simulation environment. The method is applicable to both stochastic (impulse-based) and non-stochastic (pitched) sounds.
    Type: Grant
    Filed: January 28, 1998
    Date of Patent: June 8, 2004
    Assignee: Sandia Corporation
    Inventors: Nadine E. Miner, Thomas P. Caudell
  • Patent number: 6745165
    Abstract: A method and system uses a finite state command grammar coordinated with application scripting to recognize voice command structures for performing an event from an initial location to a new location. The method involves a series of steps, including: recognizing an enabling voice command specifying the event to be performed from the initial location; determining a functional expression for the enabling voice command defined by one or more actions and objects; storing the action and object in a memory location; receiving input specifying the new location; recognizing an activating voice command for performing the event up to the new location; retrieving the stored action and object from the memory location; and performing the event from the initial location to the new location according to the retrieved action and object. Preferably, the enabling-activating command is phrased as “from here . . . to here”.
    Type: Grant
    Filed: June 16, 1999
    Date of Patent: June 1, 2004
    Assignee: International Business Machines Corporation
    Inventors: James R. Lewis, Kerry A. Ortega, Maria E. Smith, Thomas A. Kist, Linda M. Boyer
  • Patent number: 6738738
    Abstract: A method of transforming a voice application program designed for US English speakers to a voice application program for UK English speakers using a computer system is described. In one embodiment, scripts and grammars associated with the voice application program are converted from US-to-UK English. The process includes spelling normalization, lexical normalization, and pronunciation conversion (including where appropriate accounting for stress shifts). The result is necessary word pronunciations for speech recognition of UK English speaker (especially for proper nouns) as well as a script that has been conformed to use UK English spelling and lexical conventions. Additionally, the script can be annotated with pronunciations as a part of the process. Further, in one embodiment a web based interface to the conversion process is provided either standalone or as part of a voice application development environment.
    Type: Grant
    Filed: December 23, 2000
    Date of Patent: May 18, 2004
    Assignee: Tellme Networks, Inc.
    Inventor: Caroline G. Henton
  • Patent number: 6725195
    Abstract: Probabilistic recognition using clusters and simple probability functions provides improved performance by employing a limited number of clusters each using a relatively large number of simple probability functions. The simple probability functions for each of the limited number of state clusters are greater in number than the limited number of state clusters.
    Type: Grant
    Filed: October 22, 2001
    Date of Patent: April 20, 2004
    Assignee: SRI International
    Inventors: Ananth Sankar, Venkata Ramana Rao Gadde
  • Patent number: 6725194
    Abstract: In a speech recognition device (1) comprising receiving means (36) for receiving voice information (AI) uttered by a speaker and including speech coefficient memory means (38) for storing a speech coefficient indicator (SKI, PRI, SMI, WI) and including speech recognition means (42) which are arranged for recognizing text information (RTI) corresponding to the received voice information (AI) by evaluating the voice information (AI) and the speech coefficient indicator (SKI, PRI, SMI, WI), and including correction means (49) for correcting the recognized text information (RTI) and for producing corrected text information (CTI), text comparing means (52) are provided for comparing the recognized text information (RTI) with the corrected text information (CTI) and for determining at least a correspondence indicator (CI) and the adjusting means (50) are provided for adjusting the stored speech coefficient indicator (SKI, PRI, SMI, WI) by evaluating only one of such text parts (P2) of the corrected text information
    Type: Grant
    Filed: July 6, 2000
    Date of Patent: April 20, 2004
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Heinrich Bartosik, Walter Müller, Martin Schatz
  • Patent number: 6711542
    Abstract: The invention relates to a method of identifying a language in which a text is composed in the form of a string of characters, and also to a method of controlling a speech reproduction unit and to a communication device. To be able to carry out language identification with little expenditure, it is provided according to the invention that a frequency distribution (h1(x), h2(x,y), h3(x,y,z)) of letters in the text is ascertained, the ascertained frequency distribution (h1(x), h2(x,y), h3(x,y,z)) is compared with corresponding frequency distributions (l1(x), l2(x,y), l3(x,y,z)) of available languages, in order to ascertain similarity factors (s1, S2, s3) which indicate the similarity of the language of the text with the available languages, and the language for which the ascertained similarity factor (S1, S2, S3) is the greatest is established as the language of the text.
    Type: Grant
    Filed: December 28, 2000
    Date of Patent: March 23, 2004
    Assignee: Nokia Mobile Phones Ltd.
    Inventor: Wofgang Theimer
  • Patent number: 6701291
    Abstract: A method and apparatus for extracting speech features from a speech signal in which the linear frequency spectrum data, as generated, for example, by a conventional frequency transform, is first converted to logarithmic frequency spectrum data having frequency data distributed on a substantially logarithmic (rather than linear) frequency scale. Then, a plurality of digital auditory filters is applied to the resultant logarithmic frequency spectrum data, each of these filters having a substantially similar shape, but centered at different points on the logarithmic frequency scale. Because each of the filters have a similar shape, the feature extraction approach of the present invention advantageously can be easily modified or tuned by adjusting each of the filters in a coordinated manner, with the adjustment of only a handful of filter parameters.
    Type: Grant
    Filed: April 2, 2001
    Date of Patent: March 2, 2004
    Assignee: Lucent Technologies Inc.
    Inventors: Qi P. Li, Olivier Siohan, Frank Kao-Ping Soong
  • Patent number: 6687670
    Abstract: A digital audio receiver stores received frames temporarily for decoding and error concealment. A reconstructing block (14) in the decoder reads stored frames using a read window (43) wherein the latest received frame (+cnnxt) is undecoded. Decoding is carried out in stages so that the correctness of the current frame (0) is examined and possible errors are concealed using corresponding data of other frames in the window. Detection of errors is based on checksums (19, 26) and allowed values of bit combinations in certain parts of the frame. In addition, the receiver maintains an estimate (60) for the signal's bit error ratio and uses it to control the operation of the error concealment algorithm.
    Type: Grant
    Filed: June 16, 1999
    Date of Patent: February 3, 2004
    Assignee: Nokia OYJ
    Inventors: Matti Sydänmaa, Mauri Väänänen, Aki Mäkivirta
  • Patent number: 6681205
    Abstract: A method and apparatus enrolls a user for voice recognition by prompting the user to speak a social security number or other number. A voiceprint is extracted from the social security number. Additional sequences of numbers are generated so that the total number of times each decimal digit appears in the social security number or the additional sequences meets or exceeds a threshold value. The user is then prompted to speak the additional sequences and the voiceprint extracted from the social security number is refined to include the additional information received from the responses to the prompts for the sequences. A standard sequence may also be prompted and a voiceprint of the standard sequence compared with the voiceprints of other users speaking the same standard sequence to identify the level of differentiation between the user's voice and other user's voices.
    Type: Grant
    Filed: December 17, 1999
    Date of Patent: January 20, 2004
    Assignee: Charles Schwab & Co., Inc.
    Inventors: Michelle San Martin, Robert C Wohlsen, Cecily Baptist
  • Patent number: 6678657
    Abstract: The present invention relates to a method and an apparatus for a robust feature extraction for speech recognition in a noisy environment, wherein the speech signal is segmented and is characterized by spectral components. The speech signal is splitted into a number of short term spectral components in L subbands, with L=1, 2, . . . and a noise spectrum from segments that only contain noise is estimated. Then a spectral subtraction of the estimated noise spectrum from the corresponding short term spectrum is performed and a probability for each short term spectrum component to contain noise is calculated. Finally these spectral component of each short-term spectrum, having a low probability to contain speech are interpolated in order to smooth those short-term, spectra that only contain noise. With the interpolation the spectral components containing noise are interpolated by reliable spectral speech components that could be found in the neighborhood.
    Type: Grant
    Filed: October 23, 2000
    Date of Patent: January 13, 2004
    Assignee: Telefonaktiebolaget LM Ericsson(Publ)
    Inventors: Raymond Brückner, Hans-Günter Hirsch, Rainer Klisch, Volker Springer
  • Patent number: 6678662
    Abstract: A recording medium encoded with audio data encoded using a lossless encoding apparatus. The lossless encoding apparatus includes a lossless compression unit which losslessly compression encodes the audio data stored in an input buffer in units of predetermined data and outputs the encoded data in sequence, and an output buffer which stores the encoded audio data output from the lossless compression unit. A bitrate controller divides a plurality of the encoded audio data stored in the output buffer into first data having a data amount exceeding the maximum bitrate and second data having a data amount less than the maximum bitrate, divides the first data into third data being the encoded audio data having a data amount of the maximum bitrate and fourth data being the encoded data of the portion exceeding the maximum bitrate, and controls the output buffer so that the fourth data is output together with the second data.
    Type: Grant
    Filed: November 8, 2002
    Date of Patent: January 13, 2004
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Jae-Hoon Heo
  • Patent number: 6665638
    Abstract: Methods and systems for filtering synthesized or reconstructed speech are implemented. A filter based on a set of linear predictive coding (LPC) coefficients is constructed by transforming the LPC coefficients to the pseudo-cepstrum, a domain existing between LPC domain and the line spectral frequency (LSF) domain. The resulting filter can emphasize spectral frequencies associated with various formants, or spectral peaks, of an inverse transfer function relating to the LPC coefficients, and can de-emphasize spectral frequencies associated with various spectral minima, or spectral valleys, of the inverse transfer function relating to the LPC coefficients.
    Type: Grant
    Filed: April 13, 2001
    Date of Patent: December 16, 2003
    Assignee: AT&T Corp.
    Inventors: Hong-Goo Kang, Hong Kook Kim