Patents Examined by T{overscore (a)}livaldis Ivars {haeck over (S)}mits
  • Patent number: 6574602
    Abstract: A method and apparatus for subband phase flag determination for coupling of channels in a dual channel audio encoder is based on a psychoacoustic model of the human auditory system. The method and apparatus are applicable to audio encoders which utilize a coupling channel to combine certain frequency components of the input audio signals. The method ensures a least square error between the original channel frequency coefficients at the encoder and the estimated coefficients at the decoder by determining the sign of the dot product of the coefficients for one of the channels and the coupling coefficients. No restriction is placed on the strategy utilized for generating the coupling channel coefficients or the coupling coordinates.
    Type: Grant
    Filed: September 8, 2000
    Date of Patent: June 3, 2003
    Assignee: STMicroelectronics Asia Pacific PTE Limited
    Inventors: Mohammed Javed Absar, Sapna George, Antonio Mario Alvarez-Tinoco
  • Patent number: 6571207
    Abstract: A device for processing the phase information of an acoustic signal, and a method thereof are provided. This device processes the phase information of a digital speech signal which is expressed as a discrete sum of periodic signals having different frequency components. Also, this device includes a critical bandwidth calculator for calculating the critical bandwidth of each frequency according to the bandwidth characteristics of a human's auditory filter, a frequency range setting unit for setting the frequency ranges of local phase changes using critical bandwidths corrected by multiplying the critical bandwidths by a predetermined scaling coefficient, and a phase significance discriminator for checking whether frequency components adjacent to each frequency are within the frequency range corresponding to the frequency, and discriminating whether the phase of a signal having the frequency component is significant in terms of auditory characteristics.
    Type: Grant
    Filed: May 15, 2000
    Date of Patent: May 27, 2003
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Doh-suk Kim
  • Patent number: 6567778
    Abstract: A stream of input speech is coupled as an input to a speech recognizer. The speech can be provided to the speech recognizer directly from a user or first stored and provided from a memory circuit. Each input word is recognized by the speech recognizer and a word confidence score is associated with each corresponding recognized word. The recognized words and their associated word confidence scores are provided to a natural language interpreter which parses the stream of recognized words into predetermined edges. From the edges, the natural language interpreter forms semantic slots which represent a semantic meaning. A slot confidence score related to the word or phone confidence scores for each of the words in the slot is determined for each slot. Based upon the slot confidence score, an ancillary application program determines whether to accept the words used to fill each slot.
    Type: Grant
    Filed: June 23, 1999
    Date of Patent: May 20, 2003
    Assignee: Nuance Communications
    Inventors: Eric I Chao Chang, Eric G. Jackson
  • Patent number: 6564187
    Abstract: An apparatus and method for compression and expansion of a wave signal on a time axis. A memory device stores waveform data representative of a waveform for each sub-frequency band of each main-frequency band of a wave signal, in which the wave signal is divided into a plurality of the main-frequency bands, each of the main-frequency bands is divided into a plurality of the sub-frequency bands. A plurality of time axis compression and expansion devices are provided for each of the sub-frequency bands for performing time axis compression and expansion of the waveform. A mixing device mixes signals provided from the time axis compression and expansion devices.
    Type: Grant
    Filed: March 28, 2000
    Date of Patent: May 13, 2003
    Assignee: Roland Corporation
    Inventors: Tadao Kikumoto, Atsushi Hoshiai, Satoshi Kusakabe
  • Patent number: 6560577
    Abstract: The present invention discloses method and system of encoding audio information (e.g., music, speech, etc.) from an analog medium (e.g., vinyl recordings, cassette tapes, etc.) into a compressed, track-oriented, digital format using attribute information and silence detection. The invention generates an analog audio signal by reproducing the audio information recorded on the analog medium, which is recorded on a plurality of discrete analog tracks. The analog audio signal is then sampled to generate a digital audio file (i.e., WAV file). The digital audio file is filtered to correct defects and then separated into a plurality of discrete digital audio tracks. Attribute information, such as track length, is accessed and confirmed by silence detection techniques in order to provide separation.
    Type: Grant
    Filed: March 21, 2000
    Date of Patent: May 6, 2003
    Assignee: Intel Corporation
    Inventors: Jay G. Gilbert, Preston J. Hunt, Andrew S. Liu
  • Patent number: 6556965
    Abstract: A telephone that communicates high-quality audio signals and a method for communicating an audio signal with an extended frequency range over a telephone network. The cordless telephone has a handset unit with a sampler circuit, a compression circuit, decompression circuit, as well as analog-to-digital (A/D) and digital-to-analog (D/A) converters, which handset is coupled by an infrared (IR) wireless link to a base unit connected to the Public Switched Telephone Network (PSTN). In one embodiment, the telephone includes a sampler, a compression block, a telephone port, a decompression block, and a digital-to-analog (D/A) converter. The sampler receives a analog audio signal and digitizes the signal into a high-quality (705.6 kbps) digital audio signal. The compression block then encodes the digital audio signal into a compressed digital signal at a lower bit rate (such as 150 kbps or 56 kbps) with a compression algorithm such as MP3.
    Type: Grant
    Filed: March 24, 1999
    Date of Patent: April 29, 2003
    Assignee: Legerity, Inc.
    Inventors: David J. Borland, Paul R. Teich
  • Patent number: 6549887
    Abstract: Inputted sign language word labels and editing items such as speeds and positions of moving portions for specifying manual signs and/or sign gestures corresponding to the respective sign language word labels are displayed on an editing screen. These editing items are modified by the user to add non-language information such as emphasis/feeling information to the contents of communication, thereby generating modified sign language animation information data including the inputted sign language word label string having the added non-language information. For communication or interaction, the non-language information is extracted from the modified sign language animation information data and stored into a memory with the inputted sign language word label string. When a hearing impaired person communicates or interacts with another person through text, the user can emphasize the contents of communication or show the user's feeling for the contents of communication to the other person.
    Type: Grant
    Filed: January 20, 2000
    Date of Patent: April 15, 2003
    Assignee: Hitachi, Ltd.
    Inventors: Haru Ando, Hirohiko Sagawa, Masaru Takeuchi
  • Patent number: 6546370
    Abstract: A recording medium encoded with audio data encoded using a lossless encoding apparatus. The lossless encoding apparatus includes a lossless compression unit which losslessly compression encodes the audio data stored in an input buffer in units of predetermined data and outputs the encoded data in sequence, and an output buffer which stores the encoded audio data output from the lossless compression unit. A bitrate controller divides a plurality of the encoded audio data stored in the output buffer into first data having a data amount exceeding the maximum bitrate and second data having a data amount less than the maximum bitrate, divides the first data into third data being the encoded audio data having a data amount of the maximum bitrate and fourth data being the encoded data of the portion exceeding the maximum bitrate, and controls the output buffer so that the fourth data is output together with the second data.
    Type: Grant
    Filed: July 19, 2001
    Date of Patent: April 8, 2003
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Jae-Hoon Heo
  • Patent number: 6542865
    Abstract: A method according to the present invention generates weight data for each audio band and assigns a number of bits to each band according to the weight data. The method then calculates a total of the numbers of bits of one block and compares the total with an upper limit and with a lower limit of a compression target value. Based on the comparison result, the method increases or decreases the value of the weight data to update it. The method reassigns a number of bits based on the updated weight data.
    Type: Grant
    Filed: February 18, 1999
    Date of Patent: April 1, 2003
    Assignee: Sanyo Electric Co., Ltd.
    Inventors: Fumiaki Nagao, Masato Fuma, Miyuki Okamoto
  • Patent number: 6539352
    Abstract: The voice print system of the present invention is a subword-based, text-dependent automatic speaker verification system that embodies the capability of user-selectable passwords with no constraints on the choice of vocabulary words or the language. Automatic blind speech segmentation allows speech to be segmented into subword units without any linguistic knowledge of the password. Subword modeling is performed using a multiple classifiers. The system also takes advantage of such concepts as multiple classifier fusion and data resampling to successfully boost the performance. Key word/key phrase spotting is used to optimally locate the password phrase. Numerous adaptation techniques increase the flexibility of the base system, and include: channel adaptation, fusion adaptation, model adaptation and threshold adaptation.
    Type: Grant
    Filed: November 21, 1997
    Date of Patent: March 25, 2003
    Inventors: Manish Sharma, Xiaoyu Zhang, Richard J. Mammone
  • Patent number: 6538666
    Abstract: An image processing device which changes the way speech recognition results are processed as the program progresses. A video game machine body 10 causes a television receiver 30 to display given images and to output given sounds in accordance with a game program stored in a ROM cartridge 20. When a player enters a speech from a microphone 60, a speech recognition unit 50 recognizes a word corresponding to the speech and sends the result to the video game machine body 10. The video game machine body 10 causes the state of a dialogue partner object displayed on the television receiver 30 to change on the basis of the recognized result received from the speech recognition unit 50. The relation between the recognition result and the control of the displayed dialogue partner object is changed as the program progresses, which gives variety to the game and makes it more amusing.
    Type: Grant
    Filed: December 10, 1999
    Date of Patent: March 25, 2003
    Assignee: Nintendo Co., Ltd.
    Inventors: Muneaki Ozawa, Koji Mitsunari, Takeshi Nagareda
  • Patent number: 6539353
    Abstract: A method and apparatus is provided for speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract at least one feature from the digital signal. A hypothesis word string that consists of sub-word units is identified from the extracted feature. For each identified word, a word confidence measure is determined based on weighted confidence measure scores for each sub-word unit in the word. The weighted confidence measure scores are created by applying different weights to confidence scores associated with different sub-words of the hypothesis word.
    Type: Grant
    Filed: October 12, 1999
    Date of Patent: March 25, 2003
    Assignee: Microsoft Corporation
    Inventors: Li Jiang, Xuedong Huang
  • Patent number: 6505162
    Abstract: A portable dialogue management system includes a dialogue manager and a hierarchical task description table. The hierarchical task description table has a plurality of base tables connected with a hierarchical structure. Each base table defines the strategy of a sub-dialogue and stores the dialogue states, a number of domain parameters, and a plurality of response actions corresponding to each dialogue state. The dialogue manager manages the dialogue state of a dialogue system, determines the dialogue state and executes the appropriate response action. Because the domain knowledge is defined in the hierarchical task description table and the dialogue manager is not dependent on the application domain, the dialogue management system is easily portable to different applications. A stack may also be used to push in or pop up a dialogue state so that dialogues of multiple purposes can be accomplished.
    Type: Grant
    Filed: October 25, 1999
    Date of Patent: January 7, 2003
    Assignee: Industrial Technology Research Institute
    Inventors: Huei-Ming Wang, Yi-Chung Lin, Tung-Hui Chiang
  • Patent number: 6502067
    Abstract: A method for processing a sound signal y in which redundancy, consisting mainly of almost repetitions of signal profiles, is detected and correlations between the signal profiles are determined within segments of the sound signal. Correlated signal components are allocated to a power component and uncorrelated signal components to a noise component of the sound signal. The correlations between the signal profiles are determined by methods of nonlinear noise reduction in deterministic systems in reconstructed vector spaces based on the time domain.
    Type: Grant
    Filed: December 17, 1999
    Date of Patent: December 31, 2002
    Assignee: Max-Planck-Gesellschaft zur Forderung der Wissenschaften e.V.
    Inventors: Rainer Hegger, Holger Kantz, Lorenzo Matassini
  • Patent number: 6502072
    Abstract: A method and apparatus is provided for two-tier noise rejection in speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract features from the digital signal. A hypothesis speech word and a hypothesis noise word are identified from respective extracted features. The features associated with the hypothesis speech word are examined in a second tier of noise rejection to determine if the features are more likely to represent noise than speech. The hypothesis speech word is replaced by a noise marker if the features are more likely to represent noise than speech.
    Type: Grant
    Filed: October 12, 1999
    Date of Patent: December 31, 2002
    Assignee: Microsoft Corporation
    Inventors: Li Jiang, Xuedong Huang
  • Patent number: 6502071
    Abstract: Power conservation, when generating background noise samples in a radio receiver, is disclosed. Background noise data is generated using at least one noise parameter that is transmitted in a manner included in framed noise information. This information is transmitted at predetermined time intervals during a period of no-speech. A controller is provided so as to check to determine if an incoming framed data is the noise information. In the case where the incoming framed data is specified as the noise information, a check is made to determine if a time period, which corresponds to a predetermined number of consecutive frames, has expired. When the time period has not yet elapsed, the background noise data is generated using at least one noise parameter in a manner of extending to the predetermined number of frames.
    Type: Grant
    Filed: July 15, 1999
    Date of Patent: December 31, 2002
    Assignee: NEC Corporation
    Inventor: Mayumi Nagasaki
  • Patent number: 6502066
    Abstract: Formants, corresponding to input speech units based either on a known text or the results of a speech recognition procedure, are generated from a formant synthesizer. A frequency response is generated based on the synthesized formants. A second frequency response is generated based on a speech signal which is received and which corresponds to utterances of speech units. The synthesized formants are modified based on a comparison of the frequency response corresponding to the synthesized formants and specific proportional characteristics of a frequency response of the input speech signal. In one illustrative embodiment, the comparison is then recalculated and further modifications are made accordingly to improve accuracy. In one illustrative embodiment, time aligning and frequency warping are utilized as modification functions.
    Type: Grant
    Filed: April 2, 2001
    Date of Patent: December 31, 2002
    Assignee: Microsoft Corporation
    Inventor: Michael D. Plumpe
  • Patent number: 6499014
    Abstract: The speech synthesis apparatus of the present invention includes: a text analyzer operable to generate a phonetic and prosodic symbol string from character information of an input text; a word dictionary storing a reading and an accent of a word; an voice segment dictionary storing a phoneme that is a basic unit of speech; a parameter generator operable to generate synthesizing parameters including at least a phoneme, a duration of the phoneme and a fundamental frequency for the phonetic and prosodic symbol string, the parameter generator including a calculating means operable to obtain a sum of phrase components and a sum of accent components and to calculate an average pitch from the sum of the phrase components and the sum of the accent components, and a determining means operable to determine a base pitch from the average pitch; and a waveform generator operable to generate a synthesized waveform by making waveform-overlapping referring to the synthesizing parameters generated by the parameter generator a
    Type: Grant
    Filed: March 7, 2000
    Date of Patent: December 24, 2002
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Keiichi Chihara
  • Patent number: 6499008
    Abstract: A radio signal transceiver receiving at its input a speech signal and producing an output signal at a given output rate, the speech signal having undergone a source coding intended to sufficiently compress the input signal to obtain the desired output rate while an acceptable distortion ratio is maintained. In order to improve the compromise of transmission quality of the speech signal and transmission rate by selecting the optimum coder from the available coders, the transceiver comprises a measuring device for measuring the distortion of the output signal of a coder and a check circuit for comparing the estimated distortion with set values and deriving therefrom the optimum coder for the measured distortion.
    Type: Grant
    Filed: May 21, 1999
    Date of Patent: December 24, 2002
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Gilles Miet
  • Patent number: 6496801
    Abstract: A speech synthesis system for generating voice dialog for a message frame having a fixed and a variable portion. A prosody module selects a prosodic template for each of the fixed and variable portions wherein at least one portion comprises a phrase of multiple words. An acoustic module selects an acoustic template for each of the fixed and variable portions wherein at least one portion comprises a phrase of multiple words. A frame generator concatenates the respective prosodic templates and acoustic templates. A sound module generates the voice dialog in accordance with the concatenated prosodic and acoustic templates.
    Type: Grant
    Filed: November 2, 1999
    Date of Patent: December 17, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Peter Veprek, Steve Pearson, Jean-Claude Junqua