Patents Examined by T{overscore (a)}livaldis Ivars {haeck over (S)}mits

Dual channel phase flag determination for coupling bands in a transform coder for high quality audio

Patent number: 6574602

Abstract: A method and apparatus for subband phase flag determination for coupling of channels in a dual channel audio encoder is based on a psychoacoustic model of the human auditory system. The method and apparatus are applicable to audio encoders which utilize a coupling channel to combine certain frequency components of the input audio signals. The method ensures a least square error between the original channel frequency coefficients at the encoder and the estimated coefficients at the decoder by determining the sign of the dot product of the coefficients for one of the channels and the coupling coefficients. No restriction is placed on the strategy utilized for generating the coupling channel coefficients or the coupling coordinates.

Type: Grant

Filed: September 8, 2000

Date of Patent: June 3, 2003

Assignee: STMicroelectronics Asia Pacific PTE Limited

Inventors: Mohammed Javed Absar, Sapna George, Antonio Mario Alvarez-Tinoco
Device for processing phase information of acoustic signal and method thereof

Patent number: 6571207

Abstract: A device for processing the phase information of an acoustic signal, and a method thereof are provided. This device processes the phase information of a digital speech signal which is expressed as a discrete sum of periodic signals having different frequency components. Also, this device includes a critical bandwidth calculator for calculating the critical bandwidth of each frequency according to the bandwidth characteristics of a human's auditory filter, a frequency range setting unit for setting the frequency ranges of local phase changes using critical bandwidths corrected by multiplying the critical bandwidths by a predetermined scaling coefficient, and a phase significance discriminator for checking whether frequency components adjacent to each frequency are within the frequency range corresponding to the frequency, and discriminating whether the phase of a signal having the frequency component is significant in terms of auditory characteristics.

Type: Grant

Filed: May 15, 2000

Date of Patent: May 27, 2003

Assignee: Samsung Electronics Co., Ltd.

Inventor: Doh-suk Kim
Natural language speech recognition using slot semantic confidence scores related to their word recognition confidence scores

Patent number: 6567778

Abstract: A stream of input speech is coupled as an input to a speech recognizer. The speech can be provided to the speech recognizer directly from a user or first stored and provided from a memory circuit. Each input word is recognized by the speech recognizer and a word confidence score is associated with each corresponding recognized word. The recognized words and their associated word confidence scores are provided to a natural language interpreter which parses the stream of recognized words into predetermined edges. From the edges, the natural language interpreter forms semantic slots which represent a semantic meaning. A slot confidence score related to the word or phone confidence scores for each of the words in the slot is determined for each slot. Based upon the slot confidence score, an ancillary application program determines whether to accept the words used to fill each slot.

Type: Grant

Filed: June 23, 1999

Date of Patent: May 20, 2003

Assignee: Nuance Communications

Inventors: Eric I Chao Chang, Eric G. Jackson
Waveform signal compression and expansion along time axis having different sampling rates for different main-frequency bands

Patent number: 6564187

Abstract: An apparatus and method for compression and expansion of a wave signal on a time axis. A memory device stores waveform data representative of a waveform for each sub-frequency band of each main-frequency band of a wave signal, in which the wave signal is divided into a plurality of the main-frequency bands, each of the main-frequency bands is divided into a plurality of the sub-frequency bands. A plurality of time axis compression and expansion devices are provided for each of the sub-frequency bands for performing time axis compression and expansion of the waveform. A mixing device mixes signals provided from the time axis compression and expansion devices.

Type: Grant

Filed: March 28, 2000

Date of Patent: May 13, 2003

Assignee: Roland Corporation

Inventors: Tadao Kikumoto, Atsushi Hoshiai, Satoshi Kusakabe
Process for encoding audio from an analog medium into a compressed digital format using attribute information and silence detection

Patent number: 6560577

Abstract: The present invention discloses method and system of encoding audio information (e.g., music, speech, etc.) from an analog medium (e.g., vinyl recordings, cassette tapes, etc.) into a compressed, track-oriented, digital format using attribute information and silence detection. The invention generates an analog audio signal by reproducing the audio information recorded on the analog medium, which is recorded on a plurality of discrete analog tracks. The analog audio signal is then sampled to generate a digital audio file (i.e., WAV file). The digital audio file is filtered to correct defects and then separated into a plurality of discrete digital audio tracks. Attribute information, such as track length, is accessed and confirmed by silence detection techniques in order to provide separation.

Type: Grant

Filed: March 21, 2000

Date of Patent: May 6, 2003

Assignee: Intel Corporation

Inventors: Jay G. Gilbert, Preston J. Hunt, Andrew S. Liu
Wired and cordless telephone systems with extended frequency range

Patent number: 6556965

Abstract: A telephone that communicates high-quality audio signals and a method for communicating an audio signal with an extended frequency range over a telephone network. The cordless telephone has a handset unit with a sampler circuit, a compression circuit, decompression circuit, as well as analog-to-digital (A/D) and digital-to-analog (D/A) converters, which handset is coupled by an infrared (IR) wireless link to a base unit connected to the Public Switched Telephone Network (PSTN). In one embodiment, the telephone includes a sampler, a compression block, a telephone port, a decompression block, and a digital-to-analog (D/A) converter. The sampler receives a analog audio signal and digitizes the signal into a high-quality (705.6 kbps) digital audio signal. The compression block then encodes the digital audio signal into a compressed digital signal at a lower bit rate (such as 150 kbps or 56 kbps) with a compression algorithm such as MP3.

Type: Grant

Filed: March 24, 1999

Date of Patent: April 29, 2003

Assignee: Legerity, Inc.

Inventors: David J. Borland, Paul R. Teich
Apparatus capable of processing sign language information

Patent number: 6549887

Abstract: Inputted sign language word labels and editing items such as speeds and positions of moving portions for specifying manual signs and/or sign gestures corresponding to the respective sign language word labels are displayed on an editing screen. These editing items are modified by the user to add non-language information such as emphasis/feeling information to the contents of communication, thereby generating modified sign language animation information data including the inputted sign language word label string having the added non-language information. For communication or interaction, the non-language information is extracted from the modified sign language animation information data and stored into a memory with the inputted sign language word label string. When a hearing impaired person communicates or interacts with another person through text, the user can emphasize the contents of communication or show the user's feeling for the contents of communication to the other person.

Type: Grant

Filed: January 20, 2000

Date of Patent: April 15, 2003

Assignee: Hitachi, Ltd.

Inventors: Haru Ando, Hirohiko Sagawa, Masaru Takeuchi
Recording medium with audio data from coder using constant bitrate real-time lossless encoding by moving excess data amounts

Patent number: 6546370

Abstract: A recording medium encoded with audio data encoded using a lossless encoding apparatus. The lossless encoding apparatus includes a lossless compression unit which losslessly compression encodes the audio data stored in an input buffer in units of predetermined data and outputs the encoded data in sequence, and an output buffer which stores the encoded audio data output from the lossless compression unit. A bitrate controller divides a plurality of the encoded audio data stored in the output buffer into first data having a data amount exceeding the maximum bitrate and second data having a data amount less than the maximum bitrate, divides the first data into third data being the encoded audio data having a data amount of the maximum bitrate and fourth data being the encoded data of the portion exceeding the maximum bitrate, and controls the output buffer so that the fourth data is output together with the second data.

Type: Grant

Filed: July 19, 2001

Date of Patent: April 8, 2003

Assignee: Samsung Electronics Co., Ltd.

Inventor: Jae-Hoon Heo
Method and apparatus for subband coding, allocating available frame bits based on changable subband weights

Patent number: 6542865

Abstract: A method according to the present invention generates weight data for each audio band and assigns a number of bits to each band according to the weight data. The method then calculates a total of the numbers of bits of one block and compares the total with an upper limit and with a lower limit of a compression target value. Based on the comparison result, the method increases or decreases the value of the weight data to update it. The method reassigns a number of bits based on the updated weight data.

Type: Grant

Filed: February 18, 1999

Date of Patent: April 1, 2003

Assignee: Sanyo Electric Co., Ltd.

Inventors: Fumiaki Nagao, Masato Fuma, Miyuki Okamoto
Subword-based speaker verification with multiple-classifier score fusion weight and threshold adaptation

Patent number: 6539352

Abstract: The voice print system of the present invention is a subword-based, text-dependent automatic speaker verification system that embodies the capability of user-selectable passwords with no constraints on the choice of vocabulary words or the language. Automatic blind speech segmentation allows speech to be segmented into subword units without any linguistic knowledge of the password. Subword modeling is performed using a multiple classifiers. The system also takes advantage of such concepts as multiple classifier fusion and data resampling to successfully boost the performance. Key word/key phrase spotting is used to optimally locate the password phrase. Numerous adaptation techniques increase the flexibility of the base system, and include: channel adaptation, fusion adaptation, model adaptation and threshold adaptation.

Type: Grant

Filed: November 21, 1997

Date of Patent: March 25, 2003

Inventors: Manish Sharma, Xiaoyu Zhang, Richard J. Mammone
Image processing device using speech recognition to control a displayed object

Patent number: 6538666

Abstract: An image processing device which changes the way speech recognition results are processed as the program progresses. A video game machine body 10 causes a television receiver 30 to display given images and to output given sounds in accordance with a game program stored in a ROM cartridge 20. When a player enters a speech from a microphone 60, a speech recognition unit 50 recognizes a word corresponding to the speech and sends the result to the video game machine body 10. The video game machine body 10 causes the state of a dialogue partner object displayed on the television receiver 30 to change on the basis of the recognized result received from the speech recognition unit 50. The relation between the recognition result and the control of the displayed dialogue partner object is changed as the program progresses, which gives variety to the game and makes it more amusing.

Type: Grant

Filed: December 10, 1999

Date of Patent: March 25, 2003

Assignee: Nintendo Co., Ltd.

Inventors: Muneaki Ozawa, Koji Mitsunari, Takeshi Nagareda
Confidence measures using sub-word-dependent weighting of sub-word confidence scores for robust speech recognition

Patent number: 6539353

Abstract: A method and apparatus is provided for speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract at least one feature from the digital signal. A hypothesis word string that consists of sub-word units is identified from the extracted feature. For each identified word, a word confidence measure is determined based on weighted confidence measure scores for each sub-word unit in the word. The weighted confidence measure scores are created by applying different weights to confidence scores associated with different sub-words of the hypothesis word.

Type: Grant

Filed: October 12, 1999

Date of Patent: March 25, 2003

Assignee: Microsoft Corporation

Inventors: Li Jiang, Xuedong Huang
Apparatus and method for portable dialogue management using a hierarchial task description table

Patent number: 6505162

Abstract: A portable dialogue management system includes a dialogue manager and a hierarchical task description table. The hierarchical task description table has a plurality of base tables connected with a hierarchical structure. Each base table defines the strategy of a sub-dialogue and stores the dialogue states, a number of domain parameters, and a plurality of response actions corresponding to each dialogue state. The dialogue manager manages the dialogue state of a dialogue system, determines the dialogue state and executes the appropriate response action. Because the domain knowledge is defined in the hierarchical task description table and the dialogue manager is not dependent on the application domain, the dialogue management system is easily portable to different applications. A stack may also be used to push in or pop up a dialogue state so that dialogues of multiple purposes can be accomplished.

Type: Grant

Filed: October 25, 1999

Date of Patent: January 7, 2003

Assignee: Industrial Technology Research Institute

Inventors: Huei-Ming Wang, Yi-Chung Lin, Tung-Hui Chiang
Method and apparatus for processing noisy sound signals

Patent number: 6502067

Abstract: A method for processing a sound signal y in which redundancy, consisting mainly of almost repetitions of signal profiles, is detected and correlations between the signal profiles are determined within segments of the sound signal. Correlated signal components are allocated to a power component and uncorrelated signal components to a noise component of the sound signal. The correlations between the signal profiles are determined by methods of nonlinear noise reduction in deterministic systems in reconstructed vector spaces based on the time domain.

Type: Grant

Filed: December 17, 1999

Date of Patent: December 31, 2002

Assignee: Max-Planck-Gesellschaft zur Forderung der Wissenschaften e.V.

Inventors: Rainer Hegger, Holger Kantz, Lorenzo Matassini
Two-tier noise rejection in speech recognition

Patent number: 6502072

Abstract: A method and apparatus is provided for two-tier noise rejection in speech recognition. The method and apparatus convert an analog speech signal into a digital signal and extract features from the digital signal. A hypothesis speech word and a hypothesis noise word are identified from respective extracted features. The features associated with the hypothesis speech word are examined in a second tier of noise rejection to determine if the features are more likely to represent noise than speech. The hypothesis speech word is replaced by a noise marker if the features are more likely to represent noise than speech.

Type: Grant

Filed: October 12, 1999

Date of Patent: December 31, 2002

Assignee: Microsoft Corporation

Inventors: Li Jiang, Xuedong Huang
Comfort noise generation in a radio receiver, using stored, previously-decoded noise after deactivating decoder during no-speech periods

Patent number: 6502071

Abstract: Power conservation, when generating background noise samples in a radio receiver, is disclosed. Background noise data is generated using at least one noise parameter that is transmitted in a manner included in framed noise information. This information is transmitted at predetermined time intervals during a period of no-speech. A controller is provided so as to check to determine if an incoming framed data is the noise information. In the case where the incoming framed data is specified as the noise information, a check is made to determine if a time period, which corresponds to a predetermined number of consecutive frames, has expired. When the time period has not yet elapsed, the background noise data is generated using at least one noise parameter in a manner of extending to the predetermined number of frames.

Type: Grant

Filed: July 15, 1999

Date of Patent: December 31, 2002

Assignee: NEC Corporation

Inventor: Mayumi Nagasaki
System for generating formant tracks by modifying formants synthesized from speech units

Patent number: 6502066

Abstract: Formants, corresponding to input speech units based either on a known text or the results of a speech recognition procedure, are generated from a formant synthesizer. A frequency response is generated based on the synthesized formants. A second frequency response is generated based on a speech signal which is received and which corresponds to utterances of speech units. The synthesized formants are modified based on a comparison of the frequency response corresponding to the synthesized formants and specific proportional characteristics of a frequency response of the input speech signal. In one illustrative embodiment, the comparison is then recalculated and further modifications are made accordingly to improve accuracy. In one illustrative embodiment, time aligning and frequency warping are utilized as modification functions.

Type: Grant

Filed: April 2, 2001

Date of Patent: December 31, 2002

Assignee: Microsoft Corporation

Inventor: Michael D. Plumpe
Speech synthesis apparatus

Patent number: 6499014

Abstract: The speech synthesis apparatus of the present invention includes: a text analyzer operable to generate a phonetic and prosodic symbol string from character information of an input text; a word dictionary storing a reading and an accent of a word; an voice segment dictionary storing a phoneme that is a basic unit of speech; a parameter generator operable to generate synthesizing parameters including at least a phoneme, a duration of the phoneme and a fundamental frequency for the phonetic and prosodic symbol string, the parameter generator including a calculating means operable to obtain a sum of phrase components and a sum of accent components and to calculate an average pitch from the sum of the phrase components and the sum of the accent components, and a determining means operable to determine a base pitch from the average pitch; and a waveform generator operable to generate a synthesized waveform by making waveform-overlapping referring to the synthesizing parameters generated by the parameter generator a

Type: Grant

Filed: March 7, 2000

Date of Patent: December 24, 2002

Assignee: Oki Electric Industry Co., Ltd.

Inventor: Keiichi Chihara
Transceiver for selecting a source coder based on signal distortion estimate

Patent number: 6499008

Abstract: A radio signal transceiver receiving at its input a speech signal and producing an output signal at a given output rate, the speech signal having undergone a source coding intended to sufficiently compress the input signal to obtain the desired output rate while an acceptable distortion ratio is maintained. In order to improve the compromise of transmission quality of the speech signal and transmission rate by selecting the optimum coder from the available coders, the transceiver comprises a measuring device for measuring the distortion of the output signal of a coder and a check circuit for comparing the estimated distortion with set values and deriving therefrom the optimum coder for the measured distortion.

Type: Grant

Filed: May 21, 1999

Date of Patent: December 24, 2002

Assignee: Koninklijke Philips Electronics N.V.

Inventor: Gilles Miet
Speech synthesis employing concatenated prosodic and acoustic templates for phrases of multiple words

Patent number: 6496801

Abstract: A speech synthesis system for generating voice dialog for a message frame having a fixed and a variable portion. A prosody module selects a prosodic template for each of the fixed and variable portions wherein at least one portion comprises a phrase of multiple words. An acoustic module selects an acoustic template for each of the fixed and variable portions wherein at least one portion comprises a phrase of multiple words. A frame generator concatenates the respective prosodic templates and acoustic templates. A sound module generates the voice dialog in accordance with the concatenated prosodic and acoustic templates.

Type: Grant

Filed: November 2, 1999

Date of Patent: December 17, 2002

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Peter Veprek, Steve Pearson, Jean-Claude Junqua

prev 1 2 3 4 5 6 7 next