Patents Examined by W. R. Young
  • Patent number: 7031917
    Abstract: The present invention relates to a speech recognition apparatus and a speech recognition method for speech recognition with improved accuracy. A distance calculator 47 determines the distance from a microphone 21 to a user uttering. Data indicating the determined distance is supplied to a speech recognition unit 41B. The speech recognition unit 41B has plural sets of acoustic models produced from speech data obtained by capturing speeches uttered at various distances. From those sets of acoustic models, the speech recognition unit 41B selects a set of acoustic models produced from speech data uttered at a distance closest to the distance determined by the distance calculator 47, and the speech recognition unit 41B performs speech recognition using the selected set of acoustic models.
    Type: Grant
    Filed: October 21, 2002
    Date of Patent: April 18, 2006
    Assignee: Sony Corporation
    Inventor: Yasuharu Asano
  • Patent number: 7031916
    Abstract: A method of initializing an ITU Recommendation G.729 Annex B voice activity detection (VAD) device is disclosed, having the steps of (1) extracting a set of parameters from a signal that characterize the signal; (2) calculating an energy measure of the signal from the set of parameters; (3) comparing the energy measure with a reference value; (4) determining an initial value for an average of a noise characteristic of the signal; and (5) counting the number of times the energy measure equals or exceeds the reference level. Also disclosed is a method of converging an ITU Recommendation G.
    Type: Grant
    Filed: June 1, 2001
    Date of Patent: April 18, 2006
    Assignee: Texas Instruments Incorporated
    Inventors: Dunling Li, Daniel C. Thomas, Gokhan Sisli
  • Patent number: 7027976
    Abstract: Methods and apparatus for document based ambiguous character resolution. An application searches a document for words that do not contain ambiguous characters and adds them to a dictionary, then searches the document for words that do contain ambiguous characters. For each ambiguous word, a set of candidate solutions is created by resolving the ambiguous characters in all possible ways. The dictionary is searched for words matching members of the candidate solution set. When a single member is matched, the ambiguous characters are resolved accordingly. When no member or more than one member is matched, a user is prompted to resolve the ambiguous characters. Alternatively, when more than one member is matched, the ambiguous characters are resolved to obtain the largest word, the smallest word, the most words, or the fewest words.
    Type: Grant
    Filed: January 29, 2001
    Date of Patent: April 11, 2006
    Assignee: Adobe Systems Incorporated
    Inventor: Richard L. Sites
  • Patent number: 7027978
    Abstract: A system control portion of a voice data recording and reproducing apparatus converts inputted voice signals into digitized voice data, adds header information stored in a table composed of a rewritable nonvolatile storage medium to the converted voice data and records them in a semiconductor memory as a recording medium. A PC to which such a voice data recording and reproducing apparatus can be connected acquires header information stored in the data table, and, when the changing of the header information is designated, sends the changed header information to the voice data recording and reproducing apparatus. Based upon the sent header information, the system control portion of the voice recording and reproducing apparatus rewrites the header information in the data table.
    Type: Grant
    Filed: January 29, 2001
    Date of Patent: April 11, 2006
    Assignee: Olympus Optical Co., Ltd.
    Inventor: Hideo Okano
  • Patent number: 7024362
    Abstract: A method for estimating mean opinion score or naturalness of synthesized speech is provided. The method includes using an objective measure that has components derived directly from textual information used to form synthesized utterances. The objective measure has a high correlation with mean opinion score such that a relationship can be formed between the objective measure and corresponding mean opinion score. An estimated mean opinion score can be obtained easily from the relationship when the objective measure is applied to utterances of a modified speech synthesizer.
    Type: Grant
    Filed: February 11, 2002
    Date of Patent: April 4, 2006
    Assignee: Microsoft Corporation
    Inventors: Min Chu, Hu Peng
  • Patent number: 7024354
    Abstract: In response to a coded speech signal output from a speech coder, a speech decoder decodes the coded speech signal into a reproduction speech signal. If the reproduction speech signal meets predetermined conditions, for example, “silence”, “unvoiced sound”, and the like, the speech decoder further operates as the following. The speech decoder calculates spectral parameters based on the reproduction speech signal, and calculates an excitation signal on the basis of the reproduction speech signal and the spectral parameters. In the calculation, a level of the excitation signal is also obtained. The speech decoder smoothes in time at least one of the spectral parameters and the level of the excitation signal. The speech decoder synthesizes the excitation signal by using the synthesis filter constructed with the spectrum parameters, so as to reproduce the speech signal. The speech signal has an excellent quality even if a bit rate is low.
    Type: Grant
    Filed: November 6, 2001
    Date of Patent: April 4, 2006
    Assignee: NEC Corporation
    Inventor: Kazunori Ozawa
  • Patent number: 7024357
    Abstract: An apparatus for detecting at least one tone having a known frequency and duration in an input signal. The input signal is input over a period of time which is divided into frame portions including at least an initial frame portion and a last frame portion. An energy signal indicative of the energy of the input signal during each frame portion is generated. A signal filter receives the energy signal and generates a noise indicator for each frame portion based on whether noise is detected in the energy signal. A dynamic threshold determiner generates an energy threshold for each frame portion. The energy threshold for the initial frame portion is generated based on a minimum expected value of the energy signal for a subsequent frame portion. The energy thresholds for frame portions subsequent to the initial frame portion are generated based on values of the energy signals during previous frame portions and the noise indicator.
    Type: Grant
    Filed: March 22, 2004
    Date of Patent: April 4, 2006
    Assignee: Legerity, Inc.
    Inventor: John G. Bartkowiak
  • Patent number: 7020605
    Abstract: A speech coding system is provided with time-domain noise attenuation. The speech coding system has an encoder operatively connected to a decoder via a communication medium. A preprocessor processes a digitized speech signal from an analog-to-digital converter. Speech coding systems are used to encode and decode a bitstream. Gains from the speech coding are adjusted by a gain factor Gf that provides time-domain background noise attenuation.
    Type: Grant
    Filed: February 13, 2001
    Date of Patent: March 28, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 7016840
    Abstract: A speech synthesis apparatus (10) comprises speech segment disassembling means (101) for disassembling the speech segments each including at least one phoneme into a plurality of pitch waveforms, phase characteristic transforming means (103) for transforming the phase characteristics of the pitch waveforms into a uniformed phase characteristic, pitch waveform classifying means (104) for classifying the pitch waveforms into a plurality of groups, pitch waveform registering means (106) for registering the pitch waveforms in the database (111) by extracting one pitch waveform from among the pitch waveforms in each of the groups, and synthesizing means (107) for synthesizing the speech with the pitch waveforms registered in the database (111). The speech synthesis apparatus (10) thus constructed can synthesize a natural speech using a relatively small database capacity.
    Type: Grant
    Filed: September 12, 2001
    Date of Patent: March 21, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Ryo Mochizuki, Toshiyuki Isono, Hirofumi Nishimura
  • Patent number: 7016829
    Abstract: A method of training a natural language processing unit applies a candidate learning set to at least one component of the natural language unit. The natural language unit is then used to generate a meaning set from a first corpus. A second meaning set is generated from a second corpus using a second natural language unit and the two meaning sets are compared to each other to form a score for the candidate learning set. This score is used to determine whether to modify the natural language unit based on the candidate learning set.
    Type: Grant
    Filed: May 4, 2001
    Date of Patent: March 21, 2006
    Assignee: Microsoft Corporation
    Inventors: Eric D. Brill, Arul A. Menezes
  • Patent number: 7016692
    Abstract: A channel for location estimation based on a wireless data communication from a mobile station is selected based on one or more of signal duration, variability and power level/signal-to-noise ratio of at least a portion of the wireless signals transmitted on the selected channel by the mobile station under the applicable configuration. Acceptable channels reducing location estimation error over alternatives include the access channel for Short Message Service (SMS) systems, the reverse pilot channel or the enhanced access channel for IS2000 systems, and the reverse link traffic channel for 1×EV-DO or 1×EV-DV systems. Location estimation is performed on wireless data communications on the selected channel.
    Type: Grant
    Filed: March 20, 2002
    Date of Patent: March 21, 2006
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Purva R. Rajkotia
  • Patent number: 7016837
    Abstract: An initial combination HMM 16 is generated from a voice HMM 10 having multiplicative distortions and an initial noise HMM of additive noise, and at the same time, a Jacobian matrix J is calculated by a Jacobian matrix calculating section 19. Noise variation Namh (cep), in which an estimated value Ha^(cep) of the multiplicative distortions that are obtained from voice that is actually uttered, additive noise Na(cep) that is obtained in a non-utterance period, and additive noise Nm(cep) of the initial noise HMM 17 are combined, is multiplied by a Jacobian matrix, wherein the result of the multiplication and initial combination HMM 16 are combined, and an adaptive HMM 26 is generated. Thereby, an adaptive HMM 26 that is matched to the observation value series RNah(cep) generated from actual utterance voice can be generated in advance.
    Type: Grant
    Filed: September 18, 2001
    Date of Patent: March 21, 2006
    Assignee: Pioneer Corporation
    Inventors: Hiroshi Seo, Mitsuya Komamura, Soichi Toyama
  • Patent number: 7016833
    Abstract: A method and system for speech characterization. One embodiment includes a method for speaker verification which includes collecting data from a speaker, wherein the data comprises acoustic data and non-acoustic data. The data is used to generate a template that includes a first set of “template” parameters. The method further includes receiving a real-time identity claim from a claimant, and using acoustic data and non-acoustic data from the identity claim to generate a second set of parameters. The method further includes comparing the first set of parameters to the set of parameters to determine whether the claimant is the speaker. The first set of parameters and the second set of parameters include at least one purely non-acoustic parameter, including a non-acoustic glottal shape parameter derived from averaging multiple glottal cycle waveforms.
    Type: Grant
    Filed: June 12, 2001
    Date of Patent: March 21, 2006
    Assignee: The Regents of the University of California
    Inventors: Todd J. Gable, Lawrence C. Ng, John F. Holzrichter, Greg C. Burnett
  • Patent number: 7013261
    Abstract: A system provides accelerated morphological analysis and in particular a speed-up of morphological look-up via a caching mechanism. The system determines whether each incoming token in a token stream is unique or recurring. Unique tokens, which occur for the first time in the token stream, are marked with a unique numerical identification (ID). A pointer is added to recurring tokens, which already occurred in the token stream, and directed towards the unique numerical ID which was defined for the respective token when occurring for the first time. A morphological look-up is performed on the unique tokens. Subsequently, the tokens carrying the pointer are detected and replaced with the results of morphological look-up stored under the unique numerical ID of the respective unique token.
    Type: Grant
    Filed: October 16, 2001
    Date of Patent: March 14, 2006
    Assignee: Xerox Corporation
    Inventor: Andreas Eisele
  • Patent number: 7013280
    Abstract: A method for correcting ambiguations in directory assistance systems includes the steps of receiving and processing a directory assistance request from a caller. If the processing results in an ambiguation of at least two names, audio information is provided to the caller. The audio information includes, at least in part, playback of an audio recording of at least one person included in the ambiguation. The audio playback helps the caller resolve the ambiguation. A voice activated directory assistance system having structure for resolving ambiguations is also disclosed.
    Type: Grant
    Filed: February 27, 2001
    Date of Patent: March 14, 2006
    Assignee: International Business Machines Corporation
    Inventors: Brent L. Davis, Reza Ghasemi, Susan M. Hill, Tracy Kong, John r Lauria, Vanessa V. Michelini
  • Patent number: 7013273
    Abstract: A system and associated method of converting audio data from a television signal into textual data for display as a closed caption on an display device is provided. The audio data is decoded and audio speech signals are filtered from the audio data. The audio speech signals are parsed into phonemes in accordance by a speech recognition module. The parsed phonemes are grouped into words and sentences responsive to a database of words corresponding to the grouped phonemes. The words are converted into text data which is formatted for presentation on the display device as closed captioned textual data.
    Type: Grant
    Filed: March 29, 2001
    Date of Patent: March 14, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Michael Kahn
  • Patent number: 7010476
    Abstract: A system constructs finite-state networks. The system initially compiles an intermediate finite-state network from a source file of regular expressions. The intermediate finite-state network includes a delimited subpath that defines a substring having the form of a regular expression. The system subsequently produces an output finite-state network in which the delimited subpath is replaced with an FSN compiled from the substring encoded by the delimited subpath.
    Type: Grant
    Filed: December 18, 2000
    Date of Patent: March 7, 2006
    Assignee: Xerox Corporation
    Inventors: Lauri J Karttunen, Kenneth R Beesley
  • Patent number: 7010480
    Abstract: A method for preparing a speech signal for encoding comprises determining whether the spectral content of an input speech signal is representative of a defined spectral characteristic (e.g., a defined characteristic slope). A frequency specific filter component of a weighting filter is controlled based on the determination of the spectral content of the speech signal or/and its location in the encoder. A core weighting filter component of the weighting filter may be maintained regardless of the spectral content of the speech signal.
    Type: Grant
    Filed: September 13, 2001
    Date of Patent: March 7, 2006
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Yang Gao, Huan-Yu Su
  • Patent number: 7010481
    Abstract: In a method for performing a segmentation operation upon a synthesizing speech signal and an input speech signal, a synthesized speech signal and a speech element duration signal are generated from the synthesizing speech signal A first feature parameter is extracted from the synthesized speech signal, and a second feature parameter is extracted from the input speech signal. A dynamic programming matching operation is performed upon the second feature parameter with reference to the first feature parameter and the speech element duration signal to obtain segmentation points of the input speech signal.
    Type: Grant
    Filed: March 27, 2002
    Date of Patent: March 7, 2006
    Assignee: NEC Corporation
    Inventor: Takuya Takizawa
  • Patent number: RE39013
    Abstract: Disclosed is an optical recording and reproducing apparatus comprising a light source directing a light spot toward a recording medium, a detection system detecting light reflected from the recording medium to derive an electrical signal from the reflected light, an information processing circuit modulating the intensity of the light spot according to writing pulses to record information on the recording medium and using the electrical signal to reproduce information from the recording medium, and a tracking servo circuit carrying out tracking servo operation on the basis of the electrical signal and including an extracting circuit connected to a source of extracting pulses having a pulse width at least equal to the writing pulse width so that writing pulse parts contained in the electrical signal are extracted during recording information, whereby a track offset occurring during information recording can be minimized, and the stability of the tracking servo system can be improved.
    Type: Grant
    Filed: August 5, 2003
    Date of Patent: March 14, 2006
    Assignee: Hitachi, Ltd.
    Inventors: Toshimitsu Kaku, Kazuo Shigematsu, Hisataka Sugiyama, Takeshi Maeda, Masahiro Takasago