Patents Examined by W. R. Young
-
Patent number: 7031917Abstract: The present invention relates to a speech recognition apparatus and a speech recognition method for speech recognition with improved accuracy. A distance calculator 47 determines the distance from a microphone 21 to a user uttering. Data indicating the determined distance is supplied to a speech recognition unit 41B. The speech recognition unit 41B has plural sets of acoustic models produced from speech data obtained by capturing speeches uttered at various distances. From those sets of acoustic models, the speech recognition unit 41B selects a set of acoustic models produced from speech data uttered at a distance closest to the distance determined by the distance calculator 47, and the speech recognition unit 41B performs speech recognition using the selected set of acoustic models.Type: GrantFiled: October 21, 2002Date of Patent: April 18, 2006Assignee: Sony CorporationInventor: Yasuharu Asano
-
Patent number: 7031916Abstract: A method of initializing an ITU Recommendation G.729 Annex B voice activity detection (VAD) device is disclosed, having the steps of (1) extracting a set of parameters from a signal that characterize the signal; (2) calculating an energy measure of the signal from the set of parameters; (3) comparing the energy measure with a reference value; (4) determining an initial value for an average of a noise characteristic of the signal; and (5) counting the number of times the energy measure equals or exceeds the reference level. Also disclosed is a method of converging an ITU Recommendation G.Type: GrantFiled: June 1, 2001Date of Patent: April 18, 2006Assignee: Texas Instruments IncorporatedInventors: Dunling Li, Daniel C. Thomas, Gokhan Sisli
-
Patent number: 7027976Abstract: Methods and apparatus for document based ambiguous character resolution. An application searches a document for words that do not contain ambiguous characters and adds them to a dictionary, then searches the document for words that do contain ambiguous characters. For each ambiguous word, a set of candidate solutions is created by resolving the ambiguous characters in all possible ways. The dictionary is searched for words matching members of the candidate solution set. When a single member is matched, the ambiguous characters are resolved accordingly. When no member or more than one member is matched, a user is prompted to resolve the ambiguous characters. Alternatively, when more than one member is matched, the ambiguous characters are resolved to obtain the largest word, the smallest word, the most words, or the fewest words.Type: GrantFiled: January 29, 2001Date of Patent: April 11, 2006Assignee: Adobe Systems IncorporatedInventor: Richard L. Sites
-
Patent number: 7027978Abstract: A system control portion of a voice data recording and reproducing apparatus converts inputted voice signals into digitized voice data, adds header information stored in a table composed of a rewritable nonvolatile storage medium to the converted voice data and records them in a semiconductor memory as a recording medium. A PC to which such a voice data recording and reproducing apparatus can be connected acquires header information stored in the data table, and, when the changing of the header information is designated, sends the changed header information to the voice data recording and reproducing apparatus. Based upon the sent header information, the system control portion of the voice recording and reproducing apparatus rewrites the header information in the data table.Type: GrantFiled: January 29, 2001Date of Patent: April 11, 2006Assignee: Olympus Optical Co., Ltd.Inventor: Hideo Okano
-
Patent number: 7024362Abstract: A method for estimating mean opinion score or naturalness of synthesized speech is provided. The method includes using an objective measure that has components derived directly from textual information used to form synthesized utterances. The objective measure has a high correlation with mean opinion score such that a relationship can be formed between the objective measure and corresponding mean opinion score. An estimated mean opinion score can be obtained easily from the relationship when the objective measure is applied to utterances of a modified speech synthesizer.Type: GrantFiled: February 11, 2002Date of Patent: April 4, 2006Assignee: Microsoft CorporationInventors: Min Chu, Hu Peng
-
Patent number: 7024354Abstract: In response to a coded speech signal output from a speech coder, a speech decoder decodes the coded speech signal into a reproduction speech signal. If the reproduction speech signal meets predetermined conditions, for example, “silence”, “unvoiced sound”, and the like, the speech decoder further operates as the following. The speech decoder calculates spectral parameters based on the reproduction speech signal, and calculates an excitation signal on the basis of the reproduction speech signal and the spectral parameters. In the calculation, a level of the excitation signal is also obtained. The speech decoder smoothes in time at least one of the spectral parameters and the level of the excitation signal. The speech decoder synthesizes the excitation signal by using the synthesis filter constructed with the spectrum parameters, so as to reproduce the speech signal. The speech signal has an excellent quality even if a bit rate is low.Type: GrantFiled: November 6, 2001Date of Patent: April 4, 2006Assignee: NEC CorporationInventor: Kazunori Ozawa
-
Patent number: 7024357Abstract: An apparatus for detecting at least one tone having a known frequency and duration in an input signal. The input signal is input over a period of time which is divided into frame portions including at least an initial frame portion and a last frame portion. An energy signal indicative of the energy of the input signal during each frame portion is generated. A signal filter receives the energy signal and generates a noise indicator for each frame portion based on whether noise is detected in the energy signal. A dynamic threshold determiner generates an energy threshold for each frame portion. The energy threshold for the initial frame portion is generated based on a minimum expected value of the energy signal for a subsequent frame portion. The energy thresholds for frame portions subsequent to the initial frame portion are generated based on values of the energy signals during previous frame portions and the noise indicator.Type: GrantFiled: March 22, 2004Date of Patent: April 4, 2006Assignee: Legerity, Inc.Inventor: John G. Bartkowiak
-
Patent number: 7020605Abstract: A speech coding system is provided with time-domain noise attenuation. The speech coding system has an encoder operatively connected to a decoder via a communication medium. A preprocessor processes a digitized speech signal from an analog-to-digital converter. Speech coding systems are used to encode and decode a bitstream. Gains from the speech coding are adjusted by a gain factor Gf that provides time-domain background noise attenuation.Type: GrantFiled: February 13, 2001Date of Patent: March 28, 2006Assignee: Mindspeed Technologies, Inc.Inventor: Yang Gao
-
Patent number: 7016840Abstract: A speech synthesis apparatus (10) comprises speech segment disassembling means (101) for disassembling the speech segments each including at least one phoneme into a plurality of pitch waveforms, phase characteristic transforming means (103) for transforming the phase characteristics of the pitch waveforms into a uniformed phase characteristic, pitch waveform classifying means (104) for classifying the pitch waveforms into a plurality of groups, pitch waveform registering means (106) for registering the pitch waveforms in the database (111) by extracting one pitch waveform from among the pitch waveforms in each of the groups, and synthesizing means (107) for synthesizing the speech with the pitch waveforms registered in the database (111). The speech synthesis apparatus (10) thus constructed can synthesize a natural speech using a relatively small database capacity.Type: GrantFiled: September 12, 2001Date of Patent: March 21, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Ryo Mochizuki, Toshiyuki Isono, Hirofumi Nishimura
-
Patent number: 7016829Abstract: A method of training a natural language processing unit applies a candidate learning set to at least one component of the natural language unit. The natural language unit is then used to generate a meaning set from a first corpus. A second meaning set is generated from a second corpus using a second natural language unit and the two meaning sets are compared to each other to form a score for the candidate learning set. This score is used to determine whether to modify the natural language unit based on the candidate learning set.Type: GrantFiled: May 4, 2001Date of Patent: March 21, 2006Assignee: Microsoft CorporationInventors: Eric D. Brill, Arul A. Menezes
-
Patent number: 7016692Abstract: A channel for location estimation based on a wireless data communication from a mobile station is selected based on one or more of signal duration, variability and power level/signal-to-noise ratio of at least a portion of the wireless signals transmitted on the selected channel by the mobile station under the applicable configuration. Acceptable channels reducing location estimation error over alternatives include the access channel for Short Message Service (SMS) systems, the reverse pilot channel or the enhanced access channel for IS2000 systems, and the reverse link traffic channel for 1×EV-DO or 1×EV-DV systems. Location estimation is performed on wireless data communications on the selected channel.Type: GrantFiled: March 20, 2002Date of Patent: March 21, 2006Assignee: Samsung Electronics Co., Ltd.Inventor: Purva R. Rajkotia
-
Patent number: 7016837Abstract: An initial combination HMM 16 is generated from a voice HMM 10 having multiplicative distortions and an initial noise HMM of additive noise, and at the same time, a Jacobian matrix J is calculated by a Jacobian matrix calculating section 19. Noise variation Namh (cep), in which an estimated value Ha^(cep) of the multiplicative distortions that are obtained from voice that is actually uttered, additive noise Na(cep) that is obtained in a non-utterance period, and additive noise Nm(cep) of the initial noise HMM 17 are combined, is multiplied by a Jacobian matrix, wherein the result of the multiplication and initial combination HMM 16 are combined, and an adaptive HMM 26 is generated. Thereby, an adaptive HMM 26 that is matched to the observation value series RNah(cep) generated from actual utterance voice can be generated in advance.Type: GrantFiled: September 18, 2001Date of Patent: March 21, 2006Assignee: Pioneer CorporationInventors: Hiroshi Seo, Mitsuya Komamura, Soichi Toyama
-
Patent number: 7016833Abstract: A method and system for speech characterization. One embodiment includes a method for speaker verification which includes collecting data from a speaker, wherein the data comprises acoustic data and non-acoustic data. The data is used to generate a template that includes a first set of “template” parameters. The method further includes receiving a real-time identity claim from a claimant, and using acoustic data and non-acoustic data from the identity claim to generate a second set of parameters. The method further includes comparing the first set of parameters to the set of parameters to determine whether the claimant is the speaker. The first set of parameters and the second set of parameters include at least one purely non-acoustic parameter, including a non-acoustic glottal shape parameter derived from averaging multiple glottal cycle waveforms.Type: GrantFiled: June 12, 2001Date of Patent: March 21, 2006Assignee: The Regents of the University of CaliforniaInventors: Todd J. Gable, Lawrence C. Ng, John F. Holzrichter, Greg C. Burnett
-
Patent number: 7013261Abstract: A system provides accelerated morphological analysis and in particular a speed-up of morphological look-up via a caching mechanism. The system determines whether each incoming token in a token stream is unique or recurring. Unique tokens, which occur for the first time in the token stream, are marked with a unique numerical identification (ID). A pointer is added to recurring tokens, which already occurred in the token stream, and directed towards the unique numerical ID which was defined for the respective token when occurring for the first time. A morphological look-up is performed on the unique tokens. Subsequently, the tokens carrying the pointer are detected and replaced with the results of morphological look-up stored under the unique numerical ID of the respective unique token.Type: GrantFiled: October 16, 2001Date of Patent: March 14, 2006Assignee: Xerox CorporationInventor: Andreas Eisele
-
Patent number: 7013280Abstract: A method for correcting ambiguations in directory assistance systems includes the steps of receiving and processing a directory assistance request from a caller. If the processing results in an ambiguation of at least two names, audio information is provided to the caller. The audio information includes, at least in part, playback of an audio recording of at least one person included in the ambiguation. The audio playback helps the caller resolve the ambiguation. A voice activated directory assistance system having structure for resolving ambiguations is also disclosed.Type: GrantFiled: February 27, 2001Date of Patent: March 14, 2006Assignee: International Business Machines CorporationInventors: Brent L. Davis, Reza Ghasemi, Susan M. Hill, Tracy Kong, John r Lauria, Vanessa V. Michelini
-
Patent number: 7013273Abstract: A system and associated method of converting audio data from a television signal into textual data for display as a closed caption on an display device is provided. The audio data is decoded and audio speech signals are filtered from the audio data. The audio speech signals are parsed into phonemes in accordance by a speech recognition module. The parsed phonemes are grouped into words and sentences responsive to a database of words corresponding to the grouped phonemes. The words are converted into text data which is formatted for presentation on the display device as closed captioned textual data.Type: GrantFiled: March 29, 2001Date of Patent: March 14, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventor: Michael Kahn
-
Patent number: 7010476Abstract: A system constructs finite-state networks. The system initially compiles an intermediate finite-state network from a source file of regular expressions. The intermediate finite-state network includes a delimited subpath that defines a substring having the form of a regular expression. The system subsequently produces an output finite-state network in which the delimited subpath is replaced with an FSN compiled from the substring encoded by the delimited subpath.Type: GrantFiled: December 18, 2000Date of Patent: March 7, 2006Assignee: Xerox CorporationInventors: Lauri J Karttunen, Kenneth R Beesley
-
Patent number: 7010480Abstract: A method for preparing a speech signal for encoding comprises determining whether the spectral content of an input speech signal is representative of a defined spectral characteristic (e.g., a defined characteristic slope). A frequency specific filter component of a weighting filter is controlled based on the determination of the spectral content of the speech signal or/and its location in the encoder. A core weighting filter component of the weighting filter may be maintained regardless of the spectral content of the speech signal.Type: GrantFiled: September 13, 2001Date of Patent: March 7, 2006Assignee: Mindspeed Technologies, Inc.Inventors: Yang Gao, Huan-Yu Su
-
Patent number: 7010481Abstract: In a method for performing a segmentation operation upon a synthesizing speech signal and an input speech signal, a synthesized speech signal and a speech element duration signal are generated from the synthesizing speech signal A first feature parameter is extracted from the synthesized speech signal, and a second feature parameter is extracted from the input speech signal. A dynamic programming matching operation is performed upon the second feature parameter with reference to the first feature parameter and the speech element duration signal to obtain segmentation points of the input speech signal.Type: GrantFiled: March 27, 2002Date of Patent: March 7, 2006Assignee: NEC CorporationInventor: Takuya Takizawa
-
Method and apparatus for optical recording and reproducing with tracking servo reducing track offset
Patent number: RE39013Abstract: Disclosed is an optical recording and reproducing apparatus comprising a light source directing a light spot toward a recording medium, a detection system detecting light reflected from the recording medium to derive an electrical signal from the reflected light, an information processing circuit modulating the intensity of the light spot according to writing pulses to record information on the recording medium and using the electrical signal to reproduce information from the recording medium, and a tracking servo circuit carrying out tracking servo operation on the basis of the electrical signal and including an extracting circuit connected to a source of extracting pulses having a pulse width at least equal to the writing pulse width so that writing pulse parts contained in the electrical signal are extracted during recording information, whereby a track offset occurring during information recording can be minimized, and the stability of the tracking servo system can be improved.Type: GrantFiled: August 5, 2003Date of Patent: March 14, 2006Assignee: Hitachi, Ltd.Inventors: Toshimitsu Kaku, Kazuo Shigematsu, Hisataka Sugiyama, Takeshi Maeda, Masahiro Takasago