Patents Examined by Abul K. Azad
-
Patent number: 8290774Abstract: An improved -removal method is disclosed that computes for any input weighted automaton A with -transitions an equivalent weighted automaton B with no -transitions. The method comprises two main steps. The first step comprises computing for each state “p” of the automaton A its -closure. The second step in the method comprises modifying the outgoing transitions of each state “p” by removing those labeled with . The method next comprises adding to the set of transitions leaving the state “p” non--transitions leaving each state “q” in the set of states reachable from “p” via a path labeled with with their weights pre--multiplied by the -distance from state “p” to state “q” in the automaton A. State “p” is a final state if some state “q” within the set of states reachable from “p” via a path labeled with is final and the final weight ? ? [ p ] = ? q ? ? ? ? ? e ? [ p ] ? F ? ( d ? [ p , q ] ? ? ? ? [ q ] ) .Type: GrantFiled: July 20, 2001Date of Patent: October 16, 2012Assignee: AT&T Intellectual Property II, L.P.Inventor: Mehryar Mohri
-
Patent number: 7464030Abstract: Each of the M basic vectors in a noise code book 260 is multiplied by a factor ±1 in a sign adder 270 and combined in an adder 280 to create 2M noise signed vectors. The characteristic of the binary Gray code is utilized as follows. A change ?Gu obtained between a noise signed vector based on a signed word i of the binary Gray code and a noise sign vector based on a sign word u adjacent to the sign word i and different from the sign word i only in a predetermined bit position v is used in such a manner that a sign word u? which is next to reverse the bit position v on the Gray code sequence can express a change ?Gu? from the noise signed vector by utilizing the fact that the sign word u? differs from the sign word u only in one bit position w excluding the bit position V. Thus, calculation is simplified, increasing the vector search speed.Type: GrantFiled: March 26, 1998Date of Patent: December 9, 2008Assignee: Sony CorporationInventors: Yuji Maeda, Shuichi Maeda
-
Patent number: 7440897Abstract: In an embodiment, a lattice of phone strings in an input communication of a user may be recognized, wherein the lattice may represent a distribution over the phone strings. Morphemes in the input communication of the user may be detected using the recognized lattice. Task-type classification decisions may be made based on the detected morphemes in the input communication of the user.Type: GrantFiled: May 27, 2006Date of Patent: October 21, 2008Assignee: AT&T Corp.Inventors: Allen Louis Gorin, Dijana Petrovska-Delacretaz, Giuseppe Riccardi, Jeremy Huntley Wright
-
Patent number: 7433820Abstract: A system, method and program storage device implementing a method for modeling a data generating process, wherein the modeling comprises observing a data sequence comprising irregularly sampled data, obtaining an observation sequence based on the observed data sequence, assigning a time index sequence to the data sequence, obtaining a hidden state sequence of the data sequence, and decoding the data sequence based on a combination of the time index sequence and the hidden state sequence to model the data sequence. The method further comprises assigning a probability distribution over time stamp values of the observation sequence, wherein the decoding comprises using a Hidden Markov Model. The method further comprises using an expectation maximization methodology to learn the Hidden Markov Model.Type: GrantFiled: May 12, 2004Date of Patent: October 7, 2008Assignee: International Business Machines CorporationInventors: Ashutosh Garg, Sreeram V. Balakrishnan, Shivakumar Vaithyanathan
-
Patent number: 7433822Abstract: At an audio source, pause information is added to audio data, the combination of which is subsequently packetized. The resulting packets are transmitted to an audio destination via a network in which different packets may be subjected to varying levels of delay. At the audio destination, the pause information may be used to insert pauses at appropriate times to accommodate the occurrence of delays in packet delivery. In one embodiment, pauses are inserted based on a hierarchy of pause types. During pauses, audio filler information may be injected. In this manner, the effects of variable network delays upon reconstructed audio may be mitigated.Type: GrantFiled: April 25, 2005Date of Patent: October 7, 2008Assignee: Research In Motion LimitedInventors: Dale R. Buchholz, Bashar Jano, Ira Gerson
-
Patent number: 7421386Abstract: A lexicon stored on a computer readable medium and used by language processing systems. The lexicon can store word information in a plurality of data fields associated with each entered word. The data fields can include information on spelling and grammar, parts of speech, steps that the entered word can be transformed into another word, a word description, and a segmentation for a compound word. Information that cannot be stored in the lexicon can be stored in an intermediate indexes table. Associated methods of constructing, updating and using the lexicon are introduced.Type: GrantFiled: March 19, 2004Date of Patent: September 2, 2008Assignee: Microsoft CorporationInventors: Kevin R. Powell, Andrea Jessee, Douglas W. Potter
-
Patent number: 7415414Abstract: Techniques are provided for determining and using interaction models. Discourse functions, prosodic features and turn information are determined from the speech information in a training corpus. Statistics, decision trees, rules and/or various other methods are used to determine a predictive interaction model based on the discourse functions, the prosodic features and the turn information. Predictive interaction models are optionally determined for individual users, genres, languages and/or other characteristics of the speech information. The predictive interaction model is useable to predict turns in a dialogue based on the discourse functions and prosodic features identified in the speech information. Speech information is presented and/or received based on the predictive interaction model.Type: GrantFiled: March 23, 2004Date of Patent: August 19, 2008Assignee: Fuji Xerox Co., Ltd.Inventors: Misty Azara, Livia Polanyi, Giovanni L. Thione, Martin H Van Den Berg
-
Patent number: 7389232Abstract: A portable communication device and learning tool for use by speech impaired individuals or monolinguistic individuals is provided by the present invention. The device is foldable for convenient carrying and storage. A method of using the communication device and learning tool is also provided.Type: GrantFiled: June 27, 2003Date of Patent: June 17, 2008Inventors: Jeanne Bedford, Suzanne Hasko
-
Patent number: 7389229Abstract: A unified clustering tree (500) generates phoneme clusters based on an input sequence of phonemes. The number of possible clusters is significantly less than the number of possible combinations of input phonemes. Nodes (510, 511) in the unified clustering tree are arranged into levels such that the clustering tree generates clusters for multiple speech recognition models. Models that correspond to higher levels in the unified clustering tree are coarse models relative to more fine-grain models at lower levels of the clustering tree.Type: GrantFiled: October 16, 2003Date of Patent: June 17, 2008Assignee: BBN Technologies Corp.Inventors: Jayadev Billa, Daniel Kiecza, Francis G. Kubala
-
Patent number: 7383177Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.Type: GrantFiled: July 26, 2005Date of Patent: June 3, 2008Assignee: Mitsubishi Denki Kabushiki KaishaInventor: Tadashi Yamaura
-
Patent number: 7379873Abstract: Voice synthesis unit data stored in a phoneme database 10 is selected by a voice synthesis unit selector 12 in accordance with MIDI information stored in a performance data storage unit 11. Characteristic parameters are derived from the selected voice synthesis unit data. A characteristic parameter correction unit 21 corrects the characteristic parameters based on pitch information, etc. A spectrum envelope generating unit 23 generates a spectrum envelope in accordance with the corrected characteristic parameter. A timbre transformation unit 25 changes timbre by correcting the characteristic parameters in accordance with timbre transformation parameters in a time axis. Timbres in the same song position can be transformed into different arbitrary timbres respectively; therefore, the synthesized singing voice will be rich in variety and reality.Type: GrantFiled: July 3, 2003Date of Patent: May 27, 2008Assignee: Yamaha CorporationInventor: Hideki Kemmochi
-
Patent number: 7373291Abstract: The present invention provides a new source of information, linguistic models, to improve the accuracy of mathematical recognition. Specifically, the present invention is an extension of linguistic methods to the mathematical domain thereby providing recognition of the artificial language of mathematics in a way analogous to natural language recognition. Parse trees are the basic units of the mathematical language, and a linguistic model for mathematics is a method for assigning a linguistic score to each parse tree. The models are generally created by taking a large body of known text and counting the occurrence of various linguistic events such as word bigrams in that body. The raw counts are modified by smoothing and other algorithms before taking their place as probabilities in the model.Type: GrantFiled: February 18, 2003Date of Patent: May 13, 2008Assignee: Mathsoft Engineering & Education, Inc.Inventor: Peter F. Garst
-
Patent number: 7366659Abstract: Time-scaled, sound signals (i.e. sounds output at differing speeds) are generated by mixing weighted time-and frequency-domain processed signals, the former signal generally representing speech-based signals while the latter representing music-based signals. The weights applied to each type of signal may be determined by a scaling factor, which in turn is related to the desired speed at which a listener desires to hear a sound signal. In one example of the invention, only stationary signal portions of an input sound signal are used to generate time-scaled processed signals. An adaptive frame-size may also be used to pre-process the separate signals prior to being weighted, which at least decreases the amount of unwanted reverberative sound qualities in a resulting sound signal. Together, techniques envisioned by the present invention produce improved, speed adjusted sound signals.Type: GrantFiled: June 7, 2002Date of Patent: April 29, 2008Assignee: Lucent Technologies Inc.Inventor: Walter Etter
-
Patent number: 7366660Abstract: The present invention relates to a transceiver which provides a high-quality decoded voice. A mobile telephone 1011 encodes voice data, and outputs the encoded voice data. Furthermore, the mobile telephone 1011 learns quality-enhancement data which improves the quality of a voice output from a mobile telephone 1012, based on voice data used in past learning and newly input voice data, thereby transmitting the encoded voice data and quality-enhancement data. The mobile telephone 1012 receives the encoded voice data transmitted from the mobile telephone 1011, and selects quality-enhancement data correspondingly associated with a telephone number of the mobile telephone 1011. The mobile telephone 1012 decodes the received encoded voice data based on the selected quality-enhancement data. The present invention is applied to a mobile telephone that transmits and receives voices.Type: GrantFiled: June 20, 2002Date of Patent: April 29, 2008Assignee: Sony CorporationInventors: Tetsujiro Kondo, Masaaki Hattori, Tsutomu Watanabe, Hiroto Kimura
-
Patent number: 7363220Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal.Type: GrantFiled: March 28, 2005Date of Patent: April 22, 2008Assignee: Mitsubishi Denki Kabushiki KaishaInventor: Tadashi Yamaura
-
Patent number: 7359853Abstract: An implementation of the present invention for 4800 bits per second comprises a voice encoder and decoder method and system that uses voice excitation, eliminating the voice/unvoiced pitch tracking, and the first formant up to 2400 Hertz, does not use pulse code modulation encoding, but uses the zero crossings only of the first formant, dividing by two and sampling at 2400 Hertz. The resulting combination uses half of the bit rate for excitation and the remainder for short term spectrum analysis. The spectrum is updated each 20.8 milliseconds using 50 bits per frame. The decoder extracts the excitation, multiplies it by two and uses a Hanning modified sawtooth and spectral flattening to excite the spectrum generator. This waveform produces both even and odd harmonics for both periodic (voiced) and aperiodic (unvoiced) frequencies and gives naturalness to all languages and speakers.Type: GrantFiled: February 11, 2005Date of Patent: April 15, 2008Inventor: Clyde Holmes
-
Patent number: 7353171Abstract: Methods and apparatus to operate an audience metering device with voice commands are described herein. In an example method, at least one of a television program audio signal or a voice command from an audience member is transduced into an audio input signal. Based on the audio input signal and a television audio line signal, a residual audio signal is generated. One or more vectors from the residual audio signal are extracted. Based on the one or more vectors extracted from the residual audio signal, the voice command is identified.Type: GrantFiled: March 14, 2006Date of Patent: April 1, 2008Assignee: Nielsen Media Research, Inc.Inventor: Venugopal Srinivasan
-
Patent number: 7346509Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.Type: GrantFiled: September 26, 2003Date of Patent: March 18, 2008Assignee: Callminer, Inc.Inventor: Jeffrey A. Gallino
-
Patent number: 7346510Abstract: A method and computer-readable medium are provided that determine predicted acoustic values for a sequence of hypothesized speech units using modeled articulatory or VTR dynamics values and using the modeled relationship between the articulatory (or VTR) and acoustic values for the same speech events. Under one embodiment, the articulatory (or VTR) dynamics value depends on articulatory dynamics values at pervious time frames and articulation targets. In another embodiment, the articulatory dynamics value depends in part on an acoustic environment value such as noise or distortion. In a third embodiment, a time constant that defines the articulatory dynamics value is trained using a variety of articulation styles. By modeling the articulatory or VTR dynamics value in these manners, hyper-articulated, hypo-articulated, fast, and slow speech can be better recognized and the requirement for the training data can be reduced.Type: GrantFiled: March 19, 2002Date of Patent: March 18, 2008Assignee: Microsoft CorporationInventor: Li Deng
-
Patent number: 7346517Abstract: Many compressed audio or video frames contain silence (if audio), or a blank image (if video); these essentially information content free (e.g. silent if audio or blank if video) frames can be both detected whilst still in compressed form and then used to carry the additional data. In an MPEG implementation, subbands associated with silent frames are rendered digitally silent and then used to carry PAD (Programme Associated Data).Type: GrantFiled: February 8, 2002Date of Patent: March 18, 2008Assignee: Radioscape LimitedInventors: Gavin Robert Ferris, Michael Vincent Woodward