Patents Examined by Abul K. Azad
  • Patent number: 8290774
    Abstract: An improved -removal method is disclosed that computes for any input weighted automaton A with -transitions an equivalent weighted automaton B with no -transitions. The method comprises two main steps. The first step comprises computing for each state “p” of the automaton A its -closure. The second step in the method comprises modifying the outgoing transitions of each state “p” by removing those labeled with . The method next comprises adding to the set of transitions leaving the state “p” non--transitions leaving each state “q” in the set of states reachable from “p” via a path labeled with with their weights pre--multiplied by the -distance from state “p” to state “q” in the automaton A. State “p” is a final state if some state “q” within the set of states reachable from “p” via a path labeled with is final and the final weight ? ? [ p ] = ? q ? ? ? ? ? e ? [ p ] ? F ? ( d ? [ p , q ] ? ? ? ? [ q ] ) .
    Type: Grant
    Filed: July 20, 2001
    Date of Patent: October 16, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Mehryar Mohri
  • Patent number: 7464030
    Abstract: Each of the M basic vectors in a noise code book 260 is multiplied by a factor ±1 in a sign adder 270 and combined in an adder 280 to create 2M noise signed vectors. The characteristic of the binary Gray code is utilized as follows. A change ?Gu obtained between a noise signed vector based on a signed word i of the binary Gray code and a noise sign vector based on a sign word u adjacent to the sign word i and different from the sign word i only in a predetermined bit position v is used in such a manner that a sign word u? which is next to reverse the bit position v on the Gray code sequence can express a change ?Gu? from the noise signed vector by utilizing the fact that the sign word u? differs from the sign word u only in one bit position w excluding the bit position V. Thus, calculation is simplified, increasing the vector search speed.
    Type: Grant
    Filed: March 26, 1998
    Date of Patent: December 9, 2008
    Assignee: Sony Corporation
    Inventors: Yuji Maeda, Shuichi Maeda
  • Patent number: 7440897
    Abstract: In an embodiment, a lattice of phone strings in an input communication of a user may be recognized, wherein the lattice may represent a distribution over the phone strings. Morphemes in the input communication of the user may be detected using the recognized lattice. Task-type classification decisions may be made based on the detected morphemes in the input communication of the user.
    Type: Grant
    Filed: May 27, 2006
    Date of Patent: October 21, 2008
    Assignee: AT&T Corp.
    Inventors: Allen Louis Gorin, Dijana Petrovska-Delacretaz, Giuseppe Riccardi, Jeremy Huntley Wright
  • Patent number: 7433820
    Abstract: A system, method and program storage device implementing a method for modeling a data generating process, wherein the modeling comprises observing a data sequence comprising irregularly sampled data, obtaining an observation sequence based on the observed data sequence, assigning a time index sequence to the data sequence, obtaining a hidden state sequence of the data sequence, and decoding the data sequence based on a combination of the time index sequence and the hidden state sequence to model the data sequence. The method further comprises assigning a probability distribution over time stamp values of the observation sequence, wherein the decoding comprises using a Hidden Markov Model. The method further comprises using an expectation maximization methodology to learn the Hidden Markov Model.
    Type: Grant
    Filed: May 12, 2004
    Date of Patent: October 7, 2008
    Assignee: International Business Machines Corporation
    Inventors: Ashutosh Garg, Sreeram V. Balakrishnan, Shivakumar Vaithyanathan
  • Patent number: 7433822
    Abstract: At an audio source, pause information is added to audio data, the combination of which is subsequently packetized. The resulting packets are transmitted to an audio destination via a network in which different packets may be subjected to varying levels of delay. At the audio destination, the pause information may be used to insert pauses at appropriate times to accommodate the occurrence of delays in packet delivery. In one embodiment, pauses are inserted based on a hierarchy of pause types. During pauses, audio filler information may be injected. In this manner, the effects of variable network delays upon reconstructed audio may be mitigated.
    Type: Grant
    Filed: April 25, 2005
    Date of Patent: October 7, 2008
    Assignee: Research In Motion Limited
    Inventors: Dale R. Buchholz, Bashar Jano, Ira Gerson
  • Patent number: 7421386
    Abstract: A lexicon stored on a computer readable medium and used by language processing systems. The lexicon can store word information in a plurality of data fields associated with each entered word. The data fields can include information on spelling and grammar, parts of speech, steps that the entered word can be transformed into another word, a word description, and a segmentation for a compound word. Information that cannot be stored in the lexicon can be stored in an intermediate indexes table. Associated methods of constructing, updating and using the lexicon are introduced.
    Type: Grant
    Filed: March 19, 2004
    Date of Patent: September 2, 2008
    Assignee: Microsoft Corporation
    Inventors: Kevin R. Powell, Andrea Jessee, Douglas W. Potter
  • Patent number: 7415414
    Abstract: Techniques are provided for determining and using interaction models. Discourse functions, prosodic features and turn information are determined from the speech information in a training corpus. Statistics, decision trees, rules and/or various other methods are used to determine a predictive interaction model based on the discourse functions, the prosodic features and the turn information. Predictive interaction models are optionally determined for individual users, genres, languages and/or other characteristics of the speech information. The predictive interaction model is useable to predict turns in a dialogue based on the discourse functions and prosodic features identified in the speech information. Speech information is presented and/or received based on the predictive interaction model.
    Type: Grant
    Filed: March 23, 2004
    Date of Patent: August 19, 2008
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Misty Azara, Livia Polanyi, Giovanni L. Thione, Martin H Van Den Berg
  • Patent number: 7389232
    Abstract: A portable communication device and learning tool for use by speech impaired individuals or monolinguistic individuals is provided by the present invention. The device is foldable for convenient carrying and storage. A method of using the communication device and learning tool is also provided.
    Type: Grant
    Filed: June 27, 2003
    Date of Patent: June 17, 2008
    Inventors: Jeanne Bedford, Suzanne Hasko
  • Patent number: 7389229
    Abstract: A unified clustering tree (500) generates phoneme clusters based on an input sequence of phonemes. The number of possible clusters is significantly less than the number of possible combinations of input phonemes. Nodes (510, 511) in the unified clustering tree are arranged into levels such that the clustering tree generates clusters for multiple speech recognition models. Models that correspond to higher levels in the unified clustering tree are coarse models relative to more fine-grain models at lower levels of the clustering tree.
    Type: Grant
    Filed: October 16, 2003
    Date of Patent: June 17, 2008
    Assignee: BBN Technologies Corp.
    Inventors: Jayadev Billa, Daniel Kiecza, Francis G. Kubala
  • Patent number: 7383177
    Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.
    Type: Grant
    Filed: July 26, 2005
    Date of Patent: June 3, 2008
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventor: Tadashi Yamaura
  • Patent number: 7379873
    Abstract: Voice synthesis unit data stored in a phoneme database 10 is selected by a voice synthesis unit selector 12 in accordance with MIDI information stored in a performance data storage unit 11. Characteristic parameters are derived from the selected voice synthesis unit data. A characteristic parameter correction unit 21 corrects the characteristic parameters based on pitch information, etc. A spectrum envelope generating unit 23 generates a spectrum envelope in accordance with the corrected characteristic parameter. A timbre transformation unit 25 changes timbre by correcting the characteristic parameters in accordance with timbre transformation parameters in a time axis. Timbres in the same song position can be transformed into different arbitrary timbres respectively; therefore, the synthesized singing voice will be rich in variety and reality.
    Type: Grant
    Filed: July 3, 2003
    Date of Patent: May 27, 2008
    Assignee: Yamaha Corporation
    Inventor: Hideki Kemmochi
  • Patent number: 7373291
    Abstract: The present invention provides a new source of information, linguistic models, to improve the accuracy of mathematical recognition. Specifically, the present invention is an extension of linguistic methods to the mathematical domain thereby providing recognition of the artificial language of mathematics in a way analogous to natural language recognition. Parse trees are the basic units of the mathematical language, and a linguistic model for mathematics is a method for assigning a linguistic score to each parse tree. The models are generally created by taking a large body of known text and counting the occurrence of various linguistic events such as word bigrams in that body. The raw counts are modified by smoothing and other algorithms before taking their place as probabilities in the model.
    Type: Grant
    Filed: February 18, 2003
    Date of Patent: May 13, 2008
    Assignee: Mathsoft Engineering & Education, Inc.
    Inventor: Peter F. Garst
  • Patent number: 7366659
    Abstract: Time-scaled, sound signals (i.e. sounds output at differing speeds) are generated by mixing weighted time-and frequency-domain processed signals, the former signal generally representing speech-based signals while the latter representing music-based signals. The weights applied to each type of signal may be determined by a scaling factor, which in turn is related to the desired speed at which a listener desires to hear a sound signal. In one example of the invention, only stationary signal portions of an input sound signal are used to generate time-scaled processed signals. An adaptive frame-size may also be used to pre-process the separate signals prior to being weighted, which at least decreases the amount of unwanted reverberative sound qualities in a resulting sound signal. Together, techniques envisioned by the present invention produce improved, speed adjusted sound signals.
    Type: Grant
    Filed: June 7, 2002
    Date of Patent: April 29, 2008
    Assignee: Lucent Technologies Inc.
    Inventor: Walter Etter
  • Patent number: 7366660
    Abstract: The present invention relates to a transceiver which provides a high-quality decoded voice. A mobile telephone 1011 encodes voice data, and outputs the encoded voice data. Furthermore, the mobile telephone 1011 learns quality-enhancement data which improves the quality of a voice output from a mobile telephone 1012, based on voice data used in past learning and newly input voice data, thereby transmitting the encoded voice data and quality-enhancement data. The mobile telephone 1012 receives the encoded voice data transmitted from the mobile telephone 1011, and selects quality-enhancement data correspondingly associated with a telephone number of the mobile telephone 1011. The mobile telephone 1012 decodes the received encoded voice data based on the selected quality-enhancement data. The present invention is applied to a mobile telephone that transmits and receives voices.
    Type: Grant
    Filed: June 20, 2002
    Date of Patent: April 29, 2008
    Assignee: Sony Corporation
    Inventors: Tetsujiro Kondo, Masaaki Hattori, Tsutomu Watanabe, Hiroto Kimura
  • Patent number: 7363220
    Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal.
    Type: Grant
    Filed: March 28, 2005
    Date of Patent: April 22, 2008
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventor: Tadashi Yamaura
  • Patent number: 7359853
    Abstract: An implementation of the present invention for 4800 bits per second comprises a voice encoder and decoder method and system that uses voice excitation, eliminating the voice/unvoiced pitch tracking, and the first formant up to 2400 Hertz, does not use pulse code modulation encoding, but uses the zero crossings only of the first formant, dividing by two and sampling at 2400 Hertz. The resulting combination uses half of the bit rate for excitation and the remainder for short term spectrum analysis. The spectrum is updated each 20.8 milliseconds using 50 bits per frame. The decoder extracts the excitation, multiplies it by two and uses a Hanning modified sawtooth and spectral flattening to excite the spectrum generator. This waveform produces both even and odd harmonics for both periodic (voiced) and aperiodic (unvoiced) frequencies and gives naturalness to all languages and speakers.
    Type: Grant
    Filed: February 11, 2005
    Date of Patent: April 15, 2008
    Inventor: Clyde Holmes
  • Patent number: 7353171
    Abstract: Methods and apparatus to operate an audience metering device with voice commands are described herein. In an example method, at least one of a television program audio signal or a voice command from an audience member is transduced into an audio input signal. Based on the audio input signal and a television audio line signal, a residual audio signal is generated. One or more vectors from the residual audio signal are extracted. Based on the one or more vectors extracted from the residual audio signal, the voice command is identified.
    Type: Grant
    Filed: March 14, 2006
    Date of Patent: April 1, 2008
    Assignee: Nielsen Media Research, Inc.
    Inventor: Venugopal Srinivasan
  • Patent number: 7346509
    Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.
    Type: Grant
    Filed: September 26, 2003
    Date of Patent: March 18, 2008
    Assignee: Callminer, Inc.
    Inventor: Jeffrey A. Gallino
  • Patent number: 7346510
    Abstract: A method and computer-readable medium are provided that determine predicted acoustic values for a sequence of hypothesized speech units using modeled articulatory or VTR dynamics values and using the modeled relationship between the articulatory (or VTR) and acoustic values for the same speech events. Under one embodiment, the articulatory (or VTR) dynamics value depends on articulatory dynamics values at pervious time frames and articulation targets. In another embodiment, the articulatory dynamics value depends in part on an acoustic environment value such as noise or distortion. In a third embodiment, a time constant that defines the articulatory dynamics value is trained using a variety of articulation styles. By modeling the articulatory or VTR dynamics value in these manners, hyper-articulated, hypo-articulated, fast, and slow speech can be better recognized and the requirement for the training data can be reduced.
    Type: Grant
    Filed: March 19, 2002
    Date of Patent: March 18, 2008
    Assignee: Microsoft Corporation
    Inventor: Li Deng
  • Patent number: 7346517
    Abstract: Many compressed audio or video frames contain silence (if audio), or a blank image (if video); these essentially information content free (e.g. silent if audio or blank if video) frames can be both detected whilst still in compressed form and then used to carry the additional data. In an MPEG implementation, subbands associated with silent frames are rendered digitally silent and then used to carry PAD (Programme Associated Data).
    Type: Grant
    Filed: February 8, 2002
    Date of Patent: March 18, 2008
    Assignee: Radioscape Limited
    Inventors: Gavin Robert Ferris, Michael Vincent Woodward