Patents Examined by David Kovacek
  • Patent number: 8898055
    Abstract: A voice quality conversion device including: a target vowel vocal tract information hold unit holding target vowel vocal tract information of each vowel indicating target voice quality; a vowel conversion unit (i) receiving vocal tract information with phoneme boundary information of the speech including information of phonemes and phoneme durations, (ii) approximating a temporal change of vocal tract information of a vowel in the vocal tract information with phoneme boundary information applying a first function, (iii) approximating a temporal change of vocal tract information of the same vowel held in the target vowel vocal tract information hold unit applying a second function, (iv) calculating a third function by combining the first function with the second function, and (v) converting the vocal tract information of the vowel applying the third function; and a synthesis unit synthesizing a speech using the converted information.
    Type: Grant
    Filed: May 8, 2008
    Date of Patent: November 25, 2014
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventors: Yoshifumi Hirose, Takahiro Kamai, Yumiko Kato
  • Patent number: 8856003
    Abstract: A method for dual channel monitoring on a radio device as provided enables efficient use of communication network resources. The method includes receiving at the radio device a first speech signal over a first channel, while simultaneously receiving at the radio device a second speech signal over a second channel. The first speech signal is then processed at the radio device to generate a text transcription of the first speech signal, and the text transcription of the first speech signal is displayed on a display screen of the radio device. An audible voice signal is then produced from a speaker that is operatively connected to the radio device simultaneously with displaying the text transcription of the first speech signal.
    Type: Grant
    Filed: April 30, 2008
    Date of Patent: October 7, 2014
    Assignee: Motorola Solutions, Inc.
    Inventors: Wei Tuck Chong, Swee Aun Khor, Ing Boh Wong
  • Patent number: 8849662
    Abstract: A method and a system for segmenting phonemes from voice signals. A method for accurately segmenting phonemes, in which a histogram showing a peak distribution corresponding to an order is formed by using a high order concept, and a boundary indicating a starting point and an ending point of each phoneme is determined by calculating a peak statistic based on the histogram. The phoneme segmentation method can remarkably reduce an amount of calculation, and has an advantage of being applied to sound signal systems which perform sound coding, sound recognition, sound synthesizing, sound reinforcement, etc.
    Type: Grant
    Filed: December 28, 2006
    Date of Patent: September 30, 2014
    Assignee: Samsung Electronics Co., Ltd
    Inventor: Hyun-Soo Kim
  • Patent number: 8831943
    Abstract: A language model learning system for learning a language model on an identifiable basis relating to a word error rate used in speech recognition. The language model learning system (10) includes a recognizing device (101) for recognizing an input speech by using a sound model and a language model and outputting the recognized word sequence as the recognition result, a reliability degree computing device (103) for computing the degree of reliability of the word sequence, and a language model parameter updating device (104) for updating the parameters of the language model by using the degree of reliability. The language model parameter updating device updates the parameters of the language model to heighten the degree of reliability of the word sequence the computed degree of reliability of which is low when the recognizing device recognizes by using the updated language model and the reliability degree computing device computes the degree of reliability.
    Type: Grant
    Filed: May 30, 2007
    Date of Patent: September 9, 2014
    Assignee: NEC Corporation
    Inventors: Tadashi Emori, Yoshifumi Onishi
  • Patent number: 8812309
    Abstract: A method for suppressing ambient noise using multiple audio signals may include providing at least two audio signals captured by at least two electro-acoustic transducers. The at least two audio signals may include desired audio and ambient noise. The method may also include performing beamforming on the at least two audio signals in order to obtain a desired audio reference signal that is separate from a noise reference signal.
    Type: Grant
    Filed: November 25, 2008
    Date of Patent: August 19, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Dinesh Ramakrishnan, Song Wang
  • Patent number: 8744847
    Abstract: Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify phones or speech sounds spoken by the key child, independent of content. The number and type of phones is analyzed to automatically assess the key child's expressive language development. The assessment can result in a standard score, an estimated developmental age, or an estimated mean length of utterance.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: June 3, 2014
    Assignee: LENA Foundation
    Inventors: Terrance Paul, Dongxin Xu, Jeffrey A. Richards
  • Patent number: 8738374
    Abstract: Described is a speech-to-text conversion system and method that provides secure, real-time and high-accuracy conversion of general-quality speech into text. The system is designed to interface with external devices and services, providing a simple and convenient manner to transcribe audio that may be stored elsewhere such as a wireless phone's voice mail, or occurring between two or more parties such as a conference call. The first step in the system's process ensures secure and private transcription by separating an audio stream into many audio shreds, each of which has duration of only a few seconds and cannot reveal the context of the conversation. A workforce of geographically distributed transcription agents who transcribe the audio shreds is able to generate transcription in real time, with many agents working in parallel on a single conversation. No one agent (or group of agents) receives a sufficient number of audio shreds to reconstruct the context of any conversation.
    Type: Grant
    Filed: May 22, 2009
    Date of Patent: May 27, 2014
    Assignee: j2 Global Communications, Inc.
    Inventor: Jon Jaroker
  • Patent number: 8738373
    Abstract: In a signal processing method and apparatus, a predetermined correcting signal having a same frame length as a second frame signal in which predetermined processing is performed to a frequency spectrum of a first frame signal of a frame length to which a predetermined window function is performed and is converted into a time domain is adjusted so that amplitudes of both ends of the correcting signal become equal to amplitudes of both or one of frame ends of the second frame signal, and a corrected frame signal is obtained by subtracting an adjusted correcting signal from the second frame signal.
    Type: Grant
    Filed: December 13, 2006
    Date of Patent: May 27, 2014
    Assignee: Fujitsu Limited
    Inventors: Takeshi Otani, Masanao Suzuki
  • Patent number: 8731925
    Abstract: The present invention can include a speech enrollment system including an ordered stack of grammars and a recognition engine. The ordered stack of grammars can include an application grammars layer, a confusable grammar layer, a personal grammar layer, a phrase enrolled grammar layer, and an enrollment grammar layer. The recognition engine can return recognition results for speech input by processing the input using the ordered stack of grammars. The processing can occur from the topmost layer in the stack to the bottommost layer in the stack. Each layer in the stack can includes exit criteria based upon a defined condition. When the exit criteria is satisfied, a result can be returned based upon that layer and lower layers of the ordered stack can be ignored.
    Type: Grant
    Filed: December 22, 2006
    Date of Patent: May 20, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: William V. Da Palma, Brien H. Muschett
  • Patent number: 8725501
    Abstract: There is disclosed an audio decoding device capable of improving audio quality of a decoded signal by considering the energy change of a past signal in eracure concealment processing. In this device, an energy change calculation unit (143) calculates an average energy of an audio source signal of one-pitch cycle from the end of the ACB vector outputted from an adaptive codebook (106). Moreover, the energy change calculation unit (143) calculates a ratio of the average energy of the current sub-frame and the sub-frame immediately before and outputs the ratio to an ACB gain generation unit (135). The ACB gain generation unit (135) outputs a conceal processing ACB gain defined by the ACB gain decoded in the past or information on the energy change ratio outputted from the energy change calculation unit (143) to a multiplier (132).
    Type: Grant
    Filed: July 14, 2005
    Date of Patent: May 13, 2014
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Ehara
  • Patent number: 8725517
    Abstract: A dialog manager and spoken dialog service having a dialog manager generated according to a method comprising selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The dialog manager capable of handling context shifts in a spoken dialog with a user. Application dependencies are established in the top level flow controller thus enabling the subdialogs to be reusable and to be capable of managing context shifts and mixed initiative dialogs.
    Type: Grant
    Filed: June 25, 2013
    Date of Patent: May 13, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis
  • Patent number: 8700404
    Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.
    Type: Grant
    Filed: August 27, 2005
    Date of Patent: April 15, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Patent number: 8700396
    Abstract: This document generally describes computer technologies relating to generating speech data collection prompts, such as textual scripts and/or textual scenarios. Speech data collection prompts for a particular language can be generated based on a variety of factors, including the frequency with which linguistic elements (e.g., phonemes, syllables, words, phrases) in the particular language occur in one or more corpora of textual information associated with the particular language. Textual prompts can also and/or alternatively be generated based on statistics for previously recorded speech data.
    Type: Grant
    Filed: October 8, 2012
    Date of Patent: April 15, 2014
    Assignee: Google Inc.
    Inventors: Pedro J. Moreno Mengibar, Eugene Weinstein
  • Patent number: 8682681
    Abstract: An audio decoder has an arithmetic decoder for providing decoded spectral values on the basis of an arithmetically-encoded representation and a frequency-domain-to-time-domain converter for providing a time-domain audio representation. The arithmetic decoder selects a mapping rule describing a mapping of a code value onto a symbol code in dependence on a context state described by a numeric current context value which is determined in dependence on previously decoded spectral values. The arithmetic decoder obtains a plurality of context subregion values on the basis of previously decoded spectral values and derives a numeric current context value associated with one or more spectral values to be decoded in dependence on stored context subregion values. The arithmetic decoder computes the norm of a vector formed by a plurality of previously decoded spectral values in order to obtain a common context subregion value. An audio encoder uses a similar concept.
    Type: Grant
    Filed: July 12, 2012
    Date of Patent: March 25, 2014
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Guillaume Fuchs, Markus Multrus, Nikolaus Rettelbach, Vignesh Subbaraman, Oliver Weiss, Marc Gayer, Patrick Warmbold, Christian Griebel
  • Patent number: 8676588
    Abstract: Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage.
    Type: Grant
    Filed: May 22, 2009
    Date of Patent: March 18, 2014
    Assignee: Accenture Global Services Limited
    Inventors: Thomas J. Ryan, Biji K. Janan
  • Patent number: 8660840
    Abstract: A method and apparatus for predictively quantizing voiced speech includes a parameter generator and a quantizer. The parameter generator is configured to extract parameters from frames of predictive speech such as voiced speech, and to transform the extracted information to a frequency-domain representation. The quantizer is configured to subtract a weighted sum of the parameters for previous frames from the parameter for the current frame. The quantizer is configured to quantize the difference value. A prototype extractor may be added to first extract a pitch period prototype to be processed by the parameter generator.
    Type: Grant
    Filed: August 12, 2008
    Date of Patent: February 25, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Arasanipalai K. Ananthapadmanabhan, Sarath Manjunath, Pengjun Huang, Eddie-Lun Tik Choy, Andrew P. Dejaco
  • Patent number: 8639509
    Abstract: In a confidence computing method and system, a processor may interpret speech signals as a text string or directly receive a text string as input, generate a syntactical parse tree representing the interpreted string and including a plurality of sub-trees which each represents a corresponding section of the interpreted text string, determine for each sub-tree whether the sub-tree is accurate, obtain replacement speech signals for each sub-tree determined to be inaccurate, and provide output based on corresponding text string sections of at least one sub-tree determined to be accurate.
    Type: Grant
    Filed: July 27, 2007
    Date of Patent: January 28, 2014
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Feng Lin, Zhe Feng
  • Patent number: 8630859
    Abstract: A method of developing a dialog manager for a spoken dialog service is disclosed. The method comprises selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The method enables a developer to create a dialog manager that has individual reusable dialog modules that operate independent of the dialog model of the other modules. Application dependencies and context shifts are defined independent of the subdialogs to enable them to be reusable. The spoken dialog server manages context shifts in the spoken dialog by transitioning between dialog modules and subdialog modules.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: January 14, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe DiFabbrizio, Charles Alfred Lewis
  • Patent number: 8583440
    Abstract: An apparatus and method for providing visual indication of character ambiguity and ensuing reduction of such ambiguity during text entry are described. An application text entry field is presented in a display screen, into which the user enters text by means of a reduced keyboard and a disambiguating system. The default or most likely word construct for the current key sequence may be presented at the insertion point of the text entry field. An indication of ambiguity is presented in the display screen to communicate to the user the possible ambiguous characters associated with each key. A word choice list field may also be present to display at least one word construct matching the current key sequence.
    Type: Grant
    Filed: August 26, 2005
    Date of Patent: November 12, 2013
    Assignee: Tegic Communications, Inc.
    Inventors: James Stephanick, Ethan R. Bradford, Pim Van Meurs, Richard Eyraud, Michael R. Longé
  • Patent number: 8560326
    Abstract: Techniques for employing improved prompts in a speech-to-speech translation system are disclosed. By way of example, a technique for use in indicating a dialogue turn in an automated speech-to-speech translation system comprises the following steps/operations. One or more text-based scripts are obtained. The one or more text-based scripts are synthesizable into one or more voice prompts. At least one of the one or more voice prompts is synthesized for playback from at least one of the one or more text-based scripts, the at least one synthesized voice prompt comprising an audible message in a language understandable to a speaker interacting with the speech-to-speech translation system, the audible message indicating a dialogue turn in the automated speech-to-speech translation system.
    Type: Grant
    Filed: May 5, 2008
    Date of Patent: October 15, 2013
    Assignee: International Business Machines Corporation
    Inventors: Yuqing Gao, Liang Gu, Fu-Hua Liu