Subportions Patents (Class 704/254)

Three dimensional animated figures

Patent number: 7508393

Abstract: A system comprising a plurality of three dimensional artificially animated portraits for performing preprogrammed animations of voice and facial expressions in the form of a scripted dialogue orchestrated by a central source. The system is operable to prepare animations of recorded voice and selected depictions of facial expressions to be transferred to the animated portraits and performed by the animated portraits. The system is operable to combine prepared animations in a scripted dialogue to be performed so as to mimic an interactive conversation.

Type: Grant

Filed: June 6, 2006

Date of Patent: March 24, 2009

Inventors: Patricia L. Gordon, Robert E Glaser
KEYWORD SPOTTING USING A PHONEME-SEQUENCE INDEX

Publication number: 20090063151

Abstract: In some aspects, a wordspotter is used to locate occurrences in an audio corpus of each of a set of predetermined subword units, which may be phoneme sequences. To locate a query (e.g., a keyword or phrase) in the audio corpus, constituent subword units in the query are indentified and then locations of those subwords are determined based on the locations of those subword units determined earlier by the wordspotter, for example, using a pre-built inverted index that maps subword units to their locations.

Type: Application

Filed: August 27, 2008

Publication date: March 5, 2009

Applicant: NEXIDIA INC.

Inventors: Jon A. Arrowood, Robert W. Morris, Mark Finlay, Scott A. Judy
SPEECH-TO-TEXT SYSTEM, SPEECH-TO-TEXT METHOD, AND SPEECH-TO-TEXT PROGRAM

Publication number: 20090048832

Abstract: [Problems] To provide a speech-to-text system and the like capable of matching edit result text acquired by editing recognition result text or edit result text which is newly-written text information with speech data. [Means for Solving Problems] A speech-to-text system (1) includes a matching unit (27) which collates edit result text acquired by a text editor unit (22) with speech recognition result information having time information created by a speech recognition unit (11) to thereby match the edit result text and speech data.

Type: Application

Filed: November 8, 2006

Publication date: February 19, 2009

Applicant: Nec Corporation

Inventor: Makoto Terao
Phonetic tone mark system and method thereof

Publication number: 20090048837

Abstract: A system and method that utilizes common symbols for marking the tones of alphabet letters of different languages. The marking system and method employs the symbols from the standard English typing keyboard to denote tones. There are seven phonetic tone marks. Each mark represents a unique tone. The system can be applied to any alphabetic writing letters of different languages to denote specific language tones. The method makes it possible for alphabetic writing of any kind of language and for people to effectively capture the tones of words in different languages.

Type: Application

Filed: August 14, 2007

Publication date: February 19, 2009

Inventors: Ling Ju Su, Kuojui Su
System and method for client voice building

Publication number: 20090048838

Abstract: Provided is a system and method for building and managing a customized voice of an end-user, comprising the steps of designing a set of prompts for collection from the user, wherein the prompts are selected from both an analysis tool and by the user's own choosing to capture voice characteristics unique to the user. The prompts are delivered to the user over a network to allow the user to save a user recording on a server of a service provider. This recording is then retrieved and stored on the server and then set up on the server to build a voice database using text-to-speech synthesis tools. A graphical interface allows the user to continuously refine the data file to improve the voice and customize parameter and configuration settings, thereby forming a customized voice database which can be deployed or accessed.

Type: Application

Filed: May 29, 2008

Publication date: February 19, 2009

Inventors: Craig F. Campbell, Kevin A. Lenzo, Alexandre D. Cox
Methods and apparatus relating to searching of spoken audio data

Publication number: 20090043581

Abstract: This invention relates to a method of searching spoken audio data for one or more search terms comprising performing a phonetic search of the audio data to identify likely matches to a search term and producing textual data corresponding to a portion of the spoken audio data including a likely match. An embodiment of the method comprises the steps of taking phonetic index data corresponding to the spoken audio data, searching the phonetic index data for likely matches to the search term, wherein when a likely match is detected a portion of the spoken audio data or phonetic index data is selected which includes the likely match and said selected portion of the spoken audio data or phonetic index data is processed using a large vocabulary speech recogniser. The large vocabulary speech recogniser may derive textual data which can be used for further processing or may be used to present a transcript to a user.

Type: Application

Filed: August 7, 2008

Publication date: February 12, 2009

Applicant: AURIX LIMITED

Inventors: Martin G. Abbott, Keith M. Ponting
Method and System of Indexing Speech Data

Publication number: 20090030680

Abstract: A method and system of indexing speech data. The method includes indexing word transcripts including a timestamp for a word occurrence; and indexing sub-word transcripts including a timestamp for a sub-word occurrence. A timestamp in the index indicates the time and duration of occurrence of the word or sub-word in the speech data, and word and sub-word occurrences can be correlated using the timestamps. A method of searching speech transcripts is also provided in which a search query in the form of a phrase to be searched includes at least one in-vocabulary word and at least one out-of-vocabulary word.

Type: Application

Filed: July 23, 2007

Publication date: January 29, 2009

Inventor: Jonathan Joseph Mamou
SPEECH RECOGNITION DICTIONARY COMPILATION ASSISTING SYSTEM, SPEECH RECOGNITION DICTIONARY COMPILATION ASSISTING METHOD AND SPEECH RECOGNITION DICTIONARY COMPILATION ASSISTING PROGRAM

Publication number: 20090024392

Abstract: A speech recognition dictionary making supporting system for efficiently making/updating a speech recognition dictionary/language model with reduced speech recognition errors by using text data available at low cost. The speech recognition dictionary making supporting system comprises a recognition dictionary storage section (105), a language model storage section (106), and a sound model storage section (107). A virtual speech recognizing section (102) creates virtual speech recognition result text data in regard to an analyzed text data created by a text analyzing section (101) with reference to a recognition dictionary, language model, and sound model, and compares the virtual speech recognition result text data with the original analyzed text data. An updating section (103) updates the recognition dictionary and language model so that the different portions in both the text data may be lessened.

Type: Application

Filed: February 2, 2007

Publication date: January 22, 2009

Applicant: NEC CORPORATION

Inventor: Takafumi Koshinaka
Information recognition device and information recognition method

Patent number: 7480616

Abstract: Information relating to an amount of muscle activity is extracted from a myo-electrical signal by activity amount information extraction means, and information recognition is performed by activity amount information recognition means using the information relating to the amount of muscle activity of a speaker. There is a prescribed correspondence relationship between the amount of muscle activity of a speaker and a phoneme uttered by a speaker, so the content of an utterance can be recognized with a high recognition rate by information recognition using information relating to an amount of muscle activity.

Type: Grant

Filed: February 27, 2003

Date of Patent: January 20, 2009

Assignee: NTT DoCoMo, Inc.

Inventors: Hiroyuki Manabe, Akira Hiraiwa, Toshiaki Sugimura
Systems and methods for building a native language phoneme lexicon having native pronunciations of non-native words derived from non-native pronunciations

Patent number: 7472061

Abstract: Systems and methods are provided for automatically building a native phonetic lexicon for a speech-based application trained to process a native (base) language, wherein the native phonetic lexicon includes native phonetic transcriptions (base forms) for non-native (foreign) words which are automatically derived from non-native phonetic transcriptions of the non-native words.

Type: Grant

Filed: March 31, 2008

Date of Patent: December 30, 2008

Assignee: International Business Machines Corporation

Inventors: Neal Alewine, Eric Janke, Paul Sharp, Roberto Sicconi
GENERIC SPELLING MNEMONICS

Publication number: 20080319749

Abstract: A system and method for creating a mnemonics Language Model for use with a speech recognition software application, wherein the method includes generating an n-gram Language Model containing a predefined large body of characters, wherein the n-gram Language Model includes at least one character from the predefined large body of characters, constructing a new Language Model (LM) token for each of the at least one character, extracting pronunciations for each of the at least one character responsive to a predefined pronunciation dictionary to obtain a character pronunciation representation, creating at least one alternative pronunciation for each of the at least one character responsive to the character pronunciation representation to create an alternative pronunciation dictionary and compiling the n-gram Language Model for use with the speech recognition software application, wherein compiling the Language Model is responsive to the new Language Model token and the alternative pronunciation dictionary.

Type: Application

Filed: July 11, 2008

Publication date: December 25, 2008

Applicant: MICROSOFT CORPORATION

Inventors: David Mowatt, Robert Chambers, Ciprian Chelba, Qiang Wu
Methodology for generating enhanced demiphone acoustic models for speech recognition

Patent number: 7467086

Abstract: A system and method for effectively performing speech recognition procedures includes enhanced demiphone acoustic models that a speech recognition engine utilizes to perform the speech recognition procedures. The enhanced demiphone acoustic models each have three states that are collectively arranged to form a preceding demiphone and a succeeding demiphone. An acoustic model generator may utilize a decision tree for analyzing speech context information from a training database. The acoustic model generator then effectively configures each of the enhanced demiphone acoustic models as either a succeeding-dominant enhanced demiphone acoustic model or a preceding-dominant enhanced demiphone acoustic model to accurately model speech characteristics.

Type: Grant

Filed: December 16, 2004

Date of Patent: December 16, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menendez-Pidal, Lex S. Olorenshaw, Gustavo Hernandez Abrego
Speech Recognition System with Huge Vocabulary

Publication number: 20080294441

Abstract: The invention deals with speech recognition, such as a system for recognizing words in continuous speech. A speech recognition system is disclosed which is capable of recognizing a huge number of words, and in principle even an unlimited number of words. The speech recognition system comprises a word recognizer for deriving a best path through a word graph, and wherein words are assigned to the speech based on the best path. The word score being obtained from applying a phonemic language model to each word of the word graph. Moreover, the invention deals with an apparatus and a method for identifying words from a sound block and to computer readable code for implementing the method.

Type: Application

Filed: December 6, 2006

Publication date: November 27, 2008

Inventor: Zsolt Saffer
System and method for improving recognition accuracy in speech recognition applications

Patent number: 7457751

Abstract: A speech recognition system and method are provided to correctly distinguish among multiple interpretations of an utterance. This system is particularly useful when the set of possible interpretations is large, changes dynamically, and/or contains items that are not phonetically distinctive. The speech recognition system extends the capabilities of mobile wireless communication devices that are voice operated after their initial activation.

Type: Grant

Filed: November 30, 2004

Date of Patent: November 25, 2008

Assignee: Vocera Communications, Inc.

Inventor: Robert E. Shostak
Sequential presentation of long instructions in an interactive voice response system

Patent number: 7453994

Abstract: A method of presenting instructions to a user sending an incoming communication to a service center includes presenting a menu to the user. The menu includes a plurality of procedure descriptors to the user. The user is presented, according to a selection of one of the procedure descriptors by the user, a sequence of instructions which enable completion of a procedure described by the selected procedure descriptor. The incoming communication is transferred at a position in the sequence of instructions to a representative. The incoming communication is also transferred back to the same position in the sequence of instructions.

Type: Grant

Filed: October 22, 2007

Date of Patent: November 18, 2008

Assignee: AT&T Labs, Inc.

Inventors: Philip Ted Kortum, Robert R. Bushey
System and method for compiling rules created by machine learning program

Patent number: 7451125

Abstract: A system, a method, and a machine-readable medium are provided. A group of linear rules and associated weights are provided as a result of machine learning. Each one of the group of linear rules is partitioned into a respective one of a group of types of rules. A respective transducer for each of the linear rules is compiled. A combined finite state transducer is created from a union of the respective transducers compiled from the linear rules.

Type: Grant

Filed: November 7, 2005

Date of Patent: November 11, 2008

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Srinivas Bangalore
Speech recognizing apparatus having optimal phoneme series comparing unit and speech recognizing method

Patent number: 7447634

Abstract: A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit obtains a likelihood that respective recognizing-unit standard patterns coincide with a time series of the amount of characteristics representing the characteristics of the input speed.

Type: Grant

Filed: June 11, 2007

Date of Patent: November 4, 2008

Assignee: Kabushiki Kaisha Toshiba

Inventor: Ryosuke Koshiba
Method and System for Automatically Providing Linguistic Formulations that are Outside a Recognition Domain of an Automatic Speech Recognition System

Publication number: 20080270129

Abstract: A method for automatically providing a hypothesis of a linguistic formulation that is uttered by users of a voice service based on an automatic speech recognition system and that is outside a recognition domain of the automatic speech recognition system. The method includes providing a constrained and an unconstrained speech recognition from an input speech signal, identifying a part of the constrained speech recognition outside the recognition domain, identifying a part of the unconstrained speech recognition corresponding to the identified part of the constrained speech recognition, and providing the linguistic formulation hypothesis based on the identified part of the unconstrained speech recognition.

Type: Application

Filed: February 17, 2005

Publication date: October 30, 2008

Applicant: Loquendo S.p.A.

Inventors: Daniele Colibro, Claudio Vair, Luciano Fissore, Cosmin Popovici
Method and system for automatically detecting morphemes in a task classification system using lattices

Patent number: 7440897

Abstract: In an embodiment, a lattice of phone strings in an input communication of a user may be recognized, wherein the lattice may represent a distribution over the phone strings. Morphemes in the input communication of the user may be detected using the recognized lattice. Task-type classification decisions may be made based on the detected morphemes in the input communication of the user.

Type: Grant

Filed: May 27, 2006

Date of Patent: October 21, 2008

Assignee: AT&T Corp.

Inventors: Allen Louis Gorin, Dijana Petrovska-Delacretaz, Giuseppe Riccardi, Jeremy Huntley Wright
AURAL SIMILARITY MEASURING SYSTEM FOR TEXT

Publication number: 20080228485

Abstract: The aural similarity measuring system and method provides a measure of the aural similarity between a target text (10) and one or more reference texts (11). Both the target text (10) and the reference texts (11) are converted into a string of phonemes (15) and then one or other of the phoneme strings are adjusted (16) so that both are equal in length. The phoneme strings are compared (12) and a score generated representative of the degree of similarity of the two phoneme strings. Finally, where there is a plurality of reference texts the similarity scores for each of the reference texts are ranked (13). With this aural similarity measuring system the analysis is automated thereby reducing risks of errors and omissions. Moreover, the system provides an objective measure of aural similarity enabling consistency of comparison in results and reproducibility of results.

Type: Application

Filed: March 5, 2008

Publication date: September 18, 2008

Applicant: MONGOOSE VENTURES LIMITED

Inventor: Mark Owen
METHOD AND SYSTEM FOR AUTOMATICALLY DETECTING MORPHEMES IN A TASK CLASSIFICATION SYSTEM USING LATTICES

Publication number: 20080215328

Abstract: The invention concerns a method and system for detecting morphemes in a user's communication. The method may include recognizing a lattice of phone strings from the user's input communication, the lattice representing a distribution over the phone strings, and detecting morphemes in the user's input communication using the lattice. The morphemes may be acoustic and/or non-acoustic. The morphemes may represent any unit or sub-unit of communication including phones, diphones, phone-phrases, syllables, grammars, words, gestures, tablet strokes, body movements, mouse clicks, etc. The training speech may be verbal, non-verbal, a combination of verbal and non-verbal, or multimodal.

Type: Application

Filed: September 13, 2007

Publication date: September 4, 2008

Applicant: AT&T Corp.

Inventors: Allen Louis Gorin, Dijana Petrovska-Delacretaz, Giuseppe Riccardi, Jeremy Huntley Wright
Voice detection device

Patent number: 7418385

Abstract: This voice detection device is composed of a myoelectric signal acquisition part for acquiring, from a plurality of regions, myoelectric signals generated at the time of a vocalization operation, a parameter calculation part for calculating, as parameters, the fluctuations of the acquired myoelectric signals relative to a predetermined value in every channel corresponding to one of the plurality of regions, a vowel vocalization recognition part for specifying the vowel vocalization operation timing at the time of the vocalization operation, based on the fluctuations of the calculated parameters, and a vowel specification part for specifying a vowel corresponding to the vocalization operation, based on the fluctuation condition of the parameters before and after the specified vocalization operation timing in every channel.

Type: Grant

Filed: June 18, 2004

Date of Patent: August 26, 2008

Assignee: NTT DoCoMo, Inc.

Inventors: Hiroyuki Manabe, Yumiko Hiraiwa, legal representative, Kouki Hayashi, Takashi Ninjouji, Toshiaki Sugimura, Akira Hiraiwa
Distributed speech recognition system and method and terminal and server for distributed speech recognition

Publication number: 20080201147

Abstract: Provided are a distributed speech recognition system, a distributed speech recognition speech method, and a terminal and a server for distributed speech recognition. The distributed speech recognition system includes a terminal which decodes a feature vector that is extracted from an input speech signal into a sequence of phonemes and generates the final recognition result by rescoring a candidate list provided from the outside; and a server which generates the candidate list by performing symbol matching on the recognized sequence of phonemes provided from the terminal and transmits the candidate list for the rescoring to the terminal.

Type: Application

Filed: July 13, 2007

Publication date: August 21, 2008

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ick-sang Han, Kyu-hong Kim, Jeong-su Kim
Speech recognizing apparatus with noise model adapting processing unit and speech recognizing method

Patent number: 7415408

Abstract: A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit compares the time series of the amount of characteristics with one recognizing standard pattern or with two or more combined recognizing standard patterns one-by-one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics. A rejection determining unit determining unit determines whether or not the input signal is a noise by comparing the likelihood obtained by the recognizing target vocabulary comparing step with the likelihood obtained by the environment adaptive noise model comparing step.

Type: Grant

Filed: June 11, 2007

Date of Patent: August 19, 2008

Assignee: Kabushiki Kaisha Toshiba

Inventor: Ryosuke Koshiba
Directory dialer name recognition

Patent number: 7412386

Abstract: A method, apparatus, computer program product and service for directory dialer name recognition. The directory dialer has a directory of names and a first name grammar and a second name grammar representing phonetic baseforms of first names and second names respectively. The method includes: receiving voice data for a spoken name after requesting a user to speak the required name; extracting a set of phonetic baseforms for the voice data; and finding the best matches between the extracted set of phonetic baseforms voice data and any combination of the first name grammar and the second name grammar. The method can further include: checking the best match against the directory of names; if the best match does not exist in the directory, informing the user and prompting the next best match as an alternative; and if the best match does exist in the directory, forwarding the call to that best match.

Type: Grant

Filed: November 24, 2004

Date of Patent: August 12, 2008

Assignee: International Business Machines Corporation

Inventors: Eric William Janke, Keith Sloan
Speech recognizing apparatus with noise model adapting processing unit, speech recognizing method and computer-readable medium

Patent number: 7409341

Abstract: A recognizing target vocabulary comparing unit calculates a compared likelihood of recognizing target vocabulary, i.e., a compared likelihood of registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adapted noise model comparing unit compares the time series of the amount of characteristics with one recognizing standard pattern or with two or more combined recognizing standard patterns one-by-one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics. A rejection determining unit determines whether or not the input signal is noise by comparing the likelihood obtained by the recognizing target vocabulary comparing step with the likelihood obtained by the environment adaptive noise model comparing step.

Type: Grant

Filed: June 11, 2007

Date of Patent: August 5, 2008

Assignee: Kabushiki Kaisha Toshiba

Inventor: Ryosuke Koshiba
Methods for reducing spurious insertions in speech recognition

Patent number: 7409345

Abstract: Techniques for improving an automatic baseform generation system. More particularly, the invention provides techniques for reducing insertion of spurious speech events in a word or phone sequence generated by an automatic baseform generation system. Such automatic baseform generation techniques may be accomplished by enhancing the scores of long-lasting speech events with respect to the scores of short-lasting events. For example, this may be achieved by merging competing candidates that relate to the same speech event (e.g., phone or word) and that overlap in time into a single candidate, the score of which may be equal to the sum of the scores of the merged candidates.

Type: Grant

Filed: April 4, 2003

Date of Patent: August 5, 2008

Assignee: International Business Machines Corporation

Inventors: Sabine V. Deligne, Lidia L. Mangu
Two-stage implementation for phonetic recognition using a bi-directional target-filtering model of speech coarticulation and reduction

Patent number: 7409346

Abstract: A structured generative model of a speech coarticulation and reduction is described with a novel two-stage implementation. At the first stage, the dynamics of formants or vocal tract resonance (VTR) are generated using prior information of resonance targets in the phone sequence. Bi-directional temporal filtering with finite impulse response (FIR) is applied to the segmental target sequence as the FIR filter's input. At the second stage the dynamics of speech cepstra are predicted analytically based on the FIR filtered VTR targets. The combined system of these two stages thus generates correlated and causally related VTR and cepstral dynamics where phonetic reduction is represented explicitly in the hidden resonance space and implicitly in the observed cepstral space. The combined system also gives the acoustic observation probability given a phone sequence. Using this probability, different phone sequences can be compared and ranked in terms of their respective probability values.

Type: Grant

Filed: March 1, 2005

Date of Patent: August 5, 2008

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Dong Yu, Li Deng
SYSTEM AND METHOD OF PATTERN RECOGNITION IN VERY HIGH DIMENSIONAL SPACE

Publication number: 20080183471

Abstract: A system and method of recognizing speech comprises an audio receiving element and a computer server. The audio receiving element and the computer server perform the process steps of the method. The method involves training a stored set of phonemes by converting them into n-dimensional space, where n is a relatively large number. Once the stored phonemes are converted, they are transformed using single value decomposition to conform the data generally into a hypersphere. The received phonemes from the audio-receiving element are also converted into n-dimensional space and transformed using single value decomposition to conform the data into a hypersphere. The method compares the transformed received phoneme to each transformed stored phoneme by comparing a first distance from a center of the hypersphere to a point associated with the transformed received phoneme and a second distance from the center of the hypersphere to a point associated with the respective transformed stored phoneme.

Type: Application

Filed: March 28, 2008

Publication date: July 31, 2008

Applicant: AT&T Corp.

Inventor: Bishnu Saroop Atal
METHOD AND SYSTEM FOR AUTOMATIC DETECTING MORPHEMES IN A TASK CLASSIFICATION SYSTEM USING LATTICES

Publication number: 20080177544

Abstract: The invention concerns a method and system for detecting morphemes in a user's communication. The method may include recognizing a lattice of phone strings from the user's input communication, the lattice representing a distribution over the phone strings, and detecting morphemes in the user's input communication using the lattice. The morphemes may be acoustic and/or non-acoustic. The morphemes may represent any unit or sub-unit of communication including phones, diphones, phone-phrases, syllables, grammars, words, gestures, tablet strokes, body movements, mouse clicks, etc. The training speech may be verbal, non-verbal, a combination of verbal and non-verbal, or multimodal.

Type: Application

Filed: September 13, 2007

Publication date: July 24, 2008

Applicant: AT&T Corp.

Inventors: Allen Louis Gorin, Dijana Petrovska-Delacretaz, Giuseppe Riccardi, Jeremy Huntley Wright
Position-dependent phonetic models for reliable pronunciation identification

Publication number: 20080172224

Abstract: A representation of a speech signal is received and is decoded to identify a sequence of position-dependent phonetic tokens wherein each token comprises a phone and a position indicator that indicates the position of the phone within a syllable.

Type: Application

Filed: January 11, 2007

Publication date: July 17, 2008

Applicant: Microsoft Corporation

Inventors: Peng Liu, Yu Shi, Frank Kao-ping Soong
Phonetic fragment search in speech data

Patent number: 7401019

Abstract: A method of searching audio data is provided including receiving a query defining multiple phonetic possibilities. The method also includes comparing the query with a lattice of phonetic hypotheses associated with the audio data to identify if at least one of the multiple phonetic possibilities is approximated by at least one phonetic hypothesis in the lattice of phonetic hypotheses.

Type: Grant

Filed: January 15, 2004

Date of Patent: July 15, 2008

Assignee: Microsoft Corporation

Inventors: Frank T. Seide, Eric I-Chao Chang
Orientation pronunciation

Publication number: 20080167873

Abstract: A method for pronunciation of English alphas according to the indications at different orientations of the alpha, comprises the steps of: dividing an area around an alpha into six sections, indicating short sounds, long sounds and strong sounds by points, lines and slashes; that put a small piece of line (in different angle) or point on an alpha indicating that it is pronounced by the pronunciation of another alpha; using underlines to indicate long sounds and short sounds of phonetic symbols of a set of double alphas; using a delete line to indicate that the alpha will not be pronounced, using a space area to divide syllables of a word; using a vertical cut line to indicate that one alpha is pronounced by two sounds; indicating an original sound line at an upper side of the first stroke to represents that the alpha is pronounced with an original sound; and a “?” under a double alpha set representing that the alpha is pronounced with a reverse sound.

Type: Application

Filed: January 8, 2007

Publication date: July 10, 2008

Inventor: Wei-Chou Su
Chaos theoretical diagnosis sensitizer

Patent number: 7392178

Abstract: The present invention is a preprocessing apparatus including a voice input apparatus for acquiring an uttered voice, an analog-digital conversion apparatus for converting the acquired uttered voice to digital voice data, and a comparator for selecting voice data, having a level which is equal to or higher than a certain level, from the digital voice data and for outputting the selected voice data. The preprocessing apparatus also includes a voice data cutout apparatus capable of cutting out voice data having a level which is equal to or higher than a certain level output from the comparator, while taking a phoneme as a unit, and a voice data output apparatus for outputting voice data of the phoneme unit output from the voice data cutout apparatus.

Type: Grant

Filed: February 26, 2003

Date of Patent: June 24, 2008

Assignees: Electronic Navigation Research Institute, An Independent Administration Institution, Mitsubishi Space Software Co., Ltd.

Inventors: Kakuichi Shiomi, Naritomo Meguro, Tomoya Maruyama
System for speech recognition with multi-part recognition

Patent number: 7392189

Abstract: A speech recognition system for processing voice inputs from a user to select a list element from a list or group of list elements. Recognition procedures are carried out on the voice input of the user. One recognition procedure separates the voice input of a whole word into at least one sequence of speech subunits to produce a vocabulary of list elements. Another recognition procedure compares the voice input of the whole word with the vocabulary of list elements.

Type: Grant

Filed: February 21, 2003

Date of Patent: June 24, 2008

Assignee: Harman Becker Automotive Systems GmbH

Inventors: Marcus Hennecke, Walter Koch, Gerhard Nüβle, Richard Reng
MULTIPLE SOUND FRAGMENTS PROCESSING AND LOAD BALANCING

Publication number: 20080147403

Abstract: A method, system and article of manufacture of recognizing a voice command. One embodiment of the invention comprises: receiving a voice input; using the number of sound fragments, determining a number of sound fragments to be processed in a first set of sound fragments; determining whether the first set of sound fragments of the voice input matches with the first set of sound fragments of a voice command; and if the first set of sound fragments matches with the first set of sound fragments of the voice command, then determining whether one or more remaining sound fragments matches with one or more remaining sound fragments of the voice command.

Type: Application

Filed: March 3, 2008

Publication date: June 19, 2008

Inventors: Joseph Herbert McIntyre, Victor S. Moore
METHOD AND APPARATUS FOR RECOGNIZING CONTINUOUS SPEECH USING SEARCH SPACE RESTRICTION BASED ON PHONEME RECOGNITION

Publication number: 20080133239

Abstract: Provided are an apparatus and method for recognizing continuous speech using search space restriction based on phoneme recognition. In the apparatus and method, a search space can be primarily reduced by restricting connection words to be shifted at a boundary between words based on the phoneme recognition result. In addition, the search space can be secondarily reduced by rapidly calculating a degree of similarity between the connection word to be shifted and the phoneme recognition result using a phoneme code and shifting the corresponding phonemes to only connection words having degrees of similarity equal to or higher than a predetermined reference value. Therefore, the speed and performance of the speech recognition process can be improved in various speech recognition services.

Type: Application

Filed: December 4, 2007

Publication date: June 5, 2008

Inventors: Hyung Bae Jeon, Jun Park, Seung Hi Kim, Kyu Woong Hwang
Method, Apparatus and Computer Program Product for Providing a Language Based Interactive Multimedia System

Publication number: 20080126093

Abstract: An apparatus for providing a language based interactive multimedia system includes a selection element, a comparison element and a processing element. The selection element may be configured to select a phoneme graph based on a type of speech processing associated with an input sequence of phonemes. The comparison element may be configured to compare the input sequence of phonemes to the selected phoneme graph. The processing element may be in communication with the comparison element and configured to process the input sequence of phonemes based on the comparison.

Type: Application

Filed: November 28, 2006

Publication date: May 29, 2008

Inventor: Sunil Sivadas
Multi-space distribution for pattern recognition based on mixed continuous and discrete observations

Publication number: 20080120108

Abstract: Performing speech recognition on a tonal language is done using a plurality of tonal models. Each tonal model has a multi-space distribution and corresponds to a known syllable in a language. A first data stream indicative of an observation of an utterance is received. The observation has both a discrete and a continuous tonal feature. A second data stream indicative of spectral features of a syllable of an utterance is also received. The first data stream is compared against at least one of the plurality of tonal models and the second data stream is compared against a spectral model.

Type: Application

Filed: November 16, 2006

Publication date: May 22, 2008

Inventors: Frank Kao-Ping Soong, Yao Qian
Computer-implemented methods and systems for entering and searching for non-Roman-alphabet characters and related search systems

Patent number: 7376648

Abstract: A computer-implemented method for selecting a desired Roman or non-Roman-alphabet character or objects from a set of non-Roman characters or objects may include steps of providing an association database that includes, for each non-Roman-alphabet character of the set, a Roman alphabet or other phonetic transliteration associated with each said non-Roman-alphabet character and a plurality of entries that are associated with each said non-Roman-alphabet character; receiving a phonetic transliteration of the desired non-Roman-alphabet character or data object and at least one associated entry that is associated with the desired non-Roman-alphabet character or other similar symbolic input; accessing the association database and identifying as candidate characters those characters of the set that are associated with the received phonetic transliteration and with the at least one received associated entry; if a number of candidate characters is greater than one, receiving additional associated entries and repeating

Type: Grant

Filed: October 20, 2004

Date of Patent: May 20, 2008

Assignee: Oracle International Corporation

Inventor: Richard C. Johnson
Motor vehicle with a speech interface

Publication number: 20080114598

Abstract: A motor vehicle has a speech interface for an acoustic input of commands for operating the motor vehicle or a module of the motor vehicle. The speech interface includes a speech recognition database in which a substantial portion of commands or command components, which can be input, are stored in a version according to a pronunciation in a first language and in a version according to a pronunciation in at least a second language, and a speech recognition engine for automatically comparing an acoustic command to commands and/or command components, which are stored in the speech recognition database, in a version according to the pronunciation in the first language and to commands and/or command components, which are stored in the speech recognition database, in a version according to the pronunciation in the second language.

Type: Application

Filed: November 9, 2006

Publication date: May 15, 2008

Applicant: Volkswagen of America, Inc.

Inventors: Ramon Prieto, M. Kashif Imam, Carsten Bergmann, Wai Yin Cheung, Carly Williams
System and method of pattern recognition in very high-dimensional space

Patent number: 7369993

Abstract: A system and method of recognizing speech comprises an audio receiving element and a computer server. The audio receiving element and the computer server perform the process steps of the method. The method involves training a stored set of phonemes by converting them into n-dimensional space, where n is a relatively large number. Once the stored phonemes are converted, they are transformed using single value decomposition to conform the data generally into a hypersphere. The received phonemes from the audio-receiving element are also converted into n-dimensional space and transformed using single value decomposition to conform the data into a hypersphere. The method compares the transformed received phoneme to each transformed stored phoneme by comparing a first distance from a center of the hypersphere to a point associated with the transformed received phoneme and a second distance from the center of the hypersphere to a point associated with the respective transformed stored phoneme.

Type: Grant

Filed: December 29, 2006

Date of Patent: May 6, 2008

Assignee: AT&T Corp.

Inventor: Bishnu Saroop Atal
Heuristic for Voice Result Determination

Publication number: 20080103774

Abstract: A method of and a system for processing speech. A spoken utterance of a plurality of characters can be received. A plurality of known character sequences that potentially correspond to the spoken utterance can be selected. Each selected known character sequence can be scored based on, at least in part, a weighting of individual characters that comprise the known character sequence.

Type: Application

Filed: October 30, 2006

Publication date: May 1, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Kenneth D. White
Methods and apparatus related to pruning for concatenative text-to-speech synthesis

Publication number: 20080091428

Abstract: The present invention provides, among other things, automatic identification of near-redundant units in a large TTS voice table, identifying which units are distinctive enough to keep and which units are sufficiently redundant to discard. According to an aspect of the invention, pruning is treated as a clustering problem in a suitable feature space. All instances of a given unit (e.g. word or characters expressed as Unicode strings) are mapped onto the feature space, and cluster units in that space using a suitable similarity measure. Since all units in a given cluster are, by construction, closely related from the point of view of the measure used, they are suitably redundant and can be replaced by a single instance. The disclosed method can detect near-redundancy in TTS units in a completely unsupervised manner, based on an original feature extraction and clustering strategy.

Type: Application

Filed: October 10, 2006

Publication date: April 17, 2008

Inventor: Jerome R. Bellegarda
Hierarchical word indexes used for efficient N-gram storage

Publication number: 20080091427

Abstract: Systems and methods are provided for compressing data models, for example, N-gram language models used in speech recognition applications. Words in the vocabulary of the language model are assigned to classes of words, for example, by syntactic criteria, semantic criteria, or statistical analysis of an existing language model. After word classes are defined, the follower lists for words in the vocabulary may be stored as hierarchical sets of class indexes and word indexes within each class. Hierarchical word indexes may reduce the storage requirements for the N-gram language model by more efficiently representing multiple words in a single list in the same follower list.

Type: Application

Filed: October 11, 2006

Publication date: April 17, 2008

Applicant: Nokia Corporation

Inventor: Jesper Olsen
Using Child Directed Speech to Bootstrap a Model Based Speech Segmentation and Recognition System

Publication number: 20080082337

Abstract: A method and system for obtaining a pool of speech syllable models. The model pool is generated by first detecting a training segment using unsupervised speech segmentation or speech unit spotting. If the model pool is empty, a first speech syllable model is trained and added to the model pool. If the model pool is not empty, an existing model is determined from the model pool that best matches the training segment. Then the existing module is scored for the training segment. If the score is less than a predefined threshold, a new model for the training segment is created and added to the pool. If the score equals the threshold or is larger than the threshold, the training segment is used to improve or to re-estimate the model.

Type: Application

Filed: September 21, 2007

Publication date: April 3, 2008

Applicant: HONDA RESEARCH INSTITUTE EUROPE GMBH

Inventors: Frank Joublin, Holger Brandl
Speech analysis using statistical learning

Publication number: 20080082336

Abstract: Included are embodiments for providing speech analysis. At least one embodiment of a method includes receiving audio data associated with a communication and providing the at least one phoneme in a phonetic transcript, the phonetic transcript including at least one character from a phonetic alphabet.

Type: Application

Filed: September 29, 2006

Publication date: April 3, 2008

Inventors: Gary Duke, Joseph Watson
CONVERSION OF ALPHABETIC WORDS INTO A PLURALITY OF INDEPENDENT SPELLINGS

Publication number: 20080082335

Abstract: A method and a system for automatically converting alphabetic words into a plurality of independent spellings. The method can include parsing textual input to identify at least one word and converting the word into a first word object having a first spelling including letter objects. The method also can include converting the word into a second word object having a second spelling including phonetic objects, each of the phonetic objects correlating to at least one of the letter objects. Further, the first word object and the second word object can be presented in a visual field such that each of the phonetic objects is visually associated with the letter object to which it correlates.

Type: Application

Filed: September 28, 2006

Publication date: April 3, 2008

Inventor: Howard Engelsen
System and method for effectively implementing a Mandarin Chinese speech recognition dictionary

Patent number: 7353174

Abstract: The present invention comprises a system and method for effectively implementing a Mandarin Chinese speech recognition dictionary, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may efficiently be implemented by utilizing an allophone and phonemic variation technique. In addition, the foregoing vocabulary dictionary may be implemented by utilizing unified dictionary optimization techniques to provide robust and accurate speech recognition. Furthermore, the vocabulary dictionary may be implemented as an optimized dictionary to accurately recognize either Northern Mandarin Chinese speech or Southern Mandarin Chinese speech during the speech recognition procedure.

Type: Grant

Filed: March 31, 2003

Date of Patent: April 1, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw
System and method for Mandarin Chinese speech recognition using an optimized phone set

Patent number: 7353173

Abstract: The present invention comprises a system and method for implementing a Mandarin Chinese speech recognizer with an optimized phone set, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may be implemented with a phonetic technique to separately include consonantal phones and vocalic phones. For reasons of system efficiency, the optimized Mandarin Chinese phone set may preferably be implemented in a compact manner to include only a minimum required number of consonantal phones and vocalic phones to accurately represent Mandarin Chinese speech during the speech recognition procedure.

Type: Grant

Filed: March 31, 2003

Date of Patent: April 1, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw

prev … 10 11 12 13 14 15 16 17 18 … next