Similarity Patents (Class 704/239)

Method of accessing a dial-up service

Patent number: 8731922

Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.

Type: Grant

Filed: April 30, 2013

Date of Patent: May 20, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Robert Wesley Bossemeyer, Jr.
Apparatus and method for computing control information for an echo suppression filter and apparatus and method for computing a delay value

Patent number: 8731207

Abstract: An embodiment of an apparatus for computing control information for a suppression filter for filtering a second audio signal to suppress an echo based on a first audio signal includes a computer having a value determiner for determining at least one energy-related value for a band-pass signal of at least two temporally successive data blocks of at least one signal of a group of signals. The computer further includes a mean value determiner for determining at least one mean value of the at least one determined energy-related value for the band-pass signal. The computer further includes a modifier for modifying the at least one energy-related value for the band-pass signal on the basis of the determined mean value for the band-pass signal. The computer further includes a control information computer for computing the control information for the suppression filter on the basis of the at least one modified energy-related value.

Type: Grant

Filed: January 12, 2009

Date of Patent: May 20, 2014

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.

Inventors: Fabian Kuech, Markus Kallinger, Christof Faller, Alexis Favrot
Method and system for identifying sound signals

Patent number: 8725829

Abstract: A method and system is described which allows users to identify (pre-recorded) sounds such as music, radio broadcast, commercials, and other audio signals in almost any environment. The audio signal (or sound) must be a recording represented in a database of recordings. The service can quickly identify the signal from just a few seconds of excerption, while tolerating high noise and distortion. Once the signal is identified to the user, the user may perform transactions interactively in real-time or offline using the identification information.

Type: Grant

Filed: April 26, 2004

Date of Patent: May 13, 2014

Assignee: Shazam Investments Limited

Inventors: Avery Li-Chun Wang, Christopher Jacques Penrose Barton, Dheeraj Shankar Mukherjee, Philip Inghelbrecht
Distinguishing out-of-vocabulary speech from in-vocabulary speech

Patent number: 8688451

Abstract: A speech recognition method includes receiving input speech from a user, processing the input speech using a first grammar to obtain parameter values of a first N-best list of vocabulary, comparing a parameter value of a top result of the first N-best list to a threshold value, and if the compared parameter value is below the threshold value, then additionally processing the input speech using a second grammar to obtain parameter values of a second N-best list of vocabulary. Other preferred steps include: determining the input speech to be in-vocabulary if any of the results of the first N-best list is also present within the second N-best list, but out-of-vocabulary if none of the results of the first N-best list is within the second N-best list; and providing audible feedback to the user if the input speech is determined to be out-of-vocabulary.

Type: Grant

Filed: May 11, 2006

Date of Patent: April 1, 2014

Assignee: General Motors LLC

Inventors: Timothy J. Grost, Rathinavelu Chengalvarayan
Method and system for post-processing speech recognition results

Patent number: 8682660

Abstract: A system and a method to correct semantic interpretation recognition errors presented in this invention applies to Automatic Speech Recognition systems returning recognition results with semantic interpretations. The method finds the most likely intended semantic interpretation given the recognized sequence of words and the recognized semantic interpretation. The key point is the computation of the conditional probability of the recognized sequence of words given the recognized semantic interpretation and a particular intended semantic interpretation. It is done with the use of Conditional Language Models which are Statistical Language Models trained on a corpus of utterances collected under the condition of a particular recognized semantic interpretation and a particular intended semantic interpretation. Based on these conditional probabilities and the joint probabilities of the recognized and intended semantic interpretations, new semantic interpretation confidences are computed.

Type: Grant

Filed: May 16, 2009

Date of Patent: March 25, 2014

Assignee: Resolvity, Inc.

Inventors: Yevgeniy Lyudovyk, Jacek Jarmulak
Automatic speech and concept recognition

Patent number: 8676580

Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.

Type: Grant

Filed: August 16, 2011

Date of Patent: March 18, 2014

Assignee: International Business Machines Corporation

Inventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
System and method of evaluating user simulations in a spoken dialog system with a diversion metric

Patent number: 8660844

Abstract: Systems, methods and computer-readable media associated with using a divergence metric to evaluate user simulations in a spoken dialog system. The method employs user simulations of a spoken dialog system and includes aggregating a first set of one or more scores from a real user dialog, aggregating a second set of one or more scores from a simulated user dialog associated with a user model, determining a similarity of distributions associated with each of the first set and the second set, wherein the similarity is determined using a divergence metric that does not require any assumptions regarding a shape of the distributions. It is preferable to use a Cramér-von Mises divergence.

Type: Grant

Filed: November 1, 2007

Date of Patent: February 25, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Jason Williams
Sound event detecting module for a sound event recognition system and method thereof

Patent number: 8655655

Abstract: A sound event detecting module for detecting whether a sound event with characteristic of repeating is generated. A sound end recognizing unit recognizes ends of sounds according to a sound signal to generate sound sections and multiple sets of feature vectors of the sound sections correspondingly. A storage unit stores at least M sets of feature vectors. A similarity comparing unit compares the at least M sets of feature vectors with each other, and correspondingly generates a similarity score matrix, which stores similarity scores of any two of the sound sections of the at least M of the sound sections. A correlation arbitrating unit determines the number of sound sections with high correlations to each other according to the similarity score matrix. When the number is greater than one threshold value, the correlation arbitrating unit indicates that the sound event with the characteristic of repeating is generated.

Type: Grant

Filed: December 30, 2010

Date of Patent: February 18, 2014

Assignee: Industrial Technology Research Institute

Inventors: Yuh-Ching Wang, Kuo-Yuan Li
DETECTING POTENTIAL SIGNIFICANT ERRORS IN SPEECH RECOGNITION RESULTS

Publication number: 20140012575

Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for a domain, such as the medical domain. In some embodiments, words and/or phrases that may be confused by an ASR system may be determined and associated in sets of words and/or phrases. Words and/or phrases that may be determined include those that change a meaning of a phrase or sentence when included in the phrase/sentence.

Type: Application

Filed: July 9, 2012

Publication date: January 9, 2014

Applicant: Nuance Communications, Inc.

Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
Speech processing system and method

Patent number: 8620655

Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic

Type: Grant

Filed: August 10, 2011

Date of Patent: December 31, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
Speech data process unit and speech data process unit control program for speech recognition

Patent number: 8606580

Abstract: To provide a data process unit and data process unit control program that are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and that are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. The data process unit comprises a data classification section, data storing section, pattern model generating section, data control section, mathematical distance calculating section, pattern model converting section, pattern model display section, region dividing section, division changing section, region selecting section, and specific pattern model generating section.

Type: Grant

Filed: December 30, 2008

Date of Patent: December 10, 2013

Assignee: Asahi Kasei Kabushiki Kaisha

Inventors: Makoto Shozakai, Goshu Nagino
SYSTEM AND METHOD FOR IDENTIFICATION OF A SPEAKER BY PHONOGRAMS OF SPONTANEOUS ORAL SPEECH AND BY USING FORMANT EQUALIZATION

Publication number: 20130325470

Abstract: A system and method for identification of a speaker by phonograms of oral speech is disclosed. Similarity between a first phonogram of the speaker and a second, or sample, phonogram is evaluated by matching formant frequencies in referential utterances of a speech signal, where the utterances for comparison are selected from the first phonogram and the second phonogram. Referential utterances of speech signals are selected from the first phonogram and the second phonogram, where the referential utterances include formant paths of at least three formant frequencies. The selected referential utterances including at least two identical formant frequencies are compared therebetween. Similarity of the compared referential utterances from matching other formant frequencies is evaluated, where similarity of the phonograms is determined from evaluation of similarity of all the compared referential utterances.

Type: Application

Filed: July 31, 2013

Publication date: December 5, 2013

Applicant: Obschestvo s ogranichennoi otvetstvennost'yu "Centr Rechevyh Tehnologij"

Inventor: Sergey Lvovich Koval
METHOD FOR PROVIDING VOICE RECOGNITION FUNCTION AND ELECTRONIC DEVICE THEREOF

Publication number: 20130325469

Abstract: A method for providing a voice recognition function and an electronic device thereof are provided. The method provides a voice recognition function in an electronic device that includes outputting, when a voice instruction is input, a list of prediction instructions that are candidate instructions similar to the input voice instruction, updating, when a correction instruction correcting the output candidate instructions is input, the list of prediction instructions, and performing, if the correction instruction matches with an instruction of high similarity in the updated list of prediction instructions, a voice recognition function corresponding to the voice instruction.

Type: Application

Filed: May 24, 2013

Publication date: December 5, 2013

Applicant: Samsung Electronics Co., Ltd.

Inventors: Hee-Woon KIM, Yu-Mi AHN, Seon-Hwa KIM, Ha-Young JEON
Method for dialog management

Patent number: 8600747

Abstract: A spoken dialog system and method having a dialog management module are disclosed. The dialog management module includes a plurality of dialog motivators for handling various operations during a spoken dialog. The dialog motivators comprise an error handling, disambiguation, assumption, confirmation, missing information, and continuation. The spoken dialog system uses the assumption dialog motivator in either a-priori or a-posteriori modes. A-priori assumption is based on predefined requirements for the call flow and a-posteriori assumption can work with the confirmation dialog motivator to assume the content of received user input and confirm received user input.

Type: Grant

Filed: June 17, 2008

Date of Patent: December 3, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Alicia Abella, Allen Louis Gorin
INFORMATION PROCESSING METHOD AND APPARATUS, COMPUTER PROGRAM AND RECORDING MEDIUM

Publication number: 20130304469

Abstract: Among multiple documents presented to a user, a high interest and a low interest document are specified, a word group in the high interest document is compared with a word group in the low interest document, and a string of word groups associated weight values is generated as a user feature vector. A word group included in each of multiple data items targeted for assigning priorities is extracted, and data feature vectors are generated specific to each data item, based on the word groups extracted. A degree of similarity between each data feature vectors of multiple data items and user feature vector is obtained, and according to the degree of similarity, priorities are assigned to the multiple data items to be presented to the user. Therefore, it is possible to extract user's feature information on which the user's interests and tastes are reflected more effectively.

Type: Application

Filed: April 29, 2013

Publication date: November 14, 2013

Inventors: Tomihisa KAMADA, Keisuke HARA
ELECTRONIC DEVICE AND METHOD FOR DETECTING PORNOGRAPHIC AUDIO DATA

Publication number: 20130304470

Abstract: An electronic device used for detecting pornographic audio contents includes a memory, a reading module, a calculating module, a comparing module, and a determining module. The memory stores multiple sample curves of pornographic audio contents. The reading module accesses audio contents from an audio/video source. The calculating module calculates a plurality of pitch curves of the audio contents. The comparing module compares the pitch curves of the audio contents with the sample curves of pornographic audio contents to gain similarities of the pitch curves and the sample curves of pornographic audio contents. The determining module determines whether the audio contents are pornographic audio contents according to the similarities.

Type: Application

Filed: May 12, 2013

Publication date: November 14, 2013

Applicant: HON HAI PRECISION INDUSTRY CO., LTD.

Inventor: CHUN-TE WU
Word category estimation apparatus, word category estimation method, speech recognition apparatus, speech recognition method, program, and recording medium

Patent number: 8583436

Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.

Type: Grant

Filed: December 19, 2008

Date of Patent: November 12, 2013

Assignee: NEC Corporation

Inventors: Hitoshi Yamamoto, Kiyokazu Miki
System and method for efficiently transcribing verbal messages to text

Patent number: 8583433

Abstract: A system and method for efficiently transcribing verbal messages to text is provided. Verbal messages are received and at least one of the verbal messages is divided into segments. Automatically recognized text is determined for each of the segments by performing speech recognition and a confidence rating is assigned to the automatically recognized text for each segment. A threshold is applied to the confidence ratings and those segments with confidence ratings that fall below the threshold are identified. The segments that fall below the threshold are assigned to one or more human agents starting with those segments that have the lowest confidence ratings. Transcription from the human agents is received for the segments assigned to that agent. The transcription is assembled with the automatically recognized text of the segments not assigned to the human agents as a text message for the at least one verbal message.

Type: Grant

Filed: August 6, 2012

Date of Patent: November 12, 2013

Assignee: Intellisist, Inc.

Inventors: Mike O. Webb, Bruce J. Peterson, Janet S. Kaseda
POST PROCESSING OF NATURAL LANGUAGE ASR

Publication number: 20130289988

Abstract: A post-processing speech system includes a natural language-based speech recognition system that compares a spoken utterance to a natural language vocabulary that includes words used to generate a natural language speech recognition result. A master conversation module engine compares the natural language speech recognition result to domain specific words and phrases. A voting engine selects a word or a phrase from the domain specific words and phrases that is transmitted to an application control system. The application control system transmits one or more control signals that are used to control an internal or an external device or an internal or an external process.

Type: Application

Filed: April 30, 2012

Publication date: October 31, 2013

Applicant: QNX SOFTWARE SYSTEMS LIMITED

Inventor: Darrin Kenneth Fry
Automatic identification of repeated material in audio signals

Patent number: 8571864

Abstract: A system and method are described for recognizing repeated audio material within at least one media stream without prior knowledge of the nature of the repeated material. The system and method are able to create a screening database from the media stream or streams. An unknown sample audio fragment is taken from the media stream and compared against the screening database to find if there are matching fragments within the media streams by determining if the unknown sample matches any samples in the screening database.

Type: Grant

Filed: December 2, 2011

Date of Patent: October 29, 2013

Assignee: Shazam Investments Limited

Inventors: David L. DeBusk, Darren P. Briggs, Michael Karliner, Richard W. Cheong Tang, Avery Li-Chun Wang
System and method for automatic speech to text conversion

Patent number: 8566088

Abstract: Speech recognition is performed in near-real-time and improved by exploiting events and event sequences, employing machine learning techniques including boosted classifiers, ensembles, detectors and cascades and using perceptual clusters. Speech recognition is also improved using tandem processing. An automatic punctuator injects punctuation into recognized text streams.

Type: Grant

Filed: November 11, 2009

Date of Patent: October 22, 2013

Assignee: SCTI Holdings, Inc.

Inventors: Mark Pinson, David Pinson, Sr., Mary Flanagan, Shahrokh Makanvand
METHOD AND SYSTEM FOR AUTOMATIC DOMAIN ADAPTATION IN SPEECH RECOGNITION APPLICATIONS

Publication number: 20130262106

Abstract: A system and method for adapting a language model to a specific environment by receiving interactions captured the specific environment, generating a collection of documents from documents retrieved from external resources, detecting in the collection of documents terms related to the environment that are not included in an initial language model and adapting the initial language model to include the terms detected.

Type: Application

Filed: March 29, 2012

Publication date: October 3, 2013

Inventors: Eyal HURVITZ, Ezra Daya, Oren Pereg, Moshe Wasserblat
Apparatus and method for speech recognition using a plurality of confidence score estimation algorithms

Patent number: 8543399

Abstract: An apparatus for speech recognition includes: a first confidence score calculator calculating a first confidence score using a ratio between a likelihood of a keyword model for feature vectors per frame of a speech signal and a likelihood of a Filler model for the feature vectors; a second confidence score calculator calculating a second confidence score by comparing a Gaussian distribution trace of the keyword model per frame of the speech signal with a Gaussian distribution trace sample of a stored corresponding keyword of the keyword model; and a determination module determining a confidence of a result using the keyword model in accordance with a position determined by the first and second confidence scores on a confidence coordinate system.

Type: Grant

Filed: September 8, 2006

Date of Patent: September 24, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jae-hoon Jeong, Sang-bae Jeong, Jeong-su Kim, Nam-hoon Kim
Differential dynamic content delivery with text display in dependence upon simultaneous speech

Patent number: 8504364

Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.

Type: Grant

Filed: September 14, 2012

Date of Patent: August 6, 2013

Assignee: Nuance Communications, Inc.

Inventors: William K. Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
Sound signal processing apparatus and method

Patent number: 8494668

Abstract: Character value of a sound signal is extracted for each unit portion, and degrees of similarity between the character values of the individual unit portions are calculated and arranged in a matrix configuration. The matrix has arranged in each column the degrees of similarity acquired by comparing, for each of the unit portions, the sound signal and a delayed sound signal obtained by delaying the sound signal by a time difference equal to an integral multiple of a time length of the unit portion, and it has a plurality of the columns in association with different time differences. Repetition probability is calculated for each of the columns corresponding to the different time differences in the matrix. A plurality of peaks in a distribution of the repetition probabilities are identified. The loop region in the sound signal is identified by collating a reference matrix with the degree of similarity matrix.

Type: Grant

Filed: February 19, 2009

Date of Patent: July 23, 2013

Assignee: Yamaha Corporation

Inventors: Bee Suan Ong, Sebastian Streich, Takuya Fujishima, Keita Arimoto
Method and device for providing speech-to-text encoding and telephony service

Patent number: 8489397

Abstract: A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.

Type: Grant

Filed: September 11, 2012

Date of Patent: July 16, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Charles David Caldwell, John Bruce Harlow, Robert J. Sayko, Norman Shaye
Input voice command recognition processing apparatus

Patent number: 8447605

Abstract: A game apparatus includes a CPU core for creating an input envelope and a registered envelope. The input envelope has a plurality of envelope values detected from a voice waveform input in real time through a microphone. The registered envelope has a plurality of envelope values detected from a voice waveform previously input. Both of the input envelope and the registered envelope are stored in a RAM. The CPU core evaluates difference of the envelope values between the input envelope and the registered envelope. When an evaluated value satisfies a condition, the CPU core executes a process according to a command assigned to the registered envelope.

Type: Grant

Filed: June 3, 2005

Date of Patent: May 21, 2013

Assignee: Nintendo Co., Ltd.

Inventor: Yoji Inagaki
Nametag confusability determination

Patent number: 8438028

Abstract: A method of and system for managing nametags including receiving a command from a user to store a nametag, prompting the user to input a number to be stored in association with the nametag, receiving an input for the number from the user, prompting the user to input the nametag to be stored in association with the number, receiving an input for the nametag from the user, processing the nametag input, and calculating confusability of the nametag input in multiple individual domains including a nametag domain, a number domain, and a command domain.

Type: Grant

Filed: May 18, 2010

Date of Patent: May 7, 2013

Assignee: General Motors LLC

Inventors: Rathinavelu Chengalvarayan, Lawrence D. Cepuran
Method of accessing a dial-up service

Patent number: 8433569

Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.

Type: Grant

Filed: October 3, 2011

Date of Patent: April 30, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Robert Wesley Bossemeyer, Jr.
Augmenting an audio signal via extraction of musical features and obtaining of media fragments

Patent number: 8433575

Abstract: A system and method is described in which a multimedia story is rendered to a consumer in dependence on features extracted from an audio signal representing for example a musical selection of the consumer. Features such as key changes and tempo of the music selection are related to dramatic parameters defined by and associated with story arcs, narrative story rules and film or story structure. In one example a selection of a few music tracks provides input audio signals (602) from which musical features are extracted (604), following which a dramatic parameter list and timeline are generated (606). Media fragments are then obtained (608), the fragments having story content associated with the dramatic parameters, and the fragments output (610) with the music selection.

Type: Grant

Filed: December 10, 2003

Date of Patent: April 30, 2013

Assignee: AMBX UK Limited

Inventors: David A. Eves, Richard S. Cole, Christopher Thorne
REDUCING FALSE POSITIVES IN SPEECH RECOGNITION SYSTEMS

Publication number: 20130054242

Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.

Type: Application

Filed: August 24, 2011

Publication date: February 28, 2013

Applicant: SENSORY, INCORPORATED

Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
Automatic Speech and Concept Recognition

Publication number: 20130046539

Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.

Type: Application

Filed: August 16, 2011

Publication date: February 21, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
Method of recognizing speech

Patent number: 8374868

Abstract: A method for recognizing speech involves reciting, into a speech recognition system, an utterance including a numeric sequence that contains a digit string including a plurality of tokens and detecting a co-articulation problem related to at least two potentially co-articulated tokens in the digit string. The numeric sequence may be identified using i) a dynamically generated possible numeric sequence that potentially corresponds with the numeric sequence, and/or ii) at least one supplemental acoustic model. Also disclosed herein is a system for accomplishing the same.

Type: Grant

Filed: August 21, 2009

Date of Patent: February 12, 2013

Assignee: General Motors LLC

Inventors: Uma Arun, Sherri J Voran-Nowak, Rathinavelu Chengalvarayan, Gaurav Talwar
Method and apparatus for phase matching frames in vocoders

Patent number: 8355907

Abstract: In one embodiment, the present invention comprises a vocoder having at least one input and at least one output, an encoder comprising a filter having at least one input operably connected to the input of the vocoder and at least one output, a decoder comprising a synthesizer having at least one input operably connected to the at least one output of the encoder, and at least one output operably connected to the at least one output of the vocoder, wherein the decoder comprises a memory and the decoder is adapted to execute instructions stored in the memory comprising phase matching and time-warping a speech frame.

Type: Grant

Filed: July 27, 2005

Date of Patent: January 15, 2013

Assignee: QUALCOMM Incorporated

Inventors: Rohit Kapoor, Serafin Diaz Spindola
STATE DETECTING APPARATUS, COMMUNICATION APPARATUS, AND STORAGE MEDIUM STORING STATE DETECTING PROGRAM

Publication number: 20130006630

Abstract: A state detecting apparatus includes: a processor to execute acquiring utterance data related to uttered speech, computing a plurality of statistical quantities for feature parameters regarding features of the utterance data, creating, on the basis of the plurality of statistical quantities regarding the utterance data and another plurality of statistical quantities regarding reference utterance data based on other uttered speech, pseudo-utterance data having at least one statistical quantity equal to a statistical quantity in the other plurality of statistical quantities, computing a plurality of statistical quantities for synthetic utterance data synthesized on the basis of the pseudo-utterance data and the utterance data, and determining, on the basis of a comparison between statistical quantities of the synthetic utterance data and statistical quantities of the reference utterance data, whether the speaker who produced the uttered speech is in a first state or a second state; and a memory.

Type: Application

Filed: April 13, 2012

Publication date: January 3, 2013

Applicant: FUJITSU LIMITED

Inventors: Shoji HAYAKAWA, Naoshi Matsuo
INPUT SUPPORTING SYSTEM, METHOD AND PROGRAM

Publication number: 20120330662

Abstract: An input supporting system (1) includes a database (10) which accumulates data for a plurality of items therein, an extraction unit (104) which compares, with the data for the items in the database (10), input data which is obtained as a result of a speech recognition process on speech data (D0), and extracts data similar to the input data from the database, and a presentation unit (106) which presents the extracted data as candidates to be registered in the database (10).

Type: Application

Filed: January 17, 2011

Publication date: December 27, 2012

Applicant: NEC CORPORATION

Inventor: Masahiro Saikou
Differential dynamic content delivery with text display in dependence upon simultaneous speech

Patent number: 8332220

Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.

Type: Grant

Filed: March 25, 2008

Date of Patent: December 11, 2012

Assignee: Nuance Communications, Inc.

Inventors: William K. Bodin, Michael J. Burkhart, Daniel G. Eisenhauer, Daniel M. Schumacher, Thomas J. Watson
Speech recognition with non-linear noise reduction on Mel-frequency cepstra

Patent number: 8306817

Abstract: In an automatic speech recognition system, a feature extractor extracts features from a speech signal, and speech is recognized by the automatic speech recognition system based on the extracted features. Noise reduction as part of the feature extractor is provided by feature enhancement in which feature-domain noise reduction in the form of Mel-frequency cepstra is provided based on the minimum means square error criterion. Specifically, the devised method takes into account the random phase between the clean speech and the mixing noise. The feature-domain noise reduction is performed in a dimension-wise fashion to the individual dimensions of the feature vectors input to the automatic speech recognition system, in order to perform environment-robust speech recognition.

Type: Grant

Filed: January 8, 2008

Date of Patent: November 6, 2012

Assignee: Microsoft Corporation

Inventors: Dong Yu, Alejandro Acero, James G. Droppo, Li Deng
METHOD AND APPARATUS FOR RECEIVING AUDIO

Publication number: 20120259637

Abstract: An electronic apparatus and method for retrieving a song, and a storage medium. The electronic apparatus includes: a storage unit which stores a plurality of songs; a user input unit which receives a hummed query which is inputted for retrieving a song; and a song retrieving unit which retrieves a song based on the hummed query from among the plurality of stored songs when the hummed query is received. The song retrieving unit extracts a pitch and a duration of the hummed query, converts each of the extracted pitch and duration into multi-level symbols, calculates a string edit distance between the hummed query and one of the plurality of songs based on the symbols, and determines a similarity between the hummed query and a song based on edit operations which are performed within the calculated string edit distance.

Type: Application

Filed: April 11, 2012

Publication date: October 11, 2012

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: A. Srinivas, P. Krishnamoorthy, Rajen Bhatt, Sarvesh Kumar
SPEAKER VERIFICATION METHODS AND APPARATUS

Publication number: 20120239398

Abstract: In one aspect, a method for determining a validity of an identity asserted by a speaker using a voice print is provided. The method comprises acts of performing a first verification stage comprising comparing a first voice signal from the speaker uttering at least one first challenge utterance-with at least a portion of the voice print and performing a second verification stage if it is concluded in the first verification stage that the first voice signal was obtained from an utterance by the user. The second verification stage comprises adapting at least one parameter of the voice print based, at least in part, on the first voice signal to obtain an adapted voice print, and comparing a second voice signal from the speaker uttering at least one second challenge utterance with at least a portion of the adapted voice print.

Type: Application

Filed: April 9, 2012

Publication date: September 20, 2012

Applicant: Nuance Communications, Inc.

Inventors: Kevin R. Farrell, David A. James, William F. Ganong, III, Jerry K. Carter
SPEAKER RECOGNITION FROM TELEPHONE CALLS

Publication number: 20120232900

Abstract: The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.

Type: Application

Filed: November 12, 2009

Publication date: September 13, 2012

Inventors: Johan Nikolaas Langehoveen Brummer, Luis Buera Rodriguez, Martha Garcia Gomar
SYSTEM AND METHOD FOR IDENTIFICATION OF A SPEAKER BY PHONOGRAMS OF SPONTANEOUS ORAL SPEECH AND BY USING FORMANT EQUALIZATION

Publication number: 20120232899

Abstract: A system and method for identification of a speaker by phonograms of oral speech is disclosed. Similarity between a first phonogram of the speaker and a second, or sample, phonogram is evaluated by matching formant frequencies in referential utterances of a speech signal, where the utterances for comparison are selected from the first phonogram and the second phonogram. Referential utterances of speech signals are selected from the first phonogram and the second phonogram, where the referential utterances include formant paths of at least three formant frequencies. The selected referential utterances including at least two identical formant frequencies are compared therebetween. Similarity of the compared referential utterances from matching other formant frequencies is evaluated, where similarity of the phonograms is determined from evaluation of similarity of all the compared referential utterances.

Type: Application

Filed: March 23, 2012

Publication date: September 13, 2012

Applicant: Obschestvo s orgranichennoi otvetstvennost'yu "Centr Rechevyh Technologij"

Inventor: Sergey Lvovich Koval
Image data output processing apparatus and image data output processing method

Patent number: 8260061

Abstract: In an image data output processing apparatus of the present invention, an image matching section is capable of determining whether a similarity exists between each image of an N-up document and a reference document when input image data is indicative of the N-up document. An output process control section is capable of regulating an output process of each image in accordance with a result of determining whether the similarity exists between each image of the N-up document and the reference document. This allows detecting with high accuracy a document image under regulation on the output process and regulating the output process, when the input image data is indicative of an N-up document and includes the document image under regulation on the output process.

Type: Grant

Filed: September 18, 2008

Date of Patent: September 4, 2012

Assignee: Sharp Kabushiki Kaisha

Inventor: Hitoshi Hirohata
Signal processing method and processor

Patent number: 8255214

Abstract: A first signal of two signals to be compared for similarity is divided into small areas and one small area is selected for calculating the correlation with a second signal using a correlative method. Then, the quantity of translation, expansion rate and similarity in an area where the similarity, which is the square of the correlation value, reaches its maximum, are found. Values based on the similarity are integrated at a position represented by the quantity of translation and expansion rate. Similar processing is performed with respect to all the small areas, and at a peak where the maximum integral value of the similarity is obtained, its magnitude is compared with a threshold value to evaluate the similarity. The small area voted for that peak can be extracted.

Type: Grant

Filed: October 15, 2002

Date of Patent: August 28, 2012

Assignee: Sony Corporation

Inventors: Mototsugu Abe, Masayuki Nishiguchi
METHOD AND APPARATUS FOR INFORMATION EXTRACTION FROM INTERACTIONS

Publication number: 20120209606

Abstract: Obtaining information from audio interactions associated with an organization. The information may comprise entities, relations or events. The method comprises: receiving a corpus comprising audio interactions; performing audio analysis on audio interactions of the corpus to obtain text documents; performing linguistic analysis of the text documents; matching the text documents with one or more rules to obtain one or more matches; and unifying or filtering the matches.

Type: Application

Filed: February 14, 2011

Publication date: August 16, 2012

Applicant: Nice Systems Ltd.

Inventors: Maya Gorodetsky, Ezra Daya, Oren Pereg
Efficient conversion of voice messages into text

Patent number: 8239197

Abstract: A system and method for efficiently transcribing verbal messages transmitted over the Internet (or other network) into text. The verbal messages are initially checked to ensure that they are in a valid format and include a return network address, and if so, are processed either as whole verbal messages or split into segments. These whole verbal messages and segments are processed by an automated speech recognition (ASR) program, which produces automatically recognized text. The automatically recognized text messages or segments are assigned to selected workbenches for manual editing and transcription, producing edited text. The segments of edited text are reassembled to produce whole edited text messages, undergo post processing to correct minor errors and output as an email, an SMS message, a file, or an input to a program. The automatically recognized text and manual edits thereof are returned as feedback to the ASR program to improve its accuracy.

Type: Grant

Filed: October 29, 2008

Date of Patent: August 7, 2012

Assignee: Intellisist, Inc.

Inventors: Mike O. Webb, Bruce J. Peterson, Janet S. Kaseda
SYSTEM AND METHODS FOR MATCHING AN UTTERANCE TO A TEMPLATE HIERARCHY

Publication number: 20120191453

Abstract: A system and methods for matching at least one word of an utterance against a set of template hierarchies to select the best matching template or set of templates corresponding to the utterance. The system and methods determines at least one exact, inexact, and partial match between the at least one word of the utterance and at least one term within the template hierarchy to select and populate a template or set of templates corresponding to the utterance. The populated template or set of templates may then be used to generate a narrative template or a report template.

Type: Application

Filed: March 30, 2012

Publication date: July 26, 2012

Applicant: Cyberpulse L.L.C.

Inventors: James ROBERGE, Jeffrey Soble
Class detection scheme and time mediated averaging of class dependent models

Patent number: 8229744

Abstract: A method, system, and computer program for class detection and time mediated averaging of class dependent models. A technique is described to take advantage of gender information in training data and how obtain female, male, and gender independent models from this information. By using a probability value to average male and female Gaussian Mixture Models (GMMs), dramatic deterioration in cross gender decoding performance is avoided.

Type: Grant

Filed: August 26, 2003

Date of Patent: July 24, 2012

Assignee: Nuance Communications, Inc.

Inventors: Satyanarayana Dharanipragada, Peder A. Olsen
Voice processing device and program

Patent number: 8214211

Abstract: In a voice processing device, a male voice index calculator calculates a male voice index indicating a similarity of the input sound relative to a male speaker sound model. A female voice index calculator calculates a female voice index indicating a similarity of the input sound relative to a female speaker sound model. A first discriminator discriminates the input sound between a non-human-voice sound and a human voice sound which may be either of the male voice sound or the female voice sound. A second discriminator discriminates the input sound between the male voice sound and the female voice sound based on the male voice index and the female voice index in case that the first discriminator discriminates the human voice sound.

Type: Grant

Filed: August 26, 2008

Date of Patent: July 3, 2012

Assignee: Yamaha Corporation

Inventor: Yasuo Yoshioka
Speaker verification system

Patent number: 8209174

Abstract: A text-independent speaker verification system utilizes mel frequency cepstral coefficients analysis in the feature extraction blocks, template modeling with vector quantization in the pattern matching blocks, an adaptive threshold and an adaptive decision verdict and is implemented in a stand-alone device using less powerful microprocessors and smaller data storage devices than used by comparable systems of the prior art.

Type: Grant

Filed: April 17, 2009

Date of Patent: June 26, 2012

Assignee: Saudi Arabian Oil Company

Inventor: Essam Abed Al-Telmissani

prev 1 2 3 4 5 6 7 next