Subportions Patents (Class 704/249)

Speaker recognition in a multi-speaker environment and comparison of several voice prints to many

Patent number: 8812318

Abstract: One-to-many comparisons of callers' voice prints with known voice prints to identify any matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract at least a portion of the customer's voice to create a customer voice print, and it formats the segmented voice print for network transmission to a server. The server compares the customer's voice print with multiple known voice prints to determine any matches, meaning that the customer's voice print and one of the known voice prints are likely from the same person. The identification of any matches can be used for a variety of purposes, such as determining whether to authorize a transaction requested by the customer.

Type: Grant

Filed: February 6, 2012

Date of Patent: August 19, 2014

Assignee: III Holdings 1, LLC

Inventors: Vicki Broman, Vernon Marshall, Seshasayee Bellamkonda, Marcel Leyva, Cynthia Hanson
VOICE RECOGNITION APPARATUS AND METHOD FOR PROVIDING RESPONSE INFORMATION

Publication number: 20140214425

Abstract: A voice recognition apparatus and a method for providing response information are provided. The voice recognition apparatus according to the present disclosure includes an extractor configured to extract a first utterance element representing a user action and a second utterance element representing an object from a user's utterance voice signal; a domain determiner configured to detect an expansion domain related to the extracted first and second utterance elements based on a hierarchical domain model, and determine at least one candidate domain related to the detected expansion domain as a final domain; a communicator which performs communication with an external apparatus; and a controller configured to control the communicator to transmit information regarding the first and second utterance elements and information regarding the determined final domain.

Type: Application

Filed: January 31, 2014

Publication date: July 31, 2014

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Eun-sang BAK, Kyung-duk KIM, Myung-jae KIM, Yu LIU, Seong-han RYU, Geun-bae LEE
SPEAKER VERIFICATION AND IDENTIFICATION USING ARTIFICIAL NEURAL NETWORK-BASED SUB-PHONETIC UNIT DISCRIMINATION

Publication number: 20140195236

Abstract: In one embodiment, a computer system stores speech data for a plurality of speakers, where the speech data includes a plurality of feature vectors and, for each feature vector, an associated sub-phonetic class. The computer system then builds, based on the speech data, an artificial neural network (ANN) for modeling speech of a target speaker in the plurality of speakers, where the ANN is configured to discriminate between instances of sub-phonetic classes uttered by the target speaker and instances of sub-phonetic classes uttered by other speakers in the plurality of speakers.

Type: Application

Filed: January 10, 2013

Publication date: July 10, 2014

Applicant: Sensory, Incorporated

Inventors: John-Paul Hosom, Pieter J. Vermeulen, Jonathan Shaw
FAST, LANGUAGE-INDEPENDENT METHOD FOR USER AUTHENTICATION BY VOICE

Publication number: 20140195237

Abstract: A method and system for training a user authentication by voice signal are described. In one embodiment, a set of feature vectors are decomposed into speaker-specific recognition units. The speaker-specific recognition units are used to compute distribution values to train the voice signal. In addition, spectral feature vectors are decomposed into speaker-specific characteristic units which are compared to the speaker-specific distribution values. If the speaker-specific characteristic units are within a threshold limit of the speaker-specific distribution values, the speech signal is authenticated.

Type: Application

Filed: January 9, 2014

Publication date: July 10, 2014

Applicant: APPLE INC.

Inventors: Jerome R. BELLEGARDA, Kim E. A. SILVERMAN
Content-based audio playback emphasis

Patent number: 8768706

Abstract: Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.

Type: Grant

Filed: August 20, 2010

Date of Patent: July 1, 2014

Assignee: Multimodal Technologies, LLC

Inventors: Kjell Schubert, Juergen Fritsch, Michael Finke, Detlef Koll
Method for verifying the identity of a speaker and related computer readable medium and computer

Patent number: 8762149

Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speakers voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify (10) that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying (12, 13) that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting (16) the speakers identity to be verified in case that both verification steps give a positive result and not accepting (15) the speakers identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.

Type: Grant

Filed: December 10, 2008

Date of Patent: June 24, 2014

Inventors: Marta Sánchez Asenjo, Alfredo Gutiérrez Navarro, Alberto Martín de los Santos de las Heras, Marta García Gomar
Digital signatures for communications using text-independent speaker verification

Patent number: 8751233

Abstract: A speaker-verification digital signature system is disclosed that provides greater confidence in communications having digital signatures because a signing party may be prompted to speak a text-phrase that may be different for each digital signature, thus making it difficult for anyone other than the legitimate signing party to provide a valid signature.

Type: Grant

Filed: July 31, 2012

Date of Patent: June 10, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Pradeep K. Bansal, Lee Begeja, Carroll W. Creswell, Jeffrey Farah, Benjamin J. Stern, Jay Wilpon
SIGNAL PROCESSING DEVICE, METHOD FOR PROCESSING SIGNAL

Publication number: 20140142943

Abstract: A signal processing device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute, receiving speech of a speaker as a first signal; detecting an expiration period included in the first signal; extracting a number of phonemes included in the expiration period; and controlling, a second signal, which is an output to the speaker, on the basis of the number of phonemes and a length of the expiration period.

Type: Application

Filed: October 15, 2013

Publication date: May 22, 2014

Applicant: FUJITSU LIMITED

Inventors: Chisato Ishikawa, Taro TOGAWA, Takeshi OTANI, Masanao SUZUKI
Method of controlling a system and signal processing system

Patent number: 8731940

Abstract: A method of controlling a system which includes the steps of obtaining at least one signal representative of information communicated by a user via an input device in an environment of the user, wherein a signal from a first source is available in a perceptible form in the environment; estimating at least a point in time when a transition between information flowing from the first source and information flowing from the user is expected to occur; and timing the performance of a function by the system in relation to the estimated time.

Type: Grant

Filed: September 11, 2009

Date of Patent: May 20, 2014

Assignee: Koninklijke Philips N.V.

Inventor: Aki Sakari Harma
METHODS AND SYSTEMS FOR SPEECH SYSTEMS

Publication number: 20140136204

Abstract: Methods and systems are provided for a speech system of a vehicle. In one embodiment, the method includes: generating an utterance signature from a speech utterance received from a user of the speech system without a specific need for a user identification interaction; developing a user signature for a user based on the utterance signature; and managing a dialog with the user based on the user signature.

Type: Application

Filed: October 22, 2013

Publication date: May 15, 2014

Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: RON M. HECHT, OMER TSIMHONI, UTE WINTER, ROBERT D. SIMS, III
MASH-UP SERVICE GENERATION APPARATUS AND METHOD BASED ON VOICE COMMAND

Publication number: 20140136206

Abstract: Provided are a mash-up service generation apparatus and method based on a voice command. The mash-up service generation apparatus includes a voice recognizer configured to convert a voice command into a character string, a mash-up natural language processor configured to extract a word corresponding to a mash-up module based on the character string, and convert the word into at least one of metadata of the mash-up module and metadata of a mash-up sequence in which a plurality of mash-up modules are combined, and a mash-up sequence processor configured to search for and select a target mash-up sequence corresponding to the metadata of the mash-up sequence, and newly generate the target mash-up sequence. Accordingly, a customized mash-up service can be provided to a user.

Type: Application

Filed: November 12, 2013

Publication date: May 15, 2014

Applicant: Electronics & Telecommunications Research Institute

Inventors: Jae Chul KIM, Seong Ho LEE, Young Jae LIM, Yoon Seop CHANG
DISPLAY APPARATUS, VOICE ACQUIRING APPARATUS AND VOICE RECOGNITION METHOD THEREOF

Publication number: 20140136205

Abstract: Disclosed are a display apparatus, a voice acquiring apparatus and a voice recognition method thereof, the display apparatus including: a display unit which displays an image; a communication unit which communicates with a plurality of external apparatuses; and a controller which includes a voice recognition engine to recognize a user's voice, receives a voice signal from a voice acquiring unit, and controls the communication unit to receive candidate instruction words from at least one of the plurality of external apparatuses to recognize the received voice signal.

Type: Application

Filed: November 11, 2013

Publication date: May 15, 2014

Applicant: Samsung Electronics Co., Ltd.

Inventors: Jong-hyuk JANG, Chan-hee CHOI, Hee-seob RYU, Kyung-mi PARK, Seung-kwon PARK, Jae-hyun BAE
Method and apparatus for element identification in a signal

Patent number: 8725508

Abstract: A computer-implemented method and apparatus for searching for an element sequence, the method comprising: receiving a signal; determining an initial segment of the signal; inputting the initial segment into an element extraction engine to obtain a first element sequence; determining one or more second segments, each of the second segments at least partially overlapping with the initial segment; inputting the second segments into the element extraction engine to obtain at least one second element sequence; and searching for an element subsequence common to at least a predetermined number of sequences of the first element sequence and the second element sequences.

Type: Grant

Filed: March 27, 2012

Date of Patent: May 13, 2014

Assignee: Novospeech

Inventor: Yossef Ben-Ezra
Computer-Implemented System And Method For Masking Special Data

Publication number: 20140129219

Abstract: A computer-implemented system and method for masking special data is provided. Speakers of a call recording are identified. The call recording is separated into strands corresponding to each of the speakers. A prompt list of elements that prompt the speaker of the other strand to utter special information is applied to one of the strands. At least one of the elements of the prompt list is identified in the one strand. A special information candidate is identified in the other strand and is located after a location in time where the element was found in the voice recording of the one strand. A confidence score is assigned to the element located in the one strand and to the special information candidate in the other strand. The confidence scores are combined and a threshold is applied. The special information candidate is rendered unintelligible when the combined confidence scores satisfy the threshold.

Type: Application

Filed: November 4, 2013

Publication date: May 8, 2014

Applicant: Intellisist, Inc.

Inventors: Howard M. Lee, Steven Lutz, Gilad Odinak
METHOD AND APPARATUS FOR UTTERANCE VERIFICATION

Publication number: 20140129224

Abstract: A method and apparatus for utterance verification are provided for verifying a recognized vocabulary output from speech recognition. The apparatus for utterance verification includes a reference score accumulator, a verification score generator and a decision device. A log-likelihood score obtained from speech recognition is processed by taking a logarithm of the value of the probability of one of feature vectors of an input speech conditioned on one of states of each model vocabulary. A verification score is generated based on the processed result. The verification score is compared with a predetermined threshold value so as to reject or accept the recognized vocabulary.

Type: Application

Filed: December 17, 2012

Publication date: May 8, 2014

Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Inventor: Shih-Chieh Chien
VOICE AGENT DEVICE AND METHOD FOR CONTROLLING THE SAME

Publication number: 20140122077

Abstract: A voice agent device includes: a position detection unit which detects a position of a person in a conversation space to which the voice agent device is capable of providing information; a voice volume detection unit which detects a voice volume of the person from a sound signal in the conversation space obtained by a sound acquisition unit; a conversation area determination unit which determines a conversation area as a first area including the position when the voice volume has a first voice volume value and determines the conversation area as a second area including the position and being smaller than the first area when the voice volume has a second voice volume value smaller than the first voice volume value, the conversation area being a spatial range where an utterance of the person can be heard; and an information provision unit which provides provision information to the conversation area.

Type: Application

Filed: October 25, 2013

Publication date: May 1, 2014

Applicant: Panasonic Corporation

Inventors: Yuri NISHIKAWA, Kazunori YAMADA
Periodic ambient waveform analysis for enhanced social functions

Patent number: 8706499

Abstract: Client devices periodically capture ambient audio waveforms, generate waveform fingerprints, and upload the fingerprints to a server for analysis. The server compares the waveforms to a database of stored waveform fingerprints, and upon finding a match, pushes content or other information to the client device. The fingerprints in the database may be uploaded by other users, and compared to the received client waveform fingerprint based on common location or other social factors. Thus a client's location may be enhanced if the location of users whose fingerprints match the client's is known. In particular embodiments, the server may instruct clients whose fingerprints partially match to capture waveform data at a particular time and duration for further analysis and increased match confidence.

Type: Grant

Filed: August 16, 2011

Date of Patent: April 22, 2014

Assignee: Facebook, Inc.

Inventors: Matthew Nicholas Papakipos, David Harry Garcia
HIERARCHICAL METHODS AND APPARATUS FOR EXTRACTING USER INTENT FROM SPOKEN UTTERANCES

Publication number: 20140095162

Abstract: Improved techniques are disclosed for permitting a user to employ more human-based grammar (i.e., free form or conversational input) while addressing a target system via a voice system. For example, a technique for determining intent associated with a spoken utterance of a user comprises the following steps/operations. Decoded speech uttered by the user is obtained. An intent is then extracted from the decoded speech uttered by the user. The intent is extracted in an iterative manner such that a first class is determined after a first iteration and a sub-class of the first class is determined after a second iteration. The first class and the sub-class of the first class are hierarchically indicative of the intent of the user, e.g., a target and data that may be associated with the target. The multi-stage intent extraction approach may have more than two iterations.

Type: Application

Filed: October 1, 2013

Publication date: April 3, 2014

Applicant: Nuance Communications, Inc.

Inventors: Dimitri Kanevsky, Joseph Simon Reisinger, Roberto Sicconi, Mahesh Viswanathan
Distinguishing out-of-vocabulary speech from in-vocabulary speech

Patent number: 8688451

Abstract: A speech recognition method includes receiving input speech from a user, processing the input speech using a first grammar to obtain parameter values of a first N-best list of vocabulary, comparing a parameter value of a top result of the first N-best list to a threshold value, and if the compared parameter value is below the threshold value, then additionally processing the input speech using a second grammar to obtain parameter values of a second N-best list of vocabulary. Other preferred steps include: determining the input speech to be in-vocabulary if any of the results of the first N-best list is also present within the second N-best list, but out-of-vocabulary if none of the results of the first N-best list is within the second N-best list; and providing audible feedback to the user if the input speech is determined to be out-of-vocabulary.

Type: Grant

Filed: May 11, 2006

Date of Patent: April 1, 2014

Assignee: General Motors LLC

Inventors: Timothy J. Grost, Rathinavelu Chengalvarayan
COMMUNICATION SUPPORT DEVICE AND COMMUNICATION SUPPORT METHOD

Publication number: 20140081639

Abstract: The communication support device includes: a storing unit configured to store an utterance of a first speaker transmitted from a first terminal as utterance information; an analyzing unit configured to obtain a holding notice which sets communications with the first terminal to a holding state, the communications being transmitted from a second terminal used by a second speaker who communicates with the first speaker, and to analyze features of utterance information which correspond to a time of a holding state; and an instructing unit configured to output to the second terminal determination information on the first speaker based on the features of the utterance information of the first speaker.

Type: Application

Filed: August 30, 2013

Publication date: March 20, 2014

Applicant: FUJITSU LIMITED

Inventors: Naoto KAWASHIMA, Naoto MATSUDAIRA, Yuusuke TOUNAI, Hiroshi YOSHIDA, Shingo HIRONO
CUT AND PASTE SPOOFING DETECTION USING DYNAMIC TIME WARPING

Publication number: 20140081638

Abstract: The invention refers to a method for comparing voice utterances, the method comprising the steps: extracting a plurality of features (201) from a first voice utterance of a given text sample and extracting a plurality of features (201) from a second voice utterance of said given text sample, wherein each feature is extracted as a function of time, and wherein each feature of the second voice utterance corresponds to a feature of the first voice utterance; applying dynamic time warping (202) to one or more time dependent characteristics of the first and/or second voice utterance e.g.

Type: Application

Filed: December 10, 2009

Publication date: March 20, 2014

Inventors: Jesus Antonio Villalba Lopez, Alfonso Ortega Gimenez, Eduardo Lleida Solano, Sara Varela Redondo, Marta Garcia Gomar
SPEAKER VERIFICATION METHODS AND APPARATUS

Publication number: 20140081640

Abstract: One aspect includes determining validity of an identity asserted by a speaker using a voice print associated with a user whose identity the speaker is asserting, the voice print obtained from characteristic features of at least one first voice signal obtained from the user uttering at least one enrollment utterance including at least one enrollment word by obtaining a second voice signal of the speaker uttering at least one challenge utterance that includes at least one word not in the at least one enrollment utterance, obtaining at least one characteristic feature from the second voice signal, comparing the at least one characteristic feature with at least a portion of the voice print to determine a similarity between the at least one characteristic feature and the at least a portion of the voice print, and determining whether the speaker is the user based, at least in part, on the similarity.

Type: Application

Filed: November 21, 2013

Publication date: March 20, 2014

Applicant: Nuance Communications, Inc.

Inventors: Kevin R. Farrell, David A. James, William F. Ganong, III, Jerry K. Carter
Automatic speech and concept recognition

Patent number: 8676580

Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.

Type: Grant

Filed: August 16, 2011

Date of Patent: March 18, 2014

Assignee: International Business Machines Corporation

Inventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
INFORMATION PROCESSING APPARATUS, COMPUTER PROGRAM PRODUCT, AND INFORMATION PROCESSING METHOD

Publication number: 20140046666

Abstract: According to an embodiment, an information processing apparatus includes a dividing unit, an assigning unit, and a generating unit. The dividing unit is configured to divide speech data into pieces of utterance data. The assigning unit is configured to assign speaker identification information to each piece of utterance data based on an acoustic feature of the each piece of utterance data. The generating unit is configured to generate a candidate list that indicates candidate speaker names so as to enable a user to determine a speaker name to be given to the piece of utterance data identified by instruction information, based on operation history information in which at least pieces of utterance identification information, pieces of the speaker identification information, and speaker names given by the user to the respective pieces of utterance data are associated with one another.

Type: Application

Filed: August 6, 2013

Publication date: February 13, 2014

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Osamu Nishiyama, Taira Ashikawa, Tomoo Ikeda, Kouji Ueno, Kouta Nakata
Personalized Voice-Driven User Interfaces for Remote Multi-User Services

Publication number: 20140039893

Abstract: Disclosed embodiments provide for personalizing a voice user interface of a remote multi-user service. A voice user interface for the remote multi-user service can be provided and voice information from an identified user can be received at the multi-user service through the voice user interface. A language model specific to the identified user can be retrieved that models one or more language elements. The retrieved language model can be applied to interpret the received voice information and a response can be generated by the multi-user service in response the interpreted voice information.

Type: Application

Filed: July 31, 2012

Publication date: February 6, 2014

Applicant: SRI INTERNATIONAL

Inventor: Steven Weiner
User-specific confidence thresholds for speech recognition

Patent number: 8639508

Abstract: A method of automatic speech recognition includes receiving an utterance from a user via a microphone that converts the utterance into a speech signal, pre-processing the speech signal using a processor to extract acoustic data from the received speech signal, and identifying at least one user-specific characteristic in response to the extracted acoustic data. The method also includes determining a user-specific confidence threshold responsive to the at least one user-specific characteristic, and using the user-specific confidence threshold to recognize the utterance received from the user and/or to assess confusability of the utterance with stored vocabulary.

Type: Grant

Filed: February 14, 2011

Date of Patent: January 28, 2014

Assignee: General Motors LLC

Inventors: Xufang Zhao, Gaurav Talwar
SYSTEM, METHOD AND PROGRAM PRODUCT FOR PROVIDING AUTOMATIC SPEECH RECOGNITION (ASR) IN A SHARED RESOURCE ENVIRONMENT

Publication number: 20140025377

Abstract: A speech recognition system, method of recognizing speech and a computer program product therefor. A client device identified with a context for an associated user selectively streams audio to a provider computer, e.g., a cloud computer. Speech recognition receives streaming audio, maps utterances to specific textual candidates and determines a likelihood of a correct match for each mapped textual candidate. A context model selectively winnows candidate to resolve recognition ambiguity according to context whenever multiple textual candidates are recognized as potential matches for the same mapped utterance. Matches are used to update the context model, which may be used for multiple users in the same context.

Type: Application

Filed: August 10, 2012

Publication date: January 23, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Fernando Luiz Koch, Julio Nogima
Multi-Stage Speaker Adaptation

Publication number: 20140025378

Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.

Type: Application

Filed: September 24, 2013

Publication date: January 23, 2014

Applicant: Google Inc.

Inventors: Petar Aleksic, Xin Lei
Speaker and call characteristic sensitive open voice search

Patent number: 8630860

Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.

Type: Grant

Filed: March 3, 2011

Date of Patent: January 14, 2014

Assignee: Nuance Communications, Inc.

Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
SIGNAL PROCESSING METHOD

Publication number: 20140012576

Abstract: A signal processing method includes separating a mixed sound signal in which a plurality of excitations are mixed into the respective excitations, and performing speech detection on the plurality of separated excitation signals, judging whether or not the plurality of excitation signals are speech and generating speech section information indicating speech/non-speech information for each excitation signal. The signal processing signal also includes at least one of calculating and analyzing an utterance overlap duration using the speech section information for combinations of the plurality of excitation signals and of calculating and analyzing a silence duration. The signal processing signal further includes calculating a degree of establishment of a conversation indicating the degree of establishment of a conversation based on the extracted utterance overlap duration or the silence duration.

Type: Application

Filed: June 26, 2013

Publication date: January 9, 2014

Applicant: PANASONIC CORPORATION

Inventors: Maki YAMADA, Mitsuru ENDO, Koichiro MIZUSHIMA
SYSTEM AND METHOD FOR DELIVERING TARGETED ADVERTISEMENTS AND TRACKING ADVERTISEMENT INTERACTIONS IN VOICE RECOGNITION CONTEXTS

Publication number: 20140012577

Abstract: The system and method described herein may use various natural language models to deliver targeted advertisements and track advertisement interactions in voice recognition contexts. In particular, in response to an input device receiving an utterance, a conversational language processor may select and deliver one or more advertisements targeted to a user that spoke the utterance based on cognitive models associated with the user, various users having similar characteristics to the user, an environment in which the user spoke the utterance, or other criteria. Further, subsequent interaction with the targeted advertisements may be tracked to build and refine the cognitive models and thereby enhance the information used to deliver targeted advertisements in response to subsequent utterances.

Type: Application

Filed: September 3, 2013

Publication date: January 9, 2014

Applicant: VoiceBox Technologies Corporation

Inventors: Tom Freeman, Mike Kennewick
DETECTING POTENTIAL SIGNIFICANT ERRORS IN SPEECH RECOGNITION RESULTS

Publication number: 20140012575

Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for a domain, such as the medical domain. In some embodiments, words and/or phrases that may be confused by an ASR system may be determined and associated in sets of words and/or phrases. Words and/or phrases that may be determined include those that change a meaning of a phrase or sentence when included in the phrase/sentence.

Type: Application

Filed: July 9, 2012

Publication date: January 9, 2014

Applicant: Nuance Communications, Inc.

Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
Identifying and generating audio cohorts based on audio data input

Patent number: 8626505

Abstract: A computer implemented method, system, and/or computer program product generates an audio cohort. Audio data from a set of audio sensors is received by an audio analysis engine. The audio data, which is associated with a plurality of objects, comprises a set of audio patterns. The audio data is processed to identify audio attributes associated with the plurality of objects to form digital audio data. This digital audio data comprises metadata that describes the audio attributes of the set of objects. A set of audio cohorts is generated using the audio attributes associated with the digital audio data and cohort criteria, where each audio cohort in the set of audio cohorts is a cohort of accompanied customers in a store, and where processing the audio data identifies a type of zoological creature that is accompanying each of the accompanied customers.

Type: Grant

Filed: September 6, 2012

Date of Patent: January 7, 2014

Assignee: International Business Machines Corporation

Inventors: Robert L. Angell, Robert R. Friedlander, James R. Kraemer
Speech search device and speech search method

Patent number: 8626508

Abstract: Provided are a speech search device, the search speed of which is very fast, the search performance of which is also excellent, and which performs fuzzy search, and a speech search method. Not only the fuzzy search is performed, but also the distance between phoneme discrimination features included in speech data is calculated to determine the similarity with respect to the speech using both a suffix array and dynamic programming, and an object to be searched for is narrowed by means of search keyword division based on a phoneme and search thresholds relative to a plurality of the divided search keywords, the object to be searched for is repeatedly searched for while increasing the search thresholds in order, and whether or not there is the keyword division is determined according to the length of the search keywords, thereby implementing speech search, the search speed of which is very fast and the search performance of which is also excellent.

Type: Grant

Filed: February 10, 2010

Date of Patent: January 7, 2014

Assignee: National University Corporation TOYOHASHI UNIVERSITY OF TECHNOLOGY

Inventors: Koichi Katsurada, Tsuneo Nitta, Shigeki Teshima
Systems and methods for extracting meaning from multimodal inputs using finite-state devices

Patent number: 8626507

Abstract: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

Type: Grant

Filed: November 30, 2012

Date of Patent: January 7, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, Michael J. Johnston
Voice recognition device, voice recognition method, and voice recognition program

Patent number: 8612225

Abstract: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.

Type: Grant

Filed: February 26, 2008

Date of Patent: December 17, 2013

Assignee: NEC Corporation

Inventors: Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa
Speech processing system and method

Patent number: 8612224

Abstract: A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are upd

Type: Grant

Filed: August 23, 2011

Date of Patent: December 17, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Catherine Breslin, Mark John Francis Gales, Kean Kheong Chin, Katherine Mary Knill
METHOD AND SYSTEM FOR DUAL SCORING FOR TEXT-DEPENDENT SPEAKER VERIFICATION

Publication number: 20130325473

Abstract: Embodiments of systems and methods for speaker verification are provided. In various embodiments, a method includes receiving an utterance from a speaker and determining a text-independent speaker verification score and a text-dependent speaker verification score in response to the utterance. Various embodiments include a system for speaker verification, the system comprising an audio receiving device for receiving an utterance from a speaker and converting the utterance to an utterance signal, and a processor coupled to the audio receiving device for determining speaker verification in response to the utterance signal, wherein the processor determines speaker verification in response to a UBM-independent speaker-normalized score.

Type: Application

Filed: May 23, 2013

Publication date: December 5, 2013

Applicant: Agency for Science, Technology and Research

Inventors: Anthony Larcher, Kong Aik Lee, Bin Ma, Thai Ngoc Thuy Huong
VOICE RECOGNITION METHOD AND VOICE RECOGNITION APPARATUS

Publication number: 20130289992

Abstract: A voice recognition method includes: detecting a vocal section including a vocal sound in a voice, based on a feature value of an audio signal representing the voice; identifying a word expressed by the vocal sound in the vocal section, by matching the feature value of the audio signal of the vocal section and an acoustic model of each of a plurality of words; and selecting, with a processor, the word expressed by the vocal sound in a word section based on a comparison result between a signal characteristic of the word section and a signal characteristic of the vocal section.

Type: Application

Filed: March 18, 2013

Publication date: October 31, 2013

Applicant: FUJITSU LIMITED

Inventor: Shouji HARADA
Inference-aided speaker recognition

Patent number: 8571865

Abstract: Systems, methods performed by data processing apparatus and computer storage media encoded with computer programs for receiving information relating to (i) a communication device that has received an utterance and (ii) a voice associated with the received utterance, comparing the received voice information with voice signatures in a comparison group, the comparison group including one or more individuals identified from one or more connections arising from the received information relating to the communication device, attempting to identify the voice associated with the utterance as matching one of the individuals in the comparison group, and based on a result of the attempt to identify, selectively providing the communication device with access to one or more resources associated with the matched individual.

Type: Grant

Filed: August 10, 2012

Date of Patent: October 29, 2013

Assignee: Google Inc.

Inventor: Philip Hewinson
Method and system for bio-metric voice print authentication

Patent number: 8571867

Abstract: A method (700) and system (900) for authenticating a user is provided. The method can include receiving one or more spoken utterances from a user (702), recognizing a phrase corresponding to one or more spoken utterances (704), identifying a biometric voice print of the user from one or more spoken utterances of the phrase (706), determining a device identifier associated with the device (708), and authenticating the user based on the phrase, the biometric voice print, and the device identifier (710). A location of the handset or the user can be employed as criteria for granting access to one or more resources (712).

Type: Grant

Filed: September 13, 2012

Date of Patent: October 29, 2013

Assignee: Porticus Technology, Inc.

Inventors: Germano Di Mambro, Bernardas Salna
METHOD OF RECOGNIZING GENDER OR AGE OF A SPEAKER ACCORDING TO SPEECH EMOTION OR AROUSAL

Publication number: 20130268273

Abstract: A method of recognizing gender or age of a speaker according to speech emotion or arousal includes the following steps of A) segmentalizing speech signals into a plurality of speech segments; B) fetching the first speech segment from the plural speech segments to further acquire at least one of emotional features or arousal degree in the speech segment; C) determining whether at least one of the emotional feature and the arousal degree conforms to some condition; if yes, proceed to the step D); if no, return to the step B) and then fetch the next speech segment; D) fetching the feature indicative of gender or age from the speech segment to further acquire at least one feature parameter; and E) recognizing the at least one feature parameter to further determine the gender or age of the speaker at the currently-processed speech segment.

Type: Application

Filed: July 27, 2012

Publication date: October 10, 2013

Inventors: Oscal Tzyh-Chiang Chen, Ping-Tsung Lu, Jia-You Ke
Training and applying prosody models

Patent number: 8554566

Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

Type: Grant

Filed: November 29, 2012

Date of Patent: October 8, 2013

Assignee: Morphism LLC

Inventor: James H. Stephens, Jr.
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 8548807

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: June 9, 2009

Date of Patent: October 1, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Command recognition device, command recognition method, and command recognition robot

Patent number: 8532989

Abstract: A command recognition device includes: an utterance understanding unit that determines or selects word sequence information from speech information; speech confidence degree calculating unit that calculates degree of speech confidence based on the speech information and the word sequence information; a phrase confidence degree calculating unit that calculates a degree of phrase confidence based on image information and phrase information included in the word sequence information; and a motion control instructing unit that determines whether a command of the word sequence information should be executed based on the degree of speech confidence and the degree of phrase confidence.

Type: Grant

Filed: September 2, 2010

Date of Patent: September 10, 2013

Assignee: Honda Motor Co., Ltd.

Inventors: Kotaro Funakoshi, Mikio Nakano, Xiang Zuo, Naoto Iwahashi, Ryo Taguchi
Computer-implemented system and method for processing audio in a voice response environment

Patent number: 8521527

Abstract: A computer-implemented system and method for processing audio in a voice response environment is provided. A database of host scripts each comprising signature files of audio phrases and actions to take when one of the audio phrases is recognized is maintained. The host scripts are loaded and a call to a voice mail server is initiated. Incoming audio buffers are received during the call from voice messages stored on the voice mail server. The incoming audio buffers are processed. A signature data structure is created for each audio buffer. The signature data structure is compared with signatures of expected phrases in the host scripts. The actions stored in the host scripts are executed when the signature data structure matches the signature of the expected phrase.

Type: Grant

Filed: September 10, 2012

Date of Patent: August 27, 2013

Assignee: Intellisist, Inc.

Inventor: Martin R. M. Dunsmuir
Acoustic model adaptation methods based on pronunciation variability analysis for enhancing the recognition of voice of non-native speaker and apparatus thereof

Patent number: 8515753

Abstract: The example embodiment of the present invention provides an acoustic model adaptation method for enhancing recognition performance for a non-native speaker's speech. In order to adapt acoustic models, first, pronunciation variations are examined by analyzing a non-native speaker's speech. Thereafter, based on variation pronunciation of a non-native speaker's speech, acoustic models are adapted in a state-tying step during a training process of acoustic models. When the present invention for adapting acoustic models and a conventional acoustic model adaptation scheme are combined, more-enhanced recognition performance can be obtained. The example embodiment of the present invention enhances recognition performance for a non-native speaker's speech while reducing the degradation of recognition performance for a native speaker's speech.

Type: Grant

Filed: March 30, 2007

Date of Patent: August 20, 2013

Assignee: Gwangju Institute of Science and Technology

Inventors: Hong Kook Kim, Yoo Rhee Oh, Jae Sam Yoon
Method and device for providing speech-to-text encoding and telephony service

Patent number: 8489397

Abstract: A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.

Type: Grant

Filed: September 11, 2012

Date of Patent: July 16, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Charles David Caldwell, John Bruce Harlow, Robert J. Sayko, Norman Shaye
SPEAKER VERIFICATION IN A HEALTH MONITORING SYSTEM

Publication number: 20130173268

Abstract: A method for verifying that a person is registered to use a telemedical device includes identifying an unprompted trigger phrase in words spoken by a person and received by the telemedical device. The telemedical device prompts the person to state a name of a registered user and optionally prompts the person to state health tips for the person. The telemedical device verifies that the person is the registered user using utterance data generated from the unprompted trigger phrase, name of the registered user, and health tips.

Type: Application

Filed: December 29, 2011

Publication date: July 4, 2013

Applicant: Robert Bosch GmbH

Inventors: Fuliang Weng, Taufiq Hasan, Zhe Feng
METHOD AND APPARATUS FOR GENERATING PHONEME RULE

Publication number: 20130166283

Abstract: A phoneme rule generating apparatus includes a spectrum analyzer configured to analyze pronunciation patterns of voices included in a plurality of voice data, a clusterer configured to cluster the plurality of voice data based on the analyzed pronunciation patterns, a voice group generator configured to generate voice groups from the clustered voice data, a phoneme rule generator configured to generate a phoneme rule corresponding to each respective voice group from among the generated voice groups and a group mapping DB configured to store the generated voice groups and the generated phoneme rules for an accurate voice recognition.

Type: Application

Filed: December 26, 2012

Publication date: June 27, 2013

Applicant: KT CORPORATION

Inventor: KT Corporation

prev 1 2 3 4 5 6 7 next