Subportions Patents (Class 704/249)

Voice recognition to authenticate a mobile payment

Patent number: 10192219

Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.

Type: Grant

Filed: January 8, 2015

Date of Patent: January 29, 2019

Assignee: Capital One Services, LLC

Inventors: Lawrence Douglas, Paul Y. Moreton
Method and apparatus for speech behavior visualization and gamification

Patent number: 10056094

Abstract: In some example embodiments, a system is provided for real-time analysis of audio signals. First digital audio signals are retrieved from memory. First computed streamed signal information corresponding to each of the first digital audio signals is generated by computing first metrics data for the first digital audio signals, the first computed streamed signal information including the first metrics data. The computed first streamed signal information is stored in the memory. The first computed streamed signal information is transmitted to one or more computing devices. Transmitting the first computed streamed signal information to the one or more computing devices causes the first computed streamed signal information to be displayed at the one or more computing devices.

Type: Grant

Filed: March 12, 2015

Date of Patent: August 21, 2018

Assignee: Cogito Corporation

Inventors: Joshua Feast, Ali Azarbayejani, Skyler Place
Segment-based speaker verification using dynamically generated phrases

Patent number: 10037760

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.

Type: Grant

Filed: August 4, 2017

Date of Patent: July 31, 2018

Inventors: Dominik Roblek, Matthew Sharifi
Method and apparatus for voice communication based on voice activity detection

Patent number: 9912617

Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.

Type: Grant

Filed: January 4, 2017

Date of Patent: March 6, 2018

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
Method for securing a transaction performed by bank card

Patent number: 9875474

Abstract: A method is provided for securing a transaction made by bank card, the transaction involving a remote provision, by a user, of data existing in a bank card in his possession. The method includes: obtaining data existing in the bank card to be used, called textual data; obtaining at least one portion of the textual data in the form of an audio data stream, called a sound sample, resulting from reading the data existing in the bank card to be used; computing a current voice signature from said sound sample; comparing said current voice signature with a reference voice signature pre-recorded and associated with the textual data of the bank card; and when the reference voice signature differs from the current voice signature by a value greater than the first value defined by a predetermined parameter, for rejecting the transaction.

Type: Grant

Filed: January 16, 2015

Date of Patent: January 23, 2018

Assignee: INGENICO GROUP

Inventor: Michel Leger
Speaker adaptation of neural network acoustic models using I-vectors

Patent number: 9858919

Abstract: A method includes providing a deep neural network acoustic model, receiving audio data including one or more utterances of a speaker, extracting a plurality of speech recognition features from the one or more utterances of the speaker, creating a speaker identity vector for the speaker based on the extracted speech recognition features, and adapting the deep neural network acoustic model for automatic speech recognition using the extracted speech recognition features and the speaker identity vector.

Type: Grant

Filed: September 29, 2014

Date of Patent: January 2, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: George A. Saon
Electronic apparatus and control method thereof

Patent number: 9804822

Abstract: An electronic apparatus and a controlling methods thereof are disclosed. The electronic apparatus includes a voice input unit configured to receive a user voice, a storage unit configured to store a plurality of voice print feature models representing a plurality of user voices and a plurality of utterance environment models representing a plurality of environmental disturbances, a controller, in response to a user voice being input through the voice input unit, configured to extract utterance environment information of an utterance environment model among the plurality of utterance environment models corresponding to a location where the user voice is input, compare a voice print feature of the input user voice with the plurality of voice print feature models, revise a result of the comparison based on the extracted utterance environment information, and recognize a user corresponding to the input user voice based on the revised result.

Type: Grant

Filed: April 28, 2015

Date of Patent: October 31, 2017

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Chi-sang Jung, Byung-jin Hwang
Segment-based speaker verification using dynamically generated phrases

Patent number: 9741348

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.

Type: Grant

Filed: June 24, 2016

Date of Patent: August 22, 2017

Assignee: Google Inc.

Inventors: Dominik Roblek, Matthew Sharifi
Query refinements using search data

Patent number: 9727603

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining query refinements using search data. In one aspect, a method includes receiving a first query and a second query each comprising one or more n-grams for a user session, determining a first set of query refinements for the first query, determining a second set of query refinements from the first set of query refinements, each query refinement in the second set of query refinements including at least one n-gram that is similar to an n-gram from the first query and at least on n-gram that is similar to an n-gram from the second query, scoring each query refinement in the second set of query refinements, selecting a third query from a group consisting of the second set of query refinements and the second query, and providing the third query as input to a search operation.

Type: Grant

Filed: July 30, 2015

Date of Patent: August 8, 2017

Assignee: Google Inc.

Inventors: Matthias Heiler, Behshad Behzadi, Evgeny A. Cherepanov, Nils Grimsmo, Aurelien Boffy, Alessandro Agostini, Karoly Csalogany, Fredrik Bergenlid, Marcin M. Nowak-Przygodzki
Speaker verification methods and apparatus

Patent number: 9728191

Abstract: Techniques for automatically identifying a speaker in a conversation as a known person based on processing of audio of the speaker's voice to extract characteristics of that voice and on an automated comparison of those characteristics to known characteristics of the known person's voice. A speaker segmentation process may be performed on audio of the conversation to produce, for each speaker in the conversation, a segment that includes the audio of that speaker. Audio of each of the segments may then be processed to extract characteristics of that speaker's voice. The characteristics derived from each segment (and thus for multiple speakers) may then be compared to characteristics of the known person's voice to determine whether the speaker for that segment is the known person. For each segment, a degree of match between the voice characteristics of the speaker and the voice characteristics of the known person may be calculated.

Type: Grant

Filed: August 27, 2015

Date of Patent: August 8, 2017

Assignee: Nuance Communications, Inc.

Inventors: Emanuele Dalmasso, Daniele Colibro, Claudio Vair, Kevin R. Farrell
Multimedia information retrieval method and electronic device

Patent number: 9704485

Abstract: The present invention relates to a multimedia information retrieval method and electronic device, the multimedia information retrieval method comprising the steps of: extracting from a to-be-retrieved multimedia the voice of the to-be-retrieved multimedia; recognizing the voice of the to-be-retrieved multimedia to obtain a recognized text; and retrieving a multimedia database according to the recognized text to obtain the multimedia information of the to-be-retrieved multimedia. The present invention also relates to an electronic device. The multimedia information retrieval method and electronic device of the present invention can automatically, quickly, and comprehensively present to a user the multimedia information the user wants to know, thus greatly improving user retrieval efficiency and retrieval success rate.

Type: Grant

Filed: February 4, 2015

Date of Patent: July 11, 2017

Assignee: Tencent Technology (Shenzhen) Company Limited

Inventors: Peng Hu, Teng Zhang
False alarm reduction in speech recognition systems using contextual information

Patent number: 9646605

Abstract: A system and method are presented for using spoken word verification to reduce false alarms by exploiting global and local contexts on a lexical level, a phoneme level, and on an acoustical level. The reduction of false alarms may occur through a process that determines whether a word has been detected or if it is a false alarm. Training examples are used to generate models of internal and external contexts which are compared to test word examples. The word may be accepted or rejected based on comparison results. Comparison may be performed either at the end of the process or at multiple steps of the process to determine whether the word is rejected.

Type: Grant

Filed: January 22, 2013

Date of Patent: May 9, 2017

Assignee: Interactive Intelligence Group, Inc.

Inventors: Konstantin Biatov, Aravind Ganapathiraju, Felix Immanuel Wyss
Method and apparatus for voice communication based on voice activity detection

Patent number: 9571425

Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.

Type: Grant

Filed: March 21, 2013

Date of Patent: February 14, 2017

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
Active learning for lexical annotations

Patent number: 9508341

Abstract: Features are disclosed for active learning to identify the words which are likely to improve the guessing and automatic speech recognition (ASR) after manual annotation. When a speech recognition system needs pronunciations for words, a lexicon is typically used. For unknown words, pronunciation-guessing (G2P) may be included to provide pronunciations in an unattended (e.g., automatic) fashion. However, having manually (e.g., by a human) annotated pronunciations provides better ASR than having automatic pronunciations that may, in some instances, be wrong. The included active learning features help to direct these limited annotation resources.

Type: Grant

Filed: September 3, 2014

Date of Patent: November 29, 2016

Assignee: Amazon Technologies, Inc.

Inventors: Alok Ulhas Parlikar, Andrew Jake Rosenbaum, Jeffrey Paul Lilly, Jeffrey Penrod Adams
Speaker verification using neural networks

Patent number: 9401148

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for inputting speech data that corresponds to a particular utterance to a neural network; determining an evaluation vector based on output at a hidden layer of the neural network; comparing the evaluation vector with a reference vector that corresponds to a past utterance of a particular speaker; and based on comparing the evaluation vector and the reference vector, determining whether the particular utterance was likely spoken by the particular speaker.

Type: Grant

Filed: March 28, 2014

Date of Patent: July 26, 2016

Assignee: Google Inc.

Inventors: Xin Lei, Erik McDermott, Ehsan Variani, Ignacio L. Moreno
Voice recognition device and method, and semiconductor integrated circuit device

Patent number: 9390709

Abstract: A semiconductor integrated circuit device for voice recognition includes: a signal processing unit which generates a feature pattern representing a state of distribution of frequency components of an input voice signal; a voice recognition database storage unit which stores a voice recognition database including a standard pattern representing a state of distribution of frequency components of plural phonemes; a conversion list storage unit which stores a conversion list including plural words or sentences to be conversion candidates; a standard pattern extraction unit which extracts a standard pattern corresponding to character data representing the first syllable of each word or sentence included in the conversion list, from the voice recognition database; and a matching detection unit which compares the feature pattern generated from the first syllable of the voice signal with the extracted standard pattern and thus detects the matching of the syllable.

Type: Grant

Filed: September 20, 2013

Date of Patent: July 12, 2016

Assignee: SEIKO EPSON CORPORATION

Inventor: Tsutomu Nonaka
Language modeling in speech recognition

Patent number: 9286892

Abstract: Some implementations include a computer-implemented method. The method can include providing a training set of text samples to a semantic parser that associates text samples with actions. The method can include obtaining, for each of one or more of the text samples of the training set, data that indicates one or more domains that the semantic parser has associated with the text sample. For each of one or more domains, a subset of the text samples of the training set can be generated that the semantic parser has associated with the domain. Using the subset of text samples associated with the domain, a language model can be generated for one or more of the domain. Speech recognition can be performed on an utterance using the one or more language models that are generated for the one or more of the domains.

Type: Grant

Filed: April 1, 2014

Date of Patent: March 15, 2016

Assignee: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein
Speaker recognition from telephone calls

Patent number: 9043207

Abstract: The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.

Type: Grant

Filed: November 12, 2009

Date of Patent: May 26, 2015

Assignee: Agnitio S.L.

Inventors: Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez, Marta Garcia Gomar
DISPLAY APPARATUS AND CONTROL METHOD THEREOF

Publication number: 20150142441

Abstract: A display apparatus is provided. The display apparatus includes a communicator configured to communicate with a voice recognition apparatus that recognizes an uttered voice of a user, an input unit configured to receive the uttered voice of the user, a display unit configured to receiving voice recognition result information about the uttered voice of the user received from the voice recognition apparatus and display the voice recognition result information, and a processor configured to, when the display apparatus is turned on, perform an access to the voice recognition apparatus by transmitting access request information to the voice recognition apparatus, and when the uttered voice is inputted through the input unit, transmit voice information on the uttered voice to the voice recognition apparatus through the communicator.

Type: Application

Filed: November 18, 2014

Publication date: May 21, 2015

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Myung-jae KIM, Hee-seob RYU, Kwang-il HWANG
Automatic Speech Recognition (ASR) Feedback for Head Mounted Displays (HMD)

Publication number: 20150142440

Abstract: Feedback mechanisms to the user of a Head Mounted Display (HMD) are provided. It is important to provide feedback to the user when speech is recognized as soon as possible after the user utters a voice command. The HMD displays and/or audibly renders an ASR acknowledgment in a manner that ensures the user that the HMD has received/understood his voiced command.

Type: Application

Filed: November 13, 2014

Publication date: May 21, 2015

Inventors: Christopher Parkinson, James Woodall
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 9026442

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: August 14, 2014

Date of Patent: May 5, 2015

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
VOICE RETRIEVAL DEVICE AND VOICE RETRIEVAL METHOD

Publication number: 20150112681

Abstract: A voice retrieval device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute: setting detection criteria for a retrieval word, based on a characteristic of the retrieval word, such that the higher the detection accuracy of the retrieval word or the lower the pronunciation difficulty of the retrieval word or the lower the appearance probability of the retrieval word, the stricter the detection criteria; performing first voice retrieval processing on voice data according to the detection criteria and detecting a section that possibly includes the retrieval word as a candidate section from the voice data; and performing second voice retrieval processing different from the first voice retrieval processing on each candidate section and determining whether or not the retrieval word is included in each candidate section.

Type: Application

Filed: October 16, 2014

Publication date: April 23, 2015

Applicant: Fujitsu Limited

Inventors: Masakiyo TANAKA, Hitoshi Iwamida, Nobuyuki Washio
METHOD FOR VERIFYING THE IDENTITY OF A SPEAKER AND RELATED COMPUTER READABLE MEDIUM AND COMPUTER

Publication number: 20150112682

Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speaker's voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting the speaker's identity to be verified in case that both verification steps give a positive result and not accepting the speaker's identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.

Type: Application

Filed: January 5, 2015

Publication date: April 23, 2015

Inventors: Luis Buera Rodriguez, Marta Garcia Gomar, Marta Sanchez Asenjo, Alberto Martin de los Santos de las Heras, Alfredo Gutierrez, Carlos Vaquero Aviles-Casco, Alfonso Ortega Gimenez
Computer-Implemented System And Method For Quantitatively Assessing Vocal Behavioral Risk

Publication number: 20150095029

Abstract: Engaging persona candidates are provided with a skills assessment that includes vocal behavior. Each candidate provides both scripted and spontaneous answers to questions in a situational setting that closely matches the daily demands of the customer support industry. Samples of the candidate's speech are evaluated to identify distinct voice cues that qualitatively describe speech characteristics, which are scored based on the candidate's spoken performance. One or more of the voice cues are mapped to phonetic analytics that quantitatively describe vocal behavior. Each voice cue also has an assigned weight. The voice cue scores for each phonetic analytic are multiplied by their assigned weights and added together to form a weighted phonetic analytic, which is then used to form a part of the vocal behavior risk assessments.

Type: Application

Filed: October 2, 2013

Publication date: April 2, 2015

Applicant: StarTek, Inc.

Inventors: Ted Nardin, James Keaten
Release of transaction data

Patent number: 8996387

Abstract: For clearing transaction data selected for a processing, there is generated in a portable data carrier (1) a transaction acoustic signal (003; 103; 203) (S007; S107; S207) upon whose acoustic reproduction by an end device (10) at least transaction data selected for the processing are reproduced superimposed acoustically with a melody specific to a user of the data carrier (1) (S009; S109; S209). The generated transaction acoustic signal (003; 103; 203) is electronically transferred to an end device (10) (S108; S208), which processes the selected transaction data (S011; S121; S216) only when the user of the data carrier (1) confirms vis-à-vis the end device (10) an at least partial match both of the acoustically reproduced melody with the user-specific melody and of the acoustically reproduced transaction data with the selected transaction data (S010; S110, S116; S210).

Type: Grant

Filed: September 8, 2009

Date of Patent: March 31, 2015

Assignee: Giesecke & Devrient GmbH

Inventors: Thomas Stocker, Michael Baldischweiler
State detection device and state detecting method

Patent number: 8996373

Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.

Type: Grant

Filed: October 5, 2011

Date of Patent: March 31, 2015

Assignee: Fujitsu Limited

Inventors: Shoji Hayakawa, Naoshi Matsuo
In-Call Virtual Assistants

Publication number: 20150088514

Abstract: Techniques for providing virtual assistants to assist users during a voice communication between the users. For instance, a first user operating a device may establish a voice communication with respective devices of one or more additional users, such as with a device of a second user. For instance, the first user may utilize her device to place a telephone call to the device of the second user. A virtual assistant may also join the call and, upon invocation by a user on the call, may identify voice commands from the call and may perform corresponding tasks for the users in response.

Type: Application

Filed: September 25, 2013

Publication date: March 26, 2015

Applicant: Rawles LLC

Inventor: Marcello Typrin
BIOMETRIC PASSWORD SECURITY

Publication number: 20150081301

Abstract: A system includes a user speech profile stored on a computer readable storage device, the speech profile containing a plurality of phonemes with user identifying characteristics for the phonemes, and a speech processor coupled to access the speech profile to generate a phrase containing user distinguishing phonemes based on a difference between the user identifying characteristics for such phonemes and average user identifying characteristics, such that the phrase has discriminability from other users. The speech processor may also or alternatively select the phrase as a function of ambient noise.

Type: Application

Filed: September 18, 2013

Publication date: March 19, 2015

Applicant: Lenovo (Singapore) Pte, Ltd.

Inventors: John Weldon Nicholson, Steven Richard Perrin
SYSTEM AND METHOD FOR DYNAMIC FACIAL FEATURES FOR SPEAKER RECOGNITION

Publication number: 20150081302

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.

Type: Application

Filed: November 24, 2014

Publication date: March 19, 2015

Inventors: Ann K. SYRDAL, Sumit CHOPRA, Patrick Haffner, Taniya MISHRA, Ilija ZELJKOVIC, Eric Zavesky
Voice recognition system for registration of stable utterances

Patent number: 8977547

Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.

Type: Grant

Filed: October 8, 2009

Date of Patent: March 10, 2015

Assignee: Mitsubishi Electric Corporation

Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
COLLABORATIVE AUDIO CONVERSATION ATTESTATION

Publication number: 20150058017

Abstract: Disclosed in some examples are systems, methods, devices, and machine readable mediums which may produce an audio recording with included verification from the individuals in the recording that the recording is accurate. In some examples, the system may also provide rights management control to those individuals. This may ensure that individuals participating in audio events that are to be recorded are assured that their words are not changed, taken out of context, or otherwise altered and that they retain control over the use of their words even after the physical file has left their control.

Type: Application

Filed: August 20, 2013

Publication date: February 26, 2015

Inventors: Dave Paul Singh, Dominic Fulginti, Mahendra Tadi Tadikonda, Tobias Kohlenberg
Annotating maps with user-contributed pronunciations

Patent number: 8949125

Abstract: Systems and methods are provided to select a most typical pronunciation of a location name on a map from a plurality of user pronunciations. A server generates a reference speech model based on user pronunciations, compares the user pronunciations with the speech model and selects a pronunciation based on comparison. Alternatively, the server compares the distance between one the user pronunciations and every other user pronunciations and selects a pronunciation based on comparison. The server then annotates the map with the selected pronunciation and provides the audio output of the location name to a user device upon a user's request.

Type: Grant

Filed: June 16, 2010

Date of Patent: February 3, 2015

Assignee: Google Inc.

Inventor: Gal Chechik
Rate control for a communication

Patent number: 8947499

Abstract: Methods and systems for communicating with rate control. A communication is sent and received from a first device to a second device over a network, wherein the communication comprises at least one audio stream and a second communication stream. A capacity of the network is probed at the first device for the sending and receiving the communication. A presence of a voice in the at least one audio stream is detected at the first device via a voice activity detection of the at least one audio stream. A rate limit is set for the sending and receiving the communication at the first device based on the capacity of the network and the detection of the presence of the at least one audio stream.

Type: Grant

Filed: December 6, 2012

Date of Patent: February 3, 2015

Assignee: TangoMe, Inc.

Inventors: Alexander Subbotin, Olivier Furon, Shaowei Su, Yevgeni Litvin, Xu Liu
SYSTEMS AND METHODS FOR SPOKEN DIALOG SERVICE ARBITRATION

Publication number: 20150019219

Abstract: Systems and methods for arbitrating spoken dialog services include determining a capability catalog associated with a plurality of devices accessible within an environment. The capability catalog includes a list of the plurality of devices mapped to a list of spoken dialog services provided by each of the plurality of devices. The system arbitrates between the plurality of devices and the spoken dialog services in the capability catalog to determine a selected device and a selected dialog service.

Type: Application

Filed: December 2, 2013

Publication date: January 15, 2015

Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: ELI TZIRKEL-HANCOCK, GREG T. LINDEMANN, ROBERT D. SIMS, OMER TSIMHONI
Detecting Self-Generated Wake Expressions

Publication number: 20150006176

Abstract: A speech-based audio device may be configured to detect a user-uttered wake expression and to respond by interpreting subsequent words or phrases as commands. In order to distinguish between utterance of the wake expression by the user and generation of the wake expression by the device itself, directional audio signals may by analyzed to detect whether the wake expression has been received from multiple directions. If the wake expression has been received from many directions, it is declared as being generated by the audio device and ignored. Otherwise, if the wake expression is received from a single direction or a limited number of directions, the wake expression is declared as being uttered by the user and subsequent words or phrase are interpreted and acted upon by the audio device.

Type: Application

Filed: June 27, 2013

Publication date: January 1, 2015

Inventors: Michael Alan Pogue, Philip Ryan Hilmes
Automatic context sensitive language correction and enhancement using an internet corpus

Patent number: 8914278

Abstract: A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.

Type: Grant

Filed: July 31, 2008

Date of Patent: December 16, 2014

Assignee: Ginger Software, Inc.

Inventors: Yael Karov Zangvil, Avner Zangvil
Speaker Verification in a Health Monitoring System

Publication number: 20140365219

Abstract: A method for verifying that a person is registered to use a telemedical device includes identifying an unprompted trigger phrase in words spoken by a person and received by the telemedical device. The telemedical device prompts the person to state a name of a registered user and optionally prompts the person to state health tips for the person. The telemedical device verifies that the person is the registered user using utterance data generated from the unprompted trigger phrase, name of the registered user, and health tips.

Type: Application

Filed: August 26, 2014

Publication date: December 11, 2014

Inventors: Fuliang Weng, Taufiq Hasan, Zhe Feng
METHOD OF EXECUTING VOICE RECOGNITION OF ELECTRONIC DEVICE AND ELECTRONIC DEVICE USING THE SAME

Publication number: 20140358535

Abstract: A method of performing a voice command function in an electronic device includes detecting voice of a user, acquiring one or more pieces of attribute information from the voice, and authenticating the user by comparing the attribute information with pre-stored authentic attribe information, using a recognition model. An electronic device includes a voice input module configured to detect a voice of a user, a first processor configured to acquire one or more pieces of attribute information from the voice and authenticate the user by comparing the attribute information with a recognition model, and a second processor configured to when the attribute information matches the recognition mode, activate the voice command function, receive a voice command of the user, and execute an application corresponding to the voice command. Other embodiments are also disclosed.

Type: Application

Filed: May 28, 2014

Publication date: December 4, 2014

Applicant: Samsung Electronics Co., Ltd.

Inventors: Sanghoon Lee, Kyungtae Kim, Subhojit Chakladar, Taejin Lee, Seokyeong Jung
VOICE RECOGNITION APPARATUS AND CONTROL METHOD THEREOF

Publication number: 20140350933

Abstract: A voice recognition apparatus includes: an extractor configured to extract utterance elements from a user's uttered voice; an LSP converter configured to convert the extracted utterance elements into LSP formats; and a controller configured to determine whether an utterance element related to an OOV exists among the utterance elements converted into the LSP formats with reference to vocabulary list information including pre-registered vocabularies, and to determine an OOD area in which it is impossible to provide response information in response to the uttered voice, in response to determining that the utterance element related to the OOV exists. Accordingly, the voice recognition apparatus provides appropriate response information according to a user's intent by considering a variety of utterances and possibilities regarding a user's uttered voice.

Type: Application

Filed: May 27, 2014

Publication date: November 27, 2014

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Eun-sang BAK, Kyung-duk KIM, Hyung-jong NOH, Seong-han RYU, Geun-bae LEE
Systems and Methods for Voice Identification

Publication number: 20140350934

Abstract: Systems and methods are provided for voice identification. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as an identification result.

Type: Application

Filed: May 30, 2014

Publication date: November 27, 2014

Applicant: Tencent Technology (Shenzhen) Company Limited

Inventors: Lou Li, Li Lu, Xiang Zhang, Feng Rao, Shuai Yue, Bo Chen, Jianxiong Ma, Haibo Liu
Method of Accessing a Dial-Up Service

Publication number: 20140324432

Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.

Type: Application

Filed: May 2, 2014

Publication date: October 30, 2014

Applicant: AT&T Intellectual Property I, L.P.

Inventor: Robert Wesley Bossemeyer, JR.
Recognition dictionary creation device and voice recognition device

Patent number: 8868431

Abstract: A recognition dictionary creation device identifies the language of a reading of an inputted text which is a target to be registered and adds a reading with phonemes in the language identified thereby to the target text to be registered, and also converts the reading of the target text to be registered from the phonemes in the language identified thereby to phonemes in a language to be recognized which is handled in voice recognition to create a recognition dictionary in which the converted reading of the target text to be registered is registered.

Type: Grant

Filed: February 5, 2010

Date of Patent: October 21, 2014

Assignee: Mitsubishi Electric Corporation

Inventors: Michihiro Yamazaki, Jun Ishii, Yasushi Ishikawa
Automated Speech Recognition Proxy System for Natural Language Understanding

Publication number: 20140288932

Abstract: An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs.

Type: Application

Filed: July 8, 2013

Publication date: September 25, 2014

Inventors: Yoryos Yeracaris, Alwin B. Carus, Larissa Lapshina
VOICE COMMAND DEFINITIONS USED IN LAUNCHING APPLICATION WITH A COMMAND

Publication number: 20140278419

Abstract: A voice command definition file (VCDF) declaratively defines voice commands for an application. For example, the VCDF may include definitions for: voice commands; one or more phrases/utterances that may be said to execute each of the commands; a navigation location to navigate to within the application (e.g. a page); phrase lists containing items that may be used as a parameter in a voice command; examples; feedback; and the like. A user may say a single utterance to launch the application, navigate to the associated location of the command and execute the command. The VCDF may define multiple ways to listen for a particular command. The VCDF may be edited/defined by a user and may include a user friendly name for an application. A speech engine loads the VCDF for use such that it may recognize the commands associated with an application. The definitions may be updated during runtime.

Type: Application

Filed: March 14, 2013

Publication date: September 18, 2014

Inventors: F. Avery Bishop, Travis Wilson, Robert Chambers, Robert Brown
Method and Apparatus for Training a Voice Recognition Model Database

Publication number: 20140278420

Abstract: An electronic device digitally combines a single voice input with each of a series of noise samples. Each noise sample is taken from a different audio environment (e.g., street noise, babble, interior car noise). The voice input/noise sample combinations are used to train a voice recognition model database without the user having to repeat the voice input in each of the different environments. In one variation, the electronic device transmits the user's voice input to a server that maintains and trains the voice recognition model database.

Type: Application

Filed: December 3, 2013

Publication date: September 18, 2014

Applicant: Motorola Mobility LLC

Inventors: John R. Meloney, Joel A. Clark, Joseph C. Dwyer, Adrian M. Schuster, Snehitha Singaraju, Robert A. Zurek
System and method for pitch based gender identification with suspicious speaker detection

Patent number: 8831942

Abstract: A method is provided for identifying a gender of a speaker. The method steps include obtaining speech data of the speaker, extracting vowel-like speech frames from the speech data, analyzing the vowel-like speech frames to generate a feature vector having pitch values corresponding to the vowel-like frames, analyzing the pitch values to generate a most frequent pitch value, determining, in response to the most frequent pitch value being between a first pre-determined threshold and a second pre-determined threshold, an output of a male Gaussian Mixture Model (GMM) and an output of a female GMM using the pitch values as inputs to the male GMM and the female GMM, and identifying the gender of the speaker by comparing the output of the male GMM and the output of the female GMM based on a pre-determined criterion.

Type: Grant

Filed: March 19, 2010

Date of Patent: September 9, 2014

Assignee: Narus, Inc.

Inventor: Antonio Nucci
SPEECH RECOGNITION METHOD OF SENTENCE HAVING MULTIPLE INSTRUCTIONS

Publication number: 20140244258

Abstract: A voice recognition method for a single sentence including a multi-instruction in an interactive voice user interface, method includes steps of detecting a connection ending by analyzing the morphemes of a single sentence on which voice recognition has been performed, separating the single sentence into a plurality of passages based on the connection ending, detecting a multi-connection ending by analyzing the connection ending and extracting instructions by specifically analyzing passages including the multi-connection ending and outputting a multi-instruction included in the single sentence by combining the instructions extracted in the step of extracting instructions. In accordance with the present invention, consumer usability can be significantly increased because a multi-operation intention can be checked in one sentence.

Type: Application

Filed: October 18, 2013

Publication date: August 28, 2014

Applicant: Mediazen Co., Ltd.

Inventors: Minkyu SONG, Hyejin KIM, Sangyoon KIM
Methods and Systems for Sharing of Adapted Voice Profiles

Publication number: 20140236598

Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.

Type: Application

Filed: April 29, 2013

Publication date: August 21, 2014

Applicant: Google Inc.

Inventor: Google Inc.
SYSTEM AND METHOD FOR TARGETED TUNING OF A SPEECH RECOGNITION SYSTEM

Publication number: 20140236599

Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. A particular method includes detecting that a frequency of occurrence of a particular type of utterance satisfies a threshold. The method further includes tuning a speech recognition system with respect to the particular type of utterance.

Type: Application

Filed: April 25, 2014

Publication date: August 21, 2014

Applicant: AT&T Intellectuall Property I, L.P.

Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 8812315

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: October 1, 2013

Date of Patent: August 19, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal

prev 1 2 3 4 5 6 … next