Subportions Patents (Class 704/249)
  • Patent number: 10192219
    Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.
    Type: Grant
    Filed: January 8, 2015
    Date of Patent: January 29, 2019
    Assignee: Capital One Services, LLC
    Inventors: Lawrence Douglas, Paul Y. Moreton
  • Patent number: 10056094
    Abstract: In some example embodiments, a system is provided for real-time analysis of audio signals. First digital audio signals are retrieved from memory. First computed streamed signal information corresponding to each of the first digital audio signals is generated by computing first metrics data for the first digital audio signals, the first computed streamed signal information including the first metrics data. The computed first streamed signal information is stored in the memory. The first computed streamed signal information is transmitted to one or more computing devices. Transmitting the first computed streamed signal information to the one or more computing devices causes the first computed streamed signal information to be displayed at the one or more computing devices.
    Type: Grant
    Filed: March 12, 2015
    Date of Patent: August 21, 2018
    Assignee: Cogito Corporation
    Inventors: Joshua Feast, Ali Azarbayejani, Skyler Place
  • Patent number: 10037760
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.
    Type: Grant
    Filed: August 4, 2017
    Date of Patent: July 31, 2018
    Inventors: Dominik Roblek, Matthew Sharifi
  • Patent number: 9912617
    Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
    Type: Grant
    Filed: January 4, 2017
    Date of Patent: March 6, 2018
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
  • Patent number: 9875474
    Abstract: A method is provided for securing a transaction made by bank card, the transaction involving a remote provision, by a user, of data existing in a bank card in his possession. The method includes: obtaining data existing in the bank card to be used, called textual data; obtaining at least one portion of the textual data in the form of an audio data stream, called a sound sample, resulting from reading the data existing in the bank card to be used; computing a current voice signature from said sound sample; comparing said current voice signature with a reference voice signature pre-recorded and associated with the textual data of the bank card; and when the reference voice signature differs from the current voice signature by a value greater than the first value defined by a predetermined parameter, for rejecting the transaction.
    Type: Grant
    Filed: January 16, 2015
    Date of Patent: January 23, 2018
    Assignee: INGENICO GROUP
    Inventor: Michel Leger
  • Patent number: 9858919
    Abstract: A method includes providing a deep neural network acoustic model, receiving audio data including one or more utterances of a speaker, extracting a plurality of speech recognition features from the one or more utterances of the speaker, creating a speaker identity vector for the speaker based on the extracted speech recognition features, and adapting the deep neural network acoustic model for automatic speech recognition using the extracted speech recognition features and the speaker identity vector.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: January 2, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: George A. Saon
  • Patent number: 9804822
    Abstract: An electronic apparatus and a controlling methods thereof are disclosed. The electronic apparatus includes a voice input unit configured to receive a user voice, a storage unit configured to store a plurality of voice print feature models representing a plurality of user voices and a plurality of utterance environment models representing a plurality of environmental disturbances, a controller, in response to a user voice being input through the voice input unit, configured to extract utterance environment information of an utterance environment model among the plurality of utterance environment models corresponding to a location where the user voice is input, compare a voice print feature of the input user voice with the plurality of voice print feature models, revise a result of the comparison based on the extracted utterance environment information, and recognize a user corresponding to the input user voice based on the revised result.
    Type: Grant
    Filed: April 28, 2015
    Date of Patent: October 31, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Chi-sang Jung, Byung-jin Hwang
  • Patent number: 9741348
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: August 22, 2017
    Assignee: Google Inc.
    Inventors: Dominik Roblek, Matthew Sharifi
  • Patent number: 9727603
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining query refinements using search data. In one aspect, a method includes receiving a first query and a second query each comprising one or more n-grams for a user session, determining a first set of query refinements for the first query, determining a second set of query refinements from the first set of query refinements, each query refinement in the second set of query refinements including at least one n-gram that is similar to an n-gram from the first query and at least on n-gram that is similar to an n-gram from the second query, scoring each query refinement in the second set of query refinements, selecting a third query from a group consisting of the second set of query refinements and the second query, and providing the third query as input to a search operation.
    Type: Grant
    Filed: July 30, 2015
    Date of Patent: August 8, 2017
    Assignee: Google Inc.
    Inventors: Matthias Heiler, Behshad Behzadi, Evgeny A. Cherepanov, Nils Grimsmo, Aurelien Boffy, Alessandro Agostini, Karoly Csalogany, Fredrik Bergenlid, Marcin M. Nowak-Przygodzki
  • Patent number: 9728191
    Abstract: Techniques for automatically identifying a speaker in a conversation as a known person based on processing of audio of the speaker's voice to extract characteristics of that voice and on an automated comparison of those characteristics to known characteristics of the known person's voice. A speaker segmentation process may be performed on audio of the conversation to produce, for each speaker in the conversation, a segment that includes the audio of that speaker. Audio of each of the segments may then be processed to extract characteristics of that speaker's voice. The characteristics derived from each segment (and thus for multiple speakers) may then be compared to characteristics of the known person's voice to determine whether the speaker for that segment is the known person. For each segment, a degree of match between the voice characteristics of the speaker and the voice characteristics of the known person may be calculated.
    Type: Grant
    Filed: August 27, 2015
    Date of Patent: August 8, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Emanuele Dalmasso, Daniele Colibro, Claudio Vair, Kevin R. Farrell
  • Patent number: 9704485
    Abstract: The present invention relates to a multimedia information retrieval method and electronic device, the multimedia information retrieval method comprising the steps of: extracting from a to-be-retrieved multimedia the voice of the to-be-retrieved multimedia; recognizing the voice of the to-be-retrieved multimedia to obtain a recognized text; and retrieving a multimedia database according to the recognized text to obtain the multimedia information of the to-be-retrieved multimedia. The present invention also relates to an electronic device. The multimedia information retrieval method and electronic device of the present invention can automatically, quickly, and comprehensively present to a user the multimedia information the user wants to know, thus greatly improving user retrieval efficiency and retrieval success rate.
    Type: Grant
    Filed: February 4, 2015
    Date of Patent: July 11, 2017
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventors: Peng Hu, Teng Zhang
  • Patent number: 9646605
    Abstract: A system and method are presented for using spoken word verification to reduce false alarms by exploiting global and local contexts on a lexical level, a phoneme level, and on an acoustical level. The reduction of false alarms may occur through a process that determines whether a word has been detected or if it is a false alarm. Training examples are used to generate models of internal and external contexts which are compared to test word examples. The word may be accepted or rejected based on comparison results. Comparison may be performed either at the end of the process or at multiple steps of the process to determine whether the word is rejected.
    Type: Grant
    Filed: January 22, 2013
    Date of Patent: May 9, 2017
    Assignee: Interactive Intelligence Group, Inc.
    Inventors: Konstantin Biatov, Aravind Ganapathiraju, Felix Immanuel Wyss
  • Patent number: 9571425
    Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: February 14, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
  • Patent number: 9508341
    Abstract: Features are disclosed for active learning to identify the words which are likely to improve the guessing and automatic speech recognition (ASR) after manual annotation. When a speech recognition system needs pronunciations for words, a lexicon is typically used. For unknown words, pronunciation-guessing (G2P) may be included to provide pronunciations in an unattended (e.g., automatic) fashion. However, having manually (e.g., by a human) annotated pronunciations provides better ASR than having automatic pronunciations that may, in some instances, be wrong. The included active learning features help to direct these limited annotation resources.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: November 29, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Alok Ulhas Parlikar, Andrew Jake Rosenbaum, Jeffrey Paul Lilly, Jeffrey Penrod Adams
  • Patent number: 9401148
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for inputting speech data that corresponds to a particular utterance to a neural network; determining an evaluation vector based on output at a hidden layer of the neural network; comparing the evaluation vector with a reference vector that corresponds to a past utterance of a particular speaker; and based on comparing the evaluation vector and the reference vector, determining whether the particular utterance was likely spoken by the particular speaker.
    Type: Grant
    Filed: March 28, 2014
    Date of Patent: July 26, 2016
    Assignee: Google Inc.
    Inventors: Xin Lei, Erik McDermott, Ehsan Variani, Ignacio L. Moreno
  • Patent number: 9390709
    Abstract: A semiconductor integrated circuit device for voice recognition includes: a signal processing unit which generates a feature pattern representing a state of distribution of frequency components of an input voice signal; a voice recognition database storage unit which stores a voice recognition database including a standard pattern representing a state of distribution of frequency components of plural phonemes; a conversion list storage unit which stores a conversion list including plural words or sentences to be conversion candidates; a standard pattern extraction unit which extracts a standard pattern corresponding to character data representing the first syllable of each word or sentence included in the conversion list, from the voice recognition database; and a matching detection unit which compares the feature pattern generated from the first syllable of the voice signal with the extracted standard pattern and thus detects the matching of the syllable.
    Type: Grant
    Filed: September 20, 2013
    Date of Patent: July 12, 2016
    Assignee: SEIKO EPSON CORPORATION
    Inventor: Tsutomu Nonaka
  • Patent number: 9286892
    Abstract: Some implementations include a computer-implemented method. The method can include providing a training set of text samples to a semantic parser that associates text samples with actions. The method can include obtaining, for each of one or more of the text samples of the training set, data that indicates one or more domains that the semantic parser has associated with the text sample. For each of one or more domains, a subset of the text samples of the training set can be generated that the semantic parser has associated with the domain. Using the subset of text samples associated with the domain, a language model can be generated for one or more of the domain. Speech recognition can be performed on an utterance using the one or more language models that are generated for the one or more of the domains.
    Type: Grant
    Filed: April 1, 2014
    Date of Patent: March 15, 2016
    Assignee: Google Inc.
    Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein
  • Patent number: 9043207
    Abstract: The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.
    Type: Grant
    Filed: November 12, 2009
    Date of Patent: May 26, 2015
    Assignee: Agnitio S.L.
    Inventors: Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez, Marta Garcia Gomar
  • Publication number: 20150142441
    Abstract: A display apparatus is provided. The display apparatus includes a communicator configured to communicate with a voice recognition apparatus that recognizes an uttered voice of a user, an input unit configured to receive the uttered voice of the user, a display unit configured to receiving voice recognition result information about the uttered voice of the user received from the voice recognition apparatus and display the voice recognition result information, and a processor configured to, when the display apparatus is turned on, perform an access to the voice recognition apparatus by transmitting access request information to the voice recognition apparatus, and when the uttered voice is inputted through the input unit, transmit voice information on the uttered voice to the voice recognition apparatus through the communicator.
    Type: Application
    Filed: November 18, 2014
    Publication date: May 21, 2015
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Myung-jae KIM, Hee-seob RYU, Kwang-il HWANG
  • Publication number: 20150142440
    Abstract: Feedback mechanisms to the user of a Head Mounted Display (HMD) are provided. It is important to provide feedback to the user when speech is recognized as soon as possible after the user utters a voice command. The HMD displays and/or audibly renders an ASR acknowledgment in a manner that ensures the user that the HMD has received/understood his voiced command.
    Type: Application
    Filed: November 13, 2014
    Publication date: May 21, 2015
    Inventors: Christopher Parkinson, James Woodall
  • Patent number: 9026442
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: May 5, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Publication number: 20150112681
    Abstract: A voice retrieval device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute: setting detection criteria for a retrieval word, based on a characteristic of the retrieval word, such that the higher the detection accuracy of the retrieval word or the lower the pronunciation difficulty of the retrieval word or the lower the appearance probability of the retrieval word, the stricter the detection criteria; performing first voice retrieval processing on voice data according to the detection criteria and detecting a section that possibly includes the retrieval word as a candidate section from the voice data; and performing second voice retrieval processing different from the first voice retrieval processing on each candidate section and determining whether or not the retrieval word is included in each candidate section.
    Type: Application
    Filed: October 16, 2014
    Publication date: April 23, 2015
    Applicant: Fujitsu Limited
    Inventors: Masakiyo TANAKA, Hitoshi Iwamida, Nobuyuki Washio
  • Publication number: 20150112682
    Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speaker's voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting the speaker's identity to be verified in case that both verification steps give a positive result and not accepting the speaker's identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.
    Type: Application
    Filed: January 5, 2015
    Publication date: April 23, 2015
    Inventors: Luis Buera Rodriguez, Marta Garcia Gomar, Marta Sanchez Asenjo, Alberto Martin de los Santos de las Heras, Alfredo Gutierrez, Carlos Vaquero Aviles-Casco, Alfonso Ortega Gimenez
  • Publication number: 20150095029
    Abstract: Engaging persona candidates are provided with a skills assessment that includes vocal behavior. Each candidate provides both scripted and spontaneous answers to questions in a situational setting that closely matches the daily demands of the customer support industry. Samples of the candidate's speech are evaluated to identify distinct voice cues that qualitatively describe speech characteristics, which are scored based on the candidate's spoken performance. One or more of the voice cues are mapped to phonetic analytics that quantitatively describe vocal behavior. Each voice cue also has an assigned weight. The voice cue scores for each phonetic analytic are multiplied by their assigned weights and added together to form a weighted phonetic analytic, which is then used to form a part of the vocal behavior risk assessments.
    Type: Application
    Filed: October 2, 2013
    Publication date: April 2, 2015
    Applicant: StarTek, Inc.
    Inventors: Ted Nardin, James Keaten
  • Patent number: 8996387
    Abstract: For clearing transaction data selected for a processing, there is generated in a portable data carrier (1) a transaction acoustic signal (003; 103; 203) (S007; S107; S207) upon whose acoustic reproduction by an end device (10) at least transaction data selected for the processing are reproduced superimposed acoustically with a melody specific to a user of the data carrier (1) (S009; S109; S209). The generated transaction acoustic signal (003; 103; 203) is electronically transferred to an end device (10) (S108; S208), which processes the selected transaction data (S011; S121; S216) only when the user of the data carrier (1) confirms vis-à-vis the end device (10) an at least partial match both of the acoustically reproduced melody with the user-specific melody and of the acoustically reproduced transaction data with the selected transaction data (S010; S110, S116; S210).
    Type: Grant
    Filed: September 8, 2009
    Date of Patent: March 31, 2015
    Assignee: Giesecke & Devrient GmbH
    Inventors: Thomas Stocker, Michael Baldischweiler
  • Patent number: 8996373
    Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: March 31, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Publication number: 20150088514
    Abstract: Techniques for providing virtual assistants to assist users during a voice communication between the users. For instance, a first user operating a device may establish a voice communication with respective devices of one or more additional users, such as with a device of a second user. For instance, the first user may utilize her device to place a telephone call to the device of the second user. A virtual assistant may also join the call and, upon invocation by a user on the call, may identify voice commands from the call and may perform corresponding tasks for the users in response.
    Type: Application
    Filed: September 25, 2013
    Publication date: March 26, 2015
    Applicant: Rawles LLC
    Inventor: Marcello Typrin
  • Publication number: 20150081301
    Abstract: A system includes a user speech profile stored on a computer readable storage device, the speech profile containing a plurality of phonemes with user identifying characteristics for the phonemes, and a speech processor coupled to access the speech profile to generate a phrase containing user distinguishing phonemes based on a difference between the user identifying characteristics for such phonemes and average user identifying characteristics, such that the phrase has discriminability from other users. The speech processor may also or alternatively select the phrase as a function of ambient noise.
    Type: Application
    Filed: September 18, 2013
    Publication date: March 19, 2015
    Applicant: Lenovo (Singapore) Pte, Ltd.
    Inventors: John Weldon Nicholson, Steven Richard Perrin
  • Publication number: 20150081302
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.
    Type: Application
    Filed: November 24, 2014
    Publication date: March 19, 2015
    Inventors: Ann K. SYRDAL, Sumit CHOPRA, Patrick Haffner, Taniya MISHRA, Ilija ZELJKOVIC, Eric Zavesky
  • Patent number: 8977547
    Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: March 10, 2015
    Assignee: Mitsubishi Electric Corporation
    Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
  • Publication number: 20150058017
    Abstract: Disclosed in some examples are systems, methods, devices, and machine readable mediums which may produce an audio recording with included verification from the individuals in the recording that the recording is accurate. In some examples, the system may also provide rights management control to those individuals. This may ensure that individuals participating in audio events that are to be recorded are assured that their words are not changed, taken out of context, or otherwise altered and that they retain control over the use of their words even after the physical file has left their control.
    Type: Application
    Filed: August 20, 2013
    Publication date: February 26, 2015
    Inventors: Dave Paul Singh, Dominic Fulginti, Mahendra Tadi Tadikonda, Tobias Kohlenberg
  • Patent number: 8949125
    Abstract: Systems and methods are provided to select a most typical pronunciation of a location name on a map from a plurality of user pronunciations. A server generates a reference speech model based on user pronunciations, compares the user pronunciations with the speech model and selects a pronunciation based on comparison. Alternatively, the server compares the distance between one the user pronunciations and every other user pronunciations and selects a pronunciation based on comparison. The server then annotates the map with the selected pronunciation and provides the audio output of the location name to a user device upon a user's request.
    Type: Grant
    Filed: June 16, 2010
    Date of Patent: February 3, 2015
    Assignee: Google Inc.
    Inventor: Gal Chechik
  • Patent number: 8947499
    Abstract: Methods and systems for communicating with rate control. A communication is sent and received from a first device to a second device over a network, wherein the communication comprises at least one audio stream and a second communication stream. A capacity of the network is probed at the first device for the sending and receiving the communication. A presence of a voice in the at least one audio stream is detected at the first device via a voice activity detection of the at least one audio stream. A rate limit is set for the sending and receiving the communication at the first device based on the capacity of the network and the detection of the presence of the at least one audio stream.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: February 3, 2015
    Assignee: TangoMe, Inc.
    Inventors: Alexander Subbotin, Olivier Furon, Shaowei Su, Yevgeni Litvin, Xu Liu
  • Publication number: 20150019219
    Abstract: Systems and methods for arbitrating spoken dialog services include determining a capability catalog associated with a plurality of devices accessible within an environment. The capability catalog includes a list of the plurality of devices mapped to a list of spoken dialog services provided by each of the plurality of devices. The system arbitrates between the plurality of devices and the spoken dialog services in the capability catalog to determine a selected device and a selected dialog service.
    Type: Application
    Filed: December 2, 2013
    Publication date: January 15, 2015
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: ELI TZIRKEL-HANCOCK, GREG T. LINDEMANN, ROBERT D. SIMS, OMER TSIMHONI
  • Publication number: 20150006176
    Abstract: A speech-based audio device may be configured to detect a user-uttered wake expression and to respond by interpreting subsequent words or phrases as commands. In order to distinguish between utterance of the wake expression by the user and generation of the wake expression by the device itself, directional audio signals may by analyzed to detect whether the wake expression has been received from multiple directions. If the wake expression has been received from many directions, it is declared as being generated by the audio device and ignored. Otherwise, if the wake expression is received from a single direction or a limited number of directions, the wake expression is declared as being uttered by the user and subsequent words or phrase are interpreted and acted upon by the audio device.
    Type: Application
    Filed: June 27, 2013
    Publication date: January 1, 2015
    Inventors: Michael Alan Pogue, Philip Ryan Hilmes
  • Patent number: 8914278
    Abstract: A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: December 16, 2014
    Assignee: Ginger Software, Inc.
    Inventors: Yael Karov Zangvil, Avner Zangvil
  • Publication number: 20140365219
    Abstract: A method for verifying that a person is registered to use a telemedical device includes identifying an unprompted trigger phrase in words spoken by a person and received by the telemedical device. The telemedical device prompts the person to state a name of a registered user and optionally prompts the person to state health tips for the person. The telemedical device verifies that the person is the registered user using utterance data generated from the unprompted trigger phrase, name of the registered user, and health tips.
    Type: Application
    Filed: August 26, 2014
    Publication date: December 11, 2014
    Inventors: Fuliang Weng, Taufiq Hasan, Zhe Feng
  • Publication number: 20140358535
    Abstract: A method of performing a voice command function in an electronic device includes detecting voice of a user, acquiring one or more pieces of attribute information from the voice, and authenticating the user by comparing the attribute information with pre-stored authentic attribe information, using a recognition model. An electronic device includes a voice input module configured to detect a voice of a user, a first processor configured to acquire one or more pieces of attribute information from the voice and authenticate the user by comparing the attribute information with a recognition model, and a second processor configured to when the attribute information matches the recognition mode, activate the voice command function, receive a voice command of the user, and execute an application corresponding to the voice command. Other embodiments are also disclosed.
    Type: Application
    Filed: May 28, 2014
    Publication date: December 4, 2014
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Sanghoon Lee, Kyungtae Kim, Subhojit Chakladar, Taejin Lee, Seokyeong Jung
  • Publication number: 20140350933
    Abstract: A voice recognition apparatus includes: an extractor configured to extract utterance elements from a user's uttered voice; an LSP converter configured to convert the extracted utterance elements into LSP formats; and a controller configured to determine whether an utterance element related to an OOV exists among the utterance elements converted into the LSP formats with reference to vocabulary list information including pre-registered vocabularies, and to determine an OOD area in which it is impossible to provide response information in response to the uttered voice, in response to determining that the utterance element related to the OOV exists. Accordingly, the voice recognition apparatus provides appropriate response information according to a user's intent by considering a variety of utterances and possibilities regarding a user's uttered voice.
    Type: Application
    Filed: May 27, 2014
    Publication date: November 27, 2014
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Eun-sang BAK, Kyung-duk KIM, Hyung-jong NOH, Seong-han RYU, Geun-bae LEE
  • Publication number: 20140350934
    Abstract: Systems and methods are provided for voice identification. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as an identification result.
    Type: Application
    Filed: May 30, 2014
    Publication date: November 27, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Lou Li, Li Lu, Xiang Zhang, Feng Rao, Shuai Yue, Bo Chen, Jianxiong Ma, Haibo Liu
  • Publication number: 20140324432
    Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.
    Type: Application
    Filed: May 2, 2014
    Publication date: October 30, 2014
    Applicant: AT&T Intellectual Property I, L.P.
    Inventor: Robert Wesley Bossemeyer, JR.
  • Patent number: 8868431
    Abstract: A recognition dictionary creation device identifies the language of a reading of an inputted text which is a target to be registered and adds a reading with phonemes in the language identified thereby to the target text to be registered, and also converts the reading of the target text to be registered from the phonemes in the language identified thereby to phonemes in a language to be recognized which is handled in voice recognition to create a recognition dictionary in which the converted reading of the target text to be registered is registered.
    Type: Grant
    Filed: February 5, 2010
    Date of Patent: October 21, 2014
    Assignee: Mitsubishi Electric Corporation
    Inventors: Michihiro Yamazaki, Jun Ishii, Yasushi Ishikawa
  • Publication number: 20140288932
    Abstract: An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs.
    Type: Application
    Filed: July 8, 2013
    Publication date: September 25, 2014
    Inventors: Yoryos Yeracaris, Alwin B. Carus, Larissa Lapshina
  • Publication number: 20140278419
    Abstract: A voice command definition file (VCDF) declaratively defines voice commands for an application. For example, the VCDF may include definitions for: voice commands; one or more phrases/utterances that may be said to execute each of the commands; a navigation location to navigate to within the application (e.g. a page); phrase lists containing items that may be used as a parameter in a voice command; examples; feedback; and the like. A user may say a single utterance to launch the application, navigate to the associated location of the command and execute the command. The VCDF may define multiple ways to listen for a particular command. The VCDF may be edited/defined by a user and may include a user friendly name for an application. A speech engine loads the VCDF for use such that it may recognize the commands associated with an application. The definitions may be updated during runtime.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Inventors: F. Avery Bishop, Travis Wilson, Robert Chambers, Robert Brown
  • Publication number: 20140278420
    Abstract: An electronic device digitally combines a single voice input with each of a series of noise samples. Each noise sample is taken from a different audio environment (e.g., street noise, babble, interior car noise). The voice input/noise sample combinations are used to train a voice recognition model database without the user having to repeat the voice input in each of the different environments. In one variation, the electronic device transmits the user's voice input to a server that maintains and trains the voice recognition model database.
    Type: Application
    Filed: December 3, 2013
    Publication date: September 18, 2014
    Applicant: Motorola Mobility LLC
    Inventors: John R. Meloney, Joel A. Clark, Joseph C. Dwyer, Adrian M. Schuster, Snehitha Singaraju, Robert A. Zurek
  • Patent number: 8831942
    Abstract: A method is provided for identifying a gender of a speaker. The method steps include obtaining speech data of the speaker, extracting vowel-like speech frames from the speech data, analyzing the vowel-like speech frames to generate a feature vector having pitch values corresponding to the vowel-like frames, analyzing the pitch values to generate a most frequent pitch value, determining, in response to the most frequent pitch value being between a first pre-determined threshold and a second pre-determined threshold, an output of a male Gaussian Mixture Model (GMM) and an output of a female GMM using the pitch values as inputs to the male GMM and the female GMM, and identifying the gender of the speaker by comparing the output of the male GMM and the output of the female GMM based on a pre-determined criterion.
    Type: Grant
    Filed: March 19, 2010
    Date of Patent: September 9, 2014
    Assignee: Narus, Inc.
    Inventor: Antonio Nucci
  • Publication number: 20140244258
    Abstract: A voice recognition method for a single sentence including a multi-instruction in an interactive voice user interface, method includes steps of detecting a connection ending by analyzing the morphemes of a single sentence on which voice recognition has been performed, separating the single sentence into a plurality of passages based on the connection ending, detecting a multi-connection ending by analyzing the connection ending and extracting instructions by specifically analyzing passages including the multi-connection ending and outputting a multi-instruction included in the single sentence by combining the instructions extracted in the step of extracting instructions. In accordance with the present invention, consumer usability can be significantly increased because a multi-operation intention can be checked in one sentence.
    Type: Application
    Filed: October 18, 2013
    Publication date: August 28, 2014
    Applicant: Mediazen Co., Ltd.
    Inventors: Minkyu SONG, Hyejin KIM, Sangyoon KIM
  • Publication number: 20140236598
    Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.
    Type: Application
    Filed: April 29, 2013
    Publication date: August 21, 2014
    Applicant: Google Inc.
    Inventor: Google Inc.
  • Publication number: 20140236599
    Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. A particular method includes detecting that a frequency of occurrence of a particular type of utterance satisfies a threshold. The method further includes tuning a speech recognition system with respect to the particular type of utterance.
    Type: Application
    Filed: April 25, 2014
    Publication date: August 21, 2014
    Applicant: AT&T Intellectuall Property I, L.P.
    Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
  • Patent number: 8812315
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: August 19, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal