Subportions Patents (Class 704/249)
-
Patent number: 10192219Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.Type: GrantFiled: January 8, 2015Date of Patent: January 29, 2019Assignee: Capital One Services, LLCInventors: Lawrence Douglas, Paul Y. Moreton
-
Patent number: 10056094Abstract: In some example embodiments, a system is provided for real-time analysis of audio signals. First digital audio signals are retrieved from memory. First computed streamed signal information corresponding to each of the first digital audio signals is generated by computing first metrics data for the first digital audio signals, the first computed streamed signal information including the first metrics data. The computed first streamed signal information is stored in the memory. The first computed streamed signal information is transmitted to one or more computing devices. Transmitting the first computed streamed signal information to the one or more computing devices causes the first computed streamed signal information to be displayed at the one or more computing devices.Type: GrantFiled: March 12, 2015Date of Patent: August 21, 2018Assignee: Cogito CorporationInventors: Joshua Feast, Ali Azarbayejani, Skyler Place
-
Patent number: 10037760Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.Type: GrantFiled: August 4, 2017Date of Patent: July 31, 2018Inventors: Dominik Roblek, Matthew Sharifi
-
Patent number: 9912617Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.Type: GrantFiled: January 4, 2017Date of Patent: March 6, 2018Assignee: Dolby Laboratories Licensing CorporationInventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
-
Patent number: 9875474Abstract: A method is provided for securing a transaction made by bank card, the transaction involving a remote provision, by a user, of data existing in a bank card in his possession. The method includes: obtaining data existing in the bank card to be used, called textual data; obtaining at least one portion of the textual data in the form of an audio data stream, called a sound sample, resulting from reading the data existing in the bank card to be used; computing a current voice signature from said sound sample; comparing said current voice signature with a reference voice signature pre-recorded and associated with the textual data of the bank card; and when the reference voice signature differs from the current voice signature by a value greater than the first value defined by a predetermined parameter, for rejecting the transaction.Type: GrantFiled: January 16, 2015Date of Patent: January 23, 2018Assignee: INGENICO GROUPInventor: Michel Leger
-
Patent number: 9858919Abstract: A method includes providing a deep neural network acoustic model, receiving audio data including one or more utterances of a speaker, extracting a plurality of speech recognition features from the one or more utterances of the speaker, creating a speaker identity vector for the speaker based on the extracted speech recognition features, and adapting the deep neural network acoustic model for automatic speech recognition using the extracted speech recognition features and the speaker identity vector.Type: GrantFiled: September 29, 2014Date of Patent: January 2, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: George A. Saon
-
Patent number: 9804822Abstract: An electronic apparatus and a controlling methods thereof are disclosed. The electronic apparatus includes a voice input unit configured to receive a user voice, a storage unit configured to store a plurality of voice print feature models representing a plurality of user voices and a plurality of utterance environment models representing a plurality of environmental disturbances, a controller, in response to a user voice being input through the voice input unit, configured to extract utterance environment information of an utterance environment model among the plurality of utterance environment models corresponding to a location where the user voice is input, compare a voice print feature of the input user voice with the plurality of voice print feature models, revise a result of the comparison based on the extracted utterance environment information, and recognize a user corresponding to the input user voice based on the revised result.Type: GrantFiled: April 28, 2015Date of Patent: October 31, 2017Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Chi-sang Jung, Byung-jin Hwang
-
Patent number: 9741348Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.Type: GrantFiled: June 24, 2016Date of Patent: August 22, 2017Assignee: Google Inc.Inventors: Dominik Roblek, Matthew Sharifi
-
Patent number: 9727603Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining query refinements using search data. In one aspect, a method includes receiving a first query and a second query each comprising one or more n-grams for a user session, determining a first set of query refinements for the first query, determining a second set of query refinements from the first set of query refinements, each query refinement in the second set of query refinements including at least one n-gram that is similar to an n-gram from the first query and at least on n-gram that is similar to an n-gram from the second query, scoring each query refinement in the second set of query refinements, selecting a third query from a group consisting of the second set of query refinements and the second query, and providing the third query as input to a search operation.Type: GrantFiled: July 30, 2015Date of Patent: August 8, 2017Assignee: Google Inc.Inventors: Matthias Heiler, Behshad Behzadi, Evgeny A. Cherepanov, Nils Grimsmo, Aurelien Boffy, Alessandro Agostini, Karoly Csalogany, Fredrik Bergenlid, Marcin M. Nowak-Przygodzki
-
Patent number: 9728191Abstract: Techniques for automatically identifying a speaker in a conversation as a known person based on processing of audio of the speaker's voice to extract characteristics of that voice and on an automated comparison of those characteristics to known characteristics of the known person's voice. A speaker segmentation process may be performed on audio of the conversation to produce, for each speaker in the conversation, a segment that includes the audio of that speaker. Audio of each of the segments may then be processed to extract characteristics of that speaker's voice. The characteristics derived from each segment (and thus for multiple speakers) may then be compared to characteristics of the known person's voice to determine whether the speaker for that segment is the known person. For each segment, a degree of match between the voice characteristics of the speaker and the voice characteristics of the known person may be calculated.Type: GrantFiled: August 27, 2015Date of Patent: August 8, 2017Assignee: Nuance Communications, Inc.Inventors: Emanuele Dalmasso, Daniele Colibro, Claudio Vair, Kevin R. Farrell
-
Patent number: 9704485Abstract: The present invention relates to a multimedia information retrieval method and electronic device, the multimedia information retrieval method comprising the steps of: extracting from a to-be-retrieved multimedia the voice of the to-be-retrieved multimedia; recognizing the voice of the to-be-retrieved multimedia to obtain a recognized text; and retrieving a multimedia database according to the recognized text to obtain the multimedia information of the to-be-retrieved multimedia. The present invention also relates to an electronic device. The multimedia information retrieval method and electronic device of the present invention can automatically, quickly, and comprehensively present to a user the multimedia information the user wants to know, thus greatly improving user retrieval efficiency and retrieval success rate.Type: GrantFiled: February 4, 2015Date of Patent: July 11, 2017Assignee: Tencent Technology (Shenzhen) Company LimitedInventors: Peng Hu, Teng Zhang
-
Patent number: 9646605Abstract: A system and method are presented for using spoken word verification to reduce false alarms by exploiting global and local contexts on a lexical level, a phoneme level, and on an acoustical level. The reduction of false alarms may occur through a process that determines whether a word has been detected or if it is a false alarm. Training examples are used to generate models of internal and external contexts which are compared to test word examples. The word may be accepted or rejected based on comparison results. Comparison may be performed either at the end of the process or at multiple steps of the process to determine whether the word is rejected.Type: GrantFiled: January 22, 2013Date of Patent: May 9, 2017Assignee: Interactive Intelligence Group, Inc.Inventors: Konstantin Biatov, Aravind Ganapathiraju, Felix Immanuel Wyss
-
Patent number: 9571425Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.Type: GrantFiled: March 21, 2013Date of Patent: February 14, 2017Assignee: Dolby Laboratories Licensing CorporationInventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
-
Patent number: 9508341Abstract: Features are disclosed for active learning to identify the words which are likely to improve the guessing and automatic speech recognition (ASR) after manual annotation. When a speech recognition system needs pronunciations for words, a lexicon is typically used. For unknown words, pronunciation-guessing (G2P) may be included to provide pronunciations in an unattended (e.g., automatic) fashion. However, having manually (e.g., by a human) annotated pronunciations provides better ASR than having automatic pronunciations that may, in some instances, be wrong. The included active learning features help to direct these limited annotation resources.Type: GrantFiled: September 3, 2014Date of Patent: November 29, 2016Assignee: Amazon Technologies, Inc.Inventors: Alok Ulhas Parlikar, Andrew Jake Rosenbaum, Jeffrey Paul Lilly, Jeffrey Penrod Adams
-
Patent number: 9401148Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for inputting speech data that corresponds to a particular utterance to a neural network; determining an evaluation vector based on output at a hidden layer of the neural network; comparing the evaluation vector with a reference vector that corresponds to a past utterance of a particular speaker; and based on comparing the evaluation vector and the reference vector, determining whether the particular utterance was likely spoken by the particular speaker.Type: GrantFiled: March 28, 2014Date of Patent: July 26, 2016Assignee: Google Inc.Inventors: Xin Lei, Erik McDermott, Ehsan Variani, Ignacio L. Moreno
-
Patent number: 9390709Abstract: A semiconductor integrated circuit device for voice recognition includes: a signal processing unit which generates a feature pattern representing a state of distribution of frequency components of an input voice signal; a voice recognition database storage unit which stores a voice recognition database including a standard pattern representing a state of distribution of frequency components of plural phonemes; a conversion list storage unit which stores a conversion list including plural words or sentences to be conversion candidates; a standard pattern extraction unit which extracts a standard pattern corresponding to character data representing the first syllable of each word or sentence included in the conversion list, from the voice recognition database; and a matching detection unit which compares the feature pattern generated from the first syllable of the voice signal with the extracted standard pattern and thus detects the matching of the syllable.Type: GrantFiled: September 20, 2013Date of Patent: July 12, 2016Assignee: SEIKO EPSON CORPORATIONInventor: Tsutomu Nonaka
-
Patent number: 9286892Abstract: Some implementations include a computer-implemented method. The method can include providing a training set of text samples to a semantic parser that associates text samples with actions. The method can include obtaining, for each of one or more of the text samples of the training set, data that indicates one or more domains that the semantic parser has associated with the text sample. For each of one or more domains, a subset of the text samples of the training set can be generated that the semantic parser has associated with the domain. Using the subset of text samples associated with the domain, a language model can be generated for one or more of the domain. Speech recognition can be performed on an utterance using the one or more language models that are generated for the one or more of the domains.Type: GrantFiled: April 1, 2014Date of Patent: March 15, 2016Assignee: Google Inc.Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein
-
Patent number: 9043207Abstract: The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.Type: GrantFiled: November 12, 2009Date of Patent: May 26, 2015Assignee: Agnitio S.L.Inventors: Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez, Marta Garcia Gomar
-
Publication number: 20150142441Abstract: A display apparatus is provided. The display apparatus includes a communicator configured to communicate with a voice recognition apparatus that recognizes an uttered voice of a user, an input unit configured to receive the uttered voice of the user, a display unit configured to receiving voice recognition result information about the uttered voice of the user received from the voice recognition apparatus and display the voice recognition result information, and a processor configured to, when the display apparatus is turned on, perform an access to the voice recognition apparatus by transmitting access request information to the voice recognition apparatus, and when the uttered voice is inputted through the input unit, transmit voice information on the uttered voice to the voice recognition apparatus through the communicator.Type: ApplicationFiled: November 18, 2014Publication date: May 21, 2015Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Myung-jae KIM, Hee-seob RYU, Kwang-il HWANG
-
Publication number: 20150142440Abstract: Feedback mechanisms to the user of a Head Mounted Display (HMD) are provided. It is important to provide feedback to the user when speech is recognized as soon as possible after the user utters a voice command. The HMD displays and/or audibly renders an ASR acknowledgment in a manner that ensures the user that the HMD has received/understood his voiced command.Type: ApplicationFiled: November 13, 2014Publication date: May 21, 2015Inventors: Christopher Parkinson, James Woodall
-
Patent number: 9026442Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: August 14, 2014Date of Patent: May 5, 2015Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Publication number: 20150112681Abstract: A voice retrieval device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute: setting detection criteria for a retrieval word, based on a characteristic of the retrieval word, such that the higher the detection accuracy of the retrieval word or the lower the pronunciation difficulty of the retrieval word or the lower the appearance probability of the retrieval word, the stricter the detection criteria; performing first voice retrieval processing on voice data according to the detection criteria and detecting a section that possibly includes the retrieval word as a candidate section from the voice data; and performing second voice retrieval processing different from the first voice retrieval processing on each candidate section and determining whether or not the retrieval word is included in each candidate section.Type: ApplicationFiled: October 16, 2014Publication date: April 23, 2015Applicant: Fujitsu LimitedInventors: Masakiyo TANAKA, Hitoshi Iwamida, Nobuyuki Washio
-
Publication number: 20150112682Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speaker's voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting the speaker's identity to be verified in case that both verification steps give a positive result and not accepting the speaker's identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.Type: ApplicationFiled: January 5, 2015Publication date: April 23, 2015Inventors: Luis Buera Rodriguez, Marta Garcia Gomar, Marta Sanchez Asenjo, Alberto Martin de los Santos de las Heras, Alfredo Gutierrez, Carlos Vaquero Aviles-Casco, Alfonso Ortega Gimenez
-
Publication number: 20150095029Abstract: Engaging persona candidates are provided with a skills assessment that includes vocal behavior. Each candidate provides both scripted and spontaneous answers to questions in a situational setting that closely matches the daily demands of the customer support industry. Samples of the candidate's speech are evaluated to identify distinct voice cues that qualitatively describe speech characteristics, which are scored based on the candidate's spoken performance. One or more of the voice cues are mapped to phonetic analytics that quantitatively describe vocal behavior. Each voice cue also has an assigned weight. The voice cue scores for each phonetic analytic are multiplied by their assigned weights and added together to form a weighted phonetic analytic, which is then used to form a part of the vocal behavior risk assessments.Type: ApplicationFiled: October 2, 2013Publication date: April 2, 2015Applicant: StarTek, Inc.Inventors: Ted Nardin, James Keaten
-
Patent number: 8996387Abstract: For clearing transaction data selected for a processing, there is generated in a portable data carrier (1) a transaction acoustic signal (003; 103; 203) (S007; S107; S207) upon whose acoustic reproduction by an end device (10) at least transaction data selected for the processing are reproduced superimposed acoustically with a melody specific to a user of the data carrier (1) (S009; S109; S209). The generated transaction acoustic signal (003; 103; 203) is electronically transferred to an end device (10) (S108; S208), which processes the selected transaction data (S011; S121; S216) only when the user of the data carrier (1) confirms vis-Ã -vis the end device (10) an at least partial match both of the acoustically reproduced melody with the user-specific melody and of the acoustically reproduced transaction data with the selected transaction data (S010; S110, S116; S210).Type: GrantFiled: September 8, 2009Date of Patent: March 31, 2015Assignee: Giesecke & Devrient GmbHInventors: Thomas Stocker, Michael Baldischweiler
-
Patent number: 8996373Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.Type: GrantFiled: October 5, 2011Date of Patent: March 31, 2015Assignee: Fujitsu LimitedInventors: Shoji Hayakawa, Naoshi Matsuo
-
Publication number: 20150088514Abstract: Techniques for providing virtual assistants to assist users during a voice communication between the users. For instance, a first user operating a device may establish a voice communication with respective devices of one or more additional users, such as with a device of a second user. For instance, the first user may utilize her device to place a telephone call to the device of the second user. A virtual assistant may also join the call and, upon invocation by a user on the call, may identify voice commands from the call and may perform corresponding tasks for the users in response.Type: ApplicationFiled: September 25, 2013Publication date: March 26, 2015Applicant: Rawles LLCInventor: Marcello Typrin
-
Publication number: 20150081301Abstract: A system includes a user speech profile stored on a computer readable storage device, the speech profile containing a plurality of phonemes with user identifying characteristics for the phonemes, and a speech processor coupled to access the speech profile to generate a phrase containing user distinguishing phonemes based on a difference between the user identifying characteristics for such phonemes and average user identifying characteristics, such that the phrase has discriminability from other users. The speech processor may also or alternatively select the phrase as a function of ambient noise.Type: ApplicationFiled: September 18, 2013Publication date: March 19, 2015Applicant: Lenovo (Singapore) Pte, Ltd.Inventors: John Weldon Nicholson, Steven Richard Perrin
-
Publication number: 20150081302Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.Type: ApplicationFiled: November 24, 2014Publication date: March 19, 2015Inventors: Ann K. SYRDAL, Sumit CHOPRA, Patrick Haffner, Taniya MISHRA, Ilija ZELJKOVIC, Eric Zavesky
-
Patent number: 8977547Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.Type: GrantFiled: October 8, 2009Date of Patent: March 10, 2015Assignee: Mitsubishi Electric CorporationInventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
-
Publication number: 20150058017Abstract: Disclosed in some examples are systems, methods, devices, and machine readable mediums which may produce an audio recording with included verification from the individuals in the recording that the recording is accurate. In some examples, the system may also provide rights management control to those individuals. This may ensure that individuals participating in audio events that are to be recorded are assured that their words are not changed, taken out of context, or otherwise altered and that they retain control over the use of their words even after the physical file has left their control.Type: ApplicationFiled: August 20, 2013Publication date: February 26, 2015Inventors: Dave Paul Singh, Dominic Fulginti, Mahendra Tadi Tadikonda, Tobias Kohlenberg
-
Patent number: 8949125Abstract: Systems and methods are provided to select a most typical pronunciation of a location name on a map from a plurality of user pronunciations. A server generates a reference speech model based on user pronunciations, compares the user pronunciations with the speech model and selects a pronunciation based on comparison. Alternatively, the server compares the distance between one the user pronunciations and every other user pronunciations and selects a pronunciation based on comparison. The server then annotates the map with the selected pronunciation and provides the audio output of the location name to a user device upon a user's request.Type: GrantFiled: June 16, 2010Date of Patent: February 3, 2015Assignee: Google Inc.Inventor: Gal Chechik
-
Patent number: 8947499Abstract: Methods and systems for communicating with rate control. A communication is sent and received from a first device to a second device over a network, wherein the communication comprises at least one audio stream and a second communication stream. A capacity of the network is probed at the first device for the sending and receiving the communication. A presence of a voice in the at least one audio stream is detected at the first device via a voice activity detection of the at least one audio stream. A rate limit is set for the sending and receiving the communication at the first device based on the capacity of the network and the detection of the presence of the at least one audio stream.Type: GrantFiled: December 6, 2012Date of Patent: February 3, 2015Assignee: TangoMe, Inc.Inventors: Alexander Subbotin, Olivier Furon, Shaowei Su, Yevgeni Litvin, Xu Liu
-
Publication number: 20150019219Abstract: Systems and methods for arbitrating spoken dialog services include determining a capability catalog associated with a plurality of devices accessible within an environment. The capability catalog includes a list of the plurality of devices mapped to a list of spoken dialog services provided by each of the plurality of devices. The system arbitrates between the plurality of devices and the spoken dialog services in the capability catalog to determine a selected device and a selected dialog service.Type: ApplicationFiled: December 2, 2013Publication date: January 15, 2015Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: ELI TZIRKEL-HANCOCK, GREG T. LINDEMANN, ROBERT D. SIMS, OMER TSIMHONI
-
Publication number: 20150006176Abstract: A speech-based audio device may be configured to detect a user-uttered wake expression and to respond by interpreting subsequent words or phrases as commands. In order to distinguish between utterance of the wake expression by the user and generation of the wake expression by the device itself, directional audio signals may by analyzed to detect whether the wake expression has been received from multiple directions. If the wake expression has been received from many directions, it is declared as being generated by the audio device and ignored. Otherwise, if the wake expression is received from a single direction or a limited number of directions, the wake expression is declared as being uttered by the user and subsequent words or phrase are interpreted and acted upon by the audio device.Type: ApplicationFiled: June 27, 2013Publication date: January 1, 2015Inventors: Michael Alan Pogue, Philip Ryan Hilmes
-
Patent number: 8914278Abstract: A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.Type: GrantFiled: July 31, 2008Date of Patent: December 16, 2014Assignee: Ginger Software, Inc.Inventors: Yael Karov Zangvil, Avner Zangvil
-
Publication number: 20140365219Abstract: A method for verifying that a person is registered to use a telemedical device includes identifying an unprompted trigger phrase in words spoken by a person and received by the telemedical device. The telemedical device prompts the person to state a name of a registered user and optionally prompts the person to state health tips for the person. The telemedical device verifies that the person is the registered user using utterance data generated from the unprompted trigger phrase, name of the registered user, and health tips.Type: ApplicationFiled: August 26, 2014Publication date: December 11, 2014Inventors: Fuliang Weng, Taufiq Hasan, Zhe Feng
-
Publication number: 20140358535Abstract: A method of performing a voice command function in an electronic device includes detecting voice of a user, acquiring one or more pieces of attribute information from the voice, and authenticating the user by comparing the attribute information with pre-stored authentic attribe information, using a recognition model. An electronic device includes a voice input module configured to detect a voice of a user, a first processor configured to acquire one or more pieces of attribute information from the voice and authenticate the user by comparing the attribute information with a recognition model, and a second processor configured to when the attribute information matches the recognition mode, activate the voice command function, receive a voice command of the user, and execute an application corresponding to the voice command. Other embodiments are also disclosed.Type: ApplicationFiled: May 28, 2014Publication date: December 4, 2014Applicant: Samsung Electronics Co., Ltd.Inventors: Sanghoon Lee, Kyungtae Kim, Subhojit Chakladar, Taejin Lee, Seokyeong Jung
-
Publication number: 20140350933Abstract: A voice recognition apparatus includes: an extractor configured to extract utterance elements from a user's uttered voice; an LSP converter configured to convert the extracted utterance elements into LSP formats; and a controller configured to determine whether an utterance element related to an OOV exists among the utterance elements converted into the LSP formats with reference to vocabulary list information including pre-registered vocabularies, and to determine an OOD area in which it is impossible to provide response information in response to the uttered voice, in response to determining that the utterance element related to the OOV exists. Accordingly, the voice recognition apparatus provides appropriate response information according to a user's intent by considering a variety of utterances and possibilities regarding a user's uttered voice.Type: ApplicationFiled: May 27, 2014Publication date: November 27, 2014Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Eun-sang BAK, Kyung-duk KIM, Hyung-jong NOH, Seong-han RYU, Geun-bae LEE
-
Publication number: 20140350934Abstract: Systems and methods are provided for voice identification. For example, audio characteristics are extracted from acquired voice signals; a syllable confusion network is identified based on at least information associated with the audio characteristics; a word lattice is generated based on at least information associated with the syllable confusion network and a predetermined phonetic dictionary; and an optimal character sequence is calculated in the word lattice as an identification result.Type: ApplicationFiled: May 30, 2014Publication date: November 27, 2014Applicant: Tencent Technology (Shenzhen) Company LimitedInventors: Lou Li, Li Lu, Xiang Zhang, Feng Rao, Shuai Yue, Bo Chen, Jianxiong Ma, Haibo Liu
-
Publication number: 20140324432Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.Type: ApplicationFiled: May 2, 2014Publication date: October 30, 2014Applicant: AT&T Intellectual Property I, L.P.Inventor: Robert Wesley Bossemeyer, JR.
-
Patent number: 8868431Abstract: A recognition dictionary creation device identifies the language of a reading of an inputted text which is a target to be registered and adds a reading with phonemes in the language identified thereby to the target text to be registered, and also converts the reading of the target text to be registered from the phonemes in the language identified thereby to phonemes in a language to be recognized which is handled in voice recognition to create a recognition dictionary in which the converted reading of the target text to be registered is registered.Type: GrantFiled: February 5, 2010Date of Patent: October 21, 2014Assignee: Mitsubishi Electric CorporationInventors: Michihiro Yamazaki, Jun Ishii, Yasushi Ishikawa
-
Publication number: 20140288932Abstract: An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs.Type: ApplicationFiled: July 8, 2013Publication date: September 25, 2014Inventors: Yoryos Yeracaris, Alwin B. Carus, Larissa Lapshina
-
Publication number: 20140278419Abstract: A voice command definition file (VCDF) declaratively defines voice commands for an application. For example, the VCDF may include definitions for: voice commands; one or more phrases/utterances that may be said to execute each of the commands; a navigation location to navigate to within the application (e.g. a page); phrase lists containing items that may be used as a parameter in a voice command; examples; feedback; and the like. A user may say a single utterance to launch the application, navigate to the associated location of the command and execute the command. The VCDF may define multiple ways to listen for a particular command. The VCDF may be edited/defined by a user and may include a user friendly name for an application. A speech engine loads the VCDF for use such that it may recognize the commands associated with an application. The definitions may be updated during runtime.Type: ApplicationFiled: March 14, 2013Publication date: September 18, 2014Inventors: F. Avery Bishop, Travis Wilson, Robert Chambers, Robert Brown
-
Publication number: 20140278420Abstract: An electronic device digitally combines a single voice input with each of a series of noise samples. Each noise sample is taken from a different audio environment (e.g., street noise, babble, interior car noise). The voice input/noise sample combinations are used to train a voice recognition model database without the user having to repeat the voice input in each of the different environments. In one variation, the electronic device transmits the user's voice input to a server that maintains and trains the voice recognition model database.Type: ApplicationFiled: December 3, 2013Publication date: September 18, 2014Applicant: Motorola Mobility LLCInventors: John R. Meloney, Joel A. Clark, Joseph C. Dwyer, Adrian M. Schuster, Snehitha Singaraju, Robert A. Zurek
-
Patent number: 8831942Abstract: A method is provided for identifying a gender of a speaker. The method steps include obtaining speech data of the speaker, extracting vowel-like speech frames from the speech data, analyzing the vowel-like speech frames to generate a feature vector having pitch values corresponding to the vowel-like frames, analyzing the pitch values to generate a most frequent pitch value, determining, in response to the most frequent pitch value being between a first pre-determined threshold and a second pre-determined threshold, an output of a male Gaussian Mixture Model (GMM) and an output of a female GMM using the pitch values as inputs to the male GMM and the female GMM, and identifying the gender of the speaker by comparing the output of the male GMM and the output of the female GMM based on a pre-determined criterion.Type: GrantFiled: March 19, 2010Date of Patent: September 9, 2014Assignee: Narus, Inc.Inventor: Antonio Nucci
-
Publication number: 20140244258Abstract: A voice recognition method for a single sentence including a multi-instruction in an interactive voice user interface, method includes steps of detecting a connection ending by analyzing the morphemes of a single sentence on which voice recognition has been performed, separating the single sentence into a plurality of passages based on the connection ending, detecting a multi-connection ending by analyzing the connection ending and extracting instructions by specifically analyzing passages including the multi-connection ending and outputting a multi-instruction included in the single sentence by combining the instructions extracted in the step of extracting instructions. In accordance with the present invention, consumer usability can be significantly increased because a multi-operation intention can be checked in one sentence.Type: ApplicationFiled: October 18, 2013Publication date: August 28, 2014Applicant: Mediazen Co., Ltd.Inventors: Minkyu SONG, Hyejin KIM, Sangyoon KIM
-
Publication number: 20140236598Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.Type: ApplicationFiled: April 29, 2013Publication date: August 21, 2014Applicant: Google Inc.Inventor: Google Inc.
-
Publication number: 20140236599Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. A particular method includes detecting that a frequency of occurrence of a particular type of utterance satisfies a threshold. The method further includes tuning a speech recognition system with respect to the particular type of utterance.Type: ApplicationFiled: April 25, 2014Publication date: August 21, 2014Applicant: AT&T Intellectuall Property I, L.P.Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
-
Patent number: 8812315Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: October 1, 2013Date of Patent: August 19, 2014Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal