Subportions Patents (Class 704/249)
  • Patent number: 10963522
    Abstract: A method of determining relevancies of objects to a search query includes associating multiple tags with multiple objects, recording bookmarks to the multiple objects, or both, and determining a relevance score for each of the multiple objects and a search query. One embodiment of the method combines full-text relevance algorithms with tag relevance algorithms. Other embodiments include statistical relevance algorithms such as statistical classification or rank regression algorithms. When a user executes a search query, a results list containing the objects is returned, with the objects organized based on the relevance scores. The objects are organized by, for example, listing those with the highest relevance scores first or by marking them with an indication of their relevance.
    Type: Grant
    Filed: June 16, 2017
    Date of Patent: March 30, 2021
    Assignee: Pinterest, Inc.
    Inventors: Yunshan Lu, Michael Tanne
  • Patent number: 10848591
    Abstract: Systems and methods for sender profile and/or recipient profile disambiguation and/or confirmation are disclosed. In instances where a sender profile is not indicated by a user sending a communication from a communal device, heuristic data may be utilized to infer the sender profile. Similar heuristic data may also be used when selection of the sender profile is associated with a low confidence level. Heuristic data may also be used to infer the recipient profile when the user does not indicate the recipient profile or when selection of the recipient profile is associated with a low confidence. Various confirmations may result from the sender and recipient profile disambiguation.
    Type: Grant
    Filed: June 7, 2017
    Date of Patent: November 24, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Christo Frank Devaraj, James Alexander Stanton, Sumedha Arvind Kshirsagar, Christopher Geiger Parker, Aaron Takayanagi Barnet, Venkatesh Kancharla, Gregory Michael Hart
  • Patent number: 10832682
    Abstract: The method comprises receive first audio comprising speech from a user of a computing device, detecting an end of speech in the first audio, generating an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech, determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result, and processing second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.
    Type: Grant
    Filed: February 6, 2020
    Date of Patent: November 10, 2020
    Assignee: Nuance Communications, Inc.
    Inventor: Mark Fanty
  • Patent number: 10811006
    Abstract: An information processing method for information stored in a storage includes: holding a dialog history of a dialog including a question to a user and a reply from the user to the question, determining whether a manner in which a third reply indicating neither a first reply nor a second reply appears in a reply history of the reply included in the held dialog history satisfies a predetermined condition, the first reply indicating an affirmative in response to the question, the second reply indicating a negative in response to the question; and performing presentation regarding the information stored in the storage if the manner is determined to satisfy the predetermined condition.
    Type: Grant
    Filed: November 22, 2017
    Date of Patent: October 20, 2020
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Katsuyoshi Yamagami, Mitsuru Endo, Takashi Ushio
  • Patent number: 10804938
    Abstract: Systems and methods are disclosed for decoding data. A first block of data may be obtained from a storage medium or received from a computing device. The first block of data includes a first codeword generated based on an error correction code. A first set of likelihood values is obtained from a neural network. The first set of likelihood values indicates probabilities that the first codeword will be decoded into one of a plurality of decoded values. A second set of likelihood values is obtained from a decoder based on the first block of data. The second set of likelihood values indicates probabilities that the first codeword will be decoded into one of the plurality of decoded values. The first codeword is decoded to obtain a decoded value based on the first set of likelihood values and the second set of likelihood values.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: October 13, 2020
    Assignee: Western Digital Technologies, Inc.
    Inventor: Minghai Qin
  • Patent number: 10726832
    Abstract: Systems and processes for operating an intelligent automated assistant to perform intelligent list reading are provided. In accordance with one example, a method includes, at an electronic device having one or more processors, receiving a natural-language input corresponding to a domain; providing the natural-language input to an external device; receiving, from the external device, a process flow corresponding to the domain; determining, with the process flow corresponding to the domain, a task associated with the natural-language input; performing the task; and providing an output indicating whether the task has been performed.
    Type: Grant
    Filed: March 9, 2018
    Date of Patent: July 28, 2020
    Assignee: Apple Inc.
    Inventors: Brandon J. Newendorp, Joanna S. Peterson
  • Patent number: 10726849
    Abstract: This application describes methods and apparatus for speaker recognition. An apparatus according to an embodiment has an analyzer (202) for analyzing each frame of a sequence of frames of audio data (AIN) which correspond to speech sounds uttered by a user to determine at least one characteristic of the speech sound of that frame. An assessment module (203) determines, for each frame of audio data, a contribution indicator of the extent to which the frame of audio data should be used for speaker recognition processing based on the determined characteristic of the speech sound. In this way frames which correspond to speech sounds that are of most use for speaker discrimination may be emphasized and/or frames which correspond to speech sounds that are of least use for speaker discrimination may be de-emphasized.
    Type: Grant
    Filed: August 1, 2017
    Date of Patent: July 28, 2020
    Assignee: Cirrus Logic, Inc.
    Inventors: John Paul Lesso, John Laurence Melanson
  • Patent number: 10706422
    Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: July 7, 2020
    Assignee: Capital One Services, LLC
    Inventors: Lawrence Douglas, Paul Y. Moreton
  • Patent number: 10650829
    Abstract: Methods, systems and computer program products for operating a voice response system in a multiuser environment are provided. Aspects include receiving a voice command from a first user and determining an identity of the first user based at least in part on a voice recognition of the first user. Aspects also include determining an identity of one or more other users in range of the voice response system and obtaining a command hierarchy. Aspects further include performing an action requested by the voice command based on a determination that the first user is authorized to request the voice command, wherein the determination that the first user is authorized to request the voice command is based at least upon the identity of the first user, the identity of one or more other users in range of the voice response system and the command hierarchy.
    Type: Grant
    Filed: June 6, 2018
    Date of Patent: May 12, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric V. Kline, Sarbajit K. Rakshit
  • Patent number: 10600423
    Abstract: Methods and systems for transforming a text-independent enrolment of a customer into a text-dependent enrolment are provided. A request for authentication of a customer that is enrolled in the self-service system with a text-independent voice print is received. A request is transmitted to the customer to repeat a passphrase and the customer's response is received as an audio stream of the passphrase. The customer is authenticated by comparing the audio stream of the passphrase against the text-independent voice print and if the customer is authenticated then a text-dependent voice print is created based on the passphrase. Upon receipt of a subsequent request for authentication of the customer, a request may be transmitted to the customer to repeat the passphrase. Another audio stream of the passphrase may be received. The customer may be authenticated by comparing the another audio stream of the passphrase with the text-dependent voice print.
    Type: Grant
    Filed: January 23, 2019
    Date of Patent: March 24, 2020
    Assignee: Nice Ltd.
    Inventors: Matan Keret, Omer Kochba, Amnon Buzaglo
  • Patent number: 10522123
    Abstract: An electronic apparatus is disclosed, which includes an input interface configured to receive an audio signal, a processor configured to process the received audio signal, and an output interface configured to output the processed audio signal, in which the processor is configured to obtain a scale of a first octave by applying a filter bank to the audio signal based on a sampling frequency of the audio signal; down-sample the audio signal; and obtain a scale of a second octave lower than the first octave by applying the filter bank to the down-sampled signal.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: December 31, 2019
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Jong-woo Kim
  • Patent number: 10497369
    Abstract: Methods and systems for controlling a portable computing device (“PCD”) are disclosed. In an example method, an always on processor (AoP) of a voice recognition module of the PCD receives a voice command. The AoP determines, without decoding the received voice command, that the received voice command corresponds to a previously determined keyword. The AoP retrieves context data associated with the previously determined keyword. The AoP acts on the voice command using the context data, including in some embodiments automatically triggering a fast dormancy of a communications channel.
    Type: Grant
    Filed: August 23, 2017
    Date of Patent: December 3, 2019
    Assignee: Qualcomm Incorporated
    Inventors: Nishith Chaubey, Anil Rao, James Francis Geekie
  • Patent number: 10423959
    Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: September 24, 2019
    Assignee: Capital One Services, LLC
    Inventors: Lawrence Douglas, Paul Y. Moreton
  • Patent number: 10354656
    Abstract: Improvements in speaker identification and verification are provided via an attention model for speaker recognition and the end-to-end training thereof. A speaker discriminative convolutional neural network (CNN) is used to directly extract frame-level speaker features that are weighted and combined to form an utterance-level speaker recognition vector via the attention model. The CNN and attention model are join-optimized via an end-to-end training algorithm that imitates the speaker recognition process and uses the most-similar utterances from imposters for each speaker.
    Type: Grant
    Filed: June 23, 2017
    Date of Patent: July 16, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yong Zhao, Jinyu Li, Yifan Gong, Shixiong Zhang, Zhuo Chen
  • Patent number: 10313779
    Abstract: A system, method, and wireless earpieces for implementing a virtual assistant. A first virtual assistant for a wireless device is activated in response to receiving a request. A second virtual assistant on the wireless earpieces is executed to retrieve information associated with the request. An action is implemented utilizing the wireless device to fulfill the request utilizing the information.
    Type: Grant
    Filed: August 14, 2017
    Date of Patent: June 4, 2019
    Inventor: Peter Vincent Boesen
  • Patent number: 10289702
    Abstract: A system and method for linking a hash code to a portion of an image. A plurality of lattice points is selected in a multidimensional lattice to form a smallest enclosing region about a feature vector representing the portion of the image and a lattice point is determined from the selected plurality of lattice points according to a distribution criteria. The determined lattice point is common to the smallest enclosing region and a region of the lattice adjacent to the smallest enclosing region located within a query radius distance of the feature vector. When the feature vector is located within the query radius of a query vector the feature vector is considered a match. The method assigns the feature vector to the determined lattice point and stores a link between a hash code associated with the determined lattice point and the portion of the image.
    Type: Grant
    Filed: September 25, 2012
    Date of Patent: May 14, 2019
    Assignee: CANON KABUSHIKI KAISHA
    Inventors: Barry James Drake, Alan Valev Tonisson, Scott Alexander Rudkin
  • Patent number: 10255906
    Abstract: An approach is provided that receives, from a neurological sensor worn by a user, words as they are silently read the user. The words being read by the user correspond to a set of actual words that are included in a passage that is being read by the user. The approach compares the words as read by the user with the actual words included in the passage to identify one or more reading mistakes. The reading mistakes are analyzed resulting in a set of feedback that is provided to the user.
    Type: Grant
    Filed: December 14, 2016
    Date of Patent: April 9, 2019
    Assignee: International Business Machines Corporation
    Inventors: Marc K. Johlic, Susann M. Keohane, Emi K. Olsson
  • Patent number: 10192219
    Abstract: Systems and methods are provided for authenticating mobile payments from a customer account to a merchant. The systems and methods may include a financial service provider receiving a request to authorize an electronic transaction at a point-of-sale. A financial service provider server computer may verify that the customer is present at the point-of-sale using received location data. An image having distorted text such as a captcha may be transmitted to a device at the point-of-sale, and the customer may read the captcha aloud. A voice sample of the customer may be sent to the financial service provider for comparison to stored voice recordings, to verify that the customer's voice sample is authentic if the voice matches a previously generated voice recording for the account. If the voice sample is authentic, the financial service provider may authorize the mobile payment.
    Type: Grant
    Filed: January 8, 2015
    Date of Patent: January 29, 2019
    Assignee: Capital One Services, LLC
    Inventors: Lawrence Douglas, Paul Y. Moreton
  • Patent number: 10056094
    Abstract: In some example embodiments, a system is provided for real-time analysis of audio signals. First digital audio signals are retrieved from memory. First computed streamed signal information corresponding to each of the first digital audio signals is generated by computing first metrics data for the first digital audio signals, the first computed streamed signal information including the first metrics data. The computed first streamed signal information is stored in the memory. The first computed streamed signal information is transmitted to one or more computing devices. Transmitting the first computed streamed signal information to the one or more computing devices causes the first computed streamed signal information to be displayed at the one or more computing devices.
    Type: Grant
    Filed: March 12, 2015
    Date of Patent: August 21, 2018
    Assignee: Cogito Corporation
    Inventors: Joshua Feast, Ali Azarbayejani, Skyler Place
  • Patent number: 10037760
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.
    Type: Grant
    Filed: August 4, 2017
    Date of Patent: July 31, 2018
    Inventors: Dominik Roblek, Matthew Sharifi
  • Patent number: 9912617
    Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
    Type: Grant
    Filed: January 4, 2017
    Date of Patent: March 6, 2018
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
  • Patent number: 9875474
    Abstract: A method is provided for securing a transaction made by bank card, the transaction involving a remote provision, by a user, of data existing in a bank card in his possession. The method includes: obtaining data existing in the bank card to be used, called textual data; obtaining at least one portion of the textual data in the form of an audio data stream, called a sound sample, resulting from reading the data existing in the bank card to be used; computing a current voice signature from said sound sample; comparing said current voice signature with a reference voice signature pre-recorded and associated with the textual data of the bank card; and when the reference voice signature differs from the current voice signature by a value greater than the first value defined by a predetermined parameter, for rejecting the transaction.
    Type: Grant
    Filed: January 16, 2015
    Date of Patent: January 23, 2018
    Assignee: INGENICO GROUP
    Inventor: Michel Leger
  • Patent number: 9858919
    Abstract: A method includes providing a deep neural network acoustic model, receiving audio data including one or more utterances of a speaker, extracting a plurality of speech recognition features from the one or more utterances of the speaker, creating a speaker identity vector for the speaker based on the extracted speech recognition features, and adapting the deep neural network acoustic model for automatic speech recognition using the extracted speech recognition features and the speaker identity vector.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: January 2, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: George A. Saon
  • Patent number: 9804822
    Abstract: An electronic apparatus and a controlling methods thereof are disclosed. The electronic apparatus includes a voice input unit configured to receive a user voice, a storage unit configured to store a plurality of voice print feature models representing a plurality of user voices and a plurality of utterance environment models representing a plurality of environmental disturbances, a controller, in response to a user voice being input through the voice input unit, configured to extract utterance environment information of an utterance environment model among the plurality of utterance environment models corresponding to a location where the user voice is input, compare a voice print feature of the input user voice with the plurality of voice print feature models, revise a result of the comparison based on the extracted utterance environment information, and recognize a user corresponding to the input user voice based on the revised result.
    Type: Grant
    Filed: April 28, 2015
    Date of Patent: October 31, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Chi-sang Jung, Byung-jin Hwang
  • Patent number: 9741348
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: August 22, 2017
    Assignee: Google Inc.
    Inventors: Dominik Roblek, Matthew Sharifi
  • Patent number: 9727603
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining query refinements using search data. In one aspect, a method includes receiving a first query and a second query each comprising one or more n-grams for a user session, determining a first set of query refinements for the first query, determining a second set of query refinements from the first set of query refinements, each query refinement in the second set of query refinements including at least one n-gram that is similar to an n-gram from the first query and at least on n-gram that is similar to an n-gram from the second query, scoring each query refinement in the second set of query refinements, selecting a third query from a group consisting of the second set of query refinements and the second query, and providing the third query as input to a search operation.
    Type: Grant
    Filed: July 30, 2015
    Date of Patent: August 8, 2017
    Assignee: Google Inc.
    Inventors: Matthias Heiler, Behshad Behzadi, Evgeny A. Cherepanov, Nils Grimsmo, Aurelien Boffy, Alessandro Agostini, Karoly Csalogany, Fredrik Bergenlid, Marcin M. Nowak-Przygodzki
  • Patent number: 9728191
    Abstract: Techniques for automatically identifying a speaker in a conversation as a known person based on processing of audio of the speaker's voice to extract characteristics of that voice and on an automated comparison of those characteristics to known characteristics of the known person's voice. A speaker segmentation process may be performed on audio of the conversation to produce, for each speaker in the conversation, a segment that includes the audio of that speaker. Audio of each of the segments may then be processed to extract characteristics of that speaker's voice. The characteristics derived from each segment (and thus for multiple speakers) may then be compared to characteristics of the known person's voice to determine whether the speaker for that segment is the known person. For each segment, a degree of match between the voice characteristics of the speaker and the voice characteristics of the known person may be calculated.
    Type: Grant
    Filed: August 27, 2015
    Date of Patent: August 8, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Emanuele Dalmasso, Daniele Colibro, Claudio Vair, Kevin R. Farrell
  • Patent number: 9704485
    Abstract: The present invention relates to a multimedia information retrieval method and electronic device, the multimedia information retrieval method comprising the steps of: extracting from a to-be-retrieved multimedia the voice of the to-be-retrieved multimedia; recognizing the voice of the to-be-retrieved multimedia to obtain a recognized text; and retrieving a multimedia database according to the recognized text to obtain the multimedia information of the to-be-retrieved multimedia. The present invention also relates to an electronic device. The multimedia information retrieval method and electronic device of the present invention can automatically, quickly, and comprehensively present to a user the multimedia information the user wants to know, thus greatly improving user retrieval efficiency and retrieval success rate.
    Type: Grant
    Filed: February 4, 2015
    Date of Patent: July 11, 2017
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventors: Peng Hu, Teng Zhang
  • Patent number: 9646605
    Abstract: A system and method are presented for using spoken word verification to reduce false alarms by exploiting global and local contexts on a lexical level, a phoneme level, and on an acoustical level. The reduction of false alarms may occur through a process that determines whether a word has been detected or if it is a false alarm. Training examples are used to generate models of internal and external contexts which are compared to test word examples. The word may be accepted or rejected based on comparison results. Comparison may be performed either at the end of the process or at multiple steps of the process to determine whether the word is rejected.
    Type: Grant
    Filed: January 22, 2013
    Date of Patent: May 9, 2017
    Assignee: Interactive Intelligence Group, Inc.
    Inventors: Konstantin Biatov, Aravind Ganapathiraju, Felix Immanuel Wyss
  • Patent number: 9571425
    Abstract: Voice communication method and apparatus and method and apparatus for operating jitter buffer are described. Audio blocks are acquired in sequence. Each of the audio blocks includes one or more audio frames. Voice activity detection is performed on the audio blocks. In response to deciding voice onset for a present one of the audio blocks, a subsequence of the sequence of the acquired audio blocks is retrieved. The subsequence precedes the present audio block immediately. The subsequence has a predetermined length and non-voice is decided for each audio block in the subsequence. The present audio block and the audio blocks in the subsequence are transmitted to a receiving party. The audio blocks in the subsequence are identified as reprocessed audio blocks. In response to deciding non-voice for the present audio block, the present audio block is cached.
    Type: Grant
    Filed: March 21, 2013
    Date of Patent: February 14, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Glenn N. Dickins, Xuejing Sun, Brendon Costa
  • Patent number: 9508341
    Abstract: Features are disclosed for active learning to identify the words which are likely to improve the guessing and automatic speech recognition (ASR) after manual annotation. When a speech recognition system needs pronunciations for words, a lexicon is typically used. For unknown words, pronunciation-guessing (G2P) may be included to provide pronunciations in an unattended (e.g., automatic) fashion. However, having manually (e.g., by a human) annotated pronunciations provides better ASR than having automatic pronunciations that may, in some instances, be wrong. The included active learning features help to direct these limited annotation resources.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: November 29, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Alok Ulhas Parlikar, Andrew Jake Rosenbaum, Jeffrey Paul Lilly, Jeffrey Penrod Adams
  • Patent number: 9401148
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for inputting speech data that corresponds to a particular utterance to a neural network; determining an evaluation vector based on output at a hidden layer of the neural network; comparing the evaluation vector with a reference vector that corresponds to a past utterance of a particular speaker; and based on comparing the evaluation vector and the reference vector, determining whether the particular utterance was likely spoken by the particular speaker.
    Type: Grant
    Filed: March 28, 2014
    Date of Patent: July 26, 2016
    Assignee: Google Inc.
    Inventors: Xin Lei, Erik McDermott, Ehsan Variani, Ignacio L. Moreno
  • Patent number: 9390709
    Abstract: A semiconductor integrated circuit device for voice recognition includes: a signal processing unit which generates a feature pattern representing a state of distribution of frequency components of an input voice signal; a voice recognition database storage unit which stores a voice recognition database including a standard pattern representing a state of distribution of frequency components of plural phonemes; a conversion list storage unit which stores a conversion list including plural words or sentences to be conversion candidates; a standard pattern extraction unit which extracts a standard pattern corresponding to character data representing the first syllable of each word or sentence included in the conversion list, from the voice recognition database; and a matching detection unit which compares the feature pattern generated from the first syllable of the voice signal with the extracted standard pattern and thus detects the matching of the syllable.
    Type: Grant
    Filed: September 20, 2013
    Date of Patent: July 12, 2016
    Assignee: SEIKO EPSON CORPORATION
    Inventor: Tsutomu Nonaka
  • Patent number: 9286892
    Abstract: Some implementations include a computer-implemented method. The method can include providing a training set of text samples to a semantic parser that associates text samples with actions. The method can include obtaining, for each of one or more of the text samples of the training set, data that indicates one or more domains that the semantic parser has associated with the text sample. For each of one or more domains, a subset of the text samples of the training set can be generated that the semantic parser has associated with the domain. Using the subset of text samples associated with the domain, a language model can be generated for one or more of the domain. Speech recognition can be performed on an utterance using the one or more language models that are generated for the one or more of the domains.
    Type: Grant
    Filed: April 1, 2014
    Date of Patent: March 15, 2016
    Assignee: Google Inc.
    Inventors: Pedro J. Moreno Mengibar, Mark Edward Epstein
  • Patent number: 9043207
    Abstract: The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.
    Type: Grant
    Filed: November 12, 2009
    Date of Patent: May 26, 2015
    Assignee: Agnitio S.L.
    Inventors: Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez, Marta Garcia Gomar
  • Publication number: 20150142441
    Abstract: A display apparatus is provided. The display apparatus includes a communicator configured to communicate with a voice recognition apparatus that recognizes an uttered voice of a user, an input unit configured to receive the uttered voice of the user, a display unit configured to receiving voice recognition result information about the uttered voice of the user received from the voice recognition apparatus and display the voice recognition result information, and a processor configured to, when the display apparatus is turned on, perform an access to the voice recognition apparatus by transmitting access request information to the voice recognition apparatus, and when the uttered voice is inputted through the input unit, transmit voice information on the uttered voice to the voice recognition apparatus through the communicator.
    Type: Application
    Filed: November 18, 2014
    Publication date: May 21, 2015
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Myung-jae KIM, Hee-seob RYU, Kwang-il HWANG
  • Publication number: 20150142440
    Abstract: Feedback mechanisms to the user of a Head Mounted Display (HMD) are provided. It is important to provide feedback to the user when speech is recognized as soon as possible after the user utters a voice command. The HMD displays and/or audibly renders an ASR acknowledgment in a manner that ensures the user that the HMD has received/understood his voiced command.
    Type: Application
    Filed: November 13, 2014
    Publication date: May 21, 2015
    Inventors: Christopher Parkinson, James Woodall
  • Patent number: 9026442
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: May 5, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Publication number: 20150112681
    Abstract: A voice retrieval device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute: setting detection criteria for a retrieval word, based on a characteristic of the retrieval word, such that the higher the detection accuracy of the retrieval word or the lower the pronunciation difficulty of the retrieval word or the lower the appearance probability of the retrieval word, the stricter the detection criteria; performing first voice retrieval processing on voice data according to the detection criteria and detecting a section that possibly includes the retrieval word as a candidate section from the voice data; and performing second voice retrieval processing different from the first voice retrieval processing on each candidate section and determining whether or not the retrieval word is included in each candidate section.
    Type: Application
    Filed: October 16, 2014
    Publication date: April 23, 2015
    Applicant: Fujitsu Limited
    Inventors: Masakiyo TANAKA, Hitoshi Iwamida, Nobuyuki Washio
  • Publication number: 20150112682
    Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speaker's voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting the speaker's identity to be verified in case that both verification steps give a positive result and not accepting the speaker's identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.
    Type: Application
    Filed: January 5, 2015
    Publication date: April 23, 2015
    Inventors: Luis Buera Rodriguez, Marta Garcia Gomar, Marta Sanchez Asenjo, Alberto Martin de los Santos de las Heras, Alfredo Gutierrez, Carlos Vaquero Aviles-Casco, Alfonso Ortega Gimenez
  • Publication number: 20150095029
    Abstract: Engaging persona candidates are provided with a skills assessment that includes vocal behavior. Each candidate provides both scripted and spontaneous answers to questions in a situational setting that closely matches the daily demands of the customer support industry. Samples of the candidate's speech are evaluated to identify distinct voice cues that qualitatively describe speech characteristics, which are scored based on the candidate's spoken performance. One or more of the voice cues are mapped to phonetic analytics that quantitatively describe vocal behavior. Each voice cue also has an assigned weight. The voice cue scores for each phonetic analytic are multiplied by their assigned weights and added together to form a weighted phonetic analytic, which is then used to form a part of the vocal behavior risk assessments.
    Type: Application
    Filed: October 2, 2013
    Publication date: April 2, 2015
    Applicant: StarTek, Inc.
    Inventors: Ted Nardin, James Keaten
  • Patent number: 8996373
    Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: March 31, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 8996387
    Abstract: For clearing transaction data selected for a processing, there is generated in a portable data carrier (1) a transaction acoustic signal (003; 103; 203) (S007; S107; S207) upon whose acoustic reproduction by an end device (10) at least transaction data selected for the processing are reproduced superimposed acoustically with a melody specific to a user of the data carrier (1) (S009; S109; S209). The generated transaction acoustic signal (003; 103; 203) is electronically transferred to an end device (10) (S108; S208), which processes the selected transaction data (S011; S121; S216) only when the user of the data carrier (1) confirms vis-à-vis the end device (10) an at least partial match both of the acoustically reproduced melody with the user-specific melody and of the acoustically reproduced transaction data with the selected transaction data (S010; S110, S116; S210).
    Type: Grant
    Filed: September 8, 2009
    Date of Patent: March 31, 2015
    Assignee: Giesecke & Devrient GmbH
    Inventors: Thomas Stocker, Michael Baldischweiler
  • Publication number: 20150088514
    Abstract: Techniques for providing virtual assistants to assist users during a voice communication between the users. For instance, a first user operating a device may establish a voice communication with respective devices of one or more additional users, such as with a device of a second user. For instance, the first user may utilize her device to place a telephone call to the device of the second user. A virtual assistant may also join the call and, upon invocation by a user on the call, may identify voice commands from the call and may perform corresponding tasks for the users in response.
    Type: Application
    Filed: September 25, 2013
    Publication date: March 26, 2015
    Applicant: Rawles LLC
    Inventor: Marcello Typrin
  • Publication number: 20150081301
    Abstract: A system includes a user speech profile stored on a computer readable storage device, the speech profile containing a plurality of phonemes with user identifying characteristics for the phonemes, and a speech processor coupled to access the speech profile to generate a phrase containing user distinguishing phonemes based on a difference between the user identifying characteristics for such phonemes and average user identifying characteristics, such that the phrase has discriminability from other users. The speech processor may also or alternatively select the phrase as a function of ambient noise.
    Type: Application
    Filed: September 18, 2013
    Publication date: March 19, 2015
    Applicant: Lenovo (Singapore) Pte, Ltd.
    Inventors: John Weldon Nicholson, Steven Richard Perrin
  • Publication number: 20150081302
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.
    Type: Application
    Filed: November 24, 2014
    Publication date: March 19, 2015
    Inventors: Ann K. SYRDAL, Sumit CHOPRA, Patrick Haffner, Taniya MISHRA, Ilija ZELJKOVIC, Eric Zavesky
  • Patent number: 8977547
    Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: March 10, 2015
    Assignee: Mitsubishi Electric Corporation
    Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
  • Publication number: 20150058017
    Abstract: Disclosed in some examples are systems, methods, devices, and machine readable mediums which may produce an audio recording with included verification from the individuals in the recording that the recording is accurate. In some examples, the system may also provide rights management control to those individuals. This may ensure that individuals participating in audio events that are to be recorded are assured that their words are not changed, taken out of context, or otherwise altered and that they retain control over the use of their words even after the physical file has left their control.
    Type: Application
    Filed: August 20, 2013
    Publication date: February 26, 2015
    Inventors: Dave Paul Singh, Dominic Fulginti, Mahendra Tadi Tadikonda, Tobias Kohlenberg
  • Patent number: 8949125
    Abstract: Systems and methods are provided to select a most typical pronunciation of a location name on a map from a plurality of user pronunciations. A server generates a reference speech model based on user pronunciations, compares the user pronunciations with the speech model and selects a pronunciation based on comparison. Alternatively, the server compares the distance between one the user pronunciations and every other user pronunciations and selects a pronunciation based on comparison. The server then annotates the map with the selected pronunciation and provides the audio output of the location name to a user device upon a user's request.
    Type: Grant
    Filed: June 16, 2010
    Date of Patent: February 3, 2015
    Assignee: Google Inc.
    Inventor: Gal Chechik
  • Patent number: 8947499
    Abstract: Methods and systems for communicating with rate control. A communication is sent and received from a first device to a second device over a network, wherein the communication comprises at least one audio stream and a second communication stream. A capacity of the network is probed at the first device for the sending and receiving the communication. A presence of a voice in the at least one audio stream is detected at the first device via a voice activity detection of the at least one audio stream. A rate limit is set for the sending and receiving the communication at the first device based on the capacity of the network and the detection of the presence of the at least one audio stream.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: February 3, 2015
    Assignee: TangoMe, Inc.
    Inventors: Alexander Subbotin, Olivier Furon, Shaowei Su, Yevgeni Litvin, Xu Liu