Creating Patterns For Matching Patents (Class 704/243)
  • Patent number: 9659092
    Abstract: A music information searching method includes extracting modulating spectrums from audio data, generating modulating spectrum peak point audio fingerprints by using position information which relates to preset peak points from the extracted modulating spectrums, converting the generated modulating spectrum peak point audio fingerprints into hash keys which indicate addresses of hash tables and hash values that are stored on the hash tables via hash functions, and searching music information by extracting hash keys which relate to audio query clips and comparing the extracted hash keys with the indicated addresses of the hash tables.
    Type: Grant
    Filed: November 13, 2013
    Date of Patent: May 23, 2017
    Assignees: SAMSUNG ELECTRONICS CO., LTD., KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
    Inventors: Ki-wan Eom, Hyoung-Gook Kim, Kwang-ki Kim
  • Patent number: 9659562
    Abstract: A system estimates environment-specific alterations of a user sound received at the system. The system modifies the received user sound to formulate a modified user sound by at least compensating for the audio modifications and/or formulates an expected audio model of a user by modifying a stored user-dependent audio model of the user with the audio modification. The system is also capable of estimating whether the received user sounds is from a particular user by use of a corresponding user-dependent audio model.
    Type: Grant
    Filed: August 30, 2016
    Date of Patent: May 23, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Andrew William Lovitt
  • Patent number: 9652745
    Abstract: A method for detecting bias in an evaluation process is provided. The method includes operations of receiving evaluation data from a candidate evaluation system. The evaluation data is provided by a set of evaluators based on digital interview data collected from evaluation candidates. The operations of the method further include extracting indicators of characteristics of the evaluation candidates from the digital interview data, classifying the evaluation candidates based on the indicators extracted from the digital interview data, and determining whether the evaluation data indicates a bias of one or more evaluators with respect to a classification of the evaluation candidates.
    Type: Grant
    Filed: November 17, 2014
    Date of Patent: May 16, 2017
    Assignee: HIREVUE, INC.
    Inventors: Benjamin Taylor, Loren Larsen
  • Patent number: 9653069
    Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.
    Type: Grant
    Filed: April 30, 2015
    Date of Patent: May 16, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
  • Patent number: 9653071
    Abstract: A method and system are disclosed for recognizing speech errors, such as in a spoken short messages, using an audio input device to receive an utterance of a short message, using an automated speech recognition module to generate a text sentence corresponding to the utterance, generating an N-best list of predicted error sequences for the text sentence using a linear-chain conditional random field (CRF) module, where each word of the text sentence is assigned a label in each of the predicted error sequences, and each label is assigned a probability score. The predicted error sequence labels are rescored using a metacost matrix module, the best rescored error sequence from the N-best list of predicted error sequences is selected using a Recognition Output Voting Error Reduction (ROVER) module, and a dialog action is executed by a dialog action module based on the best rescored error sequence and the dialog action policy.
    Type: Grant
    Filed: August 22, 2014
    Date of Patent: May 16, 2017
    Assignee: Honda Motor Co., Ltd.
    Inventors: Rakesh Gupta, Teruhisa Misu, Aasish Pappu
  • Patent number: 9646609
    Abstract: Systems and processes for generating a shared pronunciation lexicon and using the shared pronunciation lexicon to interpret spoken user inputs received by a virtual assistant are provided. In one example, the process can include receiving pronunciations for words or named entities from multiple users. The pronunciations can be tagged with context tags and stored in the shared pronunciation lexicon. The shared pronunciation lexicon can then be used to interpret a spoken user input received by a user device by determining a relevant subset of the shared pronunciation lexicon based on contextual information associated with the user device and performing speech-to-text conversion on the spoken user input using the determined subset of the shared pronunciation lexicon.
    Type: Grant
    Filed: August 25, 2015
    Date of Patent: May 9, 2017
    Assignee: Apple Inc.
    Inventors: Devang K. Naik, Ali S. Mohamed, Hong M. Chen
  • Patent number: 9640175
    Abstract: Systems and methods are described for adding entries to a custom lexicon used by a speech recognition engine of a speech interface in response to user interaction with the speech interface. In one embodiment, a speech signal is obtained when the user speaks a name of a particular item to be selected from among a finite set of items. If a phonetic description of the speech signal is not recognized by the speech recognition engine, then the user is presented with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item. After the user has selected the particular item via the means for selecting, the phonetic description of the speech signal is stored in association with a text description of the particular item in the custom lexicon.
    Type: Grant
    Filed: October 7, 2011
    Date of Patent: May 2, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Wei-Ting Frank Liu, Andrew Lovitt, Stefanie Tomko, Yun-Cheng Ju
  • Patent number: 9613621
    Abstract: A speech recognition method and an electronic apparatus are provided. The speech recognition method includes the following steps. A plurality of phonetic transcriptions of a speech signal is obtained according to an acoustic model. A phonetic spelling and intonation information matched to the phonetic transcriptions are obtained according to a phonetic transcription sequence and a syllable acoustic lexicon of the invention. According to the phonetic spellings and the intonation information, a plurality of phonetic spelling sequences and a plurality of phonetic spelling sequence probabilities are obtained from a language model. The phonetic spelling sequence corresponding to a largest one among the phonetic spelling sequence probabilities is selected as a recognition result of the speech signal.
    Type: Grant
    Filed: September 19, 2014
    Date of Patent: April 4, 2017
    Assignee: VIA Technologies, Inc.
    Inventors: Guo-Feng Zhang, Yi-Fei Zhu
  • Patent number: 9589560
    Abstract: Features are disclosed for estimating a false rejection rate in a detection system. The false rejection rate can be estimated by fitting a model to a distribution of detection confidence scores. An estimated false rejection rate can then be computed for confidence scores that fall below a threshold. The false rejection rate and model can be verified once the detection system has been deployed by obtaining additional data with confidence scores falling below the threshold. Adjustments to the model or other operational parameters can be implemented based on the verified false rejection rate, model, or additional data.
    Type: Grant
    Filed: December 19, 2013
    Date of Patent: March 7, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Rohit Prasad
  • Patent number: 9564134
    Abstract: The present invention relates to a method and apparatus for speaker-calibrated speaker detection. One embodiment of a method for generating a speaker model for use in detecting a speaker of interest includes identifying one or more speech features that best distinguish the speaker of interest from a plurality of impostor speakers and then incorporating the speech features in the speaker model.
    Type: Grant
    Filed: September 28, 2015
    Date of Patent: February 7, 2017
    Assignee: SRI INTERNATIONAL
    Inventors: Elizabeth Shriberg, Luciana Ferrer, Andreas Stolcke, Martin Graciarena, Nicolas Scheffer
  • Patent number: 9557819
    Abstract: Gesture input with multiple displays, views, and physics is described. In one example, a method includes generating a three dimensional space having a plurality of objects in different positions relative to a user and a virtual object to be manipulated by the user, presenting, on a display, a displayed area having at least a portion of the plurality of different objects, detecting an air gesture of the user against the virtual object, the virtual object being outside the displayed area, generating a trajectory of the virtual object in the three-dimensional space based on the air gesture, the trajectory including interactions with objects of the plurality of objects in the three-dimensional space, and presenting a portion of the generated trajectory on the displayed area.
    Type: Grant
    Filed: November 23, 2011
    Date of Patent: January 31, 2017
    Assignee: Intel Corporation
    Inventor: Glen J. Anderson
  • Patent number: 9547755
    Abstract: A system and methods for digital content creation and upload through a managed website for providing network-based access to authorized users who pay for predetermined rights that allow for use of the content by the authorized user on a multiplicity of devices, without having to repurchase access to the same content.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: January 17, 2017
    Inventor: Jill Lewis Maurer
  • Patent number: 9536526
    Abstract: According to one embodiment, an electronic device includes a display controller and circuitry. The display controller displays a first object indicative of a first speaker, a first object indicative of a second speaker different from the first speaker, a second object indicative of a first speech period identified as a speech of the first speaker, and a second object indicative of a second speech period identified as a speech of the second speaker. The circuitry integrates the first speech period and the second speech period into a speech period of a same speaker when a first operation of associating the first object indicative of the first speaker with the first object indicative of the second speaker is operated.
    Type: Grant
    Filed: March 19, 2015
    Date of Patent: January 3, 2017
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Ryuichi Yamaguchi
  • Patent number: 9536567
    Abstract: In an aspect, in general, method for aligning an audio recording and a transcript includes receiving a transcript including a plurality of terms, each term of the plurality of terms associated with a time location within a different version of the audio recording, forming a plurality of search terms from the terms of the transcript, determining possible time locations of the search terms in the audio recording, determining a correspondence between time locations within the different version of the audio recording associated with the search terms and the possible time locations of the search terms in the audio recording, and aligning the audio recording and the transcript including updating the time location associated with terms of the transcript based on the determined correspondence.
    Type: Grant
    Filed: September 4, 2012
    Date of Patent: January 3, 2017
    Assignee: NEXIDIA INC.
    Inventors: Jacob B. Garland, Drew Lanham, Daryl Kip Watters, Marsal Gavalda, Mark Finlay, Kenneth K. Griggs
  • Patent number: 9535987
    Abstract: System and method to search audio data, including: receiving audio data representing speech; receiving a search query related to the audio data; compiling, by use of a processor, the search query into a hierarchy of scored speech recognition sub-searches; searching, by use of a processor, the audio data for speech identified by one or more of the sub-searches to produce hits; and combining, by use of a processor, the hits by use of at least one combination function to provide a composite search score of the audio data. The combination function may include an at-least-M-of-N function that produces a high score when at least M of N function inputs exceed a predetermined threshold value. The composite search score employ a soft time window such as a spline function.
    Type: Grant
    Filed: January 25, 2016
    Date of Patent: January 3, 2017
    Assignee: Avaya Inc.
    Inventor: Keith Michael Ponting
  • Patent number: 9536253
    Abstract: A method including the steps of: receiving, by a computer system including at least one computer, a first electronic media work uploaded from a first electronic device; extracting one or more features from the first electronic media work; linking the first electronic media work with a reference electronic media work identifier associated with a reference electronic media work to generate correlation information relating the first electronic media work with at least an action associated with the reference electronic media work identifier; storing the correlation information; receiving, from a second electronic device, a query related to the first electronic media work; correlating the query with action information related to an action to be performed based at least in part on the correlation information; generating machine-readable instructions based upon the action information; and providing the machine-readable instructions to the second electronic device to be used in performing the action.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: January 3, 2017
    Assignee: NETWORK-1 TECHNOLOGIES, INC.
    Inventor: Ingemar J. Cox
  • Patent number: 9536516
    Abstract: A speech recognition circuit comprises an input buffer for receiving processed speech parameters. A lexical memory contains lexical data for word recognition. The lexical data comprises a plurality of lexical tree data structures. Each lexical tree data structure comprises a model of words having common prefix components. An initial component of each lexical tree structure is unique. A plurality of lexical tree processors are connected in parallel to the input buffer for processing the speech parameters in parallel to perform parallel lexical tree processing for word recognition by accessing the lexical data in the lexical memory. A results memory is connected to the lexical tree processors for storing processing results from the lexical tree processors and lexical tree identifiers to identify lexical trees to be processed by the lexical tree processors.
    Type: Grant
    Filed: June 19, 2014
    Date of Patent: January 3, 2017
    Assignee: Zentian Limited
    Inventor: Mark Catchpole
  • Patent number: 9529793
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resolving ambiguity in received voice queries. An original voice query is received following one or more earlier voice queries, wherein the original voice query includes a pronoun or phrase. In one implementation, a plurality of acoustic parameters is identified for one or more words in the original voice query. A concept represented by the pronoun is identified based on the plurality of acoustic parameters, wherein the concept is associated with a particular query of the one or more earlier queries. The concept is associated with the pronoun. Alternatively, a concept may be associated with a phrase by using grammatical analysis of the query to relate the phrase to a concept derived from a prior query.
    Type: Grant
    Filed: February 22, 2013
    Date of Patent: December 27, 2016
    Assignee: Google Inc.
    Inventors: Gabriel Taubman, John J. Lee
  • Patent number: 9513712
    Abstract: A processing device and method is provided. According to an illustrative embodiment, the device and method is implemented by detecting a face region of an image, setting at least one action region according to the position of the face region, comparing image data corresponding to the at least one action region to the detection information for purposes of determining whether or not a predetermined action has been performed, and generating a notification when it is determined that the predetermined action has been performed.
    Type: Grant
    Filed: April 24, 2014
    Date of Patent: December 6, 2016
    Assignee: SONY CORPORATION
    Inventors: Yusuke Sakai, Shingo Tsurumi, Masao Kondo
  • Patent number: 9502032
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a method comprises receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.
    Type: Grant
    Filed: October 28, 2014
    Date of Patent: November 22, 2016
    Assignee: Google Inc.
    Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
  • Patent number: 9502030
    Abstract: Methods and systems are provided for adapting a speech system of a vehicle. In one example a method includes: logging data from the vehicle; logging speech data from the speech system; processing the data from the vehicle and the data from the speech system to determine a pattern of context and a relation to user interaction behavior; and selectively updating a user profile of the speech system based on the pattern of context.
    Type: Grant
    Filed: November 1, 2013
    Date of Patent: November 22, 2016
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Ute Winter, Timothy J. Grost, Ron M. Hecht, Robert D. Sims, III
  • Patent number: 9466292
    Abstract: Methods and systems for online incremental adaptation of neural networks using Gaussian mixture models in speech recognition are described. In an example, a computing device may be configured to receive an audio signal and a subsequent audio signal, both signals having speech content. The computing device may be configured to apply a speaker-specific feature transform to the audio signal to obtain a transformed audio signal. The speaker-specific feature transform may be configured to include speaker-specific speech characteristics of a speaker-profile relating to the speech content. Further, the computing device may be configured to process the transformed audio signal using a neural network trained to estimate a respective speech content of the audio signal. Based on outputs of the neural network, the computing device may be configured to modify the speaker-specific feature transform, and apply the modified speaker-specific feature transform to a subsequent audio signal.
    Type: Grant
    Filed: May 3, 2013
    Date of Patent: October 11, 2016
    Assignee: Google Inc.
    Inventors: Xin Lei, Petar Aleksic
  • Patent number: 9443272
    Abstract: A data processing system includes components for providing a pleasant user experience. Those components may include a family interaction engine that provides a family channel. The family interaction engine may provide for creation of a user group. The family channel may present content of interest to multiple users in the user group. When a user is detected near the data processing system, the family interaction engine may automatically present content of interest to that user. When used for presenting media content, the data processing system may also cause supplemental data to automatically be presented, wherein the supplemental data is relevant to the media content and to a predetermined interest of the user. The data processing system may also provide a ranked list of applications for potential activation by the user. The applications may be ordered based on the current context. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 13, 2012
    Date of Patent: September 13, 2016
    Assignee: Intel Corporation
    Inventors: Chieh-Yih Wan, Giuseppe Raffa, Junaith Ahemed Shahabdeen, Lama Nachman, Adam Jordan, Ashwini Asokan
  • Patent number: 9443287
    Abstract: The image processing method includes providing first dictionaries produced by dictionary learning and second dictionaries corresponding to the first dictionaries, performing, on each first dictionary, a process to approximate the first image by linear combination of elements of the first dictionary so as to produce a linear combination coefficient and thereby acquiring multiple linear combination coefficients, and calculating, for each linear combination coefficient, a ratio between a largest coefficient element and a second-largest coefficient element and selecting a specific linear combination coefficient in which the ratio is largest among the multiple linear combination coefficients.
    Type: Grant
    Filed: February 3, 2015
    Date of Patent: September 13, 2016
    Assignee: CANON KABUSHIKI KAISHA
    Inventor: Yoshinori Kimura
  • Patent number: 9436673
    Abstract: An apparatus and method applying a layout template to content are disclosed herein. A plurality of content included in a visual workspace is automatically grouped into one or more clusters, one or more content of the plurality of content being at different spatial position from each other. At least one cluster is automatically located to a respective content placeholder included in the layout template. The clusters with the layout template are presented in accordance with the automatically locating of the clusters.
    Type: Grant
    Filed: March 28, 2013
    Date of Patent: September 6, 2016
    Assignee: Prezi, Inc
    Inventors: Zoltán Gera, Andrei Boghiu, Lior Paz, Péter Zimon, Péter Polgár Balázs, Peter Arvai
  • Patent number: 9424846
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.
    Type: Grant
    Filed: July 30, 2014
    Date of Patent: August 23, 2016
    Assignee: Google Inc.
    Inventors: Dominik Roblek, Matthew Sharifi
  • Patent number: 9412370
    Abstract: A method and a system for a speech recognition system, comprising an electronic speech-based document is associated with a document template and comprises one or more sections of text recognized or transcribed from sections of speech. The sections of speech are transcribed by the speech recognition system into corresponding sections of text of the electronic speech based document. The method includes the steps of dynamically creating sub contexts and associating the sub context to sections of text of the document template.
    Type: Grant
    Filed: June 20, 2014
    Date of Patent: August 9, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Gerhard Grobauer, Miklos Papai
  • Patent number: 9384735
    Abstract: A method for facilitating the updating of a language model includes receiving, at a client device, via a microphone, an audio message corresponding to speech of a user; communicating the audio message to a first remote server; receiving, that the client device, a result, transcribed at the first remote server using an automatic speech recognition system (“ASR”), from the audio message; receiving, at the client device from the user, an affirmation of the result; storing, at the client device, the result in association with an identifier corresponding to the audio message; and communicating, to a second remote server, the stored result together with the identifier.
    Type: Grant
    Filed: July 25, 2014
    Date of Patent: July 5, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Marc White, Igor Roditis Jablokov, Victor Roman Jablokov
  • Patent number: 9361289
    Abstract: Features are disclosed for maintaining data that can be used to personalize spoken language processing, such as automatic speech recognition (“ASR”), natural language understanding (“NLU”), natural language processing (“NLP”), etc. The data may be obtained from various data sources, such as applications or services used by the user. User-specific data maintained by the data sources can be retrieved and stored for use in generating personal models. Updates to data at the data sources may be reflected by separate data sets in the personalization data, such that other processes can obtain the update data sets separate from other data.
    Type: Grant
    Filed: August 30, 2013
    Date of Patent: June 7, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Madan Mohan Rao Jampani, Arushan Rajasekaram, Nikko Strom, Yuzo Watanabe, Stan Weidner Salvador
  • Patent number: 9348411
    Abstract: Described herein are technologies relating to display of a representation of an object on a display screen with visual verisimilitude to a viewer. A location of eyes of the viewer relative to a reference point on the display screen is determined. Additionally, a direction of gaze of the eyes of the viewer is determined. Based upon the location and direction of gaze of the eyes of the viewer, the representation of the object can be displayed at a scale and orientation such that it appears with visual verisimilitude to the viewer.
    Type: Grant
    Filed: May 24, 2013
    Date of Patent: May 24, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Timothy S. Paek, Johnson Apacible
  • Patent number: 9342582
    Abstract: Methods are provided for populating search indexes with atoms identified in documents. Documents that are to be indexed are identified, and for each document, atoms are identified and are categorized as unigrams, n-grams, and n-tuples. A list of atom/document pairs is generated such that an information metric can be computed for each pair. An information metric represents a ranking of the atom in relation to the particular document. Based on the information metric, some atom/document pairs are discarded and others are indexed.
    Type: Grant
    Filed: March 10, 2011
    Date of Patent: May 17, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Knut Magne Risvik, Mike Hopcroft, John G. Bennett, Karthik Kalyanaraman, Trishul Chilimbi
  • Patent number: 9330667
    Abstract: A method and system for endpoint automatic detection of audio record is provided. The method comprises the following steps: acquiring a audio record text and affirming the text endpoint acoustic model for the audio record text; starting acquiring the audio record data of each frame in turn from the audio record start frame in the audio record data; affirming the characteristics acoustic model of the decoding optimal path for the acquired current frame of the audio record data; comparing the characteristics acoustic model of the decoding optimal path acquired from the current frame of the audio record data with the endpoint acoustic model to determine if they are the same; if yes, updating a mute duration threshold with a second time threshold, wherein the second time threshold is less than a first time threshold. This method can improve the recognizing efficiency of the audio record endpoint.
    Type: Grant
    Filed: October 29, 2010
    Date of Patent: May 3, 2016
    Assignee: iFLYTEK Co., Ltd.
    Inventors: Si Wei, Guoping Hu, Yu Hu, Qingfeng Liu
  • Patent number: 9311914
    Abstract: The subject matter discloses a method two phase phonetic indexing and search comprising: receiving a digital representation of an audio signal; producing a phonetic index of the audio signal; producing phonetic N-gram sequence from the phonetic index by segmenting the phonetic index into a plurality of phonetic N-grams; and producing an inverted index of the plurality of phonetic N-grams.
    Type: Grant
    Filed: September 3, 2012
    Date of Patent: April 12, 2016
    Assignee: NICE-SYSTEMS LTD
    Inventors: Moshe Wasserblat, Dan Eylon, Tzach Ashkenazi, Oren Pereg, Ronen Laperdon
  • Patent number: 9311298
    Abstract: Tools are provided to allow developers to enable applications for Conversational Understanding (CU) using assets from a CU service. The tools may be used to select functionality from existing domains, extend the coverage of one or more domains, as well as to create new domains in the CU service. A developer may provide example Natural Language (NL) sentences that are analyzed by the tools to assist the developer in labeling data that is used to update the models in the CU service. For example, the tools may assist a developer in identifying domains, determining intent actions, determining intent objects and determining slots from example NL sentences. After the developer tags all or a portion of the example NL sentences, the models in the CU service are automatically updated and validated. For example, validation tools may be used to determine an accuracy of the model against test data.
    Type: Grant
    Filed: June 21, 2013
    Date of Patent: April 12, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ruhi Sarikaya, Daniel Boies, Larry Heck, Tasos Anastasakos
  • Patent number: 9305545
    Abstract: A method for vocabulary integration of speech recognition comprises converting multiple speech signals into multiple words using a processor, applying confidence scores to the multiple words, classifying the multiple words into a plurality of classifications based on classification criteria and the confidence score for each word, determining if one or more of the multiple words are unrecognized based on the plurality of classifications, classifying each unrecognized word and detecting a match for the unrecognized word based on additional classification criteria, and upon detecting a match for an unrecognized word, converting at least a portion of the multiple speech signals corresponding to the unrecognized word into words.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: April 5, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Chun Shing Cheung
  • Patent number: 9299110
    Abstract: Client devices periodically capture ambient audio waveforms and modify their own device configuration based on the captured audio waveform. In particular embodiments, client devices generate waveform fingerprints and upload the fingerprints to a server for analysis. The server compares the waveform fingerprints to a database of stored waveform fingerprints, and upon finding a match, pushes content or other information to the client device. The fingerprints in the database may be uploaded by other users, and compared to the received client waveform fingerprint based on common location or other social factors. Thus a client's location may be enhanced if the location of users whose fingerprints match the client's is known, and, based upon this enhanced location, the server may transmit an instruction to the device to modify its device configuration.
    Type: Grant
    Filed: October 19, 2011
    Date of Patent: March 29, 2016
    Assignee: Facebook, Inc.
    Inventors: Matthew Nicholas Papakipos, David Harry Garcia
  • Patent number: 9275647
    Abstract: In particular embodiments, one or more computer-readable non-transitory storage media embody software that is operable when executed to receive an audio waveform fingerprint and a client-determined location from a client device. The received audio waveform fingerprint may be compared to a database of stored audio waveform fingerprints, each stored audio waveform fingerprint associated with an object in an object database. One or more matching audio waveform fingerprints may be found from a comparison set of audio waveform fingerprints obtained from the audio waveform fingerprint database. Location information associated with a location of the client device may be determined, and the location information may be sent to the client device. The client device may be operable to update the client-determined location based at least in part on the location information.
    Type: Grant
    Filed: April 18, 2014
    Date of Patent: March 1, 2016
    Assignee: Facebook, Inc.
    Inventors: Matthew Nicholas Papakipos, David Harry Garcia
  • Patent number: 9263034
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving voice queries, obtaining, for one or more of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query, generating, for the one or more voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query, selecting a subset of the one or more voice queries based on the posterior recognition confidence measures, and adapting an acoustic model using the subset of the voice queries.
    Type: Grant
    Filed: July 13, 2010
    Date of Patent: February 16, 2016
    Assignee: Google Inc.
    Inventors: Brian Strope, Douglas H. Beeferman
  • Patent number: 9251796
    Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.
    Type: Grant
    Filed: August 21, 2014
    Date of Patent: February 2, 2016
    Assignee: Shazam Entertainment Ltd.
    Inventor: Avery Li-Chun Wang
  • Patent number: 9218807
    Abstract: A system and method provide acoustic training of a voice or speech recognition engine and/or voice or speech recognition software application. Instead of requiring a user to read from a prepared or predetermined script, the system and method described herein enable acoustic training using any free text spoken phrases provided by the user directly, or by a previously recorded speech, presentation, or the like, performed by the user.
    Type: Grant
    Filed: January 7, 2011
    Date of Patent: December 22, 2015
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Eric Hon-Anderson, Robert W. Stuller
  • Patent number: 9202461
    Abstract: A set of benchmark text strings may be classified to provide a set of benchmark classifications. The benchmark text strings in the set may correspond to a benchmark corpus of benchmark utterances in a particular language. A benchmark classification distribution of the set of benchmark classifications may be determined. A respective classification for each text string in a corpus of text strings may also be determined. Text strings from the corpus of text strings may be sampled to form a training corpus of training text strings such that the classifications of the training text strings have a training text string classification distribution that is based on the benchmark classification distribution. The training corpus of training text strings may be used to train an automatic speech recognition (ASR) system.
    Type: Grant
    Filed: January 18, 2013
    Date of Patent: December 1, 2015
    Assignee: Google Inc.
    Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar, Kaisuke Nakajima, Daniel Martin Bikel
  • Patent number: 9195913
    Abstract: This method of configuring a device for detecting a situation from among a set of situations in which it is possible to find a physical system observed by a least one sensor, comprises the following steps: receiving (102) a training sequence corresponding to a determined situation of the physical system; determining (118) parameters of a statistical hidden Markov model recorded on the detection device and related to the determined situation, based on a prior initialization (104-116) of these parameters. The prior initialization (104-116) comprises the following steps: determining (104, 106) multiple probability distributions from the training sequence; distributing (108-114) the determined probability distributions between the hidden states of the statistical model being used; and initializing the parameters of the statistical model being used from representative probability distributions determined for each hidden state of the statistical model being used.
    Type: Grant
    Filed: August 31, 2011
    Date of Patent: November 24, 2015
    Assignee: Commissariat à{grave over ( )}l'énergie atomique et aux énergies alternatives
    Inventor: Pierre Jallon
  • Patent number: 9190050
    Abstract: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.
    Type: Grant
    Filed: April 3, 2014
    Date of Patent: November 17, 2015
    Assignee: MModal IP LLC
    Inventors: Lambert Mathias, Girija Yegnanarayanan, Juergen Fritsch
  • Patent number: 9177552
    Abstract: Methods and systems for setting selected automatic speech recognition parameters are described. A data set associated with operation of a speech recognition application is defined and includes: i. recognition states characterizing the semantic progression of a user interaction with the speech recognition application, and ii. recognition outcomes associated with each recognition state. For a selected user interaction with the speech recognition application, an application cost function is defined that characterizes an estimated cost of the user interaction for each recognition outcome. For one or more system performance parameters indirectly related to the user interaction, the parameters are set to values which optimize the cost of the user interaction over the recognition states.
    Type: Grant
    Filed: February 3, 2012
    Date of Patent: November 3, 2015
    Assignee: Nuance Communications, Inc.
    Inventor: Jeffrey N. Marcus
  • Patent number: 9159317
    Abstract: A system and a method recognize speech including a sequence of words. A set of interpretations of the speech is generated using an acoustic model and a language model, and, for each interpretation, a score representing correctness of an interpretation in representing the sequence of words is determined to produce a set of scores. Next, the set of scores is updated based on a consistency of each interpretation with a constraint determined in response to receiving a word sequence constraint.
    Type: Grant
    Filed: June 14, 2013
    Date of Patent: October 13, 2015
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Bret Harsham, John R. Hershey
  • Patent number: 9159318
    Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
    Type: Grant
    Filed: August 26, 2014
    Date of Patent: October 13, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
  • Patent number: 9141855
    Abstract: Systems, apparatus and methods are described related to accelerated object detection filter using a video estimation module.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: September 22, 2015
    Assignee: Intel Corporation
    Inventors: Lin Xu, Yangzhou Du, Jianguo Li, Qiang Li, Ya-Ti Peng, Yi-Jen Chiu
  • Patent number: 9123338
    Abstract: Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: September 1, 2015
    Assignee: Google Inc.
    Inventors: Jason Sanders, Gabriel Taubman, John J. Lee
  • Patent number: 9113190
    Abstract: A processor-implemented method, system and computer readable medium for intelligently controlling the power level of an electronic device in a multimedia system based on user intent, is provided. The method includes receiving data relating to a first user interaction with a device in a multimedia system. The method includes determining if the first user interaction corresponds to a user's intent to interact with the device. The method then includes setting a power level for the device based on the first user interaction. The method further includes receiving data relating to a second user interaction with the device. The method then includes altering the power level of the device based on the second user interaction to activate the device for the user.
    Type: Grant
    Filed: June 4, 2010
    Date of Patent: August 18, 2015
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: John Clavin, John Tardif
  • Patent number: 9075444
    Abstract: An information input apparatus includes an observation unit that observes an environment including a user and one or more apparatuses to be controlled and includes a sensor; a learning unit that separates a foreground including the user and the one or more apparatuses to be controlled and a background including the environment except for the foreground from observation data obtained by the observation unit and learns three-dimensional models of the foreground and the background; a state estimation unit that estimates positions and postures of already modeled foregrounds in the environment; a user recognition unit that identifies fingers of the user from the foreground and recognizes a shape, position, and posture of the fingers; and an apparatus control unit that outputs a control command to the one or more apparatuses to be controlled on the basis of the recognized shape, position, and posture of the fingers.
    Type: Grant
    Filed: February 13, 2013
    Date of Patent: July 7, 2015
    Assignee: SONY CORPORATION
    Inventors: Kuniaki Noda, Hirotaka Suzuki, Haruto Takeda, Yusuke Watanabe