Creating Patterns For Matching Patents (Class 704/243)

Update patterns (Class 704/244)

Clustering (Class 704/245)

Music information searching method and apparatus thereof

Patent number: 9659092

Abstract: A music information searching method includes extracting modulating spectrums from audio data, generating modulating spectrum peak point audio fingerprints by using position information which relates to preset peak points from the extracted modulating spectrums, converting the generated modulating spectrum peak point audio fingerprints into hash keys which indicate addresses of hash tables and hash values that are stored on the hash tables via hash functions, and searching music information by extracting hash keys which relate to audio query clips and comparing the extracted hash keys with the indicated addresses of the hash tables.

Type: Grant

Filed: November 13, 2013

Date of Patent: May 23, 2017

Assignees: SAMSUNG ELECTRONICS CO., LTD., KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION

Inventors: Ki-wan Eom, Hyoung-Gook Kim, Kwang-ki Kim
Environment adjusted speaker identification

Patent number: 9659562

Abstract: A system estimates environment-specific alterations of a user sound received at the system. The system modifies the received user sound to formulate a modified user sound by at least compensating for the audio modifications and/or formulates an expected audio model of a user by modifying a stored user-dependent audio model of the user with the audio modification. The system is also capable of estimating whether the received user sounds is from a particular user by use of a corresponding user-dependent audio model.

Type: Grant

Filed: August 30, 2016

Date of Patent: May 23, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventor: Andrew William Lovitt
Model-driven evaluator bias detection

Patent number: 9652745

Abstract: A method for detecting bias in an evaluation process is provided. The method includes operations of receiving evaluation data from a candidate evaluation system. The evaluation data is provided by a set of evaluators based on digital interview data collected from evaluation candidates. The operations of the method further include extracting indicators of characteristics of the evaluation candidates from the digital interview data, classifying the evaluation candidates based on the indicators extracted from the digital interview data, and determining whether the evaluation data indicates a bias of one or more evaluators with respect to a classification of the evaluation candidates.

Type: Grant

Filed: November 17, 2014

Date of Patent: May 16, 2017

Assignee: HIREVUE, INC.

Inventors: Benjamin Taylor, Loren Larsen
System and method for personalization of acoustic models for automatic speech recognition

Patent number: 9653069

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Grant

Filed: April 30, 2015

Date of Patent: May 16, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
Method and system for the correction-centric detection of critical speech recognition errors in spoken short messages

Patent number: 9653071

Abstract: A method and system are disclosed for recognizing speech errors, such as in a spoken short messages, using an audio input device to receive an utterance of a short message, using an automated speech recognition module to generate a text sentence corresponding to the utterance, generating an N-best list of predicted error sequences for the text sentence using a linear-chain conditional random field (CRF) module, where each word of the text sentence is assigned a label in each of the predicted error sequences, and each label is assigned a probability score. The predicted error sequence labels are rescored using a metacost matrix module, the best rescored error sequence from the N-best list of predicted error sequences is selected using a Recognition Output Voting Error Reduction (ROVER) module, and a dialog action is executed by a dialog action module based on the best rescored error sequence and the dialog action policy.

Type: Grant

Filed: August 22, 2014

Date of Patent: May 16, 2017

Assignee: Honda Motor Co., Ltd.

Inventors: Rakesh Gupta, Teruhisa Misu, Aasish Pappu
Caching apparatus for serving phonetic pronunciations

Patent number: 9646609

Abstract: Systems and processes for generating a shared pronunciation lexicon and using the shared pronunciation lexicon to interpret spoken user inputs received by a virtual assistant are provided. In one example, the process can include receiving pronunciations for words or named entities from multiple users. The pronunciations can be tagged with context tags and stored in the shared pronunciation lexicon. The shared pronunciation lexicon can then be used to interpret a spoken user input received by a user device by determining a relevant subset of the shared pronunciation lexicon based on contextual information associated with the user device and performing speech-to-text conversion on the spoken user input using the determined subset of the shared pronunciation lexicon.

Type: Grant

Filed: August 25, 2015

Date of Patent: May 9, 2017

Assignee: Apple Inc.

Inventors: Devang K. Naik, Ali S. Mohamed, Hong M. Chen
Pronunciation learning from user correction

Patent number: 9640175

Abstract: Systems and methods are described for adding entries to a custom lexicon used by a speech recognition engine of a speech interface in response to user interaction with the speech interface. In one embodiment, a speech signal is obtained when the user speaks a name of a particular item to be selected from among a finite set of items. If a phonetic description of the speech signal is not recognized by the speech recognition engine, then the user is presented with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item. After the user has selected the particular item via the means for selecting, the phonetic description of the speech signal is stored in association with a text description of the particular item in the custom lexicon.

Type: Grant

Filed: October 7, 2011

Date of Patent: May 2, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Wei-Ting Frank Liu, Andrew Lovitt, Stefanie Tomko, Yun-Cheng Ju
Speech recognition method and electronic apparatus

Patent number: 9613621

Abstract: A speech recognition method and an electronic apparatus are provided. The speech recognition method includes the following steps. A plurality of phonetic transcriptions of a speech signal is obtained according to an acoustic model. A phonetic spelling and intonation information matched to the phonetic transcriptions are obtained according to a phonetic transcription sequence and a syllable acoustic lexicon of the invention. According to the phonetic spellings and the intonation information, a plurality of phonetic spelling sequences and a plurality of phonetic spelling sequence probabilities are obtained from a language model. The phonetic spelling sequence corresponding to a largest one among the phonetic spelling sequence probabilities is selected as a recognition result of the speech signal.

Type: Grant

Filed: September 19, 2014

Date of Patent: April 4, 2017

Assignee: VIA Technologies, Inc.

Inventors: Guo-Feng Zhang, Yi-Fei Zhu
Estimating false rejection rate in a detection system

Patent number: 9589560

Abstract: Features are disclosed for estimating a false rejection rate in a detection system. The false rejection rate can be estimated by fitting a model to a distribution of detection confidence scores. An estimated false rejection rate can then be computed for confidence scores that fall below a threshold. The false rejection rate and model can be verified once the detection system has been deployed by obtaining additional data with confidence scores falling below the threshold. Adjustments to the model or other operational parameters can be implemented based on the verified false rejection rate, model, or additional data.

Type: Grant

Filed: December 19, 2013

Date of Patent: March 7, 2017

Assignee: Amazon Technologies, Inc.

Inventors: Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Rohit Prasad
Method and apparatus for speaker-calibrated speaker detection

Patent number: 9564134

Abstract: The present invention relates to a method and apparatus for speaker-calibrated speaker detection. One embodiment of a method for generating a speaker model for use in detecting a speaker of interest includes identifying one or more speech features that best distinguish the speaker of interest from a plurality of impostor speakers and then incorporating the speech features in the speaker model.

Type: Grant

Filed: September 28, 2015

Date of Patent: February 7, 2017

Assignee: SRI INTERNATIONAL

Inventors: Elizabeth Shriberg, Luciana Ferrer, Andreas Stolcke, Martin Graciarena, Nicolas Scheffer
Gesture input with multiple views, displays and physics

Patent number: 9557819

Abstract: Gesture input with multiple displays, views, and physics is described. In one example, a method includes generating a three dimensional space having a plurality of objects in different positions relative to a user and a virtual object to be manipulated by the user, presenting, on a display, a displayed area having at least a portion of the plurality of different objects, detecting an air gesture of the user against the virtual object, the virtual object being outside the displayed area, generating a trajectory of the virtual object in the three-dimensional space based on the air gesture, the trajectory including interactions with objects of the plurality of objects in the three-dimensional space, and presenting a portion of the generated trajectory on the displayed area.

Type: Grant

Filed: November 23, 2011

Date of Patent: January 31, 2017

Assignee: Intel Corporation

Inventor: Glen J. Anderson
Digital media content creation and distribution methods

Patent number: 9547755

Abstract: A system and methods for digital content creation and upload through a managed website for providing network-based access to authorized users who pay for predetermined rights that allow for use of the content by the authorized user on a multiplicity of devices, without having to repurchase access to the same content.

Type: Grant

Filed: December 14, 2015

Date of Patent: January 17, 2017

Inventor: Jill Lewis Maurer
Electronic device with speaker identification, method and storage medium

Patent number: 9536526

Abstract: According to one embodiment, an electronic device includes a display controller and circuitry. The display controller displays a first object indicative of a first speaker, a first object indicative of a second speaker different from the first speaker, a second object indicative of a first speech period identified as a speech of the first speaker, and a second object indicative of a second speech period identified as a speech of the second speaker. The circuitry integrates the first speech period and the second speech period into a speech period of a same speaker when a first operation of associating the first object indicative of the first speaker with the first object indicative of the second speaker is operated.

Type: Grant

Filed: March 19, 2015

Date of Patent: January 3, 2017

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventor: Ryuichi Yamaguchi
Transcript re-sync

Patent number: 9536567

Abstract: In an aspect, in general, method for aligning an audio recording and a transcript includes receiving a transcript including a plurality of terms, each term of the plurality of terms associated with a time location within a different version of the audio recording, forming a plurality of search terms from the terms of the transcript, determining possible time locations of the search terms in the audio recording, determining a correspondence between time locations within the different version of the audio recording associated with the search terms and the possible time locations of the search terms in the audio recording, and aligning the audio recording and the transcript including updating the time location associated with terms of the transcript based on the determined correspondence.

Type: Grant

Filed: September 4, 2012

Date of Patent: January 3, 2017

Assignee: NEXIDIA INC.

Inventors: Jacob B. Garland, Drew Lanham, Daryl Kip Watters, Marsal Gavalda, Mark Finlay, Kenneth K. Griggs
“At least” operator for combining audio search hits

Patent number: 9535987

Abstract: System and method to search audio data, including: receiving audio data representing speech; receiving a search query related to the audio data; compiling, by use of a processor, the search query into a hierarchy of scored speech recognition sub-searches; searching, by use of a processor, the audio data for speech identified by one or more of the sub-searches to produce hits; and combining, by use of a processor, the hits by use of at least one combination function to provide a composite search score of the audio data. The combination function may include an at-least-M-of-N function that produces a high score when at least M of N function inputs exceed a predetermined threshold value. The composite search score employ a soft time window such as a spline function.

Type: Grant

Filed: January 25, 2016

Date of Patent: January 3, 2017

Assignee: Avaya Inc.

Inventor: Keith Michael Ponting
Methods for linking an electronic media work to perform an action

Patent number: 9536253

Abstract: A method including the steps of: receiving, by a computer system including at least one computer, a first electronic media work uploaded from a first electronic device; extracting one or more features from the first electronic media work; linking the first electronic media work with a reference electronic media work identifier associated with a reference electronic media work to generate correlation information relating the first electronic media work with at least an action associated with the reference electronic media work identifier; storing the correlation information; receiving, from a second electronic device, a query related to the first electronic media work; correlating the query with action information related to an action to be performed based at least in part on the correlation information; generating machine-readable instructions based upon the action information; and providing the machine-readable instructions to the second electronic device to be used in performing the action.

Type: Grant

Filed: December 28, 2015

Date of Patent: January 3, 2017

Assignee: NETWORK-1 TECHNOLOGIES, INC.

Inventor: Ingemar J. Cox
Speech recognition circuit using parallel processors

Patent number: 9536516

Abstract: A speech recognition circuit comprises an input buffer for receiving processed speech parameters. A lexical memory contains lexical data for word recognition. The lexical data comprises a plurality of lexical tree data structures. Each lexical tree data structure comprises a model of words having common prefix components. An initial component of each lexical tree structure is unique. A plurality of lexical tree processors are connected in parallel to the input buffer for processing the speech parameters in parallel to perform parallel lexical tree processing for word recognition by accessing the lexical data in the lexical memory. A results memory is connected to the lexical tree processors for storing processing results from the lexical tree processors and lexical tree identifiers to identify lexical trees to be processed by the lexical tree processors.

Type: Grant

Filed: June 19, 2014

Date of Patent: January 3, 2017

Assignee: Zentian Limited

Inventor: Mark Catchpole
Resolving pronoun ambiguity in voice queries

Patent number: 9529793

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resolving ambiguity in received voice queries. An original voice query is received following one or more earlier voice queries, wherein the original voice query includes a pronoun or phrase. In one implementation, a plurality of acoustic parameters is identified for one or more words in the original voice query. A concept represented by the pronoun is identified based on the plurality of acoustic parameters, wherein the concept is associated with a particular query of the one or more earlier queries. The concept is associated with the pronoun. Alternatively, a concept may be associated with a phrase by using grammatical analysis of the query to relate the phrase to a concept derived from a prior query.

Type: Grant

Filed: February 22, 2013

Date of Patent: December 27, 2016

Assignee: Google Inc.

Inventors: Gabriel Taubman, John J. Lee
Information processing device and information processing method

Patent number: 9513712

Abstract: A processing device and method is provided. According to an illustrative embodiment, the device and method is implemented by detecting a face region of an image, setting at least one action region according to the position of the face region, comparing image data corresponding to the at least one action region to the detection information for purposes of determining whether or not a predetermined action has been performed, and generating a notification when it is determined that the predetermined action has been performed.

Type: Grant

Filed: April 24, 2014

Date of Patent: December 6, 2016

Assignee: SONY CORPORATION

Inventors: Yusuke Sakai, Shingo Tsurumi, Masao Kondo
Dynamically biasing language models

Patent number: 9502032

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a method comprises receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.

Type: Grant

Filed: October 28, 2014

Date of Patent: November 22, 2016

Assignee: Google Inc.

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
Methods and systems for adapting a speech system

Patent number: 9502030

Abstract: Methods and systems are provided for adapting a speech system of a vehicle. In one example a method includes: logging data from the vehicle; logging speech data from the speech system; processing the data from the vehicle and the data from the speech system to determine a pattern of context and a relation to user interaction behavior; and selectively updating a user profile of the speech system based on the pattern of context.

Type: Grant

Filed: November 1, 2013

Date of Patent: November 22, 2016

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Ute Winter, Timothy J. Grost, Ron M. Hecht, Robert D. Sims, III
Online incremental adaptation of deep neural networks using auxiliary Gaussian mixture models in speech recognition

Patent number: 9466292

Abstract: Methods and systems for online incremental adaptation of neural networks using Gaussian mixture models in speech recognition are described. In an example, a computing device may be configured to receive an audio signal and a subsequent audio signal, both signals having speech content. The computing device may be configured to apply a speaker-specific feature transform to the audio signal to obtain a transformed audio signal. The speaker-specific feature transform may be configured to include speaker-specific speech characteristics of a speaker-profile relating to the speech content. Further, the computing device may be configured to process the transformed audio signal using a neural network trained to estimate a respective speech content of the audio signal. Based on outputs of the neural network, the computing device may be configured to modify the speaker-specific feature transform, and apply the modified speaker-specific feature transform to a subsequent audio signal.

Type: Grant

Filed: May 3, 2013

Date of Patent: October 11, 2016

Assignee: Google Inc.

Inventors: Xin Lei, Petar Aleksic
Methods and apparatus for providing improved access to applications

Patent number: 9443272

Abstract: A data processing system includes components for providing a pleasant user experience. Those components may include a family interaction engine that provides a family channel. The family interaction engine may provide for creation of a user group. The family channel may present content of interest to multiple users in the user group. When a user is detected near the data processing system, the family interaction engine may automatically present content of interest to that user. When used for presenting media content, the data processing system may also cause supplemental data to automatically be presented, wherein the supplemental data is relevant to the media content and to a predetermined interest of the user. The data processing system may also provide a ranked list of applications for potential activation by the user. The applications may be ordered based on the current context. Other embodiments are described and claimed.

Type: Grant

Filed: September 13, 2012

Date of Patent: September 13, 2016

Assignee: Intel Corporation

Inventors: Chieh-Yih Wan, Giuseppe Raffa, Junaith Ahemed Shahabdeen, Lama Nachman, Adam Jordan, Ashwini Asokan
Image processing method and apparatus using trained dictionary

Patent number: 9443287

Abstract: The image processing method includes providing first dictionaries produced by dictionary learning and second dictionaries corresponding to the first dictionaries, performing, on each first dictionary, a process to approximate the first image by linear combination of elements of the first dictionary so as to produce a linear combination coefficient and thereby acquiring multiple linear combination coefficients, and calculating, for each linear combination coefficient, a ratio between a largest coefficient element and a second-largest coefficient element and selecting a specific linear combination coefficient in which the ratio is largest among the multiple linear combination coefficients.

Type: Grant

Filed: February 3, 2015

Date of Patent: September 13, 2016

Assignee: CANON KABUSHIKI KAISHA

Inventor: Yoshinori Kimura
Automatic application of templates to content

Patent number: 9436673

Abstract: An apparatus and method applying a layout template to content are disclosed herein. A plurality of content included in a visual workspace is automatically grouped into one or more clusters, one or more content of the plurality of content being at different spatial position from each other. At least one cluster is automatically located to a respective content placeholder included in the layout template. The clusters with the layout template are presented in accordance with the automatically locating of the clusters.

Type: Grant

Filed: March 28, 2013

Date of Patent: September 6, 2016

Assignee: Prezi, Inc

Inventors: Zoltán Gera, Andrei Boghiu, Lior Paz, Péter Zimon, Péter Polgár Balázs, Peter Arvai
Segment-based speaker verification using dynamically generated phrases

Patent number: 9424846

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.

Type: Grant

Filed: July 30, 2014

Date of Patent: August 23, 2016

Assignee: Google Inc.

Inventors: Dominik Roblek, Matthew Sharifi
Method and system for dynamic creation of contexts

Patent number: 9412370

Abstract: A method and a system for a speech recognition system, comprising an electronic speech-based document is associated with a document template and comprises one or more sections of text recognized or transcribed from sections of speech. The sections of speech are transcribed by the speech recognition system into corresponding sections of text of the electronic speech based document. The method includes the steps of dynamically creating sub contexts and associating the sub context to sections of text of the document template.

Type: Grant

Filed: June 20, 2014

Date of Patent: August 9, 2016

Assignee: Nuance Communications, Inc.

Inventors: Gerhard Grobauer, Miklos Papai
Corrective feedback loop for automated speech recognition

Patent number: 9384735

Abstract: A method for facilitating the updating of a language model includes receiving, at a client device, via a microphone, an audio message corresponding to speech of a user; communicating the audio message to a first remote server; receiving, that the client device, a result, transcribed at the first remote server using an automatic speech recognition system (“ASR”), from the audio message; receiving, at the client device from the user, an affirmation of the result; storing, at the client device, the result in association with an identifier corresponding to the audio message; and communicating, to a second remote server, the stored result together with the identifier.

Type: Grant

Filed: July 25, 2014

Date of Patent: July 5, 2016

Assignee: Amazon Technologies, Inc.

Inventors: Marc White, Igor Roditis Jablokov, Victor Roman Jablokov
Retrieval and management of spoken language understanding personalization data

Patent number: 9361289

Abstract: Features are disclosed for maintaining data that can be used to personalize spoken language processing, such as automatic speech recognition (“ASR”), natural language understanding (“NLU”), natural language processing (“NLP”), etc. The data may be obtained from various data sources, such as applications or services used by the user. User-specific data maintained by the data sources can be retrieved and stored for use in generating personal models. Updates to data at the data sources may be reflected by separate data sets in the personalization data, such that other processes can obtain the update data sets separate from other data.

Type: Grant

Filed: August 30, 2013

Date of Patent: June 7, 2016

Assignee: Amazon Technologies, Inc.

Inventors: Madan Mohan Rao Jampani, Arushan Rajasekaram, Nikko Strom, Yuzo Watanabe, Stan Weidner Salvador
Object display with visual verisimilitude

Patent number: 9348411

Abstract: Described herein are technologies relating to display of a representation of an object on a display screen with visual verisimilitude to a viewer. A location of eyes of the viewer relative to a reference point on the display screen is determined. Additionally, a direction of gaze of the eyes of the viewer is determined. Based upon the location and direction of gaze of the eyes of the viewer, the representation of the object can be displayed at a scale and orientation such that it appears with visual verisimilitude to the viewer.

Type: Grant

Filed: May 24, 2013

Date of Patent: May 24, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Timothy S. Paek, Johnson Apacible
Selection of atoms for search engine retrieval

Patent number: 9342582

Abstract: Methods are provided for populating search indexes with atoms identified in documents. Documents that are to be indexed are identified, and for each document, atoms are identified and are categorized as unigrams, n-grams, and n-tuples. A list of atom/document pairs is generated such that an information metric can be computed for each pair. An information metric represents a ranking of the atom in relation to the particular document. Based on the information metric, some atom/document pairs are discarded and others are indexed.

Type: Grant

Filed: March 10, 2011

Date of Patent: May 17, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Knut Magne Risvik, Mike Hopcroft, John G. Bennett, Karthik Kalyanaraman, Trishul Chilimbi
Method and system for endpoint automatic detection of audio record

Patent number: 9330667

Abstract: A method and system for endpoint automatic detection of audio record is provided. The method comprises the following steps: acquiring a audio record text and affirming the text endpoint acoustic model for the audio record text; starting acquiring the audio record data of each frame in turn from the audio record start frame in the audio record data; affirming the characteristics acoustic model of the decoding optimal path for the acquired current frame of the audio record data; comparing the characteristics acoustic model of the decoding optimal path acquired from the current frame of the audio record data with the endpoint acoustic model to determine if they are the same; if yes, updating a mute duration threshold with a second time threshold, wherein the second time threshold is less than a first time threshold. This method can improve the recognizing efficiency of the audio record endpoint.

Type: Grant

Filed: October 29, 2010

Date of Patent: May 3, 2016

Assignee: iFLYTEK Co., Ltd.

Inventors: Si Wei, Guoping Hu, Yu Hu, Qingfeng Liu
Method and apparatus for enhanced phonetic indexing and search

Patent number: 9311914

Abstract: The subject matter discloses a method two phase phonetic indexing and search comprising: receiving a digital representation of an audio signal; producing a phonetic index of the audio signal; producing phonetic N-gram sequence from the phonetic index by segmenting the phonetic index into a plurality of phonetic N-grams; and producing an inverted index of the plurality of phonetic N-grams.

Type: Grant

Filed: September 3, 2012

Date of Patent: April 12, 2016

Assignee: NICE-SYSTEMS LTD

Inventors: Moshe Wasserblat, Dan Eylon, Tzach Ashkenazi, Oren Pereg, Ronen Laperdon
Building conversational understanding systems using a toolset

Patent number: 9311298

Abstract: Tools are provided to allow developers to enable applications for Conversational Understanding (CU) using assets from a CU service. The tools may be used to select functionality from existing domains, extend the coverage of one or more domains, as well as to create new domains in the CU service. A developer may provide example Natural Language (NL) sentences that are analyzed by the tools to assist the developer in labeling data that is used to update the models in the CU service. For example, the tools may assist a developer in identifying domains, determining intent actions, determining intent objects and determining slots from example NL sentences. After the developer tags all or a portion of the example NL sentences, the models in the CU service are automatically updated and validated. For example, validation tools may be used to determine an accuracy of the model against test data.

Type: Grant

Filed: June 21, 2013

Date of Patent: April 12, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ruhi Sarikaya, Daniel Boies, Larry Heck, Tasos Anastasakos
Speech recognition vocabulary integration for classifying words to identify vocabulary application group

Patent number: 9305545

Abstract: A method for vocabulary integration of speech recognition comprises converting multiple speech signals into multiple words using a processor, applying confidence scores to the multiple words, classifying the multiple words into a plurality of classifications based on classification criteria and the confidence score for each word, determining if one or more of the multiple words are unrecognized based on the plurality of classifications, classifying each unrecognized word and detecting a match for the unrecognized word based on additional classification criteria, and upon detecting a match for an unrecognized word, converting at least a portion of the multiple speech signals corresponding to the unrecognized word into words.

Type: Grant

Filed: March 13, 2013

Date of Patent: April 5, 2016

Assignee: Samsung Electronics Co., Ltd.

Inventor: Chun Shing Cheung
Periodic ambient waveform analysis for dynamic device configuration

Patent number: 9299110

Abstract: Client devices periodically capture ambient audio waveforms and modify their own device configuration based on the captured audio waveform. In particular embodiments, client devices generate waveform fingerprints and upload the fingerprints to a server for analysis. The server compares the waveform fingerprints to a database of stored waveform fingerprints, and upon finding a match, pushes content or other information to the client device. The fingerprints in the database may be uploaded by other users, and compared to the received client waveform fingerprint based on common location or other social factors. Thus a client's location may be enhanced if the location of users whose fingerprints match the client's is known, and, based upon this enhanced location, the server may transmit an instruction to the device to modify its device configuration.

Type: Grant

Filed: October 19, 2011

Date of Patent: March 29, 2016

Assignee: Facebook, Inc.

Inventors: Matthew Nicholas Papakipos, David Harry Garcia
Periodic ambient waveform analysis for enhanced social functions

Patent number: 9275647

Abstract: In particular embodiments, one or more computer-readable non-transitory storage media embody software that is operable when executed to receive an audio waveform fingerprint and a client-determined location from a client device. The received audio waveform fingerprint may be compared to a database of stored audio waveform fingerprints, each stored audio waveform fingerprint associated with an object in an object database. One or more matching audio waveform fingerprints may be found from a comparison set of audio waveform fingerprints obtained from the audio waveform fingerprint database. Location information associated with a location of the client device may be determined, and the location information may be sent to the client device. The client device may be operable to update the client-determined location based at least in part on the location information.

Type: Grant

Filed: April 18, 2014

Date of Patent: March 1, 2016

Assignee: Facebook, Inc.

Inventors: Matthew Nicholas Papakipos, David Harry Garcia
Adapting enhanced acoustic models

Patent number: 9263034

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving voice queries, obtaining, for one or more of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query, generating, for the one or more voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query, selecting a subset of the one or more voice queries based on the posterior recognition confidence measures, and adapting an acoustic model using the subset of the voice queries.

Type: Grant

Filed: July 13, 2010

Date of Patent: February 16, 2016

Assignee: Google Inc.

Inventors: Brian Strope, Douglas H. Beeferman
Methods and systems for disambiguation of an identification of a sample of a media stream

Patent number: 9251796

Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.

Type: Grant

Filed: August 21, 2014

Date of Patent: February 2, 2016

Assignee: Shazam Entertainment Ltd.

Inventor: Avery Li-Chun Wang
Calibration of a speech recognition engine using validated text

Patent number: 9218807

Abstract: A system and method provide acoustic training of a voice or speech recognition engine and/or voice or speech recognition software application. Instead of requiring a user to read from a prepared or predetermined script, the system and method described herein enable acoustic training using any free text spoken phrases provided by the user directly, or by a previously recorded speech, presentation, or the like, performed by the user.

Type: Grant

Filed: January 7, 2011

Date of Patent: December 22, 2015

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Eric Hon-Anderson, Robert W. Stuller
Sampling training data for an automatic speech recognition system based on a benchmark classification distribution

Patent number: 9202461

Abstract: A set of benchmark text strings may be classified to provide a set of benchmark classifications. The benchmark text strings in the set may correspond to a benchmark corpus of benchmark utterances in a particular language. A benchmark classification distribution of the set of benchmark classifications may be determined. A respective classification for each text string in a corpus of text strings may also be determined. Text strings from the corpus of text strings may be sampled to form a training corpus of training text strings such that the classifications of the training text strings have a training text string classification distribution that is based on the benchmark classification distribution. The training corpus of training text strings may be used to train an automatic speech recognition (ASR) system.

Type: Grant

Filed: January 18, 2013

Date of Patent: December 1, 2015

Assignee: Google Inc.

Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar, Kaisuke Nakajima, Daniel Martin Bikel
Method of configuring a sensor-based detection device and a corresponding computer program and adaptive device

Patent number: 9195913

Abstract: This method of configuring a device for detecting a situation from among a set of situations in which it is possible to find a physical system observed by a least one sensor, comprises the following steps: receiving (102) a training sequence corresponding to a determined situation of the physical system; determining (118) parameters of a statistical hidden Markov model recorded on the detection device and related to the determined situation, based on a prior initialization (104-116) of these parameters. The prior initialization (104-116) comprises the following steps: determining (104, 106) multiple probability distributions from the training sequence; distributing (108-114) the determined probability distributions between the hidden states of the statistical model being used; and initializing the parameters of the statistical model being used from representative probability distributions determined for each hidden state of the statistical model being used.

Type: Grant

Filed: August 31, 2011

Date of Patent: November 24, 2015

Assignee: Commissariat à{grave over ( )}l'énergie atomique et aux énergies alternatives

Inventor: Pierre Jallon
Discriminative training of document transcription system

Patent number: 9190050

Abstract: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

Type: Grant

Filed: April 3, 2014

Date of Patent: November 17, 2015

Assignee: MModal IP LLC

Inventors: Lambert Mathias, Girija Yegnanarayanan, Juergen Fritsch
Method and apparatus for setting selected recognition parameters to minimize an application cost function

Patent number: 9177552

Abstract: Methods and systems for setting selected automatic speech recognition parameters are described. A data set associated with operation of a speech recognition application is defined and includes: i. recognition states characterizing the semantic progression of a user interaction with the speech recognition application, and ii. recognition outcomes associated with each recognition state. For a selected user interaction with the speech recognition application, an application cost function is defined that characterizes an estimated cost of the user interaction for each recognition outcome. For one or more system performance parameters indirectly related to the user interaction, the parameters are set to values which optimize the cost of the user interaction over the recognition states.

Type: Grant

Filed: February 3, 2012

Date of Patent: November 3, 2015

Assignee: Nuance Communications, Inc.

Inventor: Jeffrey N. Marcus
System and method for recognizing speech

Patent number: 9159317

Abstract: A system and a method recognize speech including a sequence of words. A set of interpretations of the speech is generated using an acoustic model and a language model, and, for each interpretation, a score representing correctness of an interpretation in representing the sequence of words is determined to produce a set of scores. Next, the set of scores is updated based on a consistency of each interpretation with a constraint determined in response to receiving a word sequence constraint.

Type: Grant

Filed: June 14, 2013

Date of Patent: October 13, 2015

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Bret Harsham, John R. Hershey
Unsupervised and active learning in automatic speech recognition for call classification

Patent number: 9159318

Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.

Type: Grant

Filed: August 26, 2014

Date of Patent: October 13, 2015

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
Accelerated object detection filter using a video motion estimation module

Patent number: 9141855

Abstract: Systems, apparatus and methods are described related to accelerated object detection filter using a video estimation module.

Type: Grant

Filed: December 20, 2012

Date of Patent: September 22, 2015

Assignee: Intel Corporation

Inventors: Lin Xu, Yangzhou Du, Jianguo Li, Qiang Li, Ya-Ti Peng, Yi-Jen Chiu
Background audio identification for speech disambiguation

Patent number: 9123338

Abstract: Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.

Type: Grant

Filed: March 14, 2013

Date of Patent: September 1, 2015

Assignee: Google Inc.

Inventors: Jason Sanders, Gabriel Taubman, John J. Lee
Controlling power levels of electronic devices through user interaction

Patent number: 9113190

Abstract: A processor-implemented method, system and computer readable medium for intelligently controlling the power level of an electronic device in a multimedia system based on user intent, is provided. The method includes receiving data relating to a first user interaction with a device in a multimedia system. The method includes determining if the first user interaction corresponds to a user's intent to interact with the device. The method then includes setting a power level for the device based on the first user interaction. The method further includes receiving data relating to a second user interaction with the device. The method then includes altering the power level of the device based on the second user interaction to activate the device for the user.

Type: Grant

Filed: June 4, 2010

Date of Patent: August 18, 2015

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: John Clavin, John Tardif
Information input apparatus, information input method, and computer program

Patent number: 9075444

Abstract: An information input apparatus includes an observation unit that observes an environment including a user and one or more apparatuses to be controlled and includes a sensor; a learning unit that separates a foreground including the user and the one or more apparatuses to be controlled and a background including the environment except for the foreground from observation data obtained by the observation unit and learns three-dimensional models of the foreground and the background; a state estimation unit that estimates positions and postures of already modeled foregrounds in the environment; a user recognition unit that identifies fingers of the user from the foreground and recognizes a shape, position, and posture of the fingers; and an apparatus control unit that outputs a control command to the one or more apparatuses to be controlled on the basis of the recognized shape, position, and posture of the fingers.

Type: Grant

Filed: February 13, 2013

Date of Patent: July 7, 2015

Assignee: SONY CORPORATION

Inventors: Kuniaki Noda, Hirotaka Suzuki, Haruto Takeda, Yusuke Watanabe

prev … 2 3 4 5 6 7 8 9 10 … next