Patents Examined by Thuykhanh Le

Information processing device, information processing method, computer program product, and recognition system

Patent number: 10319373

Abstract: An information processing device includes a phonetic converting unit, an HMM converting unit, and a searching unit. The phonetic converting unit converts a phonetic symbol sequence into a hidden Markov model (HMM) state sequence in which states of an HMM are aligned. The HMM converting unit converts the HMM state sequence into a score vector sequence indicating the degree of similarity to a specific pronunciation using a similarity matrix defining the similarity between the states of the HMM. The searching unit searches for a path having a better score for the score vector sequence than that of the other paths out of paths included in a search network and outputs a phonetic symbol sequence corresponding to the retrieved path.

Type: Grant

Filed: December 23, 2016

Date of Patent: June 11, 2019

Assignee: Kabushiki Kaisha Toshiba

Inventor: Manabu Nagao
Audio human interactive proof based on text-to-speech and semantics

Patent number: 10319363

Abstract: The text-to-speech audio HIP technique described herein in some embodiments uses different correlated or uncorrelated words or sentences generated via a text-to-speech engine as audio HIP challenges. The technique can apply different effects in the text-to-speech synthesizer speaking a sentence to be used as a HIP challenge string. The different effects can include, for example, spectral frequency warping; vowel duration warping; background addition; echo addition; and varying the time duration between words, among others. In some embodiments the technique varies the set of parameters to prevent using Automated Speech Recognition tools from using previously used audio HIP challenges to learn a model which can then be used to recognize future audio HIP challenges generated by the technique. Additionally, in some embodiments the technique introduces the requirement of semantic understanding in HIP challenges.

Type: Grant

Filed: February 17, 2012

Date of Patent: June 11, 2019

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Yao Qian, Frank Kao-Ping Soong, Bin Benjamin Zhu
Fast out-of-vocabulary search in automatic speech recognition systems

Patent number: 10290301

Abstract: A method including: receiving, on a computer system, a text search query, the query including one or more query words; generating, on the computer system, for each query word in the query, one or more anchor segments within a plurality of speech recognition processed audio files, the one or more anchor segments identifying possible locations containing the query word; post-processing, on the computer system, the one or more anchor segments, the post-processing including: expanding the one or more anchor segments; sorting the one or more anchor segments; and merging overlapping ones of the one or more anchor segments; and searching, on the computer system, the post-processed one or more anchor segments for instances of at least one of the one or more query words using a constrained grammar.

Type: Grant

Filed: January 9, 2017

Date of Patent: May 14, 2019

Inventors: Amir Lev-Tov, Avraham Faizakof, Yochai Konig
Device for generating aligned corpus based on unsupervised-learning alignment, method thereof, device for analyzing destructive expression morpheme using aligned corpus, and method for analyzing morpheme thereof

Patent number: 10282413

Abstract: Disclosed is a device for generating an aligned corpus based on unsupervised-learning alignment, and a method thereof, a device for analyzing a destructive expression morpheme using an aligned corpus, and a method for analyzing a morpheme thereof. The morpheme analyzing device includes a knowledge database and an analyzer. The knowledge database includes an aligned corpus for storing a plurality of knowledge information sets used for a per-language morpheme analysis, and stores a morpheme dictionary for storing morpheme information corresponding to a normal expression and normal expression information corresponding to a destructive expression (here, the destructive expression represents an expression that is erroneous in orthography or is not normalized and standardized).

Type: Grant

Filed: August 27, 2014

Date of Patent: May 7, 2019

Assignee: SYSTRAN INTERNATIONAL CO., LTD.

Inventor: Chang Jin Ji
Performance modification based on aggregation of audience traits and natural language feedback

Patent number: 10282409

Abstract: Mechanisms, in a natural language processing (NLP) system comprising a processor and a memory are provided. The NLP system receives a plurality of communications from a plurality of devices associated with audience members of a real-time presentation by a presenter of the presentation while the presentation is being presented. The NLP system analyzes the plurality of communications using natural language processing techniques, to identify attributes of the audience members and generates an aggregate audience model based on the identified attributes of the audience members. The aggregate audience model specifies an aggregate of attributes of the audience. Moreover, the NLP system outputs, to the presenter via a device associated with the presenter, a suggestion output identifying one or more portions of the presentation that are currently of interest to the audience members based on the aggregate audience model.

Type: Grant

Filed: December 11, 2014

Date of Patent: May 7, 2019

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Laura J. Rodriguez
System, method, and recording medium for natural language learning

Patent number: 10282411

Abstract: A natural language learning method, system, and non-transitory computer readable medium include analyzing a corpus of sentences stored in a database to identify an internal structure of words in the corpus of sentences, creating a plurality of new words that are a combination of the internal structure of a word of the words in the corpus of sentences and the word, clustering the plurality of new words created by the creating that match into a plurality of cluster groups, filtering the plurality of cluster groups to create a partial set of each of the plurality of cluster groups, and performing word embedding processing on the partial set of each of the plurality of cluster groups to obtain vectors for new words.

Type: Grant

Filed: March 31, 2016

Date of Patent: May 7, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Octavian Popescu, Vadim Sheinin
Multi-domain natural language processing architecture

Patent number: 10282419

Abstract: An arrangement and corresponding method are described for multi-domain natural language processing. Multiple parallel domain pipelines are used for processing a natural language input. Each domain pipeline represents a different specific subject domain of related concepts. Each domain pipeline includes a mention module that processes the natural language input using natural language understanding (NLU) to determine a corresponding list of mentions, and an interpretation generator that receives the list of mentions and produces a rank-ordered domain output set of sentence-level interpretation candidates. A global evidence ranker receives the domain output sets from the domain pipelines and produces an overall rank-ordered final output set of sentence-level interpretations.

Type: Grant

Filed: December 12, 2012

Date of Patent: May 7, 2019

Assignee: Nuance Communications, Inc.

Inventors: Matthieu Hebert, Jean-Philippe Robichaud, Christopher M. Parisien, Nicolae Duta, Jerome Tremblay, Amjad Almahairi, Lakshmish Kaushik, Maryse Boisvert
Document processing device, document processing method, program, and information storage medium

Patent number: 10275445

Abstract: Displaying supplemental information for an element in a document based on changes in a user's ability to read the document. A document processing device configured to: acquire information on a document including a plurality of words; acquire pieces of supplemental information being linked with the plurality of words; decide whether or not a piece of supplemental information linked with corresponding one of the plurality of words is to be displayed based on a frequency with which each of the plurality of words has appeared; and control displaying the plurality of words and the pieces of supplemental information. In the deciding, it is decided whether or not the corresponding one of the piece of supplemental information is to be displayed based on a frequency with which each of the plurality of words has been displayed along with the piece of supplemental information.

Type: Grant

Filed: March 19, 2013

Date of Patent: April 30, 2019

Assignee: RAKUTEN, INC.

Inventor: Kazuyoshi Hayase
Contextual hotwords

Patent number: 10276161

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for contextual hotwords are disclosed. In one aspect, a method, during a boot process of a computing device, includes the actions of determining, by a computing device, a context associated with the computing device. The actions further include, based on the context associated with the computing device, determining a hotword. The actions further include, after determining the hotword, receiving audio data that corresponds to an utterance. The actions further include determining that the audio data includes the hotword. The actions further include, in response to determining that the audio data includes the hotword, performing an operation associated with the hotword.

Type: Grant

Filed: December 27, 2016

Date of Patent: April 30, 2019

Assignee: Google LLC

Inventors: Christopher Thaddeus Hughes, Ignacio Lopez Moreno, Aleksandar Kracun
Dialog simulation

Patent number: 10272349

Abstract: A method of producing simulated dialog outputs selected phrases in a sequence defined by a selected dialog model. When a state change is recorded one or more of dialog models which have non-zero state change type weights corresponding to the recorded state change type are identified and one of the identified dialog models is selected probabilistically with influence from the non-zero state change type weights. One or more unique characters are mapped to character indices of the selected dialog model and one or more phrases defined in the one or more mapped unique characters are identified which have non-zero phrase type weights corresponding to the phrase types of the selected dialog model. For each phrase type defined in the selected dialog model, at least one of the phrases are selected probabilistically with influence from the non-zero phrase type weights.

Type: Grant

Filed: September 7, 2016

Date of Patent: April 30, 2019

Inventor: Isaac Davenport
Desired signal spotting in noisy, flawed environments

Patent number: 10269376

Abstract: There are disclosed devices, system and methods for desired signal spotting in noisy, flawed environments by identifying a signal to be spotted, identifying a target confidence level, and then passing a pool of cabined arrays through a comparator to detect the identified signal, wherein the cabined arrays are derived from respective distinct environments. The arrays may include plural converted samples, each converted sample include a product of a conversion of a respective original sample, the conversion including filtering noise and transforming the original sample from a first form to a second form. Detecting may include measuring a confidence of the presence of the identified signal in each of plural converted samples using correlation of the identified signal to bodies of known matching samples. If the confidence for a given converted sample satisfies the target confidence level, the given sample is flagged.

Type: Grant

Filed: June 28, 2018

Date of Patent: April 23, 2019

Assignee: Invoca, Inc.

Inventors: Sean Michael Storlie, Victor Jara Borda, Michael Kingsley McCourt, Jr., Leland W. Kirchhoff, Colin Denison Kelley, Nicholas James Burwell
Methods of decoding speech from brain activity data and devices for practicing the same

Patent number: 10264990

Abstract: The present disclosure provides methods of decoding speech from brain activity data. Aspects of the methods include receiving brain speech activity data from a subject, and processing the brain speech activity data to output speech feature data. Also provided are devices and systems for practicing the subject methods.

Type: Grant

Filed: October 25, 2013

Date of Patent: April 23, 2019

Assignee: The Regents of the University of California

Inventors: Brian Pasley, Robert T. Knight
Detecting pause in audible input to device

Patent number: 10269377

Abstract: A device includes a processor and a memory accessible to the processor and bearing instructions executable by the processor to process an audible input sequence provided by a user of the device, determine that a pause in providing the audible input sequence has occurred at least partially based on a first signal from at least one camera communicating with the device, cease to process the audible input sequence responsive to a determination that the pause has occurred, determine that providing the audible input sequence has resumed based at least partially based on a second signal from the camera, and resume processing of the audible input sequence responsive to a determination that providing the audible input sequence has resumed.

Type: Grant

Filed: August 31, 2018

Date of Patent: April 23, 2019

Assignee: LENOVO (SINGAPORE) PTE. LTD.

Inventors: Russell Speight VanBlon, Suzanne Marion Beaumont, Rod David Waltermann
System and method for detecting phonetically similar imposter phrases

Patent number: 10269352

Abstract: A system and method for detecting phonetically similar imposter phrases may include using automatic speech recognition (ASR) to search for a first phrase in a set of objects; producing a list of references by searching for the first phrase in the set of objects using phonetic search; using output produced by the ASR to determine whether or not a reference in the list points to a phrase that is the same as the first phrase; and if it is determined that the reference points to a second phrase that is different from the first phrase then marking the second phrase as a potential cause for a phrase search false positive.

Type: Grant

Filed: December 23, 2016

Date of Patent: April 23, 2019

Assignee: NICE Ltd.

Inventors: Robert William Morris, Neeraj Singh Verma
Method and apparatus for recognizing speech by lip reading

Patent number: 10204626

Abstract: A dictation device includes: an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion during the voice utterance; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a server end device and receiving second data packets including combined dictation based upon the audio stream and the video stream from the server end device. In the combined dictation, first dictation generated based upon the audio stream has been corrected by second dictation generated based upon the video stream.

Type: Grant

Filed: May 10, 2018

Date of Patent: February 12, 2019

Assignee: Panasonic Intellectual Property Corporation of America

Inventors: Yuichiro Takayanagi, Masashi Kusaka
Linear scoring for low power wake on voice

Patent number: 10170115

Abstract: Key phrase detection techniques for applications such as wake on voice are discussed include performing a vectorized operation on a multiple element acoustic score vector for a current time instance including a single state rejection model score and scores for a multiple state key phrase model and a multiple element state score vector for a previous time instance including a previous state score for the single state rejection model and previous state scores for the multiple state key phrase model to generate a multiple element score summation vector and a second vectorized operation on the multiple element score summation vector to determine a multiple element state score vector for the current time instance. The multiple element state score vector for the current time instance may then be evaluated to determine whether received audio input includes a key phrase corresponding to the multiple state key phrase model.

Type: Grant

Filed: July 12, 2018

Date of Patent: January 1, 2019

Assignee: Intel Corporation

Inventors: Tobias Bocklet, Tomasz Dorau, Przemyslaw Sobon, Przemyslaw Tomaszewski
Detecting pause in audible input to device

Patent number: 10163455

Abstract: A device includes a processor and a memory accessible to the processor and bearing instructions executable by the processor to process an audible input sequence provided by a user of the device, determine that a pause in providing the audible input sequence has occurred at least partially based on a first signal from at least one camera communicating with the device, cease to process the audible input sequence responsive to a determination that the pause has occurred, determine that providing the audible input sequence has resumed based at least partially based on a second signal from the camera, and resume processing of the audible input sequence responsive to a determination that providing the audible input sequence has resumed.

Type: Grant

Filed: December 3, 2013

Date of Patent: December 25, 2018

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Russell Speight VanBlon, Suzanne Marion Beaumont, Rod David Waltermann
Speech model retrieval in distributed speech recognition systems

Patent number: 10152973

Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

Type: Grant

Filed: November 16, 2015

Date of Patent: December 11, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Bjorn Hoffmeister, Hugh Evan Secker-Walker, Jeffrey Cornelius O'Neill
Comfort noise addition for modeling background noise at low bit-rates

Patent number: 10147432

Abstract: The invention provides a decoder being configured for processing an encoded audio bitstream, wherein the decoder includes: a bitstream decoder configured to derive a decoded audio signal from the bitstream, wherein the decoded audio signal includes at least one decoded frame; a noise estimation device configured to produce a noise estimation signal containing an estimation of the level and/or the spectral shape of a noise in the decoded audio signal; a comfort noise generating device configured to derive a comfort noise signal from the noise estimation signal; and a combiner configured to combine the decoded frame of the decoded audio signal and the comfort noise signal in order to obtain an audio output signal.

Type: Grant

Filed: June 19, 2015

Date of Patent: December 4, 2018

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Guillaume Fuchs, Anthony Lombard, Emmanuel Ravelli, Stefan Doehla, Jeremie Lecomte, Martin Dietz
Automatic censoring of objectionable song lyrics in audio

Patent number: 10141010

Abstract: Embodiments relate to censoring audio data. A censoring system receives audio data including a song tag and amplitude data as a function of time. The amplitude data represents spoken words occurring over a duration, as well as non-spoken word sound overlapping with some of the spoken words during the duration. The system accesses a set of song lyrics and processes the set of song lyrics and the amplitude data together to identify timestamps in the amplitude data. These timestamps indicate a time during the duration when one of the words from the lyrics begins in the amplitude data. The system compares the words in the set of song lyrics to a blacklist and adjusts the amplitude data at the timestamps of blacklisted word occurrences to render the audio at the blacklisted words incomprehensible. The system outputs the adjusted amplitude data.

Type: Grant

Filed: October 1, 2015

Date of Patent: November 27, 2018

Assignee: Google LLC

Inventor: Eric Paul Nichols

prev … 6 7 8 9 10 11 12 13 14 … next