Patents Examined by Thuykhanh Le
  • Patent number: 10319373
    Abstract: An information processing device includes a phonetic converting unit, an HMM converting unit, and a searching unit. The phonetic converting unit converts a phonetic symbol sequence into a hidden Markov model (HMM) state sequence in which states of an HMM are aligned. The HMM converting unit converts the HMM state sequence into a score vector sequence indicating the degree of similarity to a specific pronunciation using a similarity matrix defining the similarity between the states of the HMM. The searching unit searches for a path having a better score for the score vector sequence than that of the other paths out of paths included in a search network and outputs a phonetic symbol sequence corresponding to the retrieved path.
    Type: Grant
    Filed: December 23, 2016
    Date of Patent: June 11, 2019
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Manabu Nagao
  • Patent number: 10319363
    Abstract: The text-to-speech audio HIP technique described herein in some embodiments uses different correlated or uncorrelated words or sentences generated via a text-to-speech engine as audio HIP challenges. The technique can apply different effects in the text-to-speech synthesizer speaking a sentence to be used as a HIP challenge string. The different effects can include, for example, spectral frequency warping; vowel duration warping; background addition; echo addition; and varying the time duration between words, among others. In some embodiments the technique varies the set of parameters to prevent using Automated Speech Recognition tools from using previously used audio HIP challenges to learn a model which can then be used to recognize future audio HIP challenges generated by the technique. Additionally, in some embodiments the technique introduces the requirement of semantic understanding in HIP challenges.
    Type: Grant
    Filed: February 17, 2012
    Date of Patent: June 11, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yao Qian, Frank Kao-Ping Soong, Bin Benjamin Zhu
  • Patent number: 10290301
    Abstract: A method including: receiving, on a computer system, a text search query, the query including one or more query words; generating, on the computer system, for each query word in the query, one or more anchor segments within a plurality of speech recognition processed audio files, the one or more anchor segments identifying possible locations containing the query word; post-processing, on the computer system, the one or more anchor segments, the post-processing including: expanding the one or more anchor segments; sorting the one or more anchor segments; and merging overlapping ones of the one or more anchor segments; and searching, on the computer system, the post-processed one or more anchor segments for instances of at least one of the one or more query words using a constrained grammar.
    Type: Grant
    Filed: January 9, 2017
    Date of Patent: May 14, 2019
    Inventors: Amir Lev-Tov, Avraham Faizakof, Yochai Konig
  • Patent number: 10282413
    Abstract: Disclosed is a device for generating an aligned corpus based on unsupervised-learning alignment, and a method thereof, a device for analyzing a destructive expression morpheme using an aligned corpus, and a method for analyzing a morpheme thereof. The morpheme analyzing device includes a knowledge database and an analyzer. The knowledge database includes an aligned corpus for storing a plurality of knowledge information sets used for a per-language morpheme analysis, and stores a morpheme dictionary for storing morpheme information corresponding to a normal expression and normal expression information corresponding to a destructive expression (here, the destructive expression represents an expression that is erroneous in orthography or is not normalized and standardized).
    Type: Grant
    Filed: August 27, 2014
    Date of Patent: May 7, 2019
    Assignee: SYSTRAN INTERNATIONAL CO., LTD.
    Inventor: Chang Jin Ji
  • Patent number: 10282409
    Abstract: Mechanisms, in a natural language processing (NLP) system comprising a processor and a memory are provided. The NLP system receives a plurality of communications from a plurality of devices associated with audience members of a real-time presentation by a presenter of the presentation while the presentation is being presented. The NLP system analyzes the plurality of communications using natural language processing techniques, to identify attributes of the audience members and generates an aggregate audience model based on the identified attributes of the audience members. The aggregate audience model specifies an aggregate of attributes of the audience. Moreover, the NLP system outputs, to the presenter via a device associated with the presenter, a suggestion output identifying one or more portions of the presentation that are currently of interest to the audience members based on the aggregate audience model.
    Type: Grant
    Filed: December 11, 2014
    Date of Patent: May 7, 2019
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Laura J. Rodriguez
  • Patent number: 10282411
    Abstract: A natural language learning method, system, and non-transitory computer readable medium include analyzing a corpus of sentences stored in a database to identify an internal structure of words in the corpus of sentences, creating a plurality of new words that are a combination of the internal structure of a word of the words in the corpus of sentences and the word, clustering the plurality of new words created by the creating that match into a plurality of cluster groups, filtering the plurality of cluster groups to create a partial set of each of the plurality of cluster groups, and performing word embedding processing on the partial set of each of the plurality of cluster groups to obtain vectors for new words.
    Type: Grant
    Filed: March 31, 2016
    Date of Patent: May 7, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Octavian Popescu, Vadim Sheinin
  • Patent number: 10282419
    Abstract: An arrangement and corresponding method are described for multi-domain natural language processing. Multiple parallel domain pipelines are used for processing a natural language input. Each domain pipeline represents a different specific subject domain of related concepts. Each domain pipeline includes a mention module that processes the natural language input using natural language understanding (NLU) to determine a corresponding list of mentions, and an interpretation generator that receives the list of mentions and produces a rank-ordered domain output set of sentence-level interpretation candidates. A global evidence ranker receives the domain output sets from the domain pipelines and produces an overall rank-ordered final output set of sentence-level interpretations.
    Type: Grant
    Filed: December 12, 2012
    Date of Patent: May 7, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Matthieu Hebert, Jean-Philippe Robichaud, Christopher M. Parisien, Nicolae Duta, Jerome Tremblay, Amjad Almahairi, Lakshmish Kaushik, Maryse Boisvert
  • Patent number: 10275445
    Abstract: Displaying supplemental information for an element in a document based on changes in a user's ability to read the document. A document processing device configured to: acquire information on a document including a plurality of words; acquire pieces of supplemental information being linked with the plurality of words; decide whether or not a piece of supplemental information linked with corresponding one of the plurality of words is to be displayed based on a frequency with which each of the plurality of words has appeared; and control displaying the plurality of words and the pieces of supplemental information. In the deciding, it is decided whether or not the corresponding one of the piece of supplemental information is to be displayed based on a frequency with which each of the plurality of words has been displayed along with the piece of supplemental information.
    Type: Grant
    Filed: March 19, 2013
    Date of Patent: April 30, 2019
    Assignee: RAKUTEN, INC.
    Inventor: Kazuyoshi Hayase
  • Patent number: 10276161
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for contextual hotwords are disclosed. In one aspect, a method, during a boot process of a computing device, includes the actions of determining, by a computing device, a context associated with the computing device. The actions further include, based on the context associated with the computing device, determining a hotword. The actions further include, after determining the hotword, receiving audio data that corresponds to an utterance. The actions further include determining that the audio data includes the hotword. The actions further include, in response to determining that the audio data includes the hotword, performing an operation associated with the hotword.
    Type: Grant
    Filed: December 27, 2016
    Date of Patent: April 30, 2019
    Assignee: Google LLC
    Inventors: Christopher Thaddeus Hughes, Ignacio Lopez Moreno, Aleksandar Kracun
  • Patent number: 10272349
    Abstract: A method of producing simulated dialog outputs selected phrases in a sequence defined by a selected dialog model. When a state change is recorded one or more of dialog models which have non-zero state change type weights corresponding to the recorded state change type are identified and one of the identified dialog models is selected probabilistically with influence from the non-zero state change type weights. One or more unique characters are mapped to character indices of the selected dialog model and one or more phrases defined in the one or more mapped unique characters are identified which have non-zero phrase type weights corresponding to the phrase types of the selected dialog model. For each phrase type defined in the selected dialog model, at least one of the phrases are selected probabilistically with influence from the non-zero phrase type weights.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: April 30, 2019
    Inventor: Isaac Davenport
  • Patent number: 10269376
    Abstract: There are disclosed devices, system and methods for desired signal spotting in noisy, flawed environments by identifying a signal to be spotted, identifying a target confidence level, and then passing a pool of cabined arrays through a comparator to detect the identified signal, wherein the cabined arrays are derived from respective distinct environments. The arrays may include plural converted samples, each converted sample include a product of a conversion of a respective original sample, the conversion including filtering noise and transforming the original sample from a first form to a second form. Detecting may include measuring a confidence of the presence of the identified signal in each of plural converted samples using correlation of the identified signal to bodies of known matching samples. If the confidence for a given converted sample satisfies the target confidence level, the given sample is flagged.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: April 23, 2019
    Assignee: Invoca, Inc.
    Inventors: Sean Michael Storlie, Victor Jara Borda, Michael Kingsley McCourt, Jr., Leland W. Kirchhoff, Colin Denison Kelley, Nicholas James Burwell
  • Patent number: 10264990
    Abstract: The present disclosure provides methods of decoding speech from brain activity data. Aspects of the methods include receiving brain speech activity data from a subject, and processing the brain speech activity data to output speech feature data. Also provided are devices and systems for practicing the subject methods.
    Type: Grant
    Filed: October 25, 2013
    Date of Patent: April 23, 2019
    Assignee: The Regents of the University of California
    Inventors: Brian Pasley, Robert T. Knight
  • Patent number: 10269377
    Abstract: A device includes a processor and a memory accessible to the processor and bearing instructions executable by the processor to process an audible input sequence provided by a user of the device, determine that a pause in providing the audible input sequence has occurred at least partially based on a first signal from at least one camera communicating with the device, cease to process the audible input sequence responsive to a determination that the pause has occurred, determine that providing the audible input sequence has resumed based at least partially based on a second signal from the camera, and resume processing of the audible input sequence responsive to a determination that providing the audible input sequence has resumed.
    Type: Grant
    Filed: August 31, 2018
    Date of Patent: April 23, 2019
    Assignee: LENOVO (SINGAPORE) PTE. LTD.
    Inventors: Russell Speight VanBlon, Suzanne Marion Beaumont, Rod David Waltermann
  • Patent number: 10269352
    Abstract: A system and method for detecting phonetically similar imposter phrases may include using automatic speech recognition (ASR) to search for a first phrase in a set of objects; producing a list of references by searching for the first phrase in the set of objects using phonetic search; using output produced by the ASR to determine whether or not a reference in the list points to a phrase that is the same as the first phrase; and if it is determined that the reference points to a second phrase that is different from the first phrase then marking the second phrase as a potential cause for a phrase search false positive.
    Type: Grant
    Filed: December 23, 2016
    Date of Patent: April 23, 2019
    Assignee: NICE Ltd.
    Inventors: Robert William Morris, Neeraj Singh Verma
  • Patent number: 10204626
    Abstract: A dictation device includes: an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion during the voice utterance; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a server end device and receiving second data packets including combined dictation based upon the audio stream and the video stream from the server end device. In the combined dictation, first dictation generated based upon the audio stream has been corrected by second dictation generated based upon the video stream.
    Type: Grant
    Filed: May 10, 2018
    Date of Patent: February 12, 2019
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventors: Yuichiro Takayanagi, Masashi Kusaka
  • Patent number: 10170115
    Abstract: Key phrase detection techniques for applications such as wake on voice are discussed include performing a vectorized operation on a multiple element acoustic score vector for a current time instance including a single state rejection model score and scores for a multiple state key phrase model and a multiple element state score vector for a previous time instance including a previous state score for the single state rejection model and previous state scores for the multiple state key phrase model to generate a multiple element score summation vector and a second vectorized operation on the multiple element score summation vector to determine a multiple element state score vector for the current time instance. The multiple element state score vector for the current time instance may then be evaluated to determine whether received audio input includes a key phrase corresponding to the multiple state key phrase model.
    Type: Grant
    Filed: July 12, 2018
    Date of Patent: January 1, 2019
    Assignee: Intel Corporation
    Inventors: Tobias Bocklet, Tomasz Dorau, Przemyslaw Sobon, Przemyslaw Tomaszewski
  • Patent number: 10163455
    Abstract: A device includes a processor and a memory accessible to the processor and bearing instructions executable by the processor to process an audible input sequence provided by a user of the device, determine that a pause in providing the audible input sequence has occurred at least partially based on a first signal from at least one camera communicating with the device, cease to process the audible input sequence responsive to a determination that the pause has occurred, determine that providing the audible input sequence has resumed based at least partially based on a second signal from the camera, and resume processing of the audible input sequence responsive to a determination that providing the audible input sequence has resumed.
    Type: Grant
    Filed: December 3, 2013
    Date of Patent: December 25, 2018
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: Russell Speight VanBlon, Suzanne Marion Beaumont, Rod David Waltermann
  • Patent number: 10152973
    Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.
    Type: Grant
    Filed: November 16, 2015
    Date of Patent: December 11, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Bjorn Hoffmeister, Hugh Evan Secker-Walker, Jeffrey Cornelius O'Neill
  • Patent number: 10147432
    Abstract: The invention provides a decoder being configured for processing an encoded audio bitstream, wherein the decoder includes: a bitstream decoder configured to derive a decoded audio signal from the bitstream, wherein the decoded audio signal includes at least one decoded frame; a noise estimation device configured to produce a noise estimation signal containing an estimation of the level and/or the spectral shape of a noise in the decoded audio signal; a comfort noise generating device configured to derive a comfort noise signal from the noise estimation signal; and a combiner configured to combine the decoded frame of the decoded audio signal and the comfort noise signal in order to obtain an audio output signal.
    Type: Grant
    Filed: June 19, 2015
    Date of Patent: December 4, 2018
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Guillaume Fuchs, Anthony Lombard, Emmanuel Ravelli, Stefan Doehla, Jeremie Lecomte, Martin Dietz
  • Patent number: 10141010
    Abstract: Embodiments relate to censoring audio data. A censoring system receives audio data including a song tag and amplitude data as a function of time. The amplitude data represents spoken words occurring over a duration, as well as non-spoken word sound overlapping with some of the spoken words during the duration. The system accesses a set of song lyrics and processes the set of song lyrics and the amplitude data together to identify timestamps in the amplitude data. These timestamps indicate a time during the duration when one of the words from the lyrics begins in the amplitude data. The system compares the words in the set of song lyrics to a blacklist and adjusts the amplitude data at the timestamps of blacklisted word occurrences to render the audio at the blacklisted words incomprehensible. The system outputs the adjusted amplitude data.
    Type: Grant
    Filed: October 1, 2015
    Date of Patent: November 27, 2018
    Assignee: Google LLC
    Inventor: Eric Paul Nichols