Patents Examined by Thuykhanh Le
-
Patent number: 10319373Abstract: An information processing device includes a phonetic converting unit, an HMM converting unit, and a searching unit. The phonetic converting unit converts a phonetic symbol sequence into a hidden Markov model (HMM) state sequence in which states of an HMM are aligned. The HMM converting unit converts the HMM state sequence into a score vector sequence indicating the degree of similarity to a specific pronunciation using a similarity matrix defining the similarity between the states of the HMM. The searching unit searches for a path having a better score for the score vector sequence than that of the other paths out of paths included in a search network and outputs a phonetic symbol sequence corresponding to the retrieved path.Type: GrantFiled: December 23, 2016Date of Patent: June 11, 2019Assignee: Kabushiki Kaisha ToshibaInventor: Manabu Nagao
-
Patent number: 10319363Abstract: The text-to-speech audio HIP technique described herein in some embodiments uses different correlated or uncorrelated words or sentences generated via a text-to-speech engine as audio HIP challenges. The technique can apply different effects in the text-to-speech synthesizer speaking a sentence to be used as a HIP challenge string. The different effects can include, for example, spectral frequency warping; vowel duration warping; background addition; echo addition; and varying the time duration between words, among others. In some embodiments the technique varies the set of parameters to prevent using Automated Speech Recognition tools from using previously used audio HIP challenges to learn a model which can then be used to recognize future audio HIP challenges generated by the technique. Additionally, in some embodiments the technique introduces the requirement of semantic understanding in HIP challenges.Type: GrantFiled: February 17, 2012Date of Patent: June 11, 2019Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Yao Qian, Frank Kao-Ping Soong, Bin Benjamin Zhu
-
Patent number: 10290301Abstract: A method including: receiving, on a computer system, a text search query, the query including one or more query words; generating, on the computer system, for each query word in the query, one or more anchor segments within a plurality of speech recognition processed audio files, the one or more anchor segments identifying possible locations containing the query word; post-processing, on the computer system, the one or more anchor segments, the post-processing including: expanding the one or more anchor segments; sorting the one or more anchor segments; and merging overlapping ones of the one or more anchor segments; and searching, on the computer system, the post-processed one or more anchor segments for instances of at least one of the one or more query words using a constrained grammar.Type: GrantFiled: January 9, 2017Date of Patent: May 14, 2019Inventors: Amir Lev-Tov, Avraham Faizakof, Yochai Konig
-
Patent number: 10282413Abstract: Disclosed is a device for generating an aligned corpus based on unsupervised-learning alignment, and a method thereof, a device for analyzing a destructive expression morpheme using an aligned corpus, and a method for analyzing a morpheme thereof. The morpheme analyzing device includes a knowledge database and an analyzer. The knowledge database includes an aligned corpus for storing a plurality of knowledge information sets used for a per-language morpheme analysis, and stores a morpheme dictionary for storing morpheme information corresponding to a normal expression and normal expression information corresponding to a destructive expression (here, the destructive expression represents an expression that is erroneous in orthography or is not normalized and standardized).Type: GrantFiled: August 27, 2014Date of Patent: May 7, 2019Assignee: SYSTRAN INTERNATIONAL CO., LTD.Inventor: Chang Jin Ji
-
Patent number: 10282409Abstract: Mechanisms, in a natural language processing (NLP) system comprising a processor and a memory are provided. The NLP system receives a plurality of communications from a plurality of devices associated with audience members of a real-time presentation by a presenter of the presentation while the presentation is being presented. The NLP system analyzes the plurality of communications using natural language processing techniques, to identify attributes of the audience members and generates an aggregate audience model based on the identified attributes of the audience members. The aggregate audience model specifies an aggregate of attributes of the audience. Moreover, the NLP system outputs, to the presenter via a device associated with the presenter, a suggestion output identifying one or more portions of the presentation that are currently of interest to the audience members based on the aggregate audience model.Type: GrantFiled: December 11, 2014Date of Patent: May 7, 2019Assignee: International Business Machines CorporationInventors: Corville O. Allen, Laura J. Rodriguez
-
Patent number: 10282411Abstract: A natural language learning method, system, and non-transitory computer readable medium include analyzing a corpus of sentences stored in a database to identify an internal structure of words in the corpus of sentences, creating a plurality of new words that are a combination of the internal structure of a word of the words in the corpus of sentences and the word, clustering the plurality of new words created by the creating that match into a plurality of cluster groups, filtering the plurality of cluster groups to create a partial set of each of the plurality of cluster groups, and performing word embedding processing on the partial set of each of the plurality of cluster groups to obtain vectors for new words.Type: GrantFiled: March 31, 2016Date of Patent: May 7, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Octavian Popescu, Vadim Sheinin
-
Patent number: 10282419Abstract: An arrangement and corresponding method are described for multi-domain natural language processing. Multiple parallel domain pipelines are used for processing a natural language input. Each domain pipeline represents a different specific subject domain of related concepts. Each domain pipeline includes a mention module that processes the natural language input using natural language understanding (NLU) to determine a corresponding list of mentions, and an interpretation generator that receives the list of mentions and produces a rank-ordered domain output set of sentence-level interpretation candidates. A global evidence ranker receives the domain output sets from the domain pipelines and produces an overall rank-ordered final output set of sentence-level interpretations.Type: GrantFiled: December 12, 2012Date of Patent: May 7, 2019Assignee: Nuance Communications, Inc.Inventors: Matthieu Hebert, Jean-Philippe Robichaud, Christopher M. Parisien, Nicolae Duta, Jerome Tremblay, Amjad Almahairi, Lakshmish Kaushik, Maryse Boisvert
-
Patent number: 10275445Abstract: Displaying supplemental information for an element in a document based on changes in a user's ability to read the document. A document processing device configured to: acquire information on a document including a plurality of words; acquire pieces of supplemental information being linked with the plurality of words; decide whether or not a piece of supplemental information linked with corresponding one of the plurality of words is to be displayed based on a frequency with which each of the plurality of words has appeared; and control displaying the plurality of words and the pieces of supplemental information. In the deciding, it is decided whether or not the corresponding one of the piece of supplemental information is to be displayed based on a frequency with which each of the plurality of words has been displayed along with the piece of supplemental information.Type: GrantFiled: March 19, 2013Date of Patent: April 30, 2019Assignee: RAKUTEN, INC.Inventor: Kazuyoshi Hayase
-
Patent number: 10276161Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for contextual hotwords are disclosed. In one aspect, a method, during a boot process of a computing device, includes the actions of determining, by a computing device, a context associated with the computing device. The actions further include, based on the context associated with the computing device, determining a hotword. The actions further include, after determining the hotword, receiving audio data that corresponds to an utterance. The actions further include determining that the audio data includes the hotword. The actions further include, in response to determining that the audio data includes the hotword, performing an operation associated with the hotword.Type: GrantFiled: December 27, 2016Date of Patent: April 30, 2019Assignee: Google LLCInventors: Christopher Thaddeus Hughes, Ignacio Lopez Moreno, Aleksandar Kracun
-
Patent number: 10272349Abstract: A method of producing simulated dialog outputs selected phrases in a sequence defined by a selected dialog model. When a state change is recorded one or more of dialog models which have non-zero state change type weights corresponding to the recorded state change type are identified and one of the identified dialog models is selected probabilistically with influence from the non-zero state change type weights. One or more unique characters are mapped to character indices of the selected dialog model and one or more phrases defined in the one or more mapped unique characters are identified which have non-zero phrase type weights corresponding to the phrase types of the selected dialog model. For each phrase type defined in the selected dialog model, at least one of the phrases are selected probabilistically with influence from the non-zero phrase type weights.Type: GrantFiled: September 7, 2016Date of Patent: April 30, 2019Inventor: Isaac Davenport
-
Patent number: 10269376Abstract: There are disclosed devices, system and methods for desired signal spotting in noisy, flawed environments by identifying a signal to be spotted, identifying a target confidence level, and then passing a pool of cabined arrays through a comparator to detect the identified signal, wherein the cabined arrays are derived from respective distinct environments. The arrays may include plural converted samples, each converted sample include a product of a conversion of a respective original sample, the conversion including filtering noise and transforming the original sample from a first form to a second form. Detecting may include measuring a confidence of the presence of the identified signal in each of plural converted samples using correlation of the identified signal to bodies of known matching samples. If the confidence for a given converted sample satisfies the target confidence level, the given sample is flagged.Type: GrantFiled: June 28, 2018Date of Patent: April 23, 2019Assignee: Invoca, Inc.Inventors: Sean Michael Storlie, Victor Jara Borda, Michael Kingsley McCourt, Jr., Leland W. Kirchhoff, Colin Denison Kelley, Nicholas James Burwell
-
Patent number: 10264990Abstract: The present disclosure provides methods of decoding speech from brain activity data. Aspects of the methods include receiving brain speech activity data from a subject, and processing the brain speech activity data to output speech feature data. Also provided are devices and systems for practicing the subject methods.Type: GrantFiled: October 25, 2013Date of Patent: April 23, 2019Assignee: The Regents of the University of CaliforniaInventors: Brian Pasley, Robert T. Knight
-
Patent number: 10269377Abstract: A device includes a processor and a memory accessible to the processor and bearing instructions executable by the processor to process an audible input sequence provided by a user of the device, determine that a pause in providing the audible input sequence has occurred at least partially based on a first signal from at least one camera communicating with the device, cease to process the audible input sequence responsive to a determination that the pause has occurred, determine that providing the audible input sequence has resumed based at least partially based on a second signal from the camera, and resume processing of the audible input sequence responsive to a determination that providing the audible input sequence has resumed.Type: GrantFiled: August 31, 2018Date of Patent: April 23, 2019Assignee: LENOVO (SINGAPORE) PTE. LTD.Inventors: Russell Speight VanBlon, Suzanne Marion Beaumont, Rod David Waltermann
-
Patent number: 10269352Abstract: A system and method for detecting phonetically similar imposter phrases may include using automatic speech recognition (ASR) to search for a first phrase in a set of objects; producing a list of references by searching for the first phrase in the set of objects using phonetic search; using output produced by the ASR to determine whether or not a reference in the list points to a phrase that is the same as the first phrase; and if it is determined that the reference points to a second phrase that is different from the first phrase then marking the second phrase as a potential cause for a phrase search false positive.Type: GrantFiled: December 23, 2016Date of Patent: April 23, 2019Assignee: NICE Ltd.Inventors: Robert William Morris, Neeraj Singh Verma
-
Patent number: 10204626Abstract: A dictation device includes: an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion during the voice utterance; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a server end device and receiving second data packets including combined dictation based upon the audio stream and the video stream from the server end device. In the combined dictation, first dictation generated based upon the audio stream has been corrected by second dictation generated based upon the video stream.Type: GrantFiled: May 10, 2018Date of Patent: February 12, 2019Assignee: Panasonic Intellectual Property Corporation of AmericaInventors: Yuichiro Takayanagi, Masashi Kusaka
-
Patent number: 10170115Abstract: Key phrase detection techniques for applications such as wake on voice are discussed include performing a vectorized operation on a multiple element acoustic score vector for a current time instance including a single state rejection model score and scores for a multiple state key phrase model and a multiple element state score vector for a previous time instance including a previous state score for the single state rejection model and previous state scores for the multiple state key phrase model to generate a multiple element score summation vector and a second vectorized operation on the multiple element score summation vector to determine a multiple element state score vector for the current time instance. The multiple element state score vector for the current time instance may then be evaluated to determine whether received audio input includes a key phrase corresponding to the multiple state key phrase model.Type: GrantFiled: July 12, 2018Date of Patent: January 1, 2019Assignee: Intel CorporationInventors: Tobias Bocklet, Tomasz Dorau, Przemyslaw Sobon, Przemyslaw Tomaszewski
-
Patent number: 10163455Abstract: A device includes a processor and a memory accessible to the processor and bearing instructions executable by the processor to process an audible input sequence provided by a user of the device, determine that a pause in providing the audible input sequence has occurred at least partially based on a first signal from at least one camera communicating with the device, cease to process the audible input sequence responsive to a determination that the pause has occurred, determine that providing the audible input sequence has resumed based at least partially based on a second signal from the camera, and resume processing of the audible input sequence responsive to a determination that providing the audible input sequence has resumed.Type: GrantFiled: December 3, 2013Date of Patent: December 25, 2018Assignee: Lenovo (Singapore) Pte. Ltd.Inventors: Russell Speight VanBlon, Suzanne Marion Beaumont, Rod David Waltermann
-
Patent number: 10152973Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.Type: GrantFiled: November 16, 2015Date of Patent: December 11, 2018Assignee: Amazon Technologies, Inc.Inventors: Bjorn Hoffmeister, Hugh Evan Secker-Walker, Jeffrey Cornelius O'Neill
-
Patent number: 10147432Abstract: The invention provides a decoder being configured for processing an encoded audio bitstream, wherein the decoder includes: a bitstream decoder configured to derive a decoded audio signal from the bitstream, wherein the decoded audio signal includes at least one decoded frame; a noise estimation device configured to produce a noise estimation signal containing an estimation of the level and/or the spectral shape of a noise in the decoded audio signal; a comfort noise generating device configured to derive a comfort noise signal from the noise estimation signal; and a combiner configured to combine the decoded frame of the decoded audio signal and the comfort noise signal in order to obtain an audio output signal.Type: GrantFiled: June 19, 2015Date of Patent: December 4, 2018Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Guillaume Fuchs, Anthony Lombard, Emmanuel Ravelli, Stefan Doehla, Jeremie Lecomte, Martin Dietz
-
Patent number: 10141010Abstract: Embodiments relate to censoring audio data. A censoring system receives audio data including a song tag and amplitude data as a function of time. The amplitude data represents spoken words occurring over a duration, as well as non-spoken word sound overlapping with some of the spoken words during the duration. The system accesses a set of song lyrics and processes the set of song lyrics and the amplitude data together to identify timestamps in the amplitude data. These timestamps indicate a time during the duration when one of the words from the lyrics begins in the amplitude data. The system compares the words in the set of song lyrics to a blacklist and adjusts the amplitude data at the timestamps of blacklisted word occurrences to render the audio at the blacklisted words incomprehensible. The system outputs the adjusted amplitude data.Type: GrantFiled: October 1, 2015Date of Patent: November 27, 2018Assignee: Google LLCInventor: Eric Paul Nichols