Patents by Inventor Ariya Rastrow
Ariya Rastrow has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10140981Abstract: Features are disclosed for performing speech recognition on utterances using dynamic weights with speech recognition models. An automatic speech recognition system may use a general speech recognition model, such a large finite state transducer-based language model, to generate speech recognition results for various utterances. The general speech recognition model may include sub-models or other portions that are customized for particular tasks, such as speech recognition on utterances regarding particular topics. Individual weights within the general speech recognition model can be dynamically replaced based on the context in which an utterance is made or received, thereby providing a further degree of customization without requiring additional speech recognition models to generated, maintained, or loaded.Type: GrantFiled: June 10, 2014Date of Patent: November 27, 2018Assignee: Amazon Technologies, Inc.Inventors: Denis Sergeyevich Filimonov, Ariya Rastrow
-
Patent number: 10121471Abstract: An automatic speech recognition (ASR) system detects an endpoint of an utterance using the active hypotheses under consideration by a decoder. The ASR system calculates the amount of non-speech detected by a plurality of hypotheses and weights the non-speech duration by the probability of each hypotheses. When the aggregate weighted non-speech exceeds a threshold, an endpoint may be declared.Type: GrantFiled: June 29, 2015Date of Patent: November 6, 2018Assignee: Amazon Technologies, Inc.Inventors: Bjorn Hoffmeister, Ariya Rastrow, Baiyang Liu
-
Patent number: 10121467Abstract: A language model for automatic speech processing, such as a finite state transducer (FST) may be configured to incorporate information about how a particular word sequence (N-gram) may be used in a similar manner from another N-gram. A score of a component of the FST (such as an arc or state) relating to the first N-gram may be based on information of the second N-gram. Further, the FST may be configured to have an arc between a state of the first N-gram and a state of the second N-gram to allow for cross N-gram back off, rather than backoff from a larger N-gram to a smaller N-gram during traversal of the FST during speech processing.Type: GrantFiled: June 30, 2016Date of Patent: November 6, 2018Assignee: Amazon Technologies, Inc.Inventors: Ankur Gandhe, Denis Sergeyevich Filimonov, Ariya Rastrow, Björn Hoffmeister
-
Patent number: 10049656Abstract: Features are disclosed for generating predictive personal natural language processing models based on user-specific profile information. The predictive personal models can provide broader coverage of the various terms, named entities, and/or intents of an utterance by the user than a personal model, while providing better accuracy than a general model. Profile information may be obtained from various data sources. Predictions regarding the content or subject of future user utterances may be made from the profile information. Predictive personal models may be generated based on the predictions. Future user utterances may be processed using the predictive personal models.Type: GrantFiled: September 20, 2013Date of Patent: August 14, 2018Assignee: Amazon Technologies, Inc.Inventors: William Folwell Barton, Rohit Prasad, Stephen Frederick Potter, Nikko Strom, Yuzo Watanabe, Madan Mohan Rao Jampani, Ariya Rastrow, Arushan Rajasekaram
-
Patent number: 10032463Abstract: An automatic speech recognition (“ASR”) system produces, for particular users, customized speech recognition results by using data regarding prior interactions of the users with the system. A portion of the ASR system (e.g., a neural-network-based language model) can be trained to produce an encoded representation of a user's interactions with the system based on, e.g., transcriptions of prior utterances made by the user. This user-specific encoded representation of interaction history is then used by the language model to customize ASR processing for the user.Type: GrantFiled: December 29, 2015Date of Patent: July 24, 2018Assignee: Amazon Technologies, Inc.Inventors: Ariya Rastrow, Nikko Ström, Spyridon Matsoukas, Markus Dreyer, Ankur Gandhe, Denis Sergeyevich Filimonov, Julian Chan, Rohit Prasad
-
Patent number: 10013974Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.Type: GrantFiled: June 20, 2016Date of Patent: July 3, 2018Assignee: Amazon Technologies, Inc.Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
-
Patent number: 9934777Abstract: User-specific language models (LMs) that include internal word indexes to a word table specific to the user-specific LM rather than a word table specific to a system-wide LM. When the system-wide LM is updated, the word table of the user-specific LM may be updated to translate the user-specific indices to system-wide indices. This prevents having to update the internal indices of the user-specific LM every time the system-wide LM is updated.Type: GrantFiled: August 26, 2016Date of Patent: April 3, 2018Assignee: Amazon Technologies, Inc.Inventors: Shaun Nidhiri Joseph, Sonal Pareek, Ariya Rastrow, Gautam Tiwari, Alexander David Rosen
-
Patent number: 9922650Abstract: Features are disclosed for generating intent-specific results in an automatic speech recognition system. The results can be generated by utilizing a decoding graph containing tags that identify portions of the graph corresponding to a given intent. The tags can also identify high-information content slots and low-information carrier phrases for a given intent. The automatic speech recognition system may utilize these tags to provide a semantic representation based on a plurality of different tokens for the content slot portions and low information for the carrier portions. A user can be presented with a user interface containing top intent results with corresponding intent-specific top content slot values.Type: GrantFiled: December 20, 2013Date of Patent: March 20, 2018Assignee: Amazon Technologies, Inc.Inventors: Hugh Evan Secker-Walker, Aaron Lee Mathers Challenner, Ariya Rastrow
-
Patent number: 9892726Abstract: Features are disclosed for modifying a statistical model to more accurately discriminate between classes of input data. A subspace of the total model parameter space can be learned such that individual points in the subspace, corresponding to the various classes, are discriminative with respect to the classes. The subspace can be learned using an iterative process whereby an initial subspace is used to generate data and maximize an objective function. The objective function can correspond to maximizing the posterior probability of the correct class for a given input. The initial subspace, data, and objective function can be used to generate a new subspace that better discriminates between classes. The process may be repeated as desired. A model modified using such a subspace can be used to classify input data.Type: GrantFiled: December 17, 2014Date of Patent: February 13, 2018Assignee: Amazon Technologies, Inc.Inventors: Sri Venkata Surya Siva Rama Krishna Garimella, Spyridon Matsoukas, Ariya Rastrow, Bjorn Hoffmeister
-
Patent number: 9865254Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.Type: GrantFiled: June 20, 2016Date of Patent: January 9, 2018Assignee: Amazon Technologies, Inc.Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
-
Patent number: 9653093Abstract: Features are disclosed for using an artificial neural network to generate customized speech recognition models during the speech recognition process. By dynamically generating the speech recognition models during the speech recognition process, the models can be customized based on the specific context of individual frames within the audio data currently being processed. In this way, dependencies between frames in the current sequence can form the basis of the models used to score individual frames of the current sequence. Thus, each frame of the current sequence (or some subset thereof) may be scored using one or more models customized for the particular frame in context.Type: GrantFiled: August 19, 2014Date of Patent: May 16, 2017Assignee: Amazon Technologies, Inc.Inventors: Spyridon Matsoukas, Nikko Ström, Ariya Rastrow, Sri Venkata Surya Siva Rama Krishna Garimella
-
Patent number: 9600764Abstract: Features are disclosed for using a neural network to tag sequential input without using an internal representation of the neural network generated when scoring previous positions in the sequence. A predicted or determined label (e.g., the highest scoring or otherwise most probable label) for input at a given position in the sequence can be used when scoring input corresponding to the next position the sequence. Additional features are disclosed for training a neural network for use in tagging sequential input without using an internal representation of the neural network generated when scoring previous positions the sequence.Type: GrantFiled: June 17, 2014Date of Patent: March 21, 2017Assignee: Amazon Technologies, Inc.Inventors: Ariya Rastrow, Spyros Matsoukas, Sri Venkata Surya Siva Rama Krishna Garimella, Nikko Ström, Bjorn Hoffmeister
-
Publication number: 20160379632Abstract: An automatic speech recognition (ASR) system detects an endpoint of an utterance using the active hypotheses under consideration by a decoder. The ASR system calculates the amount of non-speech detected by a plurality of hypotheses and weights the non-speech duration by the probability of each hypotheses. When the aggregate weighted non-speech exceeds a threshold, an endpoint may be declared.Type: ApplicationFiled: June 29, 2015Publication date: December 29, 2016Inventors: Bjorn Hoffmeister, Ariya Rastrow, Baiyang Liu
-
Patent number: 9449598Abstract: Features are disclosed for performing speech recognition on utterances using a grammar and a statistical language model, such as an n-gram model. States of the grammar may correspond to states of the statistical language model. Speech recognition may be initiated using the grammar. At a given state of the grammar, speech recognition may continue at a corresponding state of the statistical language model. Speech recognition may continue using the grammar in parallel with the statistical language model, or it may continue using the statistical language model exclusively. Scores associated with the correspondences between states (e.g., backoff arcs) may be determined according to a heuristically or based on test data.Type: GrantFiled: September 26, 2013Date of Patent: September 20, 2016Assignee: Amazon Technologies, Inc.Inventors: Ariya Rastrow, Bjorn Hoffmeister, Sri Venkata Surya Siva Rama Krishna Garimella, Rohit Krishna Prasad
-
Publication number: 20150255069Abstract: An automatic speech recognition (ASR) device may be configured to predict pronunciations of textual identifiers (for example, song names, etc.) based on predicting one or more languages of origin of the textual identifier. The one or more languages of origin may be determined based on the textual identifier. The pronunciations may include a hybrid pronunciation including a pronunciation in one language, a pronunciation in a second language and a hybrid pronunciation that combines multiple languages. The pronunciations may be added to a lexicon and matched to the content item (e.g., song) and/or textual identifier. The ASR device may receive a spoken utterance from a user requesting the ASR device to access the content item. The ASR device determines whether the spoken utterance matches one of the pronunciations of the content item in the lexicon. The ASR device then accesses the content when the spoken utterance matches one of the potential textual identifier pronunciations.Type: ApplicationFiled: March 4, 2014Publication date: September 10, 2015Applicant: Amazon Technologies, Inc.Inventors: Jeffrey Penrod Adams, Alok Ulhas Parlikar, Jeffrey Paul Lilly, Ariya Rastrow