Patents by Inventor Ariya Rastrow

Ariya Rastrow has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ON-DEVICE LEARNING IN A HYBRID SPEECH PROCESSING SYSTEM

Publication number: 20220020357

Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as remote NLU data. The response data from the remote speech processing system may be compared to local NLU data to improve a speech processing model on the device. Thus, the device may perform supervised on-device learning based on the remote NLU data. The device may determine differences between the updated speech processing model and an original speech processing model received from the remote system and may send data indicating these differences to the remote system. The remote system may aggregate data received from a plurality of devices and may generate an improved speech processing model.

Type: Application

Filed: July 27, 2021

Publication date: January 20, 2022

Inventors: Ariya Rastrow, Rohit Prasad, Nikko Strom
Disambiguation in automatic speech processing

Patent number: 11211058

Abstract: Described herein is a system for prompting a user for clarification when an automatic speech recognition (ASR) system encounters ambiguity with respect to the user's input. The feedback provided by the user is used to retrain machine-learning models and/or to generate new machine-learning models. Based on the type of ambiguity, the system may determine to retrain one or more ASR models that are widely used by the system or to generate/update one or more user-specific models that are used to process inputs from one or more particular users.

Type: Grant

Filed: September 20, 2019

Date of Patent: December 28, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Aaron Eakin, Angela Sun, Ankur Gandhe, Ariya Rastrow, Chenlei Guo, Xing Fan
Language and grammar model adaptation

Patent number: 11145296

Abstract: Systems and methods described herein relate to adapting a language model for automatic speech recognition (ASR) for a new set of words. Instead of retraining the ASR models, language models and grammar models, the system only modifies one grammar model and ensures its compatibility with the existing models in the ASR system.

Type: Grant

Filed: March 25, 2019

Date of Patent: October 12, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ankur Gandhe, Ariya Rastrow, Gautam Tiwari, Ashish Vishwanath Shenoy, Chun Chen
SPEECH RECOGNITION USING DIALOG HISTORY

Publication number: 20210312914

Abstract: Described herein is a system for rescoring automatic speech recognition hypotheses for conversational devices that have multi-turn dialogs with a user. The system leverages dialog context by incorporating data related to past user utterances and data related to the system generated response corresponding to the past user utterance. Incorporation of this data improves recognition of a particular user utterance within the dialog.

Type: Application

Filed: June 7, 2021

Publication date: October 7, 2021

Inventors: Behnam Hedayatnia, Anirudh Raju, Ankur Gandhe, Chandra Prakash Khatri, Ariya Rastrow, Anushree Venkatesh, Arindam Mandal, Raefer Christopher Gabriel, Ahmad Shikib Mehri
DEVICE-DIRECTED UTTERANCE DETECTION

Publication number: 20210295833

Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.

Type: Application

Filed: March 18, 2020

Publication date: September 23, 2021

Inventors: Ariya Rastrow, Eli Joshua Fidler, Roland Maximilian Rolf Maas, Nikko Strom, Aaron Eakin, Diamond Bishop, Bjorn Hoffmeister, Sanjeev Mishra
On-device learning in a hybrid speech processing system

Patent number: 11087739

Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as remote NLU data. The response data from the remote speech processing system may be compared to local NLU data to improve a speech processing model on the device. Thus, the device may perform supervised on-device learning based on the remote NLU data. The device may determine differences between the updated speech processing model and an original speech processing model received from the remote system and may send data indicating these differences to the remote system. The remote system may aggregate data received from a plurality of devices and may generate an improved speech processing model.

Type: Grant

Filed: November 13, 2018

Date of Patent: August 10, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ariya Rastrow, Rohit Prasad, Nikko Strom
Speech recognition using dialog history

Patent number: 11043214

Abstract: Described herein is a system for rescoring automatic speech recognition hypotheses for conversational devices that have multi-turn dialogs with a user. The system leverages dialog context by incorporating data related to past user utterances and data related to the system generated response corresponding to the past user utterance. Incorporation of this data improves recognition of a particular user utterance within the dialog.

Type: Grant

Filed: November 29, 2018

Date of Patent: June 22, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Behnam Hedayatnia, Anirudh Raju, Ankur Gandhe, Chandra Prakash Khatri, Ariya Rastrow, Anushree Venkatesh, Arindam Mandal, Raefer Christopher Gabriel, Ahmad Shikib Mehri
Generation of predictive natural language processing models

Patent number: 10964312

Abstract: Features are disclosed for generating predictive personal natural language processing models based on user-specific profile information. The predictive personal models can provide broader coverage of the various terms, named entities, and/or intents of an utterance by the user than a personal model, while providing better accuracy than a general model. Profile information may be obtained from various data sources. Predictions regarding the content or subject of future user utterances may be made from the profile information. Predictive personal models may be generated based on the predictions. Future user utterances may be processed using the predictive personal models.

Type: Grant

Filed: August 13, 2018

Date of Patent: March 30, 2021

Assignee: Amazon Technologies, Inc.

Inventors: William Folwell Barton, Rohit Prasad, Stephen Frederick Potter, Nikko Strom, Yuzo Watanabe, Madan Mohan Rao Jampani, Ariya Rastrow, Arushan Rajasekaram
Creation of language models for speech recognition

Patent number: 10943583

Abstract: A system to perform automatic speech recognition (ASR) using a dynamic language model. Portions of the language model can include a group of probabilities rather than a single probability. At runtime individual probabilities of the group are weighted and combined to create an adjusted probability for the portion of the language model. The adjusted probability can be used for ASR processing. The weights can be determined based on a characteristic of the utterance, for example an associated speechlet/application, the specific user speaking, or other characteristic. By applying the weights at runtime the system can use a single language model to dynamically adjust to different utterance conditions.

Type: Grant

Filed: March 23, 2018

Date of Patent: March 9, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ankur Gandhe, Ariya Rastrow, Shaswat Pratap Shah
INTENT-SPECIFIC AUTOMATIC SPEECH RECOGNITION RESULT GENERATION

Publication number: 20200388282

Abstract: Features are disclosed for generating intent-specific results in an automatic speech recognition system. The results can be generated by utilizing a decoding graph containing tags that identify portions of the graph corresponding to a given intent. The tags can also identify high-information content slots and low-information carrier phrases for a given intent. The automatic speech recognition system may utilize these tags to provide a semantic representation based on a plurality of different tokens for the content slot portions and low information for the carrier portions. A user can be presented with a user interface containing top intent results with corresponding intent-specific top content slot values.

Type: Application

Filed: May 21, 2020

Publication date: December 10, 2020

Inventors: Hugh Evan Secker-Walker, Aaron Lee Mathers Challenner, Ariya Rastrow
Domain specific endpointing

Patent number: 10854192

Abstract: An automatic speech recognition (ASR) system detects an endpoint of an utterance based on a domain of the utterance. The ASR system processes a first portion of the utterance to determine the domain and then determines an endpoint of the remainder of the utterance depending on the domain.

Type: Grant

Filed: March 30, 2016

Date of Patent: December 1, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Roland Maas, Ariya Rastrow, Rohit Prasad
Intent-specific automatic speech recognition result generation

Patent number: 10811013

Abstract: Features are disclosed for generating intent-specific results in an automatic speech recognition system. The results can be generated by utilizing a decoding graph containing tags that identify portions of the graph corresponding to a given intent. The tags can also identify high-information content slots and low-information carrier phrases for a given intent. The automatic speech recognition system may utilize these tags to provide a semantic representation based on a plurality of different tokens for the content slot portions and low information for the carrier portions. A user can be presented with a user interface containing top intent results with corresponding intent-specific top content slot values.

Type: Grant

Filed: December 20, 2013

Date of Patent: October 20, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Hugh Evan Secker-Walker, Aaron Lee Mathers Challenner, Ariya Rastrow
GENERATION OF AUTOMATED MESSAGE RESPONSES

Publication number: 20200045130

Abstract: Systems, methods, and devices for computer-generating responses and sending responses to communications when the recipient of the communication is unavailable are disclosed. An individual may send a message (either audio or text) to a recipient. The recipient may be unavailable to contemporaneously respond to the message (e.g., the recipient may be performing an action that makes is difficult or impractical for the recipient to contemporaneously respond to the audio message). When the recipient is unavailable, a response to the message is generated and sent without receiving an instruction from the recipient to do so. The response may be sent to the message originating individual, and content of the response may thereafter be sent to the recipient to receive feedback regarding the correctness of the response. Alternatively, the response content may first be sent to the recipient to receive the feedback, and thereafter the response may be sent to the message originating individual.

Type: Application

Filed: June 27, 2019

Publication date: February 6, 2020

Inventors: Ariya Rastrow, Tony Hardie, Rohit Prasad
Compressed finite state transducers for automatic speech recognition

Patent number: 10381000

Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.

Type: Grant

Filed: January 8, 2018

Date of Patent: August 13, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
Predicting pronunciation in speech recognition

Patent number: 10339920

Abstract: An automatic speech recognition (ASR) device may be configured to predict pronunciations of textual identifiers (for example, song names, etc.) based on predicting one or more languages of origin of the textual identifier. The one or more languages of origin may be determined based on the textual identifier. The pronunciations may include a hybrid pronunciation including a pronunciation in one language, a pronunciation in a second language and a hybrid pronunciation that combines multiple languages. The pronunciations may be added to a lexicon and matched to the content item (e.g., song) and/or textual identifier. The ASR device may receive a spoken utterance from a user requesting the ASR device to access the content item. The ASR device determines whether the spoken utterance matches one of the pronunciations of the content item in the lexicon. The ASR device then accesses the content when the spoken utterance matches one of the potential textual identifier pronunciations.

Type: Grant

Filed: March 4, 2014

Date of Patent: July 2, 2019

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Jeffrey Penrod Adams, Alok Ulhas Parlikar, Jeffrey Paul Lilly, Ariya Rastrow
Generation of automated message responses

Patent number: 10339925

Abstract: Systems, methods, and devices for computer-generating responses and sending responses to communications when the recipient of the communication is unavailable are disclosed. An individual may send a message (either audio or text) to a recipient. The recipient may be unavailable to contemporaneously respond to the message (e.g., the recipient may be performing an action that makes is difficult or impractical for the recipient to contemporaneously respond to the audio message). When the recipient is unavailable, a response to the message is generated and sent without receiving an instruction from the recipient to do so. The response may be sent to the message originating individual, and content of the response may thereafter be sent to the recipient to receive feedback regarding the correctness of the response. Alternatively, the response content may first be sent to the recipient to receive the feedback, and thereafter the response may be sent to the message originating individual.

Type: Grant

Filed: September 26, 2016

Date of Patent: July 2, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Ariya Rastrow, Tony Hardie, Rohit Prasad
GENERATION OF PREDICTIVE NATURAL LANGUAGE PROCESSING MODELS

Publication number: 20190180736

Abstract: Features are disclosed for generating predictive personal natural language processing models based on user-specific profile information. The predictive personal models can provide broader coverage of the various terms, named entities, and/or intents of an utterance by the user than a personal model, while providing better accuracy than a general model. Profile information may be obtained from various data sources. Predictions regarding the content or subject of future user utterances may be made from the profile information. Predictive personal models may be generated based on the predictions. Future user utterances may be processed using the predictive personal models.

Type: Application

Filed: August 13, 2018

Publication date: June 13, 2019

Inventors: William Folwell Barton, Rohit Prasad, Stephen Frederick Potter, Nikko Strom, Yuzo Watanabe, Madan Mohan Rao Jampani, Ariya Rastrow, Arushan Rajasekaram
Error tolerant neural network model compression

Patent number: 10229356

Abstract: Features are disclosed for error tolerant model compression. Such features could be used to reduce the size of a deep neural network model including several hidden node layers. The size reduction in an error tolerant fashion ensures predictive applications relying on the model do not experience performance degradation due to model compression. Such predictive applications include automatic recognition of speech, image recognition, and recommendation engines. Partially quantized models are re-trained such that any degradation of accuracy is “trained out” of the model providing improved error tolerance with compression.

Type: Grant

Filed: December 23, 2014

Date of Patent: March 12, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Baiyang Liu, Michael Reese Bastian, Bjorn Hoffmeister, Sankaran Panchapagesan, Ariya Rastrow
Lattice decoding and result confirmation using recurrent neural networks

Patent number: 10210862

Abstract: Neural networks may be used in certain automatic speech recognition systems. To improve performance at these neural networks, the present system converts the lattice into a matrix form, thus maintaining certain information included in the lattice that might otherwise be lost while also placing the lattice in a form that may be manipulated by other components to perform operations such as checking ASR results. The matrix representation of the lattice may be transformed into a vector representation by calculations performed at a recurrent neural network (RNN). By representing the lattice as a vector representation the system may perform additional operations, such as ASR results confirmation.

Type: Grant

Filed: April 6, 2016

Date of Patent: February 19, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Faisal Ladhak, Ankur Gandhe, Markus Dreyer, Ariya Rastrow, Björn Hoffmeister, Lambert Mathias
Lattice encoding using recurrent neural networks

Patent number: 10176802

Abstract: An automatic speech recognition (ASR) system may convert an ASR output lattice into a matrix form, thus maintaining certain information included in the lattice that might otherwise be lost in an N-best list output. The matrix representation of the lattice may be encoded using a recurrent neural network (RNN) to create a vector representation of the lattice. The vector representation may then be used by the system to perform additional operations, such as ASR results confirmation.

Type: Grant

Filed: April 6, 2016

Date of Patent: January 8, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Faisal Ladhak, Ankur Gandhe, Markus Dreyer, Ariya Rastrow, Björn Hoffmeister, Lambert Mathias

prev 1 2 3 next