Patents by Inventor Denis Sergeyevich Filimonov

Denis Sergeyevich Filimonov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Compressed finite state transducers for automatic speech recognition

Patent number: 10381000

Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.

Type: Grant

Filed: January 8, 2018

Date of Patent: August 13, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
Adaptive beam pruning for automatic speech recognition

Patent number: 10199037

Abstract: A reduced latency system for automatic speech recognition (ASR). The system can use certain feature values describing the state of ASR processing to estimate how far a lowest scoring node for an audio frame is from a potential node likely be part of the Viterbi path. The system can then adjust its beam width in a manner likely to encompass the node likely to be on the Viterbi path, thus pruning unnecessary nodes and reducing latency. The feature values and estimated distances may be based on a set of training data, where the system identifies specific nodes on the Viterbi path and determines what feature values correspond to what desired beam widths. Trained models or other data may be created at training and used at runtime to dynamically adjust the beam width, as well as other settings such as threshold number of active nodes.

Type: Grant

Filed: June 29, 2016

Date of Patent: February 5, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Denis Sergeyevich Filimonov, Yuan Shangguan
Dynamic arc weights in speech recognition models

Patent number: 10140981

Abstract: Features are disclosed for performing speech recognition on utterances using dynamic weights with speech recognition models. An automatic speech recognition system may use a general speech recognition model, such a large finite state transducer-based language model, to generate speech recognition results for various utterances. The general speech recognition model may include sub-models or other portions that are customized for particular tasks, such as speech recognition on utterances regarding particular topics. Individual weights within the general speech recognition model can be dynamically replaced based on the context in which an utterance is made or received, thereby providing a further degree of customization without requiring additional speech recognition models to generated, maintained, or loaded.

Type: Grant

Filed: June 10, 2014

Date of Patent: November 27, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Denis Sergeyevich Filimonov, Ariya Rastrow
Automatic speech recognition incorporating word usage information

Patent number: 10121467

Abstract: A language model for automatic speech processing, such as a finite state transducer (FST) may be configured to incorporate information about how a particular word sequence (N-gram) may be used in a similar manner from another N-gram. A score of a component of the FST (such as an arc or state) relating to the first N-gram may be based on information of the second N-gram. Further, the FST may be configured to have an arc between a state of the first N-gram and a state of the second N-gram to allow for cross N-gram back off, rather than backoff from a larger N-gram to a smaller N-gram during traversal of the FST during speech processing.

Type: Grant

Filed: June 30, 2016

Date of Patent: November 6, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Ankur Gandhe, Denis Sergeyevich Filimonov, Ariya Rastrow, Björn Hoffmeister
Speech processing with learned representation of user interaction history

Patent number: 10032463

Abstract: An automatic speech recognition (“ASR”) system produces, for particular users, customized speech recognition results by using data regarding prior interactions of the users with the system. A portion of the ASR system (e.g., a neural-network-based language model) can be trained to produce an encoded representation of a user's interactions with the system based on, e.g., transcriptions of prior utterances made by the user. This user-specific encoded representation of interaction history is then used by the language model to customize ASR processing for the user.

Type: Grant

Filed: December 29, 2015

Date of Patent: July 24, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Ariya Rastrow, Nikko Ström, Spyridon Matsoukas, Markus Dreyer, Ankur Gandhe, Denis Sergeyevich Filimonov, Julian Chan, Rohit Prasad
Compact HCLG FST

Patent number: 10013974

Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.

Type: Grant

Filed: June 20, 2016

Date of Patent: July 3, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
Compressed finite state transducers for automatic speech recognition

Patent number: 9865254

Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.

Type: Grant

Filed: June 20, 2016

Date of Patent: January 9, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow

Compressed finite state transducers for automatic speech recognition

Adaptive beam pruning for automatic speech recognition

Dynamic arc weights in speech recognition models

Automatic speech recognition incorporating word usage information

Speech processing with learned representation of user interaction history

Compact HCLG FST

Compressed finite state transducers for automatic speech recognition