Abstract: Some embodiments described herein cover a machine learning architecture with a separated perception subsystem and application subsystem. These subsystems can be co-trained. In one example embodiment, a data item is received and information from the data item is processed by a first node to generate a sparse feature vector. A second node processes the sparse feature vector to determine an output. A relevancy rating associated with the output is determined. A determination is made as to whether to update the first node based on update criteria associated with the first node, wherein the update criteria comprise a relevancy criterion and a novelty criterion. The second node is updated based on the relevancy rating.
Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.
Type:
Grant
Filed:
December 10, 2018
Date of Patent:
February 25, 2020
Assignee:
Apprente LLC
Inventors:
Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz
Abstract: A synthetic training data item comprising a first sequence of symbols that represent a synthetic sentence output by a simulator is received. The synthetic training data item is processed using a machine learning model, which outputs a second sequence of symbols that represent the synthetic sentence. The synthetic training data item is modified by replacing the first sequence of symbols with the second sequence of symbols. A statistically significant mismatch exists between the first sequence of symbols and a third sequence of symbols that would be output by an acoustic model that processes a set of acoustic features that represent an utterance of the synthetic sentence, and no statistically significant mismatch exists between the second sequence of symbols and the third sequence of symbols. The modified synthetic training data item may be used to train a second machine learning model that processes data output by the acoustic model.
Type:
Grant
Filed:
June 10, 2019
Date of Patent:
February 11, 2020
Assignee:
Apprente LLC
Inventors:
Itamar Arel, Joshua Benjamin Looks, Ali Ziaei, Michael Lefkowitz