Patents by Inventor Ashutosh Modi

Ashutosh Modi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Techniques for interpreting spoken input using non-verbal cues

Patent number: 11887600

Abstract: In various embodiments, a communication fusion application enables other software application(s) to interpret spoken user input. In operation, a communication fusion application determines that a prediction is relevant to a text input derived from a spoken input received from a user. Subsequently, the communication fusion application generates a predicted context based on the prediction. The communication fusion application then transmits the predicted context and the text input to the other software application(s). The other software application(s) perform additional action(s) based on the text input and the predicted context. Advantageously, by providing additional, relevant information to the software application(s), the communication fusion application increases the level of understanding during interactions with the user and the overall user experience is improved.

Type: Grant

Filed: October 4, 2019

Date of Patent: January 30, 2024

Assignee: DISNEY ENTERPRISES, INC.

Inventors: Erika Doggett, Nathan Nocon, Ashutosh Modi, Joseph Charles Sengir, Maxwell McCoy
Techniques for incremental computer-based natural language understanding

Patent number: 11749265

Abstract: Various embodiments disclosed herein provide techniques for performing incremental natural language understanding on a natural language understanding (NLU) system. The NLU system acquires a first audio speech segment associated with a user utterance. The NLU system converts the first audio speech segment into a first text segment. The NLU system determines a first intent based on a text string associated with the first text segment, wherein the text string represents a portion of the user utterance. The NLU system generates a first response based on the first intent prior to when the user utterance completes.

Type: Grant

Filed: October 4, 2019

Date of Patent: September 5, 2023

Assignee: DISNEY ENTERPRISES, INC.

Inventors: Erika Varis Doggett, Ashutosh Modi, Nathan Nocon
TECHNIQUES FOR INCREMENTAL COMPUTER-BASED NATURAL LANGUAGE UNDERSTANDING

Publication number: 20210104236

Abstract: Various embodiments disclosed herein provide techniques for performing incremental natural language understanding on a natural language understanding (NLU) system. The NLU system acquires a first audio speech segment associated with a user utterance. The NLU system converts the first audio speech segment into a first text segment. The NLU system determines a first intent based on a text string associated with the first text segment, wherein the text string represents a portion of the user utterance. The NLU system generates a first response based on the first intent prior to when the user utterance completes.

Type: Application

Filed: October 4, 2019

Publication date: April 8, 2021

Inventors: Erika Varis DOGGETT, Ashutosh MODI, Nathan NOCON
TECHNIQUES FOR INTERPRETING SPOKEN INPUT USING NON-VERBAL CUES

Publication number: 20210104241

Abstract: In various embodiments, a communication fusion application enables other software application(s) to interpret spoken user input. In operation, a communication fusion application determines that a prediction is relevant to a text input derived from a spoken input received from a user. Subsequently, the communication fusion application generates a predicted context based on the prediction. The communication fusion application then transmits the predicted context and the text input to the other software application(s). The other software application(s) perform additional action(s) based on the text input and the predicted context. Advantageously, by providing additional, relevant information to the software application(s), the communication fusion application increases the level of understanding during interactions with the user and the overall user experience is improved.

Type: Application

Filed: October 4, 2019

Publication date: April 8, 2021

Inventors: Erika DOGGETT, Nathan NOCON, Ashutosh MODI, Joseph Charles SENGIR, Maxwell MCCOY
Affect-driven dialog generation

Patent number: 10818312

Abstract: According to one implementation, an affect-driven dialog generation system includes a computing platform having a hardware processor and a system memory storing a software code including a sequence-to-sequence (seq2seq) architecture trained using a loss function having an affective regularizer term based on a difference in emotional content between a target dialog response and a dialog sequence determined by the seq2seq architecture during training. The hardware processor executes the software code to receive an input dialog sequence, and to use the seq2seq architecture to generate emotionally diverse dialog responses based on the input dialog sequence and a predetermined target emotion. The hardware processor further executes the software code to determine, using the seq2seq architecture, a final dialog sequence responsive to the input dialog sequence based on an emotional relevance of each of the emotionally diverse dialog responses, and to provide the final dialog sequence as an output.

Type: Grant

Filed: December 19, 2018

Date of Patent: October 27, 2020

Assignee: Disney Enterprises, Inc.

Inventors: Ashutosh Modi, Mubbasir Kapadia, Douglas A. Fidaleo, James R. Kennedy, Wojciech Witon, Pierre Colombo
AFFECT-DRIVEN DIALOG GENERATION

Publication number: 20200202887

Abstract: According to one implementation, an affect-driven dialog generation system includes a computing platform having a hardware processor and a system memory storing a software code including a sequence-to-sequence (seq2seq) architecture trained using a loss function having an affective regularizer term based on a difference in emotional content between a target dialog response and a dialog sequence determined by the seq2seq architecture during training. The hardware processor executes the software code to receive an input dialog sequence, and to use the seq2seq architecture to generate emotionally diverse dialog responses based on the input dialog sequence and a predetermined target emotion. The hardware processor further executes the software code to determine, using the seq2seq architecture, a final dialog sequence responsive to the input dialog sequence based on an emotional relevance of each of the emotionally diverse dialog responses, and to provide the final dialog sequence as an output.

Type: Application

Filed: December 19, 2018

Publication date: June 25, 2020

Inventors: Ashutosh Modi, Mubbasir Kapadia, Douglas A. Fidaleo, James R. Kennedy, Wojciech Witon, Pierre Colombo
Image analytics question answering

Patent number: 9984772

Abstract: A computer-implemented method for predicting answers to questions concerning medical image analytics reports includes splitting a medical image analytics report into a plurality of sentences and generating a plurality of sentence embedding vectors by applying a natural language processing framework to the plurality of sentences. A question related to subject matter included in the medical image analytics report is received and a question embedding vector is generated by applying the natural language processing framework to the question. A subset of the sentence embedding vectors most similar to the question embedding vector is identified by applying a similarity matching process to the sentence embedding vectors and the question embedding vector. A trained recurrent neural network (RNN) is used to determine a predicted answer to the question based on the subset of the sentence embedding vectors.

Type: Grant

Filed: March 10, 2017

Date of Patent: May 29, 2018

Assignee: Siemens Healthcare GmbH

Inventors: Wen Liu, Ashutosh Modi, Bogdan Georgescu, Francisco Pereira
IMAGE ANALYTICS QUESTION ANSWERING

Publication number: 20170293725

Abstract: A computer-implemented method for predicting answers to questions concerning medical image analytics reports includes splitting a medical image analytics report into a plurality of sentences and generating a plurality of sentence embedding vectors by applying a natural language processing framework to the plurality of sentences. A question related to subject matter included in the medical image analytics report is received and a question embedding vector is generated by applying the natural language processing framework to the question. A subset of the sentence embedding vectors most similar to the question embedding vector is identified by applying a similarity matching process to the sentence embedding vectors and the question embedding vector. A trained recurrent neural network (RNN) is used to determine a predicted answer to the question based on the subset of the sentence embedding vectors.

Type: Application

Filed: March 10, 2017

Publication date: October 12, 2017

Inventors: Wen Liu, Ashutosh Modi, Bogdan Georgescu, Francisco Pereira