Patents by Inventor Sandeepkumar SATPAL

Sandeepkumar SATPAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Computing system for domain expressive text to speech

Patent number: 12293756

Abstract: A computing system obtains text that includes words and provides the text as input to an emotional classifier model that has been trained based upon emotional classification. The computing system obtains a textual embedding of the computer-readable text as output of the emotional classifier model. The computing system generates a phoneme sequence based upon the words of the text. The computing system, generates, by way of an encoder of a text to speech (TTS) model, a phoneme encoding based upon the phoneme sequence. The computing system provides the textual embedding and the phoneme encoding as input to a decoder of the TTS model. The computing system causes speech that includes the words to be played over a speaker based upon output of the decoder of the TTS model, where the speech reflects an emotion underlying the text due to the textual embedding provided to the encoder.

Type: Grant

Filed: November 11, 2021

Date of Patent: May 6, 2025

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Arijit Mukherjee, Shubham Bansal, Sandeepkumar Satpal, Rupeshkumar Rasiklal Mehta
COMPUTING SYSTEM FOR UNSUPERVISED EMOTIONAL TEXT TO SPEECH TRAINING

Publication number: 20230111824

Abstract: A text to speech (TTS) model is trained based on training data including text samples. The text samples are provided to a text embedding model for outputting text embeddings for the text samples. The text embeddings are clustered into several clusters of text embeddings. The several clusters are representative of variations in emotion. The TTS model is then trained based upon the several clusters of text embeddings. Upon being trained, the TTS model is configured to receive text input and output a spoken utterance that corresponds to the text input. The TTS model is configured to output the spoken utterance with emotion. The emotion is based upon the text input and the training of the TTS model.

Type: Application

Filed: February 22, 2022

Publication date: April 13, 2023

Inventors: Arijit MUKHERJEE, Shubham BANSAL, Sandeepkumar SATPAL, Rupeshkumar Rasiklal MEHTA
COMPUTING SYSTEM FOR DOMAIN EXPRESSIVE TEXT TO SPEECH

Publication number: 20230099732

Abstract: A computing system obtains text that includes words and provides the text as input to an emotional classifier model that has been trained based upon emotional classification. The computing system obtains a textual embedding of the computer-readable text as output of the emotional classifier model. The computing system generates a phoneme sequence based upon the words of the text. The computing system, generates, by way of an encoder of a text to speech (TTS) model, a phoneme encoding based upon the phoneme sequence. The computing system provides the textual embedding and the phoneme encoding as input to a decoder of the TTS model. The computing system causes speech that includes the words to be played over a speaker based upon output of the decoder of the TTS model, where the speech reflects an emotion underlying the text due to the textual embedding provided to the encoder.

Type: Application

Filed: November 11, 2021

Publication date: March 30, 2023

Inventors: Arijit MUKHERJEE, Shubham BANSAL, Sandeepkumar SATPAL, Rupeshkumar Rasiklal MEHTA

Computing system for domain expressive text to speech

COMPUTING SYSTEM FOR UNSUPERVISED EMOTIONAL TEXT TO SPEECH TRAINING

COMPUTING SYSTEM FOR DOMAIN EXPRESSIVE TEXT TO SPEECH