Patents by Inventor Nishant Prateek

Nishant Prateek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TEXT-TO-SPEECH PROCESSING USING INPUT VOICE CHARACTERISTIC DATA

Publication number: 20230043916

Abstract: During text-to-speech processing, a speech model creates synthesized speech that corresponds to input data. The speech model may include an encoder for encoding the input data into a context vector and a decoder for decoding the context vector into spectrogram data. The speech model may further include a voice decoder that receives vocal characteristic data representing a desired vocal characteristic of synthesized speech. The voice decoder may process the vocal characteristic data to determine configuration data, such as weights, for use by the speech decoder.

Type: Application

Filed: June 24, 2022

Publication date: February 9, 2023

Inventors: Roberto Barra Chicote, Vatsal Aggarwal, Andrew Paul Breen, Javier Gonzalez Hernandez, Nishant Prateek
Text-to-speech processing using input voice characteristic data

Patent number: 11373633

Abstract: During text-to-speech processing, a speech model creates synthesized speech that corresponds to input data. The speech model may include an encoder for encoding the input data into a context vector and a decoder for decoding the context vector into spectrogram data. The speech model may further include a voice decoder that receives vocal characteristic data representing a desired vocal characteristic of synthesized speech. The voice decoder may process the vocal characteristic data to determine configuration data, such as weights, for use by the speech decoder.

Type: Grant

Filed: September 27, 2019

Date of Patent: June 28, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Roberto Barra Chicote, Vatsal Aggarwal, Andrew Paul Breen, Javier Gonzalez Hernandez, Nishant Prateek
Synthetic speech processing

Patent number: 11017763

Abstract: During text-to-speech processing, a sequence-to-sequence neural network model may process text data and determine corresponding spectrogram data. A normalizing flow component may then process this spectrogram data to predict corresponding phase data. An inverse Fourier transform may then be performed on the spectrogram and phase data to create an audio waveform that includes speech corresponding to the text.

Type: Grant

Filed: December 12, 2019

Date of Patent: May 25, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Vatsal Aggarwal, Nishant Prateek, Roberto Barra Chicote, Andrew Paul Breen
TEXT-TO-SPEECH PROCESSING

Publication number: 20210097976

Abstract: During text-to-speech processing, a speech model creates synthesized speech that corresponds to input data. The speech model may include an encoder for encoding the input data into a context vector and a decoder for decoding the context vector into spectrogram data. The speech model may further include a voice decoder that receives vocal characteristic data representing a desired vocal characteristic of synthesized speech. The voice decoder may process the vocal characteristic data to determine configuration data, such as weights, for use by the speech decoder.

Type: Application

Filed: September 27, 2019

Publication date: April 1, 2021

Inventors: Roberto Barra Chicote, Vatsal Aggarwal, Andrew Paul Breen, Javier Gonzalez Hernandez, Nishant Prateek
TEXT-TO-SPEECH (TTS) PROCESSING

Publication number: 20200410981

Abstract: A speech model is trained using multi-task learning. A first task may correspond to how well predicted audio matches training audio; a second task may correspond to a metric of perceived audio quality. The speech model may include, during training, layers related to the second task that are discarded at runtime.

Type: Application

Filed: May 19, 2020

Publication date: December 31, 2020

Inventors: Thomas Edward Merritt, Adam Franciszek Nadolski, Nishant Prateek, Bartosz Putrycz, Roberto Barra Chicote, Vatsal Aggarwal, Andrew Paul Breen
Text-to-speech (TTS) processing

Patent number: 10692484

Abstract: A speech model is trained using multi-task learning. A first task may correspond to how well predicted audio matches training audio; a second task may correspond to a metric of perceived audio quality. The speech model may include, during training, layers related to the second task that are discarded at runtime.

Type: Grant

Filed: June 13, 2018

Date of Patent: June 23, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Thomas Edward Merritt, Adam Franciszek Nadolski, Nishant Prateek, Bartosz Putrycz, Roberto Barra Chicote, Vatsal Aggarwal, Andrew Paul Breen

TEXT-TO-SPEECH PROCESSING USING INPUT VOICE CHARACTERISTIC DATA

Text-to-speech processing using input voice characteristic data

Synthetic speech processing

TEXT-TO-SPEECH PROCESSING

TEXT-TO-SPEECH (TTS) PROCESSING

Text-to-speech (TTS) processing