Patents by Inventor Ankita Jha

Ankita Jha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR REAL-TIME ACCENT MIMICKING

Publication number: 20250166603

Abstract: The disclosed technology relates to methods, speech processing systems, and non-transitory computer readable media for real-time accent mimicking. In some examples, trained machine learning model(s) are applied to first input audio data to extract accent features of first input speech associated with a first accent of a first user. Obtained second input data associated with second input speech associated with a second accent of a second user is analyzed to generate characteristics specific to a natural voice of the second user. A modified version of the second input speech is synthesized based on the generated characteristics and the extracted accent features. The modified version of the second input speech advantageously preserves aspects of the natural voice of the second user and mimics the first accent. Output audio data generated based on the modified version of the second input speech is provided for output via an audio output device.

Type: Application

Filed: January 17, 2025

Publication date: May 22, 2025

Inventors: Ankita JHA, Lukas Pfeifenberger, Piotr Dura, David Braude, Alvaro Escudero, Shawn Zhang, Maxim Serebryakov
SYSTEMS AND METHODS FOR REAL-TIME ACCENT LOCALIZATION

Publication number: 20250095665

Abstract: The disclosed technology relates to methods, speech processing systems, and non-transitory computer readable media for real-time accent localization. In some examples, a geolocation of a first user device is determined, and accent features are extracted from first input speech, in response to first input audio data comprising the first input speech obtained from the first user device. Accent profiles identified based on the determined geolocation are compared to the extracted accent features to identify one of the accent profiles most closely matching the extracted accent features. Second input speech is modified to adjust an accent represented in the second input speech based on the identified one of the accent profiles. The second input speech with the adjusted accent is then provided to an audio interface of a second user device to improve communication bridging between users of the first and second user devices.

Type: Application

Filed: December 2, 2024

Publication date: March 20, 2025

Inventors: Ankita JHA, Piotr DURA, David BRAUDE, Lukas PFEIFENBERGER, Alvaro ESCUDERO, Shawn ZHANG, Maxim SEREBRYAKOV
METHODS FOR NEURAL NETWORK-BASED VOICE ENHANCEMENT AND SYSTEMS THEREOF

Publication number: 20250029626

Abstract: The disclosed technology relates to methods, voice enhancement systems, and non- transitory computer readable media for real-time voice enhancement. In some examples, input audio data including foreground speech content, non-content elements, and speech characteristics is fragmented into input speech frames. The input speech frames are converted to low-dimensional representations of the input speech frames. One or more of the fragmentation or the conversion is based on an application of a first trained neural network to the input audio data. The low-dimensional representations of the input speech frames omit one or more of the non-content elements. A second trained neural network is applied to the low-dimensional representations of the input speech frames to generate target speech frames. The target speech frames are combined to generate output audio data. The output audio data further includes one or more portions of the foreground speech content and one or more of the speech characteristics.

Type: Application

Filed: October 4, 2024

Publication date: January 23, 2025

Inventors: Shawn ZHANG, Lukas PFEIFENBERGER, Jason WU, Piotr DURA, David BRAUDE, Bajibabu BOLLEPALLI, Alvaro ESCUDERO, Gokce KESKIN, Ankita JHA, Maxim SEREBRYAKOV
SYSTEM AND METHOD FOR BACKGROUND NOISE SUPPRESSION

Publication number: 20250014587

Abstract: The disclosed technology relates to methods, background noise suppression systems, and non-transitory computer readable media for background noise suppression. In some examples, frames fragmented from input audio data are projected into a higher dimension space than the input audio data. An estimated speech mask is applied to the frames to separate speech components and noise components of the frames. The speech components are then transformed into a feature domain of the input audio data by performing an inverse projection on the speech components to generate output audio data. The output audio data is provided via an audio interface. The output audio data advantageously comprises a noise-suppressed version of the input audio data.

Type: Application

Filed: September 19, 2024

Publication date: January 9, 2025

Inventors: Lukas PFEIFENBERGER, Shawn Zhang, Monal Patel, Maxim Serebryakov, Raj Vardhan, Lan Shek, Ankita Jha
Methods for neural network-based voice enhancement and systems thereof

Patent number: 12125496

Abstract: The disclosed technology relates to methods, voice enhancement systems, and non-transitory computer readable media for real-time voice enhancement. In some examples, input audio data including foreground speech content, non-content elements, and speech characteristics is fragmented into input speech frames. The input speech frames are converted to low-dimensional representations of the input speech frames. One or more of the fragmentation or the conversion is based on an application of a first trained neural network to the input audio data. The low-dimensional representations of the input speech frames omit one or more of the non-content elements. A second trained neural network is applied to the low-dimensional representations of the input speech frames to generate target speech frames. The target speech frames are combined to generate output audio data. The output audio data further includes one or more portions of the foreground speech content and one or more of the speech characteristics.

Type: Grant

Filed: April 24, 2024

Date of Patent: October 22, 2024

Assignee: SANAS.AI INC.

Inventors: Shawn Zhang, Lukas Pfeifenberger, Jason Wu, Piotr Dura, David Braude, Bajibabu Bollepalli, Alvaro Escudero, Gokce Keskin, Ankita Jha, Maxim Serebryakov

SYSTEMS AND METHODS FOR REAL-TIME ACCENT MIMICKING

SYSTEMS AND METHODS FOR REAL-TIME ACCENT LOCALIZATION

METHODS FOR NEURAL NETWORK-BASED VOICE ENHANCEMENT AND SYSTEMS THEREOF

SYSTEM AND METHOD FOR BACKGROUND NOISE SUPPRESSION

Methods for neural network-based voice enhancement and systems thereof