Patents by Inventor Valentin Alain Jean Perret

Valentin Alain Jean Perret has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SPEAKER SEPARATION BASED ON REAL-TIME LATENT SPEAKER STATE CHARACTERIZATION

Publication number: 20240153509

Abstract: Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.

Type: Application

Filed: September 14, 2023

Publication date: May 9, 2024

Inventors: Valentin Alain Jean Perret, Nándor Kedves, Nicolas Lucien Perony
SAMPLE-EFFICIENT REPRESENTATION LEARNING FOR REAL-TIME LATENT SPEAKER STATE CHARACTERISATION

Publication number: 20230352031

Abstract: Systems, methods, and non-transitory computer-readable media can provide audio waveform data that corresponds to a voice sample to a temporal convolutional network for evaluation. The temporal convolutional network can pre-process the audio waveform data and can output an identity embedding associated with the audio waveform data. The identity embedding associated with the voice sample can be obtained from the temporal convolutional network. Information describing a speaker associated with the voice sample can be determined based at least in part on the identity embedding.

Type: Application

Filed: March 31, 2023

Publication date: November 2, 2023

Inventors: Valentin Alain Jean Perret, Nicolas Lucien Perony, Nándor Kedves
Speaker separation based on real-time latent speaker state characterization

Patent number: 11790921

Abstract: Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.

Type: Grant

Filed: February 8, 2021

Date of Patent: October 17, 2023

Assignee: OTO Systems Inc.

Inventors: Valentin Alain Jean Perret, Nándor Kedves, Nicolas Lucien Perony
Sample-efficient representation learning for real-time latent speaker state characterization

Patent number: 11646037

Abstract: Systems, methods, and non-transitory computer-readable media can provide audio waveform data that corresponds to a voice sample to a temporal convolutional network for evaluation. The temporal convolutional network can pre-process the audio waveform data and can output an identity embedding associated with the audio waveform data. The identity embedding associated with the voice sample can be obtained from the temporal convolutional network. Information describing a speaker associated with the voice sample can be determined based at least in part on the identity embedding.

Type: Grant

Filed: December 8, 2020

Date of Patent: May 9, 2023

Assignee: OTO Systems Inc.

Inventors: Valentin Alain Jean Perret, Nicolas Lucien Perony, Nándor Kedves
SAMPLE-EFFICIENT REPRESENTATION LEARNING FOR REAL-TIME LATENT SPEAKER STATE CHARACTERIZATION

Publication number: 20220044688

Abstract: Systems, methods, and non-transitory computer-readable media can provide audio waveform data that corresponds to a voice sample to a temporal convolutional network for evaluation. The temporal convolutional network can pre-process the audio waveform data and can output an identity embedding associated with the audio waveform data. The identity embedding associated with the voice sample can be obtained from the temporal convolutional network. Information describing a speaker associated with the voice sample can be determined based at least in part on the identity embedding.

Type: Application

Filed: December 8, 2020

Publication date: February 10, 2022

Inventors: Valentin Alain Jean Perret, Nicolas Lucien Perony, Nándor Kedves
SPEAKER SEPARATION BASED ON REAL-TIME LATENT SPEAKER STATE CHARACTERIZATION

Publication number: 20220044687

Abstract: Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.

Type: Application

Filed: February 8, 2021

Publication date: February 10, 2022

Inventors: Valentin Alain Jean Perret, Nándor Kedves, Nicolas Lucien Perony

SPEAKER SEPARATION BASED ON REAL-TIME LATENT SPEAKER STATE CHARACTERIZATION

SAMPLE-EFFICIENT REPRESENTATION LEARNING FOR REAL-TIME LATENT SPEAKER STATE CHARACTERISATION

Speaker separation based on real-time latent speaker state characterization

Sample-efficient representation learning for real-time latent speaker state characterization

SAMPLE-EFFICIENT REPRESENTATION LEARNING FOR REAL-TIME LATENT SPEAKER STATE CHARACTERIZATION

SPEAKER SEPARATION BASED ON REAL-TIME LATENT SPEAKER STATE CHARACTERIZATION