Patents by Inventor Lyle Patrick Stein

Lyle Patrick Stein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Text to audio conversion with style conditioning

Patent number: 12573367

Abstract: A style encoder can be trained to encode audio style and audio characteristics into selected regions of a style vector. The style vector can be used to condition a text to speech (TTS) model to generate speech with human-understandable and controllable styles. Various training strategies of the style encoder are described, including a first, second and third training strategy that can be used to disentangle audio styles into selected regions of a style vector. The distinct regions of the style vector can be used to provide numerous customization options to a user of the described system, along with tools to generate speech with a speaker identity and using selected audio styles and characteristics.

Type: Grant

Filed: August 11, 2023

Date of Patent: March 10, 2026

Assignee: Naro Corp.

Inventors: Todd Silverstein, Max Florian Frenzel, Lyle Patrick Stein
Text to audio conversion with disentangled style conditioning

Patent number: 12567421

Abstract: A style encoder can be trained to encode audio style and audio characteristics into selected regions of a style vector. The style vector can be used to condition a text to speech (TTS) model to generate speech with human-understandable and controllable styles. Various training strategies of the style encoder are described, including a first, second and third training strategy that can be used to disentangle audio styles into selected regions of a style vector. The distinct regions of the style vector can be used to provide numerous customization options to a user of the described system, along with tools to generate speech with a speaker identity and using selected audio styles and characteristics.

Type: Grant

Filed: August 11, 2023

Date of Patent: March 3, 2026

Assignee: Naro Corp.

Inventors: Lyle Patrick Stein, Max Florian Frenzel, Todd Silverstein
TEXT TO AUDIO CONVERSION WITH STYLE CONDITIONING

Publication number: 20250054487

Abstract: A style encoder can be trained to encode audio style and audio characteristics into selected regions of a style vector. The style vector can be used to condition a text to speech (TTS) model to generate speech with human-understandable and controllable styles. Various training strategies of the style encoder are described, including a first, second and third training strategy that can be used to disentangle audio styles into selected regions of a style vector. The distinct regions of the style vector can be used to provide numerous customization options to a user of the described system, along with tools to generate speech with a speaker identity and using selected audio styles and characteristics.

Type: Application

Filed: August 11, 2023

Publication date: February 13, 2025

Inventors: Todd Silverstein, Max Florian Frenzel, Lyle Patrick Stein
TEXT TO AUDIO CONVERSION WITH DISENTANGLED STYLE CONDITIONING

Publication number: 20250054502

Abstract: A style encoder can be trained to encode audio style and audio characteristics into selected regions of a style vector. The style vector can be used to condition a text to speech (TTS) model to generate speech with human-understandable and controllable styles. Various training strategies of the style encoder are described, including a first, second and third training strategy that can be used to disentangle audio styles into selected regions of a style vector. The distinct regions of the style vector can be used to provide numerous customization options to a user of the described system, along with tools to generate speech with a speaker identity and using selected audio styles and characteristics.

Type: Application

Filed: August 11, 2023

Publication date: February 13, 2025

Inventors: Lyle Patrick Stein, Max Florian Frenzel, Todd Silverstein
SYSTEMS AND METHODS OF TEXT TO AUDIO CONVERSION

Publication number: 20230386475

Abstract: A text to speech system can be implemented by training artificial intelligence models directed to encoding speech characteristics into an audio fingerprint and synthesizing audio based on the fingerprint. The speech characteristics can include a variety of attributes that can occur in natural speech, such as speech variation due to prosody. Speaker identity can, but does not have to, also be used in synthesizing speech. A pipeline using an audio processing device can receive a video clip or a collection of video clips and generate a synthesized video with varying degrees of association with the received video. A user of the pipeline can enter customization to modify the synthesized audio. A trained encoder can generate a fingerprint and a synthesizer can generate synthesized audio based on the fingerprint.

Type: Application

Filed: May 29, 2022

Publication date: November 30, 2023

Inventors: Max Florian Frenzel, Todd Silverstein, Lyle Patrick Stein

Text to audio conversion with style conditioning

Text to audio conversion with disentangled style conditioning

TEXT TO AUDIO CONVERSION WITH STYLE CONDITIONING

TEXT TO AUDIO CONVERSION WITH DISENTANGLED STYLE CONDITIONING

SYSTEMS AND METHODS OF TEXT TO AUDIO CONVERSION