Patents by Inventor Lyle Patrick Stein

Lyle Patrick Stein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12573367
    Abstract: A style encoder can be trained to encode audio style and audio characteristics into selected regions of a style vector. The style vector can be used to condition a text to speech (TTS) model to generate speech with human-understandable and controllable styles. Various training strategies of the style encoder are described, including a first, second and third training strategy that can be used to disentangle audio styles into selected regions of a style vector. The distinct regions of the style vector can be used to provide numerous customization options to a user of the described system, along with tools to generate speech with a speaker identity and using selected audio styles and characteristics.
    Type: Grant
    Filed: August 11, 2023
    Date of Patent: March 10, 2026
    Assignee: Naro Corp.
    Inventors: Todd Silverstein, Max Florian Frenzel, Lyle Patrick Stein
  • Patent number: 12567421
    Abstract: A style encoder can be trained to encode audio style and audio characteristics into selected regions of a style vector. The style vector can be used to condition a text to speech (TTS) model to generate speech with human-understandable and controllable styles. Various training strategies of the style encoder are described, including a first, second and third training strategy that can be used to disentangle audio styles into selected regions of a style vector. The distinct regions of the style vector can be used to provide numerous customization options to a user of the described system, along with tools to generate speech with a speaker identity and using selected audio styles and characteristics.
    Type: Grant
    Filed: August 11, 2023
    Date of Patent: March 3, 2026
    Assignee: Naro Corp.
    Inventors: Lyle Patrick Stein, Max Florian Frenzel, Todd Silverstein
  • Publication number: 20250054487
    Abstract: A style encoder can be trained to encode audio style and audio characteristics into selected regions of a style vector. The style vector can be used to condition a text to speech (TTS) model to generate speech with human-understandable and controllable styles. Various training strategies of the style encoder are described, including a first, second and third training strategy that can be used to disentangle audio styles into selected regions of a style vector. The distinct regions of the style vector can be used to provide numerous customization options to a user of the described system, along with tools to generate speech with a speaker identity and using selected audio styles and characteristics.
    Type: Application
    Filed: August 11, 2023
    Publication date: February 13, 2025
    Inventors: Todd Silverstein, Max Florian Frenzel, Lyle Patrick Stein
  • Publication number: 20250054502
    Abstract: A style encoder can be trained to encode audio style and audio characteristics into selected regions of a style vector. The style vector can be used to condition a text to speech (TTS) model to generate speech with human-understandable and controllable styles. Various training strategies of the style encoder are described, including a first, second and third training strategy that can be used to disentangle audio styles into selected regions of a style vector. The distinct regions of the style vector can be used to provide numerous customization options to a user of the described system, along with tools to generate speech with a speaker identity and using selected audio styles and characteristics.
    Type: Application
    Filed: August 11, 2023
    Publication date: February 13, 2025
    Inventors: Lyle Patrick Stein, Max Florian Frenzel, Todd Silverstein
  • Publication number: 20230386475
    Abstract: A text to speech system can be implemented by training artificial intelligence models directed to encoding speech characteristics into an audio fingerprint and synthesizing audio based on the fingerprint. The speech characteristics can include a variety of attributes that can occur in natural speech, such as speech variation due to prosody. Speaker identity can, but does not have to, also be used in synthesizing speech. A pipeline using an audio processing device can receive a video clip or a collection of video clips and generate a synthesized video with varying degrees of association with the received video. A user of the pipeline can enter customization to modify the synthesized audio. A trained encoder can generate a fingerprint and a synthesizer can generate synthesized audio based on the fingerprint.
    Type: Application
    Filed: May 29, 2022
    Publication date: November 30, 2023
    Inventors: Max Florian Frenzel, Todd Silverstein, Lyle Patrick Stein