Patents by Inventor Michael Getty HORGAN

Michael Getty HORGAN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for adapting human speaker embeddings in speech synthesis

Patent number: 11929058

Abstract: Novel methods and systems for adapting a voice cloning synthesizer for a new speaker using real speech data are disclosed. Utterances from one or more target speakers are parameterized and are used to initialize an embedding vector for use with a voice synthesizer, by means of clustering the utterance data and determining the centroid of the data, using a speaker identification neural network, and/or by finding the closest stored embedded vector to the utterance data.

Type: Grant

Filed: August 18, 2020

Date of Patent: March 12, 2024

Assignee: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Cong Zhou, Xiaoyu Liu, Michael Getty Horgan, Vivek Kumar
DEEP-LEARNING BASED SPEECH ENHANCEMENT

Publication number: 20230368807

Abstract: A system for suppressing noise and enhancing speech and a related method are disclosed. The system trains a neural network model that takes banded energies corresponding to an original noisy waveform and produces a speech value indicating the amount of speech present in each band at each frame. The neural model comprises a feature extraction block that implements some lookahead. The feature extraction block is followed by an encoder with steady down-sampling along the frequency domain forming a contracting path. The encoder is followed by a corresponding decoder with steady up-sampling along the frequency domain forming an expanding path. The decoder receives scaled output feature maps from the encoder at a corresponding level. The decoder is followed by a classification block that generates a speech value indicating an amount of speech present for each frequency band of the plurality of frequency bands at each frame of the plurality of frames.

Type: Application

Filed: October 29, 2021

Publication date: November 16, 2023

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Xiaoyu LIU, Michael Getty HORGAN, Roy M. FEJGIN, Paul HOLMBERG
Speech style transfer

Patent number: 11538455

Abstract: Computer-implemented methods for speech synthesis are provided. A speech synthesizer may be trained to generate synthesized audio data that corresponds to words uttered by a source speaker according to speech characteristics of a target speaker. The speech synthesizer may be trained by time-stamped phoneme sequences, pitch contour data and speaker identification data. The speech synthesizer may include a voice modeling neural network and a conditioning neural network.

Type: Grant

Filed: February 14, 2019

Date of Patent: December 27, 2022

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Cong Zhou, Michael Getty Horgan, Vivek Kumar, Jaime H. Morales, Cristina Michel Vasco
DEEP SOURCE SEPARATION ARCHITECTURE

Publication number: 20220406323

Abstract: A speech separation server comprises a deep-learning encoder with nonlinear activation. The encoder is programmed to take a mixture audio waveform in the time domain, learn generalized patterns from the mixture audio waveform, and generate an encoded representation that effectively characterizes the mixture audio waveform for speech separation.

Type: Application

Filed: October 20, 2020

Publication date: December 22, 2022

Applicants: Dolby Laboratories Licensing Corporation, Dolby International AB

Inventors: Berkan KADIOGLU, Michael Getty HORGAN, Jordi Pons PUIG, Xiaoyu LIU
SYSTEMS AND METHODS FOR ADAPTING HUMAN SPEAKER EMBEDDINGS IN SPEECH SYNTHESIS

Publication number: 20220335925

Abstract: Novel methods and systems for adapting a voice cloning synthesizer for a new speaker using real speech data are disclosed. Utterances from one or more target speakers are parameterized and are used to initialize an embedding vector for use with a voice synthesizer, by means of clustering the utterance data and determining the centroid of the data, using a speaker identification neural network, and/or by finding the closest stored embedded vector to the utterance data.

Type: Application

Filed: August 18, 2020

Publication date: October 20, 2022

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Cong ZHOU, Xiaoyu LIU, Michael Getty HORGAN, Vivek Kumar
SPEECH STYLE TRANSFER

Publication number: 20200410976

Abstract: Computer-implemented methods for speech synthesis are provided. A speech synthesizer may be trained to generate synthesized audio data that corresponds to words uttered by a source speaker according to speech characteristics of a target speaker. The speech synthesizer may be trained by time-stamped phoneme sequences, pitch contour data and speaker identification data. The speech synthesizer may include a voice modeling neural network and a conditioning neural network.

Type: Application

Filed: February 14, 2019

Publication date: December 31, 2020

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Cong ZHOU, Michael Getty HORGAN, Vivek KUMAR, Jaime H. MORALES, Cristina Michel VASCO

Systems and methods for adapting human speaker embeddings in speech synthesis

DEEP-LEARNING BASED SPEECH ENHANCEMENT

Speech style transfer

DEEP SOURCE SEPARATION ARCHITECTURE

SYSTEMS AND METHODS FOR ADAPTING HUMAN SPEAKER EMBEDDINGS IN SPEECH SYNTHESIS

SPEECH STYLE TRANSFER