Patents by Inventor Vincent Pollet

Vincent Pollet has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Adaptation and training of neural speech synthesis

Publication number: 20250006175

Abstract: Disclosed are systems, methods and other implementations for speech generation, including a method that includes obtaining a speech sample for a target speaker, processing, using a trained encoder, the speech sample to produce a parametric representation of the speech sample for the target speaker, receiving configuration data for a speech synthesis system that accepts as an input the parametric representation, and adapting the configuration data for the speech synthesis system according to an input comprising the parametric representation, and a time-domain representation for the speech sample, to generate adapted configuration data for the speech synthesis system.

Type: Application

Filed: December 7, 2022

Publication date: January 2, 2025

Inventors: Paolo Coppo, Matteo Testa, Vincent Pollet
Speech synthesis using one or more recurrent neural networks

Patent number: 11069335

Abstract: Aspects of the disclosure are related to synthesizing speech or other audio based on input data. Additionally, aspects of the disclosure are related to using one or more recurrent neural networks. For example, a computing device may receive text input; may determine features based on the text input; may provide the features as input to an recurrent neural network; may determine embedded data from one or more activations of a hidden layer of the recurrent neural network; may determine speech data based on a speech unit search that attempts to select, from a database, speech units based on the embedded data; and may generate speech output based on the speech data.

Type: Grant

Filed: July 12, 2017

Date of Patent: July 20, 2021

Assignee: Cerence Operating Company

Inventors: Vincent Pollet, Enrico Zovato
SYSTEMS AND METHODS FOR MULTI-STYLE SPEECH SYNTHESIS

Publication number: 20200211529

Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.

Type: Application

Filed: February 11, 2020

Publication date: July 2, 2020

Applicant: Nuance Communications, Inc.

Inventor: Vincent Pollet
SYSTEMS AND METHODS FOR MULTI-STYLE SPEECH SYNTHESIS

Publication number: 20190108830

Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.

Type: Application

Filed: June 4, 2018

Publication date: April 11, 2019

Applicant: Nuance Communications, Inc.

Inventor: Vincent Pollet
Systems and methods for multi-style speech synthesis

Patent number: 9990915

Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in synthesizing the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.

Type: Grant

Filed: December 28, 2016

Date of Patent: June 5, 2018

Assignee: Nuance Communications, Inc.

Inventor: Vincent Pollet
Speech Synthesis

Publication number: 20180096677

Abstract: Aspects of the disclosure are related to synthesizing speech or other audio based on input data. Additionally, aspects of the disclosure are related to using one or more recurrent neural networks. For example, a computing device may receive text input; may determine features based on the text input; may provide the features as input to an recurrent neural network; may determine embedded data from one or more activations of a hidden layer of the recurrent neural network; may determine speech data based on a speech unit search that attempts to select, from a database, speech units based on the embedded data; and may generate speech output based on the speech data.

Type: Application

Filed: July 12, 2017

Publication date: April 5, 2018

Inventors: Vincent Pollet, Enrico Zovato
SYSTEMS AND METHODS FOR MULTI-STYLE SPEECH SYNTHESIS

Publication number: 20170110110

Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a first speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identified plurality of speech segments comprising a first speech segment having the first speaking style and a second speech segment having a second speaking style different from the first speaking style; and rendering the text as speech having the first speaking style, at least in part, by using the identified plurality of speech segments.

Type: Application

Filed: December 28, 2016

Publication date: April 20, 2017

Applicant: Nuance Communications, Inc.

Inventor: Vincent Pollet
Systems and methods for multi-style speech synthesis

Patent number: 9570065

Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a first speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identified plurality of speech segments comprising a first speech segment having the first speaking style and a second speech segment having a second speaking style different from the first speaking style; and rendering the text as speech having the first speaking style, at least in part, by using the identified plurality of speech segments.

Type: Grant

Filed: September 29, 2014

Date of Patent: February 14, 2017

Assignee: Nuance Communications, Inc.

Inventor: Vincent Pollet
System and method for automatic prediction of speech suitability for statistical modeling

Patent number: 9484045

Abstract: An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.

Type: Grant

Filed: September 7, 2012

Date of Patent: November 1, 2016

Assignee: Nuance Communications, Inc.

Inventors: Alexander Sorin, Slava Shechtman, Vincent Pollet
SYSTEMS AND METHODS FOR MULTI-STYLE SPEECH SYNTHESIS

Publication number: 20160093289

Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a first speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identified plurality of speech segments comprising a first speech segment having the first speaking style and a second speech segment having a second speaking style different from the first speaking style; and rendering the text as speech having the first speaking style, at least in part, by using the identified plurality of speech segments.

Type: Application

Filed: September 29, 2014

Publication date: March 31, 2016

Inventor: Vincent Pollet
System and Method for Automatic Prediction of Speech Suitability for Statistical Modeling

Publication number: 20140074468

Abstract: An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.

Type: Application

Filed: September 7, 2012

Publication date: March 13, 2014

Applicant: Nuance Communications, Inc.

Inventors: Alexander Sorin, Slava Shechtman, Vincent Pollet
Synthesis by generation and concatenation of multi-form segments

Patent number: 8321222

Abstract: A speech synthesis system and method is described. A speech segment database references speech segments having various different speech representational structures. A speech segment selector selects from the speech segment database a sequence of speech segment candidates corresponding to a target text. A speech segment sequencer generates from the speech segment candidates sequenced speech segments corresponding to the target text. A speech segment synthesizer combines the selected sequenced speech segments to produce a synthesized speech signal output corresponding to the target text.

Type: Grant

Filed: August 14, 2007

Date of Patent: November 27, 2012

Assignee: Nuance Communications, Inc.

Inventors: Vincent Pollet, Andrew Breen
Corpus-based speech synthesis based on segment recombination

Patent number: 7567896

Abstract: A system and method generate synthesized speech through concatenation of speech segments that are derived from a large prosodically-rich corpus of speech segments including using an additional dictionary of speech segment identifier sequences.

Type: Grant

Filed: January 18, 2005

Date of Patent: July 28, 2009

Assignee: Nuance Communications, Inc.

Inventors: Geert Coorman, Vincent Pollet, Stefaan Van Gerven, Mario De Bock, Bert Van Coile, Jan De Moortel
Synthesis by Generation and Concatenation of Multi-Form Segments

Publication number: 20090048841

Abstract: A speech synthesis system and method is described. A speech segment database references speech segments having various different speech representational structures. A speech segment selector selects from the speech segment database a sequence of speech segment candidates corresponding to a target text. A speech segment sequencer generates from the speech segment candidates sequenced speech segments corresponding to the target text. A speech segment synthesizer combines the selected sequenced speech segments to produce a synthesized speech signal output corresponding to the target text.

Type: Application

Filed: August 14, 2007

Publication date: February 19, 2009

Applicant: Nuance Communications, Inc.

Inventors: Vincent Pollet, Andrew Breen
Corpus-based speech synthesis based on segment recombination

Publication number: 20050182629

Abstract: A system and method generate synthesized speech through concatenation of speech segments that are derived from a large prosodically-rich corpus of speech segments including using an additional dictionary of speech segment identifier sequences.

Type: Application

Filed: January 18, 2005

Publication date: August 18, 2005

Inventors: Geert Coorman, Vincent Pollet, Stefaan Van Gerven, Mario De Bock, Bert Van Coile, Jan De Moortel