Patents by Inventor Vincent Pollet

Vincent Pollet has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11069335
    Abstract: Aspects of the disclosure are related to synthesizing speech or other audio based on input data. Additionally, aspects of the disclosure are related to using one or more recurrent neural networks. For example, a computing device may receive text input; may determine features based on the text input; may provide the features as input to an recurrent neural network; may determine embedded data from one or more activations of a hidden layer of the recurrent neural network; may determine speech data based on a speech unit search that attempts to select, from a database, speech units based on the embedded data; and may generate speech output based on the speech data.
    Type: Grant
    Filed: July 12, 2017
    Date of Patent: July 20, 2021
    Assignee: Cerence Operating Company
    Inventors: Vincent Pollet, Enrico Zovato
  • Publication number: 20200211529
    Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.
    Type: Application
    Filed: February 11, 2020
    Publication date: July 2, 2020
    Applicant: Nuance Communications, Inc.
    Inventor: Vincent Pollet
  • Publication number: 20190108830
    Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.
    Type: Application
    Filed: June 4, 2018
    Publication date: April 11, 2019
    Applicant: Nuance Communications, Inc.
    Inventor: Vincent Pollet
  • Patent number: 9990915
    Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in synthesizing the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.
    Type: Grant
    Filed: December 28, 2016
    Date of Patent: June 5, 2018
    Assignee: Nuance Communications, Inc.
    Inventor: Vincent Pollet
  • Publication number: 20180096677
    Abstract: Aspects of the disclosure are related to synthesizing speech or other audio based on input data. Additionally, aspects of the disclosure are related to using one or more recurrent neural networks. For example, a computing device may receive text input; may determine features based on the text input; may provide the features as input to an recurrent neural network; may determine embedded data from one or more activations of a hidden layer of the recurrent neural network; may determine speech data based on a speech unit search that attempts to select, from a database, speech units based on the embedded data; and may generate speech output based on the speech data.
    Type: Application
    Filed: July 12, 2017
    Publication date: April 5, 2018
    Inventors: Vincent Pollet, Enrico Zovato
  • Publication number: 20170110110
    Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a first speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identified plurality of speech segments comprising a first speech segment having the first speaking style and a second speech segment having a second speaking style different from the first speaking style; and rendering the text as speech having the first speaking style, at least in part, by using the identified plurality of speech segments.
    Type: Application
    Filed: December 28, 2016
    Publication date: April 20, 2017
    Applicant: Nuance Communications, Inc.
    Inventor: Vincent Pollet
  • Patent number: 9570065
    Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a first speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identified plurality of speech segments comprising a first speech segment having the first speaking style and a second speech segment having a second speaking style different from the first speaking style; and rendering the text as speech having the first speaking style, at least in part, by using the identified plurality of speech segments.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: February 14, 2017
    Assignee: Nuance Communications, Inc.
    Inventor: Vincent Pollet
  • Patent number: 9484045
    Abstract: An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.
    Type: Grant
    Filed: September 7, 2012
    Date of Patent: November 1, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Alexander Sorin, Slava Shechtman, Vincent Pollet
  • Publication number: 20160093289
    Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a first speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identified plurality of speech segments comprising a first speech segment having the first speaking style and a second speech segment having a second speaking style different from the first speaking style; and rendering the text as speech having the first speaking style, at least in part, by using the identified plurality of speech segments.
    Type: Application
    Filed: September 29, 2014
    Publication date: March 31, 2016
    Inventor: Vincent Pollet
  • Publication number: 20140074468
    Abstract: An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.
    Type: Application
    Filed: September 7, 2012
    Publication date: March 13, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Alexander Sorin, Slava Shechtman, Vincent Pollet
  • Patent number: 8321222
    Abstract: A speech synthesis system and method is described. A speech segment database references speech segments having various different speech representational structures. A speech segment selector selects from the speech segment database a sequence of speech segment candidates corresponding to a target text. A speech segment sequencer generates from the speech segment candidates sequenced speech segments corresponding to the target text. A speech segment synthesizer combines the selected sequenced speech segments to produce a synthesized speech signal output corresponding to the target text.
    Type: Grant
    Filed: August 14, 2007
    Date of Patent: November 27, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Vincent Pollet, Andrew Breen
  • Patent number: 7567896
    Abstract: A system and method generate synthesized speech through concatenation of speech segments that are derived from a large prosodically-rich corpus of speech segments including using an additional dictionary of speech segment identifier sequences.
    Type: Grant
    Filed: January 18, 2005
    Date of Patent: July 28, 2009
    Assignee: Nuance Communications, Inc.
    Inventors: Geert Coorman, Vincent Pollet, Stefaan Van Gerven, Mario De Bock, Bert Van Coile, Jan De Moortel
  • Publication number: 20090048841
    Abstract: A speech synthesis system and method is described. A speech segment database references speech segments having various different speech representational structures. A speech segment selector selects from the speech segment database a sequence of speech segment candidates corresponding to a target text. A speech segment sequencer generates from the speech segment candidates sequenced speech segments corresponding to the target text. A speech segment synthesizer combines the selected sequenced speech segments to produce a synthesized speech signal output corresponding to the target text.
    Type: Application
    Filed: August 14, 2007
    Publication date: February 19, 2009
    Applicant: Nuance Communications, Inc.
    Inventors: Vincent Pollet, Andrew Breen
  • Publication number: 20050182629
    Abstract: A system and method generate synthesized speech through concatenation of speech segments that are derived from a large prosodically-rich corpus of speech segments including using an additional dictionary of speech segment identifier sequences.
    Type: Application
    Filed: January 18, 2005
    Publication date: August 18, 2005
    Inventors: Geert Coorman, Vincent Pollet, Stefaan Van Gerven, Mario De Bock, Bert Van Coile, Jan De Moortel