Patents by Inventor Steve Pearson

Steve Pearson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12573372
    Abstract: A neural TTS system is trained to generate key acoustic frames at variable rates while omitting other frames. The frame skipping depends on the acoustic features to be generated for the input text. The TTS system can interpolate frames between the key frames at a target rate for a vocoder to synthesis audio samples.
    Type: Grant
    Filed: October 31, 2022
    Date of Patent: March 10, 2026
    Assignee: SoundHound AI IP, LLC
    Inventors: Steve Pearson, Jon Grossman
  • Publication number: 20240144910
    Abstract: A neural TTS system is trained to generate key acoustic frames at variable rates while omitting other frames. The frame skipping depends on the acoustic features to be generated for the input text. The TTS system can interpolate frames between the key frames at a target rate for a vocoder to synthesis audio samples.
    Type: Application
    Filed: October 31, 2022
    Publication date: May 2, 2024
    Applicant: SoundHound, Inc.
    Inventors: Steve PEARSON, Jon GROSSMAN
  • Patent number: 11600284
    Abstract: A voice morphing apparatus having adjustable parameters is described. The disclosed system and method include a voice morphing apparatus that morphs input audio to mask a speaker's identity. Parameter adjustment uses evaluation of an objective function that is based on the input audio and output of the voice morphing apparatus. The voice morphing apparatus includes objectives that are based adversarially on speaker identification and positively on audio fidelity. Thus, the voice morphing apparatus is adjusted to reduce identifiability of speakers while maintaining fidelity of the morphed audio. The voice morphing apparatus may be used as part of an automatic speech recognition system.
    Type: Grant
    Filed: January 11, 2020
    Date of Patent: March 7, 2023
    Assignee: SOUNDHOUND, INC.
    Inventor: Steve Pearson
  • Patent number: 11100940
    Abstract: Systems and methods for training a voice morphing apparatus are described. The voice morphing apparatus is trained to morph input audio data to mask an identity of a speaker. Training is performed by evaluating an objective function that is a function of the input audio data and an output of the voice morphing apparatus. The objective function may have a first term that is based on speaker identification and a second term that is based on audio fidelity. By optimizing the objective function, parameters of the voice morphing apparatus may be adjusted so as to reduce a confidence of speaker identification and maintain an audio fidelity of the morphed audio data. The voice morphing apparatus, once trained, may be used as part of an automatic speech recognition system.
    Type: Grant
    Filed: January 10, 2020
    Date of Patent: August 24, 2021
    Assignee: SOUNDHOUND, INC.
    Inventor: Steve Pearson
  • Publication number: 20210217431
    Abstract: A voice morphing apparatus having adjustable parameters is described. The disclosed system and method include a voice morphing apparatus that morphs input audio to mask a speaker's identity. Parameter adjustment uses evaluation of an objective function that is based on the input audio and output of the voice morphing apparatus. The voice morphing apparatus includes objectives that are based adversarially on speaker identification and positively on audio fidelity. Thus, the voice morphing apparatus is adjusted to reduce identifiability of speakers while maintaining fidelity of the morphed audio. The voice morphing apparatus may be used as part of an automatic speech recognition system.
    Type: Application
    Filed: January 11, 2020
    Publication date: July 15, 2021
    Applicant: SoundHound, Inc.
    Inventor: Steve PEARSON
  • Publication number: 20210193159
    Abstract: Systems and methods for training a voice morphing apparatus are described. The voice morphing apparatus is trained to morph input audio data to mask an identity of a speaker. Training is performed by evaluating an objective function that is a function of the input audio data and an output of the voice morphing apparatus. The objective function may have a first term that is based on speaker identification and a second term that is based on audio fidelity. By optimizing the objective function, parameters of the voice morphing apparatus may be adjusted so as to reduce a confidence of speaker identification and maintain an audio fidelity of the morphed audio data. The voice morphing apparatus, once trained, may be used as part of an automatic speech recognition system.
    Type: Application
    Filed: January 10, 2020
    Publication date: June 24, 2021
    Applicant: SoundHound, Inc.
    Inventor: Steve PEARSON
  • Patent number: 10008216
    Abstract: Method and apparatus for reducing a size of databases required for recorded speech data.
    Type: Grant
    Filed: April 15, 2014
    Date of Patent: June 26, 2018
    Assignee: SPEECH MORPHING SYSTEMS, INC.
    Inventors: Fathy Yassa, Benjamin Reaves, Steve Pearson
  • Patent number: 9905218
    Abstract: Method and apparatus for diphone or concatenative synthesis to compensate for insufficient or missing diphones.
    Type: Grant
    Filed: April 18, 2014
    Date of Patent: February 27, 2018
    Assignee: SPEECH MORPHING SYSTEMS, INC.
    Inventors: Benjamin Reaves, Steve Pearson, Fathy Yassa
  • Publication number: 20170249953
    Abstract: Method and apparatus for reducing a size of databases required for recorded speech data.
    Type: Application
    Filed: April 15, 2014
    Publication date: August 31, 2017
    Inventors: Fathy Yassa, Benjamin Reaves, Steve Pearson
  • Patent number: 7908019
    Abstract: A taxonomy engine in a software architecture generates a taxonomy dataset establishing the group of well formed commands, and at least one command generator of the system is adapted to generate a well formed command using the taxonomy dataset. The taxonomy engine is configured to deliver the taxonomy dataset to the command generator, and the command generator is configured to deliver the well formed command to the controller.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: March 15, 2011
    Assignee: Whirlpool Corporation
    Inventors: Matthew P. Ebrom, Mark E. Glotzbach, Richard A. McCoy, Steve Pearson
  • Publication number: 20090064008
    Abstract: A graphic user interface system for use with a content based retrieval system includes an active display having display areas. For example, the display areas include a main area providing an overview of database contents by displaying representative samples of the database contents. The display areas also include one or more query areas into which one or more of the representative samples can be moved from the main area by a user employing gesture based interaction. A query formulation module employs the one or more representative samples moved into the query area to provide feedback to the content based retrieval system.
    Type: Application
    Filed: August 31, 2007
    Publication date: March 5, 2009
    Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.
    Inventors: Chaojun Liu, Luca Rigazio, Peter Veprek, David Kryze, Steve Pearson
  • Patent number: 7472843
    Abstract: A spray nozzle assembly comprising a nozzle body and a separate nozzle body insert adapted for removable snap action engagement with the body. The nozzle body insert defines a liquid flow passage that includes an inlet section for communicating with a liquid inlet, a metering orifice for accelerating the liquid flow stream, and a downstream expansion section. The body insert further includes a venturi passage communicating with the expansion section for drawing ambient air into the liquid flow stream for stabilizing the liquid prior to discharge from the nozzle. In the illustrated embodiment, the expansion chamber communicates with a transverse passage of the nozzle body, which in turn communicates with a laterally oriented discharge orifice.
    Type: Grant
    Filed: June 23, 2005
    Date of Patent: January 6, 2009
    Assignee: Spraying Systems Co.
    Inventors: Steve Pearson, Marc Arenson, Eric Greenawalt
  • Publication number: 20080103610
    Abstract: A taxonomy engine in a software architecture generates a taxonomy dataset establishing the group of well formed commands, and at least one command generator of the system is adapted to generate a well formed command using the taxonomy dataset. The taxonomy engine is configured to deliver the taxonomy dataset to the command generator, and the command generator is configured to deliver the well formed command to the controller.
    Type: Application
    Filed: October 31, 2007
    Publication date: May 1, 2008
    Applicant: WHIRLPOOL CORPORATION
    Inventors: Matthew Ebrom, Mark Glotzbach, Richard McCoy, Steve Pearson
  • Publication number: 20080087745
    Abstract: A spray nozzle assembly comprising a nozzle body and a separate nozzle body insert adapted for removable snap action engagement with the body. The nozzle body insert defines a liquid flow passage that includes an inlet section for communicating with a liquid inlet, a metering orifice for accelerating the liquid flow stream, and a downstream expansion section. The body insert further includes a venturi passage communicating with the expansion section for drawing ambient air into the liquid flow stream for stabilizing the liquid prior to discharge from the nozzle. In the illustrated embodiment, the expansion chamber communicates with a transverse passage of the nozzle body, which in turn communicates with a laterally oriented discharge orifice.
    Type: Application
    Filed: June 23, 2005
    Publication date: April 17, 2008
    Applicant: Spraying Systems Co.
    Inventors: Steve Pearson, Marc Arenson, Eric Greenawalt
  • Patent number: 6513008
    Abstract: A speech synthesizer customization system provides a mechanism for generating a hierarchical customized user database. The customization system has a template management tool for generating the templates based on customization data from a user and associated replicated dynamic synthesis data from a text-to-speech (TTS) synthesizer. The replicated dynamic synthesis data is arranged in a dynamic data structure having hierarchical levels. The customization system further includes a user database that supplements a standard database of the synthesizer. The tool populates the user database with the templates such that the templates enable the user database to uniformly override subsequently generated speech synthesis data at all hierarchical levels of the dynamic data structure.
    Type: Grant
    Filed: March 15, 2001
    Date of Patent: January 28, 2003
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Steve Pearson, Peter Veprek, Jean-Claude Junqua
  • Patent number: 6496801
    Abstract: A speech synthesis system for generating voice dialog for a message frame having a fixed and a variable portion. A prosody module selects a prosodic template for each of the fixed and variable portions wherein at least one portion comprises a phrase of multiple words. An acoustic module selects an acoustic template for each of the fixed and variable portions wherein at least one portion comprises a phrase of multiple words. A frame generator concatenates the respective prosodic templates and acoustic templates. A sound module generates the voice dialog in accordance with the concatenated prosodic and acoustic templates.
    Type: Grant
    Filed: November 2, 1999
    Date of Patent: December 17, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Peter Veprek, Steve Pearson, Jean-Claude Junqua
  • Publication number: 20020133348
    Abstract: A speech synthesizer customization system provides a mechanism for generating a hierarchical customized user database. The customization system has a template management tool for generating the templates based on customization data from a user and associated replicated dynamic synthesis data from a text-to-speech (TTS) synthesizer. The replicated dynamic synthesis data is arranged in a dynamic data structure having hierarchical levels. The customization system further includes a user database that supplements a standard database of the synthesizer. The tool populates the user database with the templates such that the templates enable the user database to uniformly override subsequently generated speech synthesis data at all hierarchical levels of the dynamic data structure.
    Type: Application
    Filed: March 15, 2001
    Publication date: September 19, 2002
    Inventors: Steve Pearson, Peter Veprek, Jean-Claude Junqua
  • Patent number: 6363342
    Abstract: An editing tool is provided for developing word-pronunciation pairs based on a spelled word input. The editing tool includes a transcription generator that receives the spelled word input from the user and generates a list of suggested phonetic transcriptions. The editor displays the list of suggested phonetic transcriptions to the user and provides a mechanism for selecting the desired pronunciation from the list of suggested phonetic transcriptions. The editing tool further includes a speech recognizer to aid the user in selecting the desired pronunciation from the list of suggested phonetic transcriptions based on speech data input that corresponds to the spelled word input, and a syllable editor that enables the user to manipulate a syllabic part of a selected pronunciation. Lastly, the desired pronunciation can be tested at any point through the use of a text-to-speech synthesizer that generates audible speech data for the selected phonetic transcription.
    Type: Grant
    Filed: December 18, 1998
    Date of Patent: March 26, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Rhonda Shaw, Roland Kuhn, Steve Pearson
  • Publication number: 20020013707
    Abstract: An editing tool is provided for developing word-pronunciation pairs based on a spelled word input. The editing tool includes a transcription generator that receives the spelled word input from the user and generates a list of suggested phonetic transcriptions. The editor displays the list of suggested phonetic transcriptions to the user and provides a mechanism for selecting the desired pronunciation from the list of suggested phonetic transcriptions. The editing tool further includes a speech recognizer to aid the user in selecting the desired pronunciation from the list of suggested phonetic transcriptions based on speech data input that corresponds to the spelled word input, and a syllable editor that enables the user to manipulate a syllabic part of a selected pronunciation. Lastly, the desired pronunciation can be tested at any point through the use of a text-to-speech synthesizer that generates audible speech data for the selected phonetic transcription.
    Type: Application
    Filed: December 18, 1998
    Publication date: January 31, 2002
    Inventors: RHONDA SHAW, ROLAND KUHN, STEVE PEARSON
  • Patent number: RE39336
    Abstract: The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross face techniques, one applied in the time domain in the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.
    Type: Grant
    Filed: November 5, 2002
    Date of Patent: October 10, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Steve Pearson, Nicholas Kibre, Nancy Niedzielski