Patents by Inventor Steve Pearson
Steve Pearson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12573372Abstract: A neural TTS system is trained to generate key acoustic frames at variable rates while omitting other frames. The frame skipping depends on the acoustic features to be generated for the input text. The TTS system can interpolate frames between the key frames at a target rate for a vocoder to synthesis audio samples.Type: GrantFiled: October 31, 2022Date of Patent: March 10, 2026Assignee: SoundHound AI IP, LLCInventors: Steve Pearson, Jon Grossman
-
Publication number: 20240144910Abstract: A neural TTS system is trained to generate key acoustic frames at variable rates while omitting other frames. The frame skipping depends on the acoustic features to be generated for the input text. The TTS system can interpolate frames between the key frames at a target rate for a vocoder to synthesis audio samples.Type: ApplicationFiled: October 31, 2022Publication date: May 2, 2024Applicant: SoundHound, Inc.Inventors: Steve PEARSON, Jon GROSSMAN
-
Patent number: 11600284Abstract: A voice morphing apparatus having adjustable parameters is described. The disclosed system and method include a voice morphing apparatus that morphs input audio to mask a speaker's identity. Parameter adjustment uses evaluation of an objective function that is based on the input audio and output of the voice morphing apparatus. The voice morphing apparatus includes objectives that are based adversarially on speaker identification and positively on audio fidelity. Thus, the voice morphing apparatus is adjusted to reduce identifiability of speakers while maintaining fidelity of the morphed audio. The voice morphing apparatus may be used as part of an automatic speech recognition system.Type: GrantFiled: January 11, 2020Date of Patent: March 7, 2023Assignee: SOUNDHOUND, INC.Inventor: Steve Pearson
-
Patent number: 11100940Abstract: Systems and methods for training a voice morphing apparatus are described. The voice morphing apparatus is trained to morph input audio data to mask an identity of a speaker. Training is performed by evaluating an objective function that is a function of the input audio data and an output of the voice morphing apparatus. The objective function may have a first term that is based on speaker identification and a second term that is based on audio fidelity. By optimizing the objective function, parameters of the voice morphing apparatus may be adjusted so as to reduce a confidence of speaker identification and maintain an audio fidelity of the morphed audio data. The voice morphing apparatus, once trained, may be used as part of an automatic speech recognition system.Type: GrantFiled: January 10, 2020Date of Patent: August 24, 2021Assignee: SOUNDHOUND, INC.Inventor: Steve Pearson
-
Publication number: 20210217431Abstract: A voice morphing apparatus having adjustable parameters is described. The disclosed system and method include a voice morphing apparatus that morphs input audio to mask a speaker's identity. Parameter adjustment uses evaluation of an objective function that is based on the input audio and output of the voice morphing apparatus. The voice morphing apparatus includes objectives that are based adversarially on speaker identification and positively on audio fidelity. Thus, the voice morphing apparatus is adjusted to reduce identifiability of speakers while maintaining fidelity of the morphed audio. The voice morphing apparatus may be used as part of an automatic speech recognition system.Type: ApplicationFiled: January 11, 2020Publication date: July 15, 2021Applicant: SoundHound, Inc.Inventor: Steve PEARSON
-
Publication number: 20210193159Abstract: Systems and methods for training a voice morphing apparatus are described. The voice morphing apparatus is trained to morph input audio data to mask an identity of a speaker. Training is performed by evaluating an objective function that is a function of the input audio data and an output of the voice morphing apparatus. The objective function may have a first term that is based on speaker identification and a second term that is based on audio fidelity. By optimizing the objective function, parameters of the voice morphing apparatus may be adjusted so as to reduce a confidence of speaker identification and maintain an audio fidelity of the morphed audio data. The voice morphing apparatus, once trained, may be used as part of an automatic speech recognition system.Type: ApplicationFiled: January 10, 2020Publication date: June 24, 2021Applicant: SoundHound, Inc.Inventor: Steve PEARSON
-
Patent number: 10008216Abstract: Method and apparatus for reducing a size of databases required for recorded speech data.Type: GrantFiled: April 15, 2014Date of Patent: June 26, 2018Assignee: SPEECH MORPHING SYSTEMS, INC.Inventors: Fathy Yassa, Benjamin Reaves, Steve Pearson
-
Patent number: 9905218Abstract: Method and apparatus for diphone or concatenative synthesis to compensate for insufficient or missing diphones.Type: GrantFiled: April 18, 2014Date of Patent: February 27, 2018Assignee: SPEECH MORPHING SYSTEMS, INC.Inventors: Benjamin Reaves, Steve Pearson, Fathy Yassa
-
Publication number: 20170249953Abstract: Method and apparatus for reducing a size of databases required for recorded speech data.Type: ApplicationFiled: April 15, 2014Publication date: August 31, 2017Inventors: Fathy Yassa, Benjamin Reaves, Steve Pearson
-
Patent number: 7908019Abstract: A taxonomy engine in a software architecture generates a taxonomy dataset establishing the group of well formed commands, and at least one command generator of the system is adapted to generate a well formed command using the taxonomy dataset. The taxonomy engine is configured to deliver the taxonomy dataset to the command generator, and the command generator is configured to deliver the well formed command to the controller.Type: GrantFiled: October 31, 2007Date of Patent: March 15, 2011Assignee: Whirlpool CorporationInventors: Matthew P. Ebrom, Mark E. Glotzbach, Richard A. McCoy, Steve Pearson
-
Publication number: 20090064008Abstract: A graphic user interface system for use with a content based retrieval system includes an active display having display areas. For example, the display areas include a main area providing an overview of database contents by displaying representative samples of the database contents. The display areas also include one or more query areas into which one or more of the representative samples can be moved from the main area by a user employing gesture based interaction. A query formulation module employs the one or more representative samples moved into the query area to provide feedback to the content based retrieval system.Type: ApplicationFiled: August 31, 2007Publication date: March 5, 2009Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.Inventors: Chaojun Liu, Luca Rigazio, Peter Veprek, David Kryze, Steve Pearson
-
Patent number: 7472843Abstract: A spray nozzle assembly comprising a nozzle body and a separate nozzle body insert adapted for removable snap action engagement with the body. The nozzle body insert defines a liquid flow passage that includes an inlet section for communicating with a liquid inlet, a metering orifice for accelerating the liquid flow stream, and a downstream expansion section. The body insert further includes a venturi passage communicating with the expansion section for drawing ambient air into the liquid flow stream for stabilizing the liquid prior to discharge from the nozzle. In the illustrated embodiment, the expansion chamber communicates with a transverse passage of the nozzle body, which in turn communicates with a laterally oriented discharge orifice.Type: GrantFiled: June 23, 2005Date of Patent: January 6, 2009Assignee: Spraying Systems Co.Inventors: Steve Pearson, Marc Arenson, Eric Greenawalt
-
Publication number: 20080103610Abstract: A taxonomy engine in a software architecture generates a taxonomy dataset establishing the group of well formed commands, and at least one command generator of the system is adapted to generate a well formed command using the taxonomy dataset. The taxonomy engine is configured to deliver the taxonomy dataset to the command generator, and the command generator is configured to deliver the well formed command to the controller.Type: ApplicationFiled: October 31, 2007Publication date: May 1, 2008Applicant: WHIRLPOOL CORPORATIONInventors: Matthew Ebrom, Mark Glotzbach, Richard McCoy, Steve Pearson
-
Publication number: 20080087745Abstract: A spray nozzle assembly comprising a nozzle body and a separate nozzle body insert adapted for removable snap action engagement with the body. The nozzle body insert defines a liquid flow passage that includes an inlet section for communicating with a liquid inlet, a metering orifice for accelerating the liquid flow stream, and a downstream expansion section. The body insert further includes a venturi passage communicating with the expansion section for drawing ambient air into the liquid flow stream for stabilizing the liquid prior to discharge from the nozzle. In the illustrated embodiment, the expansion chamber communicates with a transverse passage of the nozzle body, which in turn communicates with a laterally oriented discharge orifice.Type: ApplicationFiled: June 23, 2005Publication date: April 17, 2008Applicant: Spraying Systems Co.Inventors: Steve Pearson, Marc Arenson, Eric Greenawalt
-
Patent number: 6513008Abstract: A speech synthesizer customization system provides a mechanism for generating a hierarchical customized user database. The customization system has a template management tool for generating the templates based on customization data from a user and associated replicated dynamic synthesis data from a text-to-speech (TTS) synthesizer. The replicated dynamic synthesis data is arranged in a dynamic data structure having hierarchical levels. The customization system further includes a user database that supplements a standard database of the synthesizer. The tool populates the user database with the templates such that the templates enable the user database to uniformly override subsequently generated speech synthesis data at all hierarchical levels of the dynamic data structure.Type: GrantFiled: March 15, 2001Date of Patent: January 28, 2003Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Steve Pearson, Peter Veprek, Jean-Claude Junqua
-
Patent number: 6496801Abstract: A speech synthesis system for generating voice dialog for a message frame having a fixed and a variable portion. A prosody module selects a prosodic template for each of the fixed and variable portions wherein at least one portion comprises a phrase of multiple words. An acoustic module selects an acoustic template for each of the fixed and variable portions wherein at least one portion comprises a phrase of multiple words. A frame generator concatenates the respective prosodic templates and acoustic templates. A sound module generates the voice dialog in accordance with the concatenated prosodic and acoustic templates.Type: GrantFiled: November 2, 1999Date of Patent: December 17, 2002Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Peter Veprek, Steve Pearson, Jean-Claude Junqua
-
Publication number: 20020133348Abstract: A speech synthesizer customization system provides a mechanism for generating a hierarchical customized user database. The customization system has a template management tool for generating the templates based on customization data from a user and associated replicated dynamic synthesis data from a text-to-speech (TTS) synthesizer. The replicated dynamic synthesis data is arranged in a dynamic data structure having hierarchical levels. The customization system further includes a user database that supplements a standard database of the synthesizer. The tool populates the user database with the templates such that the templates enable the user database to uniformly override subsequently generated speech synthesis data at all hierarchical levels of the dynamic data structure.Type: ApplicationFiled: March 15, 2001Publication date: September 19, 2002Inventors: Steve Pearson, Peter Veprek, Jean-Claude Junqua
-
Patent number: 6363342Abstract: An editing tool is provided for developing word-pronunciation pairs based on a spelled word input. The editing tool includes a transcription generator that receives the spelled word input from the user and generates a list of suggested phonetic transcriptions. The editor displays the list of suggested phonetic transcriptions to the user and provides a mechanism for selecting the desired pronunciation from the list of suggested phonetic transcriptions. The editing tool further includes a speech recognizer to aid the user in selecting the desired pronunciation from the list of suggested phonetic transcriptions based on speech data input that corresponds to the spelled word input, and a syllable editor that enables the user to manipulate a syllabic part of a selected pronunciation. Lastly, the desired pronunciation can be tested at any point through the use of a text-to-speech synthesizer that generates audible speech data for the selected phonetic transcription.Type: GrantFiled: December 18, 1998Date of Patent: March 26, 2002Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Rhonda Shaw, Roland Kuhn, Steve Pearson
-
Publication number: 20020013707Abstract: An editing tool is provided for developing word-pronunciation pairs based on a spelled word input. The editing tool includes a transcription generator that receives the spelled word input from the user and generates a list of suggested phonetic transcriptions. The editor displays the list of suggested phonetic transcriptions to the user and provides a mechanism for selecting the desired pronunciation from the list of suggested phonetic transcriptions. The editing tool further includes a speech recognizer to aid the user in selecting the desired pronunciation from the list of suggested phonetic transcriptions based on speech data input that corresponds to the spelled word input, and a syllable editor that enables the user to manipulate a syllabic part of a selected pronunciation. Lastly, the desired pronunciation can be tested at any point through the use of a text-to-speech synthesizer that generates audible speech data for the selected phonetic transcription.Type: ApplicationFiled: December 18, 1998Publication date: January 31, 2002Inventors: RHONDA SHAW, ROLAND KUHN, STEVE PEARSON
-
Patent number: RE39336Abstract: The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross face techniques, one applied in the time domain in the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.Type: GrantFiled: November 5, 2002Date of Patent: October 10, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Steve Pearson, Nicholas Kibre, Nancy Niedzielski