Patents by Inventor Matthias Neeracher

Matthias Neeracher has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Text normalization based on a data-driven learning network

Patent number: 10395654

Abstract: Systems and processes for operating an intelligent automated assistant to perform text-to-speech conversion are provided. An example method includes, at an electronic device having one or more processors, receiving a text corpus comprising unstructured natural language text. The method further includes generating a sequence of normalized text based on the received text corpus; and generating a pronunciation sequence representing the sequence of the normalized text. The method further includes causing an audio output to be provided to the user based on the pronunciation sequence. At least one of the sequence of normalized text and the pronunciation sequence is generated based on a data-driven learning network.

Type: Grant

Filed: August 10, 2017

Date of Patent: August 27, 2019

Assignee: Apple Inc.

Inventors: Ladan Golipour, Matthias Neeracher, Ramya Rasipuram
TEXT NORMALIZATION BASED ON A DATA-DRIVEN LEARNING NETWORK

Publication number: 20180330729

Abstract: Systems and processes for operating an intelligent automated assistant to perform text-to-speech conversion are provided. An example method includes, at an electronic device having one or more processors, receiving a text corpus comprising unstructured natural language text. The method further includes generating a sequence of normalized text based on the received text corpus; and generating a pronunciation sequence representing the sequence of the normalized text. The method further includes causing an audio output to be provided to the user based on the pronunciation sequence. At least one of the sequence of normalized text and the pronunciation sequence is generated based on a data-driven learning network.

Type: Application

Filed: August 10, 2017

Publication date: November 15, 2018

Inventors: Ladan GOLIPOUR, Matthias NEERACHER, Ramya RASIPURAM
Multi-unit approach to text-to-speech synthesis

Patent number: 8036894

Abstract: Methods, apparatus, systems, and computer program products are provided for synthesizing speech. One method includes matching a first level of units of a received input string to audio segments from a plurality of audio segments including using properties of or between first level units to locate matching audio segments from a plurality of selections, parsing unmatched first level units into second level units, matching the second level units to audio segments using properties of or between the units to locate matching audio segments from a plurality of selections and synthesizing the input string, including combining the audio segments associated with the first and second units.

Type: Grant

Filed: February 16, 2006

Date of Patent: October 11, 2011

Assignee: Apple Inc.

Inventors: Matthias Neeracher, Devang K. Naik, Kevin B. Aitken, Jerome R. Bellegarda, Kim E.A. Silverman
Using non-speech sounds during text-to-speech synthesis

Patent number: 8027837

Abstract: Systems, apparatus, methods and computer program products are described for producing text-to-speech synthesis with non-speech sounds. In general, some of the pauses or silences that would otherwise be generated in synthesized speech are instead synthesized as non-speech sounds such as breaths. Non-speech sounds can be identified from pre-recorded speech that can include meta-data such as the grammatical and phrasal structure of words and sounds that precede and succeed non-speech sounds. A non-speech sound can be selected for use in synthesized speech based on the words, punctuation, grammatical and phrasal structure of text from which the speech is being synthesized, or other characteristics.

Type: Grant

Filed: September 15, 2006

Date of Patent: September 27, 2011

Assignee: Apple Inc.

Inventors: Kim E. A. Silverman, Matthias Neeracher
USING NON-SPEECH SOUNDS DURING TEXT-TO-SPEECH SYNTHESIS

Publication number: 20080071529

Abstract: Systems, apparatus, methods and computer program products are described for producing text-to-speech synthesis with non-speech sounds. In general, some of the pauses or silences that would otherwise be generated in synthesized speech are instead synthesized as non-speech sounds such as breaths. Non-speech sounds can be identified from pre-recorded speech that can include meta-data such as the grammatical and phrasal structure of words and sounds that precede and succeed non-speech sounds. A non-speech sound can be selected for use in synthesized speech based on the words, punctuation, grammatical and phrasal structure of text from which the speech is being synthesized, or other characteristics.

Type: Application

Filed: September 15, 2006

Publication date: March 20, 2008

Inventors: Kim E.A. Silverman, Matthias Neeracher
Multi-unit approach to text-to-speech synthesis

Publication number: 20070192105

Abstract: Methods, apparatus, systems, and computer program products are provided for synthesizing speech. One method includes matching a first level of units of a received input string to audio segments from a plurality of audio segments including using properties of or between first level units to locate matching audio segments from a plurality of selections, parsing unmatched first level units into second level units, matching the second level units to audio segments using properties of or between the units to locate matching audio segments from a plurality of selections and synthesizing the input string, including combining the audio segments associated with the first and second units.

Type: Application

Filed: February 16, 2006

Publication date: August 16, 2007

Inventors: Matthias Neeracher, Devang K. Naik, Kevin B. Aitken, Jerome R. Bellegarda, Kim E.A. Silverman
Combined dual spectral and temporal alignment method for user authentication by voice

Patent number: 6697779

Abstract: A method and system for training a user authentication by voice signal are described. In one embodiment, during training, a set of all spectral feature vectors for a given speaker is globally decomposed into speaker-specific decomposition units and a speaker-specific recognition unit. During recognition, spectral feature vectors are locally decomposed into speaker-specific characteristic units. The speaker-specific recognition unit is used together with selected speaker-specific characteristic units to compute a speaker-specific comparison unit. If the speaker-specific comparison unit is within a threshold limit, then the voice signal is authenticated. In addition, a speaker-specific content unit is time-aligned with selected speaker-specific characteristic units. If the alignment is within a threshold limit, then the voice signal is authenticated. In one embodiment, if both thresholds are satisfied, then the user is authenticated.

Type: Grant

Filed: September 29, 2000

Date of Patent: February 24, 2004

Assignee: Apple Computer, Inc.

Inventors: Jerome Bellegarda, Devang Naik, Matthias Neeracher, Kim Silverman

Text normalization based on a data-driven learning network

TEXT NORMALIZATION BASED ON A DATA-DRIVEN LEARNING NETWORK

Multi-unit approach to text-to-speech synthesis

Using non-speech sounds during text-to-speech synthesis

USING NON-SPEECH SOUNDS DURING TEXT-TO-SPEECH SYNTHESIS

Multi-unit approach to text-to-speech synthesis

Combined dual spectral and temporal alignment method for user authentication by voice