Patents by Inventor Alexander Gutkin

Alexander Gutkin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and system for building text-to-speech voice from diverse recordings

Patent number: 9542927

Abstract: A method and system is disclosed for building a speech database for a text-to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions. For a plurality of utterances of a reference speaker, a set of reference-speaker vectors may be extracted, and for each of a plurality of utterances of a colloquial speaker, a respective set of colloquial-speaker vectors may be extracted. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each colloquial-speaker vector to a reference-speaker vector. The colloquial-speaker vector may be replaced with the matched reference-speaker vector. The matching-and-replacing can be carried out separately for each set of colloquial-speaker vectors. A conditioned set of speaker vectors can then be constructed by aggregating all the replaced speaker vectors. The condition set of speaker vectors can be used to train the TTS system.

Type: Grant

Filed: November 13, 2014

Date of Patent: January 10, 2017

Assignee: Google Inc.

Inventors: Ioannis Agiomyrgiannakis, Alexander Gutkin
Statistical unit selection language models based on acoustic fingerprinting

Patent number: 9424835

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for providing statistical unit selection language modeling based on acoustic fingerprinting. The methods, systems and apparatus include the actions of obtaining a unit database of acoustic units and, for each acoustic unit, linguistic data corresponding to the acoustic unit; obtaining stored data associating each acoustic unit with (i) a corresponding acoustic fingerprint and (ii) a probability of the linguistic data corresponding to the acoustic unit occurring in a text corpus; determining that the unit database of acoustic units has been updated to include one or more new acoustic units; for each new acoustic unit in the updated unit database: generating an acoustic fingerprint for the new acoustic unit; identifying an acoustic unit that (i) has an acoustic fingerprint that is indicated as similar to the fingerprint of the new acoustic unit, and (ii) has a stored associated probability.

Type: Grant

Filed: September 10, 2015

Date of Patent: August 23, 2016

Assignee: Google Inc.

Inventors: Alexander Gutkin, Javier Gonzalvo Fructuoso, Cyril Georges Luc Allauzen
Method and System for Building Text-to-Speech Voice from Diverse Recordings

Publication number: 20160140951

Abstract: A method and system is disclosed for building a speech database for a text-to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions. For a plurality of utterances of a reference speaker, a set of reference-speaker vectors may be extracted, and for each of a plurality of utterances of a colloquial speaker, a respective set of colloquial-speaker vectors may be extracted. A matching procedure, carried out under a transform that compensates for speaker differences, may be used to match each colloquial-speaker vector to a reference-speaker vector. The colloquial-speaker vector may be replaced with the matched reference-speaker vector. The matching-and-replacing can be carried out separately for each set of colloquial-speaker vectors. A conditioned set of speaker vectors can then be constructed by aggregating all the replaced speaker vectors. The condition set of speaker vectors can be used to train the TTS system.

Type: Application

Filed: November 13, 2014

Publication date: May 19, 2016

Inventors: Ioannis Agiomyrgiannakis, Alexander Gutkin
STATISTICAL UNIT SELECTION LANGUAGE MODELS BASED ON ACOUSTIC FINGERPRINTING

Publication number: 20160093295

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for providing statistical unit selection language modeling based on acoustic fingerprinting. The methods, systems and apparatus include the actions of obtaining a unit database of acoustic units and, for each acoustic unit, linguistic data corresponding to the acoustic unit; obtaining stored data associating each acoustic unit with (i) a corresponding acoustic fingerprint and (ii) a probability of the linguistic data corresponding to the acoustic unit occurring in a text corpus; determining that the unit database of acoustic units has been updated to include one or more new acoustic units; for each new acoustic unit in the updated unit database: generating an acoustic fingerprint for the new acoustic unit; identifying an acoustic unit that (i) has an acoustic fingerprint that is indicated as similar to the fingerprint of the new acoustic unit, and (ii) has a stored associated probability.

Type: Application

Filed: September 10, 2015

Publication date: March 31, 2016

Inventors: Alexander Gutkin, Javier Gonzalvo Fructuoso, Cyril Georges Luc Allauzen
JOINT MULTIGRAM-BASED DETECTION OF SPELLING VARIANTS

Publication number: 20150234804

Abstract: Content processing includes receiving a set of a correctly spelled alert words and at least one spelling variant corresponding to each correctly spelled alert word; determining at least one alignment of joint multigrams for each correctly spelled alert word/corresponding spelling variant pair; training a model of correspondence between the set of received orthographic alert words and corresponding spelling variants using the determined alignments; and receiving a spelling variant observation from a content block. Using the trained model, the technology determines a probability that the received spelling variant observation corresponds to a received correctly spelled alert word. For a determined probability exceeding a configured threshold, the technology denies automatic acceptance of the content block.

Type: Application

Filed: August 26, 2014

Publication date: August 20, 2015

Inventors: Matthew Nicholas Stuttle, Alexander Gutkin
Text-to-speech synthesis

Patent number: 9082401

Abstract: The present disclosure describes example systems, methods, and devices for generating a synthetic speech signal. An example method may include determining a phonemic representation of text. The example method may also include identifying one or more finite-state machines (“FSMs”) corresponding to one or more phonemes included in the phonemic representation of the text. A given FSM may be a compressed unit of recorded speech that simulates a Hidden Markov Model. The example method may further include determining a selected sequence of models that minimizes a cost function that represents a likelihood that a possible sequence of models substantially matches a phonemic representation of text. Each possible sequence of models may include at least one FSM. The method may additionally include generating a synthetic speech signal based on the selected sequence that includes one or more spectral features generated from at least one FSM included in the selected sequence.

Type: Grant

Filed: January 9, 2013

Date of Patent: July 14, 2015

Assignee: Google Inc.

Inventors: Javier Gonzalvo Fructuoso, Alexander Gutkin
Devices and methods for speech unit reduction in text-to-speech synthesis systems

Patent number: 8751236

Abstract: A device may receive a plurality of speech sounds that are indicative of pronunciations of a first linguistic term. The device may determine concatenation features of the plurality of speech sounds. The concatenation features may be indicative of an acoustic transition between a first speech sound and a second speech sound when the first speech sound and the second speech sound are concatenated. The first speech sound may be included in the plurality of speech sounds and the second speech sound may be indicative of a pronunciation of a second linguistic term. The device may cluster the plurality of speech sounds into one or more clusters based on the concatenation features. The device may provide a representative speech sound of the given cluster as the first speech sound when the first speech sound and the second speech sound are concatenated.

Type: Grant

Filed: October 23, 2013

Date of Patent: June 10, 2014

Assignee: Google Inc.

Inventors: Javier Gonzalvo Fructuoso, Alexander Gutkin, Ioannis Agiomyrgiannakis
Canless bi-cell

Publication number: 20070015021

Abstract: The invention provides a substantially flat, planar, metal-air canless bi-cell having two major surfaces formed of oppositely-disposed spaced-apart gas-permeable liquid-impermeable air-electrode cathodic material, defining therebetween a space containing a fluid anodic material comprising anodic metal particles and electrolyte.

Type: Application

Filed: July 18, 2005

Publication date: January 18, 2007

Inventors: Yaron Shrim, Ronald Putt, Jacob Rosenberg, Victor Bogdanovsky, Alexander Gutkin, Neal Naimer