Patents by Inventor Petar Aleksic

Petar Aleksic has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

NEGATIVE N-GRAM BIASING

Publication number: 20160365092

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing dynamic, stroke-based alignment of touch displays. In one aspect, a method includes obtaining a candidate transcription that an automated speech recognizer generates for an utterance, determining a particular context associated with the utterance, determining that a particular n-gram that is included in the candidate transcription is included among a set of undesirable n-grams that is associated with the context, adjusting a speech recognition confidence score associated with the transcription based on determining that the particular n-gram that is included in the candidate transcription is included among the set of undesirable n-grams that is associated with the context, and determining whether to provide the candidate transcription for output based at least on the adjusted speech recognition confidence score.

Type: Application

Filed: June 15, 2015

Publication date: December 15, 2016

Inventors: Pedro J. Moreno Mengibar, Petar Aleksic
Dynamically biasing language models

Patent number: 9502032

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a method comprises receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.

Type: Grant

Filed: October 28, 2014

Date of Patent: November 22, 2016

Assignee: Google Inc.

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
SPEECH RECOGNITION FOR KEYWORDS

Publication number: 20160335677

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition are disclosed. In one aspect, a method includes receiving a candidate adword from an advertiser. The method further includes generating a score for the candidate adword based on a likelihood of a speech recognizer generating, based on an utterance of the candidate adword, a transcription that includes a word that is associated with an expected pronunciation of the candidate adword. The method further includes classifying, based at least on the score, the candidate adword as an appropriate adword for use in a bidding process for advertisements that are selected based on a transcription of a speech query or as not an appropriate adword for use in the bidding process for advertisements that are selected based on the transcription of the speech query.

Type: Application

Filed: May 13, 2015

Publication date: November 17, 2016

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
Online incremental adaptation of deep neural networks using auxiliary Gaussian mixture models in speech recognition

Patent number: 9466292

Abstract: Methods and systems for online incremental adaptation of neural networks using Gaussian mixture models in speech recognition are described. In an example, a computing device may be configured to receive an audio signal and a subsequent audio signal, both signals having speech content. The computing device may be configured to apply a speaker-specific feature transform to the audio signal to obtain a transformed audio signal. The speaker-specific feature transform may be configured to include speaker-specific speech characteristics of a speaker-profile relating to the speech content. Further, the computing device may be configured to process the transformed audio signal using a neural network trained to estimate a respective speech content of the audio signal. Based on outputs of the neural network, the computing device may be configured to modify the speaker-specific feature transform, and apply the modified speaker-specific feature transform to a subsequent audio signal.

Type: Grant

Filed: May 3, 2013

Date of Patent: October 11, 2016

Assignee: Google Inc.

Inventors: Xin Lei, Petar Aleksic
LANGUAGE MODEL BIASING MODULATION

Publication number: 20160293163

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

Type: Application

Filed: March 30, 2015

Publication date: October 6, 2016

Inventors: Pedro J. Moreno Mengibar, Petar Aleksic
Language model biasing modulation

Patent number: 9460713

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

Type: Grant

Filed: March 30, 2015

Date of Patent: October 4, 2016

Assignee: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Petar Aleksic
DYNAMICALLY BIASING LANGUAGE MODELS

Publication number: 20160104482

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a method comprises receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.

Type: Application

Filed: October 28, 2014

Publication date: April 14, 2016

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
Multiple recognizer speech recognition

Patent number: 9293136

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving audio data that corresponds to an utterance, obtaining a first transcription of the utterance that was generated using a limited speech recognizer. The limited speech recognizer includes a speech recognizer that includes a language model that is trained over a limited speech recognition vocabulary that includes one or more terms from a voice command grammar, but that includes fewer than all terms of an expanded grammar. A second transcription of the utterance is obtained that was generated using an expanded speech recognizer. The expanded speech recognizer includes a speech recognizer that includes a language model that is trained over an expanded speech recognition vocabulary that includes all of the terms of the expanded grammar. The utterance is classified based at least on a portion of the first transcription or the second transcription.

Type: Grant

Filed: June 1, 2015

Date of Patent: March 22, 2016

Assignee: Google Inc.

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar, Fadi Biadsy
Multiple Recognizer Speech Recognition

Publication number: 20150262581

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving audio data that corresponds to an utterance, obtaining a first transcription of the utterance that was generated using a limited speech recognizer. The limited speech recognizer includes a speech recognizer that includes a language model that is trained over a limited speech recognition vocabulary that includes one or more terms from a voice command grammar, but that includes fewer than all terms of an expanded grammar. A second transcription of the utterance is obtained that was generated using an expanded speech recognizer. The expanded speech recognizer includes a speech recognizer that includes a language model that is trained over an expanded speech recognition vocabulary that includes all of the terms of the expanded grammar. The utterance is classified based at least on a portion of the first transcription or the second transcription.

Type: Application

Filed: June 1, 2015

Publication date: September 17, 2015

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar, Fadi Biadsy
Multiple recognizer speech recognition

Patent number: 9058805

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving audio data that corresponds to an utterance, obtaining a first transcription of the utterance that was generated using a limited speech recognizer. The limited speech recognizer includes a speech recognizer that includes a language model that is trained over a limited speech recognition vocabulary that includes one or more terms from a voice command grammar, but that includes fewer than all terms of an expanded grammar. A second transcription of the utterance is obtained that was generated using an expanded speech recognizer. The expanded speech recognizer includes a speech recognizer that includes a language model that is trained over an expanded speech recognition vocabulary that includes all of the terms of the expanded grammar. The utterance is classified based at least on a portion of the first transcription or the second transcription.

Type: Grant

Filed: May 13, 2013

Date of Patent: June 16, 2015

Assignee: Google Inc.

Inventors: Petar Aleksic, Pedro J. Mengibar, Fadi Biadsy
Multi-stage speaker adaptation

Patent number: 8996366

Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.

Type: Grant

Filed: February 17, 2014

Date of Patent: March 31, 2015

Assignee: Google Inc.

Inventors: Petar Aleksic, Xin Lei
VIDEO ANALYSIS BASED LANGUAGE MODEL ADAPTATION

Publication number: 20140379346

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data obtained by a microphone of a wearable computing device, wherein the audio data encodes a user utterance, receiving image data obtained by a camera of the wearable computing device, identifying one or more image features based on the image data, identifying one or more concepts based on the one or more image features, selecting one or more terms associated with a language model used by a speech recognizer to generate transcriptions, adjusting one or more probabilities associated with the language model that correspond to one or more of the selected terms based on the relevance of one or more of the selected terms to the one or more concepts, and obtaining a transcription of the user utterance using the speech recognizer.

Type: Application

Filed: June 21, 2013

Publication date: December 25, 2014

Inventors: Petar Aleksic, Xin Lei
Multiple Recognizer Speech Recognition

Publication number: 20140337032

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving audio data that corresponds to an utterance, obtaining a first transcription of the utterance that was generated using a limited speech recognizer. The limited speech recognizer includes a speech recognizer that includes a language model that is trained over a limited speech recognition vocabulary that includes one or more terms from a voice command grammar, but that includes fewer than all terms of an expanded grammar. A second transcription of the utterance is obtained that was generated using an expanded speech recognizer. The expanded speech recognizer includes a speech recognizer that includes a language model that is trained over an expanded speech recognition vocabulary that includes all of the terms of the expanded grammar. The utterance is classified based at least on a portion of the first transcription or the second transcription.

Type: Application

Filed: May 13, 2013

Publication date: November 13, 2014

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar, Fadi Biadsy
Localized speech recognition with offload

Patent number: 8880398

Abstract: A local computing device may receive an utterance from a user device. In response to receiving the utterance, the local computing device may obtain a text string transcription of the utterance, and determine a response mode for the utterance. If the response mode is a text-based mode, the local computing device may provide the text string transcription to a target device. If the response mode is a non-text-based mode, the local computing device may convert the text string transcription into one or more commands from a command set supported by the target device, and provide the one or more commands to the target device.

Type: Grant

Filed: January 21, 2013

Date of Patent: November 4, 2014

Assignee: Google Inc.

Inventors: Petar Aleksic, Xin Lei
Realtime acoustic adaptation using stability measures

Patent number: 8849664

Abstract: Methods, systems, and computer programs encoded on a computer storage medium for real-time acoustic adaptation using stability measures are disclosed. The methods include the actions of receiving a transcription of a first portion of a speech session, wherein the transcription of the first portion of the speech session is generated using a speaker adaptation profile. The actions further include receiving a stability measure for a segment of the transcription and determining that the stability measure for the segment satisfies a threshold. Additionally, the actions include triggering an update of the speaker adaptation profile using the segment, or using a portion of speech data that corresponds to the segment. And the actions include receiving a transcription of a second portion of the speech session, wherein the transcription of the second portion of the speech session is generated using the updated speaker adaptation profile.

Type: Grant

Filed: July 16, 2013

Date of Patent: September 30, 2014

Assignee: Google Inc.

Inventors: Xin Lei, Petar Aleksic
Distributed speaker adaptation

Patent number: 8805684

Abstract: Automatic speech recognition (ASR) may be performed on received utterances. The ASR may be performed by an ASR module of a computing device (e.g., a client device). The ASR may include: generating feature vectors based on the utterances, updating the feature vectors based on feature-space speaker adaptation parameters, transcribing the utterances to text strings, and updating the feature-space speaker adaptation parameters based on the feature vectors. The transcriptions may be based, at least in part, on an acoustic model and the updated feature vectors. Updated speaker adaptation parameters may be received from another computing device and incorporated into the ASR module.

Type: Grant

Filed: October 17, 2012

Date of Patent: August 12, 2014

Assignee: Google Inc.

Inventors: Petar Aleksic, Xin Lei
Multi-Stage Speaker Adaptation

Publication number: 20140163985

Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.

Type: Application

Filed: February 17, 2014

Publication date: June 12, 2014

Applicant: Google Inc.

Inventors: Petar Aleksic, Xin Lei
Multi-stage speaker adaptation

Patent number: 8700393

Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.

Type: Grant

Filed: September 24, 2013

Date of Patent: April 15, 2014

Assignee: Google Inc.

Inventors: Petar Aleksic, Xin Lei
Rotating packed bed

Patent number: 8679232

Abstract: A rotating packed bed RPB that includes a first and second packed bed both arranged on the same rotatable shaft. A gas is directed via a gas inlet through the first packed bed in co-current flow with a liquid in a radially outward direction towards the outer radius of the packed bed. The liquid enters the first packed bed via a first liquid inlet. The gas exiting the first packed bed is directed to the second packed bed and forced through it in a radially inward direction in counter-current flow with a liquid, which enters through a second liquid inlet. The arrangement allows an operation of the rotating packed bed with less energy compared to RPBs of the prior art operating in counter-current flow only. The apparatus allows low-cost design and high design flexibility.

Type: Grant

Filed: August 7, 2013

Date of Patent: March 25, 2014

Assignee: ALSTOM Technology Ltd

Inventors: Hartwig Wolf, Petar Aleksic, Frank Klaus Ennenbach, Mark Harvey Tothill
Multi-Stage Speaker Adaptation

Publication number: 20140025378

Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.

Type: Application

Filed: September 24, 2013

Publication date: January 23, 2014

Applicant: Google Inc.

Inventors: Petar Aleksic, Xin Lei

prev … 3 4 5 6 7 8 next