Patents by Inventor Carolina Parada

Carolina Parada has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

END OF QUERY DETECTION

Publication number: 20200168242

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

Type: Application

Filed: January 31, 2020

Publication date: May 28, 2020

Applicant: Google LLC

Inventors: Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
Unified Endpointer Using Multitask and Multidomain Learning

Publication number: 20200117996

Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

Type: Application

Filed: December 11, 2019

Publication date: April 16, 2020

Applicant: Google LLC

Inventors: Shuo-yiin Chang, Bo Li, Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
End of query detection

Patent number: 10593352

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

Type: Grant

Filed: June 6, 2018

Date of Patent: March 17, 2020

Assignee: Google LLC

Inventors: Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
CONVOLUTIONAL NEURAL NETWORKS

Publication number: 20200051551

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for keyword spotting. One of the methods includes training, by a keyword detection system, a convolutional neural network for keyword detection by providing a two-dimensional set of input values to the convolutional neural network, the input values including a first dimension in time and a second dimension in frequency, and performing convolutional multiplication on the two-dimensional set of input values for a filter using a frequency stride greater than one to generate a feature map.

Type: Application

Filed: October 16, 2019

Publication date: February 13, 2020

Applicant: Google LLC

Inventors: Tara N. Sainath, Maria Carolina Parada San Martin
PATH DETECTION FOR AUTONOMOUS MACHINES USING DEEP NEURAL NETWORKS

Publication number: 20190384304

Abstract: In various examples, a deep learning solution for path detection is implemented to generate a more abstract definition of a drivable path without reliance on explicit lane-markings—by using a detection-based approach. Using approaches of the present disclosure, the identification of drivable paths may be possible in environments where conventional approaches are unreliable, or fail—such as where lane markings do not exist or are occluded. The deep learning solution may generate outputs that represent geometries for one or more drivable paths in an environment and confidence values corresponding to path types or classes that the geometries correspond. These outputs may be directly useable by an autonomous vehicle—such as an autonomous driving software stack—with minimal post-processing.

Type: Application

Filed: June 6, 2019

Publication date: December 19, 2019

Inventors: Regan Blythe Towal, Maroof Mohammed Farooq, Vijay Chintalapudi, Carolina Parada, David Nister
UTTERANCE CLASSIFIER

Publication number: 20190304459

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.

Type: Application

Filed: May 2, 2019

Publication date: October 3, 2019

Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
REAL-TIME DETECTION OF LANES AND BOUNDARIES BY AUTONOMOUS VEHICLES

Publication number: 20190266418

Abstract: In various examples, sensor data representative of an image of a field of view of a vehicle sensor may be received and the sensor data may be applied to a machine learning model. The machine learning model may compute a segmentation mask representative of portions of the image corresponding to lane markings of the driving surface of the vehicle. Analysis of the segmentation mask may be performed to determine lane marking types, and lane boundaries may be generated by performing curve fitting on the lane markings corresponding to each of the lane marking types. The data representative of the lane boundaries may then be sent to a component of the vehicle for use in navigating the vehicle through the driving surface.

Type: Application

Filed: February 26, 2019

Publication date: August 29, 2019

Inventors: Yifang Xu, Xin Liu, Chia-Chih Chen, Carolina Parada, Davide Onofrio, Minwoo Park, Mehdi Sajjadi Mohammadabadi, Vijay Chintalapudi, Ozan Tonkal, John Zedlewski, Pekka Janis, Jan Nikolaus Fritsch, Gordon Grigor, Zuoguan Wang, I-Kuei Chen, Miguel Sainz
Utterance classifier

Patent number: 10311872

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.

Type: Grant

Filed: July 25, 2017

Date of Patent: June 4, 2019

Assignee: Google LLC

Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
Voice activity detection

Patent number: 10229700

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting voice activity. In one aspect, a method include actions of receiving, by a neural network included in an automated voice activity detection system, a raw audio waveform, processing, by the neural network, the raw audio waveform to determine whether the audio waveform includes speech, and provide, by the neural network, a classification of the raw audio waveform indicating whether the raw audio waveform includes speech.

Type: Grant

Filed: January 4, 2016

Date of Patent: March 12, 2019

Assignee: Google LLC

Inventors: Tara N. Sainath, Gabor Simko, Maria Carolina Parada San Martin, Ruben Zazo Candil
UTTERANCE CLASSIFIER

Publication number: 20190035390

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.

Type: Application

Filed: July 25, 2017

Publication date: January 31, 2019

Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
END OF QUERY DETECTION

Publication number: 20180350395

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

Type: Application

Filed: June 6, 2018

Publication date: December 6, 2018

Inventors: Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
DETERMINING HOTWORD SUITABILITY

Publication number: 20180336906

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.

Type: Application

Filed: May 16, 2018

Publication date: November 22, 2018

Inventors: Andrew Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
Determining hotword suitability

Patent number: 10002613

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.

Type: Grant

Filed: January 20, 2016

Date of Patent: June 19, 2018

Assignee: Google LLC

Inventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
User specified keyword spotting using neural network feature extractor

Patent number: 9754584

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing keywords using a long short term memory neural network. One of the methods includes receiving, by a device for each of multiple variable length enrollment audio signals, a respective plurality of enrollment feature vectors that represent features of the respective variable length enrollment audio signal, processing each of the plurality of enrollment feature vectors using a long short term memory (LSTM) neural network to generate a respective enrollment LSTM output vector for each enrollment feature vector, and generating, for the respective variable length enrollment audio signal, a template fixed length representation for use in determining whether another audio signal encodes another spoken utterance of the enrollment phrase by combining at most a quantity k of the enrollment LSTM output vectors for the enrollment audio signal.

Type: Grant

Filed: November 8, 2016

Date of Patent: September 5, 2017

Assignee: Google Inc.

Inventors: Maria Carolina Parada San Martin, Tara N. Sainath, Guoguo Chen
Transfer learning for deep neural network based hotword detection

Patent number: 9715660

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes training a deep neural network with a first training set by adjusting values for each of a plurality of weights included in the neural network, and training the deep neural network to determine a probability that data received by the deep neural network has features similar to key features of one or more keywords or key phrases, the training comprising providing the deep neural network with a second training set and adjusting the values for a first subset of the plurality of weights, wherein the second training set includes data representing the key features of the one or more keywords or key phrases.

Type: Grant

Filed: March 31, 2014

Date of Patent: July 25, 2017

Assignee: Google Inc.

Inventors: Maria Carolina Parada San Martin, Guoguo Chen, Georg Heigold
Low-rank hidden input layer for speech recognition neural network

Patent number: 9646634

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods for training a deep neural network that includes a low rank hidden input layer and an adjoining hidden layer, the low rank hidden input layer including a first matrix A and a second matrix B with dimensions i×m and m×o, respectively, to identify a keyword includes receiving a feature vector including i values that represent features of an audio signal encoding an utterance, determining, using the low rank hidden input layer, an output vector including o values using the feature vector, determining, using the adjoining hidden layer, another vector using the output vector, determining a confidence score that indicates whether the utterance includes the keyword using the other vector, and adjusting weights for the low rank hidden input layer using the confidence score.

Type: Grant

Filed: February 9, 2015

Date of Patent: May 9, 2017

Assignee: Google Inc.

Inventors: Tara N. Sainath, Maria Carolina Parada San Martin
Voice Activity Detection

Publication number: 20170092297

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting voice activity. In one aspect, a method include actions of receiving, by a neural network included in an automated voice activity detection system, a raw audio waveform, processing, by the neural network, the raw audio waveform to determine whether the audio waveform includes speech, and provide, by the neural network, a classification of the raw audio waveform indicating whether the raw audio waveform includes speech.

Type: Application

Filed: January 4, 2016

Publication date: March 30, 2017

Inventors: Tara N. Sainath, Gabor Simko, Maria Carolina Parada San Martin
USER SPECIFIED KEYWORD SPOTTING USING LONG SHORT TERM MEMORY NEURAL NETWORK FEATURE EXTRACTOR

Publication number: 20170076717

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing keywords using a long short term memory neural network. One of the methods includes receiving, by a device for each of multiple variable length enrollment audio signals, a respective plurality of enrollment feature vectors that represent features of the respective variable length enrollment audio signal, processing each of the plurality of enrollment feature vectors using a long short term memory (LSTM) neural network to generate a respective enrollment LSTM output vector for each enrollment feature vector, and generating, for the respective variable length enrollment audio signal, a template fixed length representation for use in determining whether another audio signal encodes another spoken utterance of the enrollment phrase by combining at most a quantity k of the enrollment LSTM output vectors for the enrollment audio signal.

Type: Application

Filed: November 8, 2016

Publication date: March 16, 2017

Inventors: Maria Carolina Parada San Martin, Tara N. Sainath, Guoguo Chen
Determining hotword suitability

Patent number: 9536528

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.

Type: Grant

Filed: August 6, 2012

Date of Patent: January 3, 2017

Assignee: Google Inc.

Inventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
User specified keyword spotting using long short term memory neural network feature extractor

Patent number: 9508340

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing keywords using a long short term memory neural network. One of the methods includes receiving, by a device for each of multiple variable length enrollment audio signals, a respective plurality of enrollment feature vectors that represent features of the respective variable length enrollment audio signal, processing each of the plurality of enrollment feature vectors using a long short term memory (LSTM) neural network to generate a respective enrollment LSTM output vector for each enrollment feature vector, and generating, for the respective variable length enrollment audio signal, a template fixed length representation for use in determining whether another audio signal encodes another spoken utterance of the enrollment phrase by combining at most a quantity k of the enrollment LSTM output vectors for the enrollment audio signal.

Type: Grant

Filed: December 22, 2014

Date of Patent: November 29, 2016

Assignee: Google Inc.

Inventors: Maria Carolina Parada San Martin, Tara N. Sainath, Guoguo Chen

prev 1 2 3 next