Patents by Inventor Carolina Parada
Carolina Parada has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240096326Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.Type: ApplicationFiled: December 1, 2023Publication date: March 21, 2024Applicant: Google LLCInventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
-
Patent number: 11848018Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.Type: GrantFiled: May 31, 2022Date of Patent: December 19, 2023Assignee: Google LLCInventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
-
Publication number: 20230311335Abstract: Implementations process, using a large language model, a free-form natural language (NL) instruction to generate to generate LLM output. Those implementations generate, based on the LLM output and a NL skill description of a robotic skill, a task-grounding measure that reflects a probability of the skill description in the probability distribution of the LLM output. Those implementations further generate, based on the robotic skill and current environmental state data, a world-grounding measure that reflects a probability of the robotic skill being successful based on the current environmental state data. Those implementations further determine, based on both the task-grounding measure and the world-grounding measure, whether to implement the robotic skill.Type: ApplicationFiled: March 30, 2023Publication date: October 5, 2023Inventors: Karol Hausman, Brian Ichter, Sergey Levine, Alexander Toshev, Fei Xia, Carolina Parada
-
Patent number: 11741970Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.Type: GrantFiled: January 6, 2022Date of Patent: August 29, 2023Assignee: Google LLCInventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
-
Publication number: 20230267701Abstract: In various examples, sensor data representative of an image of a field of view of a vehicle sensor may be received and the sensor data may be applied to a machine learning model. The machine learning model may compute a segmentation mask representative of portions of the image corresponding to lane markings of the driving surface of the vehicle. Analysis of the segmentation mask may be performed to determine lane marking types, and lane boundaries may be generated by performing curve fitting on the lane markings corresponding to each of the lane marking types. The data representative of the lane boundaries may then be sent to a component of the vehicle for use in navigating the vehicle through the driving surface.Type: ApplicationFiled: May 1, 2023Publication date: August 24, 2023Inventors: Yifang Xu, Xin Liu, Chia-Chin Chen, Carolina Parada, Davide Onofrio, Minwoo Park, Mehdi Sajjadi Mohammadabadi, Vijay Chintalapudi, Ozan Tonkal, John Zedlewski, Pekka Janis, Jan Nikolaus Fritsch, Gordon Grigor, Zuoguan Wang, I-Kuei Chen, Miguel Sainz
-
Publication number: 20230205219Abstract: In various examples, a deep learning solution for path detection is implemented to generate a more abstract definition of a drivable path—without reliance on explicit lane-markings—by using a detection-based approach. Using approaches of the present disclosure, the identification of drivable paths may be possible in environments where conventional approaches are unreliable, or fail—such as where lane markings do not exist or are occluded. The deep learning solution may generate outputs that represent geometries for one or more drivable paths in an environment and confidence values corresponding to path types or classes that the geometries correspond. These outputs may be directly useable by an autonomous vehicle—such as an autonomous driving software stack—with minimal post-processing.Type: ApplicationFiled: March 10, 2023Publication date: June 29, 2023Inventors: Regan Blythe Towal, Maroof Mohammed Farooq, Vijay Chintalapudi, Carolina Parada, David Nister
-
Patent number: 11676625Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.Type: GrantFiled: January 20, 2021Date of Patent: June 13, 2023Assignee: Google LLCInventors: Shuo-Yiin Chang, Bo Li, Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
-
Patent number: 11675359Abstract: In various examples, a deep learning solution for path detection is implemented to generate a more abstract definition of a drivable path without reliance on explicit lane-markings—by using a detection-based approach. Using approaches of the present disclosure, the identification of drivable paths may be possible in environments where conventional approaches are unreliable, or fail—such as where lane markings do not exist or are occluded. The deep learning solution may generate outputs that represent geometries for one or more drivable paths in an environment and confidence values corresponding to path types or classes that the geometries correspond. These outputs may be directly useable by an autonomous vehicle—such as an autonomous driving software stack—with minimal post-processing.Type: GrantFiled: June 6, 2019Date of Patent: June 13, 2023Assignee: NVIDIA CorporationInventors: Regan Blythe Towal, Maroof Mohammed Farooq, Vijay Chintalapudi, Carolina Parada, David Nister
-
Patent number: 11676364Abstract: In various examples, sensor data representative of an image of a field of view of a vehicle sensor may be received and the sensor data may be applied to a machine learning model. The machine learning model may compute a segmentation mask representative of portions of the image corresponding to lane markings of the driving surface of the vehicle. Analysis of the segmentation mask may be performed to determine lane marking types, and lane boundaries may be generated by performing curve fitting on the lane markings corresponding to each of the lane marking types. The data representative of the lane boundaries may then be sent to a component of the vehicle for use in navigating the vehicle through the driving surface.Type: GrantFiled: April 5, 2021Date of Patent: June 13, 2023Assignee: NVIDIA CorporationInventors: Yifang Xu, Xin Liu, Chia-Chih Chen, Carolina Parada, Davide Onofrio, Minwoo Park, Mehdi Sajjadi Mohammadabadi, Vijay Chintalapudi, Ozan Tonkal, John Zedlewski, Pekka Janis, Jan Nikolaus Fritsch, Gordon Grigor, Zuoguan Wang, I-Kuei Chen, Miguel Sainz
-
Patent number: 11551709Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.Type: GrantFiled: January 31, 2020Date of Patent: January 10, 2023Assignee: Google LLCInventors: Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
-
Patent number: 11545147Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.Type: GrantFiled: May 2, 2019Date of Patent: January 3, 2023Assignee: Google LLCInventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
-
Publication number: 20220293101Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.Type: ApplicationFiled: May 31, 2022Publication date: September 15, 2022Applicant: Google LLCInventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
-
Patent number: 11361768Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.Type: GrantFiled: July 21, 2020Date of Patent: June 14, 2022Assignee: Google LLCInventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
-
Publication number: 20220130399Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.Type: ApplicationFiled: January 6, 2022Publication date: April 28, 2022Applicant: Google LLCInventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
-
Publication number: 20210224556Abstract: In various examples, sensor data representative of an image of a field of view of a vehicle sensor may be received and the sensor data may be applied to a machine learning model. The machine learning model may compute a segmentation mask representative of portions of the image corresponding to lane markings of the driving surface of the vehicle. Analysis of the segmentation mask may be performed to determine lane marking types, and lane boundaries may be generated by performing curve fitting on the lane markings corresponding to each of the lane marking types. The data representative of the lane boundaries may then be sent to a component of the vehicle for use in navigating the vehicle through the driving surface.Type: ApplicationFiled: April 5, 2021Publication date: July 22, 2021Inventors: Yifang Xu, Xin Liu, Chia-Chih Chen, Carolina Parada, Davide Onofrio, Minwoo Park, Mehdi Sajjadi Mohammadabadi, Vijay Chintalapudi, Ozan Tonkal, John Zedlewski, Pekka Janis, Jan Nikolaus Fritsch, Gordon Grigor, Zuoguan Wang, I-Kuei Chen, Miguel Sainz
-
Patent number: 10997433Abstract: In various examples, sensor data representative of an image of a field of view of a vehicle sensor may be received and the sensor data may be applied to a machine learning model. The machine learning model may compute a segmentation mask representative of portions of the image corresponding to lane markings of the driving surface of the vehicle. Analysis of the segmentation mask may be performed to determine lane marking types, and lane boundaries may be generated by performing curve fitting on the lane markings corresponding to each of the lane marking types. The data representative of the lane boundaries may then be sent to a component of the vehicle for use in navigating the vehicle through the driving surface.Type: GrantFiled: February 26, 2019Date of Patent: May 4, 2021Assignee: NVIDIA CorporationInventors: Yifang Xu, Xin Liu, Chia-Chih Chen, Carolina Parada, Davide Onofrio, Minwoo Park, Mehdi Sajjadi Mohammadabadi, Vijay Chintalapudi, Ozan Tonkal, John Zedlewski, Pekka Janis, Jan Nikolaus Fritsch, Gordon Grigor, Zuoguan Wang, I-Kuei Chen, Miguel Sainz
-
Patent number: 10929754Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.Type: GrantFiled: December 11, 2019Date of Patent: February 23, 2021Assignee: Google LLCInventors: Shuo-yiin Chang, Bo Li, Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
-
Publication number: 20200349946Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.Type: ApplicationFiled: July 21, 2020Publication date: November 5, 2020Applicant: Google LLCInventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
-
Patent number: 10762894Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for keyword spotting. One of the methods includes training, by a keyword detection system, a convolutional neural network for keyword detection by providing a two-dimensional set of input values to the convolutional neural network, the input values including a first dimension in time and a second dimension in frequency, and performing convolutional multiplication on the two-dimensional set of input values for a filter using a frequency stride greater than one to generate a feature map.Type: GrantFiled: July 22, 2015Date of Patent: September 1, 2020Assignee: GOOGLE LLCInventors: Tara N. Sainath, Maria Carolina Parada San Martin
-
Patent number: 10714096Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.Type: GrantFiled: May 16, 2018Date of Patent: July 14, 2020Assignee: Google LLCInventors: Andrew Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin