Patents by Inventor Antonio Torralba
Antonio Torralba has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240256831Abstract: In various examples, systems and methods are disclosed relating to generating a response from image and/or video input for image/video-based artificial intelligence (AI) systems and applications. Systems and methods are disclosed for a first model (e.g., a teacher model) distilling its knowledge to a second model (a student model). The second model receives a downstream image in a downstream task and generates at least one feature. The first model generates first features corresponding to an image which can be a real image or a synthetic image. The second model generates second features using the image as an input to the second model. Loss with respect to first features is determined. The second model is updated using the loss.Type: ApplicationFiled: January 26, 2023Publication date: August 1, 2024Applicant: NVIDIA CorporationInventors: Daiqing Li, Huan Ling, Seung Wook Kim, Karsten Julian Kreis, Antonio Torralba Barriuso, Sanja Fidler, Amlan Kar
-
Publication number: 20240096064Abstract: Apparatuses, systems, and techniques to annotate images using neural models. In at least one embodiment, neural networks generate mask information from labels of one or more objects within one or more images identified by one or more other neural networks.Type: ApplicationFiled: June 3, 2022Publication date: March 21, 2024Inventors: Daiqing Li, Huan Ling, Seung Wook Kim, Karsten Julian Kreis, Sanja Fidler, Antonio Torralba Barriuso
-
Publication number: 20230377324Abstract: In various examples, systems and methods are disclosed relating to multi-domain generative adversarial networks with learned warp fields. Input data can be generated according to a noise function and provided as input to a generative machine-learning model. The generative machine-learning model can determine a plurality of output images each corresponding to one of a respective plurality of image domains. The generative machine-learning model can include at least one layer to generate a plurality of morph maps each corresponding to one of the respective plurality of image domains. The output images can be presented using a display device.Type: ApplicationFiled: May 18, 2023Publication date: November 23, 2023Applicant: NVIDIA CorporationInventors: Seung Wook KIM, Karsten Julian KREIS, Daiqing LI, Sanja FIDLER, Antonio TORRALBA BARRIUSO
-
Publication number: 20230229919Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar— and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.Type: ApplicationFiled: March 20, 2023Publication date: July 20, 2023Inventors: Amlan Kar, Aayush Prakash, Ming-Yu Liu, David Jesus Acuna Marrero, Antonio Torralba Barriuso, Sanja Fidler
-
Publication number: 20230134690Abstract: Approaches are presented for training an inverse graphics network. An image synthesis network can generate training data for an inverse graphics network. In turn, the inverse graphics network can teach the synthesis network about the physical three-dimensional (3D) controls. Such an approach can provide for accurate 3D reconstruction of objects from 2D images using the trained inverse graphics network, while requiring little annotation of the provided training data. Such an approach can extract and disentangle 3D knowledge learned by generative models by utilizing differentiable renderers, enabling a disentangled generative model to function as a controllable 3D “neural renderer,” complementing traditional graphics renderers.Type: ApplicationFiled: November 7, 2022Publication date: May 4, 2023Inventors: Wenzheng Chen, Yuxuan Zhang, Sanja Fidler, Huan Ling, Jun Gao, Antonio Torralba Barriuso
-
Patent number: 11610115Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar—and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.Type: GrantFiled: November 15, 2019Date of Patent: March 21, 2023Assignee: NVIDIA CorporationInventors: Amlan Kar, Aayush Prakash, Ming-Yu Liu, David Jesus Acuna Marrero, Antonio Torralba Barriuso, Sanja Fidler
-
Publication number: 20220383570Abstract: In various examples, high-precision semantic image editing for machine learning systems and applications are described. For example, a generative adversarial network (GAN) may be used to jointly model images and their semantic segmentations based on a same underlying latent code. Image editing may be achieved by using segmentation mask modifications (e.g., provided by a user, or otherwise) to optimize the latent code to be consistent with the updated segmentation, thus effectively changing the original, e.g., RGB image. To improve efficiency of the system, and to not require optimizations for each edit on each image, editing vectors may be learned in latent space that realize the edits, and that can be directly applied on other images with or without additional optimizations. As a result, a GAN in combination with the optimization approaches described herein may simultaneously allow for high precision editing in real-time with straightforward compositionality of multiple edits.Type: ApplicationFiled: May 27, 2022Publication date: December 1, 2022Inventors: Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba Barriuso, Sanja Fidler
-
Patent number: 11494976Abstract: Approaches are presented for training an inverse graphics network. An image synthesis network can generate training data for an inverse graphics network. In turn, the inverse graphics network can teach the synthesis network about the physical three-dimensional (3D) controls. Such an approach can provide for accurate 3D reconstruction of objects from 2D images using the trained inverse graphics network, while requiring little annotation of the provided training data. Such an approach can extract and disentangle 3D knowledge learned by generative models by utilizing differentiable renderers, enabling a disentangled generative model to function as a controllable 3D “neural renderer,” complementing traditional graphics renderers.Type: GrantFiled: March 5, 2021Date of Patent: November 8, 2022Assignee: Nvidia CorporationInventors: Wenzheng Chen, Yuxuan Zhang, Sanja Fidler, Huan Ling, Jun Gao, Antonio Torralba Barriuso
-
Patent number: 11436839Abstract: The present disclosure provides systems and methods to detect occluded objects using shadow information to anticipate moving obstacles that are occluded behind a corner or other obstacle. The system may perform a dynamic threshold analysis on enhanced images allowing the detection of even weakly visible shadows. The system may classify an image sequence as either “dynamic” or “static”, enabling an autonomous vehicle, or other moving platform, to react and respond to a moving, yet occluded object by slowing down or stopping.Type: GrantFiled: November 2, 2018Date of Patent: September 6, 2022Assignees: TOYOTA RESEARCH INSTITUTE, INC., MASSACHUSETTS INSTITUE OF TECHNOLOGYInventors: Felix Maximilian Naser, Igor Gilitschenski, Guy Rosman, Alexander Andre Amini, Fredo Durand, Antonio Torralba, Gregory Wornell, William Freeman, Sertac Karaman, Daniela Rus
-
Patent number: 11430084Abstract: A method includes receiving, with a computing device, an image, identifying one or more salient features in the image, and generating a saliency map of the image including the one or more salient features. The method further includes sampling the image based on the saliency map such that the one or more salient features are sampled at a first density of sampling and at least one portion of the image other than the one or more salient features are sampled at a second density of sampling, where the first density of sampling is greater than the second density of sampling, and storing the sampled image in a non-transitory computer readable memory.Type: GrantFiled: September 5, 2018Date of Patent: August 30, 2022Assignees: TOYOTA RESEARCH INSTITUTE, INC., MASSACHUSETTS INSTITUTE OF TECHNOLOGYInventors: Simon A. I. Stent, Adrià Recasens, Antonio Torralba, Petr Kellnhofer, Wojciech Matusik
-
Publication number: 20220269937Abstract: Apparatuses, systems, and techniques to use one or more neural networks to generate one or more images based, at least in part, on one or more spatially-independent features within the one or more images. In at least one embodiment, the one or more neural networks determine spatially-independent information and spatially-dependent information of the one or more images and process the spatially-independent information and the spatially-dependent information to generate the one or more spatially-independent features and one or more spatially-dependent features within the one or more images.Type: ApplicationFiled: February 24, 2021Publication date: August 25, 2022Inventors: Seung Wook Kim, Jonah Philion, Sanja Fidler, Antonio Torralba Barriuso
-
Publication number: 20220083807Abstract: Apparatuses, systems, and techniques to determine pixel-level labels of a synthetic image. In at least one embodiment, the synthetic image is generated by one or more generative networks and the pixel-level labels are generated using a combination of data output by a plurality of layers of the generative networks.Type: ApplicationFiled: September 14, 2020Publication date: March 17, 2022Inventors: Yuxuan Zhang, Huan Ling, Jun Gao, Wenzheng Chen, Antonio Torralba Barriuso, Sanja Fidler
-
Patent number: 11221671Abstract: A system includes a camera positioned in an environment to capture image data of a subject; a computing device communicatively coupled to the camera, the computing device comprising a processor and a non-transitory computer-readable memory; and a machine-readable instruction set stored in the non-transitory computer-readable memory. The machine-readable instruction set causes the computing device to perform at least the following when executed by the processor: receive the image data from the camera; analyze the image data captured by the camera using a neural network trained on training data generated from a 360-degree panoramic camera configured to collect image data of a subject and a visual target that is moved about an environment; and predict a gaze direction vector of the subject with the neural network.Type: GrantFiled: January 16, 2020Date of Patent: January 11, 2022Assignees: TOYOTA RESEARCH INSTITUTE, INC., MASSACHUSETTS INSTITUTE OF TECHNOLOGYInventors: Simon A. I. Stent, Adrià Recasens, Petr Kellnhofer, Wojciech Matusik, Antonio Torralba
-
Publication number: 20210390778Abstract: Apparatuses, systems, and techniques are presented to generate a simulated environment. In at least one embodiment, one or more neural networks are used to generate a simulated environment based, at least in part, on stored information associated with objects within the simulated environment.Type: ApplicationFiled: June 10, 2020Publication date: December 16, 2021Inventors: Seung Wook Kim, Sanja Fidler, Jonah Philion, Antonio Torralba Barriuso
-
Publication number: 20210315485Abstract: Systems and methods are provided for estimating 3D poses of a subject based on tactile interactions with the ground. Test subject interactions with the ground are recorded using a sensor system along with reference information (e.g., synchronized video information) for use in correlating tactile information with specific 3D poses, e.g., by training a neural network based on the reference information. Then, tactile information received in response to a given subject interacting with the ground can be used to estimate the 3D pose of the given subject directly, i.e., without reference to corresponding reference information. Certain exemplary embodiments use a sensor system in the form of a pressure sensing carpet or mat, although other types of sensor systems using pressure or other sensors can be used in various alternative embodiments.Type: ApplicationFiled: April 9, 2021Publication date: October 14, 2021Inventors: Wojciech Matusik, Antonio Torralba, Michael J. Foshey, Wan Shou, Yiyue Luo, Pratyusha Sharma, Yunzhu Li
-
Publication number: 20210279952Abstract: Approaches are presented for training an inverse graphics network. An image synthesis network can generate training data for an inverse graphics network. In turn, the inverse graphics network can teach the synthesis network about the physical three-dimensional (3D) controls. Such an approach can provide for accurate 3D reconstruction of objects from 2D images using the trained inverse graphics network, while requiring little annotation of the provided training data. Such an approach can extract and disentangle 3D knowledge learned by generative models by utilizing differentiable renderers, enabling a disentangled generative model to function as a controllable 3D “neural renderer,” complementing traditional graphics renderers.Type: ApplicationFiled: March 5, 2021Publication date: September 9, 2021Inventors: Wenzheng Chen, Yuxuan Zhang, Sanja Fidler, Huan Ling, Jun Gao, Antonio Torralba Barriuso
-
Patent number: 11042994Abstract: A system for determining the gaze direction of a subject includes a camera, a computing device and a machine-readable instruction set. The camera is positioned in an environment to capture image data of head of a subject. The computing device is communicatively coupled to the camera and the computing device includes a processor and a non-transitory computer-readable memory. The machine-readable instruction set is stored in the non-transitory computer-readable memory and causes the computing device to: receive image data from the camera, analyze the image data using a convolutional neural network trained on an image dataset comprising images of a head of a subject captured from viewpoints distributed around up to 360-degrees of head yaw, and predict a gaze direction vector of the subject based upon a combination of head appearance and eye appearance image data from the image dataset.Type: GrantFiled: October 12, 2018Date of Patent: June 22, 2021Assignee: TOYOTA RESEARCH INSTITUTE, INC.Inventors: Simon Stent, Adria Recasens, Antonio Torralba, Petr Kellnhofer, Wojciech Matusik
-
Publication number: 20200302250Abstract: A generative model can be used for generation of spatial layouts and graphs. Such a model can progressively grow these layouts and graphs based on local statistics, where nodes can represent spatial control points of the layout, and edges can represent segments or paths between nodes, such as may correspond to road segments. A generative model can utilize an encoder-decoder architecture where the encoder is a recurrent neural network (RNN) that encodes local incoming paths into a node and the decoder is another RNN that generates outgoing nodes and edges connecting an existing node to the newly generated nodes. Generation is done iteratively, and can finish once all nodes are visited or another end condition is satisfied. Such a model can generate layouts by additionally conditioning on a set of attributes, giving control to a user in generating the layout.Type: ApplicationFiled: March 20, 2020Publication date: September 24, 2020Inventors: Hang Chu, Daiqing Li, David Jesus Acuna Marrero, Amlan Kar, Maria Shugrina, Ming-Yu Liu, Antonio Torralba Barriuso, Sanja Fidler
-
Publication number: 20200249753Abstract: A system includes a camera positioned in an environment to capture image data of a subject; a computing device communicatively coupled to the camera, the computing device comprising a processor and a non-transitory computer-readable memory; and a machine-readable instruction set stored in the non-transitory computer-readable memory. The machine-readable instruction set causes the computing device to perform at least the following when executed by the processor: receive the image data from the camera; analyze the image data captured by the camera using a neural network trained on training data generated from a 360-degree panoramic camera configured to collect image data of a subject and a visual target that is moved about an environment; and predict a gaze direction vector of the subject with the neural network.Type: ApplicationFiled: January 16, 2020Publication date: August 6, 2020Applicants: Toyota Research Institute, Inc., Massachusetts Institute of TechnologyInventors: Simon A.I. Stent, Adrià Recasens, Petr Kellnhofer, Wojciech Matusik, Antonio Torralba
-
Publication number: 20200160178Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar—and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.Type: ApplicationFiled: November 15, 2019Publication date: May 21, 2020Inventors: Amlan Kar, Aayush Prakash, Ming-Yu Liu, David Jesus Acuna Marrero, Antonio Torralba Barriuso, Sanja Fidler