Patents by Inventor Sanja Fidler

Sanja Fidler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230244985
    Abstract: In various examples, a representative subset of data points are queried or selected using integer programming to minimize the Wasserstein distance between the selected data points and the data set from which they were selected. A Generalized Benders Decomposition (GBD) may be used to decompose and iteratively solve the minimization problem, providing a globally optimal solution (an identified subset of data points that match the distribution of their data set) within a threshold tolerance. Data selection may be accelerated by applying one or more constraints while iterating, such as optimality cuts that leverage properties of the Wasserstein distance and/or pruning constraints that reduce the search space of candidate data points. In an active learning implementation, a representative subset of unlabeled data points may be selected using GBD, labeled, and used to train machine learning model(s) over one or more cycles of active learning.
    Type: Application
    Filed: February 2, 2022
    Publication date: August 3, 2023
    Inventors: Rafid Reza Mahmood, Sanja Fidler, Marc Law
  • Publication number: 20230229919
    Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar— and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.
    Type: Application
    Filed: March 20, 2023
    Publication date: July 20, 2023
    Inventors: Amlan Kar, Aayush Prakash, Ming-Yu Liu, David Jesus Acuna Marrero, Antonio Torralba Barriuso, Sanja Fidler
  • Patent number: 11676284
    Abstract: Various types of image analysis benefit from a multi-stream architecture that allows the analysis to consider shape data. A shape stream can process image data in parallel with a primary stream, where data from layers of a network in the primary stream is provided as input to a network of the shape stream. The shape data can be fused with the primary analysis data to produce more accurate output, such as to produce accurate boundary information when the shape data is used with semantic segmentation data produced by the primary stream. A gate structure can be used to connect the intermediate layers of the primary and shape streams, using higher level activations to gate lower level activations in the shape stream. Such a gate structure can help focus the shape stream on the relevant information and reduces any additional weight of the shape stream.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: June 13, 2023
    Assignee: Nvidia Corporation
    Inventors: David Jesus Acuna Marrero, Towaki Takikawa, Varun Jampani, Sanja Fidler
  • Publication number: 20230134690
    Abstract: Approaches are presented for training an inverse graphics network. An image synthesis network can generate training data for an inverse graphics network. In turn, the inverse graphics network can teach the synthesis network about the physical three-dimensional (3D) controls. Such an approach can provide for accurate 3D reconstruction of objects from 2D images using the trained inverse graphics network, while requiring little annotation of the provided training data. Such an approach can extract and disentangle 3D knowledge learned by generative models by utilizing differentiable renderers, enabling a disentangled generative model to function as a controllable 3D “neural renderer,” complementing traditional graphics renderers.
    Type: Application
    Filed: November 7, 2022
    Publication date: May 4, 2023
    Inventors: Wenzheng Chen, Yuxuan Zhang, Sanja Fidler, Huan Ling, Jun Gao, Antonio Torralba Barriuso
  • Publication number: 20230140460
    Abstract: A technique is described for extracting or constructing a three-dimensional (3D) model from multiple two-dimensional (2D) images. In an embodiment, a foreground segmentation mask or depth field may be provided as an additional supervision input with each 2D image. In an embodiment, the foreground segmentation mask or depth field is automatically generated for each 2D image. The constructed 3D model comprises a triangular mesh topology, materials, and environment lighting. The constructed 3D model is represented in a format that can be directly edited and/or rendered by conventional application programs, such as digital content creation (DCC) tools. For example, the constructed 3D model may be represented as a triangular surface mesh (with arbitrary topology), a set of 2D textures representing spatially-varying material parameters, and an environment map. Furthermore, the constructed 3D model may be included in 3D scenes and interacts realistically with other objects.
    Type: Application
    Filed: May 30, 2022
    Publication date: May 4, 2023
    Inventors: Carl Jacob Munkberg, Jon Niklas Theodor Hasselgren, Tianchang Shen, Jun Gao, Wenzheng Chen, Alex John Bauld Evans, Thomas Müller-Höhne, Sanja Fidler
  • Patent number: 11610115
    Abstract: In various examples, a generative model is used to synthesize datasets for use in training a downstream machine learning model to perform an associated task. The synthesized datasets may be generated by sampling a scene graph from a scene grammar—such as a probabilistic grammar—and applying the scene graph to the generative model to compute updated scene graphs more representative of object attribute distributions of real-world datasets. The downstream machine learning model may be validated against a real-world validation dataset, and the performance of the model on the real-world validation dataset may be used as an additional factor in further training or fine-tuning the generative model for generating the synthesized datasets specific to the task of the downstream machine learning model.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: March 21, 2023
    Assignee: NVIDIA Corporation
    Inventors: Amlan Kar, Aayush Prakash, Ming-Yu Liu, David Jesus Acuna Marrero, Antonio Torralba Barriuso, Sanja Fidler
  • Publication number: 20230074420
    Abstract: Generation of three-dimensional (3D) object models may be challenging for users without a sufficient skill set for content creation and may also be resource intensive. One or more style transfer networks may be used for part-aware style transformation of both geometric features and textural components of a source asset to a target asset. The source asset may be segmented into particular parts and then ellipsoid approximations may be warped according to correspondence of the particular parts to the target assets. Moreover, a texture associated with the target asset may be used to warp or adjust a source texture, where the new texture can be applied to the warped parts.
    Type: Application
    Filed: September 7, 2021
    Publication date: March 9, 2023
    Inventors: Kangxue Yin, Jun Gao, Masha Shugrina, Sameh Khamis, Sanja Fidler
  • Patent number: 11556797
    Abstract: The present invention relates generally to object annotation, specifically to polygonal annotations of objects. Described are methods of annotating an object including steps of receiving an image depicting an object, generating a set of image features using a CNN encoder implemented on one or more computers, and producing a polygon object annotation via a recurrent decoder or a Graph Neural Network. The recurrent decoder may include a recurrent neural network, a graph neural network or a gated graph neural network. A system for annotating an object and a method of training an object annotation system are also described.
    Type: Grant
    Filed: March 23, 2020
    Date of Patent: January 17, 2023
    Inventors: Sanja Fidler, Amlan Kar, Huan Ling, Jun Gao, Wenzheng Chen, David Jesus Acuna Marrero
  • Publication number: 20220392162
    Abstract: In various examples, a deep three-dimensional (3D) conditional generative model is implemented that can synthesize high resolution 3D shapes using simple guides—such as coarse voxels, point clouds, etc.—by marrying implicit and explicit 3D representations into a hybrid 3D representation. The present approach may directly optimize for the reconstructed surface, allowing for the synthesis of finer geometric details with fewer artifacts. The systems and methods described herein may use a deformable tetrahedral grid that encodes a discretized signed distance function (SDF) and a differentiable marching tetrahedral layer that converts the implicit SDF representation to an explicit surface mesh representation. This combination allows joint optimization of the surface geometry and topology as well as generation of the hierarchy of subdivisions using reconstruction and adversarial losses defined explicitly on the surface mesh.
    Type: Application
    Filed: April 11, 2022
    Publication date: December 8, 2022
    Inventors: Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler
  • Publication number: 20220391766
    Abstract: In various examples, systems and methods are disclosed that use a domain-adaptation theory to minimize the reality gap between simulated and real-world domains for training machine learning models. For example, sampling of spatial priors may be used to generate synthetic data that that more closely matches the diversity of data from the real-world. To train models using this synthetic data that still perform well in the real-world, the systems and methods of the present disclosure may use a discriminator that allows a model to learn domain-invariant representations to minimize the divergence between the virtual world and the real-world in a latent space. As such, the techniques described herein allow for a principled approach to learn neural-invariant representations and a theoretically inspired approach on how to sample data from a simulator that, in combination, allow for training of machine learning models using synthetic data.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 8, 2022
    Inventors: David Jesus Acuna Marrero, Sanja Fidler, Jonah Philion
  • Publication number: 20220391781
    Abstract: A method performed by a server is provided. The method comprises sending copies of a set of parameters of a hyper network (HN) to at least one client device, receiving from each client device in the at least one client device, a corresponding set of updated parameters of the HN, and determining a next set of parameters of the HN based on the corresponding sets of updated parameters received from the at least one client device. Each client device generates the corresponding set of updated parameters based on a local model architecture of the client device.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 8, 2022
    Inventors: Or Litany, Haggai Maron, David Jesus Acuna Marrero, Jan Kautz, Sanja Fidler, Gal Chechik
  • Publication number: 20220383073
    Abstract: In various examples, machine learning models (MLMs) may be updated using multi-order gradients in order to train the MLMs, such as at least a first order gradient and any number of higher-order gradients. At least a first of the MLMs may be trained to generate a representation of features that is invariant to a first domain corresponding to a first dataset and a second domain corresponding to a second dataset. At least a second of the MLMs may be trained to classify whether the representation corresponds to the first domain or the second domain. At least a third of the MLMs may trained to perform a task. The first dataset may correspond to a labeled source domain and the second dataset may correspond to an unlabeled target domain. The training may include transferring knowledge from the first domain to the second domain in a representation space.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 1, 2022
    Inventors: David Jesus Acuna Marrero, Sanja Fidler, Marc Law, Guojun Zhang
  • Publication number: 20220383582
    Abstract: In various examples, information may be received for a 3D model, such as 3D geometry information, lighting information, and material information. A machine learning model may be trained to disentangle the 3D geometry information, the lighting information, and/or material information from input data to provide the information, which may be used to project geometry of the 3D model onto an image plane to generate a mapping between pixels and portions of the 3D model. Rasterization may then use the mapping to determine which pixels are covered and in what manner, by the geometry. The mapping may also be used to compute radiance for points corresponding to the one or more 3D models using light transport simulation. Disclosed approaches may be used in various applications, such as image editing, 3D model editing, synthetic data generation, and/or data set augmentation.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 1, 2022
    Inventors: Wenzheng Chen, Joey Litalien, Jun Gao, Zian Wang, Clement Tse Tsian Christophe Louis Fuji Tsang, Sameh Khamis, Or Litany, Sanja Fidler
  • Publication number: 20220383570
    Abstract: In various examples, high-precision semantic image editing for machine learning systems and applications are described. For example, a generative adversarial network (GAN) may be used to jointly model images and their semantic segmentations based on a same underlying latent code. Image editing may be achieved by using segmentation mask modifications (e.g., provided by a user, or otherwise) to optimize the latent code to be consistent with the updated segmentation, thus effectively changing the original, e.g., RGB image. To improve efficiency of the system, and to not require optimizations for each edit on each image, editing vectors may be learned in latent space that realize the edits, and that can be directly applied on other images with or without additional optimizations. As a result, a GAN in combination with the optimization approaches described herein may simultaneously allow for high precision editing in real-time with straightforward compositionality of multiple edits.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 1, 2022
    Inventors: Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba Barriuso, Sanja Fidler
  • Patent number: 11494976
    Abstract: Approaches are presented for training an inverse graphics network. An image synthesis network can generate training data for an inverse graphics network. In turn, the inverse graphics network can teach the synthesis network about the physical three-dimensional (3D) controls. Such an approach can provide for accurate 3D reconstruction of objects from 2D images using the trained inverse graphics network, while requiring little annotation of the provided training data. Such an approach can extract and disentangle 3D knowledge learned by generative models by utilizing differentiable renderers, enabling a disentangled generative model to function as a controllable 3D “neural renderer,” complementing traditional graphics renderers.
    Type: Grant
    Filed: March 5, 2021
    Date of Patent: November 8, 2022
    Assignee: Nvidia Corporation
    Inventors: Wenzheng Chen, Yuxuan Zhang, Sanja Fidler, Huan Ling, Jun Gao, Antonio Torralba Barriuso
  • Publication number: 20220284659
    Abstract: Systems and methods are described for rendering complex surfaces or geometry. In at least one embodiment, neural signed distance functions (SDFs) can be used that efficiently capture multiple levels of detail (LODs), and that can be used to reconstruct multi-dimensional geometry or surfaces with high image quality. An example architecture can represent complex shapes in a compressed format with high visual fidelity, and can generalize across different geometries from a single learned example. Extremely small multi-layer perceptrons (MLPs) can be used with an octree-based feature representation for the learned neural SDFs.
    Type: Application
    Filed: May 16, 2022
    Publication date: September 8, 2022
    Inventors: Towaki Alan Takikawa, Joey Litalien, Kangxue Yin, Karsten Julian Kreis, Charles Loop, Morgan McGuire, Sanja Fidler
  • Publication number: 20220269937
    Abstract: Apparatuses, systems, and techniques to use one or more neural networks to generate one or more images based, at least in part, on one or more spatially-independent features within the one or more images. In at least one embodiment, the one or more neural networks determine spatially-independent information and spatially-dependent information of the one or more images and process the spatially-independent information and the spatially-dependent information to generate the one or more spatially-independent features and one or more spatially-dependent features within the one or more images.
    Type: Application
    Filed: February 24, 2021
    Publication date: August 25, 2022
    Inventors: Seung Wook Kim, Jonah Philion, Sanja Fidler, Antonio Torralba Barriuso
  • Publication number: 20220172423
    Abstract: Systems and methods are described for rendering complex surfaces or geometry. In at least one embodiment, neural signed distance functions (SDFs) can be used that efficiently capture multiple levels of detail (LODs), and that can be used to reconstruct multi-dimensional geometry or surfaces with high image quality. An example architecture can represent complex shapes in a compressed format with high visual fidelity, and can generalize across different geometries from a single learned example. Extremely small multi-layer perceptrons (MLPs) can be used with an octree-based feature representation for the learned neural SDFs.
    Type: Application
    Filed: May 7, 2021
    Publication date: June 2, 2022
    Inventors: Towaki Alan Takikawa, Joey Litalien, Kangxue Yin, Karsten Julian Kreis, Charles Loop, Morgan McGuire, Sanja Fidler
  • Patent number: 11335056
    Abstract: Systems and methods are described for rendering complex surfaces or geometry. In at least one embodiment, neural signed distance functions (SDFs) can be used that efficiently capture multiple levels of detail (LODs), and that can be used to reconstruct multi-dimensional geometry or surfaces with high image quality. An example architecture can represent complex shapes in a compressed format with high visual fidelity, and can generalize across different geometries from a single learned example. Extremely small multi-layer perceptrons (MLPs) can be used with an octree-based feature representation for the learned neural SDFs.
    Type: Grant
    Filed: May 7, 2021
    Date of Patent: May 17, 2022
    Assignee: Nvidia Corporation
    Inventors: Towaki Alan Takikawa, Joey Litalien, Kangxue Yin, Karsten Julian Kreis, Charles Loop, Morgan McGuire, Sanja Fidler
  • Publication number: 20220108134
    Abstract: Approaches presented herein provide for unsupervised domain transfer learning. In particular, three neural networks can be trained together using at least labeled data from a first domain and unlabeled data from a second domain. Features of the data are extracted using a feature extraction network. A first classifier network uses these features to classify the data, while a second classifier network uses these features to determine the relevant domain. A combined loss function is used to optimize the networks, with a goal of the feature extraction network extracting features that the first classifier network is able to use to accurately classify the data, but prevent the second classifier from determining the domain for the image. Such optimization enables object classification to be performed with high accuracy for either domain, even though there may have been little to no labeled training data for the second domain.
    Type: Application
    Filed: April 9, 2021
    Publication date: April 7, 2022
    Inventors: David Acuna Marrero, Guojun Zhang, Marc Law, Sanja Fidler