Patents by Inventor Vincent Sitzmann

Vincent Sitzmann has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240161389
    Abstract: Systems and methods described herein support enhanced computer vision capabilities which may be applicable to, for example, autonomous vehicle operation. An example method includes generating a latent space and a decoder based on image data that includes multiple images, where each image has a different viewing frame of a scene. The method also includes generating a volumetric embedding that is representative of a novel viewing frame of the scene. The method includes decoding, with the decoder, the latent space using cross-attention with the volumetric embedding, and generating a novel viewing frame of the scene based on an output of the decoder.
    Type: Application
    Filed: August 3, 2023
    Publication date: May 16, 2024
    Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
  • Publication number: 20240161471
    Abstract: Systems and methods described herein support enhanced computer vision capabilities which may be applicable to, for example, autonomous vehicle operation. An example method includes generating, through training, a shared latent space based on (i) image data that include multiple images, where each image has a different viewing frame of a scene, and (ii) first and second types of embeddings, and training a decoder based on the first type of embeddings. The method also includes generating an embedding based on the first type of embeddings that is representative of a novel viewing frame of the scene, decoding, with the decoder, the shared latent space using cross-attention with the generated embedding, and generating the novel viewing frame of the scene based on an output of the decoder.
    Type: Application
    Filed: August 3, 2023
    Publication date: May 16, 2024
    Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
  • Publication number: 20240161510
    Abstract: Systems and methods described herein support enhanced computer vision capabilities which may be applicable to, for example, autonomous vehicle operation. An example method includes An example method includes training a shared latent space and a first decoder based on first image data that includes multiple images, and training the shared latent space and a second decoder based on second image data that includes multiple images. The method also includes generating a volumetric embedding that is representative of a novel viewing frame the first scene. Further, the method includes decoding, with the first decoders, the shared latent space with the volumetric embedding, and generating the novel viewing frame of the first scene based on the output of the first decoder.
    Type: Application
    Filed: August 3, 2023
    Publication date: May 16, 2024
    Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
  • Publication number: 20240153197
    Abstract: An example method includes generating embeddings of image data that includes multiple images, where each image has a different viewpoints of a scene, generating a latent space and a decoder, wherein the decoder receives embeddings as input to generate an output viewpoint, for each viewpoint in the image data, determining a volumetric rendering view synthesis loss and a multi-view photometric loss, and applying an optimization algorithm to the latent space and the decoder over a number of epochs until the volumetric rendering view synthesis loss is within a volumetric threshold and the multi-view photometric loss is within a multi-view threshold.
    Type: Application
    Filed: August 3, 2023
    Publication date: May 9, 2024
    Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Jiading Fang, Sergey Zakharov, Vincent Sitzmann, Igor Vasiljevic, Adrien Gaidon
  • Patent number: 11887248
    Abstract: Systems and methods described herein relate to reconstructing a scene in three dimensions from a two-dimensional image. One embodiment processes an image using a detection transformer to detect an object in the scene and to generate a NOCS map of the object and a background depth map; uses MLPs to relate the object to a differentiable database of object priors (PriorDB); recovers, from the NOCS map, a partial 3D object shape; estimates an initial object pose; fits a PriorDB object prior to align in geometry and appearance with the partial 3D shape to produce a complete shape and refines the initial pose estimate; generates an editable and re-renderable 3D scene reconstruction based, at least in part, on the complete shape, the refined pose estimate, and the depth map; and controls the operation of a robot based, at least in part, on the editable and re-renderable 3D scene reconstruction.
    Type: Grant
    Filed: March 16, 2022
    Date of Patent: January 30, 2024
    Assignees: Toyota Research Institute, Inc., Massachusetts Institute of Technology, The Board of Trustees of the Leland Standford Junior Univeristy
    Inventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus, Dennis Park, Joshua Tenenbaum, Jiajun Wu, Fredo Durand, Vincent Sitzmann
  • Publication number: 20240005627
    Abstract: A method of conditional neural ground planes for static-dynamic disentanglement is described. The method includes extracting, using a convolutional neural network (CNN), CNN image features from an image to form a feature tensor. The method also includes resampling unprojected 2D features of the feature tensor to form feature pillars. The method further includes aggregating the feature pillars to form an entangled neural ground plane. The method also includes decomposing the entangled neural ground plane into a static neural ground plane and a dynamic neural ground plane.
    Type: Application
    Filed: April 18, 2023
    Publication date: January 4, 2024
    Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHA, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Prafull SHARMA, Ayush TEWARI, Yilun DU, Sergey ZAKHAROV, Rares Andrei AMBRUS, Adrien David GAIDON, William Tafel FREEMAN, Frederic Pierre DURAND, Joshua B. TENENBAUM, Vincent SITZMANN
  • Publication number: 20220414974
    Abstract: Systems and methods described herein relate to reconstructing a scene in three dimensions from a two-dimensional image. One embodiment processes an image using a detection transformer to detect an object in the scene and to generate a NOCS map of the object and a background depth map; uses MLPs to relate the object to a differentiable database of object priors (PriorDB); recovers, from the NOCS map, a partial 3D object shape; estimates an initial object pose; fits a PriorDB object prior to align in geometry and appearance with the partial 3D shape to produce a complete shape and refines the initial pose estimate; generates an editable and re-renderable 3D scene reconstruction based, at least in part, on the complete shape, the refined pose estimate, and the depth map; and controls the operation of a robot based, at least in part, on the editable and re-renderable 3D scene reconstruction.
    Type: Application
    Filed: March 16, 2022
    Publication date: December 29, 2022
    Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, The Board of Trustees of the Leland Stanford Junior University
    Inventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus, Dennis Park, Joshua Tenenbaum, Jiajun Wu, Fredo Durand, Vincent Sitzmann