Patents by Inventor Sergey Zakharov

Sergey Zakharov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240135721
    Abstract: A method for improving 3D object detection via object-level augmentations is described. The method includes recognizing, using an image recognition model of a differentiable data generation pipeline, an object in an image of a scene. The method also includes generating, using a 3D reconstruction model, a 3D reconstruction of the scene from the image including the recognized object. The method further includes manipulating, using an object level augmentation model, a random property of the object by a random magnitude at an object level to determine a set of properties and a set of magnitudes of an object manipulation that maximizes a loss function of the image recognition model. The method also includes training a downstream task network based on a set of training data generated based on the set of properties and the set of magnitudes of the object manipulation, such that the loss function is minimized.
    Type: Application
    Filed: October 12, 2022
    Publication date: April 25, 2024
    Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Rares Andrei AMBRUS, Sergey ZAKHAROV, Vitor GUIZILINI, Adrien David GAIDON
  • Publication number: 20240104774
    Abstract: Various embodiments include a pose estimation method for refining an initial multi-dimensional pose of an object of interest to generate a refined multi-dimensional object pose Tpr(NL) with NL?1. The method may include: providing the initial object pose Tpr(0) and at least one 2D-3D-correspondence map ?pri with i=1, . . . , I and I?1; and estimating the refined object pose Tpr(NL) using an iterative optimization procedure of a loss according to a given loss function LF(k) based on discrepancies between the one or more provided 2D-3D-correspondence maps ?pri and one or more respective rendered 2D-3D-correspondence maps ?rendk,i.
    Type: Application
    Filed: December 9, 2021
    Publication date: March 28, 2024
    Applicant: Siemens Aktiengesellschaft
    Inventors: Slobodan Ilic, Ivan Shugurov, Sergey Zakharov, Ivan Pavlov
  • Patent number: 11915451
    Abstract: A method and a system for object detection and pose estimation within an input image. A 6-degree-of-freedom object detection and pose estimation is performed using a trained encoder-decoder convolutional artificial neural network including an encoder head, an ID mask decoder head, a first correspondence color channel decoder head and a second correspondence color channel decoder head. The ID mask decoder head creates an ID mask for identifying objects, and the color channel decoder heads are used to create a 2D-to-3D-correspondence map. For at least one object identified by the ID mask, a pose estimation based on the generated 2D-to-3D-correspondence map and on a pre-generated bijective association of points of the object with unique value combinations in the first and the second correspondence color channels is generated.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: February 27, 2024
    Assignee: Siemens Aktiengesellschaft
    Inventors: Ivan Shugurov, Andreas Hutter, Sergey Zakharov, Slobodan Ilic
  • Patent number: 11887248
    Abstract: Systems and methods described herein relate to reconstructing a scene in three dimensions from a two-dimensional image. One embodiment processes an image using a detection transformer to detect an object in the scene and to generate a NOCS map of the object and a background depth map; uses MLPs to relate the object to a differentiable database of object priors (PriorDB); recovers, from the NOCS map, a partial 3D object shape; estimates an initial object pose; fits a PriorDB object prior to align in geometry and appearance with the partial 3D shape to produce a complete shape and refines the initial pose estimate; generates an editable and re-renderable 3D scene reconstruction based, at least in part, on the complete shape, the refined pose estimate, and the depth map; and controls the operation of a robot based, at least in part, on the editable and re-renderable 3D scene reconstruction.
    Type: Grant
    Filed: March 16, 2022
    Date of Patent: January 30, 2024
    Assignees: Toyota Research Institute, Inc., Massachusetts Institute of Technology, The Board of Trustees of the Leland Standford Junior Univeristy
    Inventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus, Dennis Park, Joshua Tenenbaum, Jiajun Wu, Fredo Durand, Vincent Sitzmann
  • Publication number: 20240028792
    Abstract: The disclosure provides implicit representations for multi-object 3D shape, 6D pose and size, and appearance optimization, including obtaining shape, 6D pose and size, and appearance codes. Training is employed using shape and appearance priors from an implicit joint differential database. 2D masks are also obtained and are used in an optimization process that utilizes a combined loss minimizing function and an Octree-based coarse-to-fine differentiable optimization to jointly optimize the latest shape, appearance, pose and size, and 2D masks. An object surface is recovered from the latest shape codes to a desired resolution level. The database represents shapes as Signed Distance Fields (SDF), and appearance as Texture Fields (TF).
    Type: Application
    Filed: July 19, 2022
    Publication date: January 25, 2024
    Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: MUHAMMAD ZUBAIR IRSHAD, Sergey Zakharov, Rares A. Ambrus, Adrien D. Gaidon
  • Publication number: 20240013409
    Abstract: A method for multiple object tracking includes receiving, with a computing device, a point cloud dataset, detecting one or more objects in the point cloud dataset, each of the detected one or more objects defined by points of the point cloud dataset and a bounding box, querying one or more historical tracklets for historical tracklet states corresponding to each of the one or more detected objects, implementing a 4D encoding backbone comprising two branches: a first branch configured to compute per-point features for each of the one or more objects and the corresponding historical tracklet states, and a second branch configured to obtain 4D point features, concatenating the per-point features and the 4D point features, and predicting, with a decoder receiving the concatenated per-point features, current tracklet states for each of the one or more objects.
    Type: Application
    Filed: May 26, 2023
    Publication date: January 11, 2024
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha, The Board of Trustees of the Leland Stanford Junior University
    Inventors: Colton Stearns, Jie Li, Rares A. Ambrus, Vitor Campagnolo Guizilini, Sergey Zakharov, Adrien D. Gaidon, Davis Rempe, Tolga Birdal, Leonidas J. Guibas
  • Publication number: 20240005627
    Abstract: A method of conditional neural ground planes for static-dynamic disentanglement is described. The method includes extracting, using a convolutional neural network (CNN), CNN image features from an image to form a feature tensor. The method also includes resampling unprojected 2D features of the feature tensor to form feature pillars. The method further includes aggregating the feature pillars to form an entangled neural ground plane. The method also includes decomposing the entangled neural ground plane into a static neural ground plane and a dynamic neural ground plane.
    Type: Application
    Filed: April 18, 2023
    Publication date: January 4, 2024
    Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHA, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Prafull SHARMA, Ayush TEWARI, Yilun DU, Sergey ZAKHAROV, Rares Andrei AMBRUS, Adrien David GAIDON, William Tafel FREEMAN, Frederic Pierre DURAND, Joshua B. TENENBAUM, Vincent SITZMANN
  • Publication number: 20240005540
    Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model to derive depth estimates from monocular images using cost volumes. In one embodiment, a method includes predicting, using a depth model, depth values from at least one input image that is a monocular image. The method includes generating a cost volume by sampling the depth values corresponding to bins of the cost volume. The method includes determining loss values for the bins of the cost volume. The method includes training the depth model according to the loss values of the cost volume.
    Type: Application
    Filed: May 27, 2022
    Publication date: January 4, 2024
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
  • Publication number: 20230386060
    Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model to derive depth estimates from monocular images using histograms to assess photometric losses. In one embodiment, a method includes determining loss values according to a photometric loss function. The loss values are associated with a depth map derived from an input image that is a monocular image. The method includes generating histograms for the loss values corresponding to different regions of a target image. The method includes, responsive to identifying erroneous values of the loss values, masking the erroneous values to avoid considering the erroneous values during training of the depth model.
    Type: Application
    Filed: May 27, 2022
    Publication date: November 30, 2023
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
  • Publication number: 20230386059
    Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model for monocular depth estimation by warping depth features prior to decoding. In one embodiment, a method includes encoding, using an encoder of a depth model, a source image into depth features of a scene depicted by the source image. The method includes warping the depth features into warped features of a target frame of a target image associated with the source image. The method includes decoding, using a decoder of the depth model, the warped features into a depth map. The method includes training the depth model according to a loss derived from the depth map.
    Type: Application
    Filed: May 27, 2022
    Publication date: November 30, 2023
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
  • Publication number: 20230377180
    Abstract: In accordance with one embodiment of the present disclosure, a method includes receiving a set of images, each image depicting a view of a scene, generating sparse depth data from each image of the set of images, training a monocular depth estimation model with the sparse depth data, generating, with the trained monocular depth estimation model, depth data and uncertainty data for each image, training a NeRF model with the set of images, wherein the training is constrained by the depth data and uncertainty data, and rendering, with the trained NeRF model, a new image having a new view of the scene.
    Type: Application
    Filed: May 18, 2022
    Publication date: November 23, 2023
    Applicant: Toyota Research Institute Inc.
    Inventors: Rares Ambrus, Sergey Zakharov, Vitor C. Guizilini, Adrien Gaidon
  • Patent number: 11809524
    Abstract: Systems and methods for training an adapter network that adapts a model pre-trained on synthetic images to real-world data are disclosed herein. A system may include a processor and a memory in communication with the processor and having machine-readable that cause the processor to output, using a neural network, a predicted scene that includes a three-dimensional bounding box having pose information of an object, generate a rendered map of the object that includes a rendered shape of the object and a rendered surface normal of the object, and train the adapter network, which adapts the predicted scene to adjust for a deformation of the input image by comparing the rendered map to the output map acting as a ground truth.
    Type: Grant
    Filed: July 23, 2021
    Date of Patent: November 7, 2023
    Assignees: Woven Planet North America, Inc., Toyota Research Institute, Inc.
    Inventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20230326049
    Abstract: System, methods, and other embodiments described herein relate to training a depth model for monocular depth estimation using photometric loss masks derived from motion estimates of dynamic objects. In one embodiment, a method includes generating depth maps from images of an environment. The method includes determining motion of points within the depth maps. The method includes associating the points between the depth maps to identify an object according to a correlation of the motion for a first cluster of the points with a second cluster of the points. The method includes providing the depth maps and the object as an electronic output.
    Type: Application
    Filed: April 7, 2022
    Publication date: October 12, 2023
    Inventors: Rares A. Ambrus, Sergey Zakharov, Vitor Guizilini, Adrien David Gaidon
  • Publication number: 20230169677
    Abstract: Various embodiments of the teachings herein include a computer implemented pose estimation method for providing poses of objects of interest in a scene. The scene comprises a visual representation of the objects of interest in an environment. The method comprising: conducting for each one of the objects of interest a pose estimation; determining edge data of the object of interest from the visual representation representing the edges of the respective object of interest; determining keypoints of the respective object of interest by a previously trained artificial neural keypoint detection network, wherein the artificial neural keypoint detection network utilizes the determined edge data of the respective object of interest Oi as input and provides the keypoints of the respective object of interest as output; and estimating the pose of the respective object of interest based on the respective object's keypoints provided by the artificial neural keypoint detection network.
    Type: Application
    Filed: April 30, 2021
    Publication date: June 1, 2023
    Applicant: Siemens Aktiengesellschaft
    Inventors: Slobodan Ilic, Roman Kaskman, Ivan Shugurov, Sergey Zakharov
  • Publication number: 20230154145
    Abstract: In accordance with one embodiment of the present disclosure, a method includes receiving an input image having an object and a background, intrinsically decomposing the object and the background into an input image data having a set of features, augmenting the input image data with a 2.5D differentiable renderer for each feature of the set of features to create a set of augmented images, and compiling the input image and the set of augmented images into a training data set for training a downstream task network.
    Type: Application
    Filed: January 19, 2022
    Publication date: May 18, 2023
    Applicant: Toyota Research Institute, Inc.
    Inventors: Sergey Zakharov, Rares Ambrus, Vitor Guizilini, Adrien Gaidon
  • Publication number: 20230154024
    Abstract: A system for producing a depth map can include a processor and a memory. The memory can store a neural network. The neural network can include an encoding portion module, a multi-frame feature matching portion module, and a decoding portion module. The encoding portion module can include instructions that, when executed by the processor, cause the processor to encode an image to produce single-frame features. The multi-frame feature matching portion module can include instructions that, when executed by the processor, cause the processor to process the single-frame features to produce information. The decoding portion module can include instructions that, when executed by the processor, cause the processor to decode the information to produce the depth map. A first training dataset, used to train the multi-frame feature matching portion module, can be different from a second training dataset used to train the encoding portion module and the decoding portion module.
    Type: Application
    Filed: August 2, 2022
    Publication date: May 18, 2023
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Dian Chen, Adrien David Gaidon, Sergey Zakharov
  • Publication number: 20230154038
    Abstract: A system for producing a depth map can include a processor and a memory. The memory can store a candidate depth production module and a depth map production module. The candidate depth production module can include instructions that cause the processor to: (1) identify, in a first image, an epipolar line associated with a pixel in a second image and (2) sample, from a first image feature set, a set of candidate depths for pixels along the epipolar line. The depth map production module can include instructions that cause the processor to: (1) determine a similarity measure between a feature, from a second image feature set, and a member of the set and (2) produce, from the second image, the depth map with a depth for the pixel being a depth associated with a member, of the set, having a greatest similarity measure.
    Type: Application
    Filed: August 2, 2022
    Publication date: May 18, 2023
    Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha
    Inventors: Vitor Guizilini, Rares A. Ambrus, Dian Chen, Adrien David Gaidon, Sergey Zakharov
  • Publication number: 20220414974
    Abstract: Systems and methods described herein relate to reconstructing a scene in three dimensions from a two-dimensional image. One embodiment processes an image using a detection transformer to detect an object in the scene and to generate a NOCS map of the object and a background depth map; uses MLPs to relate the object to a differentiable database of object priors (PriorDB); recovers, from the NOCS map, a partial 3D object shape; estimates an initial object pose; fits a PriorDB object prior to align in geometry and appearance with the partial 3D shape to produce a complete shape and refines the initial pose estimate; generates an editable and re-renderable 3D scene reconstruction based, at least in part, on the complete shape, the refined pose estimate, and the depth map; and controls the operation of a robot based, at least in part, on the editable and re-renderable 3D scene reconstruction.
    Type: Application
    Filed: March 16, 2022
    Publication date: December 29, 2022
    Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, The Board of Trustees of the Leland Stanford Junior University
    Inventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus, Dennis Park, Joshua Tenenbaum, Jiajun Wu, Fredo Durand, Vincent Sitzmann
  • Publication number: 20220397567
    Abstract: The method for increasing contractility in patients with systolic heart failure involves screening for candidate small molecules which block the interaction between Rad and the plasma membrane and/or block the interaction between Rad and the CaV1.2/CaV?2 complex, or between Rad and CaV?2, in order to increase cardiac contractility. A method for preventing calcium overload and arrhythmias in heart disease involves preventing the dissociation of Rad and the CaV1.2/CaV?2 complex, or between Rad and CaV?2, during beta-adrenergic system activation. Additionally, a method of screening for drugs that block interaction between an RGK GTPase protein and a ?-subunit of the calcium channel is provided. A suitable technique, such as fluorescence resonance energy transfer (FRET), may be used to assess blocking of the interaction between the RGK GTPase protein and the ?-subunit of the calcium channel for the treatment of heart disease, pain, diabetes, skeletal muscle disorders and/or central nervous system (CNS) disorders.
    Type: Application
    Filed: July 2, 2020
    Publication date: December 15, 2022
    Inventors: Steven O. MARX, Alexander KUSHNIR, Sergey ZAKHAROV, Alexander KATCHMAN, Steven P. GYGI, Marian KALOCSAY, Manu BEN-JOHNY, Henry M. COLECRAFT, Guoxia LIU
  • Patent number: 11482014
    Abstract: A method for 3D auto-labeling of objects with predetermined structural and physical constraints includes identifying initial object-seeds for all frames from a given frame sequence of a scene. The method also includes refining each of the initial object-seeds over the 2D/3D data, while complying with the predetermined structural and physical constraints to auto-label 3D object vehicles within the scene. The method further includes linking the auto-label 3D object vehicles over time into trajectories while respecting the predetermined structural and physical constraints.
    Type: Grant
    Filed: September 18, 2020
    Date of Patent: October 25, 2022
    Assignee: TOYOTA RESEARCH INSTITUTE, INC.
    Inventors: Wadim Kehl, Sergey Zakharov, Adrien David Gaidon