Patents by Inventor Sergey Zakharov
Sergey Zakharov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240135721Abstract: A method for improving 3D object detection via object-level augmentations is described. The method includes recognizing, using an image recognition model of a differentiable data generation pipeline, an object in an image of a scene. The method also includes generating, using a 3D reconstruction model, a 3D reconstruction of the scene from the image including the recognized object. The method further includes manipulating, using an object level augmentation model, a random property of the object by a random magnitude at an object level to determine a set of properties and a set of magnitudes of an object manipulation that maximizes a loss function of the image recognition model. The method also includes training a downstream task network based on a set of training data generated based on the set of properties and the set of magnitudes of the object manipulation, such that the loss function is minimized.Type: ApplicationFiled: October 12, 2022Publication date: April 25, 2024Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHAInventors: Rares Andrei AMBRUS, Sergey ZAKHAROV, Vitor GUIZILINI, Adrien David GAIDON
-
Publication number: 20240104774Abstract: Various embodiments include a pose estimation method for refining an initial multi-dimensional pose of an object of interest to generate a refined multi-dimensional object pose Tpr(NL) with NL?1. The method may include: providing the initial object pose Tpr(0) and at least one 2D-3D-correspondence map ?pri with i=1, . . . , I and I?1; and estimating the refined object pose Tpr(NL) using an iterative optimization procedure of a loss according to a given loss function LF(k) based on discrepancies between the one or more provided 2D-3D-correspondence maps ?pri and one or more respective rendered 2D-3D-correspondence maps ?rendk,i.Type: ApplicationFiled: December 9, 2021Publication date: March 28, 2024Applicant: Siemens AktiengesellschaftInventors: Slobodan Ilic, Ivan Shugurov, Sergey Zakharov, Ivan Pavlov
-
Patent number: 11915451Abstract: A method and a system for object detection and pose estimation within an input image. A 6-degree-of-freedom object detection and pose estimation is performed using a trained encoder-decoder convolutional artificial neural network including an encoder head, an ID mask decoder head, a first correspondence color channel decoder head and a second correspondence color channel decoder head. The ID mask decoder head creates an ID mask for identifying objects, and the color channel decoder heads are used to create a 2D-to-3D-correspondence map. For at least one object identified by the ID mask, a pose estimation based on the generated 2D-to-3D-correspondence map and on a pre-generated bijective association of points of the object with unique value combinations in the first and the second correspondence color channels is generated.Type: GrantFiled: January 17, 2020Date of Patent: February 27, 2024Assignee: Siemens AktiengesellschaftInventors: Ivan Shugurov, Andreas Hutter, Sergey Zakharov, Slobodan Ilic
-
Patent number: 11887248Abstract: Systems and methods described herein relate to reconstructing a scene in three dimensions from a two-dimensional image. One embodiment processes an image using a detection transformer to detect an object in the scene and to generate a NOCS map of the object and a background depth map; uses MLPs to relate the object to a differentiable database of object priors (PriorDB); recovers, from the NOCS map, a partial 3D object shape; estimates an initial object pose; fits a PriorDB object prior to align in geometry and appearance with the partial 3D shape to produce a complete shape and refines the initial pose estimate; generates an editable and re-renderable 3D scene reconstruction based, at least in part, on the complete shape, the refined pose estimate, and the depth map; and controls the operation of a robot based, at least in part, on the editable and re-renderable 3D scene reconstruction.Type: GrantFiled: March 16, 2022Date of Patent: January 30, 2024Assignees: Toyota Research Institute, Inc., Massachusetts Institute of Technology, The Board of Trustees of the Leland Standford Junior UniveristyInventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus, Dennis Park, Joshua Tenenbaum, Jiajun Wu, Fredo Durand, Vincent Sitzmann
-
Publication number: 20240028792Abstract: The disclosure provides implicit representations for multi-object 3D shape, 6D pose and size, and appearance optimization, including obtaining shape, 6D pose and size, and appearance codes. Training is employed using shape and appearance priors from an implicit joint differential database. 2D masks are also obtained and are used in an optimization process that utilizes a combined loss minimizing function and an Octree-based coarse-to-fine differentiable optimization to jointly optimize the latest shape, appearance, pose and size, and 2D masks. An object surface is recovered from the latest shape codes to a desired resolution level. The database represents shapes as Signed Distance Fields (SDF), and appearance as Texture Fields (TF).Type: ApplicationFiled: July 19, 2022Publication date: January 25, 2024Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHAInventors: MUHAMMAD ZUBAIR IRSHAD, Sergey Zakharov, Rares A. Ambrus, Adrien D. Gaidon
-
Publication number: 20240013409Abstract: A method for multiple object tracking includes receiving, with a computing device, a point cloud dataset, detecting one or more objects in the point cloud dataset, each of the detected one or more objects defined by points of the point cloud dataset and a bounding box, querying one or more historical tracklets for historical tracklet states corresponding to each of the one or more detected objects, implementing a 4D encoding backbone comprising two branches: a first branch configured to compute per-point features for each of the one or more objects and the corresponding historical tracklet states, and a second branch configured to obtain 4D point features, concatenating the per-point features and the 4D point features, and predicting, with a decoder receiving the concatenated per-point features, current tracklet states for each of the one or more objects.Type: ApplicationFiled: May 26, 2023Publication date: January 11, 2024Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki Kaisha, The Board of Trustees of the Leland Stanford Junior UniversityInventors: Colton Stearns, Jie Li, Rares A. Ambrus, Vitor Campagnolo Guizilini, Sergey Zakharov, Adrien D. Gaidon, Davis Rempe, Tolga Birdal, Leonidas J. Guibas
-
Publication number: 20240005627Abstract: A method of conditional neural ground planes for static-dynamic disentanglement is described. The method includes extracting, using a convolutional neural network (CNN), CNN image features from an image to form a feature tensor. The method also includes resampling unprojected 2D features of the feature tensor to form feature pillars. The method further includes aggregating the feature pillars to form an entangled neural ground plane. The method also includes decomposing the entangled neural ground plane into a static neural ground plane and a dynamic neural ground plane.Type: ApplicationFiled: April 18, 2023Publication date: January 4, 2024Applicants: TOYOTA RESEARCH INSTITUTE, INC., TOYOTA JIDOSHA KABUSHIKI KAISHA, MASSACHUSETTS INSTITUTE OF TECHNOLOGYInventors: Prafull SHARMA, Ayush TEWARI, Yilun DU, Sergey ZAKHAROV, Rares Andrei AMBRUS, Adrien David GAIDON, William Tafel FREEMAN, Frederic Pierre DURAND, Joshua B. TENENBAUM, Vincent SITZMANN
-
Publication number: 20240005540Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model to derive depth estimates from monocular images using cost volumes. In one embodiment, a method includes predicting, using a depth model, depth values from at least one input image that is a monocular image. The method includes generating a cost volume by sampling the depth values corresponding to bins of the cost volume. The method includes determining loss values for the bins of the cost volume. The method includes training the depth model according to the loss values of the cost volume.Type: ApplicationFiled: May 27, 2022Publication date: January 4, 2024Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki KaishaInventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
-
Publication number: 20230386060Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model to derive depth estimates from monocular images using histograms to assess photometric losses. In one embodiment, a method includes determining loss values according to a photometric loss function. The loss values are associated with a depth map derived from an input image that is a monocular image. The method includes generating histograms for the loss values corresponding to different regions of a target image. The method includes, responsive to identifying erroneous values of the loss values, masking the erroneous values to avoid considering the erroneous values during training of the depth model.Type: ApplicationFiled: May 27, 2022Publication date: November 30, 2023Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki KaishaInventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
-
Publication number: 20230386059Abstract: System, methods, and other embodiments described herein relate to an improved approach to training a depth model for monocular depth estimation by warping depth features prior to decoding. In one embodiment, a method includes encoding, using an encoder of a depth model, a source image into depth features of a scene depicted by the source image. The method includes warping the depth features into warped features of a target frame of a target image associated with the source image. The method includes decoding, using a decoder of the depth model, the warped features into a depth map. The method includes training the depth model according to a loss derived from the depth map.Type: ApplicationFiled: May 27, 2022Publication date: November 30, 2023Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki KaishaInventors: Vitor Guizilini, Rares A. Ambrus, Sergey Zakharov
-
Publication number: 20230377180Abstract: In accordance with one embodiment of the present disclosure, a method includes receiving a set of images, each image depicting a view of a scene, generating sparse depth data from each image of the set of images, training a monocular depth estimation model with the sparse depth data, generating, with the trained monocular depth estimation model, depth data and uncertainty data for each image, training a NeRF model with the set of images, wherein the training is constrained by the depth data and uncertainty data, and rendering, with the trained NeRF model, a new image having a new view of the scene.Type: ApplicationFiled: May 18, 2022Publication date: November 23, 2023Applicant: Toyota Research Institute Inc.Inventors: Rares Ambrus, Sergey Zakharov, Vitor C. Guizilini, Adrien Gaidon
-
Patent number: 11809524Abstract: Systems and methods for training an adapter network that adapts a model pre-trained on synthetic images to real-world data are disclosed herein. A system may include a processor and a memory in communication with the processor and having machine-readable that cause the processor to output, using a neural network, a predicted scene that includes a three-dimensional bounding box having pose information of an object, generate a rendered map of the object that includes a rendered shape of the object and a rendered surface normal of the object, and train the adapter network, which adapts the predicted scene to adjust for a deformation of the input image by comparing the rendered map to the output map acting as a ground truth.Type: GrantFiled: July 23, 2021Date of Patent: November 7, 2023Assignees: Woven Planet North America, Inc., Toyota Research Institute, Inc.Inventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon
-
Publication number: 20230326049Abstract: System, methods, and other embodiments described herein relate to training a depth model for monocular depth estimation using photometric loss masks derived from motion estimates of dynamic objects. In one embodiment, a method includes generating depth maps from images of an environment. The method includes determining motion of points within the depth maps. The method includes associating the points between the depth maps to identify an object according to a correlation of the motion for a first cluster of the points with a second cluster of the points. The method includes providing the depth maps and the object as an electronic output.Type: ApplicationFiled: April 7, 2022Publication date: October 12, 2023Inventors: Rares A. Ambrus, Sergey Zakharov, Vitor Guizilini, Adrien David Gaidon
-
Publication number: 20230169677Abstract: Various embodiments of the teachings herein include a computer implemented pose estimation method for providing poses of objects of interest in a scene. The scene comprises a visual representation of the objects of interest in an environment. The method comprising: conducting for each one of the objects of interest a pose estimation; determining edge data of the object of interest from the visual representation representing the edges of the respective object of interest; determining keypoints of the respective object of interest by a previously trained artificial neural keypoint detection network, wherein the artificial neural keypoint detection network utilizes the determined edge data of the respective object of interest Oi as input and provides the keypoints of the respective object of interest as output; and estimating the pose of the respective object of interest based on the respective object's keypoints provided by the artificial neural keypoint detection network.Type: ApplicationFiled: April 30, 2021Publication date: June 1, 2023Applicant: Siemens AktiengesellschaftInventors: Slobodan Ilic, Roman Kaskman, Ivan Shugurov, Sergey Zakharov
-
Publication number: 20230154145Abstract: In accordance with one embodiment of the present disclosure, a method includes receiving an input image having an object and a background, intrinsically decomposing the object and the background into an input image data having a set of features, augmenting the input image data with a 2.5D differentiable renderer for each feature of the set of features to create a set of augmented images, and compiling the input image and the set of augmented images into a training data set for training a downstream task network.Type: ApplicationFiled: January 19, 2022Publication date: May 18, 2023Applicant: Toyota Research Institute, Inc.Inventors: Sergey Zakharov, Rares Ambrus, Vitor Guizilini, Adrien Gaidon
-
Publication number: 20230154024Abstract: A system for producing a depth map can include a processor and a memory. The memory can store a neural network. The neural network can include an encoding portion module, a multi-frame feature matching portion module, and a decoding portion module. The encoding portion module can include instructions that, when executed by the processor, cause the processor to encode an image to produce single-frame features. The multi-frame feature matching portion module can include instructions that, when executed by the processor, cause the processor to process the single-frame features to produce information. The decoding portion module can include instructions that, when executed by the processor, cause the processor to decode the information to produce the depth map. A first training dataset, used to train the multi-frame feature matching portion module, can be different from a second training dataset used to train the encoding portion module and the decoding portion module.Type: ApplicationFiled: August 2, 2022Publication date: May 18, 2023Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki KaishaInventors: Vitor Guizilini, Rares A. Ambrus, Dian Chen, Adrien David Gaidon, Sergey Zakharov
-
Publication number: 20230154038Abstract: A system for producing a depth map can include a processor and a memory. The memory can store a candidate depth production module and a depth map production module. The candidate depth production module can include instructions that cause the processor to: (1) identify, in a first image, an epipolar line associated with a pixel in a second image and (2) sample, from a first image feature set, a set of candidate depths for pixels along the epipolar line. The depth map production module can include instructions that cause the processor to: (1) determine a similarity measure between a feature, from a second image feature set, and a member of the set and (2) produce, from the second image, the depth map with a depth for the pixel being a depth associated with a member, of the set, having a greatest similarity measure.Type: ApplicationFiled: August 2, 2022Publication date: May 18, 2023Applicants: Toyota Research Institute, Inc., Toyota Jidosha Kabushiki KaishaInventors: Vitor Guizilini, Rares A. Ambrus, Dian Chen, Adrien David Gaidon, Sergey Zakharov
-
Publication number: 20220414974Abstract: Systems and methods described herein relate to reconstructing a scene in three dimensions from a two-dimensional image. One embodiment processes an image using a detection transformer to detect an object in the scene and to generate a NOCS map of the object and a background depth map; uses MLPs to relate the object to a differentiable database of object priors (PriorDB); recovers, from the NOCS map, a partial 3D object shape; estimates an initial object pose; fits a PriorDB object prior to align in geometry and appearance with the partial 3D shape to produce a complete shape and refines the initial pose estimate; generates an editable and re-renderable 3D scene reconstruction based, at least in part, on the complete shape, the refined pose estimate, and the depth map; and controls the operation of a robot based, at least in part, on the editable and re-renderable 3D scene reconstruction.Type: ApplicationFiled: March 16, 2022Publication date: December 29, 2022Applicants: Toyota Research Institute, Inc., Massachusetts Institute of Technology, The Board of Trustees of the Leland Stanford Junior UniversityInventors: Sergey Zakharov, Wadim Kehl, Vitor Guizilini, Adrien David Gaidon, Rares A. Ambrus, Dennis Park, Joshua Tenenbaum, Jiajun Wu, Fredo Durand, Vincent Sitzmann
-
Publication number: 20220397567Abstract: The method for increasing contractility in patients with systolic heart failure involves screening for candidate small molecules which block the interaction between Rad and the plasma membrane and/or block the interaction between Rad and the CaV1.2/CaV?2 complex, or between Rad and CaV?2, in order to increase cardiac contractility. A method for preventing calcium overload and arrhythmias in heart disease involves preventing the dissociation of Rad and the CaV1.2/CaV?2 complex, or between Rad and CaV?2, during beta-adrenergic system activation. Additionally, a method of screening for drugs that block interaction between an RGK GTPase protein and a ?-subunit of the calcium channel is provided. A suitable technique, such as fluorescence resonance energy transfer (FRET), may be used to assess blocking of the interaction between the RGK GTPase protein and the ?-subunit of the calcium channel for the treatment of heart disease, pain, diabetes, skeletal muscle disorders and/or central nervous system (CNS) disorders.Type: ApplicationFiled: July 2, 2020Publication date: December 15, 2022Inventors: Steven O. MARX, Alexander KUSHNIR, Sergey ZAKHAROV, Alexander KATCHMAN, Steven P. GYGI, Marian KALOCSAY, Manu BEN-JOHNY, Henry M. COLECRAFT, Guoxia LIU
-
Patent number: 11482014Abstract: A method for 3D auto-labeling of objects with predetermined structural and physical constraints includes identifying initial object-seeds for all frames from a given frame sequence of a scene. The method also includes refining each of the initial object-seeds over the 2D/3D data, while complying with the predetermined structural and physical constraints to auto-label 3D object vehicles within the scene. The method further includes linking the auto-label 3D object vehicles over time into trajectories while respecting the predetermined structural and physical constraints.Type: GrantFiled: September 18, 2020Date of Patent: October 25, 2022Assignee: TOYOTA RESEARCH INSTITUTE, INC.Inventors: Wadim Kehl, Sergey Zakharov, Adrien David Gaidon