Patents by Inventor Kyle Olszewski
Kyle Olszewski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20260162367Abstract: A system and method are described for generating 3D garments from two-dimensional (2D) scribble images drawn by users. The system includes a conditional 2D generator, a conditional 3D generator, and two intermediate media including dimension-coupling color-density pairs and flat point clouds that bridge the gap between dimensions. Given a scribble image, the 2D generator synthesizes dimension-coupling color-density pairs including the RGB projection and density map from the front and rear views of the scribble image. A density-aware sampling algorithm converts the 2D dimension-coupling color-density pairs into a 3D flat point cloud representation, where the depth information is ignored. The 3D generator predicts the depth information from the flat point cloud. Dynamic variations per garment due to deformations resulting from a wearer's pose as well as irregular wrinkles and folds may be bypassed by taking advantage of 2D generative models to bridge the dimension gap in a non-parametric way.Type: ApplicationFiled: April 15, 2025Publication date: June 11, 2026Inventors: Panagiotis Achlioptas, Menglei Chai, Hsin-Ying Lee, Kyle Olszewski, Jian Ren, Sergey Tulyakov
-
Publication number: 20260057606Abstract: Systems and methods for generating static and articulated 3D assets are provided that include a 3D autodecoder at their core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space, which can then be decoded into a volumetric representation for rendering view-consistent appearance and geometry. The appropriate intermediate volumetric latent space is then identified and robust normalization and de-normalization operations are implemented to learn a 3D diffusion from 2D images or monocular videos of rigid or articulated objects. The methods are flexible enough to use either existing camera supervision or no camera information at all—instead efficiently learning the camera information during training.Type: ApplicationFiled: October 30, 2025Publication date: February 26, 2026Inventors: Evangelos Ntavelis, Kyle Olszewski, Aliaksandr Siarohin, Sergey Tulyakov
-
Patent number: 12494013Abstract: Systems and methods for generating static and articulated 3D assets are provided that include a 3D autodecoder at their core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space, which can then be decoded into a volumetric representation for rendering view-consistent appearance and geometry. The appropriate intermediate volumetric latent space is then identified and robust normalization and de-normalization operations are implemented to learn a 3D diffusion from 2D images or monocular videos of rigid or articulated objects. The methods are flexible enough to use either existing camera supervision or no camera information at all—instead efficiently learning the camera information during training.Type: GrantFiled: June 16, 2023Date of Patent: December 9, 2025Assignee: Snap Inc.Inventors: Evangelos Ntavelis, Kyle Olszewski, Aliaksandr Siarohin, Sergey Tulyakov
-
Publication number: 20250356569Abstract: Unsupervised volumetric 3D animation (UVA) of non-rigid deformable objects without annotations learns the 3D structure and dynamics of objects solely from single-view red/green/blue (RGB) videos and decomposes the single-view RGB videos into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable perspective-n-point (PnP) algorithm, the UVA model learns the underlying object 3D geometry and parts decomposition in an entirely unsupervised manner from still or video images. This allows the UVA model to perform 3D segmentation, 3D keypoint estimation, novel view synthesis, and animation. The UVA model can obtain animatable 3D objects from a single or a few images. The UVA method also features a space in which all objects are represented in their canonical, animation-ready form. Applications include the creation of lenses from images or videos for social media applications.Type: ApplicationFiled: July 29, 2025Publication date: November 20, 2025Inventors: Menglei Chai, Hsin-Ying Lee, Willi Menapace, Kyle Olszewski, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Sergey Tulyakov
-
Publication number: 20250330679Abstract: A multimodal video generation framework (MMVID) that benefits from text and images provided jointly or separately as input. Quantized representations of videos are utilized with a bidirectional transformer with multiple modalities as inputs to predict a discrete video representation. A new video token trained with self-learning and an improved mask-prediction algorithm for sampling video tokens is used to improve video quality and consistency. Text augmentation is utilized to improve the robustness of the textual representation and diversity of generated videos. The framework incorporates various visual modalities, such as segmentation masks, drawings, and partially occluded images. In addition, the MMVID extracts visual information as suggested by a textual prompt.Type: ApplicationFiled: June 27, 2025Publication date: October 23, 2025Inventors: Francesco Barbieri, Ligong Han, Hsin-Ying Lee, Shervin Minaee, Kyle Olszewski, Jian Ren, Sergey Tulyakov
-
Patent number: 12450822Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.Type: GrantFiled: April 24, 2024Date of Patent: October 21, 2025Assignee: Snap Inc.Inventors: Zeng Huang, Jian Ren, Sergey Tulyakov, Menglei Chai, Kyle Olszewski, Huan Wang
-
Publication number: 20250322605Abstract: A system to enable 3D hair reconstruction and rendering from a single reference image which performs a multi-stage process that utilizes both a 3D implicit representation and a 2D parametric embedding space.Type: ApplicationFiled: June 24, 2025Publication date: October 16, 2025Inventors: Zeng Huang, Menglei Chai, Sergey Tulyakov, Kyle Olszewski, Hsin-Ying Lee
-
Patent number: 12400388Abstract: Unsupervised volumetric 3D animation (UVA) of non-rigid deformable objects without annotations learns the 3D structure and dynamics of objects solely from single-view red/green/blue (RGB) videos and decomposes the single-view RGB videos into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable perspective-n-point (PnP) algorithm, the UVA model learns the underlying object 3D geometry and parts decomposition in an entirely unsupervised manner from still or video images. This allows the UVA model to perform 3D segmentation, 3D keypoint estimation, novel view synthesis, and animation. The UVA model can obtain animatable 3D objects from a single or a few images. The UVA method also features a space in which all objects are represented in their canonical, animation-ready form. Applications include the creation of lenses from images or videos for social media applications.Type: GrantFiled: December 28, 2022Date of Patent: August 26, 2025Assignee: Snap Inc.Inventors: Menglei Chai, Hsin-Ying Lee, Willi Menapace, Kyle Olszewski, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Sergey Tulyakov
-
Publication number: 20250252660Abstract: Three-dimensional object representation and re-rendering systems and methods for producing a 3D representation of an object from 2D images including the object that enables object-centric rendering. A modular approach is used that optimizes a Neural Radiance Field (NeRF) model to estimate object geometry and refine camera parameters and, then, infer surface material properties and per-image lighting conditions that fit the 2D images.Type: ApplicationFiled: April 28, 2025Publication date: August 7, 2025Inventors: Kyle Olszewski, Sergey Tulyakov, Zhengfei Kuang, Menglei Chai
-
Patent number: 12375766Abstract: A multimodal video generation framework (MMVID) that benefits from text and images provided jointly or separately as input. Quantized representations of videos are utilized with a bidirectional transformer with multiple modalities as inputs to predict a discrete video representation. A new video token trained with self-learning and an improved mask-prediction algorithm for sampling video tokens is used to improve video quality and consistency. Text augmentation is utilized to improve the robustness of the textual representation and diversity of generated videos. The framework incorporates various visual modalities, such as segmentation masks, drawings, and partially occluded images. In addition, the MMVID extracts visual information as suggested by a textual prompt.Type: GrantFiled: September 30, 2022Date of Patent: July 29, 2025Assignee: Snap Inc.Inventors: Francesco Barbieri, Ligong Han, Hsin-Ying Lee, Shervin Minaee, Kyle Olszewski, Jian Ren, Sergey Tulyakov
-
Patent number: 12374036Abstract: A system to enable 3D hair reconstruction and rendering from a single reference image which performs a multi-stage process that utilizes both a 3D implicit representation and a 2D parametric embedding space.Type: GrantFiled: July 21, 2022Date of Patent: July 29, 2025Assignee: Snap Inc.Inventors: Zeng Huang, Menglei Chai, Sergey Tulyakov, Kyle Olszewski, Hsin-Ying Lee
-
Patent number: 12315075Abstract: Three-dimensional object representation and re-rendering systems and methods for producing a 3D representation of an object from 2D images including the object that enables object-centric rendering. A modular approach is used that optimizes a Neural Radiance Field (NeRF) model to estimate object geometry and refine camera parameters and, then, infer surface material properties and per-image lighting conditions that fit the 2D images.Type: GrantFiled: December 28, 2022Date of Patent: May 27, 2025Assignee: Snap Inc.Inventors: Kyle Olszewski, Sergey Tulyakov, Zhengfei Kuang, Menglei Chai
-
Publication number: 20240420407Abstract: Systems and methods for generating static and articulated 3D assets are provided that include a 3D autodecoder at their core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space, which can then be decoded into a volumetric representation for rendering view-consistent appearance and geometry. The appropriate intermediate volumetric latent space is then identified and robust normalization and de-normalization operations are implemented to learn a 3D diffusion from 2D images or monocular videos of rigid or articulated objects. The methods are flexible enough to use either existing camera supervision or no camera information at all—instead efficiently learning the camera information during training.Type: ApplicationFiled: June 16, 2023Publication date: December 19, 2024Inventors: Evangelos Ntavelis, Kyle Olszewski, Aliaksandr Siarohin, Sergey Tulyakov
-
Patent number: 12094073Abstract: Systems, computer readable media, and methods herein describe an editing system where a three-dimensional (3D) object can be edited by editing a 2D sketch or 2D RGB views of the 3D object. The editing system uses multi-modal (MM) variational auto-decoders (VADs)(MM-VADs) that are trained with a shared latent space that enables editing 3D objects by editing 2D sketches of the 3D objects. The system determines a latent code that corresponds to an edited or sketched 2D sketch. The latent code is then used to generate a 3D object using the MM-VADs with the latent code as input. The latent space is divided into a latent space for shapes and a latent space for colors. The MM-VADs are trained with variational auto-encoders (VAE) and a ground truth.Type: GrantFiled: July 22, 2022Date of Patent: September 17, 2024Assignee: SNAP INC.Inventors: Menglei Chai, Sergey Tulyakov, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Zezhou Cheng
-
Publication number: 20240273809Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.Type: ApplicationFiled: April 24, 2024Publication date: August 15, 2024Inventors: Zeng Huang, Jian Ren, Sergey Tulyakov, Menglei Chai, Kyle Olszewski, Huan Wang
-
Patent number: 12056792Abstract: Systems and methods herein describe a motion retargeting system. The motion retargeting system accesses a plurality of two-dimensional images comprising a person performing a plurality of body poses, extracts a plurality of implicit volumetric representations from the plurality of body poses, generates a three-dimensional warping field, the three-dimensional warping field configured to warp the plurality of implicit volumetric representations from a canonical pose to a target pose, and based on the three-dimensional warping field, generates a two-dimensional image of an artificial person performing the target pose.Type: GrantFiled: December 21, 2021Date of Patent: August 6, 2024Assignee: Snap Inc.Inventors: Jian Ren, Menglei Chai, Oliver Woodford, Kyle Olszewski, Sergey Tulyakov
-
Publication number: 20240221258Abstract: Unsupervised volumetric 3D animation (UVA) of non-rigid deformable objects without annotations learns the 3D structure and dynamics of objects solely from single-view red/green/blue (RGB) videos and decomposes the single-view RGB videos into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable perspective-n-point (PnP) algorithm, the UVA model learns the underlying object 3D geometry and parts decomposition in an entirely unsupervised manner from still or video images. This allows the UVA model to perform 3D segmentation, 3D keypoint estimation, novel view synthesis, and animation. The UVA model can obtain animatable 3D objects from a single or a few images. The UVA method also features a space in which all objects are represented in their canonical, animation-ready form. Applications include the creation of lenses from images or videos for social media applications.Type: ApplicationFiled: December 28, 2022Publication date: July 4, 2024Inventors: Menglei Chai, Hsin-Ying Lee, Willi Menapace, Kyle Olszewski, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Sergey Tulyakov
-
Patent number: 12002146Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.Type: GrantFiled: March 28, 2022Date of Patent: June 4, 2024Assignee: Snap Inc.Inventors: Zeng Huang, Jian Ren, Sergey Tulyakov, Menglei Chai, Kyle Olszewski, Huan Wang
-
Publication number: 20240112401Abstract: A system and method are described for generating 3D garments from two-dimensional (2D) scribble images drawn by users. The system includes a conditional 2D generator, a conditional 3D generator, and two intermediate media including dimension-coupling color-density pairs and flat point clouds that bridge the gap between dimensions. Given a scribble image, the 2D generator synthesizes dimension-coupling color-density pairs including the RGB projection and density map from the front and rear views of the scribble image. A density-aware sampling algorithm converts the 2D dimension-coupling color-density pairs into a 3D flat point cloud representation, where the depth information is ignored. The 3D generator predicts the depth information from the flat point cloud. Dynamic variations per garment due to deformations resulting from a wearer's pose as well as irregular wrinkles and folds may be bypassed by taking advantage of 2D generative models to bridge the dimension gap in a non-parametric way.Type: ApplicationFiled: September 30, 2022Publication date: April 4, 2024Inventors: Panagiotis Achlioptas, Menglei Chai, Hsin-Ying Lee, Kyle Olszewski, Jian Ren, Sergey Tulyakov
-
Publication number: 20240029346Abstract: A system to enable 3D hair reconstruction and rendering from a single reference image which performs a multi-stage process that utilizes both a 3D implicit representation and a 2D parametric embedding space.Type: ApplicationFiled: July 21, 2022Publication date: January 25, 2024Inventors: Zeng Huang, Menglei Chai, Sergey Tulyakov, Kyle Olszewski, Hsin-Ying Lee