Patents by Inventor Kyle Olszewski

Kyle Olszewski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

AUTODECODING LATENT 3D DIFFUSION MODELS

Publication number: 20240420407

Abstract: Systems and methods for generating static and articulated 3D assets are provided that include a 3D autodecoder at their core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space, which can then be decoded into a volumetric representation for rendering view-consistent appearance and geometry. The appropriate intermediate volumetric latent space is then identified and robust normalization and de-normalization operations are implemented to learn a 3D diffusion from 2D images or monocular videos of rigid or articulated objects. The methods are flexible enough to use either existing camera supervision or no camera information at all—instead efficiently learning the camera information during training.

Type: Application

Filed: June 16, 2023

Publication date: December 19, 2024

Inventors: Evangelos Ntavelis, Kyle Olszewski, Aliaksandr Siarohin, Sergey Tulyakov
Cross-modal shape and color manipulation

Patent number: 12094073

Abstract: Systems, computer readable media, and methods herein describe an editing system where a three-dimensional (3D) object can be edited by editing a 2D sketch or 2D RGB views of the 3D object. The editing system uses multi-modal (MM) variational auto-decoders (VADs)(MM-VADs) that are trained with a shared latent space that enables editing 3D objects by editing 2D sketches of the 3D objects. The system determines a latent code that corresponds to an edited or sketched 2D sketch. The latent code is then used to generate a 3D object using the MM-VADs with the latent code as input. The latent space is divided into a latent space for shapes and a latent space for colors. The MM-VADs are trained with variational auto-encoders (VAE) and a ground truth.

Type: Grant

Filed: July 22, 2022

Date of Patent: September 17, 2024

Assignee: SNAP INC.

Inventors: Menglei Chai, Sergey Tulyakov, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Zezhou Cheng
3D MODELING BASED ON NEURAL LIGHT FIELD

Publication number: 20240273809

Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

Type: Application

Filed: April 24, 2024

Publication date: August 15, 2024

Inventors: Zeng Huang, Jian Ren, Sergey Tulyakov, Menglei Chai, Kyle Olszewski, Huan Wang
Flow-guided motion retargeting

Patent number: 12056792

Abstract: Systems and methods herein describe a motion retargeting system. The motion retargeting system accesses a plurality of two-dimensional images comprising a person performing a plurality of body poses, extracts a plurality of implicit volumetric representations from the plurality of body poses, generates a three-dimensional warping field, the three-dimensional warping field configured to warp the plurality of implicit volumetric representations from a canonical pose to a target pose, and based on the three-dimensional warping field, generates a two-dimensional image of an artificial person performing the target pose.

Type: Grant

Filed: December 21, 2021

Date of Patent: August 6, 2024

Assignee: Snap Inc.

Inventors: Jian Ren, Menglei Chai, Oliver Woodford, Kyle Olszewski, Sergey Tulyakov
UNSUPERVISED VOLUMETRIC ANIMATION

Publication number: 20240221258

Abstract: Unsupervised volumetric 3D animation (UVA) of non-rigid deformable objects without annotations learns the 3D structure and dynamics of objects solely from single-view red/green/blue (RGB) videos and decomposes the single-view RGB videos into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable perspective-n-point (PnP) algorithm, the UVA model learns the underlying object 3D geometry and parts decomposition in an entirely unsupervised manner from still or video images. This allows the UVA model to perform 3D segmentation, 3D keypoint estimation, novel view synthesis, and animation. The UVA model can obtain animatable 3D objects from a single or a few images. The UVA method also features a space in which all objects are represented in their canonical, animation-ready form. Applications include the creation of lenses from images or videos for social media applications.

Type: Application

Filed: December 28, 2022

Publication date: July 4, 2024

Inventors: Menglei Chai, Hsin-Ying Lee, Willi Menapace, Kyle Olszewski, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Sergey Tulyakov
3D modeling based on neural light field

Patent number: 12002146

Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

Type: Grant

Filed: March 28, 2022

Date of Patent: June 4, 2024

Assignee: Snap Inc.

Inventors: Zeng Huang, Jian Ren, Sergey Tulyakov, Menglei Chai, Kyle Olszewski, Huan Wang
3D GARMENT GENERATION FROM 2D SCRIBBLE IMAGES

Publication number: 20240112401

Abstract: A system and method are described for generating 3D garments from two-dimensional (2D) scribble images drawn by users. The system includes a conditional 2D generator, a conditional 3D generator, and two intermediate media including dimension-coupling color-density pairs and flat point clouds that bridge the gap between dimensions. Given a scribble image, the 2D generator synthesizes dimension-coupling color-density pairs including the RGB projection and density map from the front and rear views of the scribble image. A density-aware sampling algorithm converts the 2D dimension-coupling color-density pairs into a 3D flat point cloud representation, where the depth information is ignored. The 3D generator predicts the depth information from the flat point cloud. Dynamic variations per garment due to deformations resulting from a wearer's pose as well as irregular wrinkles and folds may be bypassed by taking advantage of 2D generative models to bridge the dimension gap in a non-parametric way.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Inventors: Panagiotis Achlioptas, Menglei Chai, Hsin-Ying Lee, Kyle Olszewski, Jian Ren, Sergey Tulyakov
SINGLE IMAGE THREE-DIMENSIONAL HAIR RECONSTRUCTION

Publication number: 20240029346

Abstract: A system to enable 3D hair reconstruction and rendering from a single reference image which performs a multi-stage process that utilizes both a 3D implicit representation and a 2D parametric embedding space.

Type: Application

Filed: July 21, 2022

Publication date: January 25, 2024

Inventors: Zeng Huang, Menglei Chai, Sergey Tulyakov, Kyle Olszewski, Hsin-Ying Lee
CROSS-MODAL SHAPE AND COLOR MANIPULATION

Publication number: 20230386158

Abstract: Systems, computer readable media, and methods herein describe an editing system where a three-dimensional (3D) object can be edited by editing a 2D sketch or 2D RGB views of the 3D object. The editing system uses multi-modal (MM) variational auto-decoders (VADs)(MM-VADs) that are trained with a shared latent space that enables editing 3D objects by editing 2D sketches of the 3D objects. The system determines a latent code that corresponds to an edited or sketched 2D sketch. The latent code is then used to generate a 3D object using the MM-VADs with the latent code as input. The latent space is divided into a latent space for shapes and a latent space for colors. The MM-VADs are trained with variational auto-encoders (VAE) and a ground truth.

Type: Application

Filed: July 22, 2022

Publication date: November 30, 2023

Inventors: Menglei Chai, Sergey Tulyakov, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Zezhou Cheng
VECTOR-QUANTIZED TRANSFORMABLE BOTTLENECK NETWORKS

Publication number: 20230316454

Abstract: The 3D structure and appearance of objects extracted from 2D images are represented in a volumetric grid containing quantized feature vectors of values representing different aspects of the appearance and shape of an object, such as local features, structures, or colors that define the object. An encoder-decoder framework applies spatial transformations directly to a latent volumetric representation of the encoded image content. The volumetric representation is quantized to substantially reduce the space required to represent the image content. The volumetric representation is also spatially disentangled, such that each voxel acts as a primitive building block and supports various manipulations, including novel view synthesis and non-rigid creative manipulations.

Type: Application

Filed: March 31, 2022

Publication date: October 5, 2023

Inventors: Kyle Olszewski, Sergey Tulyakov, Menglei Chai, Jian Ren, Zeng Huang
3D MODELING BASED ON NEURAL LIGHT FIELD

Publication number: 20230306675

Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

Type: Application

Filed: March 28, 2022

Publication date: September 28, 2023

Inventors: Zeng Huang, Jian Ren, Sergey Tulyakov, Menglei Chai, Kyle Olszewski, Huan Wang
VIDEO SYNTHESIS VIA MULTIMODAL CONDITIONING

Publication number: 20230262293

Abstract: A multimodal video generation framework (MMVID) that benefits from text and images provided jointly or separately as input. Quantized representations of videos are utilized with a bidirectional transformer with multiple modalities as inputs to predict a discrete video representation. A new video token trained with self-learning and an improved mask-prediction algorithm for sampling video tokens is used to improve video quality and consistency. Text augmentation is utilized to improve the robustness of the textual representation and diversity of generated videos. The framework incorporates various visual modalities, such as segmentation masks, drawings, and partially occluded images. In addition, the MMVID extracts visual information as suggested by a textual prompt.

Type: Application

Filed: September 30, 2022

Publication date: August 17, 2023

Inventors: Francesco Barbieri, Ligong Han, Hsin-Ying Lee, Shervin Minaee, Kyle Olszewski, Jian Ren, Sergey Tulyakov
OBJECT-CENTRIC NEURAL DECOMPOSITION FOR IMAGE RE-RENDERING

Publication number: 20230215085

Abstract: Three-dimensional object representation and re-rendering systems and methods for producing a 3D representation of an object from 2D images including the object that enables object-centric rendering. A modular approach is used that optimizes a Neural Radiance Field (NeRF) model to estimate object geometry and refine camera parameters and, then, infer surface material properties and per-image lighting conditions that fit the 2D images.

Type: Application

Filed: December 28, 2022

Publication date: July 6, 2023

Inventors: Kyle Olszewski, Sergey Tulyakov, Zhengfei Kuang, Menglei Chai
FLOW-GUIDED MOTION RETARGETING

Publication number: 20220207786

Abstract: Systems and methods herein describe a motion retargeting system. The motion retargeting system accesses a plurality of two-dimensional images comprising a person performing a plurality of body poses, extracts a plurality of implicit volumetric representations from the plurality of body poses, generates a three-dimensional warping field, the three-dimensional warping field configured to warp the plurality of implicit volumetric representations from a canonical pose to a target pose, and based on the three-dimensional warping field, generates a two-dimensional image of an artificial person performing the target pose.

Type: Application

Filed: December 21, 2021

Publication date: June 30, 2022

Inventors: Jian Ren, Menglei Chai, Oliver Woodford, Kyle Olszewski, Sergey Tulyakov
VIDEO SYNTHESIS WITHIN A MESSAGING SYSTEM

Publication number: 20220101104

Abstract: Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for video synthesis. The program and method provide for accessing a primary generative adversarial network (GAN) comprising a pre-trained image generator, a motion generator comprising a plurality of neural networks, and a video discriminator; generating an updated GAN based on the primary GAN, by performing operations comprising identifying input data of the updated GAN, the input data comprising an initial latent code and a motion domain dataset, training the motion generator based on the input data, and adjusting weights of the plurality of neural networks of the primary GAN based on an output of the video discriminator; and generating a synthesized video based on the primary GAN and the input data.

Type: Application

Filed: September 30, 2021

Publication date: March 31, 2022

Inventors: Menglei Chai, Kyle Olszewski, Jian Ren, Yu Tian, Sergey Tulyakov
Synthesizing hair features in image content based on orientation data from user guidance

Patent number: 10515456

Abstract: Certain embodiments involve synthesizing image content depicting facial hair or other hair features based on orientation data obtained using guidance inputs or other user-provided guidance data. For instance, a graphic manipulation application accesses guidance data identifying a desired hair feature and an appearance exemplar having image data with color information for the desired hair feature. The graphic manipulation application transforms the guidance data into an input orientation map. The graphic manipulation application matches the input orientation map to an exemplar orientation map having a higher resolution than the input orientation map. The graphic manipulation application generates the desired hair feature by applying the color information from the appearance exemplar to the exemplar orientation map. The graphic manipulation application outputs the desired hair feature at a presentation device.

Type: Grant

Filed: March 22, 2018

Date of Patent: December 24, 2019

Assignee: Adobe Inc.

Inventors: Duygu Ceylan Aksit, Zhili Chen, Jose Ignacio Echevarria Vallespi, Kyle Olszewski
SYNTHESIZING HAIR FEATURES IN IMAGE CONTENT BASED ON ORIENTATION DATA FROM USER GUIDANCE

Publication number: 20190295272

Abstract: Certain embodiments involve synthesizing image content depicting facial hair or other hair features based on orientation data obtained using guidance inputs or other user-provided guidance data. For instance, a graphic manipulation application accesses guidance data identifying a desired hair feature and an appearance exemplar having image data with color information for the desired hair feature. The graphic manipulation application transforms the guidance data into an input orientation map. The graphic manipulation application matches the input orientation map to an exemplar orientation map having a higher resolution than the input orientation map. The graphic manipulation application generates the desired hair feature by applying the color information from the appearance exemplar to the exemplar orientation map. The graphic manipulation application outputs the desired hair feature at a presentation device.

Type: Application

Filed: March 22, 2018

Publication date: September 26, 2019

Inventors: Duygu Ceylan Aksit, Zhili Chen, Jose Ignacio Echevarria Vallespi, Kyle Olszewski
Deep learning-based facial animation for head-mounted display

Patent number: 10217261

Abstract: There is disclosed a system and method for training a set of expression and neutral convolutional neural networks using a single performance mapped to a set of known phonemes and visemes in the form predetermined sentences and facial expressions. Then, subsequent training of the convolutional neural networks can occur using temporal data derived from audio data within the original performance mapped to a set of professionally-created three dimensional animations. Thereafter, with sufficient training, the expression and neutral convolutional neural networks can generate facial animations from facial image data in real-time without individual specific training.

Type: Grant

Filed: February 21, 2017

Date of Patent: February 26, 2019

Assignee: PINSCREEN, INC.

Inventors: Hao Li, Joseph J. Lim, Kyle Olszewski
HIGH-FIDELITY FACIAL AND SPEECH ANIMATION FOR VIRTUAL REALITY HEAD MOUNTED DISPLAYS

Publication number: 20170243387

Abstract: There is disclosed a system and method for training a set of expression and neutral convolutional neural networks using a single performance mapped to a set of known phonemes and visemes in the form predetermined sentences and facial expressions. Then, subsequent training of the convolutional neural networks can occur using temporal data derived from audio data within the original performance mapped to a set of professionally-created three dimensional animations. Thereafter, with sufficient training, the expression and neutral convolutional neural networks can generate facial animations from facial image data in real-time without individual specific training.

Type: Application

Filed: February 21, 2017

Publication date: August 24, 2017

Inventors: Hao Li, Joseph J. Lim, Kyle Olszewski