Patents by Inventor Sergey Tulyakov

Sergey Tulyakov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

3D GARMENT GENERATION FROM 2D SCRIBBLE IMAGES

Publication number: 20240112401

Abstract: A system and method are described for generating 3D garments from two-dimensional (2D) scribble images drawn by users. The system includes a conditional 2D generator, a conditional 3D generator, and two intermediate media including dimension-coupling color-density pairs and flat point clouds that bridge the gap between dimensions. Given a scribble image, the 2D generator synthesizes dimension-coupling color-density pairs including the RGB projection and density map from the front and rear views of the scribble image. A density-aware sampling algorithm converts the 2D dimension-coupling color-density pairs into a 3D flat point cloud representation, where the depth information is ignored. The 3D generator predicts the depth information from the flat point cloud. Dynamic variations per garment due to deformations resulting from a wearer's pose as well as irregular wrinkles and folds may be bypassed by taking advantage of 2D generative models to bridge the dimension gap in a non-parametric way.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Inventors: Panagiotis Achlioptas, Menglei Chai, Hsin-Ying Lee, Kyle Olszewski, Jian Ren, Sergey Tulyakov
TEXT-GUIDED CAMEO GENERATION

Publication number: 20240104789

Abstract: A method of generating an image for use in a conversation taking place in a messaging application is disclosed. Conversation input text is received from a user of a portable device that includes a display. Model input text is generated from the conversation input text, which is processed with a text-to-image model to generate an image based on the model input text. The coordinates of a face in the image are determined, and the face of the user or another person is added to the image at the location. The final image is displayed on the portable device, and user input is received to transmit the image to a remote recipient.

Type: Application

Filed: September 22, 2022

Publication date: March 28, 2024

Inventors: Arnab Ghosh, Jian Ren, Pavel Savchenkov, Sergey Tulyakov
LAYER FREEZING & DATA SIEVING FOR SPARSE TRAINING

Publication number: 20240070521

Abstract: A layer freezing and data sieving technique used in a sparse training domain for object recognition, providing end-to-end dataset-efficient training. The layer freezing and data sieving methods are seamlessly incorporated into a sparse training algorithm to form a generic framework. The generic framework consistently outperforms prior approaches and significantly reduces training floating point operations per second (FLOPs) and memory costs while preserving high accuracy. The reduction in training FLOPs comes from three sources: weight sparsity, frozen layers, and a shrunken dataset. The training acceleration depends on different factors, e.g., the support of the sparse computation, layer type and size, and system overhead. The FLOPs reduction from the frozen layers and shrunken dataset leads to higher actual training acceleration than weight sparsity.

Type: Application

Filed: August 23, 2022

Publication date: February 29, 2024

Inventors: Jian Ren, Sergey Tulyakov, Yanyu Li, Geng Yuan
TEXT-GUIDED STICKER GENERATION

Publication number: 20240062008

Abstract: A method of generating an image for use in a conversation taking place in a messaging application is disclosed. Conversation input text is received from a user of a portable device that includes a display. Model input text is generated from the conversation input text, which is processed with a text-to-image model to generate an image based on the model input text. The generated image is displayed on the portable device, and user input is received to transmit the image to a remote recipient.

Type: Application

Filed: August 17, 2022

Publication date: February 22, 2024

Inventors: Arnab Ghosh, Jian Ren, Pavel Savchenkov, Sergey Tulyakov
SINGLE IMAGE THREE-DIMENSIONAL HAIR RECONSTRUCTION

Publication number: 20240029346

Abstract: A system to enable 3D hair reconstruction and rendering from a single reference image which performs a multi-stage process that utilizes both a 3D implicit representation and a 2D parametric embedding space.

Type: Application

Filed: July 21, 2022

Publication date: January 25, 2024

Inventors: Zeng Huang, Menglei Chai, Sergey Tulyakov, Kyle Olszewski, Hsin-Ying Lee
EFFICIENTFORMER VISION TRANSFORMER

Publication number: 20240020948

Abstract: A vision transformer network having extremely low latency and usable on mobile devices, such as smart eyewear devices and other augmented reality (AR) and virtual reality (VR) devices. The transformer network processes an input image, and the network includes a convolution stem configured to patch embed the image. A first stack of stages including at least two stages of 4-Dimension (4D) metablocks (MBs) (MB4D) follow the convolution stem. A second stack of stages including at least two stages of 3-Dimension MBs (MB3D) follow the MB4D stages. Each of the MB4D stages and each of the MB3D stages include different layer configurations, and each of the MB4D stages and each of the MB3D stages include a token mixer. The MB3D stages each additionally include a multi-head self attention (MHSA) processing block.

Type: Application

Filed: July 14, 2022

Publication date: January 18, 2024

Inventors: Jian Ren, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanyu Li, Geng Yuan
COMPRESSING IMAGE-TO-IMAGE MODELS WITH AVERAGE SMOOTHING

Publication number: 20230410376

Abstract: System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.

Type: Application

Filed: August 28, 2023

Publication date: December 21, 2023

Inventors: Jian Ren, Menglei Chai, Sergey Tulyakov, Qing Jin
Motion representations for articulated animation

Patent number: 11836835

Abstract: Systems and methods herein describe novel motion representations for animating articulated objects consisting of distinct parts. The described systems and method access source image data, identify driving image data to modify image feature data in the source image sequence data, generate, using an image transformation neural network, modified source image data comprising a plurality of modified source images depicting modified versions of the image feature data, the image transformation neural network being trained to identify, for each image in the source image data, a driving image from the driving image data, the identified driving image being implemented by the image transformation neural network to modify a corresponding source image in the source image data using motion estimation differences between the identified driving image and the corresponding source image, and stores the modified source image data.

Type: Grant

Filed: June 30, 2021

Date of Patent: December 5, 2023

Assignee: Snap Inc.

Inventors: Menglei Chai, Jian Ren, Aliaksandr Siarohin, Sergey Tulyakov, Oliver Woodford
DEVICE-BASED IMAGE MODIFICATION OF DEPICTED OBJECTS

Publication number: 20230384918

Abstract: A system of machine learning schemes can be configured to efficiently perform image processing tasks on a user device, such as a mobile phone. The system can selectively detect and transform individual regions within each frame of a live streaming video. The system can selectively partition and toggle image effects within the live streaming video.

Type: Application

Filed: August 9, 2023

Publication date: November 30, 2023

Inventors: Theresa Barton, Yanping Chen, Jaewook Chung, Christopher Crutchfield, Aymeric Damien, Sergei Kotcur, Igor Kudriashov, Sergey Tulyakov, Andrew Wan, Emre Yamangil
CROSS-MODAL SHAPE AND COLOR MANIPULATION

Publication number: 20230386158

Abstract: Systems, computer readable media, and methods herein describe an editing system where a three-dimensional (3D) object can be edited by editing a 2D sketch or 2D RGB views of the 3D object. The editing system uses multi-modal (MM) variational auto-decoders (VADs)(MM-VADs) that are trained with a shared latent space that enables editing 3D objects by editing 2D sketches of the 3D objects. The system determines a latent code that corresponds to an edited or sketched 2D sketch. The latent code is then used to generate a 3D object using the MM-VADs with the latent code as input. The latent space is divided into a latent space for shapes and a latent space for colors. The MM-VADs are trained with variational auto-encoders (VAE) and a ground truth.

Type: Application

Filed: July 22, 2022

Publication date: November 30, 2023

Inventors: Menglei Chai, Sergey Tulyakov, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Zezhou Cheng
VIDEO COMPRESSION SYSTEM

Publication number: 20230379491

Abstract: Systems and methods herein describe a video compression system. The described systems and methods accesses a sequence of image frames from a first computing device, the sequence of image frames comprising a first image frame and a second image frame, detects a first set of keypoints for the first image frame, transmits the first image frame and the first set of keypoints to a second computing device, detects a second set of keypoints for the second image frame, transmits the second set of keypoints to the second computing device, causes an animated image to be displayed on the second computing device.

Type: Application

Filed: August 4, 2023

Publication date: November 23, 2023

Inventors: Sergey Demyanov, Andrew Cheng-min Lin, Walton Lin, Aleksei Podkin, Aleksei Stoliar, Sergey Tulyakov
Motion representations for articulated animation

Patent number: 11798213

Abstract: Systems and methods herein describe novel motion representations for animating articulated objects consisting of distinct parts. The described systems and method access source image data, identify driving image data to modify image feature data in the source image sequence data, generate, using an image transformation neural network, modified source image data comprising a plurality of modified source images depicting modified versions of the image feature data, the image transformation neural network being trained to identify, for each image in the source image data, a driving image from the driving image data, the identified driving image being implemented by the image transformation neural network to modify a corresponding source image in the source image data using motion estimation differences between the identified driving image and the corresponding source image, and stores the modified source image data.

Type: Grant

Filed: June 30, 2021

Date of Patent: October 24, 2023

Assignee: Snap Inc.

Inventors: Menglei Chai, Jian Ren, Aliaksandr Siarohin, Sergey Tulyakov, Oliver Woodford
Image face manipulation

Patent number: 11798261

Abstract: Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and a method for synthesizing a realistic image with a new expression of a face in an input image by receiving an input image comprising a face having a first expression; obtaining a target expression for the face; and extracting a texture of the face and a shape of the face. The program and method for generating, based on the extracted texture of the face, a target texture corresponding to the obtained target expression using a first machine learning technique; generating, based on the extracted shape of the face, a target shape corresponding to the obtained target expression using a second machine learning technique; and combining the generated target texture and generated target shape into an output image comprising the face having a second expression corresponding to the obtained target expression.

Type: Grant

Filed: June 9, 2021

Date of Patent: October 24, 2023

Assignee: Snap Inc.

Inventors: Chen Cao, Sergey Tulyakov, Zhenglin Geng
GENERATIVE NEURAL NETWORK DISTILLATION

Publication number: 20230334327

Abstract: A compact generative neural network can be distilled from a teacher generative neural network using a training network. The compact network can be trained on the input data and output data of the teacher network. The training network train the student network using a discrimination layer and one or more types of losses, such as perception loss and adversarial loss.

Type: Application

Filed: June 22, 2023

Publication date: October 19, 2023

Inventors: Sergey Tulyakov, Sergei Korolev, Aleksei Stoliar, Maksim Gusarov, Sergei Kotcur, Christopher Yale Crutchfield, Andrew Wan
Compressing image-to-image models with average smoothing

Patent number: 11790565

Abstract: System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.

Type: Grant

Filed: March 4, 2021

Date of Patent: October 17, 2023

Assignee: Snap Inc.

Inventors: Jian Ren, Menglei Chai, Sergey Tulyakov, Qing Jin
VECTOR-QUANTIZED TRANSFORMABLE BOTTLENECK NETWORKS

Publication number: 20230316454

Abstract: The 3D structure and appearance of objects extracted from 2D images are represented in a volumetric grid containing quantized feature vectors of values representing different aspects of the appearance and shape of an object, such as local features, structures, or colors that define the object. An encoder-decoder framework applies spatial transformations directly to a latent volumetric representation of the encoded image content. The volumetric representation is quantized to substantially reduce the space required to represent the image content. The volumetric representation is also spatially disentangled, such that each voxel acts as a primitive building block and supports various manipulations, including novel view synthesis and non-rigid creative manipulations.

Type: Application

Filed: March 31, 2022

Publication date: October 5, 2023

Inventors: Kyle Olszewski, Sergey Tulyakov, Menglei Chai, Jian Ren, Zeng Huang
Device-based image modification of depicted objects

Patent number: 11775158

Abstract: A system of machine learning schemes can be configured to efficiently perform image processing tasks on a user device, such as a mobile phone. The system can selectively detect and transform individual regions within each frame of a live streaming video. The system can selectively partition and toggle image effects within the live streaming video.

Type: Grant

Filed: June 22, 2021

Date of Patent: October 3, 2023

Assignee: Snap Inc.

Inventors: Theresa Barton, Yanping Chen, Jaewook Chung, Christopher Yale Crutchfield, Aymeric Damien, Sergei Kotcur, Igor Kudriashov, Sergey Tulyakov, Andrew Wan, Emre Yamangil
3D MODELING BASED ON NEURAL LIGHT FIELD

Publication number: 20230306675

Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

Type: Application

Filed: March 28, 2022

Publication date: September 28, 2023

Inventors: Zeng Huang, Jian Ren, Sergey Tulyakov, Menglei Chai, Kyle Olszewski, Huan Wang
Video compression system

Patent number: 11736717

Abstract: Systems and methods herein describe a video compression system. The described systems and methods accesses a sequence of image frames from a first computing device, the sequence of image frames comprising a first image frame and a second image frame, detects a first set of keypoints for the first image frame, transmits the first image frame and the first set of keypoints to a second computing device, detects a second set of keypoints for the second image frame, transmits the second set of keypoints to the second computing device, causes an animated image to be displayed on the second computing device.

Type: Grant

Filed: September 30, 2021

Date of Patent: August 22, 2023

Assignee: Snap Inc.

Inventors: Sergey Demyanov, Andrew Cheng-min Lin, Walton Lin, Aleksei Podkin, Aleksei Stoliar, Sergey Tulyakov
VIDEO SYNTHESIS VIA MULTIMODAL CONDITIONING

Publication number: 20230262293

Abstract: A multimodal video generation framework (MMVID) that benefits from text and images provided jointly or separately as input. Quantized representations of videos are utilized with a bidirectional transformer with multiple modalities as inputs to predict a discrete video representation. A new video token trained with self-learning and an improved mask-prediction algorithm for sampling video tokens is used to improve video quality and consistency. Text augmentation is utilized to improve the robustness of the textual representation and diversity of generated videos. The framework incorporates various visual modalities, such as segmentation masks, drawings, and partially occluded images. In addition, the MMVID extracts visual information as suggested by a textual prompt.

Type: Application

Filed: September 30, 2022

Publication date: August 17, 2023

Inventors: Francesco Barbieri, Ligong Han, Hsin-Ying Lee, Shervin Minaee, Kyle Olszewski, Jian Ren, Sergey Tulyakov

1 2 3 next