Patents by Inventor Sergey Tulyakov

Sergey Tulyakov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240112401
    Abstract: A system and method are described for generating 3D garments from two-dimensional (2D) scribble images drawn by users. The system includes a conditional 2D generator, a conditional 3D generator, and two intermediate media including dimension-coupling color-density pairs and flat point clouds that bridge the gap between dimensions. Given a scribble image, the 2D generator synthesizes dimension-coupling color-density pairs including the RGB projection and density map from the front and rear views of the scribble image. A density-aware sampling algorithm converts the 2D dimension-coupling color-density pairs into a 3D flat point cloud representation, where the depth information is ignored. The 3D generator predicts the depth information from the flat point cloud. Dynamic variations per garment due to deformations resulting from a wearer's pose as well as irregular wrinkles and folds may be bypassed by taking advantage of 2D generative models to bridge the dimension gap in a non-parametric way.
    Type: Application
    Filed: September 30, 2022
    Publication date: April 4, 2024
    Inventors: Panagiotis Achlioptas, Menglei Chai, Hsin-Ying Lee, Kyle Olszewski, Jian Ren, Sergey Tulyakov
  • Publication number: 20240104789
    Abstract: A method of generating an image for use in a conversation taking place in a messaging application is disclosed. Conversation input text is received from a user of a portable device that includes a display. Model input text is generated from the conversation input text, which is processed with a text-to-image model to generate an image based on the model input text. The coordinates of a face in the image are determined, and the face of the user or another person is added to the image at the location. The final image is displayed on the portable device, and user input is received to transmit the image to a remote recipient.
    Type: Application
    Filed: September 22, 2022
    Publication date: March 28, 2024
    Inventors: Arnab Ghosh, Jian Ren, Pavel Savchenkov, Sergey Tulyakov
  • Publication number: 20240070521
    Abstract: A layer freezing and data sieving technique used in a sparse training domain for object recognition, providing end-to-end dataset-efficient training. The layer freezing and data sieving methods are seamlessly incorporated into a sparse training algorithm to form a generic framework. The generic framework consistently outperforms prior approaches and significantly reduces training floating point operations per second (FLOPs) and memory costs while preserving high accuracy. The reduction in training FLOPs comes from three sources: weight sparsity, frozen layers, and a shrunken dataset. The training acceleration depends on different factors, e.g., the support of the sparse computation, layer type and size, and system overhead. The FLOPs reduction from the frozen layers and shrunken dataset leads to higher actual training acceleration than weight sparsity.
    Type: Application
    Filed: August 23, 2022
    Publication date: February 29, 2024
    Inventors: Jian Ren, Sergey Tulyakov, Yanyu Li, Geng Yuan
  • Publication number: 20240062008
    Abstract: A method of generating an image for use in a conversation taking place in a messaging application is disclosed. Conversation input text is received from a user of a portable device that includes a display. Model input text is generated from the conversation input text, which is processed with a text-to-image model to generate an image based on the model input text. The generated image is displayed on the portable device, and user input is received to transmit the image to a remote recipient.
    Type: Application
    Filed: August 17, 2022
    Publication date: February 22, 2024
    Inventors: Arnab Ghosh, Jian Ren, Pavel Savchenkov, Sergey Tulyakov
  • Publication number: 20240029346
    Abstract: A system to enable 3D hair reconstruction and rendering from a single reference image which performs a multi-stage process that utilizes both a 3D implicit representation and a 2D parametric embedding space.
    Type: Application
    Filed: July 21, 2022
    Publication date: January 25, 2024
    Inventors: Zeng Huang, Menglei Chai, Sergey Tulyakov, Kyle Olszewski, Hsin-Ying Lee
  • Publication number: 20240020948
    Abstract: A vision transformer network having extremely low latency and usable on mobile devices, such as smart eyewear devices and other augmented reality (AR) and virtual reality (VR) devices. The transformer network processes an input image, and the network includes a convolution stem configured to patch embed the image. A first stack of stages including at least two stages of 4-Dimension (4D) metablocks (MBs) (MB4D) follow the convolution stem. A second stack of stages including at least two stages of 3-Dimension MBs (MB3D) follow the MB4D stages. Each of the MB4D stages and each of the MB3D stages include different layer configurations, and each of the MB4D stages and each of the MB3D stages include a token mixer. The MB3D stages each additionally include a multi-head self attention (MHSA) processing block.
    Type: Application
    Filed: July 14, 2022
    Publication date: January 18, 2024
    Inventors: Jian Ren, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanyu Li, Geng Yuan
  • Publication number: 20230410376
    Abstract: System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.
    Type: Application
    Filed: August 28, 2023
    Publication date: December 21, 2023
    Inventors: Jian Ren, Menglei Chai, Sergey Tulyakov, Qing Jin
  • Patent number: 11836835
    Abstract: Systems and methods herein describe novel motion representations for animating articulated objects consisting of distinct parts. The described systems and method access source image data, identify driving image data to modify image feature data in the source image sequence data, generate, using an image transformation neural network, modified source image data comprising a plurality of modified source images depicting modified versions of the image feature data, the image transformation neural network being trained to identify, for each image in the source image data, a driving image from the driving image data, the identified driving image being implemented by the image transformation neural network to modify a corresponding source image in the source image data using motion estimation differences between the identified driving image and the corresponding source image, and stores the modified source image data.
    Type: Grant
    Filed: June 30, 2021
    Date of Patent: December 5, 2023
    Assignee: Snap Inc.
    Inventors: Menglei Chai, Jian Ren, Aliaksandr Siarohin, Sergey Tulyakov, Oliver Woodford
  • Publication number: 20230384918
    Abstract: A system of machine learning schemes can be configured to efficiently perform image processing tasks on a user device, such as a mobile phone. The system can selectively detect and transform individual regions within each frame of a live streaming video. The system can selectively partition and toggle image effects within the live streaming video.
    Type: Application
    Filed: August 9, 2023
    Publication date: November 30, 2023
    Inventors: Theresa Barton, Yanping Chen, Jaewook Chung, Christopher Crutchfield, Aymeric Damien, Sergei Kotcur, Igor Kudriashov, Sergey Tulyakov, Andrew Wan, Emre Yamangil
  • Publication number: 20230386158
    Abstract: Systems, computer readable media, and methods herein describe an editing system where a three-dimensional (3D) object can be edited by editing a 2D sketch or 2D RGB views of the 3D object. The editing system uses multi-modal (MM) variational auto-decoders (VADs)(MM-VADs) that are trained with a shared latent space that enables editing 3D objects by editing 2D sketches of the 3D objects. The system determines a latent code that corresponds to an edited or sketched 2D sketch. The latent code is then used to generate a 3D object using the MM-VADs with the latent code as input. The latent space is divided into a latent space for shapes and a latent space for colors. The MM-VADs are trained with variational auto-encoders (VAE) and a ground truth.
    Type: Application
    Filed: July 22, 2022
    Publication date: November 30, 2023
    Inventors: Menglei Chai, Sergey Tulyakov, Jian Ren, Hsin-Ying Lee, Kyle Olszewski, Zeng Huang, Zezhou Cheng
  • Publication number: 20230379491
    Abstract: Systems and methods herein describe a video compression system. The described systems and methods accesses a sequence of image frames from a first computing device, the sequence of image frames comprising a first image frame and a second image frame, detects a first set of keypoints for the first image frame, transmits the first image frame and the first set of keypoints to a second computing device, detects a second set of keypoints for the second image frame, transmits the second set of keypoints to the second computing device, causes an animated image to be displayed on the second computing device.
    Type: Application
    Filed: August 4, 2023
    Publication date: November 23, 2023
    Inventors: Sergey Demyanov, Andrew Cheng-min Lin, Walton Lin, Aleksei Podkin, Aleksei Stoliar, Sergey Tulyakov
  • Patent number: 11798213
    Abstract: Systems and methods herein describe novel motion representations for animating articulated objects consisting of distinct parts. The described systems and method access source image data, identify driving image data to modify image feature data in the source image sequence data, generate, using an image transformation neural network, modified source image data comprising a plurality of modified source images depicting modified versions of the image feature data, the image transformation neural network being trained to identify, for each image in the source image data, a driving image from the driving image data, the identified driving image being implemented by the image transformation neural network to modify a corresponding source image in the source image data using motion estimation differences between the identified driving image and the corresponding source image, and stores the modified source image data.
    Type: Grant
    Filed: June 30, 2021
    Date of Patent: October 24, 2023
    Assignee: Snap Inc.
    Inventors: Menglei Chai, Jian Ren, Aliaksandr Siarohin, Sergey Tulyakov, Oliver Woodford
  • Patent number: 11798261
    Abstract: Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and a method for synthesizing a realistic image with a new expression of a face in an input image by receiving an input image comprising a face having a first expression; obtaining a target expression for the face; and extracting a texture of the face and a shape of the face. The program and method for generating, based on the extracted texture of the face, a target texture corresponding to the obtained target expression using a first machine learning technique; generating, based on the extracted shape of the face, a target shape corresponding to the obtained target expression using a second machine learning technique; and combining the generated target texture and generated target shape into an output image comprising the face having a second expression corresponding to the obtained target expression.
    Type: Grant
    Filed: June 9, 2021
    Date of Patent: October 24, 2023
    Assignee: Snap Inc.
    Inventors: Chen Cao, Sergey Tulyakov, Zhenglin Geng
  • Publication number: 20230334327
    Abstract: A compact generative neural network can be distilled from a teacher generative neural network using a training network. The compact network can be trained on the input data and output data of the teacher network. The training network train the student network using a discrimination layer and one or more types of losses, such as perception loss and adversarial loss.
    Type: Application
    Filed: June 22, 2023
    Publication date: October 19, 2023
    Inventors: Sergey Tulyakov, Sergei Korolev, Aleksei Stoliar, Maksim Gusarov, Sergei Kotcur, Christopher Yale Crutchfield, Andrew Wan
  • Patent number: 11790565
    Abstract: System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.
    Type: Grant
    Filed: March 4, 2021
    Date of Patent: October 17, 2023
    Assignee: Snap Inc.
    Inventors: Jian Ren, Menglei Chai, Sergey Tulyakov, Qing Jin
  • Publication number: 20230316454
    Abstract: The 3D structure and appearance of objects extracted from 2D images are represented in a volumetric grid containing quantized feature vectors of values representing different aspects of the appearance and shape of an object, such as local features, structures, or colors that define the object. An encoder-decoder framework applies spatial transformations directly to a latent volumetric representation of the encoded image content. The volumetric representation is quantized to substantially reduce the space required to represent the image content. The volumetric representation is also spatially disentangled, such that each voxel acts as a primitive building block and supports various manipulations, including novel view synthesis and non-rigid creative manipulations.
    Type: Application
    Filed: March 31, 2022
    Publication date: October 5, 2023
    Inventors: Kyle Olszewski, Sergey Tulyakov, Menglei Chai, Jian Ren, Zeng Huang
  • Patent number: 11775158
    Abstract: A system of machine learning schemes can be configured to efficiently perform image processing tasks on a user device, such as a mobile phone. The system can selectively detect and transform individual regions within each frame of a live streaming video. The system can selectively partition and toggle image effects within the live streaming video.
    Type: Grant
    Filed: June 22, 2021
    Date of Patent: October 3, 2023
    Assignee: Snap Inc.
    Inventors: Theresa Barton, Yanping Chen, Jaewook Chung, Christopher Yale Crutchfield, Aymeric Damien, Sergei Kotcur, Igor Kudriashov, Sergey Tulyakov, Andrew Wan, Emre Yamangil
  • Publication number: 20230306675
    Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.
    Type: Application
    Filed: March 28, 2022
    Publication date: September 28, 2023
    Inventors: Zeng Huang, Jian Ren, Sergey Tulyakov, Menglei Chai, Kyle Olszewski, Huan Wang
  • Patent number: 11736717
    Abstract: Systems and methods herein describe a video compression system. The described systems and methods accesses a sequence of image frames from a first computing device, the sequence of image frames comprising a first image frame and a second image frame, detects a first set of keypoints for the first image frame, transmits the first image frame and the first set of keypoints to a second computing device, detects a second set of keypoints for the second image frame, transmits the second set of keypoints to the second computing device, causes an animated image to be displayed on the second computing device.
    Type: Grant
    Filed: September 30, 2021
    Date of Patent: August 22, 2023
    Assignee: Snap Inc.
    Inventors: Sergey Demyanov, Andrew Cheng-min Lin, Walton Lin, Aleksei Podkin, Aleksei Stoliar, Sergey Tulyakov
  • Publication number: 20230262293
    Abstract: A multimodal video generation framework (MMVID) that benefits from text and images provided jointly or separately as input. Quantized representations of videos are utilized with a bidirectional transformer with multiple modalities as inputs to predict a discrete video representation. A new video token trained with self-learning and an improved mask-prediction algorithm for sampling video tokens is used to improve video quality and consistency. Text augmentation is utilized to improve the robustness of the textual representation and diversity of generated videos. The framework incorporates various visual modalities, such as segmentation masks, drawings, and partially occluded images. In addition, the MMVID extracts visual information as suggested by a textual prompt.
    Type: Application
    Filed: September 30, 2022
    Publication date: August 17, 2023
    Inventors: Francesco Barbieri, Ligong Han, Hsin-Ying Lee, Shervin Minaee, Kyle Olszewski, Jian Ren, Sergey Tulyakov