Patents by Inventor Yinda Zhang

Yinda Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Active stereo depth prediction based on coarse matching

Patent number: 12316825

Abstract: An electronic device estimates a depth map of an environment based on matching reduced-resolution stereo depth images captured by depth cameras to generate a coarse disparity (depth) map. The electronic device downsamples depth images captured by the depth cameras and matches sections of the reduced-resolution images to each other to generate a coarse depth map. The electronic device upsamples the coarse depth map to a higher resolution and refines the upsampled depth map to generate a high-resolution depth map to support location-based functionality.

Type: Grant

Filed: February 17, 2023

Date of Patent: May 27, 2025

Assignee: Google LLC

Inventors: Sameh Khamis, Yinda Zhang, Christoph Rhemann, Julien Valentin, Adarsh Kowdle, Vladimir Tankovich, Michael Schoenberg, Shahram Izadi, Thomas Funkhouser, Sean Fanello
Selecting avatar for videoconference

Patent number: 12317002

Abstract: A method can include selecting, from at least a first avatar and a second avatar based on at least one attribute of a calendar event associated with a user, a session avatar, the first avatar being based on a first set of images of a user wearing a first outfit and the second avatar being based on a second set of images of the user wearing a second outfit, and presenting the session avatar during a videoconference, the presentation of the session avatar changing based on audio input received from the user during the videoconference.

Type: Grant

Filed: October 18, 2022

Date of Patent: May 27, 2025

Assignee: Google LLC

Inventors: Yinda Zhang, Ruofei Du
HIGH-RESOLUTION MULTIVIEW-CONSISTENT RENDERING AND ALPHA MATTING FROM SPARSE VIEWS

Publication number: 20250111477

Abstract: A method including capturing a first plurality of images that include a foreground object and a background, capturing a second plurality of images that include the background, generating an alpha matte based on the first plurality of images and the second plurality of images using a trained machine learned model trained using a loss function configured to cause the trained machine learned model to learn high-frequency details of the foreground object, generating a foreground object image based on the first plurality of images and the second plurality of images using the trained machine learned model, and synthesizing an image including the foreground object image and a second background scene using the alpha matte.

Type: Application

Filed: September 28, 2023

Publication date: April 3, 2025

Inventors: Sergio Orts Escolano, Zhiwen Fan, Di Qiu, Yinda Zhang, Daoye Wang, Erroll Wood, Abhimitra Meka, Hossam Isack, Paulo Fabiano Urnau Gotardo, Kripasindhu Sarkar, Thabo Beeler, Zhengyang Shen, Alexander Sahba Koumis
GENERATING REPRESENTATION OF USER BASED ON DEPTH MAP

Publication number: 20240303918

Abstract: A method can include receiving, via a camera, a first video stream of a face of a user; determining a location of the face of the user based on the first video stream and a facial landmark detection model; receiving, via the camera, a second video stream of the face of the user; generating a depth map based on the second video stream, the location of the face of the user, and a depth prediction model; and generating a representation of the user based on the depth map and the second video stream.

Type: Application

Filed: October 11, 2023

Publication date: September 12, 2024

Inventors: Ruofei Du, Xun Qian, Yinda Zhang, Alex Olwal
MULTIRESOLUTION DEEP IMPLICIT FUNCTIONS FOR THREE-DIMENSIONAL SHAPE REPRESENTATION

Publication number: 20240303908

Abstract: A method including generating a first vector based on a first grid and a three-dimensional (3D) position associated with a first implicit representation (IR) of a 3D object, generating at least one second vector based on at least one second grid and an upsampled first grid, decoding the first vector to generate a second IR of the 3D object, decoding the at least one second vector to generate at least one third IR of the 3D object, generating a composite IR of the 3D object based on the second IR of the 3D object and the at least one third IR of the 3D object, and generating a reconstructed volume representing the 3D object based on the composite IR of the 3D object.

Type: Application

Filed: April 30, 2021

Publication date: September 12, 2024

Inventors: Yinda Zhang, Danhang Tang, Ruofei Du, Zhang Chen, Kyle Genova, Sofien Bouaziz, Thomas Allen Funkhouser, Sean Ryan Francesco Fanello, Christian Haene
AVATAR BASED ON MONOCULAR IMAGES

Publication number: 20240290025

Abstract: A method comprises receiving a first sequence of images of a portion of a user, the first sequence of images being monocular images; generating an avatar based on the first sequence of images, the avatar being based on a model including a feature vector associated with a vertex; receiving a second sequence of images of the portion of the user; and based on the second sequence of images, modifying the avatar with a displacement of the vertex to represent a gesture of the avatar.

Type: Application

Filed: February 27, 2024

Publication date: August 29, 2024

Inventors: Yinda Zhang, Sean Ryan Francesco Fanello, Ziqian Bai, Feitong Tan, Zeng Huang, Kripasindhu Sarkar, Danhang Tang, Di Qiu, Abhimitra Meka, Ruofei Du, Mingsong Dou, Sergio Orts Escolano, Rohit Kumar Pandey, Thabo Beeler
INTERMEDIATE VIEW SYNTHESIS BETWEEN WIDE-BASELINE PANORAMAS

Publication number: 20240212184

Abstract: A method including predicting a stereo depth associated with a first panoramic image and a second panoramic image, the first panoramic image and the second panoramic image being captured with a time interlude between the capture of the first panoramic image and the second panoramic image, generating a first mesh representation based on the first panoramic image and a stereo depth corresponding to the first panoramic image, generating a second mesh representation based on the second panoramic image and a stereo depth corresponding to the second panoramic image, and synthesizing a third panoramic image based on fusing the first mesh representation with the second mesh representation.

Type: Application

Filed: April 30, 2021

Publication date: June 27, 2024

Inventors: Ruofei Du, David Li, Danhang Tang, Yinda Zhang
Systems and Methods for Training Models to Predict Dense Correspondences in Images Using Geodesic Distances

Publication number: 20240212325

Abstract: Systems and methods for training models to predict dense correspondences across images such as human images. A model may be trained using synthetic training data created from one or more 3D computer models of a subject. In addition, one or more geodesic distances derived from the surfaces of one or more of the 3D models may be used to generate one or more loss values, which may in turn be used in modifying the model's parameters during training.

Type: Application

Filed: March 6, 2024

Publication date: June 27, 2024

Inventors: Yinda Zhang, Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Sean Ryan Francesco Fanello, Sofien Bouaziz, Cem Keskin, Ruofei Du, Rohit Kumar Pandey, Deqing Sun
SELECTING AVATAR FOR VIDEOCONFERENCE

Publication number: 20240129437

Abstract: A method can include selecting, from at least a first avatar and a second avatar based on at least one attribute of a calendar event associated with a user, a session avatar, the first avatar being based on a first set of images of a user wearing a first outfit and the second avatar being based on a second set of images of the user wearing a second outfit, and presenting the session avatar during a videoconference, the presentation of the session avatar changing based on audio input received from the user during the videoconference.

Type: Application

Filed: October 18, 2022

Publication date: April 18, 2024

Inventors: Yinda Zhang, Ruofei Du
Systems and methods for training models to predict dense correspondences in images using geodesic distances

Patent number: 11954899

Abstract: Systems and methods for training models to predict dense correspondences across images such as human images. A model may be trained using synthetic training data created from one or more 3D computer models of a subject. In addition, one or more geodesic distances derived from the surfaces of one or more of the 3D models may be used to generate one or more loss values, which may in turn be used in modifying the model's parameters during training.

Type: Grant

Filed: March 11, 2021

Date of Patent: April 9, 2024

Assignee: GOOGLE LLC

Inventors: Yinda Zhang, Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Sean Ryan Francesco Fanello, Sofien Bouaziz, Cem Keskin, Ruofei Du, Rohit Kumar Pandey, Deqing Sun
COMPUTER VISION MODELS USING GLOBAL AND LOCAL INFORMATION

Publication number: 20240062046

Abstract: A system including a computer vision model configured to perform a machine learning task is described. The computer vision model includes multiple wrapped convolutional layers, in which each wrapped convolutional layer includes a respective convolutional layer configured to receive, for each time step of multiple time steps, a layer input and to process the layer input to generate an initial output for the current time step, and a respective note-taking module configured to receive the initial output and to process the initial output to generate a feature vector for the current time step, the feature vector representing local information of the wrapped convolutional layer. The model includes a summarization module configured to receive the feature vectors and to process the feature vectors to generate a revision vector for the current time step, the revision vector representing global information of the plurality of wrapped convolutional layers.

Type: Application

Filed: March 31, 2021

Publication date: February 22, 2024

Inventors: Ruofei Du, Yinda Zhang, Weihao Zeng
Systems and Methods for Training Models to Predict Dense Correspondences in Images Using Geodesic Distances

Publication number: 20240046618

Abstract: Systems and methods for training models to predict dense correspondences across images such as human images. A model may be trained using synthetic training data created from one or more 3D computer models of a subject. In addition, one or more geodesic distances derived from the surfaces of one or more of the 3D models may be used to generate one or more loss values, which may in turn be used in modifying the model's parameters during training.

Type: Application

Filed: March 11, 2021

Publication date: February 8, 2024

Inventors: Yinda Zhang, Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Sean Ryan Francesco Fanello, Sofien Bouaziz, Cem Keskin, Ruofei Du, Rohit Kumar Pandey, Deqing Sun
GENERATIVE MODEL FOR 3D FACE SYNTHESIS WITH HDRI RELIGHTING

Publication number: 20240020915

Abstract: Techniques include introducing a neural generator configured to produce novel faces that can be rendered at free camera viewpoints (e.g., at any angle with respect to the camera) and relit under an arbitrary high dynamic range (HDR) light map. A neural implicit intrinsic field takes a randomly sampled latent vector as input and produces as output per-point albedo, volume density, and reflectance properties for any queried 3D location. These outputs are aggregated via a volumetric rendering to produce low resolution albedo, diffuse shading, specular shading, and neural feature maps. The low resolution maps are then upsampled to produce high resolution maps and input into a neural renderer to produce relit images.

Type: Application

Filed: July 17, 2023

Publication date: January 18, 2024

Inventors: Yinda Zhang, Feitong Tan, Sean Ryan Francesco Fanello, Abhimitra Meka, Sergio Orts Escolano, Danhang Tang, Rohit Kumar Pandey, Jonathan James Taylor
VOLUMETRIC PERFORMANCE CAPTURE WITH NEURAL RENDERING

Publication number: 20230419600

Abstract: Example embodiments relate to techniques for volumetric performance capture with neural rendering. A technique may involve initially obtaining images that depict a subject from multiple viewpoints and under various lighting conditions using a light stage and depth data corresponding to the subject using infrared cameras. A neural network may extract features of the subject from the images based on the depth data and map the features into a texture space (e.g., the UV texture space). A neural renderer can be used to generate an output image depicting the subject from a target view such that illumination of the subject in the output image aligns with the target view. The neural render may resample the features of the subject from the texture space to an image space to generate the output image.

Type: Application

Filed: November 5, 2020

Publication date: December 28, 2023

Inventors: Sean Ryan Francesco FANELLO, Abhi MEKA, Rohit Kumar PANDEY, Christian HAENE, Sergio Orts ESCOLANO, Christoph RHEMANN, Paul DEBEVEC, Sofien BOUAZIZ, Thabo BEELER, Ryan OVERBECK, Peter BARNUM, Daniel ERICKSON, Philip DAVIDSON, Yinda ZHANG, Jonathan TAYLOR, Chloe LeGENDRE, Shahram IZADI
Real-time stereo matching using a hierarchical iterative refinement network

Patent number: 11810313

Abstract: According to an aspect, a real-time active stereo system includes a capture system configured to capture stereo data, where the stereo data includes a first input image and a second input image, and a depth sensing computing system configured to predict a depth map. The depth sensing computing system includes a feature extractor configured to extract features from the first and second images at a plurality of resolutions, an initialization engine configured to generate a plurality of depth estimations, where each of the plurality of depth estimations corresponds to a different resolution, and a propagation engine configured to iteratively refine the plurality of depth estimations based on image warping and spatial propagation.

Type: Grant

Filed: February 19, 2021

Date of Patent: November 7, 2023

Assignee: GOOGLE LLC

Inventors: Vladimir Tankovich, Christian Haene, Sean Ryan Francesco Fanello, Yinda Zhang, Shahram Izadi, Sofien Bouaziz, Adarsh Prakash Murthy Kowdle, Sameh Khamis
ACTIVE STEREO DEPTH PREDICTION BASED ON COARSE MATCHING

Publication number: 20230209036

Abstract: An electronic device estimates a depth map of an environment based on matching reduced-resolution stereo depth images captured by depth cameras to generate a coarse disparity (depth) map. The electronic device downsamples depth images captured by the depth cameras and matches sections of the reduced-resolution images to each other to generate a coarse depth map. The electronic device upsamples the coarse depth map to a higher resolution and refines the upsampled depth map to generate a high-resolution depth map to support location-based functionality.

Type: Application

Filed: February 17, 2023

Publication date: June 29, 2023

Inventors: Sameh KHAMIS, Yinda ZHANG, Christoph RHEMANN, Julien VALENTIN, Adarsh KOWDLE, Vladimir TANKOVICH, Michael SCHOENBERG, Shahram IZADI, Thomas FUNKHOUSER, Sean FANELLO
Systems and Methods for Compression of Three-Dimensional Volumetric Representations

Publication number: 20230154051

Abstract: Systems and methods are directed to encoding and/or decoding of the textures/geometry of a three-dimensional volumetric representation. An encoding computing system can obtain voxel blocks from a three-dimensional volumetric representation of an object. The encoding computing system can encode voxel blocks with a machine-learned voxel encoding model to obtain encoded voxel blocks. The encoding computing system can decode the encoded voxel blocks with a machine-learned voxel decoding model to obtain reconstructed voxel blocks. The encoding computing system can generate a reconstructed mesh representation of the object based at least in part on the one or more reconstructed voxel blocks. The encoding computing system can encode textures associated with the voxel blocks according to an encoding scheme and based at least in part on the reconstructed mesh representation of the object to obtain encoded textures.

Type: Application

Filed: April 17, 2020

Publication date: May 18, 2023

Inventors: Danhang Tang, Saurabh Singh, Cem Keskin, Phillip Andrew Chou, Christian Haene, Mingsong Dou, Sean Ryan Francesco Fanello, Jonathan Taylor, Andrea Tagliasacchi, Philip Lindsley Davidson, Yinda Zhang, Onur Gonen Guleryuz, Shahram Izadi, Sofien Bouaziz
Active stereo depth prediction based on coarse matching

Patent number: 11589031

Abstract: An electronic device estimates a depth map of an environment based on matching reduced-resolution stereo depth images captured by depth cameras to generate a coarse disparity (depth) map. The electronic device downsamples depth images captured by the depth cameras and matches sections of the reduced-resolution images to each other to generate a coarse depth map. The electronic device upsamples the coarse depth map to a higher resolution and refines the upsampled depth map to generate a high-resolution depth map to support location-based functionality.

Type: Grant

Filed: September 24, 2019

Date of Patent: February 21, 2023

Assignee: GOOGLE LLC

Inventors: Sameh Khamis, Yinda Zhang, Christoph Rhemann, Julien Valentin, Adarsh Kowdle, Vladimir Tankovich, Michael Schoenberg, Shahram Izadi, Thomas Funkhouser, Sean Fanello
COMPUTATIONALLY EFFICIENT AND ROBUST EAR SADDLE POINT DETECTION

Publication number: 20220405500

Abstract: A computer-implemented method includes receiving a two-dimensional (2-D) side view face image of a person, identifying a bounded portion or area of the 2-D side view face image of the person as an ear region-of-interest (ROI) area showing at least a portion of an ear of the person, and processing the identified ear ROI area of the 2-D side view face image, pixel-by-pixel, through a trained fully convolutional neural network model (FCNN model) to predict a 2-D ear saddle point (ESP) location for the ear shown in the ear ROI area. The FCNN model has an image segmentation architecture.

Type: Application

Filed: June 21, 2021

Publication date: December 22, 2022

Inventors: Mayank Bhargava, Idris Syed Aleem, Yinda Zhang, Sushant Umesh Kulkarni, Rees Anwyl Simmons, Ahmed Gawish
JOINT DEPTH PREDICTION FROM DUAL-CAMERAS AND DUAL-PIXELS

Publication number: 20220343525

Abstract: Example implementations relate to joint depth prediction from dual cameras and dual pixels. An example method may involve obtaining a first set of depth information representing a scene from a first source and a second set of depth information representing the scene from a second source. The method may further involve determining, using a neural network, a joint depth map that conveys respective depths for elements in the scene. The neural network may determine the joint depth map based on a combination of the first set of depth information and the second set of depth information. In addition, the method may involve modifying an image representing the scene based on the joint depth map. For example, background portions of the image may be partially blurred based on the joint depth map.

Type: Application

Filed: April 27, 2020

Publication date: October 27, 2022

Inventors: Rahul GARG, Neal WADHWA, Sean FANELLO, Christian HAENE, Yinda ZHANG, Sergio Orts ESCOLANO, Yael Pritch KNAAN, Marc LEVOY, Shahram IZADI

1 2 next