Patents by Inventor Yinhao ZHU
Yinhao ZHU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240412493Abstract: Systems and techniques are provided for processing image data. According to some aspects, a computing device can generate a gradient (e.g., a classifier gradient using a trained classifier) associated with a current sample. The computing device can combine the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate. The computing device can predict, using the diffusion machine learning model and based on the score function estimate, a new sample.Type: ApplicationFiled: December 12, 2023Publication date: December 12, 2024Inventors: Risheek GARREPALLI, Yunxiao SHI, Hong CAI, Yinhao ZHU, Shubhankar Mangesh BORSE, Jisoo JEONG, Debasmit DAS, Manish Kumar SINGH, Rajeev YASARLA, Shizhong Steve HAN, Fatih Murat PORIKLI
-
Publication number: 20240386650Abstract: Systems and techniques are provided for processing image data corresponding to a scene. A process can include generating a planar distance map including a planar distance value for each pixel of at least one image corresponding to the scene. Planar segmentation is performed based on the planar distance map, a normal map corresponding to the at least one image, and positional encoding information of the planar distance map. A triangular mesh fragment is initialized based on sampling points from each planar segment of a plurality of planar segments from the planar segmentation. Ray-triangle intersections are determined based on performing ray casting for a reconstructed planar mesh including a plurality of triangular mesh fragments each corresponding to a different image. A planar reconstruction and segmentation machine learning network is optimized for the scene, based on training the planar reconstruction and segmentation machine learning network using one or more loss functions.Type: ApplicationFiled: November 14, 2023Publication date: November 21, 2024Inventors: Farhad GHAZVINIAN ZANJANI, Leyla MIRVAKHABOVA, Yinhao ZHU, Hong CAI, Fatih Murat PORIKLI
-
Patent number: 12132919Abstract: A processor-implemented method for image compression using an artificial neural network (ANN) includes receiving, at an encoder of the ANN, an image and a spatial segmentation map corresponding to the image. The spatial segmentation map indicates one or more regions of interest. The encoder compresses the image according to a controllable spatial bit allocation. The controllable spatial bit allocation is based on a learned quantization bin size.Type: GrantFiled: November 15, 2022Date of Patent: October 29, 2024Assignee: QUALCOMM IncorporatedInventors: Yang Yang, Hoang Cong Minh Le, Yinhao Zhu, Reza Pourreza, Amir Said, Yizhe Zhang, Taco Sebastiaan Cohen
-
Patent number: 12120348Abstract: Systems and techniques are described herein for processing media data using a neural network system. For instance, a process can include obtaining a latent representation of a frame of encoded image data and generating, by a plurality of decoder transformer layers of a decoder sub-network using the latent representation of the frame of encoded image data as input, a frame of decoded image data. At least one decoder transformer layer of the plurality of decoder transformer layers includes: one or more transformer blocks for generating one or more patches of features and determine self-attention locally within one or more window partitions and shifted window partitions applied over the one or more patches; and a patch un-merging engine for decreasing a respective size of each patch of the one or more patches.Type: GrantFiled: September 27, 2021Date of Patent: October 15, 2024Assignee: QUALCOMM INCORPORATEDInventors: Yinhao Zhu, Yang Yang, Taco Sebastiaan Cohen
-
Publication number: 20240303913Abstract: Systems and techniques are provided for physical-based light estimation for inverse rendering of indoor scenes. For example, a computing device can obtain an estimated scene geometry based on a multi-view observation of a scene. The computing device can further obtain a light emission mask based on the multi-view observation of the scene. The computing device can also obtain an emitted radiance field based on the multi-view observation of the scene. The computing device can then determine, based on the light emission mask and the emitted radiance field, a geometry of at least one light source of the estimated scene geometry.Type: ApplicationFiled: March 8, 2023Publication date: September 12, 2024Inventors: Yinhao ZHU, Rui ZHU, Hong CAI, Fatih Murat PORIKLI
-
Patent number: 12008731Abstract: Certain aspects of the present disclosure provide techniques for compressing content using a neural network. An example method generally includes receiving content for compression. The content is encoded into a first latent code space through an encoder implemented by an artificial neural network trained to generate a latent space representation of the content. A first compressed version of the encoded content is generated using a first quantization bin size of a series of quantization bin sizes. A refined compressed version of the encoded content is generated by scaling the first compressed version of the encoded content into one or more second quantization bin sizes smaller than the first quantization bin size, conditioned at least on a value of the first compressed version of the encoded content. The refined compressed version of the encoded content is output for transmission.Type: GrantFiled: January 24, 2022Date of Patent: June 11, 2024Assignee: QUALCOMM IncorporatedInventors: Yadong Lu, Yang Yang, Yinhao Zhu, Amir Said, Taco Sebastiaan Cohen
-
Publication number: 20240177329Abstract: Systems and techniques are provided for processing sensor data. For example, a process can include determining, using a trained machine learning system, a predicted depth map for an image, the predicted depth map including a respective predicted depth value for each pixel of the image. The process can further include obtaining depth values for the image, the depth values including depth values for less than all pixels of the image from a tracker configured to determine the depth values based on one or more feature points between frames. The process can further include scaling the predicted depth map for the image using and the depth values. The output of the process can be scale-correct depth prediction values.Type: ApplicationFiled: October 4, 2023Publication date: May 30, 2024Inventors: Hong CAI, Yinhao ZHU, Jisoo JEONG, Yunxiao SHI, Fatih Murat PORIKLI
-
Publication number: 20240144589Abstract: Systems and techniques are provided for part segmentation. For example, a process for performing part segmentation can include obtaining a three-dimensional capture of an object. The method can include generating one or more two-dimensional images of the object from the three-dimensional capture of the object. The method can further include processing the one or more two-dimensional images of the object to generate at least one two-dimensional bounding box associated with a part of the object. The method can include performing three-dimensional part segmentation of the part of the object based on a three-dimensional point cloud generated from the one or more two-dimensional images of the object and the at least one two-dimensional bounding box and based on semantically labeled super points which are merged into subgroups associated with the part of the object.Type: ApplicationFiled: March 1, 2023Publication date: May 2, 2024Inventors: Minghua LIU, Yinhao ZHU, Hong CAI, Fatih Murat PORIKLI, Hao SU
-
Patent number: 11943460Abstract: A computer-implemented method for operating an artificial neural network (ANN) includes receiving an input by the ANN. The ANN generates a latent representation of the input. The latent representation is communicated according to a bit rate based on a learned latent scaling parameter. The latent scaling parameter is learned based on a channel index and a tradeoff parameter value that corresponds to a value that balances the bit rate and a distortion.Type: GrantFiled: January 11, 2022Date of Patent: March 26, 2024Assignee: QUALCOMM INCORPORATEDInventors: Yadong Lu, Yang Yang, Yinhao Zhu, Amir Said, Reza Pourreza, Taco Sebastiaan Cohen
-
Patent number: 11798197Abstract: A method of image compression includes receiving an image. Multiple quantized latent representations are generated to represent features of the image. Each of the quantized latent representations has a different resolution and is generated at staggered timings. Each of the later generated quantized latent representations is conditioned on each of the prior generated quantized latent representations. The multiple quantized latent representations are decoded to reconstruct the image.Type: GrantFiled: March 12, 2021Date of Patent: October 24, 2023Assignee: QUALCOMM IncorporatedInventors: Hoang Cong Minh Le, Reza Pourreza, Yang Yang, Yinhao Zhu, Amir Said, Yizhe Zhang, Taco Sebastiaan Cohen
-
Publication number: 20230262267Abstract: This disclosure describes entropy coding techniques for media data coded using neural-based techniques. A media coder is configured to determine a probability distribution function parameter for a data element of a data stream coded by a neural-based media compression technique, wherein the probability distribution function parameter is a logarithmic function of a standard deviation of a probability distribution function of the data stream, determine a code vector based on the probability distribution function parameter, and entropy code the data element using the code vector.Type: ApplicationFiled: February 11, 2022Publication date: August 17, 2023Inventors: Amir Said, Yinhao Zhu
-
Publication number: 20230169694Abstract: A processor-implemented method for video compression using an artificial neural network (ANN) includes receiving a video via the ANN. The ANN extracts a first set of features of a current frame of the video and a second set of features of a reference frame of the video. The ANN determines an estimate of correlation features between the first set of features of the current frame and the second set of features of the reference frame. The estimate of the correlation features are encoded and transmitted to a receiver.Type: ApplicationFiled: October 27, 2022Publication date: June 1, 2023Inventors: Hoang Cong Minh LE, Reza POURREZA, Yang YANG, Yinhao ZHU, Amir SAID, Taco Sebastiaan COHEN
-
Publication number: 20230156207Abstract: A processor-implemented method for image compression using an artificial neural network (ANN) includes receiving, at an encoder of the ANN, an image and a spatial segmentation map corresponding to the image. The spatial segmentation map indicates one or more regions of interest. The encoder compresses the image according to a controllable spatial bit allocation. The controllable spatial bit allocation is based on a learned quantization bin size.Type: ApplicationFiled: November 15, 2022Publication date: May 18, 2023Inventors: Yang YANG, Hoang Cong Minh LE, Yinhao ZHU, Reza POURREZA, Amir SAID, Yizhe ZHANG, Taco Sebastiaan COHEN
-
Patent number: 11638025Abstract: Systems and techniques are described for encoding and/or decoding data based on motion estimation that applies variable-scale warping. An encoding device can receive an input frame and a reference frame that depict a scene at different times. The encoding device can generate an optical flow identifying movements in the scene between the two frames. The encoding device can generate a weight map identifying how finely or coarsely the reference frame can be warped for input frame prediction. The encoding device can generate encoded video data based on the optical flow and the weight map. A decoding device can generate a reconstructed optical flow and a reconstructed weight map from the encoded data. A decoding device can generate a prediction frame by warping the reference frame based on the reconstructed optical flow and the reconstructed weight map. The decoding device can generate a reconstructed input frame based on the prediction frame.Type: GrantFiled: March 19, 2021Date of Patent: April 25, 2023Assignee: QUALCOMM IncorporatedInventors: Reza Pourreza, Amir Said, Yang Yang, Yinhao Zhu, Taco Sebastiaan Cohen
-
Publication number: 20230100413Abstract: Systems and techniques are described herein for processing media data using a neural network system. For instance, a process can include obtaining a latent representation of a frame of encoded image data and generating, by a plurality of decoder transformer layers of a decoder sub-network using the latent representation of the frame of encoded image data as input, a frame of decoded image data. At least one decoder transformer layer of the plurality of decoder transformer layers includes: one or more transformer blocks for generating one or more patches of features and determine self-attention locally within one or more window partitions and shifted window partitions applied over the one or more patches; and a patch un-merging engine for decreasing a respective size of each patch of the one or more patches.Type: ApplicationFiled: September 27, 2021Publication date: March 30, 2023Inventors: Yinhao ZHU, Yang YANG, Taco Sebastiaan COHEN
-
Publication number: 20220303568Abstract: Systems and techniques are described for encoding and/or decoding data based on motion estimation that applies variable-scale warping. An encoding device can receive an input frame and a reference frame that depict a scene at different times. The encoding device can generate an optical flow identifying movements in the scene between the two frames. The encoding device can generate a weight map identifying how finely or coarsely the reference frame can be warped for input frame prediction. The encoding device can generate encoded video data based on the optical flow and the weight map. A decoding device can generate a reconstructed optical flow and a reconstructed weight map from the encoded data. A decoding device can generate a prediction frame by warping the reference frame based on the reconstructed optical flow and the reconstructed weight map. The decoding device can generate a reconstructed input frame based on the prediction frame.Type: ApplicationFiled: March 19, 2021Publication date: September 22, 2022Inventors: Reza POURREZA, Amir SAID, Yang YANG, Yinhao ZHU, Taco Sebastiaan COHEN
-
Publication number: 20220292725Abstract: A method of image compression includes receiving an image. Multiple quantized latent representations are generated to represent features of the image. Each of the quantized latent representations has a different resolution and is generated at staggered timings. Each of the later generated quantized latent representations is conditioned on each of the prior generated quantized latent representations. The multiple quantized latent representations are decoded to reconstruct the image.Type: ApplicationFiled: March 12, 2021Publication date: September 15, 2022Inventors: Hoang Cong Minh LE, Reza POURREZA, Yang YANG, Yinhao ZHU, Amir SAID, Yizhe ZHANG, Taco Sebastiaan COHEN
-
Publication number: 20220237740Abstract: Certain aspects of the present disclosure provide techniques for compressing content using a neural network. An example method generally includes receiving content for compression. The content is encoded into a first latent code space through an encoder implemented by an artificial neural network trained to generate a latent space representation of the content. A first compressed version of the encoded content is generated using a first quantization bin size of a series of quantization bin sizes. A refined compressed version of the encoded content is generated by scaling the first compressed version of the encoded content into one or more second quantization bin sizes smaller than the first quantization bin size, conditioned at least on a value of the first compressed version of the encoded content. The refined compressed version of the encoded content is output for transmission.Type: ApplicationFiled: January 24, 2022Publication date: July 28, 2022Inventors: Yadong LU, Yang YANG, Yinhao ZHU, Amir SAID, Taco Sebastiaan COHEN
-
Patent number: 11399198Abstract: Techniques are described for learned bidirectional predicted frame (B-frame) coding. An example method can include receiving a residual associated with a frame of a current time step; determining first motion information for a first reference frame associated with a first time step and second motion information for a second reference frame associated with a second time step, wherein the current time step is after the first time step and before the second time step; determining third motion information for the frame based on the first motion information and second motion information; generating a predicted frame based on the third motion information, first reference frame and second reference frame; and generating, using the predicted frame and residual, a reconstructed B-frame for the current time step, the reconstructed B-frame representing the frame.Type: GrantFiled: March 1, 2021Date of Patent: July 26, 2022Assignee: QUALCOMM IncorporatedInventors: Reza Pourreza, Yang Yang, Amir Said, Yinhao Zhu, Taco Sebastiaan Cohen
-
Publication number: 20220224926Abstract: A computer-implemented method for operating an artificial neural network (ANN) includes receiving an input by the ANN. The ANN generates a latent representation of the input. The latent representation is communicated according to a bit rate based on a learned latent scaling parameter. The latent scaling parameter is learned based on a channel index and a tradeoff parameter value that corresponds to a value that balances the bit rate and a distortion.Type: ApplicationFiled: January 11, 2022Publication date: July 14, 2022Inventors: Yadong LU, Yang YANG, Yinhao ZHU, Amir SAID, Reza POURREZA, Taco Sebastiaan COHEN