Patents by Inventor Yinhao ZHU

Yinhao ZHU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240412493
    Abstract: Systems and techniques are provided for processing image data. According to some aspects, a computing device can generate a gradient (e.g., a classifier gradient using a trained classifier) associated with a current sample. The computing device can combine the gradient with an iterative model estimated score function or data associated with the current sample to generate a score function estimate. The computing device can predict, using the diffusion machine learning model and based on the score function estimate, a new sample.
    Type: Application
    Filed: December 12, 2023
    Publication date: December 12, 2024
    Inventors: Risheek GARREPALLI, Yunxiao SHI, Hong CAI, Yinhao ZHU, Shubhankar Mangesh BORSE, Jisoo JEONG, Debasmit DAS, Manish Kumar SINGH, Rajeev YASARLA, Shizhong Steve HAN, Fatih Murat PORIKLI
  • Publication number: 20240386650
    Abstract: Systems and techniques are provided for processing image data corresponding to a scene. A process can include generating a planar distance map including a planar distance value for each pixel of at least one image corresponding to the scene. Planar segmentation is performed based on the planar distance map, a normal map corresponding to the at least one image, and positional encoding information of the planar distance map. A triangular mesh fragment is initialized based on sampling points from each planar segment of a plurality of planar segments from the planar segmentation. Ray-triangle intersections are determined based on performing ray casting for a reconstructed planar mesh including a plurality of triangular mesh fragments each corresponding to a different image. A planar reconstruction and segmentation machine learning network is optimized for the scene, based on training the planar reconstruction and segmentation machine learning network using one or more loss functions.
    Type: Application
    Filed: November 14, 2023
    Publication date: November 21, 2024
    Inventors: Farhad GHAZVINIAN ZANJANI, Leyla MIRVAKHABOVA, Yinhao ZHU, Hong CAI, Fatih Murat PORIKLI
  • Patent number: 12132919
    Abstract: A processor-implemented method for image compression using an artificial neural network (ANN) includes receiving, at an encoder of the ANN, an image and a spatial segmentation map corresponding to the image. The spatial segmentation map indicates one or more regions of interest. The encoder compresses the image according to a controllable spatial bit allocation. The controllable spatial bit allocation is based on a learned quantization bin size.
    Type: Grant
    Filed: November 15, 2022
    Date of Patent: October 29, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Yang Yang, Hoang Cong Minh Le, Yinhao Zhu, Reza Pourreza, Amir Said, Yizhe Zhang, Taco Sebastiaan Cohen
  • Patent number: 12120348
    Abstract: Systems and techniques are described herein for processing media data using a neural network system. For instance, a process can include obtaining a latent representation of a frame of encoded image data and generating, by a plurality of decoder transformer layers of a decoder sub-network using the latent representation of the frame of encoded image data as input, a frame of decoded image data. At least one decoder transformer layer of the plurality of decoder transformer layers includes: one or more transformer blocks for generating one or more patches of features and determine self-attention locally within one or more window partitions and shifted window partitions applied over the one or more patches; and a patch un-merging engine for decreasing a respective size of each patch of the one or more patches.
    Type: Grant
    Filed: September 27, 2021
    Date of Patent: October 15, 2024
    Assignee: QUALCOMM INCORPORATED
    Inventors: Yinhao Zhu, Yang Yang, Taco Sebastiaan Cohen
  • Publication number: 20240303913
    Abstract: Systems and techniques are provided for physical-based light estimation for inverse rendering of indoor scenes. For example, a computing device can obtain an estimated scene geometry based on a multi-view observation of a scene. The computing device can further obtain a light emission mask based on the multi-view observation of the scene. The computing device can also obtain an emitted radiance field based on the multi-view observation of the scene. The computing device can then determine, based on the light emission mask and the emitted radiance field, a geometry of at least one light source of the estimated scene geometry.
    Type: Application
    Filed: March 8, 2023
    Publication date: September 12, 2024
    Inventors: Yinhao ZHU, Rui ZHU, Hong CAI, Fatih Murat PORIKLI
  • Patent number: 12008731
    Abstract: Certain aspects of the present disclosure provide techniques for compressing content using a neural network. An example method generally includes receiving content for compression. The content is encoded into a first latent code space through an encoder implemented by an artificial neural network trained to generate a latent space representation of the content. A first compressed version of the encoded content is generated using a first quantization bin size of a series of quantization bin sizes. A refined compressed version of the encoded content is generated by scaling the first compressed version of the encoded content into one or more second quantization bin sizes smaller than the first quantization bin size, conditioned at least on a value of the first compressed version of the encoded content. The refined compressed version of the encoded content is output for transmission.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: June 11, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Yadong Lu, Yang Yang, Yinhao Zhu, Amir Said, Taco Sebastiaan Cohen
  • Publication number: 20240177329
    Abstract: Systems and techniques are provided for processing sensor data. For example, a process can include determining, using a trained machine learning system, a predicted depth map for an image, the predicted depth map including a respective predicted depth value for each pixel of the image. The process can further include obtaining depth values for the image, the depth values including depth values for less than all pixels of the image from a tracker configured to determine the depth values based on one or more feature points between frames. The process can further include scaling the predicted depth map for the image using and the depth values. The output of the process can be scale-correct depth prediction values.
    Type: Application
    Filed: October 4, 2023
    Publication date: May 30, 2024
    Inventors: Hong CAI, Yinhao ZHU, Jisoo JEONG, Yunxiao SHI, Fatih Murat PORIKLI
  • Publication number: 20240144589
    Abstract: Systems and techniques are provided for part segmentation. For example, a process for performing part segmentation can include obtaining a three-dimensional capture of an object. The method can include generating one or more two-dimensional images of the object from the three-dimensional capture of the object. The method can further include processing the one or more two-dimensional images of the object to generate at least one two-dimensional bounding box associated with a part of the object. The method can include performing three-dimensional part segmentation of the part of the object based on a three-dimensional point cloud generated from the one or more two-dimensional images of the object and the at least one two-dimensional bounding box and based on semantically labeled super points which are merged into subgroups associated with the part of the object.
    Type: Application
    Filed: March 1, 2023
    Publication date: May 2, 2024
    Inventors: Minghua LIU, Yinhao ZHU, Hong CAI, Fatih Murat PORIKLI, Hao SU
  • Patent number: 11943460
    Abstract: A computer-implemented method for operating an artificial neural network (ANN) includes receiving an input by the ANN. The ANN generates a latent representation of the input. The latent representation is communicated according to a bit rate based on a learned latent scaling parameter. The latent scaling parameter is learned based on a channel index and a tradeoff parameter value that corresponds to a value that balances the bit rate and a distortion.
    Type: Grant
    Filed: January 11, 2022
    Date of Patent: March 26, 2024
    Assignee: QUALCOMM INCORPORATED
    Inventors: Yadong Lu, Yang Yang, Yinhao Zhu, Amir Said, Reza Pourreza, Taco Sebastiaan Cohen
  • Patent number: 11798197
    Abstract: A method of image compression includes receiving an image. Multiple quantized latent representations are generated to represent features of the image. Each of the quantized latent representations has a different resolution and is generated at staggered timings. Each of the later generated quantized latent representations is conditioned on each of the prior generated quantized latent representations. The multiple quantized latent representations are decoded to reconstruct the image.
    Type: Grant
    Filed: March 12, 2021
    Date of Patent: October 24, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Hoang Cong Minh Le, Reza Pourreza, Yang Yang, Yinhao Zhu, Amir Said, Yizhe Zhang, Taco Sebastiaan Cohen
  • Publication number: 20230262267
    Abstract: This disclosure describes entropy coding techniques for media data coded using neural-based techniques. A media coder is configured to determine a probability distribution function parameter for a data element of a data stream coded by a neural-based media compression technique, wherein the probability distribution function parameter is a logarithmic function of a standard deviation of a probability distribution function of the data stream, determine a code vector based on the probability distribution function parameter, and entropy code the data element using the code vector.
    Type: Application
    Filed: February 11, 2022
    Publication date: August 17, 2023
    Inventors: Amir Said, Yinhao Zhu
  • Publication number: 20230169694
    Abstract: A processor-implemented method for video compression using an artificial neural network (ANN) includes receiving a video via the ANN. The ANN extracts a first set of features of a current frame of the video and a second set of features of a reference frame of the video. The ANN determines an estimate of correlation features between the first set of features of the current frame and the second set of features of the reference frame. The estimate of the correlation features are encoded and transmitted to a receiver.
    Type: Application
    Filed: October 27, 2022
    Publication date: June 1, 2023
    Inventors: Hoang Cong Minh LE, Reza POURREZA, Yang YANG, Yinhao ZHU, Amir SAID, Taco Sebastiaan COHEN
  • Publication number: 20230156207
    Abstract: A processor-implemented method for image compression using an artificial neural network (ANN) includes receiving, at an encoder of the ANN, an image and a spatial segmentation map corresponding to the image. The spatial segmentation map indicates one or more regions of interest. The encoder compresses the image according to a controllable spatial bit allocation. The controllable spatial bit allocation is based on a learned quantization bin size.
    Type: Application
    Filed: November 15, 2022
    Publication date: May 18, 2023
    Inventors: Yang YANG, Hoang Cong Minh LE, Yinhao ZHU, Reza POURREZA, Amir SAID, Yizhe ZHANG, Taco Sebastiaan COHEN
  • Patent number: 11638025
    Abstract: Systems and techniques are described for encoding and/or decoding data based on motion estimation that applies variable-scale warping. An encoding device can receive an input frame and a reference frame that depict a scene at different times. The encoding device can generate an optical flow identifying movements in the scene between the two frames. The encoding device can generate a weight map identifying how finely or coarsely the reference frame can be warped for input frame prediction. The encoding device can generate encoded video data based on the optical flow and the weight map. A decoding device can generate a reconstructed optical flow and a reconstructed weight map from the encoded data. A decoding device can generate a prediction frame by warping the reference frame based on the reconstructed optical flow and the reconstructed weight map. The decoding device can generate a reconstructed input frame based on the prediction frame.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: April 25, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Reza Pourreza, Amir Said, Yang Yang, Yinhao Zhu, Taco Sebastiaan Cohen
  • Publication number: 20230100413
    Abstract: Systems and techniques are described herein for processing media data using a neural network system. For instance, a process can include obtaining a latent representation of a frame of encoded image data and generating, by a plurality of decoder transformer layers of a decoder sub-network using the latent representation of the frame of encoded image data as input, a frame of decoded image data. At least one decoder transformer layer of the plurality of decoder transformer layers includes: one or more transformer blocks for generating one or more patches of features and determine self-attention locally within one or more window partitions and shifted window partitions applied over the one or more patches; and a patch un-merging engine for decreasing a respective size of each patch of the one or more patches.
    Type: Application
    Filed: September 27, 2021
    Publication date: March 30, 2023
    Inventors: Yinhao ZHU, Yang YANG, Taco Sebastiaan COHEN
  • Publication number: 20220303568
    Abstract: Systems and techniques are described for encoding and/or decoding data based on motion estimation that applies variable-scale warping. An encoding device can receive an input frame and a reference frame that depict a scene at different times. The encoding device can generate an optical flow identifying movements in the scene between the two frames. The encoding device can generate a weight map identifying how finely or coarsely the reference frame can be warped for input frame prediction. The encoding device can generate encoded video data based on the optical flow and the weight map. A decoding device can generate a reconstructed optical flow and a reconstructed weight map from the encoded data. A decoding device can generate a prediction frame by warping the reference frame based on the reconstructed optical flow and the reconstructed weight map. The decoding device can generate a reconstructed input frame based on the prediction frame.
    Type: Application
    Filed: March 19, 2021
    Publication date: September 22, 2022
    Inventors: Reza POURREZA, Amir SAID, Yang YANG, Yinhao ZHU, Taco Sebastiaan COHEN
  • Publication number: 20220292725
    Abstract: A method of image compression includes receiving an image. Multiple quantized latent representations are generated to represent features of the image. Each of the quantized latent representations has a different resolution and is generated at staggered timings. Each of the later generated quantized latent representations is conditioned on each of the prior generated quantized latent representations. The multiple quantized latent representations are decoded to reconstruct the image.
    Type: Application
    Filed: March 12, 2021
    Publication date: September 15, 2022
    Inventors: Hoang Cong Minh LE, Reza POURREZA, Yang YANG, Yinhao ZHU, Amir SAID, Yizhe ZHANG, Taco Sebastiaan COHEN
  • Publication number: 20220237740
    Abstract: Certain aspects of the present disclosure provide techniques for compressing content using a neural network. An example method generally includes receiving content for compression. The content is encoded into a first latent code space through an encoder implemented by an artificial neural network trained to generate a latent space representation of the content. A first compressed version of the encoded content is generated using a first quantization bin size of a series of quantization bin sizes. A refined compressed version of the encoded content is generated by scaling the first compressed version of the encoded content into one or more second quantization bin sizes smaller than the first quantization bin size, conditioned at least on a value of the first compressed version of the encoded content. The refined compressed version of the encoded content is output for transmission.
    Type: Application
    Filed: January 24, 2022
    Publication date: July 28, 2022
    Inventors: Yadong LU, Yang YANG, Yinhao ZHU, Amir SAID, Taco Sebastiaan COHEN
  • Patent number: 11399198
    Abstract: Techniques are described for learned bidirectional predicted frame (B-frame) coding. An example method can include receiving a residual associated with a frame of a current time step; determining first motion information for a first reference frame associated with a first time step and second motion information for a second reference frame associated with a second time step, wherein the current time step is after the first time step and before the second time step; determining third motion information for the frame based on the first motion information and second motion information; generating a predicted frame based on the third motion information, first reference frame and second reference frame; and generating, using the predicted frame and residual, a reconstructed B-frame for the current time step, the reconstructed B-frame representing the frame.
    Type: Grant
    Filed: March 1, 2021
    Date of Patent: July 26, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Reza Pourreza, Yang Yang, Amir Said, Yinhao Zhu, Taco Sebastiaan Cohen
  • Publication number: 20220224926
    Abstract: A computer-implemented method for operating an artificial neural network (ANN) includes receiving an input by the ANN. The ANN generates a latent representation of the input. The latent representation is communicated according to a bit rate based on a learned latent scaling parameter. The latent scaling parameter is learned based on a channel index and a tradeoff parameter value that corresponds to a value that balances the bit rate and a distortion.
    Type: Application
    Filed: January 11, 2022
    Publication date: July 14, 2022
    Inventors: Yadong LU, Yang YANG, Yinhao ZHU, Amir SAID, Reza POURREZA, Taco Sebastiaan COHEN