Patents by Inventor Junfeng He

Junfeng He has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

IMAGE DESCRIPTION METHOD AND RELATED DEVICE

Publication number: 20260188032

Abstract: An image description method includes obtaining an image; obtaining an image description model and a description prompt text, the description prompt text indicating an image description rule for performing image description by the image description model; inputting the description prompt text and the image into the image description model; and performing image description on the image based on the description prompt text through the image description model, to obtain a first image description text that conforms to the image description rule. The description prompt text is obtained by correcting an initial prompt text corresponding to the image description rule based on a first text difference between a sample image description text and a predicted image description text. The sample image description text is obtained based on a sample image and the image description rule.

Type: Application

Filed: February 27, 2026

Publication date: July 2, 2026

Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Junxian CAI, Junfeng HE, Xi CHEN
VIDEO GENERATION METHOD, APPARATUS, DEVICE AND MEDIUM

Publication number: 20260154859

Abstract: The disclosed embodiments relate to a video generation method, apparatus, device, and medium. The method comprises: obtaining product information of a target product; determining, based on the product information, character information corresponding to at least two target virtual characters; wherein the character information comprises character characteristics, character lines, and character appearance time; and generating a target video based on the character information of the target virtual characters; wherein the target video is a communication video of the at least two target virtual characters for the target product. The disclosed embodiments can directly generate a communication video of a plurality of virtual characters for the target product, requiring low cost and providing efficient production, thereby effectively showcasing and introducing the relevant product.

Type: Application

Filed: November 25, 2025

Publication date: June 4, 2026

Inventors: Wenhe ZHAO, Yijun ZHAO, Xiaoping LUO, Kaibo YU, Jinju YANG, Weidong YANG, Junfeng HE, Hui REN, Gongmian WANG
CALIBRATED PREFERENCE OPTIMIZATION FOR GENERATIVE NEURAL NETWORKS

Publication number: 20260134289

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a generative neural network that has parameters. In one aspect, one of the methods include: obtaining a context input; processing, by the generative neural network, the context input to generate a plurality of training outputs; for each objective in a set of objectives and for each of the plurality of training outputs: determining a respective quality score of the training output relative to each other training input in the plurality of training outputs with respect to the objective; and determining a calibrated reward for the training output with respect to the objective based on the respective quality scores of the training output with respect to the objective; selecting a positive training output and a negative training output; and training the generative neural network on the positive training output and the negative training output.

Type: Application

Filed: November 14, 2025

Publication date: May 14, 2026

Inventors: Kyungmin Lee, Yinxiao Li, Feng Yang, Junfeng He, Irfan Aziz Essa, Ming-Hsuan Yang, Xiaohang Li, Junjie Ke
VIDEO PROCESSING METHOD AND RELATED DEVICES

Publication number: 20260120721

Abstract: The present disclosure provides a video processing method and related devices. The method includes: acquiring attribute information of a target object and an initial video, where the initial video includes audio data related to the target object; obtaining a video category and a video effect list of the initial video based on a target text corresponding to the audio data and the attribute information; determining a target material corresponding to an effect object based on the video category and an effect material label; and adding the target material to the initial video based on a target timestamp of the text segment in the audio data, to obtain a target video.

Type: Application

Filed: October 29, 2025

Publication date: April 30, 2026

Inventors: Junfeng HE, Weidong Yang, Jiepeng Cen, Lingfu Li
MODIFYING TARGET REGIONS WITHIN AN IMAGE USING A DIFFUSION NEURAL NETWORK

Publication number: 20260094247

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a diffusion neural network using a region-aware fine-tuning process. After training, the diffusion neural network can be used to generate an image conditioned on a conditioning input.

Type: Application

Filed: October 2, 2025

Publication date: April 2, 2026

Inventors: Paul Adrian Vicol, Yinxiao Li, Xiaoying Xing, Avinab Saha, Mungyung Ryu, Susan Hao, Feng Yang, Deepak Ramachandran, Junfeng He, Gang Li, Sarah Ming Young, Sahil Singla
Automatic Identification of Distracting Vivid Regions in An Image

Publication number: 20260038093

Abstract: Methods and systems for modifying a digital image are described herein. The method can include performing vividness scoring for a plurality of pixels of the digital image, determining one or more candidate pixels based on the vividness scoring for the plurality of pixels, and agglomerating the one or more candidate pixels into one or more suggested agglomerates. The method can also include determining at least one subject of the digital image, removing at least one agglomerate from the one or more suggested agglomerates based on at least one of the at least one subject of the digital image or one or more characteristics of the at least one agglomerate, generating a modified digital image with the one or more suggested agglomerates modified, and outputting the modified digital image.

Type: Application

Filed: July 29, 2022

Publication date: February 5, 2026

Inventors: Orly Liba, Junfeng He, Bryan Eric Feldman, Yael Pritch Knaan, Kfir Aberman
Multimodal Machine-Learned Models for Unified Attention and Response Predictions for Visual Content

Publication number: 20260004191

Abstract: Aspects of the disclosed technology include computer-implemented systems and methods for machine-learned multimodal models. A machine-learned multimodal model includes one or more embedding layers configured to generate one or more image tokens and one or more text tokens in response to the imagery and the text, a transformer encoder configured to receive the one or more image tokens and the one or more text tokens and generate one or more fused image tokens and one or more fused text tokens, a heatmap predictor configured to obtain the one or more fused image tokens and generate at least one image heatmap, and a sequence predictor configured to obtain the one or more fused image tokens and the one or more fused text tokens and generate a predicted sequence associated with the image.

Type: Application

Filed: June 26, 2025

Publication date: January 1, 2026

Inventors: Junfeng He, Gang Li, Peizhao Li, Nachiappan Valliappan, Vidhya Navalpakkam, Yang Li, Kai Jochen Kohlhoff
VISION-LANGUAGE MODEL FOR IMAGE CROPPING THROUGH IN-CONTEXT LEARNING

Publication number: 20260004546

Abstract: The technology provides for enhanced image cropping via in-context learning. It includes an efficient prompt retrieval mechanism for image cropping to automate the selection of in-context examples. It also includes an iterative refinement strategy to iteratively enhance the predicted crops. The image cropping framework is applicable to a wide range of cropping tasks, including free-form cropping, subject-aware cropping, and aspect ratio-aware cropping. The approach employs a trained large vision-language model associated with in-context learning. For instance, given an input image (whether from free-form, subject-aware or aspect ratio-aware cropping), the top-K semantically similar images from a dataset are retrieved as an in-context learning prompt. Then the in-context learning prompt is fed to a pretrained vision-language model to generate a set of crops. The crop candidates of the set are iteratively refined to yield a final output crop.

Type: Application

Filed: June 18, 2025

Publication date: January 1, 2026

Inventors: Feng Yang, Seunghyun Lee, Junjie Ke, Yinxiao Li, Junfeng He, Steven Hickson, Ekaterina Datsenko, Ming-Hsuan Yang, Irfan Essa
EMBEDDING CASSETTE MARKING MACHINE AND OPERATING METHOD

Publication number: 20260001170

Abstract: An embedding cassette marking machine and an operating method are provided. The embedding cassette marking machine includes: a housing; a feeding mechanism; a laser marking mechanism for marking the embedding cassette that slides down to the slideway assembly; and a discharging mechanism, which includes a pushing assembly and a material-carrying rail. The pushing assembly includes a pushing block, and the pushing block is configured to reciprocate toward or away from the material-carrying rail, so as to push the marked embedding cassette to the material-carrying rail.

Type: Application

Filed: September 4, 2025

Publication date: January 1, 2026

Inventors: Changyou LIAO, Zengguang JIA, Xufeng YIN, Youfa WANG, Yuan JIANG, Changxin LI, Xiongbing ZHOU, Wentao LI, Weiwei LONG, Hui LIU, Yuhao TANG, Junfeng HE
Feedback Predictions for Machine-Learned Generative Models

Publication number: 20260004490

Abstract: Aspects of the disclosed technology include computer-implemented systems and methods for machine-learned multimodal models for feedback predictions for synthetic content. A machine-learned multimodal model is configured to generate a feature map based at least in part on fusion of image information and text information from a synthetic image and a text prompt. The model is configured to generate a set of text tokens based at least in part on fusion of the image information and the text information. The model is configured to generate at least one misalignment or implausibility heatmap based at least in part on the at least one feature map. The model is configured to generate at least one predicted misalignment sequence based at least in part on the set of text tokens.

Type: Application

Filed: June 26, 2025

Publication date: January 1, 2026

Inventors: Junfeng He, Youwei Liang, Gang Li, Feng Yang, Junjie Ke, Peizhao Li, Vidhya Navalpakkam, Jiao Sun, Yang Li, Kai Jochen Kohlhoff, Jordi Pont-Tuset, Deepak Ramachandran
Techniques for Removing a Distraction in an Image

Publication number: 20250363643

Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.

Type: Application

Filed: August 7, 2025

Publication date: November 27, 2025

Inventors: Kfir Aberman, David Edward Jacobs, Kai Jochen Kohlhoff, Michael Rubinstein, Yossi Gandelsman, Junfeng He, Inbar Mosseri, Yael Pritch Knaan
Differentially private heatmaps

Patent number: 12468850

Abstract: Improved methods are provided for generating heatmaps or other summary map data from multiple users' data (e.g., probability distributions) in a manner that preserves the privacy of the users' data while also generating heatmaps that are visually similar to the ‘true’ heatmap. These methods include decomposing the average of the users' data (the ‘true’ heatmap) into multiple different spatial scales, injecting random noise into the data at the multiple different spatial scales, and then reconstructing the privacy-preserving heatmap based on the noisy multi-scale representations. The magnitude of the noise injected at each spatial scale is selected to ensure preservation of privacy while also resulting in heatmaps that are visually similar to the ‘true’ heatmap.

Type: Grant

Filed: July 12, 2022

Date of Patent: November 11, 2025

Assignee: Google LLC

Inventors: Vidhya Navalpakkam, Pasin Manurangsi, Nachiappan Valliappan, Kai Kohlhoff, Junfeng He, Badih Ghazi, Shanmugasundaram Ravikumar
Machine Learning for Computation of Visual Attention Center

Publication number: 20250316075

Abstract: Provided are systems and methods for training and using a machine-learned model to predict a visual attention center for an image. As one example, the predicted visual attention center for the image can be used in ordering image regions for encoding, decoding, transmitting, and/or loading in a progressive image loading format.

Type: Application

Filed: May 13, 2022

Publication date: October 9, 2025

Inventors: Junfeng He, Moritz Firsching, Jyrki Antero Alakuijala, Kai Jochen Kohlhoff
Techniques for reducing a distraction in an image

Patent number: 12406377

Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.

Type: Grant

Filed: July 1, 2022

Date of Patent: September 2, 2025

Assignee: GOOGLE LLC

Inventors: Kfir Aberman, David Edward Jacobs, Kai Jochen Kohlhoff, Michael Rubinstein, Yossi Gandelsman, Junfeng He, Inbar Mosseri, Yael Pritch Knaan
Eye gaze tracking using neural networks

Patent number: 12254685

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for characterizing a gaze position of a user in a query image. One of the methods includes obtaining a query image of a user captured by a camera of a mobile device; obtaining device characteristics data specifying (ii) characteristics of the mobile device, (ii) characteristics of the camera of the mobile device, or (iii) both; and processing a neural network input comprising (i) one or more images derived from the query image and (ii) the device characteristics data using a gaze prediction neural network, wherein the gaze prediction neural network is configured to, at run time and after the gaze prediction neural network has been trained, process the neural network input to generate a neural network output that characterizes a gaze position of the user in the query image.

Type: Grant

Filed: January 9, 2023

Date of Patent: March 18, 2025

Assignee: Google LLC

Inventors: Dmitry Lagun, Junfeng He, Pingmei Xu
Bioaugmentation treatment process for lithium battery producing wastewater

Patent number: 12246977

Abstract: The present invention relates to the technical field of wastewater treatment, and discloses a bioaugmentation treatment process for lithium battery producing wastewater. The method comprises the following steps: 1) introducing wastewater into a hydrolytic acidification tank, and adding Enterobacter sp. NJUST50 and activated sludge to the hydrolytic acidification tank for hydrolytic acidification treatment; 2) introducing the effluent into an anoxic tank, and adding Enterobacter sp. NJUST50 and anaerobic activated sludge for anoxic treatment; 3) introducing the effluent into an aerobic tank, and adding Enterobacter sp. NJUST50 and aerobic activated sludge for aerobic treatment; 4) introducing the effluent into an anoxic filter tank, and adding Enterobacter sp. NJUST50 and anaerobic activated sludge to the filter tank for treatment; and 5) introducing the effluent into a biological aerated filter tank, and adding a sludge mixture of Enterobacter sp.

Type: Grant

Filed: August 8, 2022

Date of Patent: March 11, 2025

Assignees: Nanjing University of Science and Technology, Zhenrun Environmental Science and Technology Co., Ltd.

Inventors: Jinyou Shen, Hebing Zhang, Jing Wang, Junfeng He, Xinbai Jiang, Hong Wang, Cheng Hou, Xiaodong Liu
EYE GAZE TRACKING USING NEURAL NETWORKS

Publication number: 20230274537

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for characterizing a gaze position of a user in a query image. One of the methods includes obtaining a query image of a user captured by a camera of a mobile device; obtaining device characteristics data specifying (ii) characteristics of the mobile device, (ii) characteristics of the camera of the mobile device, or (iii) both; and processing a neural network input comprising (i) one or more images derived from the query image and (ii) the device characteristics data using a gaze prediction neural network, wherein the gaze prediction neural network is configured to, at run time and after the gaze prediction neural network has been trained, process the neural network input to generate a neural network output that characterizes a gaze position of the user in the query image.

Type: Application

Filed: January 9, 2023

Publication date: August 31, 2023

Inventors: Dmitry Lagun, Junfeng He, Pingmei Xu
Differentially Private Heatmaps

Publication number: 20230032705

Abstract: Improved methods are provided for generating heatmaps or other summary map data from multiple users' data (e.g., probability distributions) in a manner that preserves the privacy of the users' data while also generating heatmaps that are visually similar to the ‘true’ heatmap. These methods include decomposing the average of the users' data (the ‘true’ heatmap) into multiple different spatial scales, injecting random noise into the data at the multiple different spatial scales, and then reconstructing the privacy-preserving heatmap based on the noisy multi-scale representations. The magnitude of the noise injected at each spatial scale is selected to ensure preservation of privacy while also resulting in heatmaps that are visually similar to the ‘true’ heatmap.

Type: Application

Filed: July 12, 2022

Publication date: February 2, 2023

Inventors: Vidhya Navalpakkam, Pasin Manurangsi, Nachiappan Valliappan, Kai Kohlhoff, Junfeng He, Badih Ghazi, Shanmugasundaram Ravikumar
Deep Saliency Prior

Publication number: 20230015117

Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.

Type: Application

Filed: July 1, 2022

Publication date: January 19, 2023

Inventors: Kfir Aberman, David Edward Jacobs, Kai Jochen Kohlhoff, Michael Rubinstein, Yossi Gandelsman, Junfeng He, Inbar Mosseri, Yael Pritch Knaan
Eye gaze tracking using neural networks

Patent number: 11551377

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for characterizing a gaze position of a user in a query image. One of the methods includes obtaining a query image of a user captured by a camera of a mobile device; obtaining device characteristics data specifying (ii) characteristics of the mobile device, (ii) characteristics of the camera of the mobile device, or (iii) both; and processing a neural network input comprising (i) one or more images derived from the query image and (ii) the device characteristics data using a gaze prediction neural network, wherein the gaze prediction neural network is configured to, at run time and after the gaze prediction neural network has been trained, process the neural network input to generate a neural network output that characterizes a gaze position of the user in the query image.

Type: Grant

Filed: November 23, 2020

Date of Patent: January 10, 2023

Assignee: Google LLC

Inventors: Dmitry Lagun, Junfeng He, Pingmei Xu

1 2 3 next