Patents by Inventor Kai Jochen Kohlhoff

Kai Jochen Kohlhoff has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20260004191
    Abstract: Aspects of the disclosed technology include computer-implemented systems and methods for machine-learned multimodal models. A machine-learned multimodal model includes one or more embedding layers configured to generate one or more image tokens and one or more text tokens in response to the imagery and the text, a transformer encoder configured to receive the one or more image tokens and the one or more text tokens and generate one or more fused image tokens and one or more fused text tokens, a heatmap predictor configured to obtain the one or more fused image tokens and generate at least one image heatmap, and a sequence predictor configured to obtain the one or more fused image tokens and the one or more fused text tokens and generate a predicted sequence associated with the image.
    Type: Application
    Filed: June 26, 2025
    Publication date: January 1, 2026
    Inventors: Junfeng He, Gang Li, Peizhao Li, Nachiappan Valliappan, Vidhya Navalpakkam, Yang Li, Kai Jochen Kohlhoff
  • Publication number: 20260004490
    Abstract: Aspects of the disclosed technology include computer-implemented systems and methods for machine-learned multimodal models for feedback predictions for synthetic content. A machine-learned multimodal model is configured to generate a feature map based at least in part on fusion of image information and text information from a synthetic image and a text prompt. The model is configured to generate a set of text tokens based at least in part on fusion of the image information and the text information. The model is configured to generate at least one misalignment or implausibility heatmap based at least in part on the at least one feature map. The model is configured to generate at least one predicted misalignment sequence based at least in part on the set of text tokens.
    Type: Application
    Filed: June 26, 2025
    Publication date: January 1, 2026
    Inventors: Junfeng He, Youwei Liang, Gang Li, Feng Yang, Junjie Ke, Peizhao Li, Vidhya Navalpakkam, Jiao Sun, Yang Li, Kai Jochen Kohlhoff, Jordi Pont-Tuset, Deepak Ramachandran
  • Publication number: 20250363643
    Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.
    Type: Application
    Filed: August 7, 2025
    Publication date: November 27, 2025
    Inventors: Kfir Aberman, David Edward Jacobs, Kai Jochen Kohlhoff, Michael Rubinstein, Yossi Gandelsman, Junfeng He, Inbar Mosseri, Yael Pritch Knaan
  • Publication number: 20250316075
    Abstract: Provided are systems and methods for training and using a machine-learned model to predict a visual attention center for an image. As one example, the predicted visual attention center for the image can be used in ordering image regions for encoding, decoding, transmitting, and/or loading in a progressive image loading format.
    Type: Application
    Filed: May 13, 2022
    Publication date: October 9, 2025
    Inventors: Junfeng He, Moritz Firsching, Jyrki Antero Alakuijala, Kai Jochen Kohlhoff
  • Patent number: 12406377
    Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.
    Type: Grant
    Filed: July 1, 2022
    Date of Patent: September 2, 2025
    Assignee: GOOGLE LLC
    Inventors: Kfir Aberman, David Edward Jacobs, Kai Jochen Kohlhoff, Michael Rubinstein, Yossi Gandelsman, Junfeng He, Inbar Mosseri, Yael Pritch Knaan
  • Publication number: 20230015117
    Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.
    Type: Application
    Filed: July 1, 2022
    Publication date: January 19, 2023
    Inventors: Kfir Aberman, David Edward Jacobs, Kai Jochen Kohlhoff, Michael Rubinstein, Yossi Gandelsman, Junfeng He, Inbar Mosseri, Yael Pritch Knaan
  • Patent number: 9703373
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying a direction in which a user is looking. In one aspect, a method includes receiving an image of a sequence of images. The image can depict a face of a user. A template image for each particular facial feature point can be compared to one or more image portions of the image. The template image for the particular facial feature point can include a portion of a previous image of the sequence of images that depicted the facial feature point. Based on the comparison, a matching image portion of the image that matches the template image for the particular facial feature point is identified. A location of the matching image portion is identified in the image. A direction in which the user is looking is determined based on the identified location for each template image.
    Type: Grant
    Filed: April 23, 2015
    Date of Patent: July 11, 2017
    Assignee: Google Inc.
    Inventors: Kai Jochen Kohlhoff, Xiaoyi Zhang
  • Publication number: 20150309569
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying a direction in which a user is looking. In one aspect, a method includes receiving an image of a sequence of images. The image can depict a face of a user. A template image for each particular facial feature point can be compared to one or more image portions of the image. The template image for the particular facial feature point can include a portion of a previous image of the sequence of images that depicted the facial feature point. Based on the comparison, a matching image portion of the image that matches the template image for the particular facial feature point is identified. A location of the matching image portion is identified in the image. A direction in which the user is looking is determined based on the identified location for each template image.
    Type: Application
    Filed: April 23, 2015
    Publication date: October 29, 2015
    Inventors: Kai Jochen Kohlhoff, Xiaoyi Zhang