Patents by Inventor Changbo HU

Changbo HU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Processing image-bearing electronic documents using a multimodal fusion framework

Patent number: 11301732

Abstract: A computer-implemented technique uses one or more neural networks to identify at least one item name associated with an input image using a multi-modal fusion approach. The technique is said to be multi-modal because it collects and processes different kinds of evidence regarding each detected item name. The technique is said to adopt a fusion approach because it fuses the multi-modal evidence into an output conclusion that identifies at least one item name associated with the input image. In one example, a first mode collects evidence by identifying and analyzing regions in the input image that are likely to include item name-related information. A second mode collects and analyzes any text that appears as part of input image itself. A third mode collects and analyzes text that is not included in the input image itself, but is nonetheless associated with the input image.

Type: Grant

Filed: March 25, 2020

Date of Patent: April 12, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Changbo Hu, Qun Li, Ruofei Zhang, Keng-hao Chang
Multi-task GAN, and image translator and image classifier trained thereby

Patent number: 11263487

Abstract: A computer-implemented technique uses a generative adversarial network (GAN) to jointly train a generator neural network (“generator”) and a discriminator neural network (“discriminator”). Unlike traditional GAN designs, the discriminator performs the dual role of: (a) determining one or more attribute values associated with an object depicted in input image fed to the discriminator; and (b) determining whether the input image fed to the discriminator is real or synthesized by the generator. Also unlike traditional GAN designs, an image classifier can make use of a model produced by the GAN's discriminator. The generator receives generator input information that includes a conditional input image and one or more conditional values that express desired characteristics of the generator output image. The discriminator receives the conditional input image in conjunction with a discriminator input image, which corresponds to either the generator output image or a real image.

Type: Grant

Filed: March 25, 2020

Date of Patent: March 1, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Qun Li, Changbo Hu, Keng-hao Chang, Ruofei Zhang
Pipeline for identifying supplemental content items that are related to objects in images

Patent number: 11163940

Abstract: Technologies are described herein that relate to identifying supplemental content items that are related to objects captured in images of webpages. A computing system receives an indication that a client computing device has a webpage displayed thereon that includes an image. The image is provided to a first DNN that is configured to identify a portion of the image that includes an object of a type from amongst a plurality of predefined types. Once the portion of the image is identified, the portion of the image is provided to a plurality of DNNs, with each of the DNNs configured to output a word or phrase that represents a value of a respective attribute of the object. A sequence of words or phrases output by the plurality of DNNs is provided to a search computing system, which identifies a supplemental content item based upon the sequence of words or phrases.

Type: Grant

Filed: May 25, 2019

Date of Patent: November 2, 2021

Assignee: Microsoft Technology Licensing LLC

Inventors: Qun Li, Changbo Hu, Keng-hao Chang, Ruofei Zhang
Multi-Task GAN, and Image Translator and Image Classifier Trained Thereby

Publication number: 20210303927

Abstract: A computer-implemented technique uses a generative adversarial network (GAN) to jointly train a generator neural network (“generator”) and a discriminator neural network (“discriminator”). Unlike traditional GAN designs, the discriminator performs the dual role of: (a) determining one or more attribute values associated with an object depicted in input image fed to the discriminator; and (b) determining whether the input image fed to the discriminator is real or synthesized by the generator. Also unlike traditional GAN designs, an image classifier can make use of a model produced by the GAN's discriminator. The generator receives generator input information that includes a conditional input image and one or more conditional values that express desired characteristics of the generator output image. The discriminator receives the conditional input image in conjunction with a discriminator input image, which corresponds to either the generator output image or a real image.

Type: Application

Filed: March 25, 2020

Publication date: September 30, 2021

Inventors: Qun LI, Changbo HU, Keng-hao CHANG, Ruofei ZHANG
Processing Image-Bearing Electronic Documents using a Multimodal Fusion Framework

Publication number: 20210303939

Abstract: A computer-implemented technique uses one or more neural networks to identify at least one item name associated with an input image using a multi-modal fusion approach. The technique is said to be multi-modal because it collects and processes different kinds of evidence regarding each detected item name. The technique is said to adopt a fusion approach because it fuses the multi-modal evidence into an output conclusion that identifies at least one item name associated with the input image. In one example, a first mode collects evidence by identifying and analyzing regions in the input image that are likely to include item name-related information. A second mode collects and analyzes any text that appears as part of input image itself. A third mode collects and analyzes text that is not included in the input image itself, but is nonetheless associated with the input image.

Type: Application

Filed: March 25, 2020

Publication date: September 30, 2021

Inventors: Changbo HU, Qun LI, Ruofei ZHANG, Keng-hao CHANG
PIPELINE FOR IDENTIFYING SUPPLEMENTAL CONTENT ITEMS THAT ARE RELATED TO OBJECTS IN IMAGES

Publication number: 20200372103

Abstract: Technologies are described herein that relate to identifying supplemental content items that are related to objects captured in images of webpages. A computing system receives an indication that a client computing device has a webpage displayed thereon that includes an image. The image is provided to a first DNN that is configured to identify a portion of the image that includes an object of a type from amongst a plurality of predefined types. Once the portion of the image is identified, the portion of the image is provided to a plurality of DNNs, with each of the DNNs configured to output a word or phrase that represents a value of a respective attribute of the object. A sequence of words or phrases output by the plurality of DNNs is provided to a search computing system, which identifies a supplemental content item based upon the sequence of words or phrases.

Type: Application

Filed: May 25, 2019

Publication date: November 26, 2020

Inventors: Qun LI, Changbo HU, Keng-hao CHANG, Ruofei ZHANG

Processing image-bearing electronic documents using a multimodal fusion framework

Multi-task GAN, and image translator and image classifier trained thereby

Pipeline for identifying supplemental content items that are related to objects in images

Multi-Task GAN, and Image Translator and Image Classifier Trained Thereby

Processing Image-Bearing Electronic Documents using a Multimodal Fusion Framework

PIPELINE FOR IDENTIFYING SUPPLEMENTAL CONTENT ITEMS THAT ARE RELATED TO OBJECTS IN IMAGES