Patents by Inventor Houdong HU

Houdong HU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Providing Local Recommendations based on Images of Consumable Items

Publication number: 20240046332

Abstract: The present disclosure provides method and apparatus for determining a food item from a photograph and a corresponding restaurant serving the food item. An image is received from a user, the image being associated with a consumable item. One or more ingredients of the consumable item in the image is identified along with a location of the user and using a neural network, determining one or more similar images from a database. A restaurant associated with each of the one or more similar images is determined along with a similarity score indicating a similarity between the restaurant and the identified content of the image. The one or more restaurants and/or associated similar food items are ranked based on the similarity score and a list of ranked restaurants is provided to the user.

Type: Application

Filed: October 20, 2023

Publication date: February 8, 2024

Inventors: Julia X. Gong, Jyotkumar Patel, Yale Song, Xuetao Yin, Xiujia Guo, Rajiv S. Binwade, Houdong Hu
Techniques for Abstract Image Generation from Multimodal Inputs with Content Appropriateness Considerations

Publication number: 20240037810

Abstract: A data processing system implements a receiving a textual input comprising a query for a first image. The data processing system also implements analyzing the textual input to determine a predicted color palette associated with a subject matter of the query; and procedurally generating the first image using the predicted color palette. Another implementation of the data processing system implements providing the textual input to a first machine learning model to obtain the first image, the first machine learning model being trained using a dataset comprising abstract imagery and analyzing the textual input using the first machine learning model to obtain the first image in response to receiving the textual input.

Type: Application

Filed: July 30, 2022

Publication date: February 1, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Julia GONG, Houdong HU, William Douglas GUYMAN
Providing local recommendations based on images of consumable items

Patent number: 11830056

Abstract: The present disclosure provides method and apparatus for determining a food item from a photograph and a corresponding restaurant serving the food item. An image is received from a user, the image being associated with a consumable item. One or more ingredients of the consumable item in the image is identified along with a location of the user and using a neural network, determining one or more similar images from a database. A restaurant associated with each of the one or more similar images is determined along with a similarity score indicating a similarity between the restaurant and the identified content of the image. The one or more restaurants and/or associated similar food items are ranked based on the similarity score and a list of ranked restaurants is provided to the user.

Type: Grant

Filed: November 23, 2020

Date of Patent: November 28, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Julia X. Gong, Jyotkumar Patel, Yale Song, Xuetao Yin, Xiujia Guo, Rajiv S. Binwade, Houdong Hu
Encoder using machine-trained term frequency weighting factors that produces a dense embedding vector

Patent number: 11669558

Abstract: A computer-implemented technique generates a dense embedding vector that provides a distributed representation of input text. The technique includes: generating an input term-frequency (TF) vector of dimension g that includes frequency information relating to frequency of occurrence of terms in an instance of input text; using a TF-modifying component to modify the term-specific frequency information in the input TF vector by respective machine-trained weighting factors, to produce an intermediate vector of dimension g; using a projection component to project the intermediate vector of dimension g into an embedding vector of dimension k, where k is less than g. Both the TF-modifying component and the projection component use respective machine-trained neural networks. An application performs any of a retrieval-based function, a recognition-based function, a recommendation-based function, a classification-based function, etc. based on the embedding vector.

Type: Grant

Filed: March 28, 2019

Date of Patent: June 6, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yan Wang, Ye Wu, Houdong Hu, Surendra Ulabala, Vishal Thakkar, Arun Sacheti
Image annotation

Patent number: 11372914

Abstract: The description relates to diversified hybrid image annotation for annotating images. One implementation includes generating first image annotations for a query image using a retrieval-based image annotation technique. Second image annotations can be generated for the query image using a model-based image annotation technique. The first and second image annotations can be integrated to generate a diversified hybrid image annotation result for the query image.

Type: Grant

Filed: March 26, 2018

Date of Patent: June 28, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yokesh Kumar, Kuang-Huei Lee, Houdong Hu, Li Huang, Arun Sacheti, Meenaz Merchant, Linjun Yang, Tianjun Xiao, Saurajit Mukherjee
Providing Local Recommendations based on Images of Consumable Items

Publication number: 20220164853

Abstract: The present disclosure provides method and apparatus for determining a food item from a photograph and a corresponding restaurant serving the food item. An image is received from a user, the image being associated with a consumable item. One or more ingredients of the consumable item in the image is identified along with a location of the user and using a neural network, determining one or more similar images from a database. A restaurant associated with each of the one or more similar images is determined along with a similarity score indicating a similarity between the restaurant and the identified content of the image. The one or more restaurants and/or associated similar food items are ranked based on the similarity score and a list of ranked restaurants is provided to the user.

Type: Application

Filed: November 23, 2020

Publication date: May 26, 2022

Inventors: Julia X GONG, Jyotkumar PATEL, Yale SONG, Xuetao YIN, Xiujia GUO, Rajiv S. BINWADE, Houdong HU
SYSTEM AND METHOD FOR ATTRIBUTE-BASED VISUAL SEARCH OVER A COMPUTER COMMUNICATION NETWORK

Publication number: 20210382935

Abstract: A visual search system comprised of a computing device, the computing device including an image processing engine for generating a feature vector representing a user-selected object in an image input, an object detection engine for locating one or more objects in the image input and for determining a category of a user-selected object from objects in the image input, the object detection engine using the category to generate a plurality of attributes for the user-selected object, a product data store for storing a plurality of tables storing one or more attributes associated with a category of the user-selected object, an attribute generation engine for generating a plurality of attribute options for each of the attributes of the user-selected object, and an attribute matching engine for comparing attributes and attribute options of the user-selected object with attributes and attribute options of visually similar products and images.

Type: Application

Filed: August 17, 2021

Publication date: December 9, 2021

Inventors: Li Huang, Meenaz Merchant, Houdong Hu, Arun Sacheti
Generating and applying an object-level relational index for images

Patent number: 11182408

Abstract: A computer-implemented technique is described herein for using a machine-trained model to identify individual objects within images. The technique then creates a relational index for the identified objects. That is, each index entry in the relational index is associated with a given object, and includes a set of attributes pertaining to the given object. One such attribute identifies at least one latent semantic vector associated with the given object. Each attribute provides a way of linking the given object to one or more other objects in the relational index. In one application of this technique, a user may submit a query that specifies a query object. The technique consults the relational index to find one or more objects that are related to the query object. In some cases, the query object and each of the other objects have a complementary relationship.

Type: Grant

Filed: May 21, 2019

Date of Patent: November 23, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kun Wu, Yiran Shen, Houdong Hu, Soudamini Sreepada, Arun Sacheti, Mithun Das Gupta, Rushabh Rajesh Gandhi, Sudhir Kumar
System and method for attribute-based visual search over a computer communication network

Patent number: 11120070

Abstract: A visual search system detects one or more user-selected objects represented in an image. A first group of attributes for the user-selected objects is identified. A category for the user-selected objects is identified, and a second group of pre-defined attributes associated with the category is retrieved. The first and second groups of attributes are combined into an attributes set. The combined set of attributes are presented to the user. The user selects one or more attributes and a search is performed to identify images similar to the user-selected attributes. The images are ranked and a subset is presented to the user.

Type: Grant

Filed: May 21, 2018

Date of Patent: September 14, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Li Huang, Meenaz Merchant, Houdong Hu, Arun Sacheti
Stacked cross-modal matching

Patent number: 11093560

Abstract: The present concepts relate to matching data of two different modalities using two stages of attention. First data is encoded as a set of first vectors representing components of the first data, and second data is encoded as a set of second vectors representing components of the second data. In the first stage, the components of the first data are attended by comparing the first vectors and the second vectors to generate a set of attended vectors. In the second stage, the components of the second data are attended by comparing the second vectors and the attended vectors to generate a plurality of relevance scores. Then, the relevance scores are pooled to calculate a similarity score that indicates a degree of similarity between the first data and the second data.

Type: Grant

Filed: September 21, 2018

Date of Patent: August 17, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kuang-Huei Lee, Gang Hua, Xi Chen, Houdong Hu, He Xiaodong
Multi-modal visual search pipeline for web scale images

Patent number: 11074289

Abstract: Systems and methods can be implemented to conduct searches based on images used as queries in a variety of applications. In various embodiments, a set of visual words representing a query image are generated from features extracted from the query image and are compared with visual words of index images. A set of candidate images is generated from the index images resulting from matching one or more visual words in the comparison. A multi-level ranking is conducted to sort the candidate images of the set of candidate images, and results of the multi-level ranking are returned to a user device that provided the query image. Additional systems and methods are disclosed.

Type: Grant

Filed: January 31, 2018

Date of Patent: July 27, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Houdong Hu, Yan Wang, Linjun Yang, Li Huang, Xi Chen, Jiapei Huang, Ye Wu, Arun K. Sacheti, Meenaz Merchant
Interactive visual search engine

Patent number: 11036724

Abstract: A visual search engine is described herein. The visual search engine is configured to return information to a client computing device based upon a multimodal query received from the client computing device (wherein the multimodal query comprises an image and text). The visual search engine is further configured to interact with a user of the client computing device to disambiguate information retrieval intent of the user.

Type: Grant

Filed: September 4, 2019

Date of Patent: June 15, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Li Huang, Houdong Hu, Meenaz Merchant, Arun Sacheti
INTERACTIVE VISUAL SEARCH ENGINE

Publication number: 20210064612

Abstract: A visual search engine is described herein. The visual search engine is configured to return information to a client computing device based upon a multimodal query received from the client computing device (wherein the multimodal query comprises an image and text). The visual search engine is further configured to interact with a user of the client computing device to disambiguate information retrieval intent of the user.

Type: Application

Filed: September 4, 2019

Publication date: March 4, 2021

Inventors: Li HUANG, Houdong HU, Meenaz MERCHANT, Arun SACHETI
Product identification in image with multiple products

Patent number: 10902051

Abstract: Methods, systems, and computer programs are presented for identifying the brand and model of products embedded within an image. One method includes operations for receiving, via a graphical user interface (GUI), a selection of an image, and for analyzing the image to determine a location within the image of one or more products. For each product in the image, determining a unique identification of the product is determined, the unique identification including a manufacturer of the product and a model identifier. The method further includes an operation for presenting information about the one or more products in the GUI with a selection option for selecting each of the one or more products. Additionally, the method includes operations for receiving a product selection for one of the one or more products, and presenting shopping options in the GUI for purchasing the selected product.

Type: Grant

Filed: April 16, 2018

Date of Patent: January 26, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Houdong Hu, Li Huang
Search results through image attractiveness

Patent number: 10902052

Abstract: Systems and methods for identifying search results in response to a search query are presented. More particularly, images are selected as search results, at least in part, according to an attractiveness value associated with the images. Upon receiving a search query, a set of content is identified according to the query intent of the search query and includes at least one image. The identified set of content is ordered according an overall score determined according to relevance and, in the case of the at least one image, according to an attractiveness value. A search results generator selects items from the set of content according to their overall scores, including the at least one image, generates a search results page, and returns the search results page to the requesting party.

Type: Grant

Filed: March 26, 2018

Date of Patent: January 26, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Mark Robert Bolin, Ning Ma, Aleksandr Livshits, Alexey Volkov, Pawel Michal Pietrusinksi, Houdong Hu
Transforming audio content into images

Patent number: 10891969

Abstract: A technique is described herein for transforming audio content into images. The technique may include: receiving the audio content from a source; converting the audio content into a temporal stream of audio features; and converting the stream of audio features into one or more images using one or more machine-trained models. The technique generates the image(s) based on recognition of: semantic information that conveys one or more semantic topics associated with the audio content; and sentiment information that conveys one or more sentiments associated with the audio content. The technique then generates an output presentation that includes the image(s), which it provides to one or more display devices for display thereat. The output presentation serves as a summary of salient semantic and sentiment-related characteristics of the audio content.

Type: Grant

Filed: October 19, 2018

Date of Patent: January 12, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Li Huang, Houdong Hu, Congyong Su
Generating and Applying an Object-Level Relational Index for Images

Publication number: 20200372047

Abstract: A computer-implemented technique is described herein for using a machine-trained model to identify individual objects within images. The technique then creates a relational index for the identified objects. That is, each index entry in the relational index is associated with a given object, and includes a set of attributes pertaining to the given object. One such attribute identifies at least one latent semantic vector associated with the given object. Each attribute provides a way of linking the given object to one or more other objects in the relational index. In one application of this technique, a user may submit a query that specifies a query object. The technique consults the relational index to find one or more objects that are related to the query object. In some cases, the query object and each of the other objects have a complementary relationship.

Type: Application

Filed: May 21, 2019

Publication date: November 26, 2020

Inventors: Kun WU, Yiran SHEN, Houdong HU, Soudamini SREEPADA, Arun SACHETI, Mithun Das GUPTA, Rushabh Rajesh GANDHI, Sudhir KUMAR
Plural-Mode Image-Based Search

Publication number: 20200356592

Abstract: A computer-implemented technique is described herein for generating query results based on both an image and an instance of text submitted by a user. The technique allows a user to more precisely express his or her search intent compared to the case in which a user submits text or an image by itself. This, in turn, enables the user to quickly and efficiently identify relevant search results. In a text-based retrieval path, the technique supplements the text submitted by the user with insight extracted from the input image, and then conducts a text-based search. In an image-based retrieval path, the technique uses insight extracted from the input text to guide the manner in which it processes the input image. In another implementation, the technique generates query results based on an image submitted by the user together with information provided by some other mode of expression besides text.

Type: Application

Filed: May 9, 2019

Publication date: November 12, 2020

Inventors: Ravi Theja YADA, Houdong HU, Yan WANG, Saurajit MUKHERJEE, Vishal THAKKAR, Arun SACHETI
Encoder Using Machine-Trained Term Frequency Weighting Factors that Produces a Dense Embedding Vector

Publication number: 20200311542

Abstract: A computer-implemented technique is described herein for generating a dense embedding vector that provides a distribution representation of input text. In one implementation, the technique includes: generating an input term-frequency (TF) vector of dimension g that includes frequency information relating to frequency of occurrence of terms in an instance of input text; using a TF-modifying to modify the term-specific frequency information in the input TF vector by respective machine-trained weighting factors, to produce an intermediate vector of dimension g; using a projection component to project the intermediate vector of dimension g into an embedding vector of dimension k, where k is less than g. Both the TF-modifying component and the projection component can use respective machine-trained neural networks. An application component can perform any of a retrieval-based function, a recognition-based function, a recommendation-based function, a classification-based function, etc.

Type: Application

Filed: March 28, 2019

Publication date: October 1, 2020

Inventors: Yan WANG, Ye WU, Houdong HU, Surendra ULABALA, Vishal THAKKAR, Arun SACHETI
Transforming Audio Content into Images

Publication number: 20200126584

Abstract: A technique is described herein for transforming audio content into images. The technique may include: receiving the audio content from a source; converting the audio content into a temporal stream of audio features; and converting the stream of audio features into one or more images using one or more machine-trained models. The technique generates the image(s) based on recognition of: semantic information that conveys one or more semantic topics associated with the audio content; and sentiment information that conveys one or more sentiments associated with the audio content. The technique then generates an output presentation that includes the image(s), which it provides to one or more display devices for display thereat. The output presentation serves as a summary of salient semantic and sentiment-related characteristics of the audio content.

Type: Application

Filed: October 19, 2018

Publication date: April 23, 2020

Inventors: Li HUANG, Houdong HU, Congyong SU

1 2 next