Patents by Inventor Xiyang Dai

Xiyang Dai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ANNOTATING IMAGES FOR TRAINING COMPUTER VISION MODELS

Publication number: 20250148765

Abstract: A method for annotating images to create a corpus for training a multi-task computer vision machine learning model is presented. The method comprises receiving, at one or more annotation specialist models, a plurality of images to be annotated. Via operation of the one or more annotation specialist models, pre-filtered annotations are generated for the plurality of images. Via operation of a data filtering and enhancement module, the pre-filtered annotations are filtered in accordance with predefined noise criteria so as to output candidate annotations for the plurality of images. The method further comprises, for each of one or more candidate annotations, selectively (1) storing the candidate annotation into the corpus as a final annotation for its associated image, or (2) adding the candidate annotation to its associated image using the one or more annotation specialist models and the data filtering and enhancement module for subsequent iterative annotation and filtering.

Type: Application

Filed: January 30, 2024

Publication date: May 8, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Lu YUAN, Bin XIAO, Haiping WU, Weijian XU, Xiyang DAI, Houdong HU, Yumao LU, Nanshan ZENG, Ce Christopher LIU
Dynamic matrix convolution with channel fusion

Patent number: 12223412

Abstract: A computer device for automatic feature detection comprises a processor, a communication device, and a memory configured to hold instructions executable by the processor to instantiate a dynamic convolution neural network, receive input data via the communication network, and execute the dynamic convolution neural network to automatically detect features in the input data. The dynamic convolution neural network compresses the input data from an input space having a dimensionality equal to a predetermined number of channels into an intermediate space having a dimensionality less than the number of channels. The dynamic convolution neural network dynamically fuses the channels into an intermediate representation within the intermediate space and expands the intermediate representation from the intermediate space to an expanded representation in an output space having a higher dimensionality than the dimensionality of the intermediate space.

Type: Grant

Filed: December 16, 2020

Date of Patent: February 11, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yinpeng Chen, Xiyang Dai, Mengchen Liu, Dongdong Chen, Lu Yuan, Zicheng Liu, Ye Yu, Mei Chen, Yunsheng Li
GENERATING AN INPAINTED IMAGE FROM A MASKED IMAGE USING A PATCH-BASED ENCODER

Publication number: 20250037252

Abstract: The disclosure herein describes generating an inpainted image from a masked image using a patch-based encoder and an unquantized transformer. An image including a masked region and an unmasked region is received, and the received image is divided into a plurality of patches including masked patches. The plurality of patches is encoded into a plurality of feature vectors, wherein each patch is encoded to a feature vector. Using a transformer, a predicted token is generated for each masked patch using a feature vector encoded from the masked patch, and a quantized vector of the masked patch is determined using generated predicted token and a masked patch-specific codebook. The determined quantized vector of the masked patch is included into a set of quantized vectors associated with the plurality of patches, and an output image is generated from the set of quantized vectors using a decoder.

Type: Application

Filed: October 11, 2024

Publication date: January 30, 2025

Inventors: Dongdong CHEN, Xiyang DAI, Yinpeng CHEN, Mengchen LIU, Lu YUAN
Generating an inpainted image from a masked image using a patch-based encoder

Patent number: 12148131

Abstract: The disclosure herein describes generating an inpainted image from a masked image using a patch-based encoder and an unquantized transformer. An image including a masked region and an unmasked region is received, and the received image is divided into a plurality of patches including masked patches. The plurality of patches is encoded into a plurality of feature vectors, wherein each patch is encoded to a feature vector. Using a transformer, a predicted token is generated for each masked patch using a feature vector encoded from the masked patch, and a quantized vector of the masked patch is determined using generated predicted token and a masked patch-specific codebook. The determined quantized vector of the masked patch is included into a set of quantized vectors associated with the plurality of patches, and an output image is generated from the set of quantized vectors using a decoder.

Type: Grant

Filed: April 29, 2022

Date of Patent: November 19, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Dongdong Chen, Xiyang Dai, Yinpeng Chen, Mengchen Liu, Lu Yuan
Dynamic head for object detection

Patent number: 11989956

Abstract: Systems and methods for object detection generate a feature pyramid corresponding to image data, and rescaling the feature pyramid to a scale corresponding to a median level of the feature pyramid, wherein the rescaled feature pyramid is a four-dimensional (4D) tensor. The 4D tensor is reshaped into a three-dimensional (3D) tensor having individual perspectives including scale features, spatial features, and task features corresponding to different dimensions of the 3D tensor. The 3D tensor is used with a plurality of attention layers to update a plurality of feature maps associated with the image data. Object detection is performed on the image data using the updated plurality of feature maps.

Type: Grant

Filed: April 5, 2021

Date of Patent: May 21, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Xiyang Dai, Yinpeng Chen, Bin Xiao, Dongdong Chen, Mengchen Liu, Lu Yuan, Lei Zhang
GENERATING AN INPAINTED IMAGE FROM A MASKED IMAGE USING A PATCH-BASED ENCODER

Publication number: 20230351558

Abstract: The disclosure herein describes generating an inpainted image from a masked image using a patch-based encoder and an unquantized transformer. An image including a masked region and an unmasked region is received, and the received image is divided into a plurality of patches including masked patches. The plurality of patches is encoded into a plurality of feature vectors, wherein each patch is encoded to a feature vector. Using a transformer, a predicted token is generated for each masked patch using a feature vector encoded from the masked patch, and a quantized vector of the masked patch is determined using generated predicted token and a masked patch-specific codebook. The determined quantized vector of the masked patch is included into a set of quantized vectors associated with the plurality of patches, and an output image is generated from the set of quantized vectors using a decoder.

Type: Application

Filed: April 29, 2022

Publication date: November 2, 2023

Inventors: Dongdong CHEN, Xiyang DAI, Yinpeng CHEN, Mengchen LIU, Lu YUAN
DYNAMIC HEAD FOR OBJECT DETECTION

Publication number: 20220318541

Abstract: Systems and methods for object detection generate a feature pyramid corresponding to image data, and rescaling the feature pyramid to a scale corresponding to a median level of the feature pyramid, wherein the rescaled feature pyramid is a four-dimensional (4D) tensor. The 4D tensor is reshaped into a three-dimensional (3D) tensor having individual perspectives including scale features, spatial features, and task features corresponding to different dimensions of the 3D tensor. The 3D tensor is used with a plurality of attention layers to update a plurality of feature maps associated with the image data. Object detection is performed on the image data using the updated plurality of feature maps.

Type: Application

Filed: April 5, 2021

Publication date: October 6, 2022

Inventors: Xiyang DAI, Yinpeng CHEN, Bin XIAO, Dongdong CHEN, Mengchen LIU, Lu YUAN, Lei ZHANG
DYNAMIC MATRIX CONVOLUTION WITH CHANNEL FUSION

Publication number: 20220188595

Abstract: A computer device for automatic feature detection comprises a processor, a communication device, and a memory configured to hold instructions executable by the processor to instantiate a dynamic convolution neural network, receive input data via the communication network, and execute the dynamic convolution neural network to automatically detect features in the input data. The dynamic convolution neural network compresses the input data from an input space having a dimensionality equal to a predetermined number of channels into an intermediate space having a dimensionality less than the number of channels. The dynamic convolution neural network dynamically fuses the channels into an intermediate representation within the intermediate space and expands the intermediate representation from the intermediate space to an expanded representation in an output space having a higher dimensionality than the dimensionality of the intermediate space.

Type: Application

Filed: December 16, 2020

Publication date: June 16, 2022

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yinpeng CHEN, Xiyang DAI, Mengchen LIU, Dongdong CHEN, Lu YUAN, Zicheng LIU, Ye YU, Mei CHEN, Yunsheng LI
WEAK NEURAL ARCHITECTURE SEARCH (NAS) PREDICTOR

Publication number: 20220188599

Abstract: A neural architecture search (NAS) with a weak predictor comprises: receiving network architecture scoring information; iteratively sampling a search space, wherein the sampling comprises: generating a set of candidate architectures within the search space; learning a first predictor; evaluating performance of the candidate architectures; and based on at least the performance of the set of candidate architectures and the network architecture scoring information, refining the search space to a smaller search space; based on at least the network architecture scoring information, thresholding the performance of candidate architectures to determine scored output candidate architectures; and reporting the scored output candidate architectures. In some examples, the candidate architectures each comprise a machine learning (ML) model, for example a neural network (NN).

Type: Application

Filed: December 15, 2020

Publication date: June 16, 2022

Inventors: Xiyang DAI, Dongdong CHEN, Yinpeng CHEN, Mengchen LIU, Ye YU, Zicheng LIU, Mei CHEN, Lu YUAN, Junru WU
Machine learning system for generating classification data and part localization data for objects depicted in images

Patent number: 10769491

Abstract: Techniques are disclosed for identifying discriminative, fine-grained features of an object in an image. In one example, an input device receives an image. A machine learning system includes a model comprising a first set, a second set, and a third set of filters. The machine learning system applies the first set of filters to the received image to generate an intermediate representation of the received image. The machine learning system applies the second set of filters to the intermediate representation to generate part localization data identifying sub-parts of an object and one or more regions of the image in which the sub-parts are located. The machine learning system applies the third set of filters to the intermediate representation to generate classification data identifying a subordinate category to which the object belongs. The system uses the part localization and classification data to perform fine-grained classification of the object.

Type: Grant

Filed: August 31, 2018

Date of Patent: September 8, 2020

Assignee: SRI International

Inventors: Bogdan Calin Mihai Matei, Xiyang Dai, John Benjamin Southall, Nhon Hoc Trinh, Harpreet Sawhney
MACHINE LEARNING SYSTEM FOR GENERATING CLASSIFICATION DATA AND PART LOCALIZATION DATA FOR OBJECTS DEPICTED IN IMAGES

Publication number: 20190073560

Abstract: Techniques are disclosed for identifying discriminative, fine-grained features of an object in an image. In one example, an input device receives an image. A machine learning system includes a model comprising a first set, a second set, and a third set of filters. The machine learning system applies the first set of filters to the received image to generate an intermediate representation of the received image. The machine learning system applies the second set of filters to the intermediate representation to generate part localization data identifying sub-parts of an object and one or more regions of the image in which the sub-parts are located. The machine learning system applies the third set of filters to the intermediate representation to generate classification data identifying a subordinate category to which the object belongs. The system uses the part localization and classification data to perform fine-grained classification of the object.

Type: Application

Filed: August 31, 2018

Publication date: March 7, 2019

Inventors: Bogdan Calin Mihai Matei, Xiyang Dai, John Benjamin Southall, Nhon Hoc Trinh, Harpreet Sawhney