Patents by Inventor Siyao Sun

Siyao Sun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DECENTRALIZED CROSS-NODE LEARNING FOR AUDIENCE PROPENSITY PREDICTION

Publication number: 20230351247

Abstract: Embodiments of the disclosed technologies receive a first-party trained model and a first-party data set from a first-party system into a protected environment, receive a first third-party data set into the protected environment, and, in a data clean room, joining the first-party data set and the first third-party data set to create a joint data set for the particular segment, tuning a first-party trained model with the joint data set to create a third-party tuned model, sending model parameter data learned in the data clean room as a result of the tuning to an aggregator node, receiving a globally tuned version of the first-party trained model from the aggregator node, applying the globally tuned version of the first-party trained model to a second third-party data set to produce a scored third-party data set, and providing the scored third-party data set to a content distribution service of the first-party system.

Type: Application

Filed: May 2, 2022

Publication date: November 2, 2023

Inventors: Boyi Chen, Tong Zhou, Siyao Sun, Lijun Peng, Xinruo Jing, Vakwadi Thejaswini Holla, Yi Wu, Pankhuri Goyal, Souvik Ghosh, Zheng Li, Yi Zhang, Onkar A. Dalal, Jing Wang, Aarthi Jayaram
TRAINING DATA GENERATION TECHNIQUES TO CAPTURE ENTITY-TO-ENTITY AFFINITIES

Publication number: 20220207484

Abstract: Techniques for generating training data to capture entity-to-entity affinities are provided. In one technique, first interaction data is stored that indicates interactions, that occurred during a first time period, between a first set of users and content items associated with a first set of entities. Also, second interaction data is stored that indicates interactions, that occurred during a second time period, between a second set of users and content items associated with a second set of entities. For each interaction in the first interaction data: (1) a training instance is generated; (2) it is determined whether the interaction matches one in the second interaction data; and (3) if the interaction does not match, then a negative label is generated for the training instance, else a positive label is generated for the training instance. Machine learning techniques are then used to train a machine-learned model based on the generating training instances.

Type: Application

Filed: December 31, 2020

Publication date: June 30, 2022

Inventors: Ankan SAHA, Siyao SUN, Zhanglong LIU, Aastha JAIN
Two-stage training with non-randomized and randomized data

Patent number: 11204973

Abstract: In an example embodiment, position bias and other types of bias may be compensated for by using two-phase training of a machine-learned model. In a first phase, the machine-learned model is trained using non-randomized training data. Since certain types of machine-learned models, such as those involving deep learning (e.g., neural networks) require a lot of training data, this allows the bulk of the training to be devoted to training using non-randomized training data. However, since this non-randomized training data may be biased, a second training phase is then used to revise the machine-learned model based on randomized training data to remove the bias from the machine-learned model. Since this randomized training data may be less plentiful, this allows the deep learning machine-learned model to be trained to operate in an unbiased manner without the need to generate additional randomized training data.

Type: Grant

Filed: June 21, 2019

Date of Patent: December 21, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Sairom Krishnan Hewlett, Dan Liu, Qi Guo, Wenxiang Chen, Xiaoyi Zhang, Lester Gilbert Cottle, III, Xuebin Yan, Yu Gong, Haitong Tian, Siyao Sun, Pei-Lun Liao
TWO-STAGE TRAINING WITH NON-RANDOMIZED AND RANDOMIZED DATA

Publication number: 20200401644

Abstract: In an example embodiment, position bias and other types of bias may be compensated for by using two-phase training of a machine-learned model. In a first phase, the machine-learned model is trained using non-randomized training data. Since certain types of machine-learned models, such as those involving deep learning (e.g., neural networks) require a lot of training data, this allows the bulk of the training to be devoted to training using non-randomized training data. However, since this non-randomized training data may be biased, a second training phase is then used to revise the machine-learned model based on randomized training data to remove the bias from the machine-learned model. Since this randomized training data may be less plentiful, this allows the deep learning machine-learned model to be trained to operate in an unbiased manner without the need to generate additional randomized training data.

Type: Application

Filed: June 21, 2019

Publication date: December 24, 2020

Inventors: Daniel Sairom Krishnan Hewlett, Dan Liu, Qi Guo, Wenxiang Chen, Xiaoyi Zhang, Lester Gilbert Cottle, Xuebin Yan, Yu Gong, Haitong Tian, Siyao Sun, Pei-Lun Liao
OPTIMIZING FEATURE EVALUATION IN MACHINE LEARNING

Publication number: 20190325352

Abstract: The disclosed embodiments provide a system for processing data. During operation, the system obtains a feature dependency graph of features for a machine learning model and an operator dependency graph comprising operators to be applied to the features. Next, the system generates feature values of the features according to an evaluation order associated with the operator dependency graph and feature dependencies from the feature dependency graph. During evaluation of an operator in the evaluation order, the system updates a list of calculated features with one or more features that have been calculated for use with the operator. During evaluation of a subsequent operator in the evaluation order, the system uses the list of calculated features to omit recalculation of the feature(s) for use with the subsequent operator.

Type: Application

Filed: April 20, 2018

Publication date: October 24, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Chang-Ming Tsai, Fei Chen, Siyao Sun, Shihai He, Yu Gong, Scott A. Banachowski, Joel D. Young
FLEXIBLE CONFIGURATION OF MODEL TRAINING PIPELINES

Publication number: 20190228343

Abstract: The disclosed embodiments provide a system for processing data. During operation, the system obtains a model definition and a training configuration for a machine-learning model, wherein the training configuration includes a set of required features, a training technique, and a scoring function. Next, the system uses the model definition and the training configuration to load the machine-learning model and the set of required features into a training pipeline without requiring a user to manually identify the set of required features. The system then uses the training pipeline and the training configuration to update a set of parameters for the machine-learning model. Finally, the system stores mappings containing the updated set of parameters and the set of required features in a representation of the machine-learning model.

Type: Application

Filed: January 23, 2018

Publication date: July 25, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Songxiang Gu, Xuebin Yan, Shihai He, Andris Birkmanis, Fei Chen, Yu Gong, Chang-Ming Tsai, Siyao Sun, Joel D. Young

DECENTRALIZED CROSS-NODE LEARNING FOR AUDIENCE PROPENSITY PREDICTION

TRAINING DATA GENERATION TECHNIQUES TO CAPTURE ENTITY-TO-ENTITY AFFINITIES

Two-stage training with non-randomized and randomized data

TWO-STAGE TRAINING WITH NON-RANDOMIZED AND RANDOMIZED DATA

OPTIMIZING FEATURE EVALUATION IN MACHINE LEARNING

FLEXIBLE CONFIGURATION OF MODEL TRAINING PIPELINES