Patents by Inventor Yeyun GONG

Yeyun GONG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DENSE RETRIEVAL EMPLOYING PROGRESSIVE DISTILLATION TRAINING

Publication number: 20240362235

Abstract: Technologies described herein relate to dense retrieval and ranking of search results. A query indicating a computing context or user input is received. An embedding of the query is computed by way of a first encoder, and candidate results selected from a pool of potential results based upon the embedding of the query and embeddings of the potential results. A similarity score for a first of the candidate results is computed by way of a second encoder trained based upon an order metric that defines a ranking over a training set of potential results. The first encoder is trained based upon output of the second encoder prior to computing the embedding of the query. The candidate results are ranked based upon the similarity score of the first candidate result, and results responsive to the query are identified based upon the ranking. The identified results are output to a computing device.

Type: Application

Filed: April 25, 2023

Publication date: October 31, 2024

Inventors: Jian JIAO, Yeyun GONG, Xingwei HE, Nan DUAN, Eren MANAVOGLU
Dense retrieval employing progressive distillation training

Patent number: 12111837

Abstract: Technologies described herein relate to dense retrieval and ranking of search results. A query indicating a computing context or user input is received. An embedding of the query is computed by way of a first encoder, and candidate results selected from a pool of potential results based upon the embedding of the query and embeddings of the potential results. A similarity score for a first of the candidate results is computed by way of a second encoder trained based upon an order metric that defines a ranking over a training set of potential results. The first encoder is trained based upon output of the second encoder prior to computing the embedding of the query. The candidate results are ranked based upon the similarity score of the first candidate result, and results responsive to the query are identified based upon the ranking. The identified results are output to a computing device.

Type: Grant

Filed: April 25, 2023

Date of Patent: October 8, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Jian Jiao, Yeyun Gong, Xingwei He, Nan Duan, Eren Manavoglu
Resource-efficient sequence generation with dual-level contrastive learning

Patent number: 11966428

Abstract: A training system produces a resource-efficient machine-trained model via a training architecture that employs plural processing paths. Some of the processing paths incorporate the use of auxiliary information that imparts external knowledge about source items being processed. The training architecture also employs contrastive learning that operates at different respective levels within the training architecture. For instance, the training architecture uses encoder-level contrastive learning to compare output information generated by different encoders within the training architecture. The training architecture uses decoder-level contrastive learning to compare output information produced by different decoders within the training architecture. An inference-stage system performs an application task using the model produced by the training system.

Type: Grant

Filed: July 1, 2021

Date of Patent: April 23, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jian Jiao, Yeyun Gong, Nan Duan, Ruofei Zhang
GENERATION OF DATA MODELS FOR PREDICTING DATA

Publication number: 20240046037

Abstract: Systems and methods are provided for training a data model based on training data. The training includes pre-training and fine-tuning the data model based on a combination of an autoregressive (AR) model and a non-autoregressive (NAR) model. Training data may be received and encoded into streams of tokens. A pre-trainer during decoding generates a continuum of data structures of the AR and NAR combined model including a main stream and a series of predicting streams. Masked tokens in predicting streams reference or attend to one or more preceding tokens in the main stream or the preceding predicting streams. A fine-tuner selects streams to generate a trained model according to a target data model. The target data model is determined based on balancing an accuracy constraint and an efficiency constraint for predicting tokens. The decoder acts as abridge between the AR and NAR models in generating a trained data model.

Type: Application

Filed: December 25, 2020

Publication date: February 8, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jian JIAO, Yeyun GONG, Nan DUAN, Weizhu CHEN, Kewen TANG, Qiang LOU, Ruofei ZHANG, Yu YAN, Jiusheng CHEN
KNOWLEDGE INJECTION MODEL FOR GENERATIVE COMMONSENSE REASONING

Publication number: 20230394333

Abstract: A knowledge injection model for generative commonsense reasoning. In examples, an encoder-decoder model is used to generate a model output (204) a plausible description for a set of concepts. A prototype (218) is generated from an in-domain or out-of-domain knowledge corpus, which is further used as input (202) for the encoder-decoder model. Concept input tokens and prototype input tokens are scaled to limit potential skew that may be introduced by the prototype (218). Additionally, position indicators are generated for each input token, which indicate the relative position each respective input token as compared to other input tokens. As such, when decoding the scaled encoded input tokens, the decoder (214) may be more attuned to the scenario bias that is introduced by the prototype (218) when generating a model output (204). Thus, the encoder-decoder model need not rely solely on the set of concepts when generating the model output (204).

Type: Application

Filed: November 12, 2020

Publication date: December 7, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jian JIAO, Yeyun GONG, Nan DUAN, Yameng HUANG, Ruofei ZHANG, Ming ZHOU
A LOOK AHEAD STRATEGY FOR TRIE-BASED BEAM SEARCH IN GENERATIVE RETRIEVAL

Publication number: 20230385315

Abstract: Systems and methods are provided for generating a keyword sequence from an input query. A first text sequence corresponding to an input query may be received and encoded into a source sequence representation using an encoder of a machine learning model. A keyword sentence may then be generated from the source sequence representation using a decoder of the machine learning model. The decoder may generate a modified generation score for a plurality of prediction tokens, wherein the modified generation score is based on the respective prediction token generation score and a maximum generation score for a suffix of each prediction token. The decoder may then select the prediction token of the plurality of prediction tokens based on the modified generation score, and add the selected prediction token to the previously decoded partial hypothesis provided by the decoder.

Type: Application

Filed: October 14, 2020

Publication date: November 30, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jian JIAO, Yeyun GONG, Nan DUAN, Ruofei ZHANG, Ming ZHOU
MATCHING BASED INTENT UNDERSTANDING WITH TRANSFER LEARNING

Publication number: 20230267328

Abstract: Described herein is a mechanism to identify user intent in requests submitted to a system such as a digital assistant or question-answer systems. Embodiments utilize a match methodology instead of a classification methodology. Features derived from a subgraph retrieved from a knowledge base based on the request are concatenated with pretrained word embeddings for both the request and a candidate predicate. The concatenated inputs for both the request and predicate are encoded using two independent LSTM networks and then a matching score is calculated using a match LSTM network. The result is identified based on the matching scores for a plurality of candidate predicates. The pretrained word embeddings allow for knowledge transfer since pretrained word embeddings in one intent domain can apply to another intent domain without retraining.

Type: Application

Filed: May 1, 2023

Publication date: August 24, 2023

Inventors: Jianshu JI, Yeyun GONG, Nan DUAN, Yi-Cheng PAN, Guihong CAO
Resource-Efficient Sequence Generation with Dual-Level Contrastive Learning

Publication number: 20230004588

Abstract: A training system produces a resource-efficient machine-trained model via a training architecture that employs plural processing paths. Some of the processing paths incorporate the use of auxiliary information that imparts external knowledge about source items being processed. The training architecture also employs contrastive learning that operates at different respective levels within the training architecture. For instance, the training architecture uses encoder-level contrastive learning to compare output information generated by different encoders within the training architecture. The training architecture uses decoder-level contrastive learning to compare output information produced by different decoders within the training architecture. An inference-stage system performs an application task using the model produced by the training system.

Type: Application

Filed: July 1, 2021

Publication date: January 5, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jian JIAO, Yeyun GONG, Nan DUAN, Ruofei ZHANG
Resource-Efficient Attention in a Neural Network

Publication number: 20220318601

Abstract: Computing technology is described herein that provides an attention mechanism, implemented by a neural network, that generates attention information based on head-specific query information and shared key and value (KV) information, without computing head-specific key information and head-specific value information, and without caching the head-specific key information and the head-specific value information in memory. This manner of operation allows the computing technology to make efficient use of processing and memory resources. In some implementations, the attention mechanism is part of decoder of an encoder-decoder system, or a standalone decoder system. In some implementations, the computing technology leverages the attention information to generate synthesized text based on input text.

Type: Application

Filed: April 3, 2021

Publication date: October 6, 2022

Inventors: Yu YAN, Jiusheng CHEN, Nikhil BHENDAWADE, Yeyun GONG, Nan DUAN, Ruofei ZHANG
Transformer-Based Neural Network including a Mask Attention Network

Publication number: 20220067533

Abstract: A transformer-based neural network includes at least one mask attention network (MAN). The MAN computes an original attention data structure that expresses influence between pairs of data items in a sequence of data items. The MAN then modifies the original data structure by mask values in a mask data structure, to produce a modified attention data structure. Compared to the original attention data structure, the modified attention data structure better accounts for the influence of neighboring data items in the sequence of data items, given a particular data item under consideration. The mask data structure used by the MAN can have static and/or machine-trained mask values. In one implementation, the transformer-based neural network includes at least one MAN in combination with at least one other attention network that does not use a mask data structure, and at least one feed-forward neural network.

Type: Application

Filed: August 27, 2020

Publication date: March 3, 2022

Inventors: Jian JIAO, Yeyun GONG, Nan DUAN, Ruofei ZHANG, Ming ZHOU
MATCHING BASED INTENT UNDERSTANDING WITH TRANSFER LEARNING

Publication number: 20200293874

Abstract: Described herein is a mechanism to identify user intent in requests submitted to a system such as a digital assistant or question-answer systems. Embodiments utilize a match methodology instead of a classification methodology. Features derived from a subgraph retrieved from a knowledge base based on the request are concatenated with pretrained word embeddings for both the request and a candidate predicate. The concatenated inputs for both the request and predicate are encoded using two independent LSTM networks and then a matching score is calculated using a match LSTM network. The result is identified based on the matching scores for a plurality of candidate predicates. The pretrained word embeddings allow for knowledge transfer since pretrained word embeddings in one intent domain can apply to another intent domain without retraining.

Type: Application

Filed: March 12, 2019

Publication date: September 17, 2020

Inventors: Jianshu JI, Yeyun GONG, Nan DUAN, Yi-Cheng PAN, Guihong CAO