Patents by Inventor Jianfeng Gao

Jianfeng Gao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMPUTERIZED QUESTION ANSWERING BASED ON EVIDENCE CHAINS

Publication number: 20240144049

Abstract: A method for computer question answering includes, at a retriever subsystem of a question answering computer system, identifying a plurality of relevant text evidence strings for an input text question. At a linker subsystem of the question answering computer system, one or more of the plurality of relevant text evidence strings are associated with a respective secondary text evidence string to form a plurality of evidence chains via a previously-trained entity-linking machine-learning model. At a chainer subsystem of the question answering computer system, a ranked set of the evidence chains is identified based at least in part on an output of a generative machine-learning model applied to each of the plurality of evidence chains. At a reader subsystem of the question answering computer system, an answer to the input text question is output based at least in part on the ranked set of evidence chains.

Type: Application

Filed: October 5, 2022

Publication date: May 2, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Hao CHENG, Xiaodong LIU, Jianfeng GAO, Kaixin MA
TRANSFORMER-BASED TEXT ENCODER FOR PASSAGE RETRIEVAL

Publication number: 20240126993

Abstract: A computing system includes a logic subsystem and a storage subsystem holding instructions executable by the logic subsystem to implement a transformer-based text encoder. The transformer-based text encoder includes a plurality of transformer blocks previously-trained to apply encoding operations to computer-readable text representations of input text strings, the computer-readable text representations including computer-readable question representations of input text questions, and computer-readable passage representations of input text passages. The plurality of transformer blocks include a shared transformer block trained for both the computer-readable question representations and the computer-readable passage representations and a specialized transformer block including two or more input-specific subnetworks, and a routing function to select an input-specific subnetwork of the two or more input-specific subnetworks for each of the computer-readable text representations.

Type: Application

Filed: October 5, 2022

Publication date: April 18, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Hao CHENG, Hao FANG, Xiaodong LIU, Jianfeng GAO
Training a user-system dialog in a task-oriented dialog system

Patent number: 11961509

Abstract: Methods and systems are disclosed for improving dialog management for task-oriented dialog systems. The disclosed dialog builder leverages machine teaching processing to improve development of dialog managers. In this way, the dialog builder combines the strengths of both rule-based and machine-learned approaches to allow dialog authors to: (1) import a dialog graph developed using popular dialog composers, (2) convert the dialog graph to text-based training dialogs, (3) continuously improve the trained dialogs based on log dialogs, and (4) generate a corrected dialog for retraining the machine learning.

Type: Grant

Filed: April 3, 2020

Date of Patent: April 16, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Swadheen Kumar Shukla, Lars Hasso Liden, Thomas Park, Matthew David Mazzola, Shahin Shayandeh, Jianfeng Gao, Eslam Kamal Abdelreheem
ML USING N-GRAM INDUCED INPUT REPRESENTATION

Publication number: 20240086619

Abstract: Generally discussed herein are devices, systems, and methods for generating an embedding that is both local string dependent and global string dependent. The generated embedding can improve machine learning (ML) model performance. A method can include converting a string of words to a series of tokens, generating a local string-dependent embedding of each token of the series of tokens, generating a global string-dependent embedding of each token of the series of tokens, combining the local string dependent embedding the global string dependent embedding to generate an n-gram induced embedding of each token of the series of tokens, obtaining a masked language model (MLM) previously trained to generate a masked word prediction, and executing the MLM based on the n-gram induced embedding of each token to generate the masked word prediction.

Type: Application

Filed: October 26, 2023

Publication date: March 14, 2024

Inventors: Pengcheng HE, Xiaodong Liu, Jianfeng Gao, Weizhu Chen
PRE-TRAINING A UNIFIED NATURAL LANGUAGE MODEL WITH CORRUPTED SPAN AND REPLACED TOKEN DETECTION

Publication number: 20240062018

Abstract: Systems and methods are provided for training and using a novel unified language foundation model. An encoder-decoder natural language model is obtained and various training data is obtained and used for training. The training process integrates a combination of replaced token detection, corrupted span reconstruction, and disentangled attention methodologies to produce a unified encoder-decoder model. The trained model is trained for performing both natural language understanding (NLU) tasks and natural language generation (NLG) tasks. Attention applied to the model is applied discretely to segmented chunks of encoded data during processing to improve the efficiency of applying attention by the model.

Type: Application

Filed: October 20, 2022

Publication date: February 22, 2024

Inventors: Pengcheng HE, Jianfeng GAO, Nanshan ZENG, Xuedong HUANG, Wei XIONG, Baolin PENG
UNIFIED NATURAL LANGUAGE MODEL WITH SEGMENTED AND AGGREGATE ATTENTION

Publication number: 20240062020

Abstract: Systems and methods are provided for training and using a novel unified language foundation model. An encoder-decoder natural language model is obtained and various training data is obtained and used for training. The training process integrates a combination of replaced token detection, corrupted span reconstruction, and disentangled attention methodologies to produce a unified encoder-decoder model. The trained model is trained for performing both natural language understanding (NLU) tasks and natural language generation (NLG) tasks. Attention applied to the model is applied discretely to segmented chunks of encoded data during processing to improve the efficiency of applying attention by the model.

Type: Application

Filed: October 20, 2022

Publication date: February 22, 2024

Inventors: Pengcheng HE, Jianfeng GAO, Nanshan ZENG, Xuedong HUANG, Wei XIONG, Baolin PENG
Synthetic data generation for training of natural language understanding models

Patent number: 11875787

Abstract: This document relates to machine learning. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining a task-semantically-conditioned generative model that has been pretrained based at least on a first training data set having unlabeled training examples and semantically conditioned based at least on a second training data set having dialog act-labeled utterances. The method or technique can also include inputting dialog acts into the semantically-conditioned generative model and obtaining synthetic utterances that are output by the semantically-conditioned generative model. The method or technique can also include outputting the synthetic utterances.

Type: Grant

Filed: October 11, 2022

Date of Patent: January 16, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Nanshan Zeng, Jianfeng Gao
ADVERSARIAL PRETRAINING OF MACHINE LEARNING MODELS

Publication number: 20240013055

Abstract: This document relates to training of machine learning models. One example method involves providing a machine learning model having one or more mapping layers. The one or more mapping layers can include at least a first mapping layer configured to map components of pretraining examples into first representations in a space. The example method also includes performing a pretraining stage on the one or more mapping layers using the pretraining examples. The pretraining stage can include adding noise to the first representations of the components of the pretraining examples to obtain noise-adjusted first representations. The pretraining stage can also include performing a self-supervised learning process to pretrain the one or more mapping layers using at least the first representations of the training data items and the noise-adjusted first representations of the training data items.

Type: Application

Filed: September 26, 2023

Publication date: January 11, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Xiaodong Liu, Hao Cheng, Yu Wang, Jianfeng Gao, Weizhu Chen, Pengcheng He, Hoifung Poon
MULTI-DOMAIN JOINT SEMANTIC FRAME PARSING

Publication number: 20230401445

Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.

Type: Application

Filed: August 29, 2023

Publication date: December 14, 2023

Inventors: Dilek Z. Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye Yi Wang
ML using n-gram induced input representation

Patent number: 11836438

Abstract: Generally discussed herein are devices, systems, and methods for generating an embedding that is both local string dependent and global string dependent. The generated embedding can improve machine learning (ML) model performance. A method can include converting a string of words to a series of tokens, generating a local string-dependent embedding of each token of the series of tokens, generating a global string-dependent embedding of each token of the series of tokens, combining the local string dependent embedding the global string dependent embedding to generate an n-gram induced embedding of each token of the series of tokens, obtaining a masked language model (MLM) previously trained to generate a masked word prediction, and executing the MLM based on the n-based induced embedding of each token to generate the masked word prediction.

Type: Grant

Filed: April 13, 2021

Date of Patent: December 5, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen
Adversarial pretraining of machine learning models

Patent number: 11803758

Abstract: This document relates to training of machine learning models. One example method involves providing a machine learning model having one or more mapping layers. The one or more mapping layers can include at least a first mapping layer configured to map components of pretraining examples into first representations in a space. The example method also includes performing a pretraining stage on the one or more mapping layers using the pretraining examples. The pretraining stage can include adding noise to the first representations of the components of the pretraining examples to obtain noise-adjusted first representations. The pretraining stage can also include performing a self-supervised learning process to pretrain the one or more mapping layers using at least the first representations of the training data items and the noise-adjusted first representations of the training data items.

Type: Grant

Filed: May 22, 2020

Date of Patent: October 31, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Xiaodong Liu, Hao Cheng, Yu Wang, Jianfeng Gao, Weizhu Chen, Pengcheng He, Hoifung Poon
Multi-domain joint semantic frame parsing

Patent number: 11783173

Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.

Type: Grant

Filed: August 4, 2016

Date of Patent: October 10, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dilek Z Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye-Yi Wang
LEARNING GRAPH REPRESENTATIONS USING HIERARCHICAL TRANSFORMERS FOR CONTENT RECOMMENDATION

Publication number: 20230267308

Abstract: Knowledge graphs can greatly improve the quality of content recommendation systems. There is a broad variety of knowledge graphs in the domain including clicked user-ad graphs, clicked query-ad graphs, keyword-display URL graphs etc. A hierarchical Transformer model learns entity embeddings in knowledge graphs. The model consists of two different Transformer blocks where the bottom block generates relation-dependent embeddings for the source entity and its neighbors, and the top block aggregates the outputs from the bottom block to produce the target entity embedding. To balance the information from contextual entities and the source entity itself, a masked entity model (MEM) task is combined with a link prediction task in model training.

Type: Application

Filed: May 4, 2023

Publication date: August 24, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jian JIAO, Xiaodong LIU, Ruofei ZHANG, Jianfeng GAO
Resource scheduling using machine learning

Patent number: 11734066

Abstract: Generally discussed herein are devices, systems, and methods for scheduling tasks to be completed by resources. A method can include identifying features of the task, the features including a time-dependent feature and a time-independent feature, the time-dependent feature indicating a time the task is more likely to be successfully completed by the resource, converting the features to feature values based on a predefined mapping of features to feature values in a first memory device, determining, by a gradient boost tree model and based on a first current time and the feature values, a likelihood the resource will successfully complete the task, and scheduling the task to be performed by the resource based on the determined likelihood.

Type: Grant

Filed: January 8, 2020

Date of Patent: August 22, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jinchao Li, Yu Wang, Karan Srivastava, Jianfeng Gao, Prabhdeep Singh, Haiyuan Cao, Xinying Song, Hui Su, Jaideep Sarkar
Iterative query-based analysis of text

Patent number: 11704551

Abstract: Techniques for iterative query-based analysis of text are described. According to various implementations, a neural network architecture is implemented receives a query for information about text content, and iteratively analyzes the content using the query. During the analysis a state of the query evolves until it reaches a termination state, at which point the state of the query is output as an answer to the initial query.

Type: Grant

Filed: June 30, 2017

Date of Patent: July 18, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Po-Sen Huang, Jianfeng Gao, Weizhu Chen, Yelong Shen
EFFICIENT TRANSFORMER LANGUAGE MODELS WITH DISENTANGLED ATTENTION AND MULTI-STEP DECODING

Publication number: 20230222295

Abstract: Systems and methods are provided for facilitating the building and use of natural language understanding models. The systems and methods identify a plurality of tokens and use them to generate one or more pre-trained natural language models using a transformer. The transformer disentangles the content embedding and positional embedding in the computation of its attention matrix. Systems and methods are also provided to facilitate self-training of the pre-trained natural language model by utilizing multi-step decoding to better reconstruct masked tokens and improve pre-training convergence.

Type: Application

Filed: December 9, 2022

Publication date: July 13, 2023

Inventors: Pengcheng HE, Xiaodong LIU, Jianfeng GAO, Weizhu CHEN
Learning graph representations using hierarchical transformers for content recommendation

Patent number: 11676001

Abstract: Knowledge graphs can greatly improve the quality of content recommendation systems. There is a broad variety of knowledge graphs in the domain including clicked user-ad graphs, clicked query-ad graphs, keyword-display URL graphs etc. A hierarchical Transformer model learns entity embeddings in knowledge graphs. The model consists of two different Transformer blocks where the bottom block generates relation-dependent embeddings for the source entity and its neighbors, and the top block aggregates the outputs from the bottom block to produce the target entity embedding. To balance the information from contextual entities and the source entity itself, a masked entity model (MEM) task is combined with a link prediction task in model training.

Type: Grant

Filed: November 9, 2020

Date of Patent: June 13, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jian Jiao, Xiaodong Liu, Ruofei Zhang, Jianfeng Gao
LANGUAGE-MODEL PRETRAINING WITH GRADIENT-DISENTANGLED EMBEDDING SHARING

Publication number: 20230153532

Abstract: A method for training a language model comprises (a) receiving vectorized training data as input to a multitask pretraining problem; (b) generating modified vectorized training data based on the vectorized training data, according to an upstream data embedding; (c) emitting pretraining output based on the modified vectorized training data, according to a downstream data embedding equivalent to the upstream data embedding; and (d) adjusting the upstream data embedding and the downstream data embedding by computing, based on the pretraining output, a gradient of the upstream data embedding disentangled from a gradient of the downstream data embedding, thereby advancing the multitask pretraining problem toward a pretrained state.

Type: Application

Filed: May 18, 2022

Publication date: May 18, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Pengcheng HE, Jianfeng GAO, Weizhu CHEN
HYBRID TRANSFORMER-BASED DIALOG PROCESSOR

Publication number: 20230153348

Abstract: Systems and methods are provided for determining a response to a query in a dialog. An entity extractor extracts rules and conditions associated with the query and determines a particular task. The disclosed technology generates a transformer-based dialog embedding by pre-training a transformer using dialog corpora including a plurality of tasks. A task-specific classifier generates a first set of candidate responses based on rules and conditions associated with the task. The transformer-based dialog embedding generates a second set of candidate responses to the query. The classifier accommodates changes made to a task by an interactive dialog editor as machine teaching. A response generator generates a response based on the first and second sets of candidate responses using an optimization function.

Type: Application

Filed: November 15, 2021

Publication date: May 18, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jinchao LI, Lars H. LIDEN, Baolin PENG, Thomas PARK, Swadheen Kumar SHUKLA, Jianfeng GAO
Training a Neural Network having Sparsely-Activated Sub-Networks using Regularization

Publication number: 20230081624

Abstract: A training technique trains a neural network having sparsely-activated sub-networks. It does so by processing plural batches of training data in two respective passes of the neural network, yielding first prediction information and second prediction information. For each batch, the technique randomly assigns different sub-networks in the first and second passes of the neural network to process the batch. Over the course of training, the technique attempts to minimize loss information, which describes the difference between the first prediction information and ground-truth information, and the difference between the second prediction information and the ground-truth information. Simultaneously, the technique attempts to minimize divergence information, which describes the divergence of the first prediction information from the second prediction information (and vice versa).

Type: Application

Filed: October 11, 2021

Publication date: March 16, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jian JIAO, Xiaodong LIU, Jianfeng GAO, Ruofei ZHANG

1 2 3 4 5 … next