Patents by Inventor Pengcheng He

Pengcheng He has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Detecting Computer-Generated Hallucinations using Progressive Scope-of-Analysis Enlargement

Publication number: 20250103800

Abstract: A technique determines whether a target item is adequately supported by a source item, and therefore likely free of hallucinations. The technique operates by progressively expanding a scope of source content to be considered when determining whether an individual target part of the target item has support in the source item. For instance, the technique initially determines whether any individual source part in the source item supports the target part. If this stage fails to identify support, the technique next considers whether a larger portion of the source item supports the particular target item. In some implementations, the technique selects a scope of analysis at a particular stage by choosing a group of source parts that most closely match the target part under consideration. The technique concatenates those source parts in the same order in which they appear in the source item.

Type: Application

Filed: September 27, 2023

Publication date: March 27, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Ahmed Elgohary GHONEIM, Pengcheng HE
Adversarial training of machine learning models

Patent number: 12242971

Abstract: This document relates to training of machine learning models such as neural networks. One example method involves providing a machine learning model having one or more layers and associated parameters and performing a pretraining stage on the parameters of the machine learning model to obtain pretrained parameters. The example method also involves performing a tuning stage on the machine learning model by using labeled training samples to tune the pretrained parameters. The tuning stage can include performing noise adjustment of the labeled training examples to obtain noise-adjusted training samples. The tuning stage can also include adjusting the pretrained parameters based at least on the labeled training examples and the noise-adjusted training examples to obtain adapted parameters. The example method can also include outputting a tuned machine learning model having the adapted parameters.

Type: Grant

Filed: January 29, 2020

Date of Patent: March 4, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Xiaodong Liu, Jianfeng Gao, Pengcheng He, Weizhu Chen
Language-model pretraining with gradient-disentangled embedding sharing

Patent number: 12223269

Abstract: A method for training a language model comprises (a) receiving vectorized training data as input to a multitask pretraining problem; (b) generating modified vectorized training data based on the vectorized training data, according to an upstream data embedding; (c) emitting pretraining output based on the modified vectorized training data, according to a downstream data embedding equivalent to the upstream data embedding; and (d) adjusting the upstream data embedding and the downstream data embedding by computing, based on the pretraining output, a gradient of the upstream data embedding disentangled from a gradient of the downstream data embedding, thereby advancing the multitask pretraining problem toward a pretrained state.

Type: Grant

Filed: May 18, 2022

Date of Patent: February 11, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Pengcheng He, Jianfeng Gao, Weizhu Chen
Compressing Information Provided to a Machine-Trained Generative Model

Publication number: 20250028750

Abstract: A technique is described for compressing input information fed to a machine-trained generative model. The technique includes: receiving original input information having a plurality of sentences; performing word-level encoding of the original input information using a first part of a machine-trained transformer model, to provide word-level encoded information; performing sentence-level encoding of the word-level encoded information using a second part of the machine-trained transformer model, to provide scores associated with the first plurality of sentences; selecting a subset of the sentences in the original input information based on the scores, to provide modified input information; and providing the modified input information to the machine-trained generative model. The operation of word-level encoding performs parallel processing on portions of the original input information.

Type: Application

Filed: July 21, 2023

Publication date: January 23, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Lesly Sadiht MICULICICH WERLEN, Pengcheng HE, Yujia XIE, Wei XIONG, Siqing CHEN, Xun WANG, Elsie Prasuna NALLIPOGU, Yanling XIONG
Interacting with a Language Model using External Knowledge and Feedback

Publication number: 20240362418

Abstract: A technique supplements a language model with knowledge information retrieved from external sources. The technique operates by: receiving a query; receiving knowledge information based on the query; generating original model-input information that includes the query and the knowledge information; and presenting the original model-input information to the language model. The technique further includes: receiving an original response from the language model; generating a usefulness measure that identifies usefulness of the original response; and determining whether the usefulness measure satisfies a prescribed test. Upon determining that the usefulness measure does not satisfy the test, the technique includes: generating revised model-input information that includes feedback information; presenting the revised model-input information to the language model; and receiving a revised response from the language model.

Type: Application

Filed: April 28, 2023

Publication date: October 31, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Baolin PENG, Michel GALLEY, Hao CHENG, Pengcheng HE, Nguyen Hung BACH, Weizhu CHEN, Jianfeng GAO
MULTI-TASK MACHINE LEARNING ARCHITECTURES AND TRAINING PROCEDURES

Publication number: 20240346295

Abstract: This document relates to architectures and training procedures for multi-task machine learning models, such as neural networks. One example method involves providing a multi-task machine learning model having one or more shared layers and two or more task-specific layers. The method can also involve performing a pretraining stage on the one or more shared layers using one or more unsupervised prediction tasks.

Type: Application

Filed: May 3, 2024

Publication date: October 17, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Weizhu CHEN, Pengcheng HE, Xiaodong LIU, Jianfeng GAO
Efficient transformer language models with disentangled attention and multi-step decoding

Patent number: 12061876

Abstract: Systems and methods are provided for facilitating the building and use of natural language understanding models. The systems and methods identify a plurality of tokens and use them to generate one or more pre-trained natural language models using a transformer. The transformer disentangles the content embedding and positional embedding in the computation of its attention matrix. Systems and methods are also provided to facilitate self-training of the pre-trained natural language model by utilizing multi-step decoding to better reconstruct masked tokens and improve pre-training convergence.

Type: Grant

Filed: December 9, 2022

Date of Patent: August 13, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen
LENGTH-CONTROLLED TEXT GENERATION USING A TEXT PROCESSING MODEL

Publication number: 20240193350

Abstract: The disclosure herein describes training a text processing model to generate model output text data using input text data and a sentence count. A training data entry including input text data and output text data is obtained. A sentence count of the output text data is determined, and the output text data is labeled with a sentence count label and a sentence number label. Model output text data is generated with a text processing model using the input text data and determined sentence count as input data. Loss data associated with a difference between the generated model output text data and the labeled output text data is determined and the text processing model is adjusted using the determined loss data. The use of labeled output text data enables the model to be trained to produce output text data with a target sentence count in a computationally efficient manner.

Type: Application

Filed: December 9, 2022

Publication date: June 13, 2024

Inventors: Yujia XIE, Lesly Sadiht MICULICICH WERLEN, Song WANG, Pengcheng HE, Yuantao WANG, Wei XIONG, Yanling XIONG
Multi-task machine learning architectures and training procedures

Patent number: 12008459

Abstract: This document relates to architectures and training procedures for multi-task machine learning models, such as neural networks. One example method involves providing a multi-task machine learning model having one or more shared layers and two or more task-specific layers. The method can also involve performing a pretraining stage on the one or more shared layers using one or more unsupervised prediction tasks. The method can also involve performing a tuning stage on the one or more shared layers and the two or more task-specific layers using respective task-specific objectives.

Type: Grant

Filed: June 17, 2019

Date of Patent: June 11, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Weizhu Chen, Pengcheng He, Xiaodong Liu, Jianfeng Gao
Split-type friction stir welding tool with adjustable stirring pin length

Patent number: 11969818

Abstract: A split-type friction stir welding head with an adjustable stirring pin length includes a stirring head housing, where a clamping handle and a detachable stirring pin are successively mounted in the stirring head housing towards a welding direction, the clamping handle is provided with external threads on a periphery thereof and in drive connection with an adjusting plate through threads, the adjusting plate is limited, fixed and mounted in the stirring head housing, and a pore-diameter-adjustable aperture shoulder is mounted between a bottom of the stirring head housing and the detachable stirring pin in order to compensate for an outside gap between a stirring pin channel of the stirring head housing and the detachable stirring pin.

Type: Grant

Filed: September 22, 2023

Date of Patent: April 30, 2024

Assignee: Hefei University of Technology

Inventors: Jingfeng Wang, Beibei Li, Pengcheng He, Ao Liu
Shoulder-angle-adjustable friction stir welding head suitable for fillet joint

Patent number: 11958127

Abstract: A shoulder-angle-adjustable friction stir welding head suitable for a fillet joint includes a stirring head body. A front end of the stirring head body is mounted with a movable shoulder, a stirring pin channel is arranged on the movable shoulder, and the stirring pin channel may allow a stirring pin of the stirring head body to pass through. The present disclosure can respond to welding tasks of the fillet joint of different angles and enlarges an application scope of the friction stir welding head in a manner that the front end of the stirring head body is mounted with the movable shoulder, the stirring pin channel is arranged on the movable shoulder, the stirring pin channel is used for allowing the stirring pin of the stirring head body to pass through, and the angle of the movable shoulder is adjusted.

Type: Grant

Filed: September 14, 2023

Date of Patent: April 16, 2024

Assignee: Hefei University of Technology

Inventors: Beibei Li, Pengcheng He, Jingfeng Wang, Wenqi Qi, Guoqiang Li
ML USING N-GRAM INDUCED INPUT REPRESENTATION

Publication number: 20240086619

Abstract: Generally discussed herein are devices, systems, and methods for generating an embedding that is both local string dependent and global string dependent. The generated embedding can improve machine learning (ML) model performance. A method can include converting a string of words to a series of tokens, generating a local string-dependent embedding of each token of the series of tokens, generating a global string-dependent embedding of each token of the series of tokens, combining the local string dependent embedding the global string dependent embedding to generate an n-gram induced embedding of each token of the series of tokens, obtaining a masked language model (MLM) previously trained to generate a masked word prediction, and executing the MLM based on the n-gram induced embedding of each token to generate the masked word prediction.

Type: Application

Filed: October 26, 2023

Publication date: March 14, 2024

Inventors: Pengcheng HE, Xiaodong Liu, Jianfeng Gao, Weizhu Chen
UNIFIED NATURAL LANGUAGE MODEL WITH SEGMENTED AND AGGREGATE ATTENTION

Publication number: 20240062020

Abstract: Systems and methods are provided for training and using a novel unified language foundation model. An encoder-decoder natural language model is obtained and various training data is obtained and used for training. The training process integrates a combination of replaced token detection, corrupted span reconstruction, and disentangled attention methodologies to produce a unified encoder-decoder model. The trained model is trained for performing both natural language understanding (NLU) tasks and natural language generation (NLG) tasks. Attention applied to the model is applied discretely to segmented chunks of encoded data during processing to improve the efficiency of applying attention by the model.

Type: Application

Filed: October 20, 2022

Publication date: February 22, 2024

Inventors: Pengcheng HE, Jianfeng GAO, Nanshan ZENG, Xuedong HUANG, Wei XIONG, Baolin PENG
PRE-TRAINING A UNIFIED NATURAL LANGUAGE MODEL WITH CORRUPTED SPAN AND REPLACED TOKEN DETECTION

Publication number: 20240062018

Abstract: Systems and methods are provided for training and using a novel unified language foundation model. An encoder-decoder natural language model is obtained and various training data is obtained and used for training. The training process integrates a combination of replaced token detection, corrupted span reconstruction, and disentangled attention methodologies to produce a unified encoder-decoder model. The trained model is trained for performing both natural language understanding (NLU) tasks and natural language generation (NLG) tasks. Attention applied to the model is applied discretely to segmented chunks of encoded data during processing to improve the efficiency of applying attention by the model.

Type: Application

Filed: October 20, 2022

Publication date: February 22, 2024

Inventors: Pengcheng HE, Jianfeng GAO, Nanshan ZENG, Xuedong HUANG, Wei XIONG, Baolin PENG
ADVERSARIAL PRETRAINING OF MACHINE LEARNING MODELS

Publication number: 20240013055

Abstract: This document relates to training of machine learning models. One example method involves providing a machine learning model having one or more mapping layers. The one or more mapping layers can include at least a first mapping layer configured to map components of pretraining examples into first representations in a space. The example method also includes performing a pretraining stage on the one or more mapping layers using the pretraining examples. The pretraining stage can include adding noise to the first representations of the components of the pretraining examples to obtain noise-adjusted first representations. The pretraining stage can also include performing a self-supervised learning process to pretrain the one or more mapping layers using at least the first representations of the training data items and the noise-adjusted first representations of the training data items.

Type: Application

Filed: September 26, 2023

Publication date: January 11, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Xiaodong Liu, Hao Cheng, Yu Wang, Jianfeng Gao, Weizhu Chen, Pengcheng He, Hoifung Poon
ML using n-gram induced input representation

Patent number: 11836438

Abstract: Generally discussed herein are devices, systems, and methods for generating an embedding that is both local string dependent and global string dependent. The generated embedding can improve machine learning (ML) model performance. A method can include converting a string of words to a series of tokens, generating a local string-dependent embedding of each token of the series of tokens, generating a global string-dependent embedding of each token of the series of tokens, combining the local string dependent embedding the global string dependent embedding to generate an n-gram induced embedding of each token of the series of tokens, obtaining a masked language model (MLM) previously trained to generate a masked word prediction, and executing the MLM based on the n-based induced embedding of each token to generate the masked word prediction.

Type: Grant

Filed: April 13, 2021

Date of Patent: December 5, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen
Adversarial pretraining of machine learning models

Patent number: 11803758

Abstract: This document relates to training of machine learning models. One example method involves providing a machine learning model having one or more mapping layers. The one or more mapping layers can include at least a first mapping layer configured to map components of pretraining examples into first representations in a space. The example method also includes performing a pretraining stage on the one or more mapping layers using the pretraining examples. The pretraining stage can include adding noise to the first representations of the components of the pretraining examples to obtain noise-adjusted first representations. The pretraining stage can also include performing a self-supervised learning process to pretrain the one or more mapping layers using at least the first representations of the training data items and the noise-adjusted first representations of the training data items.

Type: Grant

Filed: May 22, 2020

Date of Patent: October 31, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Xiaodong Liu, Hao Cheng, Yu Wang, Jianfeng Gao, Weizhu Chen, Pengcheng He, Hoifung Poon
Example based entity extraction, slot filling and value recommendation

Patent number: 11720757

Abstract: Methods, systems, apparatuses, and computer program products are provided for extracting an entity value from a sentence. An embedding set that may include one or more sentence embeddings is generated for at least part of a first sentence that is tagged to associate a first named entity in the sentence with an entity type. A plurality of candidate embeddings is also generated for at least part of a second sentence. The one or more sentence embeddings in the embedding set may be compared with each of the plurality of candidate embeddings, and a match score may be assigned to each comparison to generate a match score set. A particular match score of the match score set may be identified that exceeds a similarity threshold, and an entity value of the entity type may be extracted from the second sentence associated with the identified match score.

Type: Grant

Filed: August 19, 2019

Date of Patent: August 8, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Vikas Bahirwani, Jade Huang, Matthew Brigham Hall, Yu Zhao, Pengcheng He, Weizhu Chen, Eslam K. Abdelreheem, Jiayuan Huang, Yuting Sun
EFFICIENT TRANSFORMER LANGUAGE MODELS WITH DISENTANGLED ATTENTION AND MULTI-STEP DECODING

Publication number: 20230222295

Abstract: Systems and methods are provided for facilitating the building and use of natural language understanding models. The systems and methods identify a plurality of tokens and use them to generate one or more pre-trained natural language models using a transformer. The transformer disentangles the content embedding and positional embedding in the computation of its attention matrix. Systems and methods are also provided to facilitate self-training of the pre-trained natural language model by utilizing multi-step decoding to better reconstruct masked tokens and improve pre-training convergence.

Type: Application

Filed: December 9, 2022

Publication date: July 13, 2023

Inventors: Pengcheng HE, Xiaodong LIU, Jianfeng GAO, Weizhu CHEN
Nursing pillow

Patent number: D1062303

Type: Grant

Filed: April 17, 2023

Date of Patent: February 18, 2025

Inventor: Pengcheng He

1 2 next