Patents by Inventor Weizhu An

Weizhu An has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Recombinant Escherichia Coli for Producing L-tyrosine and Application Thereof

Publication number: 20240166986

Abstract: Disclosed is recombinant Escherichia coli for producing L-tyrosine and application thereof, and belongs to the technical fields of genetic engineering and bioengineering. According to the present disclosure, genes aroP and tyrP are knocked out, expresses the endogenous gene yddG of E. coli, then heterologously expresses fpk from Bifidobacterium adolescentis, expresses the endogenous genes ppsA and tktA of E. coli, and then expresses aroGfbr and tyrAfbr. Knocking out tyrR, trpE and pheA, so that the synthesis flux of L-tyrosine is increased. Finally, an endogenous gene poxB is knocked out to realize stable fermentation performance at high glucose concentration.

Type: Application

Filed: January 31, 2024

Publication date: May 23, 2024

Inventors: Jingwen Zhou, Jian Chen, Jurong Ping, Weizhu Zeng
P450 Cytochrome Enzyme for Andrographolide Synthesis and Its Application

Publication number: 20240117387

Abstract: The present disclosure provides a P450 cytochrome enzyme for andrographolide synthesis and its application, belonging to the field of bioengineering. The present disclosure uses Saccharomyces cerevisiae CEN.PK2-1D as a host, and implements knockout of ROX1 and GAL80 genes on the genome, and integrative expression of GGPP synthase encoding gene and CPS diterpene synthase encoding gene at ROX1 site; and implements free expression of ApCPR and CYP71A8 and CYP71D10 both with truncated signal peptides, successfully constructing recombinant S. cerevisiae, and achieving de novo synthesis of 3,15,19-Trihydroxy-8(17),13-ent-labdadiene-16-oic acid. Compared with the blank, a response value of a product peak reaches 1.9*106, and this strategy provides necessary reference for analyzing biosynthetic pathway of andrographolide and using metabolic engineering to synthesize andrographolide and related derivatives thereof.

Type: Application

Filed: December 15, 2023

Publication date: April 11, 2024

Inventors: Jingwen Zhou, Shan Li, Song Gao, Sha Xu, Weizhu Zeng, Shiqin Yu
Recombinant Escherichia coli for producing rosmarinic acid and its application thereof

Publication number: 20240084338

Abstract: The present disclosure discloses a recombinant Escherichia coli for producing rosmarinic acid and application thereof, belonging to the technical fields of genetic engineering and bioengineering. In the present disclosure, FjTA derived from Flavobacterium johnsoniae, endogenous hpaBC derived from E. coli, CbRAS derived from Coleus blumei, HPPR derived from Coleus scutellarioides, and Pc4CL1 derived from Petroselinum crispum are heterologously expressed in E. coli, realizing synthesis of rosmarinic acid. TcTAL derived from Trichosporon cutaneum and tyrC for removing feedback inhibition are introduced, further increasing synthesis throughput of caffeic acid, and PmLAAD derived from Proteus myxofaciens is heterologously expressed, realizing redistribution of L-DOPA. An endogenous gene menl is knocked out, improving the content and stability of a rosmarinic acid precursor. The recombinant strain constructed in the present disclosure can produce rosmarinic acid by fermentation at a yield of up to 511.

Type: Application

Filed: November 21, 2023

Publication date: March 14, 2024

Inventors: Jingwen ZHOU, Jian Chen, Lian Wang, Weizhu Zeng, Shiqin Yu
ML USING N-GRAM INDUCED INPUT REPRESENTATION

Publication number: 20240086619

Abstract: Generally discussed herein are devices, systems, and methods for generating an embedding that is both local string dependent and global string dependent. The generated embedding can improve machine learning (ML) model performance. A method can include converting a string of words to a series of tokens, generating a local string-dependent embedding of each token of the series of tokens, generating a global string-dependent embedding of each token of the series of tokens, combining the local string dependent embedding the global string dependent embedding to generate an n-gram induced embedding of each token of the series of tokens, obtaining a masked language model (MLM) previously trained to generate a masked word prediction, and executing the MLM based on the n-gram induced embedding of each token to generate the masked word prediction.

Type: Application

Filed: October 26, 2023

Publication date: March 14, 2024

Inventors: Pengcheng HE, Xiaodong Liu, Jianfeng Gao, Weizhu Chen
GENERATION OF DATA MODELS FOR PREDICTING DATA

Publication number: 20240046037

Abstract: Systems and methods are provided for training a data model based on training data. The training includes pre-training and fine-tuning the data model based on a combination of an autoregressive (AR) model and a non-autoregressive (NAR) model. Training data may be received and encoded into streams of tokens. A pre-trainer during decoding generates a continuum of data structures of the AR and NAR combined model including a main stream and a series of predicting streams. Masked tokens in predicting streams reference or attend to one or more preceding tokens in the main stream or the preceding predicting streams. A fine-tuner selects streams to generate a trained model according to a target data model. The target data model is determined based on balancing an accuracy constraint and an efficiency constraint for predicting tokens. The decoder acts as abridge between the AR and NAR models in generating a trained data model.

Type: Application

Filed: December 25, 2020

Publication date: February 8, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Jian JIAO, Yeyun GONG, Nan DUAN, Weizhu CHEN, Kewen TANG, Qiang LOU, Ruofei ZHANG, Yu YAN, Jiusheng CHEN
ADVERSARIAL PRETRAINING OF MACHINE LEARNING MODELS

Publication number: 20240013055

Abstract: This document relates to training of machine learning models. One example method involves providing a machine learning model having one or more mapping layers. The one or more mapping layers can include at least a first mapping layer configured to map components of pretraining examples into first representations in a space. The example method also includes performing a pretraining stage on the one or more mapping layers using the pretraining examples. The pretraining stage can include adding noise to the first representations of the components of the pretraining examples to obtain noise-adjusted first representations. The pretraining stage can also include performing a self-supervised learning process to pretrain the one or more mapping layers using at least the first representations of the training data items and the noise-adjusted first representations of the training data items.

Type: Application

Filed: September 26, 2023

Publication date: January 11, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Xiaodong Liu, Hao Cheng, Yu Wang, Jianfeng Gao, Weizhu Chen, Pengcheng He, Hoifung Poon
ML using n-gram induced input representation

Patent number: 11836438

Abstract: Generally discussed herein are devices, systems, and methods for generating an embedding that is both local string dependent and global string dependent. The generated embedding can improve machine learning (ML) model performance. A method can include converting a string of words to a series of tokens, generating a local string-dependent embedding of each token of the series of tokens, generating a global string-dependent embedding of each token of the series of tokens, combining the local string dependent embedding the global string dependent embedding to generate an n-gram induced embedding of each token of the series of tokens, obtaining a masked language model (MLM) previously trained to generate a masked word prediction, and executing the MLM based on the n-based induced embedding of each token to generate the masked word prediction.

Type: Grant

Filed: April 13, 2021

Date of Patent: December 5, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen
Construction of recombinant for synthesizing carminic acid and application thereof

Patent number: 11827878

Abstract: The disclosure discloses construction of recombinant Saccharomyces cerevisiae for synthesizing carminic acid and application thereof and belongs to the technical field of genetic engineering and bioengineering. The disclosure obtains recombinant S. cerevisiae CA-B2 capable of synthesizing carminic acid by heterologously expressing cyclase Zhul, aromatase ZhuJ, OKS of Octaketide synthase 1, C-glucosyltransferase UGT2, monooxygenase aptC and 4?-phosphopantetheinyl transferase npgA in S. cerevisiae. The recombinant S. cerevisiae can be used for synthesizing carminic acid by taking self-synthesized acetyl-CoA and malonyl-CoA as a precursor. On this basis, OKS, cyclase, aromatase, C-glucosyltransferase and monooxygenase relevant to carminic acid are integrated to a high copy site, which can remarkably improve the yield of carminic acid. The yield of carminic acid can be increased to 2664.6 ?g/L by optimizing fermentation conditions, and the fermentation time is shortened significantly.

Type: Grant

Filed: August 9, 2022

Date of Patent: November 28, 2023

Assignee: JIANGNAN UNIVERSITY

Inventors: Jingwen Zhou, Qian Zhang, Song Gao, Jian Chen, Weizhu Zeng, Guocheng Du
Adversarial pretraining of machine learning models

Patent number: 11803758

Abstract: This document relates to training of machine learning models. One example method involves providing a machine learning model having one or more mapping layers. The one or more mapping layers can include at least a first mapping layer configured to map components of pretraining examples into first representations in a space. The example method also includes performing a pretraining stage on the one or more mapping layers using the pretraining examples. The pretraining stage can include adding noise to the first representations of the components of the pretraining examples to obtain noise-adjusted first representations. The pretraining stage can also include performing a self-supervised learning process to pretrain the one or more mapping layers using at least the first representations of the training data items and the noise-adjusted first representations of the training data items.

Type: Grant

Filed: May 22, 2020

Date of Patent: October 31, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Xiaodong Liu, Hao Cheng, Yu Wang, Jianfeng Gao, Weizhu Chen, Pengcheng He, Hoifung Poon
Example based entity extraction, slot filling and value recommendation

Patent number: 11720757

Abstract: Methods, systems, apparatuses, and computer program products are provided for extracting an entity value from a sentence. An embedding set that may include one or more sentence embeddings is generated for at least part of a first sentence that is tagged to associate a first named entity in the sentence with an entity type. A plurality of candidate embeddings is also generated for at least part of a second sentence. The one or more sentence embeddings in the embedding set may be compared with each of the plurality of candidate embeddings, and a match score may be assigned to each comparison to generate a match score set. A particular match score of the match score set may be identified that exceeds a similarity threshold, and an entity value of the entity type may be extracted from the second sentence associated with the identified match score.

Type: Grant

Filed: August 19, 2019

Date of Patent: August 8, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Vikas Bahirwani, Jade Huang, Matthew Brigham Hall, Yu Zhao, Pengcheng He, Weizhu Chen, Eslam K. Abdelreheem, Jiayuan Huang, Yuting Sun
Iterative query-based analysis of text

Patent number: 11704551

Abstract: Techniques for iterative query-based analysis of text are described. According to various implementations, a neural network architecture is implemented receives a query for information about text content, and iteratively analyzes the content using the query. During the analysis a state of the query evolves until it reaches a termination state, at which point the state of the query is output as an answer to the initial query.

Type: Grant

Filed: June 30, 2017

Date of Patent: July 18, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Po-Sen Huang, Jianfeng Gao, Weizhu Chen, Yelong Shen
EFFICIENT TRANSFORMER LANGUAGE MODELS WITH DISENTANGLED ATTENTION AND MULTI-STEP DECODING

Publication number: 20230222295

Abstract: Systems and methods are provided for facilitating the building and use of natural language understanding models. The systems and methods identify a plurality of tokens and use them to generate one or more pre-trained natural language models using a transformer. The transformer disentangles the content embedding and positional embedding in the computation of its attention matrix. Systems and methods are also provided to facilitate self-training of the pre-trained natural language model by utilizing multi-step decoding to better reconstruct masked tokens and improve pre-training convergence.

Type: Application

Filed: December 9, 2022

Publication date: July 13, 2023

Inventors: Pengcheng HE, Xiaodong LIU, Jianfeng GAO, Weizhu CHEN
LANGUAGE-MODEL PRETRAINING WITH GRADIENT-DISENTANGLED EMBEDDING SHARING

Publication number: 20230153532

Abstract: A method for training a language model comprises (a) receiving vectorized training data as input to a multitask pretraining problem; (b) generating modified vectorized training data based on the vectorized training data, according to an upstream data embedding; (c) emitting pretraining output based on the modified vectorized training data, according to a downstream data embedding equivalent to the upstream data embedding; and (d) adjusting the upstream data embedding and the downstream data embedding by computing, based on the pretraining output, a gradient of the upstream data embedding disentangled from a gradient of the downstream data embedding, thereby advancing the multitask pretraining problem toward a pretrained state.

Type: Application

Filed: May 18, 2022

Publication date: May 18, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Pengcheng HE, Jianfeng GAO, Weizhu CHEN
CONSTRUCTION OF RECOMBINANT SACCHAROMYCES CEREVISIAE FOR SYNTHESIZING CARMINIC ACID AND APPLICATION THEREOF

Publication number: 20230127135

Abstract: The disclosure discloses construction of recombinant Saccharomyces cerevisiae for synthesizing carminic acid and application thereof and belongs to the technical field of genetic engineering and bioengineering. The disclosure obtains recombinant S. cerevisiae CA-B2 capable of synthesizing carminic acid by heterologously expressing cyclase Zhul, aromatase ZhuJ, OKS of Octaketide synthase 1, C-glucosyltransferase UGT2, monooxygenase aptC and 4?-phosphopantetheinyl transferase npgA in S. cerevisiae. The recombinant S. cerevisiae can be used for synthesizing carminic acid by taking self-synthesized acetyl-CoA and malonyl-CoA as a precursor. On this basis, OKS, cyclase, aromatase, C-glucosyltransferase and monooxygenase relevant to carminic acid are integrated to a high copy site, which can remarkably improve the yield of carminic acid. The yield of carminic acid can be increased to 2664.6 µg/L by optimizing fermentation conditions, and the fermentation time is shortened significantly.

Type: Application

Filed: August 9, 2022

Publication date: April 27, 2023

Inventors: Jingwen Zhou, Qian Zhang, Song Gao, Jian Chen, Weizhu Zeng, Guocheng Du
GENERATING MODEL TRAINING DATA FROM A DOMAIN SPECIFICATION

Publication number: 20230119613

Abstract: Examples described herein generate training data for machine learning (ML) for natural language (NL) processing (such as semantic parsing for translating NL). A formula tree is generated based on sampling both a formula grammar and NL templates. Using the formula tree, an ML training data instance pair is generated comprising a formula example and an NL example. A context example may also be used during instantiation of the formula tree. An ML model is trained with training data including the ML training data instance pair, and ML output is generated from NL input. The ML output includes, for example, a machine-interpretable formula, a database querying language command, or a general programming language instruction. Some examples support context-free grammar, probabilistic context-free grammar, and/or non-context-free production rules.

Type: Application

Filed: October 19, 2021

Publication date: April 20, 2023

Inventors: Zeqi LIN, Yu HU, Haiyuan CAO, Yi LIU, Jian-Guang LOU, Kuralmani ELANGO, PalaniRaj KALIYAPERUMAL, Weizhu CHEN, Kunal MUKERJEE
CONCENTRATION METHOD AND EQUIPMENT

Publication number: 20230116498

Abstract: A concentration method and equipment. The concentration method includes the step of performing reverse osmosis concentration processing on raw milk by using a reverse osmosis membrane. The reverse osmosis concentration processing includes low-pressure reverse osmosis membrane concentration processing and high-pressure reverse osmosis membrane concentration processing, wherein in the low-pressure reverse osmosis membrane concentration processing, reverse osmosis concentration processing is performed on feeding materials by using a first predetermined pressure, and in the high-pressure reverse osmosis membrane concentration processing, the reverse osmosis concentration processing is performed on the feeding materials by using a second predetermined pressure, the first predetermined pressure being lower than the second predetermined pressure. The concentration equipment includes a temporary storage unit, a homogenizing unit, and a particular concentration unit.

Type: Application

Filed: March 24, 2020

Publication date: April 13, 2023

Applicant: INNER MONGOLIA MENGNIU DAIRY (GROUP) CO., LTD.

Inventors: Weizhu YU, Jie ZHANG, Mengyuan FAN, Shengbo YU, Yonghong ZHANG, Xinghai LIU, Heqian DONG, Xianfeng REN, Hui WANG, Ru BAI, Hongli SHI, Wenting LIU, Xu WANG
Efficient transformer language models with disentangled attention and multi-step decoding

Patent number: 11526679

Abstract: Systems and methods are provided for facilitating the building and use of natural language understanding models. The systems and methods identify a plurality of tokens and use them to generate one or more pre-trained natural language models using a transformer. The transformer disentangles the content embedding and positional embedding in the computation of its attention matrix. Systems and methods are also provided to facilitate self-training of the pre-trained natural language model by utilizing multi-step decoding to better reconstruct masked tokens and improve pre-training convergence.

Type: Grant

Filed: June 24, 2020

Date of Patent: December 13, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen
REDUCING BIASES OF GENERATIVE LANGUAGE MODELS

Publication number: 20220392434

Abstract: The disclosure herein describes reducing training bias in outputs generated by a generative language model. A communication segment associated with a communication is obtained by at least one processor of a generative language model. An output value associated with the communication segment is generated by the generative language model. The output value is mapped to a set of training bias values associated with the generative language model and based on the mapping of the output value to a training bias value of the set of training bias values, an alternative output value is generated. The alternative output value is used in a generated segment output for the communication segment. The accuracy of segment outputs generated by the generative language model is improved through reducing or eliminating its training biases.

Type: Application

Filed: June 8, 2021

Publication date: December 8, 2022

Inventors: Abedelkader ASI, Yarin KUPER, Royi RONEN, Song WANG, Olga GOLDENBERG, Shimrit Rada BEMIS, Erez ALTUS, Yi MAO, Weizhu CHEN
Low-Rank Adaptation of Neural Network Models

Publication number: 20220383126

Abstract: A computer implemented method obtains neural network-based model base model weight matrices for each of multiple neural network layers. First low-rank factorization matrices are added to corresponding base model weight matrices to form a first domain model. The low-rank factorization matrices are treated as trainable parameters. The first domain model is trained with first domain specific training data without modifying base model weight matrices.

Type: Application

Filed: May 19, 2021

Publication date: December 1, 2022

Inventors: Weizhu Chen, Jingfeng HU, Yelong SHEN, Shean WANG, Yabin LIU
Automatic Production Line for Manufacturing and Processing Plant Protein Meat

Publication number: 20220167654

Abstract: The present disclosure provides an automatic production line for manufacturing and processing plant protein meat, and belongs to the field of application of food equipment. The automatic production line for manufacturing and processing the plant protein meat comprises: an extrusion-expansion machine, plant protein meat raw material blocks, a bearing plate, a first manipulator, a storage rack, a first conveying device, a second manipulator, a second conveying device, a main console, a third conveying device, a shredding device, a seasoning adding and mixing device, a third manipulator, a recycling rack, a storage cabinet and a secondary console.

Type: Application

Filed: February 18, 2022

Publication date: June 2, 2022

Inventors: Jingwen ZHOU, Meng NING, Jian CHEN, Weizhu ZENG, Xiaolin LIANG, Jie CHEN, Zhaojun WANG

1 2 3 4 5 next