Patents by Inventor Furu Wei

Furu Wei has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Reading order detection in a document

Patent number: 12619828

Abstract: According to embodiments of the present disclosure, there is provided a solution for reading order detection in a document. In the solution, a computer-implemented method includes: determining a text sequence and layout information presented in a document, the text sequence comprising a plurality of text elements, the layout information indicating a spatial layout of the plurality of text elements in the document; generating a plurality of semantic feature representations corresponding to the plurality of text elements based at least on the text sequence and the layout information; and determining a reading order of the plurality of text elements in the document based on the plurality of semantic feature representations. According to the solution, the introduction of the layout information can better characterize a spatial layout manner of the text elements in a specific document, thereby determining the reading order more effectively and accurately.

Type: Grant

Filed: May 23, 2022

Date of Patent: May 5, 2026

Assignee: Microsoft Technology Licensing, LLC

Inventors: Lei Cui, Yiheng Xu, Yang Xu, Furu Wei, Zilong Wang
ULTRA-LOW PRECISION WEIGHT QUANTIZATION OF MACHINE LEARNING MODEL

Publication number: 20260023956

Abstract: A computer system is provided that includes processing circuitry. The computer system being configured to implement a machine learning (ML) model having a transformer architecture that, during a training operation or inference operation, is configured to receive an activation input matrix of activation input values and obtain a weight matrix of weight values. The ML model is further configured to perform ultra-low precision (ULP) quantization by quantizing each of the weight values in the weight matrix to a corresponding selected value from a predefined set of binary or ternary quantized weight values and compute a matrix arithmetic result based on at least a portion of the weight matrix with the quantized weight values and at least a portion of the activation input matrix.

Type: Application

Filed: July 16, 2024

Publication date: January 22, 2026

Applicant: Microsoft Technology Licensing, LLC

Inventors: Shuming MA, Li DONG, Shaohan HUANG, Wenhui WANG, Furu WEI, Jilong XUE, Lingxiao MA, Hongyu WANG
SEMANTIC REPRESENTATION OF TEXT IN DOCUMENT

Publication number: 20250259469

Abstract: According to implementations of the subject matter described herein, there is provided a solution for semantic representation of text in a document. In this solution, textual information comprising a sequence of text elements and layout information of the text element are determined from a document. The layout information indicates a spatial arrangement of the plurality of text elements presented within the document. Based at least in part on the plurality of text elements and the layout information, respective semantic feature representations of the plurality of text elements are generated. By jointly using both the textual information and the layout information, rich semantics of the text elements in the document can be effectively captured in the feature representations.

Type: Application

Filed: April 28, 2025

Publication date: August 14, 2025

Inventors: Lei CUI, Shaohan HUANG, Li DONG, Furu WEI
Semantic representation of text in document

Patent number: 12374141

Abstract: There is provided a solution for semantic representation of text in a document. In this solution, textual information comprising a sequence of text elements (220) and layout information (230) of the text element are determined from a document. The layout information (230) indicates a spatial arrangement of the plurality of text elements (220) presented within the document. Based at least in part on the plurality of text elements (220) and the layout information (230), respective semantic feature representations (180) of the plurality of text elements (220) are generated. By jointly using both the textual information and the layout information (230), rich semantics of the text elements (220) in the document can be effectively captured in the feature representations.

Type: Grant

Filed: June 12, 2020

Date of Patent: July 29, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Lei Cui, Shaohan Huang, Li Dong, Furu Wei
INFERENCING ON HOMOMORPHICALLY ENCRYPTED VECTORS AT TRANSFORMER

Publication number: 20250202679

Abstract: A server computing device is provided, including a processor configured to receive a homomorphically encrypted input embedding vector from a client computing device. At a transformer network, the processor may generate a plurality of homomorphically encrypted intermediate vectors at least in part by performing inferencing on the homomorphically encrypted input embedding vector. The processor may transmit the plurality of homomorphically encrypted intermediate output vectors to the client computing device. The processor may receive a plurality of homomorphically encrypted intermediate input vectors from the client computing device subsequently to transmitting the homomorphically encrypted intermediate output vectors to the client computing device. At the transformer network, the processor may generate a homomorphically encrypted output vector at least in part by performing additional inferencing on the homomorphically encrypted intermediate input vectors.

Type: Application

Filed: March 30, 2022

Publication date: June 19, 2025

Applicant: Microsoft Technology Licensing, LLC

Inventors: Shaohan HUANG, Li DONG, Shuming MA, Furu WEI
Unified speech representation learning

Patent number: 12217745

Abstract: A system obtains a first training data set comprising labeled speech data or both labeled and unlabeled data corresponding to a high-resource data set as well as latent speech representations based on the first training data set. The system trains a machine learning model on the first training data set to learn phonetically aware speech representations corresponding to the first training data set. The system applies the latent speech representations to a transformer context network to generate contextual representations. The system aligns each of the contextual representations with a phoneme label to generate phonetically-aware contextual representations. The system causes a refinement engine to further refine the machine learning model.

Type: Grant

Filed: July 3, 2023

Date of Patent: February 4, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yao Qian, Yu Wu, Kenichi Kumatani, Shujie Liu, Furu Wei, Nanshan Zeng, Xuedong David Huang, Chengyi Wang
Multilingual content recommendation pipeline

Patent number: 12124812

Abstract: A data processing system implements obtaining first textual content in a first language from a first client device; determining that the first language is supported by a first machine learning model; obtaining a guard list of prohibited terms associated with the first language; determining that the textual content does not include one or more prohibited terms associated based on the guard list; providing the first textual content as an input to the first machine learning model responsive to the textual content not including the one or more prohibited terms; analyzing the first textual content with the first machine learning model to obtain a first content recommendation; obtaining a first content recommendation policy that identifies content associated with the first language that may not be provided as a content recommendation; determining that the first content recommendation is not prohibited; and providing the first content recommendation to the first client device.

Type: Grant

Filed: October 26, 2021

Date of Patent: October 22, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ji Li, Amit Srivastava, Xingxing Zhang, Furu Wei
TRANSFORMER NETWORK WITH NORMALIZATION INCLUDING SCALING PARAMETER

Publication number: 20240320482

Abstract: A computing system is provided, including a processor configured to receive a training data set. Based at least in part on the training data set, the processor is further configured to train a transformer network that includes a plurality of layers. The plurality of layers each respectively include a plurality of sub-layers including an attention sub-layer, a feed-forward sub-layer, and a plurality of normalization sub-layers. The plurality of normalization sub-layers are downstream from corresponding sub-layers of the plurality of sub-layers. Each of the plurality of normalization sub-layers is configured to apply layer normalization to a sum of: a first scaling parameter multiplied by an input vector of the sub-layer; and an output vector of the sub-layer.

Type: Application

Filed: February 28, 2023

Publication date: September 26, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Shuming MA, Li DONG, Shaohan HUANG, Dongdong ZHANG, Furu WEI, Hongyu WANG
READING ORDER DETECTION IN A DOCUMENT

Publication number: 20240265206

Abstract: According to embodiments of the present disclosure, there is provided a solution for reading order detection in a document. In the solution, a computer-implemented method includes: determining a text sequence and layout information presented in a document, the text sequence comprising a plurality of text elements, the layout information indicating a spatial layout of the plurality of text elements in the document; generating a plurality of semantic feature representations corresponding to the plurality of text elements based at least on the text sequence and the layout information; and determining a reading order of the plurality of text elements in the document based on the plurality of semantic feature representations. According to the solution, the introduction of the layout information can better characterize a spatial layout manner of the text elements in a specific document, thereby determining the reading order more effectively and accurately.

Type: Application

Filed: May 23, 2022

Publication date: August 8, 2024

Inventors: Lei CUI, Yiheng Xu, Yang Xu, Furu WEI, Zilong WANG
Generating document summary

Patent number: 12050636

Abstract: According to implementations of the subject matter described herein, there is provided a solution for generating a summary of a document. In this solution, feature information of pages comprised in a document is extracted, which characterizes at least one type of content contained in each page. Respective importance of the pages is determined at least based on the extracted feature information. A summary of the document is generated for the document by selecting a predetermined number of pages less than the number of the pages based on the respective importance. Through the solution, instead of providing all the pages, pages containing important content may be determined automatically to serve as the summary of the document. This summary allows the user to learn quickly main content of the document, shorten the time consumed in browsing all documents and/or facilitate location of a document of interest as soon as possible.

Type: Grant

Filed: June 17, 2019

Date of Patent: July 30, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Xingxing Zhang, Shaohan Huang, Lei Cui, Tao Ge, Furu Wei, Ming Zhou
ABSTRACTIVE CONTENT TRANSFORMATION

Publication number: 20240249068

Abstract: A sequence-to-sequence summarizer receives source content to be summarized and determines whether the source content has a size that meets the size threshold. If so, the source content is divided into sections and the sequence-to-sequence summarizer generates a summary for each section. The summaries for each section are merged into a document summary and surfaced for user interaction.

Type: Application

Filed: June 24, 2021

Publication date: July 25, 2024

Inventors: Warren A. ALDRED, Si-Qing CHEN, Rama S. GANESAMOORTHY KASTHURI, Xun WANG, Weixin CAI, Xinyu HE, Xingxing ZHANG, Zhang LI, Kaushik R. NARAYANAN, Furu WEI, Cheng YANG
Document conversion engine

Patent number: 11983482

Abstract: A system and method for converting a document is described. The system accesses a document comprising one or more section breaks. The system detects sections of the text document demarked by the one or more section breaks and generates a section title metadata and a section summary metadata for each section of the plurality of sections. The system inserts the section title metadata and the section summary metadata at the corresponding section breaks in the text document. The system modifies the text document into slides. Each slide being formed for each section based on the corresponding section title metadata and the section summary metadata. The system generates a presentation document based on the slides.

Type: Grant

Filed: May 20, 2021

Date of Patent: May 14, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Tomasz Lukasz Religa, Utsab Bose, Si-Qing Chen, Lei Cui, Tao Ge, Huitian Jiao, Ravi Mandliya, Kaushik Ramaiah Narayanan, Max Wang, Furu Wei
Live comments generating

Patent number: 11877016

Abstract: The present disclosure provides a technical solution of live comments generating, which may acquire candidate texts highly similar with segments of video as live comments of corresponding segments by matching the candidate texts with the segments, and further generate new live comments based on video segments and existed live comments to enrich the live comments information of related video.

Type: Grant

Filed: March 16, 2020

Date of Patent: January 16, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Lei Cui, Furu Wei, Shaohan Huang, Ming Zhou
UNIFIED SPEECH REPRESENTATION LEARNING

Publication number: 20230368782

Abstract: Systems and methods are provided for training a machine learning model to learn speech representations. Labeled speech data or both labeled and unlabeled data sets is applied to a feature extractor of a machine learning model to generate latent speech representations. The latent speech representations are applied to a quantizer to generate quantized latent speech representations and to a transformer context network to generate contextual representations. Each contextual representation included in the contextual representations is aligned with a phoneme label to generate phonetically-aware contextual representations. Quantized latent representations are aligned with phoneme labels to generate phonetically aware latent speech representations.

Type: Application

Filed: July 3, 2023

Publication date: November 16, 2023

Inventors: Yao QIAN, Yu WU, Kenichi KUMATANI, Shujie LIU, Furu WEI, Nanshan ZENG, Xuedong David HUANG, Chengyi WANG
DOCUMENT CONVERSION ENGINE

Publication number: 20230315969

Abstract: A system and method for converting a document is described. The system accesses a document comprising one or more section breaks. The system detects sections of the text document demarked by the one or more section breaks and generates a section title metadata and a section summary metadata for each section of the plurality of sections. The system inserts the section title metadata and the section summary metadata at the corresponding section breaks in the text document. The system modifies the text document into slides. Each slide being formed for each section based on the corresponding section title metadata and the section summary metadata. The system generates a presentation document based on the slides.

Type: Application

Filed: May 20, 2021

Publication date: October 5, 2023

Inventors: Tomasz L. Religa, Utsab Bose, Si-Qing CHEN, Lei CUI, Tao Ge, Huitian JIAO, Ravi Mandliya, Kaushik Ramaiah Narayanan, Max Wang, Furu WEI
Unified speech representation learning

Patent number: 11735171

Abstract: Systems and methods are provided for training a machine learning model to learn speech representations. Labeled speech data or both labeled and unlabeled data sets is applied to a feature extractor of a machine learning model to generate latent speech representations. The latent speech representations are applied to a quantizer to generate quantized latent speech representations and to a transformer context network to generate contextual representations. Each contextual representation included in the contextual representations is aligned with a phoneme label to generate phonetically-aware contextual representations. Quantized latent representations are aligned with phoneme labels to generate phonetically aware latent speech representations.

Type: Grant

Filed: May 14, 2021

Date of Patent: August 22, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yao Qian, Yu Wu, Kenichi Kumatani, Shujie Liu, Furu Wei, Nanshan Zeng, Xuedong David Huang, Chengyi Wang
Cross data set knowledge distillation for training machine learning models

Patent number: 11727270

Abstract: A method and system for training a text-to-content recommendation ML model includes training a first ML model using a first training data set, utilizing the trained first ML model to infer information about the data contained in the first training data set, collecting the inferred information to generate a second training data set, and utilizing the first training data set and the second training data set to train a second ML model. The second ML model may be a text-to-content recommendation ML model.

Type: Grant

Filed: February 24, 2020

Date of Patent: August 15, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ji Li, Amit Srivastava, Xingxing Zhang, Furu Wei, Ming Zhou
Autonomous generation of melody

Patent number: 11705096

Abstract: Implementations of the subject matter described herein provide a solution that enables a machine to automatically generate a melody. In this solution, user emotion and/or environment information is used to select a first melody feature parameter from a plurality of melody feature parameters, wherein each of the plurality of melody feature parameters corresponds to a music style of one of a plurality of reference melodies. The first melody feature parameter is further used to generate a first melody that conforms to the music style and is different from the reference melody. Thus, a melody that matches user emotions and/or environmental information may be automatically created.

Type: Grant

Filed: May 24, 2019

Date of Patent: July 18, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Shaohan Huang, Lei Cui, Tao Ge, Furu Wei, Ming Zhou
SEMANTIC REPRESENTATION OF TEXT IN DOCUMENT

Publication number: 20230206670

Abstract: There is provided a solution for semantic representation of text in a document. In this solution, textual information comprising a sequence of text elements (220) and layout information (230) of the text element are determined from a document. The layout information (230) indicates a spatial arrangement of the plurality of text elements (220) presented within the document. Based at least in part on the plurality of text elements (220) and the layout information (230), respective semantic feature representations (180) of the plurality of text elements (220) are generated. By jointly using both the textual information and the layout information (230), rich semantics of the text elements (220) in the document can be effectively captured in the feature representations.

Type: Application

Filed: June 12, 2020

Publication date: June 29, 2023

Inventors: Lei Cui, Shaohan HUANG, Li DONG, Furu WEI
Multilingual Content Recommendation Pipeline

Publication number: 20230129314

Abstract: A data processing system implements obtaining first textual content in a first language from a first client device; determining that the first language is supported by a first machine learning model; obtaining a guard list of prohibited terms associated with the first language; determining that the textual content does not include one or more prohibited terms associated based on the guard list; providing the first textual content as an input to the first machine learning model responsive to the textual content not including the one or more prohibited terms; analyzing the first textual content with the first machine learning model to obtain a first content recommendation; obtaining a first content recommendation policy that identifies content associated with the first language that may not be provided as a content recommendation; determining that the first content recommendation is not prohibited; and providing the first content recommendation to the first client device.

Type: Application

Filed: October 26, 2021

Publication date: April 27, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Ji LI, Amit SRIVASTAVA, XingXing ZHANG, Furu WEI

1 2 3 next