Patents by Inventor Chenyan XIONG
Chenyan XIONG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240338414Abstract: This document relates to natural language processing using a framework such as a neural network. One example method involves obtaining a first document and a second document and propagating attention from the first document to the second document. The example method also involves producing contextualized semantic representations of individual words in the second document based at least on the propagating. The contextualized semantic representations can provide a basis for performing one or more natural language processing operations.Type: ApplicationFiled: May 10, 2024Publication date: October 10, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Chenyan Xiong, Chen Zhao, Corbin Louis Rosset, Paul Nathan Bennett, Xia Song, Saurabh Kumar Tiwary
-
Patent number: 12099552Abstract: A computer-implemented technique is described herein for assisting a user in advancing a task objective. The technique uses a suggestion-generating system (SGS) to provide one or more suggestions to a user in response to at least a last-submitted query provided by the user. The SGS may correspond to a classification-type or generative-type neural network. The SGS uses a machine-trained model that is trained using a multi-task training framework based on plural groups of training examples, which, in turn, are produced using different respective example-generating methods. One such example-generating method constructs a training example from queries in a search session. It operates by identifying the task-related intent the queries, and then identifying at least one sequence of queries in the search session that exhibits a coherent task-related intent. A training example is constructed based on queries in such a sequence.Type: GrantFiled: November 7, 2023Date of Patent: September 24, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Corby Louis Rosset, Chenyan Xiong, Paul Nathan Bennett, Saurabh Kumar Tiwary, Daniel Fernando Campos, Xia Song, Nicholas Eric Craswell
-
Patent number: 12013902Abstract: This document relates to natural language processing using a framework such as a neural network. One example method involves obtaining a first document and a second document and propagating attention from the first document to the second document. The example method also involves producing contextualized semantic representations of individual words in the second document based at least on the propagating. The contextualized semantic representations can provide a basis for performing one or more natural language processing operations.Type: GrantFiled: July 18, 2022Date of Patent: June 18, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Chenyan Xiong, Chen Zhao, Corbin Louis Rosset, Paul Nathan Bennett, Xia Song, Saurabh Kumar Tiwary
-
Publication number: 20240070202Abstract: A computer-implemented technique is described herein for assisting a user in advancing a task objective. The technique uses a suggestion-generating system (SGS) to provide one or more suggestions to a user in response to at least a last-submitted query provided by the user. The SGS may correspond to a classification-type or generative-type neural network. The SGS uses a machine-trained model that is trained using a multi-task training framework based on plural groups of training examples, which, in turn, are produced using different respective example-generating methods. One such example-generating method constructs a training example from queries in a search session. It operates by identifying the task-related intent the queries, and then identifying at least one sequence of queries in the search session that exhibits a coherent task-related intent. A training example is constructed based on queries in such a sequence.Type: ApplicationFiled: November 7, 2023Publication date: February 29, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Corby Louis ROSSET, Chenyan XIONG, Paul Nathan BENNETT, Saurabh Kumar TIWARY, Daniel Fernando CAMPOS, Xia SONG, Nicholas Eric CRASWELL
-
Publication number: 20240046026Abstract: A method for text compression comprises recognizing a prefix string of one or more text characters preceding a target string of a plurality of text characters to be compressed. The prefix string is provided to a natural language generation (NLG) model configured to output one or more predicted continuations each having an associated rank. If the one or more predicted continuations include a matching predicted continuation relative to the next one or more text characters of the target string, the next one or more text characters are compressed as an NLG-type compressed representation. If no predicted continuations match the next one or more text characters of the target string, a longest matching entry in a compression dictionary is identified. The next one or more text characters of the target string are compressed as a dictionary-type compressed representation that includes the dictionary index value of the longest matching entry.Type: ApplicationFiled: October 17, 2023Publication date: February 8, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Ronny LEMPEL, Chenyan XIONG
-
Patent number: 11853362Abstract: A computer-implemented technique is described herein for assisting a user in advancing a task objective. The technique uses a suggestion-generating system (SGS) to provide one or more suggestions to a user in response to at least a last-submitted query provided by the user. The SGS may correspond to a classification-type or generative-type neural network. The SGS uses a machine-trained model that is trained using a multi-task training framework based on plural groups of training examples, which, in turn, are produced using different respective example-generating methods. One such example-generating method constructs a training example from queries in a search session. It operates by identifying the task-related intent the queries, and then identifying at least one sequence of queries in the search session that exhibits a coherent task-related intent. A training example is constructed based on queries in such a sequence.Type: GrantFiled: April 16, 2020Date of Patent: December 26, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Corby Louis Rosset, Chenyan Xiong, Paul Nathan Bennett, Saurabh Kumar Tiwary, Daniel Fernando Campos, Xia Song, Nicholas Eric Craswell
-
Patent number: 11829374Abstract: Document embedding vectors for each document of a corpus may be generated by combining embedding vectors for document subparts, thereby yielding a final embedding vector for the document. A machine learning model is trained using a query corpus and the document corpus, where the model generates a ranking score for a given (query, document) pair. During training, rankings scores are generated using the model, such that the training dataset is further refined using the generated ranking scores. For example, top documents and a negative document may be determined for a given query and subsequently used as training data. Multiple negative documents may therefore be determined for a given query. A negative document for a given query may be determined from the negative documents using noise-contrastive estimation. Such determined negative documents may be evaluated using a loss function during model training, thereby yielding a more robust model for search processing.Type: GrantFiled: March 19, 2021Date of Patent: November 28, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Junaid Ahmed, Li Xiong, Arnold Overwijk, Chenyan Xiong
-
Patent number: 11803693Abstract: A method for text compression comprises recognizing a prefix string of one or more text characters preceding a target string of a plurality of text characters to be compressed. The prefix string is provided to a natural language generation (NLG) model configured to output one or more predicted continuations each having an associated rank. If the one or more predicted continuations include a matching predicted continuation relative to the next one or more text characters of the target string, the next one or more text characters are compressed as an NLG-type compressed representation. If no predicted continuations match the next one or more text characters of the target string, a longest matching entry in a compression dictionary is identified. The next one or more text characters of the target string are compressed as a dictionary-type compressed representation that includes the dictionary index value of the longest matching entry.Type: GrantFiled: June 18, 2021Date of Patent: October 31, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Ronny Lempel, Chenyan Xiong
-
Patent number: 11734559Abstract: To provide automated categorization of structured textual content individual nodes of textual content, from a document object model encapsulation of the structured textual content, have a multidimensional vector associated with them, where the values of the various dimensions of the multidimensional vector are based on the textual content in the corresponding node, the visual features applied or associated with the textual content of the corresponding node, and positional information of the textual content of the corresponding node. The multidimensional vectors are input to a neighbor-imbuing neural network. The enhanced multidimensional vectors output by the neighbor-imbuing neural network are then be provided to a categorization neural network. The resulting output can be in the form of multidimensional vectors whose dimensionality is proportional to categories into which the structured textual content is to be categorized. A weighted merge takes into account multiple nodes that are grouped together.Type: GrantFiled: June 19, 2020Date of Patent: August 22, 2023Assignee: MICRSOFT TECHNOLOGY LICENSING, LLCInventors: Charumathi Lakshmanan, Ye Li, Arnold Overwijk, Chenyan Xiong, Jiguang Shen, Junaid Ahmed, Jiaming Guo
-
Patent number: 11657223Abstract: A system for extracting a key phrase from a document includes a neural key phrase extraction model (“BLING-KPE”) having a first layer to extract a word sequence from the document, a second layer to represent each word in the word sequence by ELMo embedding, position embedding, and visual features, and a third layer to concatenate the ELMo embedding, the position embedding, and the visual features to produce hybrid word embeddings. A convolutional transformer models the hybrid word embeddings to n-gram embeddings, and a feedforward layer converts the n-gram embeddings into a probability distribution over a set of n-grams and calculates a key phrase score of each n-gram. The neural key phrase extraction model is trained on annotated data based on a labeled loss function to compute cross entropy loss of the key phrase score of each n-gram as compared with a label from the annotated dataset.Type: GrantFiled: December 16, 2021Date of Patent: May 23, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Li Xiong, Chuan Hu, Arnold Overwijk, Junaid Ahmed, Daniel Fernando Campos, Chenyan Xiong
-
Publication number: 20220405461Abstract: A method for text compression comprises recognizing a prefix string of one or more text characters preceding a target string of a plurality of text characters to be compressed. The prefix string is provided to a natural language generation (NLG) model configured to output one or more predicted continuations each having an associated rank. If the one or more predicted continuations include a matching predicted continuation relative to the next one or more text characters of the target string, the next one or more text characters are compressed as an NLG-type compressed representation. If no predicted continuations match the next one or more text characters of the target string, a longest matching entry in a compression dictionary is identified. The next one or more text characters of the target string are compressed as a dictionary-type compressed representation that includes the dictionary index value of the longest matching entry.Type: ApplicationFiled: June 18, 2021Publication date: December 22, 2022Applicant: Microsoft Technology Licensing, LLCInventors: Ronny LEMPEL, Chenyan XIONG
-
Publication number: 20220374479Abstract: This document relates to natural language processing using a framework such as a neural network. One example method involves obtaining a first document and a second document and propagating attention from the first document to the second document. The example method also involves producing contextualized semantic representations of individual words in the second document based at least on the propagating. The contextualized semantic representations can provide a basis for performing one or more natural language processing operations.Type: ApplicationFiled: July 18, 2022Publication date: November 24, 2022Applicant: Microsoft Technology Licensing, LLCInventors: Chenyan Xiong, Chen Zhao, Corbin Louis Rosset, Paul Nathan Bennett, Xia Song, Saurabh Kumar Tiwary
-
Patent number: 11423093Abstract: This document relates to natural language processing using a framework such as a neural network. One example method involves obtaining a first document and a second document and propagating attention from the first document to the second document. The example method also involves producing contextualized semantic representations of individual words in the second document based at least on the propagating. The contextualized semantic representations can provide a basis for performing one or more natural language processing operations.Type: GrantFiled: September 25, 2019Date of Patent: August 23, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Chenyan Xiong, Chen Zhao, Corbin Louis Rosset, Paul Nathan Bennett, Xia Song, Saurabh Kumar Tiwary
-
Publication number: 20220179871Abstract: Document embedding vectors for each document of a corpus may be generated by combining embedding vectors for document subparts, thereby yielding a final embedding vector for the document. A machine learning model is trained using a query corpus and the document corpus, where the model generates a ranking score for a given (query, document) pair. During training, rankings scores are generated using the model, such that the training dataset is further refined using the generated ranking scores. For example, top documents and a negative document may be determined for a given query and subsequently used as training data. Multiple negative documents may therefore be determined for a given query. A negative document for a given query may be determined from the negative documents using noise-contrastive estimation. Such determined negative documents may be evaluated using a loss function during model training, thereby yielding a more robust model for search processing.Type: ApplicationFiled: March 19, 2021Publication date: June 9, 2022Applicant: Microsoft Technology Licensing, LLCInventors: Junaid AHMED, Li XIONG, Arnold OVERWIJK, Chenyan XIONG
-
Publication number: 20220108078Abstract: A system for extracting a key phrase from a document includes a neural key phrase extraction model (“BLING-KPE”) having a first layer to extract a word sequence from the document, a second layer to represent each word in the word sequence by ELMo embedding, position embedding, and visual features, and a third layer to concatenate the ELMo embedding, the position embedding, and the visual features to produce hybrid word embeddings. A convolutional transformer models the hybrid word embeddings to n-gram embeddings, and a feedforward layer converts the n-gram embeddings into a probability distribution over a set of n-grams and calculates a key phrase score of each n-gram. The neural key phrase extraction model is trained on annotated data based on a labeled loss function to compute cross entropy loss of the key phrase score of each n-gram as compared with a label from the annotated dataset.Type: ApplicationFiled: December 16, 2021Publication date: April 7, 2022Inventors: Li XIONG, Chuan HU, Arnold OVERWIJK, Junaid AHMED, Daniel Fernando CAMPOS, Chenyan XIONG
-
Patent number: 11250214Abstract: A system for extracting a key phrase from a document includes a neural key phrase extraction model (“BLING-KPE”) having a first layer to extract a word sequence from the document, a second layer to represent each word in the word sequence by ELMo embedding, position embedding, and visual features, and a third layer to concatenate the ELMo embedding, the position embedding, and the visual features to produce hybrid word embeddings. A convolutional transformer models the hybrid word embeddings to n-gram embeddings, and a feedforward layer converts the n-gram embeddings into a probability distribution over a set of n-grams and calculates a key phrase score of each n-gram. The neural key phrase extraction model is trained on annotated data based on a labeled loss function to compute cross entropy loss of the key phrase score of each n-gram as compared with a label from the annotated dataset.Type: GrantFiled: July 2, 2019Date of Patent: February 15, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Li Xiong, Chuan Hu, Arnold Overwijk, Junaid Ahmed, Daniel Fernando Campos, Chenyan Xiong
-
Publication number: 20210397944Abstract: To provide automated categorization of structured textual content individual nodes of textual content, from a document object model encapsulation of the structured textual content, have a multidimensional vector associated with them, where the values of the various dimensions of the multidimensional vector are based on the textual content in the corresponding node, the visual features applied or associated with the textual content of the corresponding node, and positional information of the textual content of the corresponding node. The multidimensional vectors are input to a neighbor-imbuing neural network. The enhanced multidimensional vectors output by the neighbor-imbuing neural network are then be provided to a categorization neural network. The resulting output can be in the form of multidimensional vectors whose dimensionality is proportional to categories into which the structured textual content is to be categorized. A weighted merge takes into account multiple nodes that are grouped together.Type: ApplicationFiled: June 19, 2020Publication date: December 23, 2021Inventors: Charumathi Lakshmanan, Ye Li, Arnold Overwijk, Chenyan Xiong, Jiguang Shen, Junaid Ahmed, Jiaming Guo
-
Publication number: 20210326742Abstract: A computer-implemented technique is described herein for assisting a user in advancing a task objective. The technique uses a suggestion-generating system (SGS) to provide one or more suggestions to a user in response to at least a last-submitted query provided by the user. The SGS may correspond to a classification-type or generative-type neural network. The SGS uses a machine-trained model that is trained using a multi-task training framework based on plural groups of training examples, which, in turn, are produced using different respective example-generating methods. One such example-generating method constructs a training example from queries in a search session. It operates by identifying the task-related intent the queries, and then identifying at least one sequence of queries in the search session that exhibits a coherent task-related intent. A training example is constructed based on queries in such a sequence.Type: ApplicationFiled: April 16, 2020Publication date: October 21, 2021Inventors: Corby Louis ROSSET, Chenyan XIONG, Paul Nathan BENNETT, Saurabh Kumar TIWARY, Daniel Fernando CAMPOS, Xia SONG, Nicholas Eric CRASWELL
-
Patent number: 11138285Abstract: A computer-implemented technique receives an input expression that a user submits with an intent to accomplish some objective. The technique then uses a machine-trained intent encoder component to map the input expression into an input expression intent vector (IEIV). The IEIV corresponds to a distributed representation of the intent associated with the input expression, within a vector intent vector space. The technique then leverages the intent vector to facilitate some downstream application task, such as the retrieval of information. Some application tasks also use a neighbor search component to find expressions that express an intent similar to that of the input expression. A training system trains the intent encoder component based on the nexus between queries and user clicks, as recorded in a search engine's search log.Type: GrantFiled: March 7, 2019Date of Patent: October 5, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Hongfei Zhang, Xia Song, Chenyan Xiong, Corbin Louis Rosset, Paul Nathan Bennett, Nicholas Eric Craswell, Saurabh Kumar Tiwary
-
Publication number: 20210089594Abstract: This document relates to natural language processing using a framework such as a neural network. One example method involves obtaining a first document and a second document and propagating attention from the first document to the second document. The example method also involves producing contextualized semantic representations of individual words in the second document based at least on the propagating. The contextualized semantic representations can provide a basis for performing one or more natural language processing operations.Type: ApplicationFiled: September 25, 2019Publication date: March 25, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Chenyan Xiong, Chen Zhao, Corbin Louis Rosset, Paul Nathan Bennett, Xia Song, Saurabh Kumar Tiwary