Patents by Inventor Wojciech Kryscinski

Wojciech Kryscinski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for controllable text summarization

Patent number: 11934781

Abstract: Embodiments described herein provide a flexible controllable summarization system that allows users to control the generation of summaries without manually editing or writing the summary, e.g., without the user actually adding or deleting certain information under various granularity. Specifically, the summarization system performs controllable summarization through keywords manipulation. A neural network model is learned to generate summaries conditioned on both the keywords and source document so that at test time a user can interact with the neural network model through a keyword interface, potentially enabling multi-factor control.

Type: Grant

Filed: December 17, 2020

Date of Patent: March 19, 2024

Assignee: Salesforce, Inc.

Inventors: Junxian He, Wojciech Kryscinski, Bryan McCann
Systems and methods for scientific contribution summarization

Patent number: 11790184

Abstract: Embodiments described herein provide natural language processing (NLP) systems and methods that provide a customized summarization of scientific or technical articles, which disentangles background information from new contributions, and summarizes the background information or the new information (or both) based on a user's preference. Specifically, the systems and methods utilize machine learning classifiers to classify portions of sentences within the article as containing background information or as containing a new contribution attributable to the article. The systems and methods then incorporate the background information in the summary or incorporate the new contribution in the summary and output the summary. In this way, the systems and methods can provide summaries of scientific literatures, which largely accelerates literature review in scientific fields.

Type: Grant

Filed: January 28, 2021

Date of Patent: October 17, 2023

Assignee: SALESFORCE.COM, INC.

Inventors: Hiroaki Hayashi, Wojciech Kryscinski
Multi-attribute control for text summarization using multiple decoder heads

Patent number: 11755637

Abstract: The decoder network includes multiple decoders trained to generate different types of summaries. The lower layers of the multiple decoders are shared. The upper layers of the multiple decoders do not overlap. The multiple decoders generate probability distributions. A gating mechanism combines the probability distributions of the multiple decoders into a probability distribution of the decoder network. Words in the summary are selected based on the probability distribution of the decoder network.

Type: Grant

Filed: January 10, 2022

Date of Patent: September 12, 2023

Assignee: Salesforce, Inc.

Inventors: Tanya Goyal, Wojciech Kryscinski, Nazneen Rajani
Systems and methods for abstractive document summarization with entity coverage control

Patent number: 11741142

Abstract: Embodiments described herein provide document summarization systems and methods that utilize fine-tuning of pre-trained abstractive summarization models to produce summaries that more faithfully track the content of the documents. Such abstractive summarization models may be pre-trained using a corpus consisting of pairs of articles and associated summaries. For each article-summary pair, a pseudo label or control code is generated and represents a faithfulness of the summary with respect to the article. The pre-trained model is then fine-tuned based on the article-summary pairs and the corresponding control codes. The resulting fine-tuned models then provide improved faithfulness in document summarization tasks.

Type: Grant

Filed: January 31, 2022

Date of Patent: August 29, 2023

Assignee: salesforce.com, inc.

Inventors: Haopeng Zheng, Semih Yavuz, Wojciech Kryscinski, Kazuma Hashimoto, Yingbo Zhou
Systems and methods for explainable and factual multi-document summarization

Patent number: 11699026

Abstract: Embodiments described herein provide methods and systems for summarizing multiple documents. A system receives a plurality of documents and generates embeddings of the sentences from the plurality of documents. The embedded sentences are clustered in a representation space. Sentences from a reference summary are embedded and aligned with the closest cluster. Sentences from each cluster are summarized with the aligned reference sentences as a target. A loss is computed based on the summarized sentences and the aligned references, and the natural language processing model is updated based on the loss. Sentences may be masked from being used in the summarization by identifying sentences that are contradicted by other sentences within the plurality of documents.

Type: Grant

Filed: January 31, 2022

Date of Patent: July 11, 2023

Assignee: Salesforce, Inc.

Inventors: Jered McInerney, Wojciech Kryscinski, Nazneen Rajani
SYSTEMS AND METHODS FOR EXPLAINABLE AND FACTUAL MULTI-DOCUMENT SUMMARIZATION

Publication number: 20230070497

Abstract: Embodiments described herein provide methods and systems for summarizing multiple documents. A system receives a plurality of documents and generates embeddings of the sentences from the plurality of documents. The embedded sentences are clustered in a representation space. Sentences from a reference summary are embedded and aligned with the closest cluster. Sentences from each cluster are summarized with the aligned reference sentences as a target. A loss is computed based on the summarized sentences and the aligned references, and the natural language processing model is updated based on the loss. Sentences may be masked from being used in the summarization by identifying sentences that are contradicted by other sentences within the plurality of documents.

Type: Application

Filed: January 31, 2022

Publication date: March 9, 2023

Inventors: Jered McInerney, Wojciech Kryscinski, Nazneen Rajani
MULTI-ATTRIBUTE CONTROL FOR TEXT SUMMARIZATION USING MULTIPLE DECODER HEADS

Publication number: 20230065155

Abstract: The decoder network includes multiple decoders trained to generate different types of summaries. The lower layers of the multiple decoders are shared. The upper layers of the multiple decoders do not overlap. The multiple decoders generate probability distributions. A gating mechanism combines the probability distributions of the multiple decoders into a probability distribution of the decoder network. Words in the summary are selected based on the probability distribution of the decoder network.

Type: Application

Filed: January 10, 2022

Publication date: March 2, 2023

Inventors: Tanya Goyal, Wojciech Kryscinski, Nazneen Rajani
SYSTEMS AND METHODS FOR ABSTRACTIVE DOCUMENT SUMMARIZATION WITH ENTITY COVERAGE CONTROL

Publication number: 20230054068

Abstract: Embodiments described herein provide document summarization systems and methods that utilize fine-tuning of pre-trained abstractive summarization models to produce summaries that more faithfully track the content of the documents. Such abstractive summarization models may be pre-trained using a corpus consisting of pairs of articles and associated summaries. For each article-summary pair, a pseudo label or control code is generated and represents a faithfulness of the summary with respect to the article. The pre-trained model is then fine-tuned based on the article-summary pairs and the corresponding control codes. The resulting fine-tuned models then provide improved faithfulness in document summarization tasks.

Type: Application

Filed: January 31, 2022

Publication date: February 23, 2023

Inventors: Haopeng Zheng, Semih Yavuz, Wojciech Kryscinski, Kazuma Hashimoto, Yingbo Zhou
SYSTEMS AND METHODS FOR QUERY-FOCUSED SUMMARIZATION

Publication number: 20220277135

Abstract: Embodiments described herein provide a query-focused summarization model that employs a single or dual encoder model. A two-step approach may be adopted that first extracts parts of the source document and then synthesizes the extracted segments into a final summary. In another embodiment, an end-to-end approach may be adopted that splits the source document into overlapping segments, and then concatenates encodings into a single embedding sequence for the decoder to output a summary.

Type: Application

Filed: May 20, 2022

Publication date: September 1, 2022

Inventors: Wojciech Kryscinski, Alexander R. Fabbri, Jesse Vig
SYSTEMS AND METHODS FOR SCIENETIFIC CONTRIBUTION SUMMARIZATION

Publication number: 20220067302

Abstract: Embodiments described herein provide natural language processing (NLP) systems and methods that provide a customized summarization of scientific or technical articles, which disentangles background information from new contributions, and summarizes the background information or the new information (or both) based on a user's preference. Specifically, the systems and methods utilize machine learning classifiers to classify portions of sentences within the article as containing background information or as containing a new contribution attributable to the article. The systems and methods then incorporate the background information in the summary or incorporate the new contribution in the summary and output the summary. In this way, the systems and methods can provide summaries of scientific literatures, which largely accelerates literature review in scientific fields.

Type: Application

Filed: January 28, 2021

Publication date: March 3, 2022

Inventors: Hiroaki Hayashi, Wojciech Kryscinski
SYSTEMS AND METHODS FOR CONTROLLABLE TEXT SUMMARIZATION

Publication number: 20220067284

Abstract: Embodiments described herein provide a flexible controllable summarization system that allows users to control the generation of summaries without manually editing or writing the summary, e.g., without the user actually adding or deleting certain information under various granularity. Specifically, the summarization system performs controllable summarization through keywords manipulation. A neural network model is learned to generate summaries conditioned on both the keywords and source document so that at test time a user can interact with the neural network model through a keyword interface, potentially enabling multi-factor control.

Type: Application

Filed: December 17, 2020

Publication date: March 3, 2022

Inventors: Junxian He, Wojciech Kryscinski, Bryan McCann
Agent persona grounded chit-chat generation framework

Patent number: 11087092

Abstract: Approaches for determining a response for an agent in an undirected dialogue are provided. The approaches include a dialogue generating framework comprising an encoder neural network, a decoder neural network, and a language model neural network. The dialogue generating framework generates a sketch sentence response with at least one slot. The sketch sentence response is generated word by word and takes into account the undirected dialogue and agent traits of the agent making the response. The dialogue generating framework generates sentence responses by filling the slot with words from the agent traits. The dialogue generating framework ranks the sentence responses according to perplexity by passing the sentence responses through a language model and selects a final response which is a sentence response that has a lowest perplexity.

Type: Grant

Filed: April 30, 2019

Date of Patent: August 10, 2021

Assignee: salesforce.com, inc.

Inventors: Stephan Zheng, Wojciech Kryscinski, Michael Shum, Richard Socher, Caiming Xiong
Evaluating the Factual Consistency of Abstractive Text Summarization

Publication number: 20210124876

Abstract: A weakly-supervised, model-based approach is provided for verifying or checking factual consistency and identifying conflicts between source documents and a generated summary. In some embodiments, an artificially generated training dataset is created by applying rule-based transformations to sentences sampled from one or more unannotated source documents of a dataset. Each of the resulting transformed sentences can be either semantically variant or invariant from the respective original sampled sentence, and labeled accordingly. In some embodiments, the generated training dataset is used to train a factual consistency checking model. The factual consistency checking model can classify whether a corresponding text summary is factually consistent with a source text document, and if so, may identify a span in the source text document that supports the corresponding text summary.

Type: Application

Filed: January 23, 2020

Publication date: April 29, 2021

Inventors: Wojciech Kryscinski, Bryan McCann
Abstraction of text summarization

Patent number: 10909157

Abstract: A system is disclosed for providing an abstractive summary of a source textual document. The system includes an encoder, a decoder, and a fusion layer. The encoder is capable of generating an encoding for the source textual document. The decoder is separated into a contextual model and a language model. The contextual model is capable of extracting words from the source textual document using the encoding. The language model is capable of generating vectors paraphrasing the source textual document based on pre-training with a training dataset. The fusion layer is capable of generating the abstractive summary of the source textual document from the extracted words and the generated vectors for paraphrasing. In some embodiments, the system utilizes a novelty metric to encourage the generation of novel phrases for inclusion in the abstractive summary.

Type: Grant

Filed: July 31, 2018

Date of Patent: February 2, 2021

Assignee: salesforce.com, inc.

Inventors: Romain Paulus, Wojciech Kryscinski, Caiming Xiong
AGENT PERSONA GROUNDED CHIT-CHAT GENERATION FRAMEWORK

Publication number: 20200285705

Abstract: Approaches for determining a response for an agent in an undirected dialogue are provided. The approaches include a dialogue generating framework comprising an encoder neural network, a decoder neural network, and a language model neural network. The dialogue generating framework generates a sketch sentence response with at least one slot. The sketch sentence response is generated word by word and takes into account the undirected dialogue and agent traits of the agent making the response. The dialogue generating framework generates sentence responses by filling the slot with words from the agent traits. The dialogue generating framework ranks the sentence responses according to perplexity by passing the sentence responses through a language model and selects a final response which is a sentence response that has a lowest perplexity.

Type: Application

Filed: April 30, 2019

Publication date: September 10, 2020

Inventors: Stephan ZHENG, Wojciech KRYSCINSKI, Michael SHUM, Richard SOCHER, Caiming XIONG
ABSTRACTION OF TEXT SUMMARIZATON

Publication number: 20190362020

Abstract: A system is disclosed for providing an abstractive summary of a source textual document. The system includes an encoder, a decoder, and a fusion layer. The encoder is capable of generating an encoding for the source textual document. The decoder is separated into a contextual model and a language model. The contextual model is capable of extracting words from the source textual document using the encoding. The language model is capable of generating vectors paraphrasing the source textual document based on pre-training with a training dataset. The fusion layer is capable of generating the abstractive summary of the source textual document from the extracted words and the generated vectors for paraphrasing. In some embodiments, the system utilizes a novelty metric to encourage the generation of novel phrases for inclusion in the abstractive summary.

Type: Application

Filed: July 31, 2018

Publication date: November 28, 2019

Inventors: Romain Paulus, Wojciech Kryscinski, Caiming Xiong