Patents by Inventor Wojciech Kryscinski

Wojciech Kryscinski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11934781
    Abstract: Embodiments described herein provide a flexible controllable summarization system that allows users to control the generation of summaries without manually editing or writing the summary, e.g., without the user actually adding or deleting certain information under various granularity. Specifically, the summarization system performs controllable summarization through keywords manipulation. A neural network model is learned to generate summaries conditioned on both the keywords and source document so that at test time a user can interact with the neural network model through a keyword interface, potentially enabling multi-factor control.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: March 19, 2024
    Assignee: Salesforce, Inc.
    Inventors: Junxian He, Wojciech Kryscinski, Bryan McCann
  • Patent number: 11790184
    Abstract: Embodiments described herein provide natural language processing (NLP) systems and methods that provide a customized summarization of scientific or technical articles, which disentangles background information from new contributions, and summarizes the background information or the new information (or both) based on a user's preference. Specifically, the systems and methods utilize machine learning classifiers to classify portions of sentences within the article as containing background information or as containing a new contribution attributable to the article. The systems and methods then incorporate the background information in the summary or incorporate the new contribution in the summary and output the summary. In this way, the systems and methods can provide summaries of scientific literatures, which largely accelerates literature review in scientific fields.
    Type: Grant
    Filed: January 28, 2021
    Date of Patent: October 17, 2023
    Assignee: SALESFORCE.COM, INC.
    Inventors: Hiroaki Hayashi, Wojciech Kryscinski
  • Patent number: 11755637
    Abstract: The decoder network includes multiple decoders trained to generate different types of summaries. The lower layers of the multiple decoders are shared. The upper layers of the multiple decoders do not overlap. The multiple decoders generate probability distributions. A gating mechanism combines the probability distributions of the multiple decoders into a probability distribution of the decoder network. Words in the summary are selected based on the probability distribution of the decoder network.
    Type: Grant
    Filed: January 10, 2022
    Date of Patent: September 12, 2023
    Assignee: Salesforce, Inc.
    Inventors: Tanya Goyal, Wojciech Kryscinski, Nazneen Rajani
  • Patent number: 11741142
    Abstract: Embodiments described herein provide document summarization systems and methods that utilize fine-tuning of pre-trained abstractive summarization models to produce summaries that more faithfully track the content of the documents. Such abstractive summarization models may be pre-trained using a corpus consisting of pairs of articles and associated summaries. For each article-summary pair, a pseudo label or control code is generated and represents a faithfulness of the summary with respect to the article. The pre-trained model is then fine-tuned based on the article-summary pairs and the corresponding control codes. The resulting fine-tuned models then provide improved faithfulness in document summarization tasks.
    Type: Grant
    Filed: January 31, 2022
    Date of Patent: August 29, 2023
    Assignee: salesforce.com, inc.
    Inventors: Haopeng Zheng, Semih Yavuz, Wojciech Kryscinski, Kazuma Hashimoto, Yingbo Zhou
  • Patent number: 11699026
    Abstract: Embodiments described herein provide methods and systems for summarizing multiple documents. A system receives a plurality of documents and generates embeddings of the sentences from the plurality of documents. The embedded sentences are clustered in a representation space. Sentences from a reference summary are embedded and aligned with the closest cluster. Sentences from each cluster are summarized with the aligned reference sentences as a target. A loss is computed based on the summarized sentences and the aligned references, and the natural language processing model is updated based on the loss. Sentences may be masked from being used in the summarization by identifying sentences that are contradicted by other sentences within the plurality of documents.
    Type: Grant
    Filed: January 31, 2022
    Date of Patent: July 11, 2023
    Assignee: Salesforce, Inc.
    Inventors: Jered McInerney, Wojciech Kryscinski, Nazneen Rajani
  • Publication number: 20230070497
    Abstract: Embodiments described herein provide methods and systems for summarizing multiple documents. A system receives a plurality of documents and generates embeddings of the sentences from the plurality of documents. The embedded sentences are clustered in a representation space. Sentences from a reference summary are embedded and aligned with the closest cluster. Sentences from each cluster are summarized with the aligned reference sentences as a target. A loss is computed based on the summarized sentences and the aligned references, and the natural language processing model is updated based on the loss. Sentences may be masked from being used in the summarization by identifying sentences that are contradicted by other sentences within the plurality of documents.
    Type: Application
    Filed: January 31, 2022
    Publication date: March 9, 2023
    Inventors: Jered McInerney, Wojciech Kryscinski, Nazneen Rajani
  • Publication number: 20230065155
    Abstract: The decoder network includes multiple decoders trained to generate different types of summaries. The lower layers of the multiple decoders are shared. The upper layers of the multiple decoders do not overlap. The multiple decoders generate probability distributions. A gating mechanism combines the probability distributions of the multiple decoders into a probability distribution of the decoder network. Words in the summary are selected based on the probability distribution of the decoder network.
    Type: Application
    Filed: January 10, 2022
    Publication date: March 2, 2023
    Inventors: Tanya Goyal, Wojciech Kryscinski, Nazneen Rajani
  • Publication number: 20230054068
    Abstract: Embodiments described herein provide document summarization systems and methods that utilize fine-tuning of pre-trained abstractive summarization models to produce summaries that more faithfully track the content of the documents. Such abstractive summarization models may be pre-trained using a corpus consisting of pairs of articles and associated summaries. For each article-summary pair, a pseudo label or control code is generated and represents a faithfulness of the summary with respect to the article. The pre-trained model is then fine-tuned based on the article-summary pairs and the corresponding control codes. The resulting fine-tuned models then provide improved faithfulness in document summarization tasks.
    Type: Application
    Filed: January 31, 2022
    Publication date: February 23, 2023
    Inventors: Haopeng Zheng, Semih Yavuz, Wojciech Kryscinski, Kazuma Hashimoto, Yingbo Zhou
  • Publication number: 20220277135
    Abstract: Embodiments described herein provide a query-focused summarization model that employs a single or dual encoder model. A two-step approach may be adopted that first extracts parts of the source document and then synthesizes the extracted segments into a final summary. In another embodiment, an end-to-end approach may be adopted that splits the source document into overlapping segments, and then concatenates encodings into a single embedding sequence for the decoder to output a summary.
    Type: Application
    Filed: May 20, 2022
    Publication date: September 1, 2022
    Inventors: Wojciech Kryscinski, Alexander R. Fabbri, Jesse Vig
  • Publication number: 20220067302
    Abstract: Embodiments described herein provide natural language processing (NLP) systems and methods that provide a customized summarization of scientific or technical articles, which disentangles background information from new contributions, and summarizes the background information or the new information (or both) based on a user's preference. Specifically, the systems and methods utilize machine learning classifiers to classify portions of sentences within the article as containing background information or as containing a new contribution attributable to the article. The systems and methods then incorporate the background information in the summary or incorporate the new contribution in the summary and output the summary. In this way, the systems and methods can provide summaries of scientific literatures, which largely accelerates literature review in scientific fields.
    Type: Application
    Filed: January 28, 2021
    Publication date: March 3, 2022
    Inventors: Hiroaki Hayashi, Wojciech Kryscinski
  • Publication number: 20220067284
    Abstract: Embodiments described herein provide a flexible controllable summarization system that allows users to control the generation of summaries without manually editing or writing the summary, e.g., without the user actually adding or deleting certain information under various granularity. Specifically, the summarization system performs controllable summarization through keywords manipulation. A neural network model is learned to generate summaries conditioned on both the keywords and source document so that at test time a user can interact with the neural network model through a keyword interface, potentially enabling multi-factor control.
    Type: Application
    Filed: December 17, 2020
    Publication date: March 3, 2022
    Inventors: Junxian He, Wojciech Kryscinski, Bryan McCann
  • Patent number: 11087092
    Abstract: Approaches for determining a response for an agent in an undirected dialogue are provided. The approaches include a dialogue generating framework comprising an encoder neural network, a decoder neural network, and a language model neural network. The dialogue generating framework generates a sketch sentence response with at least one slot. The sketch sentence response is generated word by word and takes into account the undirected dialogue and agent traits of the agent making the response. The dialogue generating framework generates sentence responses by filling the slot with words from the agent traits. The dialogue generating framework ranks the sentence responses according to perplexity by passing the sentence responses through a language model and selects a final response which is a sentence response that has a lowest perplexity.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: August 10, 2021
    Assignee: salesforce.com, inc.
    Inventors: Stephan Zheng, Wojciech Kryscinski, Michael Shum, Richard Socher, Caiming Xiong
  • Publication number: 20210124876
    Abstract: A weakly-supervised, model-based approach is provided for verifying or checking factual consistency and identifying conflicts between source documents and a generated summary. In some embodiments, an artificially generated training dataset is created by applying rule-based transformations to sentences sampled from one or more unannotated source documents of a dataset. Each of the resulting transformed sentences can be either semantically variant or invariant from the respective original sampled sentence, and labeled accordingly. In some embodiments, the generated training dataset is used to train a factual consistency checking model. The factual consistency checking model can classify whether a corresponding text summary is factually consistent with a source text document, and if so, may identify a span in the source text document that supports the corresponding text summary.
    Type: Application
    Filed: January 23, 2020
    Publication date: April 29, 2021
    Inventors: Wojciech Kryscinski, Bryan McCann
  • Patent number: 10909157
    Abstract: A system is disclosed for providing an abstractive summary of a source textual document. The system includes an encoder, a decoder, and a fusion layer. The encoder is capable of generating an encoding for the source textual document. The decoder is separated into a contextual model and a language model. The contextual model is capable of extracting words from the source textual document using the encoding. The language model is capable of generating vectors paraphrasing the source textual document based on pre-training with a training dataset. The fusion layer is capable of generating the abstractive summary of the source textual document from the extracted words and the generated vectors for paraphrasing. In some embodiments, the system utilizes a novelty metric to encourage the generation of novel phrases for inclusion in the abstractive summary.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: February 2, 2021
    Assignee: salesforce.com, inc.
    Inventors: Romain Paulus, Wojciech Kryscinski, Caiming Xiong
  • Publication number: 20200285705
    Abstract: Approaches for determining a response for an agent in an undirected dialogue are provided. The approaches include a dialogue generating framework comprising an encoder neural network, a decoder neural network, and a language model neural network. The dialogue generating framework generates a sketch sentence response with at least one slot. The sketch sentence response is generated word by word and takes into account the undirected dialogue and agent traits of the agent making the response. The dialogue generating framework generates sentence responses by filling the slot with words from the agent traits. The dialogue generating framework ranks the sentence responses according to perplexity by passing the sentence responses through a language model and selects a final response which is a sentence response that has a lowest perplexity.
    Type: Application
    Filed: April 30, 2019
    Publication date: September 10, 2020
    Inventors: Stephan ZHENG, Wojciech KRYSCINSKI, Michael SHUM, Richard SOCHER, Caiming XIONG
  • Publication number: 20190362020
    Abstract: A system is disclosed for providing an abstractive summary of a source textual document. The system includes an encoder, a decoder, and a fusion layer. The encoder is capable of generating an encoding for the source textual document. The decoder is separated into a contextual model and a language model. The contextual model is capable of extracting words from the source textual document using the encoding. The language model is capable of generating vectors paraphrasing the source textual document based on pre-training with a training dataset. The fusion layer is capable of generating the abstractive summary of the source textual document from the extracted words and the generated vectors for paraphrasing. In some embodiments, the system utilizes a novelty metric to encourage the generation of novel phrases for inclusion in the abstractive summary.
    Type: Application
    Filed: July 31, 2018
    Publication date: November 28, 2019
    Inventors: Romain Paulus, Wojciech Kryscinski, Caiming Xiong