Patents by Inventor Thorsten H. Brants

Thorsten H. Brants has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8650187
    Abstract: Techniques for training and using linked event detection systems and transforming source-identified stopwords are provided. A training corpus of source identified stories and a reference language is determined. Optionally, stopwords for source-identified stories are transformed based on statistical analysis of parallel verified and un-verified transformations. Reference language and non-reference language terms are selectively included in source-pair term frequency-inverse story frequency models. Optionally, incremental source-identified term frequency-inverse story frequency models are determined. Selected terms are weighted and similarity metrics determined. Associated source-pair statistics, computed in part from a training corpus, are combined with the values of each similarity metric in the set of similarity metrics to form a similarity vector. Similarity vectors and verified link label information are used to determine a predictive model.
    Type: Grant
    Filed: July 25, 2003
    Date of Patent: February 11, 2014
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Francine R. Chen, Ayman O. Farahat, Thorsten H. Brants
  • Patent number: 7577654
    Abstract: Techniques for new event detection are provided. For a new story and a corpus of stories, story-pairs based on the new story and each corpus story are determined. Adjustments to the importance of terms are determined based on story characteristics associated with each story. Story characteristics are based on direct or indirect characteristics. Direct story characteristics include authorship, language associated with a story and the like. Indirect story characteristics may include derived characteristics such as an ROI category characteristic, a same ROI characteristic, a same event-same source characteristic, an average story similarity characteristic or any other known or later developed characteristic associated with a story. Adjustments to the inter-story similarity metrics are then determined based on story characteristics and/or a weighting function.
    Type: Grant
    Filed: July 25, 2003
    Date of Patent: August 18, 2009
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Thorsten H. Brants, Francine R. Chen, Ayman O. Farahat
  • Patent number: 7529765
    Abstract: One aspect of the invention is that of efficiently and incrementally adding new terms to an already trained probabilistic latent semantic analysis (PLSA) model.
    Type: Grant
    Filed: November 23, 2004
    Date of Patent: May 5, 2009
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Thorsten H. Brants, Ioannis Tsochantaridis, Thomas Hofmann, Francine R. Chen
  • Patent number: 7451395
    Abstract: Techniques for determining interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text provides contextualized access to an interactive topic-based text summary and to an original text.
    Type: Grant
    Filed: December 16, 2002
    Date of Patent: November 11, 2008
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Thorsten H. Brants, Francine R. Chen, Annie E. Zaenen
  • Patent number: 7376893
    Abstract: Techniques for determining sentence based interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text.
    Type: Grant
    Filed: December 16, 2002
    Date of Patent: May 20, 2008
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Francine R. Chen, Thorsten H. Brants, Annie E. Zaenen
  • Patent number: 7280957
    Abstract: A method is provided for digesting the content of hierarchically related information. The method, which obtains relatively short overviews, selects a proportion of representative nodes and then extracts and organizes one or more sentences from the text associated with each selected node. For text trees representing archived discussions, the selection of nodes and sentences is from comment/response sequences drawn from lexically central nodes which will capture those aspects of the discussion considered most important to discussion participants.
    Type: Grant
    Filed: December 16, 2002
    Date of Patent: October 9, 2007
    Assignee: Palo Alto Research Center, Incorporated
    Inventors: Paula S. Newman, John C. Blitzer, Thorsten H. Brants
  • Patent number: 7130837
    Abstract: Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to “topics”, latent variables in the PLSA model, and “topics” to text segments. A system executing the method exhibits significant performance improvement. Once determined, the topic structure of a document may be employed for document retrieval and/or document summarization.
    Type: Grant
    Filed: March 22, 2002
    Date of Patent: October 31, 2006
    Assignee: Xerox Corporation
    Inventors: Ioannis Tsochantaridis, Thorsten H. Brants, Francine R. Chen
  • Patent number: 7117437
    Abstract: Techniques for displaying interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text to provide contextualized access to an interactive topic-based text summary and to an original text.
    Type: Grant
    Filed: December 16, 2002
    Date of Patent: October 3, 2006
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Francine R. Chen, Thorsten H. Brants, Annie E. Zaenen
  • Publication number: 20040122657
    Abstract: Techniques for determining interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text provides contextualized access to an interactive topic-based text summary and to an original text.
    Type: Application
    Filed: December 16, 2002
    Publication date: June 24, 2004
    Inventors: Thorsten H. Brants, Francine R. Chen, Annie E. Zaenen
  • Publication number: 20040117449
    Abstract: A method is provided for digesting the content of hierarchically related information. The method, which obtains relatively short overviews, selects a proportion of representative nodes and then extracts and organizes one or more sentences from the text associated with each selected node. For text trees representing archived discussions, the selection of nodes and sentences is from comment/response sequences drawn from lexically central nodes which will capture those aspects of the discussion considered most important to discussion participants.
    Type: Application
    Filed: December 16, 2002
    Publication date: June 17, 2004
    Applicant: Palo Alto Research Center, Incorporated
    Inventors: Paula S. Newman, John C. Blitzer, Thorsten H. Brants
  • Publication number: 20040117740
    Abstract: Techniques for displaying interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text to provide contextualized access to an interactive topic-based text summary and to an original text.
    Type: Application
    Filed: December 16, 2002
    Publication date: June 17, 2004
    Inventors: Francine R. Chen, Thorsten H. Brants, Annie E. Zaenen
  • Publication number: 20040117725
    Abstract: Techniques for determining sentence based interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text.
    Type: Application
    Filed: December 16, 2002
    Publication date: June 17, 2004
    Inventors: Francine R. Chen, Thorsten H. Brants, Annie E. Zaenen
  • Publication number: 20030182631
    Abstract: Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to “topics”, latent variables in the PLSA model, and “topics” to text segments. A system executing the method exhibits significant performance improvement. Once determined, the topic structure of a document may be employed for document retrieval and/or document summarization.
    Type: Application
    Filed: March 22, 2002
    Publication date: September 25, 2003
    Applicant: XEROX CORPORATION
    Inventors: Ioannis Tsochantaridis, Thorsten H. Brants, Francine R. Chen