Patents by Inventor Thorsten H. Brants
Thorsten H. Brants has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8650187Abstract: Techniques for training and using linked event detection systems and transforming source-identified stopwords are provided. A training corpus of source identified stories and a reference language is determined. Optionally, stopwords for source-identified stories are transformed based on statistical analysis of parallel verified and un-verified transformations. Reference language and non-reference language terms are selectively included in source-pair term frequency-inverse story frequency models. Optionally, incremental source-identified term frequency-inverse story frequency models are determined. Selected terms are weighted and similarity metrics determined. Associated source-pair statistics, computed in part from a training corpus, are combined with the values of each similarity metric in the set of similarity metrics to form a similarity vector. Similarity vectors and verified link label information are used to determine a predictive model.Type: GrantFiled: July 25, 2003Date of Patent: February 11, 2014Assignee: Palo Alto Research Center IncorporatedInventors: Francine R. Chen, Ayman O. Farahat, Thorsten H. Brants
-
Patent number: 7577654Abstract: Techniques for new event detection are provided. For a new story and a corpus of stories, story-pairs based on the new story and each corpus story are determined. Adjustments to the importance of terms are determined based on story characteristics associated with each story. Story characteristics are based on direct or indirect characteristics. Direct story characteristics include authorship, language associated with a story and the like. Indirect story characteristics may include derived characteristics such as an ROI category characteristic, a same ROI characteristic, a same event-same source characteristic, an average story similarity characteristic or any other known or later developed characteristic associated with a story. Adjustments to the inter-story similarity metrics are then determined based on story characteristics and/or a weighting function.Type: GrantFiled: July 25, 2003Date of Patent: August 18, 2009Assignee: Palo Alto Research Center IncorporatedInventors: Thorsten H. Brants, Francine R. Chen, Ayman O. Farahat
-
Patent number: 7529765Abstract: One aspect of the invention is that of efficiently and incrementally adding new terms to an already trained probabilistic latent semantic analysis (PLSA) model.Type: GrantFiled: November 23, 2004Date of Patent: May 5, 2009Assignee: Palo Alto Research Center IncorporatedInventors: Thorsten H. Brants, Ioannis Tsochantaridis, Thomas Hofmann, Francine R. Chen
-
Patent number: 7451395Abstract: Techniques for determining interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text provides contextualized access to an interactive topic-based text summary and to an original text.Type: GrantFiled: December 16, 2002Date of Patent: November 11, 2008Assignee: Palo Alto Research Center IncorporatedInventors: Thorsten H. Brants, Francine R. Chen, Annie E. Zaenen
-
Patent number: 7376893Abstract: Techniques for determining sentence based interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text.Type: GrantFiled: December 16, 2002Date of Patent: May 20, 2008Assignee: Palo Alto Research Center IncorporatedInventors: Francine R. Chen, Thorsten H. Brants, Annie E. Zaenen
-
Patent number: 7280957Abstract: A method is provided for digesting the content of hierarchically related information. The method, which obtains relatively short overviews, selects a proportion of representative nodes and then extracts and organizes one or more sentences from the text associated with each selected node. For text trees representing archived discussions, the selection of nodes and sentences is from comment/response sequences drawn from lexically central nodes which will capture those aspects of the discussion considered most important to discussion participants.Type: GrantFiled: December 16, 2002Date of Patent: October 9, 2007Assignee: Palo Alto Research Center, IncorporatedInventors: Paula S. Newman, John C. Blitzer, Thorsten H. Brants
-
Patent number: 7130837Abstract: Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to “topics”, latent variables in the PLSA model, and “topics” to text segments. A system executing the method exhibits significant performance improvement. Once determined, the topic structure of a document may be employed for document retrieval and/or document summarization.Type: GrantFiled: March 22, 2002Date of Patent: October 31, 2006Assignee: Xerox CorporationInventors: Ioannis Tsochantaridis, Thorsten H. Brants, Francine R. Chen
-
Patent number: 7117437Abstract: Techniques for displaying interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text to provide contextualized access to an interactive topic-based text summary and to an original text.Type: GrantFiled: December 16, 2002Date of Patent: October 3, 2006Assignee: Palo Alto Research Center IncorporatedInventors: Francine R. Chen, Thorsten H. Brants, Annie E. Zaenen
-
Publication number: 20040122657Abstract: Techniques for determining interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text provides contextualized access to an interactive topic-based text summary and to an original text.Type: ApplicationFiled: December 16, 2002Publication date: June 24, 2004Inventors: Thorsten H. Brants, Francine R. Chen, Annie E. Zaenen
-
Publication number: 20040117449Abstract: A method is provided for digesting the content of hierarchically related information. The method, which obtains relatively short overviews, selects a proportion of representative nodes and then extracts and organizes one or more sentences from the text associated with each selected node. For text trees representing archived discussions, the selection of nodes and sentences is from comment/response sequences drawn from lexically central nodes which will capture those aspects of the discussion considered most important to discussion participants.Type: ApplicationFiled: December 16, 2002Publication date: June 17, 2004Applicant: Palo Alto Research Center, IncorporatedInventors: Paula S. Newman, John C. Blitzer, Thorsten H. Brants
-
Publication number: 20040117740Abstract: Techniques for displaying interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical and interactive display of texts based on the use of discrete sentence constituent based summaries which associates expansible and contractible displayed text to provide contextualized access to an interactive topic-based text summary and to an original text.Type: ApplicationFiled: December 16, 2002Publication date: June 17, 2004Inventors: Francine R. Chen, Thorsten H. Brants, Annie E. Zaenen
-
Publication number: 20040117725Abstract: Techniques for determining sentence based interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text.Type: ApplicationFiled: December 16, 2002Publication date: June 17, 2004Inventors: Francine R. Chen, Thorsten H. Brants, Annie E. Zaenen
-
Publication number: 20030182631Abstract: Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to “topics”, latent variables in the PLSA model, and “topics” to text segments. A system executing the method exhibits significant performance improvement. Once determined, the topic structure of a document may be employed for document retrieval and/or document summarization.Type: ApplicationFiled: March 22, 2002Publication date: September 25, 2003Applicant: XEROX CORPORATIONInventors: Ioannis Tsochantaridis, Thorsten H. Brants, Francine R. Chen