Patents by Inventor Hinrich Schuetze
Hinrich Schuetze has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7383241Abstract: A method for estimating the performance of a statistical classifier. The method includes inputting a first set of business data in a first format from a real business process and storing the first set of business data in the first format into memory. The method applying a statistical classifier to the first set of business data and recording its classification decisions and obtaining a labeling that contains the correct decision for each data item. The method includes computing a weight for each data item that reflects its true frequency and computing a performance measure of the statistical classifier based on the weights that reflect true frequency. The method also displays the performance measure to a user.Type: GrantFiled: July 14, 2004Date of Patent: June 3, 2008Assignee: ENKATA Technologies, Inc.Inventors: Omer Emre Velipasaoglu, Hinrich Schuetze, Chia-Hao Yu, Stan Stukov
-
Patent number: 7089237Abstract: A search and recommendation system employs the preferences and profiles of individual users and groups within a community of users, as well as information derived from categorically organized content pointers, to augment electronic commerce related searches, re-rank search results, and provide recommendations for commerce related objects based on an initial subject-matter query and an interaction history of a user. The search and recommendation system operates in the context of a content pointer manager, which stores individual users' content pointers (some of which may be published or shared for group use) on a centralized content pointer database connected to a network. The shared content pointer manager is implemented as a distributed program, portions of which operate on users' terminals and other portions of which operate on the centralized content pointer database. A user's content pointers are organized in accordance with a local topical categorical hierarchy.Type: GrantFiled: January 26, 2001Date of Patent: August 8, 2006Assignee: Google, Inc.Inventors: Donald R. Turnbull, Hinrich Schuetze
-
Patent number: 7031961Abstract: A search and recommendation system employs the preferences and profiles of individual users and groups within a community of users, as well as information derived from categorically organized content pointers, to augment Internet searches, re-rank search results, and provide recommendations for objects based on an initial subject-matter query. The search and recommendation system operates in the context of a content pointer manager, which stores individual users' content pointers (some of which may be published or shared for group use) on a centralized content pointer database connected to the Internet. The shared content pointer manager is implemented as a distributed program, portions of which operate on users' terminals and other portions of which operate on the centralized content pointer database. A user's content pointers are organized in accordance with a local topical categorical hierarchy.Type: GrantFiled: December 4, 2000Date of Patent: April 18, 2006Assignee: Google, Inc.Inventors: James B. Pitkow, Hinrich Schuetze
-
Patent number: 7013304Abstract: Improved method, data structure and computer readable medium for searching for digital information files. Files referenced by URLs may be quickly located by finding a minimum unique prefix for the desired URL, breaking the prefix into substrings, and traversing a trie data structure to find indices to another trie data structure that will yield the physical location of the stored digital information file. A node data structure may be used to construct the trie data structures, and may be compressed to allow the tries to occupy less memory, thus allowing the tries to be maintained in memory and less access to storage devices. The result is faster retrieval times for digital information files.Type: GrantFiled: October 20, 1999Date of Patent: March 14, 2006Assignee: Xerox CorporationInventors: Hinrich Schüetze, James E. Pitkow
-
Patent number: 6973423Abstract: A processor implemented method of identifying the text genre of a machine-readable, untagged text. The processor implemented method begins by generating a cue vector from the text, which represents occurrences in the text of a first set of nonstructural, surface cues, which are easily computable. Afterward, the processor determines whether the text is an instance of a first text genre using the cue vector and a weighting vector associated with the first text genre.Type: GrantFiled: June 18, 1998Date of Patent: December 6, 2005Assignee: Xerox CorporationInventors: Geoffrey D. Nunberg, Hinrich Schuetze, Jan O. Pedersen, Brett L. Kessler, Gregory Grefenstette
-
Patent number: 6941321Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.Type: GrantFiled: October 19, 1999Date of Patent: September 6, 2005Assignee: Xerox CorporationInventors: Hinrich Schuetze, Francine R. Chen, Peter L. Pirolli, James E. Pitkow, Ed H. Chi, Jun Li
-
Publication number: 20050192824Abstract: A method for detecting change in business data using a statistical classifier process. The method includes inputting a first set of business data in a first format from a real business process from a first data source and storing the first set of business data into one or more memories. The method also includes inputting a second set of business data in a second format from a real business process from a second data source and storing the second set of business data into one or more memories. The method forms a statistical classifier by inputting the first set of business data into a learning process associating with the statistical classifier that processes business the data in the first format. The method stores the classifier into the one or more memories, the classifier being associated with the first set of data in the first format and processes the data from the first data source in the statistical classifier to derive a first result.Type: ApplicationFiled: July 12, 2004Publication date: September 1, 2005Applicant: ENKATA TechnologiesInventors: Hinrich Schuetze, Omor Velipasaoglu, Chia-Hao Yu, Stan Stukov
-
Patent number: 6922699Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.Type: GrantFiled: October 19, 1999Date of Patent: July 26, 2005Assignee: Xerox CorporationInventors: Hinrich Schuetze, Francine R. Chen, Peter L. Pirolli, James E. Pitkow, Ed H. Chi, Jun Li, Ullas Gargi
-
Patent number: 6907562Abstract: A system and method in accordance with an embodiment of the invention addresses the problems of unlinked or sparsely linked documents by linking them using a set of automatically extracted content words, the “index terms.” Upon receiving a list of documents for indexing, the system and method in accordance with an embodiment of the invention automatically selects the terms to be indexed and generates a hypertext concordance (an “HC”). A concordance is an index where each of the indexed terms is listed with surrounding text, i.e., in context. As well, each of the indexed terms in the HC is given a hyperlink, instead of a page number, back to the occurrence of the term in a version of the indexed document. In one embodiment of the invention, the original document that has been indexed is also revised to include hyperlinks from the index terms into the HC.Type: GrantFiled: July 26, 1999Date of Patent: June 14, 2005Assignee: Xerox CorporationInventor: Hinrich Schuetze
-
Publication number: 20050065967Abstract: A method for processing semi-structured data. The method includes receiving semi-structured data into a first format from a real business process. Preferably, the semi-structured data are machine generated. The method includes tokenizing the semi-structured data into a second format and storing the semi-structured data in the second format into one or more memories and clustering the tokenized data to form a plurality of clusters. The method also includes identifying a selected low frequency term in each of the clusters, and processing at least two of the clusters and the associated selected low frequency terms to form a single template for the at least two of the clusters. In a preferred embodiment, the method replaces the selected low frequency term with a wild card character.Type: ApplicationFiled: July 20, 2004Publication date: March 24, 2005Applicant: EnkataTechnologies, Inc.Inventors: Hinrich Schuetze, Chia-Hao Yu, Omer Velipasaoglu, Stan Stukov
-
Publication number: 20050060340Abstract: A method for loading data into a datamart. The method includes identifying business data in a first format from a real business process and identifying a desired second format. The method includes designing a transformation algorithm that transforms the business data in the first format into the second format and implementing the transformation algorithm in computer executable code. The method includes running the computer executable code on business data in the first format and generating business data in the second format and storing the business data in the second format in a datamart.Type: ApplicationFiled: July 22, 2004Publication date: March 17, 2005Applicant: Enkata TechnologiesInventors: Daniel Sommerfield, Hinrich Schuetze, Stan Stukov
-
Publication number: 20050021357Abstract: A system and method for the efficient creation of training data for automatic classification.Type: ApplicationFiled: May 19, 2004Publication date: January 27, 2005Applicant: ENKATA TechnologiesInventors: Hinrich Schuetze, Omer Velipasaoglu, Chia-Hao Yu, Stan Stukov
-
Publication number: 20050021290Abstract: A method for estimating the performance of a statistical classifier. The method includes inputting a first set of business data in a first format from a real business process and storing the first set of business data in the first format into memory. The method applying a statistical classifier to the first set of business data and recording its classification decisions and obtaining a labeling that contains the correct decision for each data item. The method includes computing a weight for each data item that reflects its true frequency and computing a performance measure of the statistical classifier based on the weights that reflect true frequency. The method also displays the performance measure to a user.Type: ApplicationFiled: July 14, 2004Publication date: January 27, 2005Applicant: ENKATA Technologies, Inc.Inventors: Omer Velipasaoglu, Hinrich Schuetze, Chia-Hao Yu, Stan Stukov
-
Patent number: 6766287Abstract: A system for genre-specific summarization of documents is provided that overcomes the problem of summarizing heterogeneous document collections by taking the genre, or type, of document into account when selecting summary sentences. The system of the present invention takes advantage of the structure and wording of various document genres to provide faster and more accurate summaries.Type: GrantFiled: December 15, 1999Date of Patent: July 20, 2004Assignee: Xerox CorporationInventors: Julian M. Kupiec, Hinrich Schuetze
-
Patent number: 6757669Abstract: This invention provides a device that can be plugged into an intranet and offers searchable index functionality of that intranet without requiring information about system configuration or administration.Type: GrantFiled: January 28, 2000Date of Patent: June 29, 2004Assignee: Xerox CorporationInventors: Eytan Adar, Hinrich Schuetze, Blake D. Ward
-
Patent number: 6751612Abstract: A system, computer readable medium and method for searching for recently altered documents on the World Wide Web is provided. The method selects a server to be searched or crawled by a Web crawler based on a user selected ranking. Servers are ranked by a filter program which compares a user query with the content of a server and the frequency in which content is altered. A top percentage of ranked servers are crawled and the recently altered information, such as hyperlinks, are then provided to the user.Type: GrantFiled: November 29, 1999Date of Patent: June 15, 2004Assignee: Xerox CorporationInventors: Hinrich Schuetze, Jan Pedersen
-
Patent number: 6728752Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.Type: GrantFiled: October 19, 1999Date of Patent: April 27, 2004Assignee: Xerox CorporationInventors: Francine R. Chen, Hinrich Schuetze, Ullas Gargi
-
Patent number: 6598054Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.Type: GrantFiled: October 19, 1999Date of Patent: July 22, 2003Assignee: Xerox CorporationInventors: Hinrich Schuetze, Peter L. Pirolli, James E. Pitkow, Ed H. Chi, Jun Li
-
Publication number: 20030110181Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.Type: ApplicationFiled: October 19, 1999Publication date: June 12, 2003Inventors: HINRICH SCHUETZE, PETER L. PIROLLI, JAMES E. PITKOW, ED H. CHI, JUN LI
-
Patent number: 6567797Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.Type: GrantFiled: October 19, 1999Date of Patent: May 20, 2003Assignee: Xerox CorporationInventors: Hinrich Schuetze, James E. Pitkow, Peter L. Pirolli, Ed H. Chi, Jun Li