Patents by Inventor Hinrich Schuetze

Hinrich Schuetze has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for estimating performance of a classifier

Patent number: 7383241

Abstract: A method for estimating the performance of a statistical classifier. The method includes inputting a first set of business data in a first format from a real business process and storing the first set of business data in the first format into memory. The method applying a statistical classifier to the first set of business data and recording its classification decisions and obtaining a labeling that contains the correct decision for each data item. The method includes computing a weight for each data item that reflects its true frequency and computing a performance measure of the statistical classifier based on the weights that reflect true frequency. The method also displays the performance measure to a user.

Type: Grant

Filed: July 14, 2004

Date of Patent: June 3, 2008

Assignee: ENKATA Technologies, Inc.

Inventors: Omer Emre Velipasaoglu, Hinrich Schuetze, Chia-Hao Yu, Stan Stukov
Interface and system for providing persistent contextual relevance for commerce activities in a networked environment

Patent number: 7089237

Abstract: A search and recommendation system employs the preferences and profiles of individual users and groups within a community of users, as well as information derived from categorically organized content pointers, to augment electronic commerce related searches, re-rank search results, and provide recommendations for commerce related objects based on an initial subject-matter query and an interaction history of a user. The search and recommendation system operates in the context of a content pointer manager, which stores individual users' content pointers (some of which may be published or shared for group use) on a centralized content pointer database connected to a network. The shared content pointer manager is implemented as a distributed program, portions of which operate on users' terminals and other portions of which operate on the centralized content pointer database. A user's content pointers are organized in accordance with a local topical categorical hierarchy.

Type: Grant

Filed: January 26, 2001

Date of Patent: August 8, 2006

Assignee: Google, Inc.

Inventors: Donald R. Turnbull, Hinrich Schuetze
System and method for searching and recommending objects from a categorically organized information repository

Patent number: 7031961

Abstract: A search and recommendation system employs the preferences and profiles of individual users and groups within a community of users, as well as information derived from categorically organized content pointers, to augment Internet searches, re-rank search results, and provide recommendations for objects based on an initial subject-matter query. The search and recommendation system operates in the context of a content pointer manager, which stores individual users' content pointers (some of which may be published or shared for group use) on a centralized content pointer database connected to the Internet. The shared content pointer manager is implemented as a distributed program, portions of which operate on users' terminals and other portions of which operate on the centralized content pointer database. A user's content pointers are organized in accordance with a local topical categorical hierarchy.

Type: Grant

Filed: December 4, 2000

Date of Patent: April 18, 2006

Assignee: Google, Inc.

Inventors: James B. Pitkow, Hinrich Schuetze
Method for locating digital information files

Patent number: 7013304

Abstract: Improved method, data structure and computer readable medium for searching for digital information files. Files referenced by URLs may be quickly located by finding a minimum unique prefix for the desired URL, breaking the prefix into substrings, and traversing a trie data structure to find indices to another trie data structure that will yield the physical location of the stored digital information file. A node data structure may be used to construct the trie data structures, and may be compressed to allow the tries to occupy less memory, thus allowing the tries to be maintained in memory and less access to storage devices. The result is faster retrieval times for digital information files.

Type: Grant

Filed: October 20, 1999

Date of Patent: March 14, 2006

Assignee: Xerox Corporation

Inventors: Hinrich Schüetze, James E. Pitkow
Article and method of automatically determining text genre using surface features of untagged texts

Patent number: 6973423

Abstract: A processor implemented method of identifying the text genre of a machine-readable, untagged text. The processor implemented method begins by generating a cue vector from the text, which represents occurrences in the text of a first set of nonstructural, surface cues, which are easily computable. Afterward, the processor determines whether the text is an instance of a first text genre using the cue vector and a weighting vector associated with the first text genre.

Type: Grant

Filed: June 18, 1998

Date of Patent: December 6, 2005

Assignee: Xerox Corporation

Inventors: Geoffrey D. Nunberg, Hinrich Schuetze, Jan O. Pedersen, Brett L. Kessler, Gregory Grefenstette
System and method for identifying similarities among objects in a collection

Patent number: 6941321

Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.

Type: Grant

Filed: October 19, 1999

Date of Patent: September 6, 2005

Assignee: Xerox Corporation

Inventors: Hinrich Schuetze, Francine R. Chen, Peter L. Pirolli, James E. Pitkow, Ed H. Chi, Jun Li
System and method for determining a behavior of a classifier for use with business data

Publication number: 20050192824

Abstract: A method for detecting change in business data using a statistical classifier process. The method includes inputting a first set of business data in a first format from a real business process from a first data source and storing the first set of business data into one or more memories. The method also includes inputting a second set of business data in a second format from a real business process from a second data source and storing the second set of business data into one or more memories. The method forms a statistical classifier by inputting the first set of business data into a learning process associating with the statistical classifier that processes business the data in the first format. The method stores the classifier into the one or more memories, the classifier being associated with the first set of data in the first format and processes the data from the first data source in the statistical classifier to derive a first result.

Type: Application

Filed: July 12, 2004

Publication date: September 1, 2005

Applicant: ENKATA Technologies

Inventors: Hinrich Schuetze, Omor Velipasaoglu, Chia-Hao Yu, Stan Stukov
System and method for quantitatively representing data objects in vector space

Patent number: 6922699

Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.

Type: Grant

Filed: October 19, 1999

Date of Patent: July 26, 2005

Assignee: Xerox Corporation

Inventors: Hinrich Schuetze, Francine R. Chen, Peter L. Pirolli, James E. Pitkow, Ed H. Chi, Jun Li, Ullas Gargi
Hypertext concordance

Patent number: 6907562

Abstract: A system and method in accordance with an embodiment of the invention addresses the problems of unlinked or sparsely linked documents by linking them using a set of automatically extracted content words, the “index terms.” Upon receiving a list of documents for indexing, the system and method in accordance with an embodiment of the invention automatically selects the terms to be indexed and generates a hypertext concordance (an “HC”). A concordance is an index where each of the indexed terms is listed with surrounding text, i.e., in context. As well, each of the indexed terms in the HC is given a hyperlink, instead of a page number, back to the occurrence of the term in a version of the indexed document. In one embodiment of the invention, the original document that has been indexed is also revised to include hyperlinks from the index terms into the HC.

Type: Grant

Filed: July 26, 1999

Date of Patent: June 14, 2005

Assignee: Xerox Corporation

Inventor: Hinrich Schuetze
System and method for processing semi-structured business data using selected template designs

Publication number: 20050065967

Abstract: A method for processing semi-structured data. The method includes receiving semi-structured data into a first format from a real business process. Preferably, the semi-structured data are machine generated. The method includes tokenizing the semi-structured data into a second format and storing the semi-structured data in the second format into one or more memories and clustering the tokenized data to form a plurality of clusters. The method also includes identifying a selected low frequency term in each of the clusters, and processing at least two of the clusters and the associated selected low frequency terms to form a single template for the at least two of the clusters. In a preferred embodiment, the method replaces the selected low frequency term with a wild card character.

Type: Application

Filed: July 20, 2004

Publication date: March 24, 2005

Applicant: EnkataTechnologies, Inc.

Inventors: Hinrich Schuetze, Chia-Hao Yu, Omer Velipasaoglu, Stan Stukov
System and method for efficient enrichment of business data

Publication number: 20050060340

Abstract: A method for loading data into a datamart. The method includes identifying business data in a first format from a real business process and identifying a desired second format. The method includes designing a transformation algorithm that transforms the business data in the first format into the second format and implementing the transformation algorithm in computer executable code. The method includes running the computer executable code on business data in the first format and generating business data in the second format and storing the business data in the second format in a datamart.

Type: Application

Filed: July 22, 2004

Publication date: March 17, 2005

Applicant: Enkata Technologies

Inventors: Daniel Sommerfield, Hinrich Schuetze, Stan Stukov
System and method for the efficient creation of training data for automatic classification

Publication number: 20050021357

Abstract: A system and method for the efficient creation of training data for automatic classification.

Type: Application

Filed: May 19, 2004

Publication date: January 27, 2005

Applicant: ENKATA Technologies

Inventors: Hinrich Schuetze, Omer Velipasaoglu, Chia-Hao Yu, Stan Stukov
System and method for estimating performance of a classifier

Publication number: 20050021290

Abstract: A method for estimating the performance of a statistical classifier. The method includes inputting a first set of business data in a first format from a real business process and storing the first set of business data in the first format into memory. The method applying a statistical classifier to the first set of business data and recording its classification decisions and obtaining a labeling that contains the correct decision for each data item. The method includes computing a weight for each data item that reflects its true frequency and computing a performance measure of the statistical classifier based on the weights that reflect true frequency. The method also displays the performance measure to a user.

Type: Application

Filed: July 14, 2004

Publication date: January 27, 2005

Applicant: ENKATA Technologies, Inc.

Inventors: Omer Velipasaoglu, Hinrich Schuetze, Chia-Hao Yu, Stan Stukov
System for genre-specific summarization of documents

Patent number: 6766287

Abstract: A system for genre-specific summarization of documents is provided that overcomes the problem of summarizing heterogeneous document collections by taking the genre, or type, of document into account when selecting summary sentences. The system of the present invention takes advantage of the structure and wording of various document genres to provide faster and more accurate summaries.

Type: Grant

Filed: December 15, 1999

Date of Patent: July 20, 2004

Assignee: Xerox Corporation

Inventors: Julian M. Kupiec, Hinrich Schuetze
Self-contained indexing system for an intranet

Patent number: 6757669

Abstract: This invention provides a device that can be plugged into an intranet and offers searchable index functionality of that intranet without requiring information about system configuration or administration.

Type: Grant

Filed: January 28, 2000

Date of Patent: June 29, 2004

Assignee: Xerox Corporation

Inventors: Eytan Adar, Hinrich Schuetze, Blake D. Ward
USER QUERY GENERATE SEARCH RESULTS THAT RANK SET OF SERVERS WHERE RANKING IS BASED ON COMPARING CONTENT ON EACH SERVER WITH USER QUERY, FREQUENCY AT WHICH CONTENT ON EACH SERVER IS ALTERED USING WEB CRAWLER IN A SEARCH ENGINE

Patent number: 6751612

Abstract: A system, computer readable medium and method for searching for recently altered documents on the World Wide Web is provided. The method selects a server to be searched or crawled by a Web crawler based on a user selected ranking. Servers are ranked by a filter program which compares a user query with the content of a server and the frequency in which content is altered. A top percentage of ranked servers are crawled and the recently altered information, such as hyperlinks, are then provided to the user.

Type: Grant

Filed: November 29, 1999

Date of Patent: June 15, 2004

Assignee: Xerox Corporation

Inventors: Hinrich Schuetze, Jan Pedersen
System and method for information browsing using multi-modal features

Patent number: 6728752

Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.

Type: Grant

Filed: October 19, 1999

Date of Patent: April 27, 2004

Assignee: Xerox Corporation

Inventors: Francine R. Chen, Hinrich Schuetze, Ullas Gargi
System and method for clustering data objects in a collection

Patent number: 6598054

Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.

Type: Grant

Filed: October 19, 1999

Date of Patent: July 22, 2003

Assignee: Xerox Corporation

Inventors: Hinrich Schuetze, Peter L. Pirolli, James E. Pitkow, Ed H. Chi, Jun Li
SYSTEM AND METHOD FOR CLUSTERING DATA OBJECTS IN A COLLECTION

Publication number: 20030110181

Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.

Type: Application

Filed: October 19, 1999

Publication date: June 12, 2003

Inventors: HINRICH SCHUETZE, PETER L. PIROLLI, JAMES E. PITKOW, ED H. CHI, JUN LI
System and method for providing recommendations based on multi-modal user clusters

Patent number: 6567797

Abstract: A system and method for browsing, retrieving, and recommending information from a collection uses multi-modal features of the documents in the collection, as well as an analysis of users' prior browsing and retrieval behavior. The system and method are premised on various disclosed methods for quantitatively representing documents in a document collection as vectors in multi-dimensional vector spaces, quantitatively determining similarity between documents, and clustering documents according to those similarities. The system and method also rely on methods for quantitatively representing users in a user population, quantitatively determining similarity between users, clustering users according to those similarities, and visually representing clusters of users by analogy to clusters of documents.

Type: Grant

Filed: October 19, 1999

Date of Patent: May 20, 2003

Assignee: Xerox Corporation

Inventors: Hinrich Schuetze, James E. Pitkow, Peter L. Pirolli, Ed H. Chi, Jun Li

1 2 next