Patents by Inventor William Scott Spangler

William Scott Spangler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Clustering hypertext with applications to WEB searching

Publication number: 20040049503

Abstract: A method and structure for providing a database of documents comprising performing a search of the database using a query to produce query result documents, constructing a word dictionary of words within the query result documents, pruning function words from the word dictionary, forming first vectors for words remaining in a word dictionary, constructing an out-link dictionary of documents within the database that are pointed to by the query result documents, adding the query result documents to the out-link dictionary, pruning documents from the out-link dictionary that are pointed to by fewer than a first predetermined number of the query result documents, forming second vectors for documents remaining in the out-link dictionary, constructing an in-link dictionary of documents within the database that point to the query result documents, adding the query result documents to the in-link dictionary, pruning documents from the in-link dictionary that point to fewer than a second predetermined number of the qu

Type: Application

Filed: September 11, 2003

Publication date: March 11, 2004

Inventors: Dharmendra Shantilal Modha, William Scott Spangler
Clustering hypertext with applications to web searching

Patent number: 6684205

Abstract: A method and structure of searching a database containing hypertext documents comprising searching the database using a query to produce a set of hypertext documents; and geometrically clustering the set of hypertext documents into various clusters using a toric k-means similarity measure such that documents within each cluster are similar to each other, wherein the clustering has a linear-time complexity in producing the set of hypertext documents, wherein the similarity measure comprises a weighted sum of maximized individual components of the set of hypertext documents, and wherein the clustering is based upon words contained in each hypertext document, out-links from each hypertext document, and in-links to each hypertext document.

Type: Grant

Filed: October 18, 2000

Date of Patent: January 27, 2004

Assignee: International Business Machines Corporation

Inventors: Dharmendra Shantilal Modha, William Scott Spangler
Method for automatically finding frequently asked questions in a helpdesk data set

Publication number: 20030050908

Abstract: A system and method automatically identify candidate helpdesk problem categories that are most amenable to automated solutions. The system generates a dictionary wherein each word in the text data set is identified, and the number of documents containing these words is counted, and a corresponding count is generated. The documents are partitioned into clusters. For each generated cluster, the system sorts the dictionary terms in order of decreasing occurrence frequency. It then determines a search space by selecting the top dictionary terms as specified by a user defined depth of search. Next, the system chooses a set of terms from the search space as specified by a user-defined value indicating the desired level of detail.

Type: Application

Filed: August 22, 2001

Publication date: March 13, 2003

Applicant: International Business Machines Corporation

Inventors: Jeffrey Thomas Kreulen, Justin Thomas Lessler, Michael Ponce Sanchez, William Scott Spangler
Method and system for the routing of requests using an automated classification and profile matching in a networked environment

Patent number: 6510431

Abstract: A system and method for routing customer requests to advisors is disclosed. The system and method comprises at least one customer server process for receiving customer requests and classifying the information to produce a classified request, the classified request comprising the original request and at least one attribute. The system further comprises at least one advisor server process for receiving the classified requests, comparing the classified requests by associated profiles from the advisors to find matching attributes with classified request, and creating a connection between the requesting customer and at least one advisor, the at least one advisor having submitted a profile with matching attributes. A routing system in accordance with the present invention reduces response time to a problem and saves advisor time. The system also provides for an automatic response to frequent problems at increased efficiency.

Type: Grant

Filed: June 28, 1999

Date of Patent: January 21, 2003

Assignee: International Business Machines Corporation

Inventors: Matthias Eichstaedt, Jeffrey Thomas Kreulen, Vikas Krishna, William Scott Spangler, Hovey Raymond Strong, Jr.
Feature weighting in k-means clustering

Publication number: 20030005258

Abstract: A method and system is provided for integrating multiple feature spaces in a k-means clustering algorithm when analyzing data records having multiple, heterogeneous feature spaces. The method assigns different relative weights to these various features spaces. Optimal feature weights are also determined that lead to a clustering that simultaneously minimizes the average intra-cluster dispersion and maximizes the average inter-cluster dispersion along all the feature spaces. Examples are provided that empirically demonstrate the effectiveness of feature weighting in clustering using two different feature domains.

Type: Application

Filed: March 22, 2001

Publication date: January 2, 2003

Inventors: Dharmendra Shantilal Modha, William Scott Spangler
Method and apparatus for discovering knowledge gaps between problems and solutions in text databases

Publication number: 20020169783

Abstract: A method (and system) of determining a knowledge gap between a first database containing a set of problems records and a second database containing solutions documents, includes developing a set of clusters of the problems records of the first database, where each cluster has a centroid, developing a dictionary having entries based on the problems records in the first database, developing a vector space correlated to the solutions documents in the second database, where the vector space is based on the dictionary entries, developing a listing of distances between the cluster centroids and the vector space, and determining a knowledge gap for each cluster.

Type: Application

Filed: April 18, 2001

Publication date: November 14, 2002

Applicant: International Busines Machines Corporation

Inventors: Jeffrey Thomas Kreulen, Michael A. Lamb, William Scott Spangler
Efficient storage mechanism for representing term occurrence in unstructured text documents

Publication number: 20020165884

Abstract: A method and structure converts a document corpus containing an ordered plurality of documents into a compact representation in memory of occurrence data, where the representation is to be based on a dictionary previously developed for the document corpus and where each term in the dictionary has associated therewith a corresponding unique integer. The method includes developing a first vector for the entire document corpus, the first vector being a sequential listing of the unique integers such that each document in the document corpus is sequentially represented in the listing according to the occurrence in the document of the corresponding dictionary terms. A second vector is also developed for the entire document corpus and indicates the location of each of the document's representation in the first vector.

Type: Application

Filed: May 4, 2001

Publication date: November 7, 2002

Applicant: International Business Machines Corporation

Inventors: Jeffrey Thomas Kreulen, William Scott Spangler
Method and system for identifying relationships between text documents and structured variables pertaining to the text documents

Publication number: 20020156810

Abstract: A method and system for interesting relationships in text documents includes generating a dictionary of keywords in the text documents, forming categories of the text documents using the dictionary and an automated algorithm, counting occurrences of the structured variables, categories and structured variable/category combinations in the text documents, and calculating probabilities of occurrences of the structured variable/category combinations.

Type: Application

Filed: April 19, 2001

Publication date: October 24, 2002

Applicant: International Business Machines Corporation

Inventors: Karen Mae Holland, Jeffrey Thomas Kreulen, William Scott Spangler
System and method for interactive classification and analysis of data

Patent number: 6424971

Abstract: A system, method, and computer program product for interactively classifying and analyzing data is particularly applicable to classification and analysis of textual data. It is particularly useful in identification of helpdesk inquiry and problem categories amenable to automated fulfillment or solution. A dictionary is generated based on a frequency of occurrence of words in a document set. A count of occurrences of each word in the dictionary within each document in the document set is generated. The set of documents is partitioned into a plurality of clusters. A name, a centroid, a cohesion score, and a distinctness score are generated for each cluster and displayed in a table. The documents contained in the clusters sorted based on their similarity to other documents in the cluster.

Type: Grant

Filed: October 29, 1999

Date of Patent: July 23, 2002

Assignee: International Business Machines Corporation

Inventors: Jeffrey Thomas Kreulen, Dharmendra Shantilal Modha, William Scott Spangler, Hovey Raymond Strong, Jr.
Method and system for automatic comparison of text classifications

Patent number: 6397215

Abstract: A system and method for automatic generation of a comparison list given two different classifications, and automatic sorting of the list in order of similarity. A first dictionary is generated including a subset of words contained in a first document set, the first document set including at least one document and having an associated first classification including at least one class, each class having a class name. A second dictionary is generated including a subset of words contained in a second document set, the second document set including at least one document and having an associated second classification including at least one class, each class having a class name. A common dictionary including words that are common to both the first dictionary and the second dictionary is generated. A count of occurrences of each word in the common dictionary within each document in each document set is generated. A centroid of each class in the space of the common dictionary is generated.

Type: Grant

Filed: October 29, 1999

Date of Patent: May 28, 2002

Assignee: International Business Machines Corporation

Inventors: Jeffrey Thomas Kreulen, William Scott Spangler, Hovey Raymond Strong, Jr.
Surfaid predictor: web-based system for predicting surfer behavior

Patent number: 6338066

Abstract: Given a log of previous web-surfer behavior listing the order in which each surfer downloaded specific items at the web site, and given a meaningful classification of those same items, future surfer behavior is predicted by the present invention. The algorithm utilizes a quantitative model relating items downloaded prior to some specified event to items downloaded after that same event. When the model is applied to a new surfer's session prior to an analogous event, the present invention predicts the likely behavior of the surfer subsequent to that event. The predicted behavior is then further analyzed to derive a quantitative value for the utility of the expected behavior. By randomly selecting sample sessions from a web log, multiple models of surfer behavior can be generated. The multiple models can then be applied to a new surfer's session to produce a predicted behavior/utility distribution and thus a confidence interval for the predicted behavior/utility.

Type: Grant

Filed: September 25, 1998

Date of Patent: January 8, 2002

Assignee: International Business Machines Corporation

Inventors: David Charles Martin, Hansel Joseph Miranda, Mark Paul Plutowski, William Scott Spangler, Shivakumar Vaithyanathan, Kevin Wheeler, David Hilton Wolpert
Method and apparatus for cluster exploration and visualization

Patent number: 6100901

Abstract: A method and apparatus for visualizing a multi-dimensional data set in which the multi-dimensional data set is clustered into k clusters, with each cluster having a centroid. Then, either two distinct current centroids or three distinct non-collinear current centroids are selected. A current 2-dimensional cluster projection is generated based on the selected current centroids. In the case when two distinct current centroids are selected, two distinct target centroids are selected, with at least one of the two target centroids being different from the two current centroids. In the case when three distinct current centroids are selected, three distinct non-collinear target centroids are selected, with at least one of the three target centroids being different from the three current centroids.

Type: Grant

Filed: June 22, 1998

Date of Patent: August 8, 2000

Assignee: International Business Machines Corporation

Inventors: Dharmendra Shantilal Mohda, David Charles Martin, William Scott Spangler, Shivakumar Vaithyanathan

prev 1 2 3 4