Patents by Inventor W. Scott Spangler

W. Scott Spangler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Building Entity Relationship Networks from n-ary Relative Neighborhood Trees

Publication number: 20190138510

Abstract: Entities are objects with feature values that can be thought of as vectors in N-space, where N is the number of features. Similarity between any two entities can be calculated as a distance between the two entity vectors. A similarity network can be drawn between a set of entities based on connecting two entities that are relatively near to each other in N-space. Binary relative neighborhood trees are a special type of entity relationship network, designed to be useful in visualizing the entity space. They have the intuitively simple property that the more typical entities occur at the top of the tree and the more unusual entities occur at the leaf nodes. By limiting the number of links to n+1 per node (one parent, n children), a regularized flat tree structure is created that is much easier to visualize and navigate at both a course and a fine level by domain experts.

Type: Application

Filed: December 31, 2018

Publication date: May 9, 2019

Inventor: W Scott Spangler
System and method for using text analytics to identify a set of related documents from a source document

Patent number: 9495349

Abstract: A system and method for processing a document to generate a set of related documents. A system is provided that includes a textual analytics system that analyzes unstructured data contained in a source document and extracts a set of structured information about the source document; and a compare system that identifies a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.

Type: Grant

Filed: November 17, 2005

Date of Patent: November 15, 2016

Assignee: International Business Machines Corporation

Inventors: Robert L. Angell, Stephen K. Boyer, James W. Cooper, Richard A. Hennessy, Tapas Kanungo, Jeffrey T. Kreulen, David C. Martin, James J. Rhodes, W. Scott Spangler, Herschel J. R. Weintraub
Semi-supervised data integration model for named entity classification

Patent number: 9292797

Abstract: According to one embodiment, a semi-supervised data integration model for named entity classification from a first repository of entity information in view of an auxiliary repository of classification assistance data is provided. Training data are compared to named entity candidates taken from the first repository to form a positive training seed set. A decision tree is populated and classification rules are created for classifying the named entity candidates. A number of entities are sampled from the named entity candidates. The sampled entities are labeled as positive examples and/or negative examples. The positive training seed set is updated to include identified commonality between the positive examples and the auxiliary repository. A negative training seed set is updated to include negative examples which lack commonality with the auxiliary repository. In view of both the updated positive and negative training seed sets, the decision tree and the classification rules are updated.

Type: Grant

Filed: December 14, 2012

Date of Patent: March 22, 2016

Assignee: International Business Machines Corporation

Inventors: Qi He, W. Scott Spangler
MINING STRONG RELEVANCE BETWEEN HETEROGENEOUS ENTITIES FROM THEIR CO-OCURRENCES

Publication number: 20150332158

Abstract: Given two heterogeneous entities, the prevalence of text data provides rich co-occurrence information for them. However, the co-occurrence only is noisy—not only may the co-occurrence just imply an accidental writing, but also it may just reflect the domain-specific common words. Only those strong relevance between entities supported by rich relevance contexts in data can indicate meaningful entity relationships. Strong relevance between heterogeneous entities are mined from their co-occurrences. Drug-disease therapeutic relationships are used as the example to demonstrate an application of this work.

Type: Application

Filed: May 16, 2014

Publication date: November 19, 2015

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Qi He, Ming Ji, W. Scott Spangler
Building Entity Relationship Networks from n-ary Relative Neighborhood Trees

Publication number: 20150324481

Abstract: Entities are objects with feature values that can be thought of as vectors in N-space, where N is the number of features. Similarity between any two entities can be calculated as a distance between the two entity vectors. A similarity network can be drawn between a set of entities based on connecting two entities that are relatively near to each other in N-space. Binary relative neighborhood trees are a special type of entity relationship network, designed to be useful in visualizing the entity space. They have the intuitively simple property that the more typical entities occur at the top of the tree and the more unusual entities occur at the leaf nodes. By limiting the number of links to n+1 per node (one parent, n children), a regularized flat tree structure is created that is much easier to visualize and navigate at both a course and a fine level by domain experts.

Type: Application

Filed: May 6, 2014

Publication date: November 12, 2015

Applicant: International Business Machines Corporation

Inventor: W Scott Spangler
Technology prediction

Patent number: 9183600

Abstract: Embodiments of the invention relate to technology prediction. A technical dictionary of technical terms is constructed based on a collection of documents. The technical terms are partitioned into equivalence classes. A table is generated that correlates technical terms across equivalence classes based on temporal co-occurrence of the technical terms across the equivalence classes. For a given technical term the table is accessed to determine a first set of technical terms that correlate to the given technical term. The table is accessed again to determine a second set of technical terms that correlate to the first set of technical terms. It is predicted that the second set of technical terms will correlate to the given technical term in the future.

Type: Grant

Filed: January 10, 2013

Date of Patent: November 10, 2015

Assignee: International Business Machines Corporation

Inventors: Ying Chen, Bin He, Qi He, Xin Jin, W. Scott Spangler
INFERRING BIOLOGICAL PATHWAYS FROM UNSTRUCTURED TEXT ANALYSIS

Publication number: 20150220680

Abstract: A biological pathway is a series of actions that take place in an organism that lead to some resulting pathology or otherwise change the organism state. In the cell, these actions typically take place between molecules called proteins. Proteins within the cell interact in ways that are not fully understood, but evidence concerning these interactions is constantly being collected and published by microbiologists. The disclosed method automatically infers such biological pathways between proteins by looking at the overall system of published literature about those proteins.

Type: Application

Filed: January 31, 2014

Publication date: August 6, 2015

Applicant: International Business Machines Corporation

Inventors: STEPHEN K BOYER, JEFFREY T KREULEN, W SCOTT SPANGLER
TECHNOLOGY PREDICTION

Publication number: 20140195471

Abstract: Embodiments of the invention relate to technology prediction. A technical dictionary of technical terms is constructed based on a collection of documents. The technical terms are partitioned into equivalence classes. A table is generated that correlates technical terms across equivalence classes based on temporal co-occurrence of the technical terms across the equivalence classes. For a given technical term the table is accessed to determine a first set of technical terms that correlate to the given technical term. The table is accessed again to determine a second set of technical terms that correlate to the first set of technical terms. It is predicted that the second set of technical terms will correlate to the given technical term in the future.

Type: Application

Filed: January 10, 2013

Publication date: July 10, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ying Chen, Bin He, Qi He, Xin Jin, W. Scott Spangler
SEMI-SUPERVISED DATA INTEGRATION MODEL FOR NAMED ENTITY CLASSIFICATION

Publication number: 20140172754

Abstract: According to one embodiment, a semi-supervised data integration model for named entity classification from a first repository of entity information in view of an auxiliary repository of classification assistance data is provided. Training data are compared to named entity candidates taken from the first repository to form a positive training seed set. A decision tree is populated and classification rules are created for classifying the named entity candidates. A number of entities are sampled from the named entity candidates. The sampled entities are labeled as positive examples and/or negative examples. The positive training seed set is updated to include identified commonality between the positive examples and the auxiliary repository. A negative training seed set is updated to include negative examples which lack commonality with the auxiliary repository. In view of both the updated positive and negative training seed sets, the decision tree and the classification rules are updated.

Type: Application

Filed: December 14, 2012

Publication date: June 19, 2014

Applicant: International Business Machines Corporation

Inventors: Qi He, W. Scott Spangler
CLASSIFYING DOCUMENTS ACCORDING TO READERSHIP

Publication number: 20120226695

Abstract: A system for classifying documents in a collection of documents according to their intended readerships includes: a computer configured to select a document in the collection of documents; and a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. A computer classifies the selected document as misleading, commercial, or personal according to its determined characteristic; and a computer repeats the steps of select document, determines a characteristic of the selected document, and classifies the selected document for additional documents in the collection. At least some documents are classified as misleading, some as commercial, and at least some as personal.

Type: Application

Filed: May 16, 2012

Publication date: September 6, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ying Chen, Bin He, W. Scott Spangler
CLASSIFYING DOCUMENTS ACCORDING TO READERSHIP

Publication number: 20110276553

Abstract: One embodiment is a computer-implemented method for classifying documents in a collection of documents according to their intended readerships. The method comprises using a computer to select a document in the collection of documents; and using a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. The method further includes using a computer to classify the selected document as misleading, commercial, or personal according to its determined characteristic; and using a computer to repeat the steps of select document, determine a characteristic of the selected document, and classify the selected document for additional documents in the collection.

Type: Application

Filed: May 10, 2010

Publication date: November 10, 2011

Applicant: International Business Machines Corporation

Inventors: Ying Chen, Bin He, W. Scott Spangler
Method of monitoring electronic media

Patent number: 8010524

Abstract: Consumer-generated media (CGM) and/or other media are monitored to allow an organization to become aware of, and respond to, issues that may affect how it is perceived by the public. An extract, transform, load (ETL) engine is used to process CGM and other media content, and an analytical engine utilizes a multi-step progressive filtering approach to identify those documents that are most relevant. The filtering approach includes executing broad queries to extract relevant content from different CGM and other sources, extracting text snippets from the relevant content and performing de-duplication, defining organizational identity (e.g., brand name, trade name, or company name) and hot-topic models using a rule-based and statistical-based approach, and using the models together in an orthogonal filtering approach to effectively generate alerts and reports. The methodology is found to be substantially more effective compared to a conventional keyword based approach.

Type: Grant

Filed: October 29, 2007

Date of Patent: August 30, 2011

Assignee: International Business Machines Corporation

Inventors: Ying Chen, Amit Behal, Thomas D. Griffin, Larry L. Proctor, W. Scott Spangler
SYSTEMS AND METHODS FOR DETECTING SENTIMENT-BASED TOPICS

Publication number: 20110137906

Abstract: A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.

Type: Application

Filed: December 9, 2009

Publication date: June 9, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES, INC.

Inventors: Keke Cai, Ying Chen, W. Scott Spangler, LI Zhang
METHOD OF MONITORING ELECTRONIC MEDIA

Publication number: 20090119275

Abstract: Consumer-generated media (CGM) and/or other media are monitored to allow an organization to become aware of, and respond to, issues that may affect how it is perceived by the public. An extract, transform, load (ETL) engine is used to process CGM and other media content, and an analytical engine utilizes a multi-step progressive filtering approach to identify those documents that are most relevant. The filtering approach includes executing broad queries to extract relevant content from different CGM and other sources, extracting text snippets from the relevant content and performing de-duplication, defining organizational identity (e.g., brand name, trade name, or company name) and hot-topic models using a rule-based and statistical-based approach, and using the models together in an orthogonal filtering approach to effectively generate alerts and reports. The methodology is found to be substantially more effective compared to a conventional keyword based approach.

Type: Application

Filed: October 29, 2007

Publication date: May 7, 2009

Applicant: International Business Machines Corporation

Inventors: Ying Chen, Amit Behal, Thomas D. Griffin, Larry L. Proctor, W. Scott Spangler
Interactive process for recognition and evaluation of a partial search query and display of interactive results

Patent number: 6862713

Abstract: A method for presenting to an end-user the intermediate matching search results of a keyword search in an index list of information. The method comprising the steps of: coupling to a search engine a graphical user interface for accepting keyword search terms for searching the indexed list of information with the search engine; receiving one or more keyword search terms with one or more separation characters separating there between; performing a keyword search with the one or more keyword search terms received when a separation character is received; and presenting the number of documents matching the keyword search terms to the end-user, and presenting a graphical menu item on a display. In accordance with another embodiment of the present invention, an information processing system and computer readable storage medium carries out the above method.

Type: Grant

Filed: August 31, 1999

Date of Patent: March 1, 2005

Assignee: International Business Machines Corporation

Inventors: Reiner Kraft, W. Scott Spangler
Method and system for knowledge repository exploration and visualization

Patent number: 6725217

Abstract: A computing system and method explores a knowledge repository by accepting a natural language query from a user, determining a distance between the query and every category in every partitioning of the knowledge repository, and displaying a radial graph (322) of the nearest categories. Further, in response to a user selecting a category, visually displaying matching elements in the category along with its nearest neighbor categories in a scatter plot (324).

Type: Grant

Filed: June 20, 2001

Date of Patent: April 20, 2004

Assignee: International Business Machines Corporation

Inventors: Amy W. Chow, Jeffrey T. Kreulen, Justin T. Lessler, Larry L. Proctor, W. Scott Spangler
Method and system for knowledge repository exploration and visualization

Publication number: 20030004932

Abstract: A computing system and method explores a knowledge repository by accepting a natural language query from a user, determining a distance between the query and every category in every partitioning of the knowledge repository, and displaying a radial graph (322) of the nearest categories. Further, in response to a user selecting a category, visually displaying matching elements in the category along with its nearest neighbor categories in a scatter plot (324).

Type: Application

Filed: June 20, 2001

Publication date: January 2, 2003

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Amy W. Chow, Jeffrey T. Kreulen, Justin T. Lessler, Larry L. Proctor, W. Scott Spangler