Patents by Inventor W. Scott Spangler
W. Scott Spangler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20190138510Abstract: Entities are objects with feature values that can be thought of as vectors in N-space, where N is the number of features. Similarity between any two entities can be calculated as a distance between the two entity vectors. A similarity network can be drawn between a set of entities based on connecting two entities that are relatively near to each other in N-space. Binary relative neighborhood trees are a special type of entity relationship network, designed to be useful in visualizing the entity space. They have the intuitively simple property that the more typical entities occur at the top of the tree and the more unusual entities occur at the leaf nodes. By limiting the number of links to n+1 per node (one parent, n children), a regularized flat tree structure is created that is much easier to visualize and navigate at both a course and a fine level by domain experts.Type: ApplicationFiled: December 31, 2018Publication date: May 9, 2019Inventor: W Scott Spangler
-
Patent number: 9495349Abstract: A system and method for processing a document to generate a set of related documents. A system is provided that includes a textual analytics system that analyzes unstructured data contained in a source document and extracts a set of structured information about the source document; and a compare system that identifies a set of related documents by comparing the set of structured information with metadata indexed from a set of publications.Type: GrantFiled: November 17, 2005Date of Patent: November 15, 2016Assignee: International Business Machines CorporationInventors: Robert L. Angell, Stephen K. Boyer, James W. Cooper, Richard A. Hennessy, Tapas Kanungo, Jeffrey T. Kreulen, David C. Martin, James J. Rhodes, W. Scott Spangler, Herschel J. R. Weintraub
-
Patent number: 9292797Abstract: According to one embodiment, a semi-supervised data integration model for named entity classification from a first repository of entity information in view of an auxiliary repository of classification assistance data is provided. Training data are compared to named entity candidates taken from the first repository to form a positive training seed set. A decision tree is populated and classification rules are created for classifying the named entity candidates. A number of entities are sampled from the named entity candidates. The sampled entities are labeled as positive examples and/or negative examples. The positive training seed set is updated to include identified commonality between the positive examples and the auxiliary repository. A negative training seed set is updated to include negative examples which lack commonality with the auxiliary repository. In view of both the updated positive and negative training seed sets, the decision tree and the classification rules are updated.Type: GrantFiled: December 14, 2012Date of Patent: March 22, 2016Assignee: International Business Machines CorporationInventors: Qi He, W. Scott Spangler
-
Publication number: 20150332158Abstract: Given two heterogeneous entities, the prevalence of text data provides rich co-occurrence information for them. However, the co-occurrence only is noisy—not only may the co-occurrence just imply an accidental writing, but also it may just reflect the domain-specific common words. Only those strong relevance between entities supported by rich relevance contexts in data can indicate meaningful entity relationships. Strong relevance between heterogeneous entities are mined from their co-occurrences. Drug-disease therapeutic relationships are used as the example to demonstrate an application of this work.Type: ApplicationFiled: May 16, 2014Publication date: November 19, 2015Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Qi He, Ming Ji, W. Scott Spangler
-
Publication number: 20150324481Abstract: Entities are objects with feature values that can be thought of as vectors in N-space, where N is the number of features. Similarity between any two entities can be calculated as a distance between the two entity vectors. A similarity network can be drawn between a set of entities based on connecting two entities that are relatively near to each other in N-space. Binary relative neighborhood trees are a special type of entity relationship network, designed to be useful in visualizing the entity space. They have the intuitively simple property that the more typical entities occur at the top of the tree and the more unusual entities occur at the leaf nodes. By limiting the number of links to n+1 per node (one parent, n children), a regularized flat tree structure is created that is much easier to visualize and navigate at both a course and a fine level by domain experts.Type: ApplicationFiled: May 6, 2014Publication date: November 12, 2015Applicant: International Business Machines CorporationInventor: W Scott Spangler
-
Patent number: 9183600Abstract: Embodiments of the invention relate to technology prediction. A technical dictionary of technical terms is constructed based on a collection of documents. The technical terms are partitioned into equivalence classes. A table is generated that correlates technical terms across equivalence classes based on temporal co-occurrence of the technical terms across the equivalence classes. For a given technical term the table is accessed to determine a first set of technical terms that correlate to the given technical term. The table is accessed again to determine a second set of technical terms that correlate to the first set of technical terms. It is predicted that the second set of technical terms will correlate to the given technical term in the future.Type: GrantFiled: January 10, 2013Date of Patent: November 10, 2015Assignee: International Business Machines CorporationInventors: Ying Chen, Bin He, Qi He, Xin Jin, W. Scott Spangler
-
Publication number: 20150220680Abstract: A biological pathway is a series of actions that take place in an organism that lead to some resulting pathology or otherwise change the organism state. In the cell, these actions typically take place between molecules called proteins. Proteins within the cell interact in ways that are not fully understood, but evidence concerning these interactions is constantly being collected and published by microbiologists. The disclosed method automatically infers such biological pathways between proteins by looking at the overall system of published literature about those proteins.Type: ApplicationFiled: January 31, 2014Publication date: August 6, 2015Applicant: International Business Machines CorporationInventors: STEPHEN K BOYER, JEFFREY T KREULEN, W SCOTT SPANGLER
-
Publication number: 20140195471Abstract: Embodiments of the invention relate to technology prediction. A technical dictionary of technical terms is constructed based on a collection of documents. The technical terms are partitioned into equivalence classes. A table is generated that correlates technical terms across equivalence classes based on temporal co-occurrence of the technical terms across the equivalence classes. For a given technical term the table is accessed to determine a first set of technical terms that correlate to the given technical term. The table is accessed again to determine a second set of technical terms that correlate to the first set of technical terms. It is predicted that the second set of technical terms will correlate to the given technical term in the future.Type: ApplicationFiled: January 10, 2013Publication date: July 10, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ying Chen, Bin He, Qi He, Xin Jin, W. Scott Spangler
-
Publication number: 20140172754Abstract: According to one embodiment, a semi-supervised data integration model for named entity classification from a first repository of entity information in view of an auxiliary repository of classification assistance data is provided. Training data are compared to named entity candidates taken from the first repository to form a positive training seed set. A decision tree is populated and classification rules are created for classifying the named entity candidates. A number of entities are sampled from the named entity candidates. The sampled entities are labeled as positive examples and/or negative examples. The positive training seed set is updated to include identified commonality between the positive examples and the auxiliary repository. A negative training seed set is updated to include negative examples which lack commonality with the auxiliary repository. In view of both the updated positive and negative training seed sets, the decision tree and the classification rules are updated.Type: ApplicationFiled: December 14, 2012Publication date: June 19, 2014Applicant: International Business Machines CorporationInventors: Qi He, W. Scott Spangler
-
Publication number: 20120226695Abstract: A system for classifying documents in a collection of documents according to their intended readerships includes: a computer configured to select a document in the collection of documents; and a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. A computer classifies the selected document as misleading, commercial, or personal according to its determined characteristic; and a computer repeats the steps of select document, determines a characteristic of the selected document, and classifies the selected document for additional documents in the collection. At least some documents are classified as misleading, some as commercial, and at least some as personal.Type: ApplicationFiled: May 16, 2012Publication date: September 6, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ying Chen, Bin He, W. Scott Spangler
-
Publication number: 20110276553Abstract: One embodiment is a computer-implemented method for classifying documents in a collection of documents according to their intended readerships. The method comprises using a computer to select a document in the collection of documents; and using a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. The method further includes using a computer to classify the selected document as misleading, commercial, or personal according to its determined characteristic; and using a computer to repeat the steps of select document, determine a characteristic of the selected document, and classify the selected document for additional documents in the collection.Type: ApplicationFiled: May 10, 2010Publication date: November 10, 2011Applicant: International Business Machines CorporationInventors: Ying Chen, Bin He, W. Scott Spangler
-
Patent number: 8010524Abstract: Consumer-generated media (CGM) and/or other media are monitored to allow an organization to become aware of, and respond to, issues that may affect how it is perceived by the public. An extract, transform, load (ETL) engine is used to process CGM and other media content, and an analytical engine utilizes a multi-step progressive filtering approach to identify those documents that are most relevant. The filtering approach includes executing broad queries to extract relevant content from different CGM and other sources, extracting text snippets from the relevant content and performing de-duplication, defining organizational identity (e.g., brand name, trade name, or company name) and hot-topic models using a rule-based and statistical-based approach, and using the models together in an orthogonal filtering approach to effectively generate alerts and reports. The methodology is found to be substantially more effective compared to a conventional keyword based approach.Type: GrantFiled: October 29, 2007Date of Patent: August 30, 2011Assignee: International Business Machines CorporationInventors: Ying Chen, Amit Behal, Thomas D. Griffin, Larry L. Proctor, W. Scott Spangler
-
Publication number: 20110137906Abstract: A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.Type: ApplicationFiled: December 9, 2009Publication date: June 9, 2011Applicant: INTERNATIONAL BUSINESS MACHINES, INC.Inventors: Keke Cai, Ying Chen, W. Scott Spangler, LI Zhang
-
Publication number: 20090119275Abstract: Consumer-generated media (CGM) and/or other media are monitored to allow an organization to become aware of, and respond to, issues that may affect how it is perceived by the public. An extract, transform, load (ETL) engine is used to process CGM and other media content, and an analytical engine utilizes a multi-step progressive filtering approach to identify those documents that are most relevant. The filtering approach includes executing broad queries to extract relevant content from different CGM and other sources, extracting text snippets from the relevant content and performing de-duplication, defining organizational identity (e.g., brand name, trade name, or company name) and hot-topic models using a rule-based and statistical-based approach, and using the models together in an orthogonal filtering approach to effectively generate alerts and reports. The methodology is found to be substantially more effective compared to a conventional keyword based approach.Type: ApplicationFiled: October 29, 2007Publication date: May 7, 2009Applicant: International Business Machines CorporationInventors: Ying Chen, Amit Behal, Thomas D. Griffin, Larry L. Proctor, W. Scott Spangler
-
Patent number: 6862713Abstract: A method for presenting to an end-user the intermediate matching search results of a keyword search in an index list of information. The method comprising the steps of: coupling to a search engine a graphical user interface for accepting keyword search terms for searching the indexed list of information with the search engine; receiving one or more keyword search terms with one or more separation characters separating there between; performing a keyword search with the one or more keyword search terms received when a separation character is received; and presenting the number of documents matching the keyword search terms to the end-user, and presenting a graphical menu item on a display. In accordance with another embodiment of the present invention, an information processing system and computer readable storage medium carries out the above method.Type: GrantFiled: August 31, 1999Date of Patent: March 1, 2005Assignee: International Business Machines CorporationInventors: Reiner Kraft, W. Scott Spangler
-
Patent number: 6725217Abstract: A computing system and method explores a knowledge repository by accepting a natural language query from a user, determining a distance between the query and every category in every partitioning of the knowledge repository, and displaying a radial graph (322) of the nearest categories. Further, in response to a user selecting a category, visually displaying matching elements in the category along with its nearest neighbor categories in a scatter plot (324).Type: GrantFiled: June 20, 2001Date of Patent: April 20, 2004Assignee: International Business Machines CorporationInventors: Amy W. Chow, Jeffrey T. Kreulen, Justin T. Lessler, Larry L. Proctor, W. Scott Spangler
-
Publication number: 20030004932Abstract: A computing system and method explores a knowledge repository by accepting a natural language query from a user, determining a distance between the query and every category in every partitioning of the knowledge repository, and displaying a radial graph (322) of the nearest categories. Further, in response to a user selecting a category, visually displaying matching elements in the category along with its nearest neighbor categories in a scatter plot (324).Type: ApplicationFiled: June 20, 2001Publication date: January 2, 2003Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Amy W. Chow, Jeffrey T. Kreulen, Justin T. Lessler, Larry L. Proctor, W. Scott Spangler