Patents by Inventor Steven Minton

Steven Minton has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Textual information extraction, parsing, and inferential analysis

Patent number: 10043135

Abstract: Textual information extraction, parsing, and inferential analysis systems and methods are provided herein. An example method includes extracting content for each of a plurality of types from a corpus of textual information, the plurality of types corresponding to segments of an inference scheme, the inference scheme including a dependency that orders the segments together so as to create a summation of the corpus of textual information when the extracted content is assembled, and assembling one or more inferred statements using the inference scheme and the extracted content.

Type: Grant

Filed: January 31, 2017

Date of Patent: August 7, 2018

Assignee: InferLink Corporation

Inventors: Matthew Michelson, Steven Minton
Textual Information Extraction, Parsing, and Inferential Analysis

Publication number: 20170262430

Abstract: Textual information extraction, parsing, and inferential analysis systems and methods are provided herein. An example method includes extracting content for each of a plurality of types from a corpus of textual information, the plurality of types corresponding to segments of an inference scheme, the inference scheme including a dependency that orders the segments together so as to create a summation of the corpus of textual information when the extracted content is assembled, and assembling one or more inferred statements using the inference scheme and the extracted content.

Type: Application

Filed: January 31, 2017

Publication date: September 14, 2017

Inventors: Matthew Michelson, Steven Minton
SYSTEM AND METHOD FOR MANAGING ENTITY KNOWLEDGEBASES

Publication number: 20090319515

Abstract: Systems and methods are presented for building comprehensive entity knowledgebases that can consolidate multiple linked references to the same entity. The resulting virtual repository can be efficiently queried. An incoming record is clustered into entities, which are collections of attributes. The system can determine the entity that most closely matches an incoming record. Coarse-grain representations (blocking) may be used initially to select a set of the most closely-matching entities, and then fine-grain representations (linkage) may be used. Coarse-grain and fine-grain match probabilities may be integrated to obtain integrated match probabilities between the record and each of the closest-matching entities. Entities are updated, including creating a new entity, merging two or more entities into one, dividing one entity, and making no change in the entities, after which the record is entered into the appropriate entity or entities. Embodiments support both free-form querying and document matching.

Type: Application

Filed: June 1, 2009

Publication date: December 24, 2009

Inventors: Steven Minton, Evan Gamble, Greg Barish, Kane See
METHOD AND SYSTEM FOR AUTOMATICALLY EXTRACTING DATA FROM WEB SITES

Publication number: 20080114800

Abstract: In accordance with an embodiment, data may be automatically extracted from semi-structured web sites. Unsupervised learning may be used to analyze web sites and discover their structure. One method utilizes a set of heterogeneous “experts,” each expert being capable of identifying certain types of generic structure. Each expert represents its discoveries as “hints.” Based on these hints, the system may cluster the pages and text segments and identify semi-structured data that can be extracted. To identify a good clustering, a probabilistic model of the hint-generation process may be used.

Type: Application

Filed: January 15, 2008

Publication date: May 15, 2008

Applicant: FETCH TECHNOLOGIES, INC.

Inventors: Bora GAZEN, Steven MINTON
Client-centric information extraction system for an information network

Publication number: 20050165789

Abstract: A client-centric online navigation architecture that extracts relevant data from documents as a user is interacting with an information network, proposes related information services based on the types of data and data values extracted from the current viewed document, and presents a menu of related information. A browser plug-in extracts data from a web page as a user browses the Internet, and provides additional services to the web user as he browses. Data extraction wrappers created by a developer are distributed to the client machines. The wrapper supported information extraction process occurs apart from the content server, e.g., on the client machine or a proxy server. Extracted data can trigger the launching of services, called “hyperservices”, either on the local machine or remote machines.

Type: Application

Filed: December 22, 2004

Publication date: July 28, 2005

Inventors: Steven Minton, Bryan Pelz
Learning data prototypes for information extraction

Patent number: 6714941

Abstract: A method for determining statistically significant token sequences lends itself for use in the recognition of broken wrappers as well as the construction of new wrapper rules. When new wrapper rules are needed as the underlying wrapped data has changed, training examples are used to recognized data rule candidates that are culled with a bias for rule candidates that would be probably more successful. The resulting rule candidate set is clustered according to feature characteristics, then compared to the training examples. Those rule candidates most similar to the training examples are used to create new wrapper rules.

Type: Grant

Filed: July 19, 2000

Date of Patent: March 30, 2004

Assignee: University of Southern California

Inventors: Kristina Lerman, Steven Minton
Wrapper induction by hierarchical data analysis

Patent number: 6606625

Abstract: An inductive algorithm, denominated STALKER, generating high accuracy extraction rules based on user-labeled training examples. With the tremendous amount of information that becomes available on the Web on a daily basis, the ability to quickly develop information agents has become a crucial problem. A vital component of any Web-based information agent is a set of wrappers that can extract the relevant data from semistructured information sources. The novel approach to wrapped induction provided herein is based on the idea of hierarchical information extraction, which turns the hard problem of extracting data from an arbitrarily complex document into a series of easier extraction tasks.

Type: Grant

Filed: June 2, 2000

Date of Patent: August 12, 2003

Assignee: University of Southern California

Inventors: Ion Muslea, Steven Minton, Craig A. Knoblock

Textual information extraction, parsing, and inferential analysis

Textual Information Extraction, Parsing, and Inferential Analysis

SYSTEM AND METHOD FOR MANAGING ENTITY KNOWLEDGEBASES

METHOD AND SYSTEM FOR AUTOMATICALLY EXTRACTING DATA FROM WEB SITES

Client-centric information extraction system for an information network

Learning data prototypes for information extraction

Wrapper induction by hierarchical data analysis