Patents Assigned to Fetch Technologies, Inc.
  • Patent number: 8117203
    Abstract: In accordance with an embodiment, data may be automatically extracted from semi-structured web sites. Unsupervised learning may be used to analyze web sites and discover their structure. One method utilizes a set of heterogeneous “experts,” each expert being capable of identifying certain types of generic structure. Each expert represents its discoveries as “hints.” Based on these hints, the system may cluster the pages and text segments and identify semi-structured data that can be extracted. To identify a good clustering, a probabilistic model of the hint-generation process may be used.
    Type: Grant
    Filed: January 15, 2008
    Date of Patent: February 14, 2012
    Assignee: Fetch Technologies, Inc.
    Inventors: Bora C. Gazen, Steven N. Minton
  • Publication number: 20110282877
    Abstract: In accordance with an embodiment, data may be automatically extracted from semi-structured web sites. Unsupervised learning may be used to analyze web sites and discover their structure. One method utilizes a set of heterogeneous “experts,” each expert being capable of identifying certain types of generic structure. Each expert represents its discoveries as “hints.” Based on these hints, the system may cluster the pages and text segments and identify semi-structured data that can be extracted. To identify a good clustering, a probabilistic model of the hint-generation process may be used.
    Type: Application
    Filed: July 26, 2011
    Publication date: November 17, 2011
    Applicant: Fetch Technologies, Inc.
    Inventors: Bora C. GAZEN, Steven N. MINTON
  • Publication number: 20080114800
    Abstract: In accordance with an embodiment, data may be automatically extracted from semi-structured web sites. Unsupervised learning may be used to analyze web sites and discover their structure. One method utilizes a set of heterogeneous “experts,” each expert being capable of identifying certain types of generic structure. Each expert represents its discoveries as “hints.” Based on these hints, the system may cluster the pages and text segments and identify semi-structured data that can be extracted. To identify a good clustering, a probabilistic model of the hint-generation process may be used.
    Type: Application
    Filed: January 15, 2008
    Publication date: May 15, 2008
    Applicant: FETCH TECHNOLOGIES, INC.
    Inventors: Bora GAZEN, Steven MINTON