Patents by Inventor Junghoo Cho

Junghoo Cho has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7685112
    Abstract: A method and system for autonomously downloading and indexing Hidden Web pages from Websites includes the steps of selecting a query term and issuing a query to a site-specific search interface containing Hidden Web pages. A results index is then acquired and the Hidden Web pages are downloaded from the results index. A plurality of potential query terms are then identified from the downloaded Hidden Web pages. The efficiency of each potential query term is then estimated and a next query term is selected from the plurality of potential query terms, wherein the next selected query term has the greatest efficiency. The next selected query term is then issued to the site-specific search interface using the next query term. The process is repeated until all or most of the Hidden Web pages are discovered.
    Type: Grant
    Filed: May 27, 2005
    Date of Patent: March 23, 2010
    Assignee: The Regents of the University of California
    Inventors: Alexandros Ntoulas, Junghoo Cho, Petros Zerfos
  • Publication number: 20080097958
    Abstract: A method and system is provided for autonomously downloading and indexing Hidden Web pages from Websites having site-specific search interfaces. The method may be implemented using a crawler program or the like to autonomously cull Hidden Web content. The method includes the steps of selecting a query term and issuing a query to a site-specific search interface containing Hidden Web pages. A results index is then acquired and the Hidden Web pages are downloaded from the results index. A plurality of potential query terms are then identified from the downloaded Hidden Web pages. The efficiency of each potential query term is then estimated and a next query term is selected from the plurality of potential query terms, wherein the next selected query term has the greatest efficiency. The next selected query term is then issued to the site-specific search interface using the next query term. The process is repeated until all or most of the Hidden Web pages are discovered.
    Type: Application
    Filed: May 27, 2005
    Publication date: April 24, 2008
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Alexandros Ntoulas, Junghoo Cho, Petros Zerfos
  • Publication number: 20060294124
    Abstract: The pages in a network of linked pages are ranked based on the quality of the pages. Page quality is obtained by determining the change over time of the link structure of the page, which is obtained by determining the link structure of the page at different periods of time by taking multiple snapshots of the link structure of the network. The link structures are approximated by their PageRanks, page quality being determined by the formula: Q ? ( p ) ? D ยท ? ? ? ? PR ? ( p ) PR ? ( p ) + PR ? ( p ) where Q(p) is the quality of the page, PR(p) is the current PageRank of the page, ?PR(p) is the change over time in the PageRank of the page, and D is a constant that determines the relative weight of the terms ?PR(p)/PR(p) and PR(p).
    Type: Application
    Filed: January 12, 2005
    Publication date: December 28, 2006
    Inventor: Junghoo Cho
  • Patent number: 6754650
    Abstract: A system and method for executing a regular expression (regex) query against a large data repository such as the World Wide Web includes an index engine that constructs multigram indices based on regex. A run time then receives a regex query and accesses the indices to return a set of potentially matching pages, which are then efficiently and quickly searched for matches to the regex query.
    Type: Grant
    Filed: May 8, 2001
    Date of Patent: June 22, 2004
    Assignee: International Business Machines Corporation
    Inventors: Junghoo Cho, Sridhar Rajagopalan
  • Publication number: 20040015909
    Abstract: A system and method for executing a regular expression (regex) query against a large data repository such as the World Wide Web includes an index engine that constructs multigram indices based on regex. A run time then receives a regex query and accesses the indices to return a set of potentially matching pages, which are then efficiently and quickly searched for matches to the regex query.
    Type: Application
    Filed: May 8, 2001
    Publication date: January 22, 2004
    Applicant: International Business Machines Corporation
    Inventors: Junghoo Cho, Sridhar Rajagopalan
  • Patent number: 6317740
    Abstract: A method and apparatus are defined for assigning keywords to media objects in files. The media objects include image, video and audio, for example. Various criteria are used for assigning the keywords, including measuring visual distance, measuring syntactical distance, detecting regular patterns in a table and detecting groups of images.
    Type: Grant
    Filed: December 16, 1998
    Date of Patent: November 13, 2001
    Assignee: NEC USA, Inc.
    Inventors: Sougata Mukherjea, Junghoo Cho