Patents Examined by Rehanba Perveen
  • Patent number: 5974455
    Abstract: A Web crawler system and method for quickly fetching and analyzing Web pages on the World Wide Web includes a hash table stored in random access memory (RAM) and a sequential Web information disk file. For every Web page known to the system, the Web crawler system stores an entry in the sequential disk file as well as a smaller entry in the hash table. The hash table entry includes a fingerprint value, a fetched flag that is set true only if the corresponding Web page has been successfully fetched, and a file location indicator that indicates where the corresponding entry is stored in the sequential disk file. Each sequential disk file entry includes the URL of a corresponding Web page, plus fetch status information concerning that Web page. All accesses to the Web information disk file are made sequentially via an input buffer such that a large number of entries from the sequential disk file are moved into the input buffer as single I/O operation.
    Type: Grant
    Filed: December 13, 1995
    Date of Patent: October 26, 1999
    Assignee: Digital Equipment Corporation
    Inventor: Louis M. Monier