Patents by Inventor Monika R. Henzinger

Monika R. Henzinger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7630973
    Abstract: A method is described for identifying related pages among a plurality of pages in a linked database such as the World Wide Web. An initial page is selected from the plurality of pages. Pages linked to the initial page are represented as a graph in a memory. The pages represented in the graph are scored on content, and a set of pages is selected, the selected set of pages having scores greater than a first predetermined threshold. The selected set of pages is scored on connectivity, and a subset of the set of pages that have scores greater than a second predetermined threshold are selected as related pages.
    Type: Grant
    Filed: November 3, 2003
    Date of Patent: December 8, 2009
    Assignee: Yahoo! Inc.
    Inventors: Jeffrey Dean Black, Monika R. Henzinger, Andrei Z. Broder
  • Patent number: 7451388
    Abstract: A method, system, and computer program product for determining relative quality of search engine indexes and search results include performing a two-level random walk through a hypertext-linked document set. Search engine index quality is measured based on the number of encountered documents that are indexed by the search engine index. Search result quality is measured based on the number and quality of documents that link to the result document.
    Type: Grant
    Filed: September 8, 1999
    Date of Patent: November 11, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Monika R. Henzinger, Michael D. Mitzenmacher
  • Patent number: 7117206
    Abstract: A computerized method determines the ranking of documents including information content. The present method uses both content and connectivity analysis. An input set of documents is represented as a neighborhood graph in a memory. In the graph, each node represents one document, and each directed edge connecting a pair of nodes represents a linkage between the pair of documents. The input set of documents represented in the graph is ranked according to the contents of the documents. A subset of documents is selected from the input set of documents if the content ranking of the selected documents is greater than a first predetermined threshold. Nodes representing any documents, other than the selected documents, are deleted from the graph. The selected subset of documents is ranked according the linkage of the documents, and an output set of documents exceeding a second predetermined threshold is selected for presentation to users.
    Type: Grant
    Filed: May 5, 2003
    Date of Patent: October 3, 2006
    Assignee: Overture Services, Inc.
    Inventors: Krishna Asur Bharat, Monika R. Henzinger
  • Publication number: 20040193636
    Abstract: A method is described for identifying related pages among a plurality of pages in a linked database such as the World Wide Web. An initial page is selected from the plurality of pages. Pages linked to the initial page are represented as a graph in a memory. The pages represented in the graph are scored on content, and a set of pages is selected, the selected set of pages having scores greater than a first predetermined threshold. The selected set of pages is scored on connectivity, and a subset of the set of pages that have scores greater than a second predetermined threshold are selected as related pages.
    Type: Application
    Filed: November 3, 2003
    Publication date: September 30, 2004
    Inventors: Jeffrey Dean Black, Monika R. Henzinger, Andrei Z. Broder
  • Patent number: 6738678
    Abstract: A computerized method determines the ranking of documents including information content. The present method uses both content and connectivity analysis. An input set of documents is represented as a neighborhood graph in a memory. In the graph, each node represents one document, and each directed edge connecting a pair of nodes represents a linkage between the pair of documents. The input set of documents represented in the graph is ranked according to the contents of the documents. A subset of documents is selected from the input set of documents if the content ranking of the selected documents is greater than a first predetermined threshold. Nodes representing any documents, other than the selected documents, are deleted from the graph. The selected subset of documents is ranked according the linkage of the documents, and an output set of documents exceeding a second predetermined threshold is selected for presentation to users.
    Type: Grant
    Filed: January 15, 1998
    Date of Patent: May 18, 2004
    Inventors: Krishna Asur Bharat, Monika R. Henzinger
  • Patent number: 6665837
    Abstract: A method is described for identifying related pages among a plurality of pages in a linked database such as the World Wide Web. An initial page is selected from the plurality of pages. Pages linked to the initial page are represented as a graph in a memory. The pages represented in the graph are scored on content, and a set of pages is selected, the selected set of pages having scores greater than a first predetermined threshold. The selected set of pages is scored on connectivity, and a subset of the set of pages that have scores greater than a second predetermined threshold are selected as related pages.
    Type: Grant
    Filed: August 10, 1998
    Date of Patent: December 16, 2003
    Assignee: Overture Services, Inc.
    Inventors: Jeffrey Dean, Monika R. Henzinger, Andrei Z. Broder
  • Patent number: 6487555
    Abstract: A method and system that detects mirrored host pairs using information about a large set of pages, including one or more of: URLs, IP addresses, and connectivity information. The identities of the detected mirrored hosts are then saved so that browsers, crawlers, proxy servers, or the like can correctly identify mirrored web sites. The described embodiments of the present invention use one or a combination of techniques to identify mirrors. A first group of techniques involves determining mirrors based on URLs and information about connectivity (i.e., hyperlinks) between pages. A second group of techniques looks at connectivity information at a higher granularity, considering all links from all pages on a host as one group and ignoring the target of each link beyond the host level.
    Type: Grant
    Filed: May 7, 1999
    Date of Patent: November 26, 2002
    Assignee: Alta Vista Company
    Inventors: Krishna A. Bharat, Andrei Z. Broder, Steven C. Glassman, Jeffrey Dean, Monika R. Henzinger
  • Patent number: 6321220
    Abstract: A method and apparatus for preventing topic drift in queries in hyperlinked environments uses equivalence components for ranking pages containing information that is relevant to the topic of a user query input to a search engine. The method includes the step of providing a query to a search engine, where the query represents a predetermined topic; retrieving at least one page associated with the query; constructing a graph representing the pages in memory; creating at least one equivalence component representing a subset of the graph; processing each equivalence component; eliminating the equivalence component in accordance with whether it matches the predetermined topic; and ranking the remaining pages.
    Type: Grant
    Filed: December 7, 1998
    Date of Patent: November 20, 2001
    Assignee: AltaVista Company
    Inventors: Jeffrey Dean, Monika R. Henzinger, Krishna Asur Bharat
  • Patent number: 6286006
    Abstract: A method and apparatus that detects mirrored host pairs using information about a large set of pages, including URLs. The identities of the detected mirrored hosts are then saved so that browsers, crawlers, proxy servers, or the like can correctly identify mirrored web sites. The described embodiments of the present invention look at the URLs of pages hosts to determine whether the hosts are potentially mirrored.
    Type: Grant
    Filed: May 7, 1999
    Date of Patent: September 4, 2001
    Assignee: Alta Vista Company
    Inventors: Krishna A. Bharat, Andrei Broder, Steven C. Glassman, Jeffrey Dean, Monika R. Henzinger
  • Patent number: 6138113
    Abstract: A method is described for identifying pages that are near duplicates in a linked database. In the linked database, pages can have incoming links and outgoing links. Two pages are selected, a first page and a second page. For each selected page, the number of outgoing links is determined. The two pages are marked as near duplicates based on the number of common outgoing links for the two pages.
    Type: Grant
    Filed: August 10, 1998
    Date of Patent: October 24, 2000
    Assignee: AltaVista Company
    Inventors: Jeffrey Dean, Monika R. Henzinger
  • Patent number: 6112203
    Abstract: In a computerized method, a set of documents is ranked according to their content and their connectivity by using topic distillation. The documents include links that connect the documents to each other, either directly, or indirectly. A graph is constructed in a memory of a computer system. In the graph, nodes represent the documents, and directed edges represent the links. Based on the number of links connecting the various nodes, a subset of documents is selected to form a topic. A second subset of the documents is chosen based on the number of directed edges connecting the nodes. Nodes in the second subset are compared with the topic to determine similarity to the topic, and a relevance weight is correspondingly assigned to each node. Nodes in the second subset having a relevance weight less than a predetermined threshold are pruned from the graph. The documents represented by the remaining nodes in the graph are ranked by connectivity based ranking scheme.
    Type: Grant
    Filed: April 9, 1998
    Date of Patent: August 29, 2000
    Assignee: AltaVista Company
    Inventors: Krishna Asur Bharat, Monika R. Henzinger