Patents by Inventor Andrei Broder

Andrei Broder has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20070038707
    Abstract: A method includes describing the thread configurations of a volume of well-ordered electronic message transmissions (EMT) and utilizing the thread configuration data to conduct selective searches of the EMT volume. An apparatus includes a thread processor and a query manager. The thread processor analyzes the EMT threads and records the thread configuration data. The query manager utilizes the thread configuration data to conduct selective searches of the EMT volume.
    Type: Application
    Filed: August 10, 2005
    Publication date: February 15, 2007
    Applicant: International Business Machines Corporation
    Inventors: Andrei Broder, Nadav Eiron, Marcus Fontoura, Michael Herscovici, Ronny Lempel, John McPherson,, Eugene Shekita
  • Publication number: 20070016545
    Abstract: A method and system for the detection of missing content in a searchable repository is provided. A system includes: a missing content query identifier (401) for identifying queries to a search engine (102) for which no or little relevant content is returned; a missing content detector (110) which clusters missing content queries by topic; and an output provider for providing details of a missing content topic.
    Type: Application
    Filed: July 14, 2005
    Publication date: January 18, 2007
    Applicant: International Business Machines Corporation
    Inventors: Andrei Broder, David Carmel, Adam Darlow, Shai Fine, Elad Yom-Tov
  • Publication number: 20060155739
    Abstract: A method for indexing a plurality of documents, that includes a plurality of duplicate documents, first identifies one or more duplicate groups of documents from among the plurality of documents. Then, one index of content for the duplicate group is created instead of indexing the content from every document within the duplicate group. However, in contrast to the content index, an index of metadata for each of the documents in the duplicate group is created. Thus the content of each duplicate group is indexed only once, while a search engine using such indexing techniques retains the capability to answer queries as if the duplicated content was indexed for each document of the group.
    Type: Application
    Filed: January 12, 2005
    Publication date: July 13, 2006
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrei Broder, Marcus Fontoura, Michael Herscovici, Ronny Lempel, John McPherson, Andreas Neumann, Runping Qi, Eugene Shekita
  • Publication number: 20060112089
    Abstract: Systems and methods are herein disclosed for assessing the staleness of a web page. In particular, in one method of the present invention, the staleness of a web page is assessed by examining internal date references within the web page. In another method of the present invention, the staleness of a web page is assessed by examining the meta-data associated with the web page. In a further method of the present invention, the staleness of a hyperlinked web page is determined by examining the link status of the hyperlinks. If the web page has a relatively large number of dead links, it is assessed as being a stale web page. In a still further method of the present invention, the link status of web pages in the neighborhood of the web page being assessed is likewise examined.
    Type: Application
    Filed: November 22, 2004
    Publication date: May 25, 2006
    Inventors: Andrei Broder, Ziv Bar-Yossef, Shanmagasundaram Ravikumar, Andrew Tomkins
  • Publication number: 20050165757
    Abstract: A method and apparatus for ranking a plurality of pages identified during a search of a linked database includes forming a linear combination of two or more matrices, and using the coefficients of the eigenvector of the resulting matrix to rank the quality of the pages. The matrices includes information about the pages and are generally normalized, stochastic matrices. The linear combination can include attractor matrices that indicate desirable or “high quality” sites, and/or non-attractor matrices that indicate sites that are undesirable. Attractor matrices and non-attractor matrices can be used alone or in combination with each other in the linear combination. Additional bias toward high quality sites, or away from undesirable sites, can be further introduced with probability weighting matrices for attractor and non-attractor matrices. Other known matrices, such as a co-citation matrix or a bibliographic coupling matrix, can also be used in the present invention.
    Type: Application
    Filed: October 27, 2004
    Publication date: July 28, 2005
    Inventor: Andrei Broder
  • Publication number: 20050055342
    Abstract: A computerized method is used to estimate the relative coverage of Web search engines. Each search engine maintains an index of words of pages located at specific URL addresses in a network. The method generates a random query. The random query is a logical combination of words found in a subset of the pages. The random query is submitted to a first search engine. In response a set of URLs of pages matching the query are received. Each URL identifies a page indexed by the first search engine that satisfies the random query. A particular URL identifying a sample page is randomly selected. A strong query corresponding to the sample page is generated, and the strong query is submitted to a second search engine. Result information received in response to the strong query is compared to determine if the second search engine has indexed the sample page, or a page substantially similar to the sample page.
    Type: Application
    Filed: January 21, 2004
    Publication date: March 10, 2005
    Inventors: Krishna Bharat, Andrei Broder
  • Publication number: 20030037093
    Abstract: A system and method are disclosed for selecting a resource, among a plurality of resources, for servicing a request. To select a resource to service the request, a first resource is randomly selected. If a first load value associated with the first resource does not exceed a threshold value, the request is assigned to the first resource for servicing. Otherwise a second resource is randomly selected. If a second load value associated with the second resource does not exceed a threshold value, the request is assigned to the second resource for servicing. If the second load value exceeds the threshold value, the request is assigned whichever of the first and second resources has a lower load value.
    Type: Application
    Filed: May 25, 2001
    Publication date: February 20, 2003
    Inventors: Prashanth B. Bhat, Andrei Broder, Richard A. Kasperski
  • Patent number: 6286006
    Abstract: A method and apparatus that detects mirrored host pairs using information about a large set of pages, including URLs. The identities of the detected mirrored hosts are then saved so that browsers, crawlers, proxy servers, or the like can correctly identify mirrored web sites. The described embodiments of the present invention look at the URLs of pages hosts to determine whether the hosts are potentially mirrored.
    Type: Grant
    Filed: May 7, 1999
    Date of Patent: September 4, 2001
    Assignee: Alta Vista Company
    Inventors: Krishna A. Bharat, Andrei Broder, Steven C. Glassman, Jeffrey Dean, Monika R. Henzinger
  • Patent number: 5991808
    Abstract: A method of operating a multiprocessor system having a predefined number of processing units for processing data, includes obtaining load information representing a loading of each of a number of randomly selected ones of the processing units. The number of randomly selected processing units is greater than 1 and substantially less than the predefined number of processing units. A least loaded of the randomly selected processing units is identified from the obtained load information. The data is directed to the identified least loaded randomly selected processing unit for processing.
    Type: Grant
    Filed: June 2, 1997
    Date of Patent: November 23, 1999
    Assignee: Digital Equipment Corporation
    Inventors: Andrei Broder, Michael Mitzenmacher