Patents by Inventor Andrei Z. Broder

Andrei Z. Broder has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20040111412
    Abstract: A method and apparatus for ranking a plurality of pages identified during a search of a linked database includes forming a linear combination of two or more matrices, and using the coefficients of the eigenvector of the resulting matrix to rank the quality of the pages. The matrices includes information about the pages and are generally normalized, stochastic matrices. The linear combination can include attractor matrices that indicate desirable or “high quality” sites, and/or non-attractor matrices that indicate sites that are undesirable. Attractor matrices and non-attractor matrices can be used alone or in combination with each other in the linear combination. Additional bias toward high quality sites, or away from undesirable sites, can be further introduced with probability weighting matrices for attractor and non-attractor matrices. Other known matrices, such as a co-citation matrix or a bibliographic coupling matrix, can also be used in the present invention.
    Type: Application
    Filed: May 6, 2003
    Publication date: June 10, 2004
    Applicant: AltaVista Company
    Inventor: Andrei Z. Broder
  • Patent number: 6665837
    Abstract: A method is described for identifying related pages among a plurality of pages in a linked database such as the World Wide Web. An initial page is selected from the plurality of pages. Pages linked to the initial page are represented as a graph in a memory. The pages represented in the graph are scored on content, and a set of pages is selected, the selected set of pages having scores greater than a first predetermined threshold. The selected set of pages is scored on connectivity, and a subset of the set of pages that have scores greater than a second predetermined threshold are selected as related pages.
    Type: Grant
    Filed: August 10, 1998
    Date of Patent: December 16, 2003
    Assignee: Overture Services, Inc.
    Inventors: Jeffrey Dean, Monika R. Henzinger, Andrei Z. Broder
  • Patent number: 6560600
    Abstract: A method and apparatus for ranking a plurality of pages identified during a search of a linked database includes forming a linear combination of two or more matrices, and using the coefficients of the eigenvector of the resulting matrix to rank the quality of the pages. The matrices includes information about the pages and are generally normalized, stochastic matrices. The linear combination can include attractor matrices that indicate desirable or “high quality” sites, and/or non-attractor matrices that indicate sites that are undesirable. Attractor matrices and non-attractor matrices can be used alone or in combination with each other in the linear combination. Additional bias toward high quality sites, or away from undesirable sites, can be further introduced with probability weighting matrices for attractor and non-attractor matrices. Other known matrices, such as a co-citation matrix or a bibliographic coupling matrix, can also be used in the present invention.
    Type: Grant
    Filed: October 25, 2000
    Date of Patent: May 6, 2003
    Assignee: Alta Vista Company
    Inventor: Andrei Z. Broder
  • Publication number: 20030061233
    Abstract: A system and method for finding one or more target biometric samples that are similar to or match a query biometric sample. A query feature vector is generated from a query biometric vector. The query biometric vector represents the query biometric sample as a set of characteristics. The characteristics are either invariable or variable. The query feature vector comprises a plurality of features which are derived from the query biometric vector using a process that includes canonicalization of the characters in the biometric vector. The query feature vector is compared to a plurality of similarly created target feature vectors, each target feature vector representing a respective target biometric sample. A target biometric sample is a potential match to the query biometric sample when a threshold number of features in the corresponding target feature vector are identical to features in the query biometric vector.
    Type: Application
    Filed: September 21, 2001
    Publication date: March 27, 2003
    Inventors: Mark S. Manasse, Andrei Z. Broder
  • Patent number: 6487555
    Abstract: A method and system that detects mirrored host pairs using information about a large set of pages, including one or more of: URLs, IP addresses, and connectivity information. The identities of the detected mirrored hosts are then saved so that browsers, crawlers, proxy servers, or the like can correctly identify mirrored web sites. The described embodiments of the present invention use one or a combination of techniques to identify mirrors. A first group of techniques involves determining mirrors based on URLs and information about connectivity (i.e., hyperlinks) between pages. A second group of techniques looks at connectivity information at a higher granularity, considering all links from all pages on a host as one group and ignoring the target of each link beyond the host level.
    Type: Grant
    Filed: May 7, 1999
    Date of Patent: November 26, 2002
    Assignee: Alta Vista Company
    Inventors: Krishna A. Bharat, Andrei Z. Broder, Steven C. Glassman, Jeffrey Dean, Monika R. Henzinger
  • Patent number: 6349296
    Abstract: A computer-implemented method determines the resemblance of data objects such as Web pages. Each data object is partitioned into a sequence of tokens. The tokens are grouped into overlapping sets of the tokens to form shingles. Each shingle is represented by a unique identification element encoded as a fingerprint. A minimum element from each of the images of the set of fingerprints associated with a document under each of a plurality of pseudo random permutations of the set of all fingerprints are selected to generate a sketch of each data object. The sketches characterize the resemblance of the data objects. The sketches can be further partitioned into a plurality of groups. Each group is fingerprinted to form a feature. Data objects that share more than a certain numbers of features are estimated to be nearly identical.
    Type: Grant
    Filed: August 21, 2000
    Date of Patent: February 19, 2002
    Assignee: AltaVista Company
    Inventors: Andrei Z. Broder, Steven C. Glassman, Charles G. Nelson, Mark S. Manasse, Geoffrey G. Zweig
  • Patent number: 6292762
    Abstract: A method determines a random permutation of input lines that produced a permuted set of bits in a bitstream. In a source design, the method replaces a logic element whose input lines are permutable with a test function. The source design is processed by a design tool to generate the bitstream including the permuted set of bits. The test function is probed with test values, and the probe results are compared with the permuted set of bits to discover the permutation of the set of bits. The test values can include a message.
    Type: Grant
    Filed: July 13, 1998
    Date of Patent: September 18, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Laurent Rene Moll, Michael David Mitzenmacher, Andrei Z. Broder, Mark Alexander Shand
  • Patent number: 6195698
    Abstract: A computerized method selectively accepts access requests from a client computer connected to a server computer by a network. The server computer receives an access request from the client computer. In response, the server computer generates a predetermined number of random characters. The random characters are used to form a string in the server computer. The string is randomly modified either visually or audibly to form a riddle. The original string becomes the correct answer to the riddle. The server computer renders the riddle on an output device of the client computer. In response, the client computer sends an answer to the server. Hopefully, the answer is a user's guess for the correct answer. The server determines if the guess is the correct answer, and if so, the access request is accepted.
    Type: Grant
    Filed: April 13, 1998
    Date of Patent: February 27, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Mark D. Lillibridge, Martin Abadi, Krishna Bharat, Andrei Z. Broder
  • Patent number: 6119124
    Abstract: A computer-implemented method determines the resemblance of data objects such as Web pages. Each data object is partitioned into a sequence of tokens. The tokens are grouped into overlapping sets of the tokens to form shingles. Each shingle is represented by a unique identification element encoded as a fingerprint. A minimum element from each of the images of the set of fingerprints associated with a document under each of a plurality of pseudo random permutations of the set of all fingerprints are selected to generate a sketch of each data object. The sketches characterize the resemblance of the data objects. The sketches can be further partitioned into a plurality of groups. Each group is fingerprinted to form a feature. Data objects that share more than a certain numbers of features are estimated to be nearly identical.
    Type: Grant
    Filed: March 26, 1998
    Date of Patent: September 12, 2000
    Assignee: Digital Equipment Corporation
    Inventors: Andrei Z. Broder, Steven C. Glassman, Charles G. Nelson, Mark S. Manasse, Geoffrey G. Zweig
  • Patent number: 6073135
    Abstract: A server computer is provided for representing and navigating the connectivity of Web pages. The Web pages include links to other Web pages. The links and Web page s have associated names (URLs). The names of the Web pages are sorted in a memory of the connectivity server. The sorted names are delta encoded while periodically storing full names as checkpoints in the memory. Each delta encoded name and checkpoint has a unique identification. A list of pairs of identifications representing existent links is sorted twice, first according to the first identification of each pair to produce an inlist, and second according to the second identification of each pair to produce an outlist. An array of elements is stored in the memory, there is one array element for each Web page. Each element includes a first pointer to one of the checkpoints, a second pointer to an associated inlist of the Web page, and a third pointer to an associated outlist of the Web page.
    Type: Grant
    Filed: March 10, 1998
    Date of Patent: June 6, 2000
    Assignee: Alta Vista Company
    Inventors: Andrei Z. Broder, Michael Burrows, Monika H. Henzinger, Sanjay Ghemawat, Puneet Kumar, Suresh Venkatasubramanian
  • Patent number: 5032987
    Abstract: A data processing system and method particularly useful for network address lookup in interconnected local area networks uses a family of hashing algorithms that allow implementation of a dictionary that is particularly advantageous when the underlying hardware allows parallel memory reads in different memory banks. The system and method requires exactly one memory cycle for deletes and lookup and constant expected amortized cost for insertions. The system and method uses easy-to-compute hash functions and makes no unproven assumptions about their randomness properties, or about any property of keys drawn from a universe U. The system and method makes it possible to build a dictionary that is able to answer 20 parallel searches among 64,000 entries in less than 5 .mu.s using relatively inexpensive hardware.
    Type: Grant
    Filed: May 11, 1990
    Date of Patent: July 16, 1991
    Assignee: Digital Equipment Corporation
    Inventors: Andrei Z. Broder, Anna R. Karlin