Patents by Inventor Ziv Bar-Yossef

Ziv Bar-Yossef has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20080097988
    Abstract: Systems and methods are herein disclosed for assessing the staleness of a web page. In particular, in one method of the present invention, the staleness of a web page is assessed by examining internal date references within the web page. In another method of the present invention, the staleness of a web page is assessed by examining the meta-data associated with the web page. In a further method of the present invention, the staleness of a hyperlinked web page is determined by examining the link status of the hyperlinks. If the web page has a relatively large number of dead links, it is assessed as being a stale web page. In a still further method of the present invention, the link status of web pages in the neighborhood of the web page being assessed is likewise examined.
    Type: Application
    Filed: December 13, 2007
    Publication date: April 24, 2008
    Inventors: Andrei Broder, Ziv Bar-Yossef, Shanmagasundaram Ravikumar, Andrew Tomkins
  • Publication number: 20080097977
    Abstract: Systems and methods are herein disclosed for assessing the staleness of a web page. In particular, in one method of the present invention, the staleness of a web page is assessed by examining internal date references within the web page. In another method of the present invention, the staleness of a web page is assessed by examining the meta-data associated with the web page. In a further method of the present invention, the staleness of a hyperlinked web page is determined by examining the link status of the hyperlinks. If the web page has a relatively large number of dead links, it is assessed as being a stale web page. In a still further method of the present invention, the link status of web pages in the neighborhood of the web page being assessed is likewise examined.
    Type: Application
    Filed: December 13, 2007
    Publication date: April 24, 2008
    Inventors: Andrei Broder, Ziv Bar-Yossef, Shanmagasundaram Ravikumar, Andrew Tomkins
  • Publication number: 20080097978
    Abstract: Systems and methods are herein disclosed for assessing the staleness of a web page. In particular, in one method of the present invention, the staleness of a web page is assessed by examining internal date references within the web page. In another method of the present invention, the staleness of a web page is assessed by examining the meta-data associated with the web page. In a further method of the present invention, the staleness of a hyperlinked web page is determined by examining the link status of the hyperlinks. If the web page has a relatively large number of dead links, it is assessed as being a stale web page. In a still further method of the present invention, the link status of web pages in the neighborhood of the web page being assessed is likewise examined.
    Type: Application
    Filed: December 13, 2007
    Publication date: April 24, 2008
    Inventors: Andrei Broder, Ziv Bar-Yossef, Shanmagasundaram Ravikumar, Andrew Tomkins
  • Publication number: 20070250471
    Abstract: A method that eagerly evaluates predicates of XPath queries over XML document nodes for a set of commonly known functions and operators (including arithmetic, general comparison, value comparison, Boolean operators, etc.) without materializing sequences is discussed. Such eager evaluation of predicates reduces the amount of buffer space required since evaluation sequences have to be buffered only partially during the predicate evaluation process. Document nodes to be selected by a query are determined earlier so that they can be outputted without buffering.
    Type: Application
    Filed: April 25, 2006
    Publication date: October 25, 2007
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: MARCUS FONTOURA, VANJA JOSIFOVSKI, ZIV BAR-YOSSEF
  • Publication number: 20070085716
    Abstract: A system and method of approximating edit distance for a set of character strings in a database includes producing a representative sketch for each of the character strings; and approximating an edit distance between two selected character strings based only on the representative sketch for each of the selected character strings. The character strings may comprise text, wherein the method further comprises encoding positions of substrings in the text using anchors, wherein the anchors comprise identical substrings occurring in two input character strings at a nearby position. A set of anchors may be used in a correlated manner, wherein character strings with a sufficiently small edit distance are likely to use a same sequence of anchors. The character strings may be substantially non-repetitive. The representative sketch of a first character string is preferably constructed absent knowledge of a second character string. A size of the representative sketch may be constant.
    Type: Application
    Filed: September 30, 2005
    Publication date: April 19, 2007
    Applicant: International Business Machines Corporation
    Inventors: Ziv Bar-Yossef, Robert Krauthgamer, Shanmugasundaram Ravikumar, Jayram Thathachar
  • Publication number: 20060122998
    Abstract: A focused random walk system produces samples of on-topic pages from a collection of hyper-linked pages such as Web pages. The focused random walk system utilizes a focused random walk to produce a focused sample, which is a random sample of Web pages focused on a topic. The focused random walk system uniformly samples pages iteratively, where each iteration follows a random link from a union of the in-links and out-links of a page. The system then classifies this randomly selected link to determine whether the page is on-topic. The random walk sampling process could comprise a hard-focus method that selects only on-topic pages at each step of the focused random walk, or a soft-focus method that allows limited divergence to off-topic pages.
    Type: Application
    Filed: December 4, 2004
    Publication date: June 8, 2006
    Applicant: International Business Machines Corporation
    Inventors: Ziv Bar-Yossef, Tapas Kanungo, Robert Krauthgamer
  • Publication number: 20060112089
    Abstract: Systems and methods are herein disclosed for assessing the staleness of a web page. In particular, in one method of the present invention, the staleness of a web page is assessed by examining internal date references within the web page. In another method of the present invention, the staleness of a web page is assessed by examining the meta-data associated with the web page. In a further method of the present invention, the staleness of a hyperlinked web page is determined by examining the link status of the hyperlinks. If the web page has a relatively large number of dead links, it is assessed as being a stale web page. In a still further method of the present invention, the link status of web pages in the neighborhood of the web page being assessed is likewise examined.
    Type: Application
    Filed: November 22, 2004
    Publication date: May 25, 2006
    Inventors: Andrei Broder, Ziv Bar-Yossef, Shanmagasundaram Ravikumar, Andrew Tomkins
  • Patent number: 6968331
    Abstract: A computing system and method clean a set of hypertext documents to minimize violations of a Hypertext Information Retrieval (IR) rule set. Then, the system and method performs an information retrieval operation on the resulting cleaned data. The cleaning process includes decomposing each page of the set of hypertext documents into one or more pagelets; identifying possible templates; and eliminating the templates from the data. Traditional IR search and mining algorithms can then be used to search on the remaining pagelets, as opposed to the original pages, to provide cleaner, more precise results.
    Type: Grant
    Filed: January 22, 2002
    Date of Patent: November 22, 2005
    Assignee: International Business Machines Corporation
    Inventors: Ziv Bar-Yossef, Sridhar Rajagopalan
  • Publication number: 20030140307
    Abstract: A computing system and method clean a set of hypertext documents to minimize violations of a Hypertext Information Retrieval (IR) rule set. Then, the system and method performs an information retrieval operation on the resulting cleaned data. The cleaning process includes decomposing each page of the set of hypertext documents into one or more pagelets; identifying possible templates; and eliminating the templates from the data. Traditional IR search and mining algorithms can then be used to search on the remaining pagelets, as opposed to the original pages, to provide cleaner, more precise results.
    Type: Application
    Filed: January 22, 2002
    Publication date: July 24, 2003
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ziv Bar-Yossef, Sridhar Rajagopalan