Patents by Inventor Alexandros Ntoulas

Alexandros Ntoulas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100241647
    Abstract: Described is a search-related technology in which context information regarding a user's prior search actions is used in making query recommendations for a current user action, such as a query or click. To determine whether each set or subset of context information is relevant to the user action, data obtained from a query log is evaluated. More particularly, a query transition (query-query) graph and a query click (query-URL) graph are extracted from the query log; vectors are computed for the current action and each context/sub-context and evaluated against vectors in the graphs to determine current action-to-context similarity. Also described is using similar context to provide the query recommendations, using parameters to control the similarity strictness, and/or whether more recent context information is more relevant than less recent context information, and using context information to distinguish between user sessions.
    Type: Application
    Filed: March 23, 2009
    Publication date: September 23, 2010
    Applicant: Microsoft Corporation
    Inventors: Alexandros Ntoulas, Heasoo Hwang, Lise C. Getoor, Stelios Paparizos, Hady Wirawan Lauw
  • Patent number: 7685112
    Abstract: A method and system for autonomously downloading and indexing Hidden Web pages from Websites includes the steps of selecting a query term and issuing a query to a site-specific search interface containing Hidden Web pages. A results index is then acquired and the Hidden Web pages are downloaded from the results index. A plurality of potential query terms are then identified from the downloaded Hidden Web pages. The efficiency of each potential query term is then estimated and a next query term is selected from the plurality of potential query terms, wherein the next selected query term has the greatest efficiency. The next selected query term is then issued to the site-specific search interface using the next query term. The process is repeated until all or most of the Hidden Web pages are discovered.
    Type: Grant
    Filed: May 27, 2005
    Date of Patent: March 23, 2010
    Assignee: The Regents of the University of California
    Inventors: Alexandros Ntoulas, Junghoo Cho, Petros Zerfos
  • Publication number: 20090327256
    Abstract: Compression of extensive, rule-based grammars used to facilitate search queries is provided herein. Rule-based grammars includes a list of rules that each comprise a sequence of token classes. Each token class is a logical grouping of tokens, and each token is a string of characters. A grammar is parsed to identify rules and token classes. Unimportant token classes are identified and sets of unimportant token classes are merged to generated merged token classes. A compressed grammar is generated by substituting the merged token classes into the grammar for corresponding unimportant token classes used to generate the merged token classes.
    Type: Application
    Filed: June 26, 2008
    Publication date: December 31, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: STELIOS PAPARIZOS, CHRISTOPHER WALTER ANDERSON, WEI LIU, AJAY NAIR, ALEXANDROS NTOULAS, NAGA SRINIVAS VEMURI
  • Publication number: 20080195601
    Abstract: A method of retrieving documents using a search engine includes providing a reverse index including one or more keywords and a list of documents containing the one or more keywords, the reverse index further including a measure of confidence (MOC) value associated with the one or more keywords. One or more query terms are input into the search engine. The query terms are disambiguated and a MOC value is associated with each meaning of the disambiguated query term. A list of documents is retrieved containing the query terms wherein the documents are initially ranked based at least in part on the MOC values of the keywords and query terms. The list of documents may be re-ranked based at least in part on the semantic similarity of each document to the disambiguated query terms.
    Type: Application
    Filed: April 13, 2006
    Publication date: August 14, 2008
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Alexandros Ntoulas, Gerald C. Chao
  • Publication number: 20080097958
    Abstract: A method and system is provided for autonomously downloading and indexing Hidden Web pages from Websites having site-specific search interfaces. The method may be implemented using a crawler program or the like to autonomously cull Hidden Web content. The method includes the steps of selecting a query term and issuing a query to a site-specific search interface containing Hidden Web pages. A results index is then acquired and the Hidden Web pages are downloaded from the results index. A plurality of potential query terms are then identified from the downloaded Hidden Web pages. The efficiency of each potential query term is then estimated and a next query term is selected from the plurality of potential query terms, wherein the next selected query term has the greatest efficiency. The next selected query term is then issued to the site-specific search interface using the next query term. The process is repeated until all or most of the Hidden Web pages are discovered.
    Type: Application
    Filed: May 27, 2005
    Publication date: April 24, 2008
    Applicant: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: Alexandros Ntoulas, Junghoo Cho, Petros Zerfos
  • Publication number: 20060184500
    Abstract: Evaluating content includes receiving content, analyzing the content for web spam using a content-based identification technique, and classifying the content according to the analysis. An index of analyzed contents may be created. A system for evaluating content includes a storage device configured to store data and a processor configured to analyze content using content-based identification techniques to determine whether web spam is present.
    Type: Application
    Filed: February 11, 2005
    Publication date: August 17, 2006
    Applicant: Microsoft Corporation
    Inventors: Marc Najork, Dennis Fetterly, Mark Manasse, Alexandros Ntoulas