Patents by Inventor Gautham Thambidorai

Gautham Thambidorai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8131712
    Abstract: A corpus of documents is identified, such as a large corpus of web documents. A quality score is applied to each, and at least some of the documents in the corpus of documents are identified based on their respective quality scores. At least one query characteristic, for instance, the language of a query, associated with a plurality of search queries is identified. A subset of documents in the corpus of documents is identified that satisfy the at least one query characteristic. An index is built that includes the identified at least some documents and the identified subset of documents.
    Type: Grant
    Filed: October 15, 2007
    Date of Patent: March 6, 2012
    Assignee: Google Inc.
    Inventors: Gautham Thambidorai, Eisar A. Lipkovitz, Cosmos Nicolaou, Li Fan
  • Publication number: 20120023073
    Abstract: A set of documents may be stored and indexed as a compressed sequence of tokens. A set of documents are grouped into clusters. Sequences of tokens representing the clusters of documents are encoded to elide some repeating instances of tokens. A compressed sequence of tokens is generated from the compressed cluster sequences of tokens. Queries on the compressed sequence are performed by identifying cluster sequences within the compressed sequence that are likely to have documents that satisfy the query and then identifying, within these identified clusters, the documents that actually satisfies the query.
    Type: Application
    Filed: September 29, 2011
    Publication date: January 26, 2012
    Inventors: Jeffrey A. Dean, Sanjay Ghemawat, Gautham Thambidorai
  • Publication number: 20070220023
    Abstract: The disclosed embodiments enable multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. The mapping scheme includes a first mapping between unique tokens contained in a set of documents and unique global token identifiers (e.g., 32-bit integers) contained in a global-lexicon (i.e., dictionary). The mapping scheme also includes a second mapping between the global token identifiers and a set of fixed-length local token identifiers (e.g., 8-bit integers) contained in one or more mini-lexicons (i.e., sub-dictionaries). Each mini-lexicon is associated with a range of token positions in the tokenized documents. The first and second mappings are used to encode/decode documents into local token identifiers having fixed widths which can be compactly stored in the tokenspace repository. The use of fixed-length local token identifiers allows for fast and efficient decoding of tokenized documents.
    Type: Application
    Filed: August 13, 2004
    Publication date: September 20, 2007
    Inventors: Jeffrey Dean, Gautham Thambidorai, Sanjay Ghemawat, Benedict Gomes, Olcan Sercinoglu