Patents by Inventor Mahesh Tiyyagura

Mahesh Tiyyagura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8046360
    Abstract: Document, such as web pages of a domain, are annotated to facilitate extracting structured information from the documents. The documents are clustered. Each cluster is such that the documents within that cluster are similar to each other at least with respect to a first threshold, such as according to a shingling metric, where the first threshold is an 8/8 shingling match. There is at least one overlap cluster, each overlap cluster including at least one of the plurality of clusters such that documents of the at least one cluster included in that overlap cluster are similar to each other at least with respect to a second threshold that is lower than the first threshold. A particular overlap cluster is designated, as is a particular cluster of the particular overlap cluster. For the particular designated cluster, an obtained annotation is transferred to other clusters included in the designated particular overlap cluster.
    Type: Grant
    Filed: December 13, 2007
    Date of Patent: October 25, 2011
    Assignee: Yahoo! Inc.
    Inventor: Mahesh Tiyyagura
  • Patent number: 8010544
    Abstract: A method is provided for information extraction from among a multiplicity of documents each having a corresponding document object model (DOM) comprising: computing signatures associated with nodes of a multiplicity of DOMs corresponding to the multiplicity of documents; producing an index that associates computed signatures to each document that has a DOM that has one or more nodes corresponding to such signature; annotating one or more nodes of a DOM that corresponds to the at least one selected document; wherein the one or more annotated nodes respectively correspond to one or more respective signatures included in the index; and matching the signatures that correspond to the annotated nodes with signatures in the index to determine which documents from the multiplicity of documents have one or more DOM nodes that correspond to one or more of the annotated nodes.
    Type: Grant
    Filed: June 6, 2008
    Date of Patent: August 30, 2011
    Assignee: Yahoo! Inc.
    Inventor: Mahesh Tiyyagura
  • Patent number: 7941421
    Abstract: A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: May 10, 2011
    Assignee: Yahoo! Inc.
    Inventor: Mahesh Tiyyagura
  • Publication number: 20100161588
    Abstract: A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.
    Type: Application
    Filed: March 2, 2010
    Publication date: June 24, 2010
    Applicant: YAHOO! INC.
    Inventor: Mahesh Tiyyagura
  • Patent number: 7707229
    Abstract: A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.
    Type: Grant
    Filed: December 12, 2007
    Date of Patent: April 27, 2010
    Assignee: Yahoo! Inc.
    Inventor: Mahesh Tiyyagura
  • Publication number: 20090319481
    Abstract: The present invention is directed towards systems and methods for extending media annotations using collective knowledge. The method according to one embodiment of the present invention comprises receiving a plurality of content items and associated annotations. The method further normalizes the plurality of associated annotations and calculates pair frequencies for the plurality of associated annotations. The method then retrieves a plurality of alternative annotations and provides the plurality of alternative annotations.
    Type: Application
    Filed: June 18, 2008
    Publication date: December 24, 2009
    Applicant: Yahoo! Inc.
    Inventors: Krishna Prasad Chitrapura, Krishna Leela Poola, Mahesh Tiyyagura
  • Publication number: 20090313127
    Abstract: An improved system and method for using contextual sections of web page content for serving advertisements in online advertising is provided. A publisher may use a tool to identify sections of a web page that represent content to be used in contextual advertising. When rendered by a web browser, content from marked sections may be extracted from the web page and sent to an advertisement server for selectively matching advertisements for display to a user. Features may be identified from the content sections and used to select advertisements matching the extracted content of the web page. In particular, the features identified from the content sections may be matched with features designated by advertisers for advertisements. Web page placements may be allocated for advertisements matching the extracted content, and the advertisements may be served for display with the web page.
    Type: Application
    Filed: June 11, 2008
    Publication date: December 17, 2009
    Applicant: Yahoo! Inc.
    Inventors: David Chaiken, Kalyan Kumar Kanuri, Arun Ramanujapuram, Mahesh Tiyyagura
  • Publication number: 20090307256
    Abstract: A method is provided for information extraction from among a multiplicity of documents each having a corresponding document object model (DOM) comprising: computing signatures associated with nodes of a multiplicity of DOMs corresponding to the multiplicity of documents; producing an index that associates computed signatures to each document that has a DOM that has one or more nodes corresponding to such signature; annotating one or more nodes of a DOM that corresponds to the at least one selected document; wherein the one or more annotated nodes respectively correspond to one or more respective signatures included in the index; and matching the signatures that correspond to the annotated nodes with signatures in the index to determine which documents from the multiplicity of documents have one or more DOM nodes that correspond to one or more of the annotated nodes.
    Type: Application
    Filed: June 6, 2008
    Publication date: December 10, 2009
    Applicant: Yahoo! Inc.
    Inventor: Mahesh TIYYAGURA
  • Publication number: 20090240670
    Abstract: Subject matter disclosed herein may relate to alignment of uniform resource identifiers associated with web pages, and further may relate to multiple sequence alignment of uniform resource identifiers. In one or more example embodiments, multiple sequence alignment techniques may provide improved tokenization of uniform resource identifiers associated with web pages, which may provide improved performance of applications such as, for example, uniform resource identifier normalization, sitemap construction, etc.
    Type: Application
    Filed: March 20, 2008
    Publication date: September 24, 2009
    Applicant: Yahoo! Inc.
    Inventors: Mahesh Tiyyagura, Krishna Leela Poola
  • Publication number: 20090171986
    Abstract: A decision tree may be determined that is a site map for a domain of web pages. A clustering of a plurality of web pages of a domain is determined, in an unsupervised fashion, based on content-related features of the plurality of web pages. Each determined cluster includes a plurality of web pages, each of the plurality of web pages characterized by a resource locator and each of the resource locators being characterized by at least one resource locator token. The clustering is processed to organize indications of the content-related features of the plurality of web pages into a decision tree characterized by a plurality of nodes, each node characterized by a feature and a value, the feature being at least one of the resource locator tokens and the value being a value of that resource locator token.
    Type: Application
    Filed: December 27, 2007
    Publication date: July 2, 2009
    Applicant: YAHOO! INC.
    Inventors: Krishna Prasad Chitrapura, Pavan Kumar Ganganahalli Marulappa, Krishna Leela Poola, Mahesh Tiyyagura
  • Publication number: 20090157597
    Abstract: Document, such as web pages of a domain, are annotated to facilitate extracting structured information from the documents. The documents are clustered. Each cluster is such that the documents within that cluster are similar to each other at least with respect to a first threshold, such as according to a shingling metric, where the first threshold is an 8/8 shingling match. There is at least one overlap cluster, each overlap cluster including at least one of the plurality of clusters such that documents of the at least one cluster included in that overlap cluster are similar to each other at least with respect to a second threshold that is lower than the first threshold. A particular overlap cluster is designated, as is a particular cluster of the particular overlap cluster. For the particular designated cluster, an obtained annotation is transferred to other clusters included in the designated particular overlap cluster.
    Type: Application
    Filed: December 13, 2007
    Publication date: June 18, 2009
    Applicant: YAHOO! INC.
    Inventor: Mahesh Tiyyagura
  • Publication number: 20090157607
    Abstract: A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.
    Type: Application
    Filed: December 12, 2007
    Publication date: June 18, 2009
    Applicant: YAHOO! INC.
    Inventor: Mahesh TIYYAGURA
  • Publication number: 20090063538
    Abstract: Techniques are described for normalizing dynamic URLs using a hierarchical organization of a web site. Given web pages associated with a web site, an information extraction method is used to generate data structures that represent the content or structure of each of the web pages. These data structures are appended to the corresponding dynamic URLs. The modified URLs with the data structures are tokenized with the resulting tokens clustered to create a hierarchical organization. Nodes of the hierarchical organization may be merged based upon occurrence or patterns of content and structure. The merged hierarchical organization may then be pruned to remove irrelevant information and to reduce the memory footprint of the hierarchical organization. When a new dynamic URL is received, the new dynamic URL is matched to the hierarchical organization. Important parameters are taken into account and irrelevant information may be removed.
    Type: Application
    Filed: August 30, 2007
    Publication date: March 5, 2009
    Inventors: Krishna Prasad CHITRAPURA, Anandsudhakar Kesari, Alok Kirpal, Mahesh Tiyyagura