Patents by Inventor Mahesh Tiyyagura
Mahesh Tiyyagura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8046360Abstract: Document, such as web pages of a domain, are annotated to facilitate extracting structured information from the documents. The documents are clustered. Each cluster is such that the documents within that cluster are similar to each other at least with respect to a first threshold, such as according to a shingling metric, where the first threshold is an 8/8 shingling match. There is at least one overlap cluster, each overlap cluster including at least one of the plurality of clusters such that documents of the at least one cluster included in that overlap cluster are similar to each other at least with respect to a second threshold that is lower than the first threshold. A particular overlap cluster is designated, as is a particular cluster of the particular overlap cluster. For the particular designated cluster, an obtained annotation is transferred to other clusters included in the designated particular overlap cluster.Type: GrantFiled: December 13, 2007Date of Patent: October 25, 2011Assignee: Yahoo! Inc.Inventor: Mahesh Tiyyagura
-
Patent number: 8010544Abstract: A method is provided for information extraction from among a multiplicity of documents each having a corresponding document object model (DOM) comprising: computing signatures associated with nodes of a multiplicity of DOMs corresponding to the multiplicity of documents; producing an index that associates computed signatures to each document that has a DOM that has one or more nodes corresponding to such signature; annotating one or more nodes of a DOM that corresponds to the at least one selected document; wherein the one or more annotated nodes respectively correspond to one or more respective signatures included in the index; and matching the signatures that correspond to the annotated nodes with signatures in the index to determine which documents from the multiplicity of documents have one or more DOM nodes that correspond to one or more of the annotated nodes.Type: GrantFiled: June 6, 2008Date of Patent: August 30, 2011Assignee: Yahoo! Inc.Inventor: Mahesh Tiyyagura
-
Patent number: 7941421Abstract: A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.Type: GrantFiled: March 2, 2010Date of Patent: May 10, 2011Assignee: Yahoo! Inc.Inventor: Mahesh Tiyyagura
-
Publication number: 20100161588Abstract: A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.Type: ApplicationFiled: March 2, 2010Publication date: June 24, 2010Applicant: YAHOO! INC.Inventor: Mahesh Tiyyagura
-
Patent number: 7707229Abstract: A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.Type: GrantFiled: December 12, 2007Date of Patent: April 27, 2010Assignee: Yahoo! Inc.Inventor: Mahesh Tiyyagura
-
Publication number: 20090319481Abstract: The present invention is directed towards systems and methods for extending media annotations using collective knowledge. The method according to one embodiment of the present invention comprises receiving a plurality of content items and associated annotations. The method further normalizes the plurality of associated annotations and calculates pair frequencies for the plurality of associated annotations. The method then retrieves a plurality of alternative annotations and provides the plurality of alternative annotations.Type: ApplicationFiled: June 18, 2008Publication date: December 24, 2009Applicant: Yahoo! Inc.Inventors: Krishna Prasad Chitrapura, Krishna Leela Poola, Mahesh Tiyyagura
-
Publication number: 20090313127Abstract: An improved system and method for using contextual sections of web page content for serving advertisements in online advertising is provided. A publisher may use a tool to identify sections of a web page that represent content to be used in contextual advertising. When rendered by a web browser, content from marked sections may be extracted from the web page and sent to an advertisement server for selectively matching advertisements for display to a user. Features may be identified from the content sections and used to select advertisements matching the extracted content of the web page. In particular, the features identified from the content sections may be matched with features designated by advertisers for advertisements. Web page placements may be allocated for advertisements matching the extracted content, and the advertisements may be served for display with the web page.Type: ApplicationFiled: June 11, 2008Publication date: December 17, 2009Applicant: Yahoo! Inc.Inventors: David Chaiken, Kalyan Kumar Kanuri, Arun Ramanujapuram, Mahesh Tiyyagura
-
Publication number: 20090307256Abstract: A method is provided for information extraction from among a multiplicity of documents each having a corresponding document object model (DOM) comprising: computing signatures associated with nodes of a multiplicity of DOMs corresponding to the multiplicity of documents; producing an index that associates computed signatures to each document that has a DOM that has one or more nodes corresponding to such signature; annotating one or more nodes of a DOM that corresponds to the at least one selected document; wherein the one or more annotated nodes respectively correspond to one or more respective signatures included in the index; and matching the signatures that correspond to the annotated nodes with signatures in the index to determine which documents from the multiplicity of documents have one or more DOM nodes that correspond to one or more of the annotated nodes.Type: ApplicationFiled: June 6, 2008Publication date: December 10, 2009Applicant: Yahoo! Inc.Inventor: Mahesh TIYYAGURA
-
Publication number: 20090240670Abstract: Subject matter disclosed herein may relate to alignment of uniform resource identifiers associated with web pages, and further may relate to multiple sequence alignment of uniform resource identifiers. In one or more example embodiments, multiple sequence alignment techniques may provide improved tokenization of uniform resource identifiers associated with web pages, which may provide improved performance of applications such as, for example, uniform resource identifier normalization, sitemap construction, etc.Type: ApplicationFiled: March 20, 2008Publication date: September 24, 2009Applicant: Yahoo! Inc.Inventors: Mahesh Tiyyagura, Krishna Leela Poola
-
Publication number: 20090171986Abstract: A decision tree may be determined that is a site map for a domain of web pages. A clustering of a plurality of web pages of a domain is determined, in an unsupervised fashion, based on content-related features of the plurality of web pages. Each determined cluster includes a plurality of web pages, each of the plurality of web pages characterized by a resource locator and each of the resource locators being characterized by at least one resource locator token. The clustering is processed to organize indications of the content-related features of the plurality of web pages into a decision tree characterized by a plurality of nodes, each node characterized by a feature and a value, the feature being at least one of the resource locator tokens and the value being a value of that resource locator token.Type: ApplicationFiled: December 27, 2007Publication date: July 2, 2009Applicant: YAHOO! INC.Inventors: Krishna Prasad Chitrapura, Pavan Kumar Ganganahalli Marulappa, Krishna Leela Poola, Mahesh Tiyyagura
-
Publication number: 20090157597Abstract: Document, such as web pages of a domain, are annotated to facilitate extracting structured information from the documents. The documents are clustered. Each cluster is such that the documents within that cluster are similar to each other at least with respect to a first threshold, such as according to a shingling metric, where the first threshold is an 8/8 shingling match. There is at least one overlap cluster, each overlap cluster including at least one of the plurality of clusters such that documents of the at least one cluster included in that overlap cluster are similar to each other at least with respect to a second threshold that is lower than the first threshold. A particular overlap cluster is designated, as is a particular cluster of the particular overlap cluster. For the particular designated cluster, an obtained annotation is transferred to other clusters included in the designated particular overlap cluster.Type: ApplicationFiled: December 13, 2007Publication date: June 18, 2009Applicant: YAHOO! INC.Inventor: Mahesh Tiyyagura
-
Publication number: 20090157607Abstract: A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.Type: ApplicationFiled: December 12, 2007Publication date: June 18, 2009Applicant: YAHOO! INC.Inventor: Mahesh TIYYAGURA
-
Publication number: 20090063538Abstract: Techniques are described for normalizing dynamic URLs using a hierarchical organization of a web site. Given web pages associated with a web site, an information extraction method is used to generate data structures that represent the content or structure of each of the web pages. These data structures are appended to the corresponding dynamic URLs. The modified URLs with the data structures are tokenized with the resulting tokens clustered to create a hierarchical organization. Nodes of the hierarchical organization may be merged based upon occurrence or patterns of content and structure. The merged hierarchical organization may then be pruned to remove irrelevant information and to reduce the memory footprint of the hierarchical organization. When a new dynamic URL is received, the new dynamic URL is matched to the hierarchical organization. Important parameters are taken into account and irrelevant information may be removed.Type: ApplicationFiled: August 30, 2007Publication date: March 5, 2009Inventors: Krishna Prasad CHITRAPURA, Anandsudhakar Kesari, Alok Kirpal, Mahesh Tiyyagura