Patents by Inventor Andrew Tomkins

Andrew Tomkins has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for selecting object metadata evolving over time

Publication number: 20070271270

Abstract: An improved system and method for selecting and visualizing object metadata evolving over time is provided. An application may generate a visualization depicting the temporal evolution of metadata describing objects in an object store over a plurality of time intervals. The application may switch between a visualization of object metadata flowing like a river or cascading like a waterfall over time. A ranked list of metadata items may be determined for some pre-selected intervals during a pre-processing step. Then at runtime when a request may be received for providing a ranked list of metadata items for a query interval, a combination of time intervals from the pre-selected time intervals may be determined that cover the query time interval, and the ranked lists of metadata items for each time interval in the combination of time intervals that cover the query time interval may be aggregated and output for visualization.

Type: Application

Filed: May 19, 2006

Publication date: November 22, 2007

Applicant: Yahoo! Inc.

Inventors: Micah Joel Dubinko, Shanmugasundaram Ravikumar, Joseph Andrew Magnani, Jasmine Novak, Prabhakar Raghavan, Andrew Tomkins
System and method using flat clustering for evolutionary clustering of sequential data sets

Publication number: 20070255684

Abstract: An improved system and method for evolutionary clustering of sequential data sets is provided. A snapshot cost may be determined for representing the data set for a particular clustering method used and may determine the cost of clustering the data set independently of a series of clusterings of the data sets in the sequence. A history cost may also be determined for measuring the distance between corresponding clusters of the data set and the previous data set in the sequence of data sets to determine a cost of clustering the data set as part of a series of clusterings of the data sets in the sequence. An overall cost may be determined for clustering the data set by minimizing the combination of the snapshot cost and the history cost. Any clustering method may be used, including flat clustering and hierarchical clustering.

Type: Application

Filed: April 29, 2006

Publication date: November 1, 2007

Applicant: Yahoo! Inc.

Inventors: Deepayan Chakrabarti, Shanmugasundaram Ravikumar, Andrew Tomkins
System and method using hierachical clustering for evolutionary clustering of sequential data sets

Publication number: 20070255736

Abstract: An improved system and method for evolutionary clustering of sequential data sets is provided. A snapshot cost may be determined for representing the data set for a particular clustering method used and may determine the cost of clustering the data set independently of a series of clusterings of the data sets in the sequence. A history cost may also be determined for measuring the distance between corresponding clusters of the data set and the previous data set in the sequence of data sets to determine a cost of clustering the data set as part of a series of clusterings of the data sets in the sequence. An overall cost may be determined for clustering the data set by minimizing the combination of the snapshot cost and the history cost. Any clustering method may be used, including flat clustering and hierarchical clustering.

Type: Application

Filed: April 29, 2006

Publication date: November 1, 2007

Applicant: Yahoo! Inc.

Inventors: Deepayan Chakrabarti, Shanmugasundaram Ravikumar, Andrew Tomkins
System and method for evolutionary clustering of sequential data sets

Publication number: 20070255737

Abstract: An improved system and method for evolutionary clustering of sequential data sets is provided. A snapshot cost may be determined for representing the data set for a particular clustering method used and may determine the cost of clustering the data set independently of a series of clusterings of the data sets in the sequence. A history cost may also be determined for measuring the distance between corresponding clusters of the data set and the previous data set in the sequence of data sets to determine a cost of clustering the data set as part of a series of clusterings of the data sets in the sequence. An overall cost may be determined for clustering the data set by minimizing the combination of the snapshot cost and the history cost. Any clustering method may be used, including flat clustering and hierarchical clustering.

Type: Application

Filed: April 29, 2006

Publication date: November 1, 2007

Applicant: Yahoo! Inc.

Inventors: Deepayan Chakrabarti, Shanmugasundaram Ravikumar, Andrew Tomkins
System and method for automatically extracting by-line information

Publication number: 20070094232

Abstract: A by-line extraction system detects a set of potential headlines from a title meta-tag of a crawled document, selects a candidate headline from the set of potential headlines, and extracts the by-line information from the document using the location of the selected candidate headline. The system constructs the set of potential headlines based on the title meta-tag. The system selects a candidate headline by evaluating the set of potential headlines in order of the lengths of the potential headlines. The system extracts the by-line information from the document by using the location of the selected candidate headline to extract a string representing a date, a name, or a source located within a minimum distance from the location of the potential headline.

Type: Application

Filed: October 25, 2005

Publication date: April 26, 2007

Inventors: Stephen Dill, Madhukar Korupolu, Andrew Tomkins
Method and framework to support indexing and searching taxonomies in large scale full text indexes

Publication number: 20070078880

Abstract: A system and method of indexing a plurality of entities located in a taxonomy, the entities comprising sets of terms, comprises receiving terms in an index structure; building a posting list for an entity with respect to the locations of the set of terms defining the entity and data associated with the respective terms; and indexing a name of a group comprising the entities within this group at the location of the entities with the data of the group comprising the name of the respective entity at each location. The building of the posting list comprises storing the location of the term and data associated with the term in an entry in the posting list for the term. The method comprises indexing aliases of the name of the group comprising the term, and using an inverted list index to associate data with each occurrence of an index term.

Type: Application

Filed: September 30, 2005

Publication date: April 5, 2007

Applicant: International Business Machines Corporation

Inventors: Nadav Eiron, Daniel Meredith, Joerg Meyer, Jan Pieper, Andrew Tomkins
Microhubs and its applications

Publication number: 20070078811

Abstract: A system and method of crawling at least one website comprising at least one URL includes maintaining a lookup structure comprising all of the URLs known to be on a website; calculating a hub score for each webpage of the website to be recrawled, wherein the hub score measures how likely the to be recrawled webpage includes links to fresh content published on the website; sorting all the to be recrawled pages by their hub scores; and crawling the to be recrawled pages in order from highest hub scores to lowest hub scores. The calculating comprises computing a first value equaling a percentage of a number of new relative URLs on the to be recrawled page; computing a second value equaling a percentage of a previous hub score of the to be recrawled page; and computing the hub score as a sum of the first and the second values.

Type: Application

Filed: September 30, 2005

Publication date: April 5, 2007

Applicant: International Business Machines Corporation

Inventors: Srinivasan Balasubramanian, Michael Ching, Piyoosh Jalan, Satish Penmetsa, Andrew Tomkins
System, service, and method for predicting sales from online public discussions

Publication number: 20070027741

Abstract: A sales prediction system predicts sales from online public discussions. The system utilizes manually or automatically formulated predicates to capture subsets of postings in online public discussions. The system predicts spikes in sales rank based on online chatter. The system comprises automated algorithms that predict spikes in sales rank given a time series of counts of online discussions such as blog postings. The system utilizes a stateless model of customer behavior based on a series of states of excitation that are increasingly likely to lead to a purchase decision. The stateless model of customer behavior yields a predictor of sales rank spikes that is significantly more accurate than conventional techniques operating on sales rank data alone.

Type: Application

Filed: July 27, 2005

Publication date: February 1, 2007

Inventors: Daniel Gruhl, Ramanathan Guha, Jasmine Novak, Shanmugasundaram Ravikumar, Andrew Tomkins
System and method for performing a high-level multi-dimensional query on a multi-structural database

Publication number: 20060282411

Abstract: A multi-structural query system performs a high-level multi-dimensional query on a multi-structural database. The query system enables a user to navigate a search by adding restrictions incrementally. The query system uses a schema to discover structure in a multi-structural database. The query system leaves a choice of nodes to return in response to a query as a constrained set of choices available to the algorithm. The query system further casts the selection of a set of nodes as an optimization. The query system uses pairwise-disjoint collections to capture a concise set of highlights of a data set within the allowed schema. The query system further comprises efficient algorithms that yield approximately optimal solutions for several classes of objective functions.

Type: Application

Filed: June 13, 2005

Publication date: December 14, 2006

Inventors: Ronald Fagin, Ramanathan Guha, Phokion Kolaitis, Jasmine Novak, Shanmugasundaram Ravikumar, Dandapani Sivakumar, Andrew Tomkins
Annotation of inverted list text indexes using search queries

Publication number: 20060248037

Abstract: A system and method of data mining comprises processing contents of a primary posting index; and producing a posting within a secondary posting index based on the processing of the contents of the primary posting index, wherein the processing of contents of the primary posting index comprises submitting a disjunction of terms or phrases to the primary posting index. The processing of contents of the primary posting index comprises generating a query result by submitting a query to the primary posting index using a query language of the primary posting index. Moreover, the processing of contents of the primary posting index comprises processing the primary posting index in order to generate results, wherein the results comprise a set of candidate entries with additional metadata; and filtering the results in order to produce the posting within the secondary posting index.

Type: Application

Filed: April 29, 2005

Publication date: November 2, 2006

Applicant: International Business Machines Corporation

Inventors: Joerg Meyer, Jan Pieper, Andrew Tomkins
Methods and apparatus for assessing web page decay

Publication number: 20060112089

Abstract: Systems and methods are herein disclosed for assessing the staleness of a web page. In particular, in one method of the present invention, the staleness of a web page is assessed by examining internal date references within the web page. In another method of the present invention, the staleness of a web page is assessed by examining the meta-data associated with the web page. In a further method of the present invention, the staleness of a hyperlinked web page is determined by examining the link status of the hyperlinks. If the web page has a relatively large number of dead links, it is assessed as being a stale web page. In a still further method of the present invention, the link status of web pages in the neighborhood of the web page being assessed is likewise examined.

Type: Application

Filed: November 22, 2004

Publication date: May 25, 2006

Inventors: Andrei Broder, Ziv Bar-Yossef, Shanmagasundaram Ravikumar, Andrew Tomkins
System, method, and service for inducing a pattern of communication among various parties

Publication number: 20050256949

Abstract: A communication pattern inducing system focuses on the propagation of topics amongst a plurality of nodes based on the text of the node rather than hyperlinks of the node. A node could represent a weblog or any other source of information such as person, a conversation, images, etc. The system utilizes a model for information diffusion, wherein the parameters of the model capture how a new topic spreads from node to node. The system further comprises a process to learn the parameters of the model based on real data and to apply the process to real (or synthetic) node data. Consequently, the system is able to identify particular individuals that are highly effective at contributing to the spread of topics.

Type: Application

Filed: May 14, 2004

Publication date: November 17, 2005

Applicant: International Business Machines Corporation

Inventors: Daniel Gruhl, Ramanathan Guha, Andrew Tomkins
System, method, and service for segmenting a topic into chatter and subtopics

Publication number: 20050256905

Abstract: A topic segmenting system segments a topic into chatter and subtopics. The system decomposes a conversation into topics, producing a time-based structure for topics and subtopics in the conversation. The system extracts a large number of topics at all levels of granularity. Some of the topics extracted correspond to broad topics and some correspond to “spiky” topics or subtopics. The system comprises a process for automatically detecting spiky regions of a topic. For each possible broad topic, the present system finds regions where coverage of the broad topic overlaps significantly with the spiky region of another topic. The system then removes the spiky subtopic from the conversation. Processing is repeated until all discernable topics have been identified and removed from the conversation, yielding random topics of little duration or intensity.

Type: Application

Filed: May 15, 2004

Publication date: November 17, 2005

Applicant: International Business Machines Corporation

Inventors: Daniel Gruhl, Ramanathan Guha, Andrew Tomkins
System, method, and service for finding an optimal collection of paths among a plurality of paths between two nodes in a complex network

Publication number: 20050243736

Abstract: An optimal path selection system extracts a connection subgraph in real time from an undirected, edge-weighted graph such as a social network that best captures the connections between two nodes of the graph. The system models the undirected, edge-weighted graph as an electrical circuit and solves for a relationship between two nodes in the undirected edge-weighted graph based on electrical analogues in the electric graph model. The system optionally accelerates the computations to produce approximate, high-quality connection subgraphs in real time on very large (disk resident) graphs. The connection subgraph is constrained to the integer budget that comprises a first node, a second node and a collection of paths from the first node to the second node that maximizes a “goodness” function g(H). The goodness function g(H) is tailored to capture salient aspects of a relationship between the first node and the second node.

Type: Application

Filed: April 19, 2004

Publication date: November 3, 2005

Applicant: International Business Machines Corporation

Inventors: Christos Faloutsos, Kevin Snow McCurley, Andrew Tomkins
Border-less clock free two-dimensional barcode and method for printing and reading the same

Patent number: 6418244

Abstract: Inventive two-dimensional barcodes, each having encoded digital information in a bitmap representing preferably randomized encoded data bits, are printed onto a printed medium. Preferably, error correction codes are added to the digital information to ensure that the decoding process accurately reproduce the digital information. In one embodiment, the bitmap may further include “anchor” bits in each corner, which are used as part of the skew estimation and deskewing processes during decoding. In a second embodiment, no “anchor” bits are required. The encoded digital information is mapped into the two-dimensional barcode in such a way as to minimize the errors caused by damage to particular rows and/or columns, for example, row damage caused by faxing the printed barcode. To extract the encoded digital information from the printed medium, the printed medium is scanned, then the bitmap is located within the printed medium.

Type: Grant

Filed: January 23, 2001

Date of Patent: July 9, 2002

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jiangying Zhou, Daniel P. Lopresti, Andrew Tomkins
Method for interactively creating an information database including preferred information elements, such as preferred-authority, world wide web pages

Patent number: 6356899

Abstract: A method for identifying, filtering, ranking and cataloging information elements; as for example, World Wide Web pages, of the Internet in whole, part, or in combination. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred World Wide Web pages in whole, part, or in combination. The method includes steps for enabling a user to interactively create a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form; for example, a searchable, World Wide Web page database.

Type: Grant

Filed: March 3, 1999

Date of Patent: March 12, 2002

Assignee: International Business Machines Corporation

Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Prabhakar Raghavan, Sridhar Rajagopalan, Shanmugasundaram Ravikumar, Andrew Tomkins
Method for interactively creating an information database including preferred information elements, such as, preferred-authority, world wide web pages

Patent number: 6336112

Abstract: A method for cataloging, filtering and ranking information, as for example, World Wide Web pages of the Internet. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred-authority World Wide Web pages. The method includes steps for enabling a user to interactively create a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form, for example, a searchable, World Wide Web page database.

Type: Grant

Filed: March 16, 2001

Date of Patent: January 1, 2002

Assignee: International Business Machines Corporation

Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Prabhakar Raghavan, Sridhar Rajagopalan, Shanmugasundaram Ravikumar, Andrew Tomkins
Method for cataloging, filtering, and relevance ranking frame-based hierarchical information structures

Patent number: 6334131

Abstract: A method for cataloging, filtering and ranking information, as for example, World Wide Web pages of the Internet. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred-authority World Wide Web pages. The method includes steps for enabling a user to interactively create a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form, for example, a searchable, World Wide Web page database.

Type: Grant

Filed: August 29, 1998

Date of Patent: December 25, 2001

Assignee: International Business Machines Corporation

Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Prabhakar Raghavan, Sridhar Rajagopalan, Shanmugasundaram Ravikumar, Andrew Tomkins
METHOD FOR INTERACTIVELY CREATING AN INFORMATION DATABASE INCLUDING PREFERRED INFORMATION ELEMENTS, SUCH AS PREFERRED AUTHORITY, WORLD

Publication number: 20010039544

Abstract: A method for cataloging, filtering and ranking information; as for example, World Wide Web pages of the Internet. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred-authority World Wide Web pages. The method including steps for enabling a user to interactively creating a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form; for example, a searchable, World Wide Web page database.

Type: Application

Filed: August 29, 1998

Publication date: November 8, 2001

Inventors: SOUMEN CHAKRABARTI, BYRON EDWARD DORN, DAVID ANDREW GIBSON, PRABHAKAR RAGHAVAN, SRIDHAR RAJAGOPALAN, SHANMUGASUNDARAM RAVIKUMAR, ANDREW TOMKINS
Method for interactively creating an information database including preferred information elements, such as, preferred-authority, world wide web pages

Publication number: 20010016846

Abstract: A method for cataloging, filtering and ranking information; as for example, World Wide Web pages of the Internet. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred-authority World Wide Web pages. The method including steps for enabling a user to interactively creating a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form; for example, a searchable, World Wide Web page database.

Type: Application

Filed: March 16, 2001

Publication date: August 23, 2001

Applicant: International Business Machines Corp.

Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Prabhakar Raghavan, Sridhar Rajagopalan, Shanmugasundaram Ravikumar, Andrew Tomkins

prev … 5 6 7 8 9 10 next