Patents by Inventor Andrew S. Tomkins

Andrew S. Tomkins has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10867337
    Abstract: A social environment is provided by creating an object in response to recognition of an entity in a portion of web content, wherein the object represents the entity, the object is associated with a type selected from a set of types, and the type is associated with a schema selected from a set of schemas, where the social environment includes a set of objects including the object, wherein the objects are instances of corresponding types in a rich system of predefined types, the schemas are associated with the types, metadata is associated with the objects, and there is at least one relationship between at least two objects selected from the set of objects, where the set of objects and the metadata are extensible, such that extensions provided by a first user are available for use by a second user. In one example, metadata provided by a first user is only available to a second user having a relationship with the first user.
    Type: Grant
    Filed: February 22, 2017
    Date of Patent: December 15, 2020
    Assignee: VERIZON MEDIA INC.
    Inventors: Andrew S. Tomkins, Cameron A. Marlow, Raghu Ramakrishnan, Shanmugasundaram Ravikumar
  • Publication number: 20170161816
    Abstract: A social environment is provided by creating an object in response to recognition of an entity in a portion of web content, wherein the object represents the entity, the object is associated with a type selected from a set of types, and the type is associated with a schema selected from a set of schemas, where the social environment includes a set of objects including the object, wherein the objects are instances of corresponding types in a rich system of predefined types, the schemas are associated with the types, metadata is associated with the objects, and there is at least one relationship between at least two objects selected from the set of objects, where the set of objects and the metadata are extensible, such that extensions provided by a first user are available for use by a second user. In one example, metadata provided by a first user is only available to a second user having a relationship with the first user.
    Type: Application
    Filed: February 22, 2017
    Publication date: June 8, 2017
    Inventors: Andrew S. TOMKINS, Cameron A. MARLOW, Raghu RAMAKRISHNAN, Shanmugasundaram RAVIKUMAR
  • Patent number: 9600800
    Abstract: A social environment is provided by creating an object in response to recognition of an entity in a portion of web content, wherein the object represents the entity, the object is associated with a type selected from a set of types, and the type is associated with a schema selected from a set of schemas, where the social environment includes a set of objects including the object, wherein the objects are instances of corresponding types in a rich system of predefined types, the schemas are associated with the types, metadata is associated with the objects, and there is at least one relationship between at least two objects selected from the set of objects, where the set of objects and the metadata are extensible, such that extensions provided by a first user are available for use by a second user. In one example, metadata provided by a first user is only available to a second user having a relationship with the first user.
    Type: Grant
    Filed: November 10, 2009
    Date of Patent: March 21, 2017
    Assignee: Yahoo! Inc.
    Inventors: Andrew S. Tomkins, Raghu Ramakrishnan, Shanmugasundaram Ravikumar
  • Patent number: 8745162
    Abstract: Method and system for presenting information on a user device are disclosed. The method includes collecting a plurality of data objects on the Internet, annotating each data object in the plurality of data objects in accordance with user-defined data and implicit data, wherein the user-defined data and implicit data form metadata associated with the plurality of data objects, creating correlations between the plurality of data objects using the metadata associated with the plurality of data objects, and presenting the plurality of data objects in multiple views on the user device simultaneously according to the correlations between the plurality of data objects.
    Type: Grant
    Filed: January 19, 2007
    Date of Patent: June 3, 2014
    Assignee: Yahoo! Inc.
    Inventors: Karon A. Weber, Andrew S. Tomkins, Reiner Kraft, Samantha M. Tripodi, Chetana Deorah
  • Patent number: 8600997
    Abstract: A system and method of indexing a plurality of entities located in a taxonomy, the entities comprising sets of terms, comprises receiving terms in an index structure; building a posting list for an entity with respect to the locations of the set of terms defining the entity and data associated with the respective terms; and indexing a name of a group comprising the entities within this group at the location of the entities with the data of the group comprising the name of the respective entity at each location. The building of the posting list comprises storing the location of the term and data associated with the term in an entry in the posting list for the term. The method comprises indexing aliases of the name of the group comprising the term, and using an inverted list index to associate data with each occurrence of an index term.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: December 3, 2013
    Assignee: International Business Machines Corporation
    Inventors: Nadav Eiron, Daniel N. Meredith, Joerg Meyer, Jan H. Pieper, Andrew S. Tomkins
  • Patent number: 8321396
    Abstract: A by-line extraction system detects a set of potential headlines from a title meta-tag of a crawled document, selects a candidate headline from the set of potential headlines, and extracts the by-line information from the document using the location of the selected candidate headline. The system constructs the set of potential headlines based on the title meta-tag. The system selects a candidate headline by evaluating the set of potential headlines in order of the lengths of the potential headlines. The system extracts the by-line information from the document by using the location of the selected candidate headline to extract a string representing a date, a name, or a source located within a minimum distance from the location of the potential headline.
    Type: Grant
    Filed: August 15, 2008
    Date of Patent: November 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Stephen Dill, Madhukar R. Korupolu, Andrew S. Tomkins
  • Publication number: 20120259890
    Abstract: In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.
    Type: Application
    Filed: June 18, 2012
    Publication date: October 11, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Matthew Denesuk, Daniel Frederick Gruhl, Sridhar Rajagopalan, Andrew S. Tomkins
  • Patent number: 8214391
    Abstract: In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.
    Type: Grant
    Filed: May 8, 2002
    Date of Patent: July 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Matthew Denesuk, Daniel Frederick Gruhl, Kevin Snow McCurley, Sridhar Rajagopalan, Andrew S. Tomkins
  • Patent number: 8041705
    Abstract: A system and method of crawling at least one website comprising at least one URL includes maintaining a lookup structure comprising all of the URLs known to be on a website; calculating a hub score for each webpage of the website to be recrawled, wherein the hub score measures how likely the to be recrawled webpage includes links to fresh content published on the website; sorting all the to be recrawled pages by their hub scores; and crawling the to be recrawled pages in order from highest hub scores to lowest hub scores. The calculating comprises computing a first value equaling a percentage of a number of new relative URLs on the to be recrawled page; computing a second value equaling a percentage of a previous hub score of the to be recrawled page; and computing the hub score as a sum of the first and the second values.
    Type: Grant
    Filed: January 5, 2009
    Date of Patent: October 18, 2011
    Assignee: International Business Machines Corporation
    Inventors: Srinivasan Balasubramanian, Michael Ching, Piyoosh Jalan, Satish C. Penmetsa, Andrew S. Tomkins
  • Patent number: 7970895
    Abstract: A communication pattern inducing system focuses on the propagation of topics amongst a plurality of nodes based on the text of the node rather than hyperlinks of the node. A node could represent a weblog or any other source of information such as person, a conversation, images, etc. The system utilizes a model for information diffusion, wherein the parameters of the model capture how a new topic spreads from node to node. The system further comprises a process to learn the parameters of the model based on real data and to apply the process to real (or synthetic) node data. Consequently, the system is able to identify particular individuals that are highly effective at contributing to the spread of topics.
    Type: Grant
    Filed: August 14, 2008
    Date of Patent: June 28, 2011
    Assignee: International Business Machines Corporation
    Inventors: Daniel Frederick Gruhl, Ramanathan Valdnyanath Guha, Andrew S. Tomkins
  • Publication number: 20100281104
    Abstract: A social environment is provided by creating an object in response to recognition of an entity in a portion of web content, wherein the object represents the entity, the object is associated with a type selected from a set of types, and the type is associated with a schema selected from a set of schemas, where the social environment includes a set of objects including the object, wherein the objects are instances of corresponding types in a rich system of predefined types, the schemas are associated with the types, metadata is associated with the objects, and there is at least one relationship between at least two objects selected from the set of objects, where the set of objects and the metadata are extensible, such that extensions provided by a first user are available for use by a second user. In one example, metadata provided by a first user is only available to a second user having a relationship with the first user.
    Type: Application
    Filed: November 10, 2009
    Publication date: November 4, 2010
    Applicant: YAHOO! INC.
    Inventors: Andrew S. TOMKINS, Cameron A. Marlow, Raghu Ramakrishnan, Shanmugasundaram Ravikumar
  • Patent number: 7725346
    Abstract: A sales prediction system predicts sales from online public discussions. The system utilizes manually or automatically formulated predicates to capture subsets of postings in online public discussions. The system predicts spikes in sales rank based on online chatter. The system comprises automated algorithms that predict spikes in sales rank given a time series of counts of online discussions such as blog postings. The system utilizes a stateless model of customer behavior based on a series of states of excitation that are increasingly likely to lead to a purchase decision. The stateless model of customer behavior yields a predictor of sales rank spikes that is significantly more accurate than conventional techniques operating on sales rank data alone.
    Type: Grant
    Filed: July 27, 2005
    Date of Patent: May 25, 2010
    Assignee: International Business Machines Corporation
    Inventors: Daniel Frederick Gruhl, Ramanathan Vaidhyanath Guha, Jasmine Novak, Shanmugasundaram Ravikumar, Andrew S. Tomkins
  • Publication number: 20090119291
    Abstract: A system and method of crawling at least one website comprising at least one URL includes maintaining a lookup structure comprising all of the URLs known to be on a website; calculating a hub score for each webpage of the website to be recrawled, wherein the hub score measures how likely the to be recrawled webpage includes links to fresh content published on the website; sorting all the to be recrawled pages by their hub scores; and crawling the to be recrawled pages in order from highest hub scores to lowest hub scores. The calculating comprises computing a first value equaling a percentage of a number of new relative URLs on the to be recrawled page; computing a second value equaling a percentage of a previous hub score of the to be recrawled page; and computing the hub score as a sum of the first and the second values.
    Type: Application
    Filed: January 5, 2009
    Publication date: May 7, 2009
    Applicant: International Business Machines Corporation
    Inventors: Srinivasan Balasubramanian, Michael Ching, Piyoosh Jalan, Satish C. Penmetsa, Andrew S. Tomkins
  • Patent number: 7496557
    Abstract: A system and method of crawling at least one website comprising at least one URL includes maintaining a lookup structure comprising all of the URLs known to be on a website; calculating a hub score for each webpage of the website to be recrawled, wherein the hub score measures how likely the to be recrawled webpage includes links to fresh content published on the website; sorting all the to be recrawled pages by their hub scores; and crawling the to be recrawled pages in order from highest hub scores to lowest hub scores. The calculating comprises computing a first value equaling a percentage of a number of new relative URLs on the to be recrawled page; computing a second value equaling a percentage of a previous hub score of the to be recrawled page; and computing the hub score as a sum of the first and the second values.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: February 24, 2009
    Assignee: International Business Machines Corporation
    Inventors: Srinivasan Balasubramanian, Michael Ching, Piyoosh Jalan, Satish C. Penmetsa, Andrew S. Tomkins
  • Publication number: 20080306941
    Abstract: A by-line extraction system detects a set of potential headlines from a title meta-tag of a crawled document, selects a candidate headline from the set of potential headlines, and extracts the by-line information from the document using the location of the selected candidate headline. The system constructs the set of potential headlines based on the title meta-tag. The system selects a candidate headline by evaluating the set of potential headlines in order of the lengths of the potential headlines. The system extracts the by-line information from the document by using the location of the selected candidate headline to extract a string representing a date, a name, or a source located within a minimum distance from the location of the potential headline.
    Type: Application
    Filed: August 15, 2008
    Publication date: December 11, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Stephen Dill, Madhukar R. Korupolu, Andrew S. Tomkins
  • Publication number: 20080307326
    Abstract: A communication pattern inducing system focuses on the propagation of topics amongst a plurality of nodes based on the text of the node rather than hyperlinks of the node. A node could represent a weblog or any other source of information such as person, a conversation, images, etc. The system utilizes a model for information diffusion, wherein the parameters of the model capture how a new topic spreads from node to node. The system further comprises a process to learn the parameters of the model based on real data and to apply the process to real (or synthetic) node data. Consequently, the system is able to identify particular individuals that are highly effective at contributing to the spread of topics.
    Type: Application
    Filed: August 14, 2008
    Publication date: December 11, 2008
    Applicant: International Business Machines
    Inventors: DANIEL Frederick GRUHL, Ramanathan Valdnyanath Guha, Andrew S. Tomkins
  • Patent number: 7464078
    Abstract: A by-line extraction method detects a set of potential headlines from a title meta-tag of a crawled document, selects a candidate headline from the set of potential headlines, and extracts the by-line information from the document using the location of the selected candidate headline. The method constructs the set of potential headlines based on the title meta-tag. The method selects a candidate headline by evaluating the set of potential headlines in order of the lengths of the potential headlines. The method extracts the by-line information from the document by using the location of the selected candidate headline to extract a string representing a date, a name, or a source located within a minimum distance from the location of the potential headline.
    Type: Grant
    Filed: October 25, 2005
    Date of Patent: December 9, 2008
    Assignee: International Business Machines Corporation
    Inventors: Stephen Dill, Madhukar R. Korupolu, Andrew S. Tomkins
  • Patent number: 7426557
    Abstract: A communication pattern inducing system focuses on the propagation of topics amongst a plurality of nodes based on the text of the node rather than hyperlinks of the node. A node could represent a weblog or any other source of information such as person, a conversation, images, etc. The system utilizes a model for information diffusion, wherein the parameters of the model capture how a new topic spreads from node to node. The system further comprises a process to learn the parameters of the model based on real data and to apply the process to real (or synthetic) node data. Consequently, the system is able to identify particular individuals that are highly effective at contributing to the spread of topics.
    Type: Grant
    Filed: May 14, 2004
    Date of Patent: September 16, 2008
    Assignee: International Business Machines Corporation
    Inventors: Daniel Frederick Gruhl, Ramanathan Valdhyanath Guha, Andrew S. Tomkins
  • Publication number: 20080052372
    Abstract: Method and system for presenting information on a user device are disclosed. The method includes collecting a plurality of data objects on the Internet, annotating each data object in the plurality of data objects in accordance with user-defined data and implicit data, wherein the user-defined data and implicit data form metadata associated with the plurality of data objects, creating correlations between the plurality of data objects using the metadata associated with the plurality of data objects, and presenting the plurality of data objects in multiple views on the user device simultaneously according to the correlations between the plurality of data objects.
    Type: Application
    Filed: January 19, 2007
    Publication date: February 28, 2008
    Applicant: Yahoo! Inc.
    Inventors: Karon A. Weber, Andrew S. Tomkins, Reiner Kraft, Samantha M. Tripodi, Chetana Deorah
  • Patent number: 7281022
    Abstract: A topic segmenting system segments a topic into chatter and subtopics. The system decomposes a conversation into topics, producing a time-based structure for topics and subtopics in the conversation. The system extracts a large number of topics at all levels of granularity. Some of the topics extracted correspond to broad topics and some correspond to “spiky” topics or subtopics. The system comprises a process for automatically detecting spiky regions of a topic. For each possible broad topic, the present system finds regions where coverage of the broad topic overlaps significantly with the spiky region of another topic. The system then removes the spiky subtopic from the conversation. Processing is repeated until all discernable topics have been identified and removed from the conversation, yielding random topics of little duration or intensity.
    Type: Grant
    Filed: May 15, 2004
    Date of Patent: October 9, 2007
    Assignee: International Business Machines Corporation
    Inventors: Daniel Frederick Gruhl, Ramanathan Vaidhyanath Guha, Andrew S. Tomkins