Patents by Inventor Soumen Chakrabarti

Soumen Chakrabarti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9996530
    Abstract: Systems, methods, and computer readable media related to determining whether a compound is a non-compositional noun compound (“NCC”). Some implementations are additionally or alternatively directed to using determined NCCs to adapt performance of one or more computer-based actions such as indexing or otherwise annotating electronic resources (e.g., web pages or other Internet resources), processing search queries, identifying and/or ranking electronic resources in response to search queries, identifying and/or ranking search query suggestions for search queries, etc.
    Type: Grant
    Filed: March 30, 2017
    Date of Patent: June 12, 2018
    Assignee: GOOGLE LLC
    Inventors: Mohammad Mehdi Hafezi Manshadi, Iftekhar Naim, Soumen Chakrabarti
  • Patent number: 8447766
    Abstract: Disclosed is a method of querying a collection of electronic documents, comprising defining a query for retrieving a numerical answer, said query comprising one or more search terms and a tolerance for said numerical answer; defining a set of document portions from said collection, each document portion in said set being extracted from an electronic document and comprising at least one term relevant to at least one of the one or more search terms and a numerical value associated with the at least one term; ordering the associated numerical values contained in said set; defining a plurality of results groups, each results group comprising an interval of ordered numerical values, each interval having a range not exceeding the tolerance; ranking the results groups; and returning at least the interval of the highest ranked results group as a response to said query A computer program product for executing this method on a computer processor and a server are also disclosed.
    Type: Grant
    Filed: April 26, 2010
    Date of Patent: May 21, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Somnath Banerjee, Soumen Chakrabarti, Ganesh Ramakrishnan
  • Publication number: 20110264670
    Abstract: Disclosed is a method of querying a collection of electronic documents, comprising defining a query for retrieving a numerical answer, said query comprising one or more search terms and a tolerance for said numerical answer; defining a set of document portions from said collection, each document portion in said set being extracted from an electronic document and comprising at least one term relevant to at least one of the one or more search terms and a numerical value associated with the at least one term; ordering the associated numerical values contained in said set; defining a plurality of results groups, each results group comprising an interval of ordered numerical values, each interval having a range not exceeding the tolerance; ranking the results groups; and returning at least the interval of the highest ranked results group as a response to said query A computer program product for executing this method on a computer processor and a server are also disclosed.
    Type: Application
    Filed: April 26, 2010
    Publication date: October 27, 2011
    Inventors: Somnath BANERJEE, Soumen Chakrabarti, Ganesh Ramakrishnan
  • Patent number: 7266762
    Abstract: A Web server stores a table of Web page inlinks. When a Web page is accessed and a user wants to access other pages related to the accessed page, the user requests the table of inlinks, and from it generates a list of sibling links to the accessed page, the sibling links being outlinks of one or more of the inlinks in the table.
    Type: Grant
    Filed: March 10, 2000
    Date of Patent: September 4, 2007
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Kevin Snow McCurley, Martin Henk van den Berg
  • Patent number: 6996572
    Abstract: A system and method are provided for eliciting interesting structure from a collection of entities or resources with explicit and/or implicit, static and/or dynamic relations, called “affinities,” between them. Interesting structure includes (1) notions of quality, authority, or definitiveness of information, (2) notions of relevance to a user's information need, (3) notions of similarity among the plurality of resources retrieved from a universe of resources by a query process, and (4) notions of similarity among the usages of resources by different users/servers. Similarities between entities are computed, based on similarities between the affinity values for the entities. That is, where the affinitiy values for two entities resemble each other, the two entities have a high degree of similarity. Using the similarities, the entities are ranked, clustered, etc., based on a significance derived from the similarities. The ranking, clustering, etc., makes up the interesting structure which is sought.
    Type: Grant
    Filed: October 8, 1997
    Date of Patent: February 7, 2006
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Jon Michael Kleinberg, Prabhakar Raghavan, Sridhar Rajagopalan
  • Patent number: 6640224
    Abstract: A system and method for optimizing I/O to low-level index access during bulk-routing through a taxonomy to classify documents, e.g., Web pages, in the taxonomy. In a first optimization, bulk-routing is regarded as a generalized join operation in a relational database framework. In a second optimization, instead of processing each document individually through nodes of the taxonomy, a group of documents are processed node by node in a wavefront-style routing scheme for better amortization of index probes.
    Type: Grant
    Filed: February 4, 2000
    Date of Patent: October 28, 2003
    Assignee: International Business Machines Corporation
    Inventor: Soumen Chakrabarti
  • Patent number: 6418433
    Abstract: A focussed Web crawler learns to recognize Web pages that are relevant to the interest of one or more users, from a set of examples provided by the users. It then explores the Web starting from the example set, using the statistics collected from the examples and other analysis on the link graph of the growing crawl database, to guide itself towards relevant, valuable resources and away from irrelevant and/or low quality material on the Web. Thereby, the Web crawler builds a comprehensive topic-specific library for the benefit of specific users.
    Type: Grant
    Filed: January 28, 1999
    Date of Patent: July 9, 2002
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, Martin Henk van den Berg
  • Patent number: 6389436
    Abstract: A method, apparatus, and article of manufacture for a computer implemented hypertext classifier. A new document containing citations to and from other documents is classified. Initially, documents within a neighborhood of the new document are identified. For each document and each class, an initial probability is determined that indicates the probability that the document fits a particular class. Next, iterative relaxation is performed to identify a class for each document using the initial probabilities. A class is selected into which the new document is to be classified based on the initial probabilities and identified classes.
    Type: Grant
    Filed: December 15, 1997
    Date of Patent: May 14, 2002
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, Piotr Indyk
  • Patent number: 6356899
    Abstract: A method for identifying, filtering, ranking and cataloging information elements; as for example, World Wide Web pages, of the Internet in whole, part, or in combination. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred World Wide Web pages in whole, part, or in combination. The method includes steps for enabling a user to interactively create a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form; for example, a searchable, World Wide Web page database.
    Type: Grant
    Filed: March 3, 1999
    Date of Patent: March 12, 2002
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Prabhakar Raghavan, Sridhar Rajagopalan, Shanmugasundaram Ravikumar, Andrew Tomkins
  • Patent number: 6336112
    Abstract: A method for cataloging, filtering and ranking information, as for example, World Wide Web pages of the Internet. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred-authority World Wide Web pages. The method includes steps for enabling a user to interactively create a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form, for example, a searchable, World Wide Web page database.
    Type: Grant
    Filed: March 16, 2001
    Date of Patent: January 1, 2002
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Prabhakar Raghavan, Sridhar Rajagopalan, Shanmugasundaram Ravikumar, Andrew Tomkins
  • Patent number: 6334131
    Abstract: A method for cataloging, filtering and ranking information, as for example, World Wide Web pages of the Internet. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred-authority World Wide Web pages. The method includes steps for enabling a user to interactively create a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form, for example, a searchable, World Wide Web page database.
    Type: Grant
    Filed: August 29, 1998
    Date of Patent: December 25, 2001
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Prabhakar Raghavan, Sridhar Rajagopalan, Shanmugasundaram Ravikumar, Andrew Tomkins
  • Publication number: 20010039544
    Abstract: A method for cataloging, filtering and ranking information; as for example, World Wide Web pages of the Internet. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred-authority World Wide Web pages. The method including steps for enabling a user to interactively creating a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form; for example, a searchable, World Wide Web page database.
    Type: Application
    Filed: August 29, 1998
    Publication date: November 8, 2001
    Inventors: SOUMEN CHAKRABARTI, BYRON EDWARD DORN, DAVID ANDREW GIBSON, PRABHAKAR RAGHAVAN, SRIDHAR RAJAGOPALAN, SHANMUGASUNDARAM RAVIKUMAR, ANDREW TOMKINS
  • Publication number: 20010037324
    Abstract: A system, process, and article of manufacture for organizing a large text database into a hierarchy of topics and for maintaining this organization as documents are added and deleted and as the topic hierarchy changes. Given sample documents belonging to various nodes in the topic hierarchy, the tokens (terms. phrases, dates, or other usable feature in the document) that are most useful at each internal decision node for the purpose of routing new documents to the children of that node are automatically detected. Using feature terms, statistical models are constructed for each topic node. The models are used in an estimation technique to assign topic paths to new unlabeled documents. The hierarchical technique, in which feature terms can be very different at different nodes, leads to an efficient context-sensitive classification technique. The hierarchical technique can handle millions of documents and tens of thousands of topics.
    Type: Application
    Filed: February 5, 2001
    Publication date: November 1, 2001
    Applicant: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Soumen Chakrabarti, Byron Edward Dom, Prabhakar Raghavan
  • Publication number: 20010016846
    Abstract: A method for cataloging, filtering and ranking information; as for example, World Wide Web pages of the Internet. The method is preferably implemented in computer software and features steps for enabling a user to interactively create an information database including preferred information elements such as preferred-authority World Wide Web pages. The method including steps for enabling a user to interactively creating a frame-based, hierarchical organizational structure for the information elements, and steps for identifying and automatically filtering and ranking by relevance, information elements, such as World Wide Web pages for populating the structure, to form; for example, a searchable, World Wide Web page database.
    Type: Application
    Filed: March 16, 2001
    Publication date: August 23, 2001
    Applicant: International Business Machines Corp.
    Inventors: Soumen Chakrabarti, Byron Edward Dom, David Andrew Gibson, Prabhakar Raghavan, Sridhar Rajagopalan, Shanmugasundaram Ravikumar, Andrew Tomkins
  • Patent number: 6233575
    Abstract: A system, process, and article of manufacture for organizing a large text database into a hierarchy of topics and for maintaining this organization as documents are added and deleted and as the topic hierarchy changes. Given sample documents belonging to various nodes in the topic hierarchy, the tokens (terms, phrases, dates, or other usable feature in the document) that are most useful at each internal decision node for the purpose of routing new documents to the children of that node are automatically detected. Using feature terms, statistical models are constructed for each topic node. The models are used in an estimation technique to assign topic paths to new unlabeled documents. The hierarchical technique, in which feature terms can be very different at different nodes, leads to an efficient context-sensitive classification technique. The hierarchical technique can handle millions of documents and tens of thousands of topics.
    Type: Grant
    Filed: June 23, 1998
    Date of Patent: May 15, 2001
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Soumen Chakrabarti, Byron Edward Dom, Prabhakar Raghavan
  • Patent number: 6189005
    Abstract: A system and method for data mining is provided in which temporal patterns of itemsets in transactions having unexpected support values are identified. A surprising temporal pattern is an itemset whose support changes over time. The method may use a minimum description length formulation to discover these surprising temporal patterns.
    Type: Grant
    Filed: August 21, 1998
    Date of Patent: February 13, 2001
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, Sunita Sarawagi
  • Patent number: 6125361
    Abstract: A system and method for ranking wide area computer network (e.g., Web) pages by popularity in response to a query. Further, using a query and the response thereto from a search engine, the system and method finds additional key words that might be good extended search terms, essentially generating a local thesaurus on the fly at query time.
    Type: Grant
    Filed: April 10, 1998
    Date of Patent: September 26, 2000
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom