Patents by Inventor Sachindra Joshi

Sachindra Joshi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7203679
    Abstract: Documents are represented based on their structure, which arises from the relationship between various elements in the document. After representing documents based on their structure in vector form, a method of measuring similarity between vectors is used to obtain the measure of structural similarity between two given documents.
    Type: Grant
    Filed: July 29, 2003
    Date of Patent: April 10, 2007
    Assignee: International Business Machines Corporation
    Inventors: Neeraj Agrawal, Sachindra Joshi, Raghuram Krishnapuram, Sumit Negi
  • Publication number: 20060143464
    Abstract: Methods, systems and computer program products for automatically enforcing obligations in accordance with a data-handling policy are disclosed. Requests by users for accessing data stored in a data repository are intercepted. A determination is made whether any obligations apply to each data item requested in accordance with the data handling policy. The determination may relate to whether rules having associated obligations identified in the data-handling policy apply to data items requested by a user. The obligations are automatically executed at an appropriate time after access of the data. Association of a data item requested by the user with an obligation may be recorded and tracked to determine the appropriate time for executing the obligation.
    Type: Application
    Filed: December 29, 2004
    Publication date: June 29, 2006
    Applicant: International Business Machines Corporation
    Inventors: Rema Ananthanarayanan, Mukesk Mohania, Ajay Gupta, Calvin Powers, Sachindra Joshi, Manish Bhide
  • Publication number: 20060026496
    Abstract: Methods, apparatus and computer programs are provided for characterizing Web-based information resources based on their interactions. A Web-based information resource is a single Web document or a collection of related Web documents. Unlike simple text documents, Web documents contain hyperlinks and other HTML tags. Different types of interactions, including inbound hyperlinks, outbound hyperlinks and internal links associated with a Web-based information resource, are used to characterize the Web-based information resource. A DOM tree representing the tag structure of a Web-based information resource is used to identify text items likely to be useful as context for a hyperlink anchor text, and the anchor text is combined with the context to generate a representation. The representation of Web-based information resources based on interactions can be used for clustering and classification, and in Web mining applications such as query disambiguation and automatic taxonomy generation.
    Type: Application
    Filed: July 28, 2004
    Publication date: February 2, 2006
    Inventors: Sachindra Joshi, Raghuram Krishnapuram, Shourya Roy
  • Publication number: 20060026157
    Abstract: Provided are methods, apparatus and computer programs for evaluating the resilience, to structural changes in a data source, of a representative label representing a data element within the data source. Also disclosed are applications using a resilient representative label. For example, a representative label may represent a particular data field or other data element within a semi-structured data source - such as within XML or HTML Web pages. An estimate of resilience to changes can be used to determine whether a candidate representative label satisfies a required degree of resilience, or to enable selection of a label with the highest resilience score among a set of representative labels. The validated or selected representative label may then be used for data extraction, remaining usable despite the possibility of future changes to the structure of a Web page, or for template clustering/classification.
    Type: Application
    Filed: June 29, 2004
    Publication date: February 2, 2006
    Inventors: Rahul Gupta, Sachindra Joshi, Raghuram Krishnapuram
  • Publication number: 20050038785
    Abstract: Documents are represented based on their structure, which arises from the relationship between various elements in the document. After representing documents based on their structure in vector form, a method of measuring similarity between vectors is used to obtain the measure of structural similarity between two given documents.
    Type: Application
    Filed: July 29, 2003
    Publication date: February 17, 2005
    Inventors: Neeraj Agrawal, Sachindra Joshi, Raghuram Krishnapuram, Sumit Negi
  • Patent number: 6754651
    Abstract: The present invention provides a system and a method for mining a new kind of association rules called disjunctive association rules, where the antecedent or the consequent of a rule may contain disjuncts of terms (XY or X⊕Y). Such rules are a natural generalisation to the kind of rules that have been mined hitherto. Furthermore, disjunctive association rules are generalised in the sense that the algorithm also mines rules which have disjunctions of conjuncts (C(AB)(DE)). Since the number of combinations of disjuncts is explosive, we use clustering to find a generalized subset. The said clustering is preferably performed using agglomerative clustering methods for finding the greedy subset.
    Type: Grant
    Filed: April 17, 2001
    Date of Patent: June 22, 2004
    Assignee: International Business Machines Corporation
    Inventors: Amit Anil Nanavati, Krishna Prasad Chitrapura, Sachindra Joshi, Raghuram Krishnapuram
  • Publication number: 20040111438
    Abstract: Disclosed are a system and method for automated populating of an existing concept hierarchy of items with new items, using entropy as a measure of the correctness of a potential classification. User-defined concept hierarchies include, for example, document hierarchies such as directories for the Internet (such as yahoo), library catalogues, patent databases and journals, and product hierarchies. These concept hierarchies can be huge and are usually maintained manually. An internet directory may have, for example, millions of Web sites, thousands of editors and hundreds of thousands of different categories.
    Type: Application
    Filed: December 4, 2002
    Publication date: June 10, 2004
    Inventors: Krishna Prasad Chitrapura, Raghuram Krishnapuram, Sachindra Joshi
  • Publication number: 20020152201
    Abstract: The present invention provides a system and a method for mining a new kind of association rules called disjunctive association rules, where the antecedent or the consequent of a rule may contain disjuncts of terms (X−Y or X/Y). Such rules are a natural generalization to the kind of rules that have been mined hitherto. Furthermore, disjunctive association rules are generalized in the sense that the algorithm also mines rules which have disjunctions of conjuncts (C&egr;(A.B)−(D.E)). Since the number of combinations of disjuncts is explosive, we use clustering to find a generalized subset. The said clustering is preferably performed using agglomerative clustering methods for finding the greedy subset.
    Type: Application
    Filed: April 17, 2001
    Publication date: October 17, 2002
    Applicant: International Business Machines Corporation
    Inventors: Amit Anil Nanavati, Krishna Prasad Chitrapura, Sachindra Joshi, Raghuram Krishnapuram