Patents by Inventor Sunita Sarawagi

Sunita Sarawagi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230222290
    Abstract: A system, computer program product, and method are provided for active learning (AL) for matching heterogeneous entity representations. The task in entity resolution (ER) is to find pairs from datasets that correspond to the same entity. A labeled training dataset is leveraged to train a first artificial intelligence (AI) model, with the first AI model training employing a pre-trained language model. A second AI model is trained with the language model updated by the first AI model, with the second AI model creating a candidate set of likely duplicate pairs. A subset is selectively identified from the candidate set. The labeled training set is augmented with the subset.
    Type: Application
    Filed: January 11, 2022
    Publication date: July 13, 2023
    Inventors: Prithviraj Sen, Sunita Sarawagi, Arjit Jain
  • Patent number: 7346601
    Abstract: A method for evaluating a user query on a database having a mining model that classifies records contained in the database into classes when the query comprises at least one mining predicate that refers to a class of database records. An upper envelope is derived for the class referred to by the mining predicate corresponding to a query that returns a set of database records that includes all of the database records belonging to the class. The upper envelope is included in the user query for query evaluation. The method may be practiced during a preprocessing phase by evaluating the mining model to extract a set of classes of the database records and deriving an upper envelope for each class. These upper envelopes are stored for access during user query evaluation.
    Type: Grant
    Filed: June 3, 2002
    Date of Patent: March 18, 2008
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Sunita Sarawagi
  • Patent number: 6691098
    Abstract: A system and method for explaining why an exceptional element in a multidimensional database is exceptional by presenting the element using at least two of the dimensions responsible for the exception. Maximal terms are identified in the monolithic equation that is used to identify exceptions, and based on the maximal terms the dimensions that are to be displayed are selected as a visual indication of why a displayed element is exceptional.
    Type: Grant
    Filed: February 8, 2000
    Date of Patent: February 10, 2004
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Sunita Sarawagi
  • Publication number: 20030229635
    Abstract: A method for evaluating a user query on a database having a mining model that classifies records contained in the database into classes when the query comprises at least one mining predicate that refers to a class of database records. An upper envelope is derived for the class referred to by the mining predicate corresponding to a query that returns a set of database records that includes all of the database records belonging to the class. The upper envelope is included in the user query for query evaluation. The method may be practiced during a preprocessing phase by evaluating the mining model to extract a set of classes of the database records and deriving an upper envelope for each class. These upper envelopes are stored for access during user query evaluation.
    Type: Application
    Filed: June 3, 2002
    Publication date: December 11, 2003
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Sunita Sarawagi
  • Patent number: 6592627
    Abstract: A user can easily organize computerized document folders by associating a few sample documents in the document database with each folder. The present invention learns folder profiles based on the sample documents and moves the remaining documents into the folders accordingly. In this way, the user can construct new folders, or rearrange existing folders, or cause the computer to automatically rearrange and maintain the folders. This is particularly useful for managing a database of perhaps thousands of emails.
    Type: Grant
    Filed: June 10, 1999
    Date of Patent: July 15, 2003
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Roberto Javier Bayardo, Dimitrios Gunopulos, Ching-Tien Howard Ho, Sunita Sarawagi, John Christopher Shafer, Ramakrishnan Srikant
  • Patent number: 6324533
    Abstract: A method and apparatus for mining data relationships from an integrated database and data-mining system are disclosed. A set of frequent 1-itemsets is generated using a group-by query on data transactions. From these frequent 1-itemsets and the transactions, frequent 2-itemsets are determined. A candidate set of (n+2)-itemsets are generated from the frequent 2-itemsets, where n=1. Frequent (n+2)-itemsets are determined from candidate set and the transaction table using a query operation. The candidate set and frequent (n+2)-itemset are generated for (n+1) until the candidate set is empty. Rules are then extracted from the union of the determined frequent itemsets.
    Type: Grant
    Filed: May 29, 1998
    Date of Patent: November 27, 2001
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Sunita Sarawagi, Shiby Thomas
  • Patent number: 6189005
    Abstract: A system and method for data mining is provided in which temporal patterns of itemsets in transactions having unexpected support values are identified. A surprising temporal pattern is an itemset whose support changes over time. The method may use a minimum description length formulation to discover these surprising temporal patterns.
    Type: Grant
    Filed: August 21, 1998
    Date of Patent: February 13, 2001
    Assignee: International Business Machines Corporation
    Inventors: Soumen Chakrabarti, Byron Edward Dom, Sunita Sarawagi
  • Patent number: 6094651
    Abstract: A method for locating data anomalies in a k dimensional data cube that includes the steps of associating a surprise value with each cell of a data cube, and indicating a data anomaly when the surprise value associated with a cell exceeds a predetermined exception threshold. According to one aspect of the invention, the surprise value associated with each cell is a composite value that is based on at least one of a Self-Exp value for the cell, an In-Exp value for the cell and a Path-Exp value for the cell. Preferably, the step of associating the surprise value with each cell includes the steps of determining a Self-Exp value for the cell, determining an In-Exp value for the cell, determining a Path-Exp value for the cell, and then generating the surprise value for the cell based on the Self-Exp value, the In-Exp value and the Path-value.
    Type: Grant
    Filed: August 22, 1997
    Date of Patent: July 25, 2000
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Sunita Sarawagi
  • Patent number: 5832475
    Abstract: Disclosed is a system and method for performing database queries including GROUP-BY operations, in which aggregate values for attributes are desired for distinct, partitioned subsets of tuples satisfying a query. A special case of the aggregation problem is addressed, employing a structure, called the data cube operator, which provides information useful for expediting execution of GROUP-BY operations in queries. Algorithms are provided for constructing the data cube by efficiently computing a collection of GROUP-BYs on the attributes of the relation. Decision support systems often require computation of multiple GROUP-BY operations on a given set of attributes, the GROUP-BYs being related in the sense that their attributes are subsets or supersets of each other. The invention extends hash-based and sort-based grouping methods with optimizations, including combining common operations across multiple GROUP-BYs and using pre-computed GROUP-BYs for computing other GROUP-BYs.
    Type: Grant
    Filed: March 29, 1996
    Date of Patent: November 3, 1998
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Ashish Gupta, Sunita Sarawagi