Patents by Inventor Rajeev Motwani

Rajeev Motwani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20040122797
    Abstract: A technique that uses a weighted divide and conquer approach for clustering a set S of n data points to find k final centers. The technique comprises 1) partitioning the set S into P disjoint pieces S1, . . . , Sp; 2) for each piece Si, determining a set Di of k intermediate centers; 3) assigning each data point in each piece Si to the nearest one of the k intermediate centers; 4) weighting each of the k intermediate centers in each set Di by the number of points in the corresponding piece Si assigned to that center; and 5) clustering the weighted intermediate centers together to find said k final centers, the clustering performed using a specific error metric and a clustering method A.
    Type: Application
    Filed: December 1, 2003
    Publication date: June 24, 2004
    Inventors: Nina Mishra, Liadan O?apos; Callaghan, Sudipto Guha, Rajeev Motwani
  • Patent number: 6684177
    Abstract: A technique that uses a weighted divide and conquer approach for clustering a set S of n data points to find k final centers. The technique comprises 1) partitioning the set S into P disjoint pieces S1, . . . , SP; 2) for each piece Si, determining a set Di of k intermediate centers; 3) assigning each data point in each piece Si to the nearest one of the k intermediate centers; 4) weighting each of the k intermediate centers in each set Di by the number of points in the corresponding piece Si assigned to that center; and 5) clustering the weighted intermediate centers together to find said k final centers, the clustering performed using a specific error metric and a clustering method A.
    Type: Grant
    Filed: May 10, 2001
    Date of Patent: January 27, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Nina Mishra, Liadan O'Callaghan, Sudipto Guha, Rajeev Motwani
  • Patent number: 6542886
    Abstract: A database server supports weighted and unweighted sampling of records or tuples in accordance with desired sampling semantics such as with replacement (WR), without replacement (WoR), or independent coin flips (CF) semantics, for example. The database server may perform such sampling sequentially not only to sample non-materialized records such as those produced as a stream by a pipeline in a query tree for example, but also to sample records, whether materialized or not, in a single pass. The database server also supports sampling over a join of two relations of records or tuples without requiring the computation of the full join and without requiring the materialization of both relations and/or indexes on the join attribute values of both relations.
    Type: Grant
    Filed: March 15, 1999
    Date of Patent: April 1, 2003
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya
  • Patent number: 6532458
    Abstract: A database server supports weighted and unweighted sampling of records or tuples in accordance with desired sampling semantics such as with replacement (WR), without replacement (WoR), or independent coin flips (CF) semantics, for example. The database server may perform such sampling sequentially not only to sample non-materialized records, such as those produced as a stream by a pipeline in a query tree for example, but also to sample records, whether materialized or not, in a single pass. The database server also supports sampling over a join of two relations of records or tuples without requiring the computation of the full join and without requiring the materialization of both relations and/or indexes on the join attribute values of both relations.
    Type: Grant
    Filed: March 15, 1999
    Date of Patent: March 11, 2003
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya
  • Publication number: 20030018615
    Abstract: A database server supports weighted and unweighted sampling of records or tuples in accordance with desired sampling semantics such as with replacement (WR), without replacement (WoR), or independent coin flips (CF) semantics, for example. The database server may perform such sampling sequentially not only to sample non-materialized records, such as those produced as a stream by a pipeline in a query tree for example, but also to sample records, whether materialized or not, in a single pass. The database server also supports sampling over a join of two relations of records or tuples without requiring the computation of the full join and without requiring the materialization of both relations and/or indexes on the join attribute values of both relations.
    Type: Application
    Filed: September 10, 2002
    Publication date: January 23, 2003
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya
  • Publication number: 20020183966
    Abstract: A technique that uses a weighted divide and conquer approach for clustering a set S of n data points to find k final centers. The technique comprises 1) partitioning the set S into P disjoint pieces S1, . . . , SP; 2) for each piece S1, determining a set D1 of k intermediate centers; 3) assigning each data point in each piece Si to the nearest one of the k intermediate centers; 4) weighting each of the k intermediate centers in each set D1 by the number of points in the corresponding piece S1 assigned to that center; and 5) clustering the weighted intermediate centers together to find said k final centers, the clustering performed using a specific error metric and a clustering method A.
    Type: Application
    Filed: May 10, 2001
    Publication date: December 5, 2002
    Inventors: Nina Mishra, Liadan O'Callaghan, Sudipto Guha, Rajeev Motwani
  • Publication number: 20020124001
    Abstract: Aggregation queries are performed by first identifying outlier values, aggregating the outlier values, and sampling the remaining data after pruning the outlier values. The sampled data is extrapolated and added to the aggregated outlier values to provide an estimate for each aggregation query. Outlier values are identified by selecting values outside of a selected sliding window of data having the lowest variance. An index is created for the outlier values. The outlier data is removed from the window of data, and separately aggregated. The remaining data without the outliers is then sampled in one of many known ways to provide a statistically relevant sample that is then aggregated and extrapolated to provide an estimate for the remaining data. This sampled estimate is combined with the outlier aggregate to form an estimate for the entire set of data. Further methods involve the use of weighted sampling and weighted selection of outlier values for low selectivity queries, or queries having group by.
    Type: Application
    Filed: January 12, 2001
    Publication date: September 5, 2002
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek R. Narasayya, Rajeev Motwani, Mayur D. Datar
  • Publication number: 20020123979
    Abstract: Aggregation queries are performed by first identifying outlier values, aggregating the outlier values, and sampling the remaining data after pruning the outlier values. The sampled data is extrapolated and added to the aggregated outlier values to provide an estimate for each aggregation query. Outlier values are identified by selecting values outside of a selected sliding window of data having the lowest variance. An index is created for the outlier values. The outlier data is removed from the window of data, and separately aggregated. The remaining data without the outliers is then sampled in one of many known ways to provide a statistically relevant sample that is then aggregated and extrapolated to provide an estimate for the remaining data. This sampled estimate is combined with the outlier aggregate to form an estimate for the entire set of data.
    Type: Application
    Filed: January 12, 2001
    Publication date: September 5, 2002
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek R. Narasayya, Rajeev Motwani, Mayur D. Datar
  • Patent number: 6278989
    Abstract: Using adaptive random sampling with cross-validation helps determine when enough data of a database has been sampled to construct histograms on one or more columns of one or more tables of the database within a desired or predetermined degree of accuracy. An adaptive random sampling histogram construction tool constructs an approximate equi-height k-histogram using an initial sample of data values from the database and iteratively updates the histogram using an additional sample of data values from the database until the histogram is within the desired degree of accuracy. The accuracy of the histogram is cross-validated against the additional sample at each iteration, and the additional sample is used to update the histogram to help improve its accuracy. The accuracy of the histogram may be measured by an error in distribution of the additional sample over the histogram as compared to a threshold error using a suitable error metric.
    Type: Grant
    Filed: August 25, 1998
    Date of Patent: August 21, 2001
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya