Patents by Inventor Vivek Narasayya

Vivek Narasayya has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20060085410
    Abstract: A method of estimating the Results of a database query are estimated by performing a sampling of weighted tuples in a database based on a probability of usage of tuples required in executing a workload. A probability is associated with each tuple sampled. And, can aggregate is computed over values in each sampled tuple while multiplying by the inverses of the probabilities associated with each tuple sampled.
    Type: Application
    Filed: December 7, 2005
    Publication date: April 20, 2006
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Rajeev Motwani, Mayur Datar
  • Publication number: 20060085484
    Abstract: An automated physical database design tool may provide an integrated physical design recommendation for horizontal partitioning, indexes and indexed views, all three features being tuned together (in concert). Manageability requirements may be specified when optimizing for performance. User-specified configuration may enable the specification of a partial physical design without materialization of the physical design. The tuning process may be performed for a production server but may be conducted substantially on a test server. Secondary indexes may be suggested for XML columns. Tuning of a database may be invoked by any owner of a database. Usage of objects may be evaluated and a recommendation for dropping unused objects may be issued. Reports may be provided concerning the count and percentage of queries in the workload that reference a particular database, and/or the count and percentage of queries in the workload that reference a particular table or column.
    Type: Application
    Filed: October 15, 2004
    Publication date: April 20, 2006
    Applicant: Microsoft Corporation
    Inventors: Alexander Raizman, Arunprasad Marathe, Djana Milton, Dmitry Sonkin, Lubor Kollar, Maciej Sarnowicz, Manoj Syamala, Raja Duddupudi, Sanjay Agrawal, Surajit Chaudhuri, Vivek Narasayya
  • Publication number: 20060085378
    Abstract: Internal communications within components of an automated physical database design tool may be conducted in a data description language such as XML. Inputs to and outputs from the automated physical database design tool may also be presented in the data description language (e.g., XML). The communications, inputs and outputs may comply with a schema for the data description language. The schema may be written in a schema language such as XSD. Inputs presented in the data description language may comprise tuning options. Outputs may comprise a proposed physical design for a database and reports.
    Type: Application
    Filed: October 15, 2004
    Publication date: April 20, 2006
    Applicant: Microsoft Corporation
    Inventors: Alexander Raizman, Arunprasad Marathe, Djana Ophelia Milton, Dmitry Sonkin, Lubor Kollar, Maciej Sarnowicz, Manoj Syamala, Raja Duddupudi, Sanjay Agrawal, Surajit Chaudhuri, Vivek Narasayya
  • Publication number: 20060053103
    Abstract: Aggregation queries are performed by first identifying outlier values, aggregating the outlier values, and sampling the remaining data after pruning the outlier values. The sampled data is extrapolated and added to the aggregated outlier values to provide an estimate for each aggregation query. Outlier values are identified by selecting values outside of a selected sliding window of data having the lowest variance. An index is created for the outlier values. The outlier data is removed from the window of data, and separately aggregated. The remaining data without the outliers is then sampled to provide a statistically relevant sample that is then aggregated and extrapolated to provide an estimate for the remaining data. This sampled estimate is combined with the outlier aggregate to form an estimate for the entire set of data.
    Type: Application
    Filed: October 7, 2005
    Publication date: March 9, 2006
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Rajeev Motwani, Mayur Datar
  • Publication number: 20060036989
    Abstract: A monitoring component of a database server collects a subset of a query workload along with related statistics. A remote index tuning component uses the workload subset and related statistics to determine a physical design that minimizes the cost of executing queries in the workload subset while ensuring that queries omitted from the subset do not degrade in performance.
    Type: Application
    Filed: August 10, 2004
    Publication date: February 16, 2006
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Arnd Konig, Vivek Narasayya
  • Publication number: 20060036600
    Abstract: Aggregation queries are performed by first identifying outlier values, aggregating the outlier values, and sampling the remaining data after pruning the outlier values. The sampled data is extrapolated and added to the aggregated outlier values to provide an estimate for each aggregation query. Outlier values are identified by selecting values outside of a selected sliding window of data having the lowest variance. An index is created for the outlier values. The outlier data is removed from the window of data, and separately aggregated. The remaining data without the outliers is then sampled to provide a statistically relevant sample that is then aggregated and extrapolated to provide an estimate for the remaining data. This sampled estimate is combined with the outlier aggregate to form an estimate for the entire set of data.
    Type: Application
    Filed: October 7, 2005
    Publication date: February 16, 2006
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Rajeev Motwani, Mayur Datar
  • Publication number: 20050223026
    Abstract: A database object summarization tool is provided that selects a subset of database objects subject to filtering constraints such as a partial order or optimization of some attribute. A dominance primitive filters out tuples that are dominated according to a partial order constraint by another tuple. A representation primitive selects a representative subset of tuples such than an optimization criteria is met.
    Type: Application
    Filed: March 31, 2004
    Publication date: October 6, 2005
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Prasanna Ganesan
  • Publication number: 20050222965
    Abstract: A query progress indicator that provides an indication to a user of the progress of a query being executed on a database. The indication of the progress of the query allows the user to decide whether the query should be allowed to complete or should be aborted. One method that may be used to estimate the progress of a query that is being executed on a database defines a model of work performed during execution of a query. The total amount of work that will be performed during execution of the query is estimated according to the model. The amount of work performed at a given point during execution of the query is estimated according to the model. The progress of the query is estimated using the estimated amount of work at the given point in time and the estimated total amount of work. This estimated progress of query execution may be provided to the user.
    Type: Application
    Filed: March 31, 2004
    Publication date: October 6, 2005
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Ravishankar Ramamurthy
  • Publication number: 20050192921
    Abstract: A framework is provided within a database system for specifying database monitoring rules that will be evaluated as part of the execution code path of database events being monitored. The occurrence of a selected database event triggers a rule that evaluates some parameter of an object related to the event against a condition in the rule. If the condition is met, a specified action is taken that can alter the execution of the database event or database system performance. Lightweight aggregation tables are utilized to enable aggregation of object parameter values so that presently occurring events can be compared to a summary of the object parameter values from previously occurring database events. Signatures are assigned to queries based on the structure of the query plan so that information in the lightweight aggregation tables can be grouped according to query signature.
    Type: Application
    Filed: February 26, 2004
    Publication date: September 1, 2005
    Inventors: Surajit Chaudhuri, Arnd Konig, Vivek Narasayya
  • Patent number: 6912547
    Abstract: Relational database applications such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these applications are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such applications is the size of the workload. The invention concerns workload compression which helps improve the scalability of such applications. The exemplary embodiment is broadly applicable to a variety of workload-driven applications, while allowing for incorporation of application specific knowledge. The process is described in detail in the context of two workload-driven applications: index selection and approximate query processing.
    Type: Grant
    Filed: June 26, 2002
    Date of Patent: June 28, 2005
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Ashish Kumar Gupta, Vivek Narasayya, Sanjay Agrawal
  • Publication number: 20050102305
    Abstract: Relational database applications such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these applications are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such applications is the size of the workload. The invention concerns workload compression which helps improve the scalability of such applications. The exemplary embodiment is broadly applicable to a variety of workload-driven applications, while allowing for incorporation of application specific knowledge. The process is described in detail in the context of two workload-driven applications: index selection and approximate query processing.
    Type: Application
    Filed: December 8, 2004
    Publication date: May 12, 2005
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Ashish Gupta, Vivek Narasayya, Sanjay Agrawal
  • Publication number: 20050033759
    Abstract: A method for estimating the result of a query on a database having data records arranged in tables. The database has an expected workload that includes a set of queries that can be executed on the database. An expected workload is derived comprising a set of queries that can be executed on the database. A sample is constructed by selecting data records for inclusion in the sample in a manner that minimizes an estimation error when the data records are acted upon by a query in the expected workload to provide an expected workload to provide an expected result. The query accesses the sample and is executed on the sample, returning an estimated query result. The expected workload can be constructed by specifying a degree of overlap between records selected by queries in the given workload and records selected by queries in the expected workload.
    Type: Application
    Filed: September 8, 2004
    Publication date: February 10, 2005
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Gantam Das
  • Publication number: 20050033739
    Abstract: A method for estimating the result of a query on a database having data records arranged in tables. The database has an expected workload that includes a set of queries that can be executed on the database. An expected workload is derived comprising a set of queries that can be executed on the database. A sample is constructed by selecting data records for inclusion in the sample in a manner that minimizes an estimation error when the data records are acted upon by a query in the expected workload to provide an expected workload to provide an expected result. The query accesses the sample and is executed on the sample, returning an estimated query result. The expected workload can be constructed by specifying a degree of overlap between records selected by queries in the given workload and records selected by queries in the expected workload.
    Type: Application
    Filed: September 8, 2004
    Publication date: February 10, 2005
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Gantam Das
  • Publication number: 20040260684
    Abstract: Integrating the partitioning of physical design structures with the physical design process can result in more efficient query execution. When candidate structures are evaluated for their relative benefit, one or more partitioning methods is associated with each structure so that the benefits of various partitioning methods are taken into consideration when the structures are selected for use by the database. A pool of partitioned candidate structures is formed by proposing and evaluating the benefit of candidate structures with associated partitioning on a per query basis. The selected partitioned candidates are then used to construct generalized structures with associated partitioning methods that are evaluated for their benefit over the workload. Those generalized structures are added to the pool of partitioned candidate structures. From this augmented pool of partitioned candidate structures, an optimal set of partitioned structures is enumerated for use by the database system.
    Type: Application
    Filed: June 23, 2003
    Publication date: December 23, 2004
    Applicant: Microsoft Corporation
    Inventors: Sanjay Agrawal, Surajit Chaudhuri, Vivek Narasayya
  • Publication number: 20040220942
    Abstract: Layout in a database system is performed using workload information. Execution information for a workload is obtained. Cumulative access and co-access information for database objects is then assembled. A cost model is developed for quantitatively capturing the value of different layouts, and a search is performed for a recommended database layout. In one embodiment, a greedy search is performed which initially attempts provide a layout that minimizes co-location of objects on storage objects, and then attempts to improve that layout via a greedy search.
    Type: Application
    Filed: April 30, 2003
    Publication date: November 4, 2004
    Applicant: Microsoft Corporation
    Inventors: Sanjay Agrawal, Surajit Chaudhuri, Abhinandan Das, Vivek Narasayya
  • Publication number: 20040002956
    Abstract: A method for estimating the result of an aggregation query on a database using multiple sample tables. A given workload is divided into a set of workload partitions that include queries from the workload. A set of sample tables are constructed. Samples for each sample table are selected to reduce an estimation error for one of the partitions of queries. The most appropriate sample table in the set of sample tables is identified for a given query. The given query is executed on the most appropriate sample table and an estimated result for the given query is returned.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Gautam Das, Vivek Narasayya
  • Publication number: 20040003004
    Abstract: A method is provided for tuning a database to recommend a set of physical design structures for the database that optimize database performance for a given workload given a total time bound that defines a maximum amount of time that can be spent tuning the database. A cumulative set of recommended structures is maintained and incrementally updated based on tuning that is performed in intervals over portions of the workload. The cumulative set of recommended structures is updated by tuning the database by examining a predetermined portion of the workload during a time slice that is a fraction of the total time bound. At the end of the time slice, a set of recommended structures has been enumerated that is based on the workload portions that have been examined thus far. The set of recommended structures is updated until all queries in the workload have been examined or until the time bound is reached.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Sanjay Agrawal, Vivek Narasayya
  • Publication number: 20040002954
    Abstract: Relational database applications such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these applications are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such applications is the size of the workload. The invention concerns workload compression which helps improve the scalability of such applications. The exemplary embodiment is broadly applicable to a variety of workload-driven applications, while allowing for incorporation of application specific knowledge. The process is described in detail in the context of two workload-driven applications: index selection and approximate query processing.
    Type: Application
    Filed: June 26, 2002
    Publication date: January 1, 2004
    Inventors: Surajit Chaudhuri, Ashish Kumar Gupta, Vivek Narasayya, Sanjay Agrawal
  • Publication number: 20040002957
    Abstract: In a relational database system, a set of physical design structures is enumerated that optimizes database performance over a given workload consisting of workload entries that include queries and updates that have been executed against the database. An individual benefit is calculated for each candidate structure relevant to a given workload entry and these individual benefits are summed over the entries in the workload examined thus far. A workload entry is selected from the workload and a set of candidate structures relevant to the workload entry is identified.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Mayur Datar
  • Publication number: 20030229635
    Abstract: A method for evaluating a user query on a database having a mining model that classifies records contained in the database into classes when the query comprises at least one mining predicate that refers to a class of database records. An upper envelope is derived for the class referred to by the mining predicate corresponding to a query that returns a set of database records that includes all of the database records belonging to the class. The upper envelope is included in the user query for query evaluation. The method may be practiced during a preprocessing phase by evaluating the mining model to extract a set of classes of the database records and deriving an upper envelope for each class. These upper envelopes are stored for access during user query evaluation.
    Type: Application
    Filed: June 3, 2002
    Publication date: December 11, 2003
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Sunita Sarawagi