Patents by Inventor Surajit Chaudhuri

Surajit Chaudhuri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10896229
    Abstract: The present invention extends to methods, systems, and computer program products for computing features of structured data. Aspects of the invention include computing features of table components (e.g., of rows, columns, cells, etc.). Computed features can be used for ranking the table components. When aggregated, features for different components of a table can be used for ranking the table (e.g., a web table).
    Type: Grant
    Filed: November 12, 2018
    Date of Patent: January 19, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
  • Publication number: 20210011926
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a transformation function is executed using an example input value to obtain an initial output value. Thereafter, a plurality of supplemental transformation tools is applied to the initial output value to generate a plurality of intermediary output values. Based on a comparison of each of the intermediary output values to an example output value, the supplemental transformation tool that generated an intermediary output value having a greatest extent of similarity to the example output values is identified. The identified supplemental transformation tool and the transformation function are used to generate a transformation program that transforms the example input values to the desired form in which to transform data.
    Type: Application
    Filed: September 9, 2020
    Publication date: January 14, 2021
    Inventors: Yeye HE, Kris Ganjam, Vivek Ravindraneth NARASAYYA, Surajit Chaudhuri
  • Patent number: 10853332
    Abstract: Systems, methods, and computer-executable instructions for partitioning a data set include receiving anchor attributes of a data set. The data set includes records, with each record including attributes. A set of filter attributes that are not mutually exclusive with any of the anchor attributes is determined. A set of candidate attributes that include each unique attribute from the first data set, excluding the anchor attributes and the filter attributes, is determined. For each of the anchor attributes and the anchor attributes, an attribute context is determined. For each of the candidate attributes, a context similarity between each of the anchor attributes is determined. A new anchor attribute is selected from the set of candidate attributes based on the context similarity.
    Type: Grant
    Filed: April 19, 2018
    Date of Patent: December 1, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lev Novik, Surajit Chaudhuri, Yeye He
  • Patent number: 10853344
    Abstract: The present invention extends to methods, systems, and computer program products for understanding tables for search. Aspects of the invention include identifying a subject tuple (e.g., a subject column) for a table, detecting a tuple header (e.g., a column header) using other tables, and detecting a tuple header (e.g., a column header) using a knowledge base. Implementations can be utilized in a structured data search system (SDSS) that indexes structured information, such as, tables in a relational database or html tables extracted from web pages. The SDSS allows users to search over the structured information (tables) using different mechanisms including keyword search and data finding data.
    Type: Grant
    Filed: July 27, 2017
    Date of Patent: December 1, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zhongyuan Wang, Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
  • Patent number: 10838957
    Abstract: A relational database server may concurrently execute many relational queries, but a complex relational query may cause performance delays in the fulfillment of other relational queries. Instead, the relational database server may generate a query plan for the relational query, and may endeavor to partition the relational query between a spool operator and a scan operator into two or more query slices, where each query slice may be executed within a query slice threshold. Many alternative candidate query plans may be considered, such as inserting spool and scan operators after various operators and parameterizing operators in order to partition the records of a relation into two or more ranges based on an attribute of the relation. A large search space of candidate query plans may be reviewed in order to select a query plan that respects the query slice threshold while efficiently executing the logic of the relational query.
    Type: Grant
    Filed: June 17, 2010
    Date of Patent: November 17, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nicolas Bruno, Ravishankar Ramamurthy, Surajit Chaudhuri, Vivek Ravindranath Narasayya
  • Patent number: 10824592
    Abstract: Generally discussed herein are devices, systems, and methods for database management. A method may include determining a first hyperloglog (HLL) sketch of a first column of data, determining a second HLL sketch of a second column of data, estimating an inclusion coefficient based on the first and second HLL sketches, and performing operations on the first column of data or the second column of data in response to determining the inclusion coefficient is greater than, or equal to, a specified threshold.
    Type: Grant
    Filed: June 14, 2018
    Date of Patent: November 3, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Azade Nazi, Bolin Ding, Vivek R Narasayya, Surajit Chaudhuri
  • Patent number: 10810202
    Abstract: Systems, methods, and computer-executable instructions for creating a query execution plan for a query of a database includes receiving, from the database, a set of previously executed query execution plans for the query. Each previously-executed query execution plans includes subplans. Each subplan indicates a tree of physical operators. Physical operators that executed in the set of previously-executed query execution plans are determined. For each physical operator, an execution cost based is determined. Invalid physical operators from the previously-executed query execution plans that are invalid for the database are removed. Equivalent subplans from the previously-executed query execution plans are identified based on physical properties and logical expressions of the subplans. A constrained search space is created based on the equivalent subplans. A query execution plan for the query is constructed from the constrained search space based on the execution cost.
    Type: Grant
    Filed: June 14, 2018
    Date of Patent: October 20, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, Vivek R Narasayya
  • Patent number: 10810181
    Abstract: The present invention extends to methods, systems, and computer program products for refining structured data indexes. Aspects of the invention include associating structured data, such as, for example, tables, with additional content. Additional content can include content outside the <table> and </table> tags of a web table. Indexes for structured data (e.g., table indexes) can be refined based on the additional content to improve the relevance of providing parts of the structured data (e.g., parts of the table) in search results.
    Type: Grant
    Filed: April 11, 2018
    Date of Patent: October 20, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
  • Publication number: 20200320093
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values are received. A repository of transformation tools is searched to identify a new transformation tool as relevant to a data transformation associated with the received set of example values. The repository includes annotations associated with the new transformation tool. The new transformation tool is used to generate a transformation program that produces transformed output values. Additional annotations are generated for the new transformation tool based on the transformed output values.
    Type: Application
    Filed: June 19, 2020
    Publication date: October 8, 2020
    Inventors: Kris Ganjam, Yeye HE, Vivek Ravindranath Narasayya, Surajit Chaudhuri
  • Patent number: 10776380
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a transformation function is executed using an example input value to obtain an initial output value. Thereafter, a plurality of supplemental transformation tools is applied to the initial output value to generate a plurality of intermediary output values. Based on a comparison of each of the intermediary output values to an example output value, the supplemental transformation tool that generated an intermediary output value having a greatest extent of similarity to the example output values is identified. The identified supplemental transformation tool and the transformation function are used to generate a transformation program that transforms the example input values to the desired form in which to transform data.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: September 15, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Kris Ganjam, Vivek Ravindranath Narasayya, Surajit Chaudhuri
  • Patent number: 10776375
    Abstract: Various technologies that facilitate performance of a data finding data (DFD) search are described herein. A user specifies entities, for example, by entering the entities into a query field, selecting the entities from a computer-executable application, or the like. The user further specifies an attribute of the entities that is of interest. A query is constructed based upon the entities and the attribute, and a search for tables is performed based upon the entities and the attribute. Values of the attribute for the selected entities are identified in a table, and the values of the attribute are returned.
    Type: Grant
    Filed: May 21, 2014
    Date of Patent: September 15, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Kris Ganjam, Zhimin Chen, Kaushik Chakrabarti, Surajit Chaudhuri, Vivek Narasayya, James Finnigan, Kanstantsyn Zoryn
  • Publication number: 20200272667
    Abstract: Systems and techniques for leveraging query executions to improve index recommendations are described herein. In an example, a machine learning model is adapted to receive a first query plan and a second query plan for performing a query with a database, where the first query plan is different from the second query plan. The machine learning model may be further adapted to determine execution cost efficiency between the first query plan and the second query plan. The machine learning model is trained using relative execution cost comparisons between a set of pairs of query plans for the database. The machine learning model is further adapted to output a ranking of the first query plan and second query plan, where the first query plan and second query plan are ranked based on execution cost efficiency.
    Type: Application
    Filed: February 21, 2019
    Publication date: August 27, 2020
    Inventors: Bailu Ding, Sudipto Das, Surajit Chaudhuri, Vivek R Narasayya, Ryan Marcus, Lin Ma, Adith Swaminathan
  • Patent number: 10740328
    Abstract: A processing unit can determine a first subset of a data set including data records selected based on measure values thereof. The processing unit can determine an index mapping a predicate to data records associated with that predicate and approximation values of the records. The processing unit can process a query against the first subset to provide a first result and a first accuracy value, determine that the first accuracy value does not satisfy an accuracy criterion, and process the query against the index. In some examples, the processing unit can process the query against a second subset including data records satisfying a predetermined predicate. In some examples, the processing unit can receive data records and determine the first subset. Data records can include respective measure values. Data records with higher measure values can occur in the first subset more frequently than data records with lower measure values.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: August 11, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Bolin Ding, Silu Huang, Chi Wang, Kaushik Chakrabarti, Surajit Chaudhuri
  • Publication number: 20200242127
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values including example input values that indicate data values to be transformed and example output values that indicate a desired form in which to transform data. Based on the set of example values, a data transformation function that is relevant to the set of example values is identified. The data transformation function is used to generate a transformation program to transform the example input values to the desired form in which to transform data. A suggestion of the transformation program can be provided to a user device, wherein selection of the transformation program suggestion results in a data transformation.
    Type: Application
    Filed: April 13, 2020
    Publication date: July 30, 2020
    Inventors: Yeye HE, Kris GANJAM, Vivek Ravindranath NARASAYYA, Surajit CHAUDHURI
  • Publication number: 20200226109
    Abstract: Systems, methods, and computer-executable instructions for reorganizing a physical layout of data of a database a database. A workload is selected from previously executed database operations. A total resource consumption of the previously executed database operations and of the workload is determined. The total resource consumption of the workload is more than a predetermined threshold of the total resource consumption of the previously executed database operations. Optimization operations for the database are determined using the workload. A cloned database of the database is created. The optimization operations are executed on the cloned database. A database operation is received for the database. The database operation is executed on the database and the cloned database. The performance of the cloned database is verified as being improved compared to the performance of the databased based on the executing of the database operation on the database and the cloned database.
    Type: Application
    Filed: January 14, 2019
    Publication date: July 16, 2020
    Inventors: Sudipto Das, Vivek R. Narasayya, Gaoxiang Xu, Surajit Chaudhuri, Andrija Jovanovic, Miodrag Radulovic
  • Patent number: 10706066
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values are received. A repository of transformation tools is searched to identify a new transformation tool as relevant to a data transformation associated with the received set of example values. The repository includes annotations associated with the new transformation tool. The new transformation tool is used to generate a transformation program that produces transformed output values. Additional annotations are generated for the new transformation tool based on the transformed output values.
    Type: Grant
    Filed: October 17, 2016
    Date of Patent: July 7, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Kris Ganjam, Yeye He, Vivek Ravindranath Narasayya, Surajit Chaudhuri
  • Patent number: 10685020
    Abstract: In some embodiments, the disclosed subject matter involves a server query optimizer for parametric query optimization (PQO) to address the problem of finding and reusing a relatively small number of query plans that can achieve good plan quality across multiple instances of a parameterized query. An embodiment processes query instances on-line and ensures (a) tight, bounded cost sub-optimality for each instance, (b) low optimization overheads, and (c) only a small number of plans need to be stored. A plan re-costing based approach is disclosed to provide good performance on all three metrics. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 2, 2017
    Date of Patent: June 16, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Surajit Chaudhuri, Anshuman Dutt, Vivek R Narasayya
  • Patent number: 10621195
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values including example input values that indicate data values to be transformed and example output values that indicate a desired form in which to transform data. Based on the set of example values, a data transformation function that is relevant to the set of example values is identified. The data transformation function is used to generate a transformation program to transform the example input values to the desired form in which to transform data. A suggestion of the transformation program can be provided to a user device, wherein selection of the transformation program suggestion results in a data transformation.
    Type: Grant
    Filed: September 20, 2016
    Date of Patent: April 14, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yeye He, Kris Ganjam, Vivek Ravindranath Narasayya, Surajit Chaudhuri
  • Publication number: 20200065712
    Abstract: In automated machine learning, an approximate best configuration can be selected among multiple candidate machine-learning configurations by progressively sampling training and test datasets for the iterative training and testing of the configurations while progressively pruning the set of candidate configurations based on associated estimated confidence intervals for their respective performance.
    Type: Application
    Filed: August 23, 2018
    Publication date: February 27, 2020
    Inventors: Chi Wang, Silu Huang, Surajit Chaudhuri, Bolin Ding
  • Publication number: 20190384844
    Abstract: Systems, methods, and computer-executable instructions for creating a query execution plan for a query of a database includes receiving, from the database, a set of previously executed query execution plans for the query. Each previously-executed query execution plans includes subplans. Each subplan indicates a tree of physical operators. Physical operators that executed in the set of previously-executed query execution plans are determined. For each physical operator, an execution cost based is determined. Invalid physical operators from the previously-executed query execution plans that are invalid for the database are removed. Equivalent subplans from the previously-executed query execution plans are identified based on physical properties and logical expressions of the subplans. A constrained search space is created based on the equivalent subplans. A query execution plan for the query is constructed from the constrained search space based on the execution cost.
    Type: Application
    Filed: June 14, 2018
    Publication date: December 19, 2019
    Inventors: Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, Vivek R. Narasayya