Patents by Inventor Surajit Chaudhuri

Surajit Chaudhuri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7930301
    Abstract: A search of an index database or another search method is conducted to identify preliminary results listing one or more selected computer objects having selected identifying information stored in an index database. In addition, one or more selected computer objects of the preliminary search results are correlated with one or more other computer objects that have associations with the selected computer objects of the preliminary search results. Integrated search results are then returned and include the preliminary search results and one or more other computer objects that have associations with the selected computer objects of the preliminary search results. The associations may be determined by a association system and represent relationships between computer files based upon user or other interactions between the objects. The associations between the objects may include similarities between them and their importance.
    Type: Grant
    Filed: March 31, 2003
    Date of Patent: April 19, 2011
    Assignee: Microsoft Corporation
    Inventors: Cezary Marcjan, Ryszard Kott, Surajit Chaudhuri, Lili Cheng
  • Patent number: 7917514
    Abstract: A system that can analyze a multi-dimensional input thereafter establishing a search query based upon extracted features from the input. In a particular example, an image can be used as an input to a search mechanism. Pattern recognition and image analysis can be applied to the image thereafter establishing a search query that corresponds to features extracted from the image input. The system can also facilitate indexing multi-dimensional searchable items thereby making them available to be retrieved as results to a search query. More particularly, the system can employ text analysis, pattern and/or speech recognition mechanisms to extract features from searchable items. These extracted features can be employed to index the searchable items.
    Type: Grant
    Filed: June 28, 2006
    Date of Patent: March 29, 2011
    Assignee: Microsoft Corporation
    Inventors: Stephen Lawler, Eric J. Horvitz, Joshua T. Goodman, Anoop Gupta, Christopher A. Meek, Eric D. Brill, Gary W. Flake, Ramez Naam, Surajit Chaudhuri, Oliver Hurst-Hiller
  • Publication number: 20110038531
    Abstract: Techniques are described to leverage a set of sample or example matched pairs of strings to learn string transformation rules, which may be used to match data records that are semantically equivalent. In one embodiment, matched pairs of input strings are accessed. For a set of matched pairs, a set of one or more string transformation rules are learned. A transformation rule may include two strings determined to be semantically equivalent. The transformation rules are used to determine whether a first and second string match each other.
    Type: Application
    Filed: August 14, 2009
    Publication date: February 17, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Arvind Arasu, Surajit Chaudhuri, Shriraghav Kaushik
  • Patent number: 7882121
    Abstract: A query generation using cardinality constraints process including choosing a first set of parameters for a query, calculating an additional set of parameters based on the first set of parameters, executing the query using additional set of parameters, evaluating the cardinality error the additional set of parameters, and refining the additional set of parameters to meet the desired cardinality constraint. Creating a query and selecting parameters for the query to meet a desired cardinality constraint or set of cardinality constraints when the query is executed against a database may be difficult. A query generation using cardinality constraints process may create a set of parameters for a query which satisfies a desired cardinality constraint or set of cardinality constraints. An application of such a query generation using cardinality constraints process may be database component and code testing.
    Type: Grant
    Filed: January 27, 2006
    Date of Patent: February 1, 2011
    Assignee: Microsoft Corporation
    Inventors: Nicolas Bruno, Surajit Chaudhuri, Dilys Thomas
  • Publication number: 20100325136
    Abstract: Techniques for error-tolerant autocompletion are described. While displaying characters of an input string as they are inputted by a user, when a character is added to the input string by the user, matching strings may be selected from among a set of candidate strings by determining which of the candidate strings have a prefix whose characters match the characters of the input string within a given edit distance of the input string.
    Type: Application
    Filed: June 23, 2009
    Publication date: December 23, 2010
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Shriraghav Kaushik
  • Publication number: 20100318543
    Abstract: An architecture for providing interactive sessions for physical database design is described, allowing users to readily try different options, identify problems, and obtain physical designs in a flexible way. Embodiments based on a .NET assembly and modifications to a database management system (DBMS) are also described.
    Type: Application
    Filed: June 15, 2009
    Publication date: December 16, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Surajit Chaudhuri, Nicolas Bruno
  • Publication number: 20100313258
    Abstract: Identifying synonyms of entities using a collection of documents is disclosed herein. In some aspects, a document from a collection of documents may be analyzed to identify hit sequences that include one or more tokens (e.g., words, number, etc.). The hit sequences may then be used to generate discriminating token sets (DTS's) that are subsets of both the hit sequences and the entity names. The DTS's are matched with corresponding entity names, and then used to create DTS phrases by selecting adjacent text in the document that is proximate to the DTS. The DTS phrases may be analyzed to determine whether the corresponding DTS is synonyms of the entity name. In various aspects, the tokens of an associated entity name that are present in the DTS phrases are used to generate a score for the DTS. When the score at least reaches a threshold, the DTS may be designated as a synonym. A list of synonyms may be generated for each entity name.
    Type: Application
    Filed: June 4, 2009
    Publication date: December 9, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Surajit Chaudhuri, Venkatesh Ganti, Dong Xin
  • Publication number: 20100299367
    Abstract: A keyword search is executed on a view of a database based on a Boolean keyword query. The view includes multiple text columns, and the keyword search is executed on each of the multiple text columns in the view. The output results from the keyword search on each of the text columns include tuple identifiers of one or more relevant tuples and a relevancy score for ranking the results of the keyword query.
    Type: Application
    Filed: May 20, 2009
    Publication date: November 25, 2010
    Applicant: Microsoft Corporation
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Publication number: 20100293179
    Abstract: Identifying synonyms of entities using web search results is disclosed herein. In some aspects, a candidate string of tokens of an entity name is selected as a search term. The search term is transmitted by a server to a search engine, which in turn, transmits search results back to the server after performing a search. The server analyzes the search results, generates a score based on the search results, and then determines a status (synonym or not a synonym) of the candidate string based on the score. In further aspects, additional candidate strings are designated as synonyms or not synonyms based on status of the searched candidate string by using relationships of a lattice formed from all possible candidate strings of the entity name.
    Type: Application
    Filed: May 14, 2009
    Publication date: November 18, 2010
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Venkatesh Ganti, Dong Xin
  • Publication number: 20100262593
    Abstract: The described implementations relate to filtered index recommendations. In one case a filtered index recommendation (FIR) tool is configured to recommend a final set of filtered indexes to use with a workload. The final set is selected from a first set of candidate filtered indexes and a second set of merged filtered indexes.
    Type: Application
    Filed: April 8, 2009
    Publication date: October 14, 2010
    Applicant: Microsoft Corporation
    Inventors: Nicolas Bruno, Surajit Chaudhuri, Vivek R. Narasayya, Manoj A. Syamala
  • Publication number: 20100250518
    Abstract: A flexible query hints system and method for discovering and expressing query hints in a database management system. Embodiments of the flexible query hints system and method include a power hints (Phints) language that enables the specification of constraints to influence a query optimizer. Phints expressions are defined as tree patterns annotated with constraints. Embodiments of the flexible query hints system and method also include techniques to incorporate the power hints language expressions into an extended query optimizer. Theses techniques include computing a directed acyclic graph for Phints expression, deriving candidate matches using the Phints expression and the graph, computing candidate matches, and extracting a revised execution plan having a lowest cost and satisfying constraints of the Phints expression. Embodiments of the flexible query hints system and method include a flexible query hint user interface that allow users to interactively adjust query hints.
    Type: Application
    Filed: March 28, 2009
    Publication date: September 30, 2010
    Applicant: Microsoft Corporation
    Inventors: Nicolas Bruno, Ravishankar Ramamurthy, Surajit Chaudhuri
  • Publication number: 20100235347
    Abstract: An exact cardinality query optimization system and method for optimizing a query having a plurality of expressions to obtain a cardinality-optimal query execution plan for the query. Embodiments of the system and method use various techniques to shorten the time necessary to obtain the cardinality-optimal query execution plan, which contains the query execution plan when all cardinalities are exact. Embodiments of the system and method include a covering queries technique that leverages query execution feedback to obtain an unordered subset of relevant expressions for the query, an early termination technique that bounds the cardinality to determine whether the processing can be terminate before each of the expressions are executed, and an expressions ordering technique that finds an ordering of expressions that yields the greatest reduction in time to obtain the cardinality-optimal query execution plan.
    Type: Application
    Filed: March 14, 2009
    Publication date: September 16, 2010
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek Narasayya, Ravishankar Ramamurthy
  • Patent number: 7739269
    Abstract: Database systems use a plan cache to avoid the overheads (e.g., time, money) of query recompilation. Query plans can become invalidated by updates to the statistics on data or changes to the physical database design. Once a plan is invalidated, it can be repaired utilizing one or more of the disclosed embodiments. Incremental repair of query plans includes reusing parts of the current plan rather than discarding the plan entirely when it is invalidated. Repair to an existing query plan is attempted before resorting to full recompilation.
    Type: Grant
    Filed: January 19, 2007
    Date of Patent: June 15, 2010
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Ravishankar Ramamurthy
  • Patent number: 7739221
    Abstract: A system that can analyze a multi-dimensional input thereafter establishing a search query based upon extracted features from the input. In a particular example, an image can be used as an input to a search mechanism. Pattern recognition and image analysis can be applied to the image thereafter establishing a search query that corresponds to features extracted from the image input. The system can also facilitate indexing multi-dimensional searchable items thereby making them available to be retrieved as results to a search query. More particularly, the system can employ text analysis, pattern and/or speech recognition mechanisms to extract features from searchable items. These extracted features can be employed to index the searchable items.
    Type: Grant
    Filed: June 28, 2006
    Date of Patent: June 15, 2010
    Assignee: Microsoft Corporation
    Inventors: Stephen Lawler, Eric J. Horvitz, Joshua T. Goodman, Anoop Gupta, Christopher A. Meek, Eric D. Brill, Gary W. Flake, Ramez Naam, Surajit Chaudhuri, Oliver Hurst-Hiller
  • Patent number: 7707207
    Abstract: The claimed subject matter relates to incorporating a skyline operator within a relational database engine, and more particularly to a database engine that utilizes novel techniques to determine the lowest cost of generating the skyline produced by the skyline operator. The database engine receives queries and associated preferences and, based on a cardinality estimate and a cost estimate, an appropriate skyline generating technique is utilized to produce a skyline representative of the received queries and its associated preferences.
    Type: Grant
    Filed: February 17, 2006
    Date of Patent: April 27, 2010
    Assignee: Microsoft Corporation
    Inventors: Kaushik Shriraghav, Surajit Chaudhuri, Nilesh N. Dalvi
  • Patent number: 7685090
    Abstract: The invention concerns a detection of duplicate tuples in a database. Previous domain independent detection of duplicated tuples relied on standard similarity functions (e.g., edit distance, cosine metric) between multi-attribute tuples. However, such prior art approaches result in large numbers of false positives if they are used to identify domain-specific abbreviations and conventions. In accordance with the invention a process for duplicate detection is implemented based on interpreting records from multiple dimensional tables in a data warehouse, which are associated with hierarchies specified through key—foreign key relationships in a snowflake schema. The invention exploits the extra knowledge available from the table hierarchy to develop a high quality, scalable duplicate detection process.
    Type: Grant
    Filed: July 14, 2005
    Date of Patent: March 23, 2010
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Venkatesh Ganti, Rohit Ananthakrishna
  • Patent number: 7685145
    Abstract: Various embodiments are disclosed relating to database configuration refinement. In an example embodiment, a method is provided that may include determining a size limitation for a database configuration, determining a workload of the database configuration, and making a determination that a size of the database configuration is greater than a size limit. The method may also include applying either a merge process or a reduction process to decrease the size of the database configuration. The merge process may merge a first index/view with a second index/view to produce a merged index/view, for example. The reduction process may delete a first portion of a first view to produce a reduced view.
    Type: Grant
    Filed: March 28, 2006
    Date of Patent: March 23, 2010
    Assignee: Microsoft Corporation
    Inventors: Nicolas Bruno, Surajit Chaudhuri
  • Publication number: 20100042963
    Abstract: Described is a constraint language and related technology by which complex constraints may be used in selecting configurations for use in physical database design tuning. The complex constraint (or constraints) is processed, e.g., in a search framework, to determine and output at least one configuration that meets the constraint, e.g., a best configuration found before a stopping condition is met. The search framework processes a current configuration into candidate configurations, including by searching for candidate configurations from a current configuration based upon a complex constraint, iteratively evaluating a search space until a stopping condition is satisfied, using transformation rules to generate new candidate configurations, and selecting a best candidate configuration. Transformation rules and pruning rules are applied to efficiently perform the search. Constraints may be specified as assertions that need to be satisfied, or as soft assertions that come close to satisfying the constraint.
    Type: Application
    Filed: August 13, 2008
    Publication date: February 18, 2010
    Applicant: Microsoft Corporation
    Inventors: Nicolas Bruno, Surajit Chaudhuri
  • Publication number: 20090327223
    Abstract: The described implementations relate to query portals. One technique analyzes search results generated by a web search engine responsive to a user search query. The technique also dynamically generates a query portal that lists the search results as well as entities identified from the search results.
    Type: Application
    Filed: June 26, 2008
    Publication date: December 31, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti, Dong Xin, Sanjay Agrawal, Arnd Christian Konig
  • Publication number: 20090319500
    Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.
    Type: Application
    Filed: June 24, 2008
    Publication date: December 24, 2009
    Applicant: Microsoft Corporation
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti