Patents by Inventor Surajit Chaudhuri
Surajit Chaudhuri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7930301Abstract: A search of an index database or another search method is conducted to identify preliminary results listing one or more selected computer objects having selected identifying information stored in an index database. In addition, one or more selected computer objects of the preliminary search results are correlated with one or more other computer objects that have associations with the selected computer objects of the preliminary search results. Integrated search results are then returned and include the preliminary search results and one or more other computer objects that have associations with the selected computer objects of the preliminary search results. The associations may be determined by a association system and represent relationships between computer files based upon user or other interactions between the objects. The associations between the objects may include similarities between them and their importance.Type: GrantFiled: March 31, 2003Date of Patent: April 19, 2011Assignee: Microsoft CorporationInventors: Cezary Marcjan, Ryszard Kott, Surajit Chaudhuri, Lili Cheng
-
Patent number: 7917514Abstract: A system that can analyze a multi-dimensional input thereafter establishing a search query based upon extracted features from the input. In a particular example, an image can be used as an input to a search mechanism. Pattern recognition and image analysis can be applied to the image thereafter establishing a search query that corresponds to features extracted from the image input. The system can also facilitate indexing multi-dimensional searchable items thereby making them available to be retrieved as results to a search query. More particularly, the system can employ text analysis, pattern and/or speech recognition mechanisms to extract features from searchable items. These extracted features can be employed to index the searchable items.Type: GrantFiled: June 28, 2006Date of Patent: March 29, 2011Assignee: Microsoft CorporationInventors: Stephen Lawler, Eric J. Horvitz, Joshua T. Goodman, Anoop Gupta, Christopher A. Meek, Eric D. Brill, Gary W. Flake, Ramez Naam, Surajit Chaudhuri, Oliver Hurst-Hiller
-
Publication number: 20110038531Abstract: Techniques are described to leverage a set of sample or example matched pairs of strings to learn string transformation rules, which may be used to match data records that are semantically equivalent. In one embodiment, matched pairs of input strings are accessed. For a set of matched pairs, a set of one or more string transformation rules are learned. A transformation rule may include two strings determined to be semantically equivalent. The transformation rules are used to determine whether a first and second string match each other.Type: ApplicationFiled: August 14, 2009Publication date: February 17, 2011Applicant: MICROSOFT CORPORATIONInventors: Arvind Arasu, Surajit Chaudhuri, Shriraghav Kaushik
-
Patent number: 7882121Abstract: A query generation using cardinality constraints process including choosing a first set of parameters for a query, calculating an additional set of parameters based on the first set of parameters, executing the query using additional set of parameters, evaluating the cardinality error the additional set of parameters, and refining the additional set of parameters to meet the desired cardinality constraint. Creating a query and selecting parameters for the query to meet a desired cardinality constraint or set of cardinality constraints when the query is executed against a database may be difficult. A query generation using cardinality constraints process may create a set of parameters for a query which satisfies a desired cardinality constraint or set of cardinality constraints. An application of such a query generation using cardinality constraints process may be database component and code testing.Type: GrantFiled: January 27, 2006Date of Patent: February 1, 2011Assignee: Microsoft CorporationInventors: Nicolas Bruno, Surajit Chaudhuri, Dilys Thomas
-
Publication number: 20100325136Abstract: Techniques for error-tolerant autocompletion are described. While displaying characters of an input string as they are inputted by a user, when a character is added to the input string by the user, matching strings may be selected from among a set of candidate strings by determining which of the candidate strings have a prefix whose characters match the characters of the input string within a given edit distance of the input string.Type: ApplicationFiled: June 23, 2009Publication date: December 23, 2010Applicant: Microsoft CorporationInventors: Surajit Chaudhuri, Shriraghav Kaushik
-
Publication number: 20100318543Abstract: An architecture for providing interactive sessions for physical database design is described, allowing users to readily try different options, identify problems, and obtain physical designs in a flexible way. Embodiments based on a .NET assembly and modifications to a database management system (DBMS) are also described.Type: ApplicationFiled: June 15, 2009Publication date: December 16, 2010Applicant: MICROSOFT CORPORATIONInventors: Surajit Chaudhuri, Nicolas Bruno
-
Publication number: 20100313258Abstract: Identifying synonyms of entities using a collection of documents is disclosed herein. In some aspects, a document from a collection of documents may be analyzed to identify hit sequences that include one or more tokens (e.g., words, number, etc.). The hit sequences may then be used to generate discriminating token sets (DTS's) that are subsets of both the hit sequences and the entity names. The DTS's are matched with corresponding entity names, and then used to create DTS phrases by selecting adjacent text in the document that is proximate to the DTS. The DTS phrases may be analyzed to determine whether the corresponding DTS is synonyms of the entity name. In various aspects, the tokens of an associated entity name that are present in the DTS phrases are used to generate a score for the DTS. When the score at least reaches a threshold, the DTS may be designated as a synonym. A list of synonyms may be generated for each entity name.Type: ApplicationFiled: June 4, 2009Publication date: December 9, 2010Applicant: MICROSOFT CORPORATIONInventors: Surajit Chaudhuri, Venkatesh Ganti, Dong Xin
-
Publication number: 20100299367Abstract: A keyword search is executed on a view of a database based on a Boolean keyword query. The view includes multiple text columns, and the keyword search is executed on each of the multiple text columns in the view. The output results from the keyword search on each of the text columns include tuple identifiers of one or more relevant tuples and a relevancy score for ranking the results of the keyword query.Type: ApplicationFiled: May 20, 2009Publication date: November 25, 2010Applicant: Microsoft CorporationInventors: Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
-
Publication number: 20100293179Abstract: Identifying synonyms of entities using web search results is disclosed herein. In some aspects, a candidate string of tokens of an entity name is selected as a search term. The search term is transmitted by a server to a search engine, which in turn, transmits search results back to the server after performing a search. The server analyzes the search results, generates a score based on the search results, and then determines a status (synonym or not a synonym) of the candidate string based on the score. In further aspects, additional candidate strings are designated as synonyms or not synonyms based on status of the searched candidate string by using relationships of a lattice formed from all possible candidate strings of the entity name.Type: ApplicationFiled: May 14, 2009Publication date: November 18, 2010Applicant: Microsoft CorporationInventors: Surajit Chaudhuri, Venkatesh Ganti, Dong Xin
-
Publication number: 20100262593Abstract: The described implementations relate to filtered index recommendations. In one case a filtered index recommendation (FIR) tool is configured to recommend a final set of filtered indexes to use with a workload. The final set is selected from a first set of candidate filtered indexes and a second set of merged filtered indexes.Type: ApplicationFiled: April 8, 2009Publication date: October 14, 2010Applicant: Microsoft CorporationInventors: Nicolas Bruno, Surajit Chaudhuri, Vivek R. Narasayya, Manoj A. Syamala
-
Publication number: 20100250518Abstract: A flexible query hints system and method for discovering and expressing query hints in a database management system. Embodiments of the flexible query hints system and method include a power hints (Phints) language that enables the specification of constraints to influence a query optimizer. Phints expressions are defined as tree patterns annotated with constraints. Embodiments of the flexible query hints system and method also include techniques to incorporate the power hints language expressions into an extended query optimizer. Theses techniques include computing a directed acyclic graph for Phints expression, deriving candidate matches using the Phints expression and the graph, computing candidate matches, and extracting a revised execution plan having a lowest cost and satisfying constraints of the Phints expression. Embodiments of the flexible query hints system and method include a flexible query hint user interface that allow users to interactively adjust query hints.Type: ApplicationFiled: March 28, 2009Publication date: September 30, 2010Applicant: Microsoft CorporationInventors: Nicolas Bruno, Ravishankar Ramamurthy, Surajit Chaudhuri
-
Publication number: 20100235347Abstract: An exact cardinality query optimization system and method for optimizing a query having a plurality of expressions to obtain a cardinality-optimal query execution plan for the query. Embodiments of the system and method use various techniques to shorten the time necessary to obtain the cardinality-optimal query execution plan, which contains the query execution plan when all cardinalities are exact. Embodiments of the system and method include a covering queries technique that leverages query execution feedback to obtain an unordered subset of relevant expressions for the query, an early termination technique that bounds the cardinality to determine whether the processing can be terminate before each of the expressions are executed, and an expressions ordering technique that finds an ordering of expressions that yields the greatest reduction in time to obtain the cardinality-optimal query execution plan.Type: ApplicationFiled: March 14, 2009Publication date: September 16, 2010Applicant: Microsoft CorporationInventors: Surajit Chaudhuri, Vivek Narasayya, Ravishankar Ramamurthy
-
Patent number: 7739269Abstract: Database systems use a plan cache to avoid the overheads (e.g., time, money) of query recompilation. Query plans can become invalidated by updates to the statistics on data or changes to the physical database design. Once a plan is invalidated, it can be repaired utilizing one or more of the disclosed embodiments. Incremental repair of query plans includes reusing parts of the current plan rather than discarding the plan entirely when it is invalidated. Repair to an existing query plan is attempted before resorting to full recompilation.Type: GrantFiled: January 19, 2007Date of Patent: June 15, 2010Assignee: Microsoft CorporationInventors: Surajit Chaudhuri, Ravishankar Ramamurthy
-
Patent number: 7739221Abstract: A system that can analyze a multi-dimensional input thereafter establishing a search query based upon extracted features from the input. In a particular example, an image can be used as an input to a search mechanism. Pattern recognition and image analysis can be applied to the image thereafter establishing a search query that corresponds to features extracted from the image input. The system can also facilitate indexing multi-dimensional searchable items thereby making them available to be retrieved as results to a search query. More particularly, the system can employ text analysis, pattern and/or speech recognition mechanisms to extract features from searchable items. These extracted features can be employed to index the searchable items.Type: GrantFiled: June 28, 2006Date of Patent: June 15, 2010Assignee: Microsoft CorporationInventors: Stephen Lawler, Eric J. Horvitz, Joshua T. Goodman, Anoop Gupta, Christopher A. Meek, Eric D. Brill, Gary W. Flake, Ramez Naam, Surajit Chaudhuri, Oliver Hurst-Hiller
-
Patent number: 7707207Abstract: The claimed subject matter relates to incorporating a skyline operator within a relational database engine, and more particularly to a database engine that utilizes novel techniques to determine the lowest cost of generating the skyline produced by the skyline operator. The database engine receives queries and associated preferences and, based on a cardinality estimate and a cost estimate, an appropriate skyline generating technique is utilized to produce a skyline representative of the received queries and its associated preferences.Type: GrantFiled: February 17, 2006Date of Patent: April 27, 2010Assignee: Microsoft CorporationInventors: Kaushik Shriraghav, Surajit Chaudhuri, Nilesh N. Dalvi
-
Patent number: 7685090Abstract: The invention concerns a detection of duplicate tuples in a database. Previous domain independent detection of duplicated tuples relied on standard similarity functions (e.g., edit distance, cosine metric) between multi-attribute tuples. However, such prior art approaches result in large numbers of false positives if they are used to identify domain-specific abbreviations and conventions. In accordance with the invention a process for duplicate detection is implemented based on interpreting records from multiple dimensional tables in a data warehouse, which are associated with hierarchies specified through key—foreign key relationships in a snowflake schema. The invention exploits the extra knowledge available from the table hierarchy to develop a high quality, scalable duplicate detection process.Type: GrantFiled: July 14, 2005Date of Patent: March 23, 2010Assignee: Microsoft CorporationInventors: Surajit Chaudhuri, Venkatesh Ganti, Rohit Ananthakrishna
-
Patent number: 7685145Abstract: Various embodiments are disclosed relating to database configuration refinement. In an example embodiment, a method is provided that may include determining a size limitation for a database configuration, determining a workload of the database configuration, and making a determination that a size of the database configuration is greater than a size limit. The method may also include applying either a merge process or a reduction process to decrease the size of the database configuration. The merge process may merge a first index/view with a second index/view to produce a merged index/view, for example. The reduction process may delete a first portion of a first view to produce a reduced view.Type: GrantFiled: March 28, 2006Date of Patent: March 23, 2010Assignee: Microsoft CorporationInventors: Nicolas Bruno, Surajit Chaudhuri
-
Publication number: 20100042963Abstract: Described is a constraint language and related technology by which complex constraints may be used in selecting configurations for use in physical database design tuning. The complex constraint (or constraints) is processed, e.g., in a search framework, to determine and output at least one configuration that meets the constraint, e.g., a best configuration found before a stopping condition is met. The search framework processes a current configuration into candidate configurations, including by searching for candidate configurations from a current configuration based upon a complex constraint, iteratively evaluating a search space until a stopping condition is satisfied, using transformation rules to generate new candidate configurations, and selecting a best candidate configuration. Transformation rules and pruning rules are applied to efficiently perform the search. Constraints may be specified as assertions that need to be satisfied, or as soft assertions that come close to satisfying the constraint.Type: ApplicationFiled: August 13, 2008Publication date: February 18, 2010Applicant: Microsoft CorporationInventors: Nicolas Bruno, Surajit Chaudhuri
-
Publication number: 20090327223Abstract: The described implementations relate to query portals. One technique analyzes search results generated by a web search engine responsive to a user search query. The technique also dynamically generates a query portal that lists the search results as well as entities identified from the search results.Type: ApplicationFiled: June 26, 2008Publication date: December 31, 2009Applicant: MICROSOFT CORPORATIONInventors: Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti, Dong Xin, Sanjay Agrawal, Arnd Christian Konig
-
Publication number: 20090319500Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.Type: ApplicationFiled: June 24, 2008Publication date: December 24, 2009Applicant: Microsoft CorporationInventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti