Patents by Inventor Surajit Chaudhuri

Surajit Chaudhuri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8874592
    Abstract: The subject disclosure pertains to web searches and more particularly toward influencing resultant content to increase relevancy. The resultant content can be influenced by reconfiguring a query and/or filtering results based on user location and/or context information (e.g., user characteristics/profile, prior interaction/usage temporal, current events, and third party state/context . . . ). Furthermore, the disclosure provides for query execution on at least a subset of designated web content, for example as specified by a user. Still further yet, a localized marketing system is disclosed that provides discount offers to users that match merchant criteria including proximity. A system for actively probing populations of users with different parameters and monitoring responses can be employed to collect data for identifying the best discounts and deadlines to offer to users to achieve desired results.
    Type: Grant
    Filed: June 28, 2006
    Date of Patent: October 28, 2014
    Assignee: Microsoft Corporation
    Inventors: Gary W. Flake, William H. Gates, III, Eric J. Horvitz, Joshua T. Goodman, Surajit Chaudhuri, Trenholme J. Griffin, Oliver Hurst-Hiller, Kenneth A. Moss
  • Publication number: 20140270407
    Abstract: Various technologies pertaining to assigning metadata to images in a personal image collection of a user based upon images and associated metadata assigned thereto that are accessible to the user by way of a social network application are described. An account of the user in a social network application is accessed to retrieve images and metadata that is accessible to the user. A face recognition algorithm is trained based upon the retrieved images and metadata, and the trained face recognition algorithm is executed over the personal image collection of the user, where the personal image collection of the user is external to the social network application.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Shobana Balakrishnan, Surajit Chaudhuri
  • Publication number: 20140207740
    Abstract: Techniques for tenant performance isolation in a multiple-tenant database management system are described. These techniques may include providing a reservation of server resources. The server resources reservation may include a reservation of a central processing unit (CPU), a reservation of Input/Output throughput, and/or a reservation of buffer pool memory or working memory. The techniques may also include a metering mechanism that determines whether the resource reservation is satisfied. The metering mechanism may be independent of an actual resource allocation mechanism associated with the server resource reservation.
    Type: Application
    Filed: January 23, 2013
    Publication date: July 24, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Vivek R. Narasayya, Sudipto Das, Manoj A. Syamala, Hyunjung Park, Surajit Chaudhuri, Badrish Chandramouli, Feng Li
  • Patent number: 8782061
    Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.
    Type: Grant
    Filed: June 24, 2008
    Date of Patent: July 15, 2014
    Assignee: Microsoft Corporation
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Patent number: 8745019
    Abstract: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.
    Type: Grant
    Filed: June 4, 2012
    Date of Patent: June 3, 2014
    Assignee: Microsoft Corporation
    Inventors: Tao Cheng, Kaushik Chakrabarti, Surajit Chaudhuri, Dong Xin
  • Publication number: 20130346464
    Abstract: A data service system is described herein which processes raw data assets from at least one network-accessible system (such as a search system), to produce processed data assets. Enterprise applications can then leverage the processed data assets to perform various environment-specific tasks. In one implementation, the data service system can generate any of: synonym resources for use by an enterprise application in providing synonyms for specified terms associated with entities; augmentation resources for use by an enterprise application in providing supplemental information for specified seed information; and spelling-correction resources for use by an enterprise application in providing spelling information for specified terms, and so on.
    Type: Application
    Filed: June 20, 2012
    Publication date: December 26, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Tao Cheng, Kris Ganjam, Kaushik Chakrabarti, Zhimin Chen, Vivek R. Narasayya, Surajit Chaudhuri
  • Publication number: 20130346421
    Abstract: A targeted disambiguation system is described herein which determines true mentions of a list of named entities in a collection of documents. The list of named entities is homogenous in the sense that the entities pertain to the same subject matter domain. The system determines the true mentions by leveraging the homogeneity in the list, and, more specifically by applying a context similarity hypothesis, a co-mention hypothesis, and an interdependency hypothesis. In one implementation, the system executes its analysis using a graph-based model. The system can operate without the existence of additional information regarding the entities in the list; nevertheless, if such information is available, the system can integrate it into its analysis.
    Type: Application
    Filed: June 22, 2012
    Publication date: December 26, 2013
    Applicant: Microsoft Corporation
    Inventors: Chi Wang, Kaushik Chakrabarti, Tao Cheng, Surajit Chaudhuri
  • Publication number: 20130297655
    Abstract: Various technologies described herein pertain to evaluating service provider compliance with terms of a performance service level agreement (SLA) for a tenant in a multi-tenant database system. The terms of the performance SLA can set a performance criterion as though a level of a resource of hardware of the multi-tenant database system is dedicated to the tenant. An actual performance metric of the resource can be tracked for a workload of the tenant. Further, a baseline performance metric of the resource can be determined for the workload of the tenant. The baseline performance metric can be based on a simulation as though the level of the resource as set in the performance SLA is dedicated to the workload of the tenant. Moreover, the actual performance metric can be compared with the baseline performance metric to evaluate compliance with the performance SLA.
    Type: Application
    Filed: May 2, 2012
    Publication date: November 7, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Vivek Ravindranath Narasayya, Feng Li, Surajit Chaudhuri
  • Publication number: 20130275434
    Abstract: A system enables metadata to be gathered about a data store beginning from the creation and generation of the data store, through subsequent use of the data store. This metadata can include keywords related to the data store and data appearing within the data store. Thus, keywords and other metadata can be generated without owner/creator intervention, with enough semantic meaning to make a discovery process associated with the data store much easier and efficient. Usage of or communication regarding a data store are monitored and keywords are extracted from the usage or communication. The keywords are then written to otherwise associated with metadata of the data store. During searching, keywords in the metadata are made available to be used to attempt to match query terms entered by a searcher.
    Type: Application
    Filed: April 11, 2012
    Publication date: October 17, 2013
    Applicant: Microsoft Corporation
    Inventors: John C. Platt, Surajit Chaudhuri, Lev Novik, Henricus Johannes Maria Meijer
  • Publication number: 20130275436
    Abstract: Various embodiments promote the discoverability of data that can be contained within a database. In one or more embodiments, data within a database is organized in a structure having a schema. The structure and data can be processed in a manner that renders one or more pseudo-documents each of which constitutes a sub-structure that can be indexed. Once produced and indexed, the pseudo-documents constitute a set of searchable objects each of which relationally points back to its associated structure within the database. Searches can now be performed against the pseudo-documents which, in turn, returns a set of search results. The set of search results can include multiple sub-sets of pseudo-documents, each sub-set of which is associated with a different structure.
    Type: Application
    Filed: April 11, 2012
    Publication date: October 17, 2013
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Lev Novik, John C. Platt
  • Publication number: 20130268552
    Abstract: A data broker observes datasets that are opened or created by a user. The data broker looks for related datasets in a data catalog. If a related dataset is found, the data broker asks the user if they want to access the related dataset. If the user is interested, then the data broker asks the data owner if they are willing to share access to the related dataset with the user. The data owner may deny access, allow access, or request the user's identity. If the user does not want to provide his or her identity, then access to the related dataset is denied. If the user does provide his or her identity, then the data owner determines whether or not to share the data with that user. Once the owner approves sharing the related dataset, then the dataset or a link to the dataset is sent to the user.
    Type: Application
    Filed: April 10, 2012
    Publication date: October 10, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: John C. Platt, Surajit Chaudhuri, Lev Novik, Henricus Johannes Maria Meijer, Efim Hudis
  • Publication number: 20130268531
    Abstract: In one embodiment, datasets are stored in a catalog. The datasets are enriched by establishing relationships among the domains in different datasets. A user searches for relevant datasets by providing examples of the domains of interest. The system identifies datasets corresponding to the user-provided examples. The system them identifies connected subsets of the datasets that are directly linked or indirectly linked through other domains. The user provides known relationship examples to filter the connected subsets and to identify the connected subsets that are most relevant to the user's query. The selected connected subsets may be further analyzed by business intelligence/analytics to create pivot tables or to process the data.
    Type: Application
    Filed: April 10, 2012
    Publication date: October 10, 2013
    Applicant: Microsoft Corporation
    Inventors: John C. Platt, Surajit Chaudhuri, Lev Novik, Henricus Johannes Maria Meijer, Efim Hudis, Kunal Mukerjee, Christopher Alan Hays
  • Publication number: 20130238621
    Abstract: The subject disclosure is directed towards providing data for augmenting an entity-attribute-related task. Pre-processing is preformed on entity-attribute tables extracted from the web, e.g., to provide indexes that are accessible to find data that completes augmentation tasks. The indexes are based on both direct mappings and indirect mappings between tables. Example augmentation tasks include queries for augmented data based on an attribute name or examples, or finding synonyms for augmentation. An online query is efficiently processed by accessing the indexes to return augmented data related to the task.
    Type: Application
    Filed: March 6, 2012
    Publication date: September 12, 2013
    Applicant: Microsoft Corporation
    Inventors: Kris K. Ganjam, Kaushik Chakrabarti, Mohamed A. Yakout, Surajit Chaudhuri
  • Patent number: 8533203
    Abstract: Identifying synonyms of entities using a collection of documents is disclosed herein. In some aspects, a document from a collection of documents may be analyzed to identify hit sequences that include one or more tokens (e.g., words, number, etc.). The hit sequences may then be used to generate discriminating token sets (DTS's) that are subsets of both the hit sequences and the entity names. The DTS's are matched with corresponding entity names, and then used to create DTS phrases by selecting adjacent text in the document that is proximate to the DTS. The DTS phrases may be analyzed to determine whether the corresponding DTS is synonyms of the entity name. In various aspects, the tokens of an associated entity name that are present in the DTS phrases are used to generate a score for the DTS. When the score at least reaches a threshold, the DTS may be designated as a synonym. A list of synonyms may be generated for each entity name.
    Type: Grant
    Filed: June 4, 2009
    Date of Patent: September 10, 2013
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Venkatesh Ganti, Dong Xin
  • Publication number: 20130232129
    Abstract: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.
    Type: Application
    Filed: June 4, 2012
    Publication date: September 5, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Tao Cheng, Kaushik Chakrabarti, Surajit Chaudhuri, Dong Xin
  • Patent number: 8527893
    Abstract: This patent application relates to taxonomy editing. One implementation involves a taxonomy editor configured to generate a visual representation of a taxonomy associated with a set of scientific papers. The taxonomy editor includes a properties module configured to identify properties relating to an individual node of the taxonomy and a statistics module configured to determine trends relating to the individual node. The taxonomy editor further includes a similarity module configured to evaluate keyword similarity relative to individual scientific papers associated with the individual node. The taxonomy editor also includes a suggestion module configured to utilize the properties, the trends and the keyword similarity to identify potential modifications to the taxonomy. The taxonomy editor is further configured to present at least some of the potential modifications, the properties, the trends, and the keyword similarity concurrently with the visual representation of the taxonomy.
    Type: Grant
    Filed: February 26, 2010
    Date of Patent: September 3, 2013
    Assignee: Microsoft Corporation
    Inventors: Sanjay Agrawal, Surajit Chaudhuri, Venkatesh Ganti, Yuri Siradeghyan
  • Publication number: 20130151504
    Abstract: The claimed subject matter provides a method for providing a progress estimate for a database query. The method includes determining static features of a query plan for the database query. The method also includes selecting an initial progress estimator based on the static features and a trained machine learning model. The model is trained using static features of a plurality of query plans, and dynamic features of the plurality of query plans. Further, the method includes determining dynamic features of the query plan for each of a plurality of candidate estimators. Additionally, the method includes selecting a revised progress estimator based on the static features, the dynamic features and a trained machine learning model for each of the candidate estimators. The method further includes producing the progress estimate based on the revised progress estimator.
    Type: Application
    Filed: December 9, 2011
    Publication date: June 13, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Christian Konig, Bolin Ding, Surajit Chaudhuri, Vivek Narasayya
  • Publication number: 20130132381
    Abstract: A plurality of description phrases associated with a first domain may be determined, based on an analysis of a first plurality of documents to determine co-occurrences of the description phrases with one or more name labels associated with the first domain. An entity associated with the first domain may be obtained. An analysis of a second plurality of documents may be initiated to identify co-occurrences of mentions of the obtained entity and one or more of the plurality of description phrases, and contexts associated with each of the co-occurrences of the mentions and description phrases, in each one of the second plurality of documents. A description tag association between the obtained entity and one of the description phrases may be determined, based on an analysis of the identified contexts.
    Type: Application
    Filed: November 17, 2011
    Publication date: May 23, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Tao Cheng
  • Publication number: 20130091120
    Abstract: A fuzzy joins system that is integrated in a database system generates fuzzy joins between records from two datasets. The fuzzy joins system includes a tokenizer to generate tokens for data records and a transformer to find transforms for the tokens. The fuzzy joins system invokes a signature generator, running within a runtime layer of the database system, to generate signatures for data records based on the tokens and their transforms. Subsequently, an equi-join operation joins the records from the two datasets with at least one equal signature. A similarity calculator, running within a runtime layer of the database system, computes a similarity measure using the token information of the joined records. If the similarity measure for any two records is above a threshold, the fuzzy joins system generates a fuzzy join between such two records.
    Type: Application
    Filed: October 5, 2011
    Publication date: April 11, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Kris Ganjam, Vivek Ravindranath Narasayya, Raghav Kaushik, Arvind Arasu, Surajit Chaudhuri
  • Patent number: 8386529
    Abstract: This patent application relates to foreign-key detection. One implementation obtains a set of data tables. This implementation automatically determines foreign-key relationships of columns from separate tables of the set.
    Type: Grant
    Filed: February 21, 2010
    Date of Patent: February 26, 2013
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Vivek R. Narasayya, Zhimin Chen