Patents by Inventor Shriraghav Kaushik

Shriraghav Kaushik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11232214
    Abstract: Methods, systems, and computer-readable media are directed towards receiving, at an untrusted component, a query for a data store. The query includes a plurality of data operations. The data store is accessible by the untrusted component. A first proper subset of data operations is determined from the plurality of data operations that do not access sensitive data within the data store. A second proper subset of data operations is determined from the plurality of data operations that access sensitive data within the data store. The first proper subset of data operations is executed, at the untrusted component, to create first results. The second proper subset of data operations is sent to a trusted component for execution. Second results based on the sending the second proper subset of data operations are received from the trusted component. Results to the query are returned based on the first results and the second results.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: January 25, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth H. Eguro, Manas Rajendra Joglekar, Donald Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
  • Publication number: 20200272744
    Abstract: Methods, systems, and computer-readable media are directed towards receiving, at an untrusted component, a query for a data store. The query includes a plurality of data operations. The data store is accessible by the untrusted component. A first proper subset of data operations is determined from the plurality of data operations that do not access sensitive data within the data store. A second proper subset of data operations is determined from the plurality of data operations that access sensitive data within the data store. The first proper subset of data operations is executed, at the untrusted component, to create first results. The second proper subset of data operations is sent to a trusted component for execution. Second results based on the sending the second proper subset of data operations are received from the trusted component. Results to the query are returned based on the first results and the second results.
    Type: Application
    Filed: May 13, 2020
    Publication date: August 27, 2020
    Inventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth H. Eguro, Manas Rajendra Joglekar, Donald Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
  • Patent number: 10671736
    Abstract: Methods, systems, and computer-readable media are directed towards receiving, at an untrusted component, a query for a data store. The query includes a plurality of data operations. The data store is accessible by the untrusted component. A first proper subset of data operations is determined from the plurality of data operations that do not access sensitive data within the data store. A second proper subset of data operations is determined from the plurality of data operations that access sensitive data within the data store. The first proper subset of data operations is executed, at the untrusted component, to create first results. The second proper subset of data operations is sent to a trusted component for execution. Second results based on the sending the second proper subset of data operations are received from the trusted component. Results to the query are returned based on the first results and the second results.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: June 2, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth H. Eguro, Manas Rajendra Joglekar, Donald Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
  • Publication number: 20180046812
    Abstract: Methods, systems, and computer-readable media are directed towards receiving, at an untrusted component, a query for a data store. The query includes a plurality of data operations. The data store is accessible by the untrusted component. A first proper subset of data operations is determined from the plurality of data operations that do not access sensitive data within the data store. A second proper subset of data operations is determined from the plurality of data operations that access sensitive data within the data store. The first proper subset of data operations is executed, at the untrusted component, to create first results. The second proper subset of data operations is sent to a trusted component for execution. Second results based on the sending the second proper subset of data operations are received from the trusted component. Results to the query are returned based on the first results and the second results.
    Type: Application
    Filed: October 27, 2017
    Publication date: February 15, 2018
    Inventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth H. Eguro, Manas Rajendra Joglekar, Donald Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
  • Patent number: 9747456
    Abstract: The subject disclosure is directed towards secure query processing over encrypted database records without disclosing information to an adversary except for permitted information. In order to adapting semantic security to a database encryption scheme, a security model for all query processing is specified by a client and used to determine which information is permitted to be disclosed and which information is not permitted. Based upon the security model, a trusted, secure query processor transforms each query and an encrypted database into secure query results. Even though the adversary can view the secure query results during communication to the client, the adversary cannot determine any reliable information regarding the secure query results or the encrypted database.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: August 29, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Arvind Arasu, Shriraghav Kaushik, Ravishankar Ramamurthy
  • Patent number: 9081817
    Abstract: An active learning record matching system and method for producing a record matching package that is used to identify pairs of duplicate records. Embodiments of the system and method allow a precision threshold to be specified and then generate a learned record matching package having precision greater than this threshold and a recall close to the best possible recall. Embodiments of the system and method use a blocking technique to restrict the space of record matching packages considered and scale to large inputs. The learning method considers several record matching packages, estimates the precision and recall of the packages, and identifies the package with maximum recall having precision greater than equal to the given precision threshold. A human domain expert labels a sample of record pairs in the output of the package as matches or non-matches and this labeling is used to estimate the precision of the package.
    Type: Grant
    Filed: April 11, 2011
    Date of Patent: July 14, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Arvind Arasu, Michaela Götz, Shriraghav Kaushik
  • Publication number: 20140281511
    Abstract: The subject disclosure is directed towards using trusted hardware to achieve secure data processing over a network. For a given set of data store operations, some operations are directed to sensitive data (e.g., encrypted data fields). These operations are compiled into a set of expressions invoking trusted hardware code configured to evaluate these expressions using corresponding data centric primitive programs. Because the trusted hardware is configured to maintain key data for encrypting/decrypting the sensitive data, the sensitive data is not accessible by an untrusted component while the sensitive data is decrypted.
    Type: Application
    Filed: August 27, 2013
    Publication date: September 18, 2014
    Applicant: Microsoft Corporation
    Inventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth Eguro, Manas Rajendra Joglekar, Donald A. Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
  • Publication number: 20140230070
    Abstract: SQL query auditing technique embodiments are presented that involve auditing data in a relational database accessed during execution of a SQL search query via a query execution plan to detect and report access to sensitive data. In one embodiment, a computer is used for inputting a SELECT trigger which specifies the sensitive data resident in the relational database that is to be monitored for access during execution of the SQL search query. In addition, the SELECT trigger specifies an action that is to be taken once execution of the SQL search query is completed, if sensitive data was accessed. Then, during execution of the query execution plan, access to sensitive data is monitored, and whenever such access is detected, it is reported. Next, upon completion of the execution of the SQL search query, the action specified in the SELECT trigger is performed if access to sensitive data was reported.
    Type: Application
    Filed: February 14, 2013
    Publication date: August 14, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Ravi Ramamurthy, Shriraghav Kaushik, Daniel Fabbri
  • Publication number: 20130132352
    Abstract: The present application provides for techniques for implementing data auditing embodiments that determine whether a query into a database is or has referenced forbidden data within the database. Various techniques are given for efficiently finding all tuples in a database referenced by a given query. A set of sensitive data is determined within a database and the set of sensitive data is employed to define a forbidden view within the database. Data within the database may be annotated to provide efficient identification of data access by query. Incoming queries may be analyzed and modified to propagate annotations for analyzing what data is or was accessed.
    Type: Application
    Filed: November 23, 2011
    Publication date: May 23, 2013
    Applicant: Microsoft Corporation
    Inventors: Shriraghav Kaushik, Ravishankar Ramamurthy, Yupeng Fu
  • Publication number: 20120259802
    Abstract: An active learning record matching system and method for producing a record matching package that is used to identify pairs of duplicate records. Embodiments of the system and method allow a precision threshold to be specified and then generate a learned record matching package having precision greater than this threshold and a recall close to the best possible recall. Embodiments of the system and method use a blocking technique to restrict the space of record matching packages considered and scale to large inputs. The learning method considers several record matching packages, estimates the precision and recall of the packages, and identifies the package with maximum recall having precision greater than equal to the given precision threshold. A human domain expert labels a sample of record pairs in the output of the package as matches or non-matches and this labeling is used to estimate the precision of the package.
    Type: Application
    Filed: April 11, 2011
    Publication date: October 11, 2012
    Applicant: Microsoft Corporation
    Inventors: Arvind Arasu, Michaela Götz, Shriraghav Kaushik
  • Patent number: 8249336
    Abstract: Techniques are described to leverage a set of sample or example matched pairs of strings to learn string transformation rules, which may be used to match data records that are semantically equivalent. In one embodiment, matched pairs of input strings are accessed. For a set of matched pairs, a set of one or more string transformation rules are learned. A transformation rule may include two strings determined to be semantically equivalent. The transformation rules are used to determine whether a first and second string match each other.
    Type: Grant
    Filed: August 14, 2009
    Date of Patent: August 21, 2012
    Assignee: Microsoft Corporation
    Inventors: Arvind Arasu, Surajit Chaudhuri, Shriraghav Kaushik
  • Patent number: 8204866
    Abstract: A deduplication algorithm that provides improved accuracy in data deduplication by using aggregate and/or groupwise constraints. Deduplication is accomplished using only as many of these constraints that are satisfied rather than be imposed inflexibly as hard constraints. Additionally, textual similarity between tuples is leveraged to restrict the search space. The algorithm begins with a coarse initial partition of data records and continues by raising the similarity threshold until the threshold splits a given partition. This sequence of splits defines a rich space of alternatives. Over this space, an algorithm finds a partition of the input that maximizes constraint satisfaction. In the context of groupwise aggregation constraints for deduplication all SQL (structured query language) aggregates are allowed, including summation.
    Type: Grant
    Filed: May 18, 2007
    Date of Patent: June 19, 2012
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Venkatesh Ganti, Shriraghav Kaushik, Anish Das Sarma
  • Patent number: 8046339
    Abstract: Example-driven creation of record matching queries. The disclosed architecture employs techniques that exploit the availability of positive (or matching) and negative (non-matching) examples to search through this space and suggest an initial record matching query. The record matching task is modeled as that of designing an operator tree obtained by composing a few primitive operators. This ensures that record matching programs be executable efficiently and scalably over large input relations. The architecture joins records across multiple (e.g., two) relations (e.g., R and S). The architecture exploits the monotonicity property of similarity functions for record matching in the relations, in that, any pair of matching records have a higher similarity value than non-matching record pairs on at least one similarity function.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: October 25, 2011
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Bee Chung Chen, Venkatesh Ganti, Shriraghav Kaushik
  • Publication number: 20110038531
    Abstract: Techniques are described to leverage a set of sample or example matched pairs of strings to learn string transformation rules, which may be used to match data records that are semantically equivalent. In one embodiment, matched pairs of input strings are accessed. For a set of matched pairs, a set of one or more string transformation rules are learned. A transformation rule may include two strings determined to be semantically equivalent. The transformation rules are used to determine whether a first and second string match each other.
    Type: Application
    Filed: August 14, 2009
    Publication date: February 17, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Arvind Arasu, Surajit Chaudhuri, Shriraghav Kaushik
  • Publication number: 20100325136
    Abstract: Techniques for error-tolerant autocompletion are described. While displaying characters of an input string as they are inputted by a user, when a character is added to the input string by the user, matching strings may be selected from among a set of candidate strings by determining which of the candidate strings have a prefix whose characters match the characters of the input string within a given edit distance of the input string.
    Type: Application
    Filed: June 23, 2009
    Publication date: December 23, 2010
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Shriraghav Kaushik
  • Patent number: 7720883
    Abstract: Architecture that provides a data profile computation technique which employs key profile computation and data pattern profile computation. Key profile computation in a data table includes both exact keys as well as approximate keys, and is based on key strengths. A key strength of 100% is an exact key, and any other percentage in an approximate key. The key strength is estimated based on the number of table rows that have duplicated attribute values. Only column sets that exceed a threshold value are returned. Pattern profiling identifies a small set of regular expression patterns which best describe the patterns within a given set of attribute values. Pattern profiling includes three phases: a first phases for determining token regular expressions, a second phase for determining candidate regular expressions, and a third phase for identifying the best regular expressions of the candidates that match the attribute values.
    Type: Grant
    Filed: June 27, 2007
    Date of Patent: May 18, 2010
    Assignee: Microsoft Corporation
    Inventors: Zhimin Chen, Venkatesh Ganti, Gunjan Jha, Shriraghav Kaushik, Vivek Narasayya
  • Patent number: 7610283
    Abstract: Input set indexing for set-similarity lookups. The architecture provides input to an indexing process that enables more efficient lookups for large data sets (e.g., disk-based) without requiring a full scan of the input. A new index structure is provided, the output of which is exact, rather than approximate. The similarity of two sets is specified using a similarity function that maps two sets to a numeric value that represents similarity of the two sets. Threshold-based lookups are addressed where two sets are considered similar if the numeric similarity score is above a threshold. The structure efficiently identifies all input sets within a distance k (e.g., a hamming distance) of the query set. Additional information in the form of frequency of elements (the number of input sets in which an element occurs) is used to improve index performance.
    Type: Grant
    Filed: June 12, 2007
    Date of Patent: October 27, 2009
    Assignee: Microsoft Corporation
    Inventors: Arvind Arasu, Venkatesh Ganti, Shriraghav Kaushik
  • Publication number: 20090210418
    Abstract: A transformation-based record matching technique. The technique provides a flexible way to account for synonyms and more general forms of string equivalences when performing record matching by taking as explicit input user-defined transformation rules (such as, for example, the fact that “Robert” and “Bob” that are synonymous). The input string and user-defined transformation rules are used to generate a larger set of strings which are used when performing record matching. Both the input string and data elements in a database can be transformed using the user-defined transformation rules in order to generate a larger set of potential record matches. These potential record matches can then be subjected to a threshold test in order to determine one or more best matches. Additionally, signature-based similarity functions are used to improve the computational efficiency of the technique.
    Type: Application
    Filed: February 15, 2008
    Publication date: August 20, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Arvind Arasu, Surajit Chaudhuri, Shriraghav Kaushik
  • Publication number: 20090083238
    Abstract: Stop-and-restart query execution that partially leverages the work already performed during the initial execution of the query to reduce the execution time during a restart. The technique selectively saves information from a previous execution of the query so that the overhead associated with restarting the query execution can be bounded. Despite saving only limited information, the disclosed technique substantially reduces the running time of the restarted query. The stop-and-restart query execution technique is constrained to save and reuse only a bounded number of records (intermediate records or output records) thereby releasing all other resources, rather than some of the resources. The technique chooses a subset of the records to save that were found during normal execution and then skipping the corresponding records when performing a scan during restart to prevent the duplication of execution. A skip-scan operator is employed to facilitate the disclosed restart technique.
    Type: Application
    Filed: September 21, 2007
    Publication date: March 26, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Surajit Chaudhuri, Shriraghav Kaushik, Abhijit Pol, Ravishankar Ramamurthy
  • Publication number: 20090006392
    Abstract: Architecture that provides a data profile computation technique which employs key profile computation and data pattern profile computation. Key profile computation in a data table includes both exact keys as well as approximate keys, and is based on key strengths. A key strength of 100% is an exact key, and any other percentage in an approximate key. The key strength is estimated based on the number of table rows that have duplicated attribute values. Only column sets that exceed a threshold value are returned. Pattern profiling identifies a small set of regular expression patterns which best describe the patterns within a given set of attribute values. Pattern profiling includes three phases: a first phases for determining token regular expressions, a second phase for determining candidate regular expressions, and a third phase for identifying the best regular expressions of the candidates that match the attribute values.
    Type: Application
    Filed: June 27, 2007
    Publication date: January 1, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Zhimin Chen, Venkatesh Ganti, Gunjan Jha, Shriraghav Kaushik, Vivek Narasayya