Patents by Inventor Shriraghav Kaushik
Shriraghav Kaushik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11232214Abstract: Methods, systems, and computer-readable media are directed towards receiving, at an untrusted component, a query for a data store. The query includes a plurality of data operations. The data store is accessible by the untrusted component. A first proper subset of data operations is determined from the plurality of data operations that do not access sensitive data within the data store. A second proper subset of data operations is determined from the plurality of data operations that access sensitive data within the data store. The first proper subset of data operations is executed, at the untrusted component, to create first results. The second proper subset of data operations is sent to a trusted component for execution. Second results based on the sending the second proper subset of data operations are received from the trusted component. Results to the query are returned based on the first results and the second results.Type: GrantFiled: May 13, 2020Date of Patent: January 25, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth H. Eguro, Manas Rajendra Joglekar, Donald Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
-
Publication number: 20200272744Abstract: Methods, systems, and computer-readable media are directed towards receiving, at an untrusted component, a query for a data store. The query includes a plurality of data operations. The data store is accessible by the untrusted component. A first proper subset of data operations is determined from the plurality of data operations that do not access sensitive data within the data store. A second proper subset of data operations is determined from the plurality of data operations that access sensitive data within the data store. The first proper subset of data operations is executed, at the untrusted component, to create first results. The second proper subset of data operations is sent to a trusted component for execution. Second results based on the sending the second proper subset of data operations are received from the trusted component. Results to the query are returned based on the first results and the second results.Type: ApplicationFiled: May 13, 2020Publication date: August 27, 2020Inventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth H. Eguro, Manas Rajendra Joglekar, Donald Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
-
Patent number: 10671736Abstract: Methods, systems, and computer-readable media are directed towards receiving, at an untrusted component, a query for a data store. The query includes a plurality of data operations. The data store is accessible by the untrusted component. A first proper subset of data operations is determined from the plurality of data operations that do not access sensitive data within the data store. A second proper subset of data operations is determined from the plurality of data operations that access sensitive data within the data store. The first proper subset of data operations is executed, at the untrusted component, to create first results. The second proper subset of data operations is sent to a trusted component for execution. Second results based on the sending the second proper subset of data operations are received from the trusted component. Results to the query are returned based on the first results and the second results.Type: GrantFiled: October 27, 2017Date of Patent: June 2, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth H. Eguro, Manas Rajendra Joglekar, Donald Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
-
Publication number: 20180046812Abstract: Methods, systems, and computer-readable media are directed towards receiving, at an untrusted component, a query for a data store. The query includes a plurality of data operations. The data store is accessible by the untrusted component. A first proper subset of data operations is determined from the plurality of data operations that do not access sensitive data within the data store. A second proper subset of data operations is determined from the plurality of data operations that access sensitive data within the data store. The first proper subset of data operations is executed, at the untrusted component, to create first results. The second proper subset of data operations is sent to a trusted component for execution. Second results based on the sending the second proper subset of data operations are received from the trusted component. Results to the query are returned based on the first results and the second results.Type: ApplicationFiled: October 27, 2017Publication date: February 15, 2018Inventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth H. Eguro, Manas Rajendra Joglekar, Donald Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
-
Patent number: 9747456Abstract: The subject disclosure is directed towards secure query processing over encrypted database records without disclosing information to an adversary except for permitted information. In order to adapting semantic security to a database encryption scheme, a security model for all query processing is specified by a client and used to determine which information is permitted to be disclosed and which information is not permitted. Based upon the security model, a trusted, secure query processor transforms each query and an encrypted database into secure query results. Even though the adversary can view the secure query results during communication to the client, the adversary cannot determine any reliable information regarding the secure query results or the encrypted database.Type: GrantFiled: March 15, 2013Date of Patent: August 29, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Arvind Arasu, Shriraghav Kaushik, Ravishankar Ramamurthy
-
Patent number: 9081817Abstract: An active learning record matching system and method for producing a record matching package that is used to identify pairs of duplicate records. Embodiments of the system and method allow a precision threshold to be specified and then generate a learned record matching package having precision greater than this threshold and a recall close to the best possible recall. Embodiments of the system and method use a blocking technique to restrict the space of record matching packages considered and scale to large inputs. The learning method considers several record matching packages, estimates the precision and recall of the packages, and identifies the package with maximum recall having precision greater than equal to the given precision threshold. A human domain expert labels a sample of record pairs in the output of the package as matches or non-matches and this labeling is used to estimate the precision of the package.Type: GrantFiled: April 11, 2011Date of Patent: July 14, 2015Assignee: Microsoft Technology Licensing, LLCInventors: Arvind Arasu, Michaela Götz, Shriraghav Kaushik
-
Publication number: 20140281511Abstract: The subject disclosure is directed towards using trusted hardware to achieve secure data processing over a network. For a given set of data store operations, some operations are directed to sensitive data (e.g., encrypted data fields). These operations are compiled into a set of expressions invoking trusted hardware code configured to evaluate these expressions using corresponding data centric primitive programs. Because the trusted hardware is configured to maintain key data for encrypting/decrypting the sensitive data, the sensitive data is not accessible by an untrusted component while the sensitive data is decrypted.Type: ApplicationFiled: August 27, 2013Publication date: September 18, 2014Applicant: Microsoft CorporationInventors: Shriraghav Kaushik, Arvind Arasu, Spyridon Blanas, Kenneth Eguro, Manas Rajendra Joglekar, Donald A. Kossmann, Ravishankar Ramamurthy, Prasang Upadhyaya, Ramarathnam Venkatesan
-
Publication number: 20140230070Abstract: SQL query auditing technique embodiments are presented that involve auditing data in a relational database accessed during execution of a SQL search query via a query execution plan to detect and report access to sensitive data. In one embodiment, a computer is used for inputting a SELECT trigger which specifies the sensitive data resident in the relational database that is to be monitored for access during execution of the SQL search query. In addition, the SELECT trigger specifies an action that is to be taken once execution of the SQL search query is completed, if sensitive data was accessed. Then, during execution of the query execution plan, access to sensitive data is monitored, and whenever such access is detected, it is reported. Next, upon completion of the execution of the SQL search query, the action specified in the SELECT trigger is performed if access to sensitive data was reported.Type: ApplicationFiled: February 14, 2013Publication date: August 14, 2014Applicant: MICROSOFT CORPORATIONInventors: Ravi Ramamurthy, Shriraghav Kaushik, Daniel Fabbri
-
Publication number: 20130132352Abstract: The present application provides for techniques for implementing data auditing embodiments that determine whether a query into a database is or has referenced forbidden data within the database. Various techniques are given for efficiently finding all tuples in a database referenced by a given query. A set of sensitive data is determined within a database and the set of sensitive data is employed to define a forbidden view within the database. Data within the database may be annotated to provide efficient identification of data access by query. Incoming queries may be analyzed and modified to propagate annotations for analyzing what data is or was accessed.Type: ApplicationFiled: November 23, 2011Publication date: May 23, 2013Applicant: Microsoft CorporationInventors: Shriraghav Kaushik, Ravishankar Ramamurthy, Yupeng Fu
-
Publication number: 20120259802Abstract: An active learning record matching system and method for producing a record matching package that is used to identify pairs of duplicate records. Embodiments of the system and method allow a precision threshold to be specified and then generate a learned record matching package having precision greater than this threshold and a recall close to the best possible recall. Embodiments of the system and method use a blocking technique to restrict the space of record matching packages considered and scale to large inputs. The learning method considers several record matching packages, estimates the precision and recall of the packages, and identifies the package with maximum recall having precision greater than equal to the given precision threshold. A human domain expert labels a sample of record pairs in the output of the package as matches or non-matches and this labeling is used to estimate the precision of the package.Type: ApplicationFiled: April 11, 2011Publication date: October 11, 2012Applicant: Microsoft CorporationInventors: Arvind Arasu, Michaela Götz, Shriraghav Kaushik
-
Patent number: 8249336Abstract: Techniques are described to leverage a set of sample or example matched pairs of strings to learn string transformation rules, which may be used to match data records that are semantically equivalent. In one embodiment, matched pairs of input strings are accessed. For a set of matched pairs, a set of one or more string transformation rules are learned. A transformation rule may include two strings determined to be semantically equivalent. The transformation rules are used to determine whether a first and second string match each other.Type: GrantFiled: August 14, 2009Date of Patent: August 21, 2012Assignee: Microsoft CorporationInventors: Arvind Arasu, Surajit Chaudhuri, Shriraghav Kaushik
-
Patent number: 8204866Abstract: A deduplication algorithm that provides improved accuracy in data deduplication by using aggregate and/or groupwise constraints. Deduplication is accomplished using only as many of these constraints that are satisfied rather than be imposed inflexibly as hard constraints. Additionally, textual similarity between tuples is leveraged to restrict the search space. The algorithm begins with a coarse initial partition of data records and continues by raising the similarity threshold until the threshold splits a given partition. This sequence of splits defines a rich space of alternatives. Over this space, an algorithm finds a partition of the input that maximizes constraint satisfaction. In the context of groupwise aggregation constraints for deduplication all SQL (structured query language) aggregates are allowed, including summation.Type: GrantFiled: May 18, 2007Date of Patent: June 19, 2012Assignee: Microsoft CorporationInventors: Surajit Chaudhuri, Venkatesh Ganti, Shriraghav Kaushik, Anish Das Sarma
-
Patent number: 8046339Abstract: Example-driven creation of record matching queries. The disclosed architecture employs techniques that exploit the availability of positive (or matching) and negative (non-matching) examples to search through this space and suggest an initial record matching query. The record matching task is modeled as that of designing an operator tree obtained by composing a few primitive operators. This ensures that record matching programs be executable efficiently and scalably over large input relations. The architecture joins records across multiple (e.g., two) relations (e.g., R and S). The architecture exploits the monotonicity property of similarity functions for record matching in the relations, in that, any pair of matching records have a higher similarity value than non-matching record pairs on at least one similarity function.Type: GrantFiled: June 5, 2007Date of Patent: October 25, 2011Assignee: Microsoft CorporationInventors: Surajit Chaudhuri, Bee Chung Chen, Venkatesh Ganti, Shriraghav Kaushik
-
Publication number: 20110038531Abstract: Techniques are described to leverage a set of sample or example matched pairs of strings to learn string transformation rules, which may be used to match data records that are semantically equivalent. In one embodiment, matched pairs of input strings are accessed. For a set of matched pairs, a set of one or more string transformation rules are learned. A transformation rule may include two strings determined to be semantically equivalent. The transformation rules are used to determine whether a first and second string match each other.Type: ApplicationFiled: August 14, 2009Publication date: February 17, 2011Applicant: MICROSOFT CORPORATIONInventors: Arvind Arasu, Surajit Chaudhuri, Shriraghav Kaushik
-
Publication number: 20100325136Abstract: Techniques for error-tolerant autocompletion are described. While displaying characters of an input string as they are inputted by a user, when a character is added to the input string by the user, matching strings may be selected from among a set of candidate strings by determining which of the candidate strings have a prefix whose characters match the characters of the input string within a given edit distance of the input string.Type: ApplicationFiled: June 23, 2009Publication date: December 23, 2010Applicant: Microsoft CorporationInventors: Surajit Chaudhuri, Shriraghav Kaushik
-
Patent number: 7720883Abstract: Architecture that provides a data profile computation technique which employs key profile computation and data pattern profile computation. Key profile computation in a data table includes both exact keys as well as approximate keys, and is based on key strengths. A key strength of 100% is an exact key, and any other percentage in an approximate key. The key strength is estimated based on the number of table rows that have duplicated attribute values. Only column sets that exceed a threshold value are returned. Pattern profiling identifies a small set of regular expression patterns which best describe the patterns within a given set of attribute values. Pattern profiling includes three phases: a first phases for determining token regular expressions, a second phase for determining candidate regular expressions, and a third phase for identifying the best regular expressions of the candidates that match the attribute values.Type: GrantFiled: June 27, 2007Date of Patent: May 18, 2010Assignee: Microsoft CorporationInventors: Zhimin Chen, Venkatesh Ganti, Gunjan Jha, Shriraghav Kaushik, Vivek Narasayya
-
Patent number: 7610283Abstract: Input set indexing for set-similarity lookups. The architecture provides input to an indexing process that enables more efficient lookups for large data sets (e.g., disk-based) without requiring a full scan of the input. A new index structure is provided, the output of which is exact, rather than approximate. The similarity of two sets is specified using a similarity function that maps two sets to a numeric value that represents similarity of the two sets. Threshold-based lookups are addressed where two sets are considered similar if the numeric similarity score is above a threshold. The structure efficiently identifies all input sets within a distance k (e.g., a hamming distance) of the query set. Additional information in the form of frequency of elements (the number of input sets in which an element occurs) is used to improve index performance.Type: GrantFiled: June 12, 2007Date of Patent: October 27, 2009Assignee: Microsoft CorporationInventors: Arvind Arasu, Venkatesh Ganti, Shriraghav Kaushik
-
Publication number: 20090210418Abstract: A transformation-based record matching technique. The technique provides a flexible way to account for synonyms and more general forms of string equivalences when performing record matching by taking as explicit input user-defined transformation rules (such as, for example, the fact that “Robert” and “Bob” that are synonymous). The input string and user-defined transformation rules are used to generate a larger set of strings which are used when performing record matching. Both the input string and data elements in a database can be transformed using the user-defined transformation rules in order to generate a larger set of potential record matches. These potential record matches can then be subjected to a threshold test in order to determine one or more best matches. Additionally, signature-based similarity functions are used to improve the computational efficiency of the technique.Type: ApplicationFiled: February 15, 2008Publication date: August 20, 2009Applicant: MICROSOFT CORPORATIONInventors: Arvind Arasu, Surajit Chaudhuri, Shriraghav Kaushik
-
Publication number: 20090083238Abstract: Stop-and-restart query execution that partially leverages the work already performed during the initial execution of the query to reduce the execution time during a restart. The technique selectively saves information from a previous execution of the query so that the overhead associated with restarting the query execution can be bounded. Despite saving only limited information, the disclosed technique substantially reduces the running time of the restarted query. The stop-and-restart query execution technique is constrained to save and reuse only a bounded number of records (intermediate records or output records) thereby releasing all other resources, rather than some of the resources. The technique chooses a subset of the records to save that were found during normal execution and then skipping the corresponding records when performing a scan during restart to prevent the duplication of execution. A skip-scan operator is employed to facilitate the disclosed restart technique.Type: ApplicationFiled: September 21, 2007Publication date: March 26, 2009Applicant: MICROSOFT CORPORATIONInventors: Surajit Chaudhuri, Shriraghav Kaushik, Abhijit Pol, Ravishankar Ramamurthy
-
Publication number: 20090006392Abstract: Architecture that provides a data profile computation technique which employs key profile computation and data pattern profile computation. Key profile computation in a data table includes both exact keys as well as approximate keys, and is based on key strengths. A key strength of 100% is an exact key, and any other percentage in an approximate key. The key strength is estimated based on the number of table rows that have duplicated attribute values. Only column sets that exceed a threshold value are returned. Pattern profiling identifies a small set of regular expression patterns which best describe the patterns within a given set of attribute values. Pattern profiling includes three phases: a first phases for determining token regular expressions, a second phase for determining candidate regular expressions, and a third phase for identifying the best regular expressions of the candidates that match the attribute values.Type: ApplicationFiled: June 27, 2007Publication date: January 1, 2009Applicant: MICROSOFT CORPORATIONInventors: Zhimin Chen, Venkatesh Ganti, Gunjan Jha, Shriraghav Kaushik, Vivek Narasayya