Patents by Inventor Frank D. McSherry

Frank D. McSherry has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10171284
    Abstract: A computer-readable storage medium stores computer-executable instructions that, when executed by a processor, perform operations including scheduling first and second threads to operate independently on first and second partitions of data. The operations include beginning a first operation on the first and second partitions by the first and second threads, respectively. The operations include tracking progress of the first operation by the first and second threads using a replicated data structure. The operations include, for a record on which the first operation will be performed, adding an entry to the replicated data structure with a timestamp indicating an epoch and iteration. The operations include determining a number of yet-to-be-processed records for a selected entry of the replicated data structure. The selected entry has the most recent timestamp for the first thread. The operations include terminating the first thread when the number of yet-to-be-processed records for the selected entry is zero.
    Type: Grant
    Filed: November 24, 2017
    Date of Patent: January 1, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Frank D. McSherry, Rebecca Isaacs, Michael A. Isard, Derek G. Murray
  • Publication number: 20180097684
    Abstract: A computer-readable storage medium stores computer-executable instructions that, when executed by a processor, perform operations including scheduling first and second threads to operate independently on first and second partitions of data. The operations include beginning a first operation on the first and second partitions by the first and second threads, respectively. The operations include tracking progress of the first operation by the first and second threads using a replicated data structure. The operations include, for a record on which the first operation will be performed, adding an entry to the replicated data structure with a timestamp indicating an epoch and iteration. The operations include determining a number of yet-to-be-processed records for a selected entry of the replicated data structure. The selected entry has the most recent timestamp for the first thread. The operations include terminating the first thread when the number of yet-to-be-processed records for the selected entry is zero.
    Type: Application
    Filed: November 24, 2017
    Publication date: April 5, 2018
    Inventors: Frank D. Mcsherry, Rebecca Isaacs, Michael A. Isard, Derek G. Murray
  • Patent number: 9832068
    Abstract: Various embodiments provide techniques for working with large-scale collections of data pertaining to real world systems, such as a social network, a roadmap/GPS system, etc. The techniques perform incremental, iterative, and interactive parallel computation using a coordination clock protocol, which applies to scheduling computations and managing resources such as memory and network resources, etc., in cyclic graphs including those resulting from a differential dataflow model that performs computations on differences in the collections of data.
    Type: Grant
    Filed: December 17, 2012
    Date of Patent: November 28, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Frank D. McSherry, Rebecca Isaacs, Michael A. Isard, Derek G. Murray
  • Patent number: 9165035
    Abstract: The techniques discussed herein efficiently perform data-parallel computations on collections of data by implementing a differential dataflow model that performs computations on differences in the collections of data. The techniques discussed herein describe defined operators for use in a data-parallel program that performs the computations on the determined differences between the collections of data by creating a lattice and indexing the differences in the collection of data according to the lattice.
    Type: Grant
    Filed: May 10, 2012
    Date of Patent: October 20, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Frank D. McSherry, Rebecca Isaacs, Michael A. Isard, Derek G. Murray
  • Publication number: 20140172939
    Abstract: Various embodiments provide techniques for working with large-scale collections of data pertaining to real world systems, such as a social network, a roadmap/GPS system, etc. The techniques perform incremental, iterative, and interactive parallel computation using a coordination clock protocol, which applies to scheduling computations and managing resources such as memory and network resources, etc., in cyclic graphs including those resulting from a differential dataflow model that performs computations on differences in the collections of data.
    Type: Application
    Filed: December 17, 2012
    Publication date: June 19, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Frank D. McSherry, Rebecca Isaacs, Michael A. Isard, Derek G. Murray
  • Patent number: 8639649
    Abstract: Given that a differentially private mechanism has a known conditional distribution, probabilistic inference techniques may be used along with the known conditional distribution, and generated results from previously computed queries on private data, to generate a posterior distribution for the differentially private mechanism used by the system. The generated posterior distribution may be used to describe the probability of every possible result being the correct result. The probability may then be used to qualify conclusions or calculations that may depend on the returned result.
    Type: Grant
    Filed: March 23, 2010
    Date of Patent: January 28, 2014
    Assignee: Microsoft Corporation
    Inventors: Frank D. McSherry, Oliver M. C. Williams
  • Patent number: 8619984
    Abstract: User rating data may be received at a correlation engine through a network. The user rating data may include ratings generated by a plurality of users for a plurality of items. Correlation data may be generated from the received user rating data by the correlation engine. The correlation data may identify correlations between the items based on the user generated ratings. Noise may be generated by the correlation engine, and the generated noise may be added to the generated correlation data by the correlation engine to provide differential privacy protection to the user rating data.
    Type: Grant
    Filed: September 11, 2009
    Date of Patent: December 31, 2013
    Assignee: Microsoft Corporation
    Inventors: Frank D. McSherry, Ilya Mironov
  • Publication number: 20130304744
    Abstract: The techniques discussed herein efficiently perform data-parallel computations on collections of data by implementing a differential dataflow model that performs computations on differences in the collections of data. The techniques discussed herein describe defined operators for use in a data-parallel program that performs the computations on the determined differences between the collections of data by creating a lattice and indexing the differences in the collection of data according to the lattice.
    Type: Application
    Filed: May 10, 2012
    Publication date: November 14, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Frank D. McSherry, Rebecca Isaacs, Michael A. Isard, Derek G. Murray
  • Patent number: 8145682
    Abstract: A query log includes a list of queries and a count for each query representing the number of times that the query was received by a search engine. In order to provide differential privacy protection to the queries, noise is generated and added to each count, and queries that have counts that fall below a threshold are removed from the query log. A distribution associated with a function used to generate the noise is referenced to determine a distribution of a number of times that a hypothetical query having a zero count would have its count exceed the threshold after the addition of noise. Random queries of an amount equal to a sample from the distribution of number of times are added to the query log with a count that is greater than the threshold count.
    Type: Grant
    Filed: February 25, 2010
    Date of Patent: March 27, 2012
    Assignee: Microsoft Corporation
    Inventors: Frank D. McSherry, Kunal Talwar
  • Publication number: 20110238611
    Abstract: Given that a differentially private mechanism has a known conditional distribution, probabilistic inference techniques may be used along with the known conditional distribution, and generated results from previously computed queries on private data, to generate a posterior distribution for the differentially private mechanism used by the system. The generated posterior distribution may be used to describe the probability of every possible result being the correct result. The probability may then be used to qualify conclusions or calculations that may depend on the returned result.
    Type: Application
    Filed: March 23, 2010
    Publication date: September 29, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Frank D. McSherry, Oliver M. C. Williams
  • Publication number: 20110208763
    Abstract: A query log includes a list of queries and a count for each query representing the number of times that the query was received by a search engine. In order to provide differential privacy protection to the queries, noise is generated and added to each count, and queries that have counts that fall below a threshold are removed from the query log. A distribution associated with a function used to generate the noise is referenced to determine a distribution of a number of times that a hypothetical query having a zero count would have its count exceed the threshold after the addition of noise. Random queries of an amount equal to a sample from the distribution of number of times are added to the query log with a count that is greater than the threshold count.
    Type: Application
    Filed: February 25, 2010
    Publication date: August 25, 2011
    Applicant: Microsoft Corporation
    Inventors: Frank D. McSherry, Kunal Talwar
  • Patent number: 8005821
    Abstract: Systems and methods for injecting noise into secure function evaluation to protect the privacy of the participants and for computing a collective noisy result by combining results and noise generated based on input from the participants. When implemented using distributed computing devices, each device may have access to a subset of data. A query may be distributed to the devices, and each device applies the query to its own subset of data to obtain a subset result. Each device then divides its subset result into one or more shares, and the shares are combined to form a collective result. The devices may also generate random bits. The random bits may be combined and used to generate noise. The collective result can be combined with the noise to obtain a collective noisy result.
    Type: Grant
    Filed: October 6, 2005
    Date of Patent: August 23, 2011
    Assignee: Microsoft Corporation
    Inventors: Cynthia Dwork, Frank D. McSherry
  • Publication number: 20110064221
    Abstract: User rating data may be received at a correlation engine through a network. The user rating data may include ratings generated by a plurality of users for a plurality of items. Correlation data may be generated from the received user rating data by the correlation engine. The correlation data may identify correlations between the items based on the user generated ratings. Noise may be generated by the correlation engine, and the generated noise may be added to the generated correlation data by the correlation engine to provide differential privacy protection to the user rating data.
    Type: Application
    Filed: September 11, 2009
    Publication date: March 17, 2011
    Applicant: Microsoft Corporation
    Inventors: Frank D. McSherry, Ilya Mironov
  • Patent number: 7818335
    Abstract: Systems and methods are provided for selectively determining privacy guarantees. For example, a first class of data may be guaranteed a first level of privacy, while other data classes are only guaranteed some lesser level of privacy. An amount of privacy is guaranteed by adding noise values to database query outputs. Noise distributions can be tailored to be appropriate for the particular data in a given database by calculating a “diameter” of the data. When the distribution is based on the diameter of a first class of data, and the diameter measurement does not account for additional data in the database, the result is that query outputs leak information about the additional data.
    Type: Grant
    Filed: December 22, 2005
    Date of Patent: October 19, 2010
    Assignee: Microsoft Corporation
    Inventors: Cynthia Dwork, Frank D. McSherry
  • Patent number: 7769707
    Abstract: Privacy of data can be preserved while utility of the output is maximized by selecting from an appropriately calculated distribution of noise values to add to an output. A distribution that includes a high likelihood of large noise values may lead to less useful output data. Conversely, a distribution that includes very low likelihood of large noise values may lead to less privacy. A distribution should be calculated to provide an appropriate level of output utility and privacy based on the query that is performed and the desired privacy level.
    Type: Grant
    Filed: November 30, 2005
    Date of Patent: August 3, 2010
    Assignee: Microsoft Corporation
    Inventors: Cynthia Dwork, Frank D. McSherry
  • Patent number: 7739356
    Abstract: An improved entity naming scheme employs the use of two sets of names: local names and global names. The local and global naming scheme may be applied to entities that are assigned to a number of different global compartments. Local entities are entities that are assigned to the same compartment, while non-local entities are entities that are assigned to different compartments. Each entity is assigned a local name that is unique among all local entities. Additionally, a number of global entities are identified. Global entities are entities that are referenced by one or more non-local entities. Each global entity is assigned a global name that is unique among all global entities.
    Type: Grant
    Filed: December 16, 2005
    Date of Patent: June 15, 2010
    Assignee: Microsoft Corporation
    Inventors: Frank D. McSherry, Ulfar Erlingsson
  • Patent number: 7716144
    Abstract: Techniques are provided that identify near-duplicate items in large collections of items. A list of (value, frequency) pairs is received, and a sample (value, instance) is returned. The value is chosen from the values of the first list, and the instance is a value less than frequency, in such a way that the probability of selecting the same sample from two lists is equal to the similarity of the two lists.
    Type: Grant
    Filed: March 22, 2007
    Date of Patent: May 11, 2010
    Assignee: Microsoft Corporation
    Inventors: Frank D. McSherry, Kunal Talwar, Mark Steven Manasse
  • Patent number: 7698250
    Abstract: Systems and methods are provided for controlling privacy loss associated with database participation. In general, privacy loss can be evaluated based on information available to a hypothetical adversary with access to a database under two scenarios: a first scenario in which the database does not contain data about a particular privacy principal, and a second scenario in which the database does contain data about the privacy principal. Such evaluation can be made for example by a mechanism for determining sensitivity of at least one database query output to addition to the database of data associated with a privacy principal. An appropriate noise distribution can be calculated based on the sensitivity measurement and optionally a privacy parameter. A noise value is selected from the distribution and added to query outputs.
    Type: Grant
    Filed: December 16, 2005
    Date of Patent: April 13, 2010
    Assignee: Microsoft Corporation
    Inventors: Cynthia Dwork, Frank D. McSherry
  • Publication number: 20100070511
    Abstract: Documents that are near-duplicates may be determined using techniques involving consistent uniform hashing. A biased bit may be placed in the leading position of a sequence of bits that may be generated and subsequently used in comparison techniques to determine near-duplicate documents. Unbiased bits may be used in subsequent positions of the sequence of bits, after the biased bit, for use in comparison techniques. Samples may be used collectively, as opposed to individually, in the generation of biased bits. Sequences of bits may thus be produced not on a single sample basis, but for multiple samples, thereby amortizing the cost of generating randomness for the samples. Less than one bit of randomness per sample may be used.
    Type: Application
    Filed: September 17, 2008
    Publication date: March 18, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Mark Steven Manasse, Frank D. McSherry, Kunal Talwar
  • Patent number: 7676513
    Abstract: While consulting indexes to conduct a search, a determination is made from time to time as to whether it is more efficient to consult individual indexes in a set or to merge the indexes and consult the merged index. The cost of merging indexes is compared with the cost of individually querying indexes. In accordance with the result of this comparison, the indexes are merged and the merged index is consulted, or the indexes are individually consulted. A cost-balance invariant in the form of an inequality is used to equate the cost of merging indexes to a weighted cost of individually querying indexes. As query events are received, the costs are updated. As long as the cost-balance invariant is not violated, indexes are merged and the merged index is queried. If the cost-balance invariant is violated, indexes are not merged, and the indexes are individually queried.
    Type: Grant
    Filed: January 6, 2006
    Date of Patent: March 9, 2010
    Assignee: Microsoft Corporation
    Inventors: Frank D. McSherry, John P. MacCormick