Patents by Inventor Frank McSherry
Frank McSherry has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8209664Abstract: General-purpose distributed data-parallel computing using high-level computing languages is described. Data parallel portions of a sequential program that is written by a developer in a high-level language are automatically translated into a distributed execution plan. A set of extensions to a sequential high-level computing language are provided to support distributed parallel computations and to facilitate generation and optimization of distributed execution plans. The extensions are fully integrated with the programming language, thereby enabling developers to write sequential language programs using known constructs while providing the ability to invoke the extensions to enable better generation and optimization of the execution plan for a distributed computing environment.Type: GrantFiled: March 18, 2009Date of Patent: June 26, 2012Assignee: Microsoft CorporationInventors: Yuan Yu, Ulfar Erlingsson, Michael A Isard, Frank McSherry
-
Publication number: 20100241827Abstract: General-purpose distributed data-parallel computing using high-level computing languages is described. Data parallel portions of a sequential program that is written by a developer in a high-level language are automatically translated into a distributed execution plan. A set of extensions to a sequential high-level computing language are provided to support distributed parallel computations and to facilitate generation and optimization of distributed execution plans. The extensions are fully integrated with the programming language, thereby enabling developers to write sequential language programs using known constructs while providing the ability to invoke the extensions to enable better generation and optimization of the execution plan for a distributed computing environment.Type: ApplicationFiled: March 18, 2009Publication date: September 23, 2010Applicant: MICROSOFT CORPORATIONInventors: Yuan Yu, Ulfar Erlingsson, Michael A. Isard, Frank McSherry
-
Publication number: 20090234734Abstract: Advertising slots on a search engine results page may be determined based on keywords and/or results to a user query. Advertisers may use the keywords and/or the results to the query to place their ads into the advertising slots. Rules may be applied to determine how ads are displayed or not displayed. For example, a larger set of keywords may be inferred from the initial set of keywords on which the ad provider placed her bids. This greatly increases the potential reach of an advertiser's ad campaign or a search engine provider's revenue from ad placement.Type: ApplicationFiled: March 17, 2008Publication date: September 17, 2009Applicant: MICROSOFT CORPORATIONInventors: Sreenivas Gollapudi, Frank McSherry, Rina Panigrahy, Kunal Talwar
-
Publication number: 20090182797Abstract: Techniques for contingency table release provide an accurate and consistent set of tables while guaranteeing that privacy is preserved. A positive and integral database is constructed that corresponds to these tables. Therefore, a database can be generated that preserves low-order marginals up to a small error. Moreover, a gracefully degrading version of the results is provided as a database can be computed such that the error in the low-order marginals is small, and increases smoothly with the order of the marginal.Type: ApplicationFiled: January 10, 2008Publication date: July 16, 2009Applicant: MICROSOFT CORPORATIONInventors: Cynthia Dwork, Frank McSherry, Kunal Talwar, Boaz Barak, Kamalika Chaudhuri, Satyen Kale
-
Publication number: 20070234005Abstract: Hash tables comprising load factors of up to and above 97% are disclosed. The hash tables may be associated with three or more hash functions, each hash function being applied to a key to identify a location in a hash table. The load factor of a hash table may be increased, obviating any need to increase the size of the hash table to accommodate more insertions. Such increase in load factor may be accomplished by a combination of increasing the number of cells per bucket in a hash table and increasing the number of hash functions associated with the hash table.Type: ApplicationFiled: March 29, 2006Publication date: October 4, 2007Applicant: Microsoft CorporationInventors: Ulfar Erlingsson, Mark Manasse, Frank McSherry, Abraham Flaxman
-
Publication number: 20070174314Abstract: While consulting indexes to conduct a search, a determination is made from time to time as to whether it is more efficient to consult individual indexes in a set or to merge the indexes and consult the merged index. The cost of merging indexes is compared with the cost of individually querying indexes. In accordance with the result of this comparison, the indexes are merged and the merged index is consulted, or the indexes are individually consulted. A cost-balance invariant in the form of an inequality is used to equate the cost of merging indexes to a weighted cost of individually querying indexes. As query events are received, the costs are updated. As long as the cost-balance invariant is not violated, indexes are merged and the merged index is queried. If the cost-balance invariant is violated, indexes are not merged, and the indexes are individually queried.Type: ApplicationFiled: January 6, 2006Publication date: July 26, 2007Applicant: Microsoft CorporationInventors: Frank McSherry, John MacCormick
-
Publication number: 20070150437Abstract: Systems and methods are provided for obscuring an amount of a resource used to process an item. In general, contemplated techniques comprise assigning a maximum allowable amount of the resource for processing a sub-part of the item. If the maximum allowable amount of the resource is reached, processing the sub-part may be terminated. Once all sub-parts are processed, a noisy quantity of the resource that was consumed in processing the item may be released. The noisy quantity is determined by adding a positive amount of the resource, combined with a noise value, to an actual quantity of the resource that was consumed.Type: ApplicationFiled: December 22, 2005Publication date: June 28, 2007Applicant: Microsoft CorporationInventors: Cynthia Dwork, Frank McSherry, Ilya Mironov
-
Publication number: 20070147606Abstract: Systems and methods are provided for selectively determining privacy guarantees. For example, a first class of data may be guaranteed a first level of privacy, while other data classes are only guaranteed some lesser level of privacy. An amount of privacy is guaranteed by adding noise values to database query outputs. Noise distributions can be tailored to be appropriate for the particular data in a given database by calculating a “diameter” of the data. When the distribution is based on the diameter of a first class of data, and the diameter measurement does not account for additional data in the database, the result is that query outputs leak information about the additional data.Type: ApplicationFiled: December 22, 2005Publication date: June 28, 2007Applicant: Microsoft CorporationInventors: Cynthia Dwork, Frank McSherry
-
Publication number: 20070143437Abstract: An improved entity naming scheme employs the use of two sets of names: local names and global names. The local and global naming scheme may be applied to entities that are assigned to a number of different global compartments. Local entities are entities that are assigned to the same compartment, while non-local entities are entities that are assigned to different compartments. Each entity is assigned a local name that is unique among all local entities. Additionally, a number of global entities are identified. Global entities are entities that are referenced by one or more non-local entities. Each global entity is assigned a global name that is unique among all global entities.Type: ApplicationFiled: December 16, 2005Publication date: June 21, 2007Applicant: Microsoft CorporationInventors: Frank McSherry, Ulfar Erlingsson
-
Publication number: 20070143289Abstract: Systems and methods are provided for controlling privacy loss associated with database participation. In general, privacy loss can be evaluated based on information available to a hypothetical adversary with access to a database under two scenarios: a first scenario in which the database does not contain data about a particular privacy principal, and a second scenario in which the database does contain data about the privacy principal. Such evaluation can be made for example by a mechanism for determining sensitivity of at least one database query output to addition to the database of data associated with a privacy principal. An appropriate noise distribution can be calculated based on the sensitivity measurement and optionally a privacy parameter. A noise value is selected from the distribution and added to query outputs.Type: ApplicationFiled: December 16, 2005Publication date: June 21, 2007Applicant: Microsoft CorporationInventors: Cynthia Dwork, Frank McSherry
-
Publication number: 20070136027Abstract: A histogram can be generated and displayed with noisy category values, where the noise values are selected from a noise distribution that is calculated using a histogram diameter. The noise values are combined with histogram category values, thereby producing noisy histogram category values that do not reveal information about the contributors.Type: ApplicationFiled: December 9, 2005Publication date: June 14, 2007Applicant: Microsoft CorporationInventors: Cynthia Dwork, Frank McSherry
-
Publication number: 20070130147Abstract: An amount of noise to add to a query output may be selected to preserve privacy of inputs while maximizing utility of the released output. Noise values can be distributed according to a substantially symmetric exponential density function (“exponential distribution”). That is, the most likely noise value can be zero, and noise values of increasing absolute value can decrease in probability according to the exponential function.Type: ApplicationFiled: December 2, 2005Publication date: June 7, 2007Applicant: Microsoft CorporationInventors: Cynthia Dwork, Frank McSherry
-
Publication number: 20070124268Abstract: Privacy of data can be preserved while utility of the output is maximized by selecting from an appropriately calculated distribution of noise values to add to an output. A distribution that includes a high likelihood of large noise values may lead to less useful output data. Conversely, a distribution that includes very low likelihood of large noise values may lead to less privacy. A distribution should be calculated to provide an appropriate level of output utility and privacy based on the query that is performed and the desired privacy level.Type: ApplicationFiled: November 30, 2005Publication date: May 31, 2007Applicant: Microsoft CorporationInventors: Cynthia Dwork, Frank McSherry
-
Publication number: 20070083493Abstract: Techniques are provided for injecting noise into secure function evaluation to protect the privacy of the participants. A system and method are illustrated that can compute a collective noisy result by combining results and noise generated based on input from the participants. When implemented using distributed computing devices, each device may have access to a subset of data. A query may be distributed to the devices, and each device applies the query to its own subset of data to obtain a subset result. Each device then divides its subset result into one or more shares, and the shares are combined to form a collective result. The devices may also generate random bits. The random bits may be combined and used to generate noise. The collective result can be combined with the noise to obtain a collective noisy result.Type: ApplicationFiled: October 6, 2005Publication date: April 12, 2007Applicant: Microsoft CorporationInventors: Cynthia Dwork, Frank McSherry
-
Publication number: 20070061315Abstract: An input or query is determined for which a search engine's static ranking computation is the answer. By understanding how this input or query differs from the posed input or query, the precise termination point of an iterative convergence problem can be determined. An iterative process provides the following inputs to the system: a graph of hyperlinks, and a vector of how the probability mass is redistributed. Given the set of ranks (the output results), it is determined how the input (e.g., the query) would have to be changed to get the rank(s) as the answer or result. Backward answer analysis is provided in the web page context. The difference between what was asked and what should have been asked is determined. After the difference is computed, it is determined if the iterative process should be stopped or not.Type: ApplicationFiled: September 15, 2005Publication date: March 15, 2007Applicant: Microsoft CorporationInventor: Frank McSherry
-
Publication number: 20060200431Abstract: A database has a plurality of entries and a plurality of attributes common to each entry, where each entry corresponds to an individual. A query is received from a querying entity query and is passed to the database, and an answer is received in response. An amount of noise is generated and added to the answer to result in an obscured answer, and the obscured answer is returned to the querying entity. The noise is normally distributed around zero with a particular variance. The variance R may be determined in accordance with R>8 T log2(T/?)/?2, where T is the permitted number of queries T, ? is the utter failure probability, and ? is the largest admissible increase in confidence. Thus, a level of protection of privacy is provided to each individual represented within the database. Example noise generation techniques, systems, and methods may be used for privacy preservation in such areas as k means, principal component analysis, statistical query learning models, and perceptron algorithms.Type: ApplicationFiled: March 1, 2005Publication date: September 7, 2006Applicant: Microsoft CorporationInventors: Cynthia Dwork, Frank McSherry, Yaacov Nissim Kobliner, Avrim Blum
-
Publication number: 20060026191Abstract: Methods and systems are described for computing page rankings more efficiently. Using an interconnectivity matrix describing the interconnection of web pages, a new matrix is computed. The new matrix is used to compute the average of values associated with each web page's neighboring web pages. The secondary eigenvector of this new matrix is computed, and indices for web pages are relabeled according to the eigenvector. The data structure storing the interconnectivity information is preferably also physically sorted according to the eigenvector. By reorganizing the matrix used in the web page ranking computations, caching is performed more efficiently, resulting in faster page ranking techniques. Methods for efficiently allocating the distribution of resources are also described.Type: ApplicationFiled: July 30, 2004Publication date: February 2, 2006Applicant: Microsoft CorporationInventor: Frank McSherry
-
Publication number: 20060015588Abstract: The present invention provides a unique system and method that facilitates reducing network traffic between a plurality of servers located on a social-based network. The system and method involve identifying a plurality of vertices or service users on the network with respect to their server or network locations. The vertices' contacts or connections can be located or determined as well. In order to minimize communication traffic, the vertices and their connections with respect to their respective server locations can be analyzed to determine whether at least a subset of nodes should be moved or relocated to another server to facilitate mitigating network traffic while balancing user load among the various servers or parts of the network. Thus, an underlying social network can be effectively partitioned. In addition, the network can be parsed into a collection of nested layers, whereby each successively less dense layer can be partitioned with respect to the previous (partitioned) more dense layer.Type: ApplicationFiled: June 30, 2004Publication date: January 19, 2006Applicant: Microsoft CorporationInventors: Dimitris Achlioptas, Frank McSherry
-
Publication number: 20060004811Abstract: Methods and systems are provided for efficiently computing page rankings of web pages or other interconnected objects. The rankings are produced by efficiently computing a principal eigenvector of a page ranking transition matrix. The methods and systems provided herein can be used to produce page rankings in a distributed and/or incremental manner, and can be used to allocate computing resources to processing page rankings for those pages that most demand them.Type: ApplicationFiled: July 1, 2004Publication date: January 5, 2006Applicant: Microsoft CorporationInventor: Frank McSherry
-
Publication number: 20050149502Abstract: Methods and systems are provided for efficiently computing personalized rankings of web pages or other interconnected objects. The personalized rankings are produced by efficiently computing an approximation matrix to an ideal personalized page ranking matrix. The methods and systems provided herein can be used to produce search results with particular relevance to an individual searcher.Type: ApplicationFiled: January 5, 2004Publication date: July 7, 2005Applicant: Microsoft CorporationInventor: Frank McSherry