BLOOM FILTER PARTITIONING
A partitioned Bloom filter is disclosed. In various embodiments, a representation of an item is received. The representation is used to determine a partition with which the item is associated. A partition-specific Bloom filter is used to determine at least in part whether the item may be an element of a set with which the partition is associated.
This application is a continuation of co-pending U.S. patent application Ser. No. 14/675,476, entitled BLOOM FILTER PARTITIONING filed Mar. 31, 2015 which is incorporated herein by reference for all purposes.
BACKGROUND OF THE INVENTIONBloom filters provide a space efficient way to store data that can be used to test whether an element is a member of a set. A Bloom filter may comprise a bit array of m bits. One or more hash functions k may be used to map a given item or a corresponding one or more locations in the array. For example, an element A may be mapped to a filter location by computing the hash of the element A modulo the size of the array. As an element is added to the set, the corresponding bits may be set, e.g., by changing an initial/default value of “0” to “1”.
When a Bloom filter is used to determine membership in a set, false positives are possible, since for two or more different items the respective hash values modulo the array size may be the same. However, false negatives are not possible, since if the element is already a member of the set the corresponding bit(s) in the filter would be found to have been set.
In some applications, a Bloom filter may be used to determine whether an element is already in a set. If the filter result is positive, a further query, e.g., of a database table, may be performed to determine conclusively whether the element is in the set. If the filter result is negative, the database query does not need to be performed.
Typically, for an array of a given size, the probability of false positives increases the more elements that are added to the set. Typically, the false positive probability increases at a specific, calculable rate. The false positive rate can be reduced by increasing the size of the array, but typically resizing requires that the entire filter be rebuilt, e.g., by iterating over the elements in the set to populate the newly-resized filter array. For a set having a very large number of elements, the time, computing, and other resources required to rebuild the filter after resizing may be prohibitive.
Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
Partitioning a set or set space into two or more partitions and providing a separate, partition-specific Bloom filter for each partition is disclosed. In various embodiments, set membership may be determined at least in part by using the partition-specific Bloom filters. In some embodiments, the computed false positive probability or other criteria may be used to determine to resize and rebuild a partition-specific Bloom filter. In various embodiments, partition-specific Bloom filters may be resized and/or rebuilt independently of other partition-specific Bloom filters associated with other partitions, enabling such other partition-specific Bloom filters to remain available for use. In various embodiments, techniques disclosed herein may be used in connection with a variety of different types of Bloom filter, including without limitation a counting Bloom filter.
In the example shown, data comprising objects stored in the file system, such as files, is stored in a cloud-based object store 112. In some embodiments, files may be segmented into a plurality of segments or “chunks”, each of which is stored in a corresponding location in the cloud-based object store. File system calls are made to file system metadata server 110, which stores file system metadata in a file system metadata storage 114, e.g., in a database or other data store. File system metadata server 110 may store in file system metadata store 114, for example, a segment or “chunk” map for each file or other object stored and represented in the file system. For example, for each file name (e.g., pathname) the file system metadata server 110 may store in a corresponding segment map a hash or other representation of each segment, and for each a corresponding location in which the segment is (or is to be) stored in cloud-based object store 112. Other file system metadata, such as metadata typically stored by a file system, may be stored by file system metadata server 110 in file system metadata store 114. Examples include, without limitation, a directory, file, or other node/object name; an identification of parent and/or child nodes; a creation time; a user that created and/or owns the object; a time last modified and/or other time; an end-of-file (EOF) or other value indicative of object size; security attributes such as a classification, access control list, etc.; and/or other file system metadata.
While in the example shown in
The client system 102 includes a network communication interface 212 that provides network connectivity, e.g., to a network such as network 108 of
In various embodiments, file system client 208 may be configured to store in a metadata write buffer comprising or otherwise associated with file system client 208 and/or cache 210 one or more file system operations and/or requests affecting file system metadata comprising a portion of the file system metadata with respect to which a file system metadata write lease is held by file system client 208. For example, file system operations affecting metadata may be buffered as received, e.g., as a result of local file system calls by applications such as application 202 of
In various embodiments, file system objects, such as files, may be stored by a client on which a distribute file system client or other agent has been installed. Upon receiving a request to store (or modify) a file system object, in various embodiments the file system client segments the object into one or more segments or “chunks” and computes a reference (e.g., a hash) for each. The references are included in a file system request sent to the file system metadata server, e.g., via a secure connection such as connection 302 of
In various embodiments, file system objects, such as files, may be retrieved by a client on which a distribute file system client or other agent has been installed. Upon receiving a request to access a file system object, in various embodiments the file system client sends a file access request to the file system metadata server, e.g., via a secure connection such as connection 302 of
Finally, the chunk metadata table 906 includes a row for each chunk, identified by chunk id (column 908 in the example shown), and for each chunk metadata including a hash of (all or a prescribed part of) the chunk contents (sometimes referred to herein as a chunk or segment “reference”) (column 910), the size of the chunk (column 912), other metadata, and a reference count (column 914) indicating how many currently live files (or other file system objects) reference the chunk. For example, if a file is created by copying another file, each of them would reference the chunks comprising the file that was copied.
In various embodiments, chunks are stored in an object store, such as object store 112 of
One way to determine whether a given chunk has already been stored would be to query the chunk metadata table to determine whether a chunk having the same hash as the chunk is already among the chunks represented in the chunk metadata table, such as chunk metadata table 906 of
In various embodiments, a Bloom filter may be used to facilitate determining whether a chunk has been stored already. Given the characteristics of a Bloom filter, a “negative” result can be relied upon to conclude a given chunk has not yet been stored, obviating the need to query the chunk metadata table prior to make that determination.
Partitioning a Bloom filter into two or more partitions, each having a relatively smaller partition-specific filter, and distributing elements of a set among the partitions, is disclosed. In various embodiments, the number of partitions and the initial size of each may be determined statically, at least initially, based on how many elements are and/or are expected to be included in the overall set. In some embodiments, a decision to partition a Bloom filter may be made dynamically, based for example on a computed probability of a false positive (e.g., based on filter size and number of elements in the set/partition) and/or based on a count of how many elements have been removed from the set/partition, e.g., by virtue of files having been modified and/or deleted from the file system.
In various embodiments, the partition-specific Bloom filter may initially be set to a size smaller than what may ultimately be required. If the partition-specific Bloom filter becomes too saturated, for example as a result of the number of objects associated with the partition becoming large relative to the filter size, then in various embodiments the partition-specific filter may be resized, as indicated by the dotted lines shown adjacent to each of partition-specific filters 1204, 1206, 1208, and 1210. In various embodiments, while a partition-specific filter is being resized, the file system (or other system) may continue to use the respective Bloom filters associated with other partitions to determine whether chunks mapped to those partitions may have been stored already. In addition, the amount of time the partition-specific Bloom filter may be unavailable as it is resized and rebuilt will be much less than if a single Bloom filter for the entire set had to be resized and rebuilt, resulting in a shorter window of time during which de-duplication decisions would need to be made by querying the chunk metadata table, without the benefit and use of the Bloom filter.
In various embodiments, partitioned populations and associated partition-specific Bloom filters may enable the presence of an element in set to be determined using space efficient data structures, without unacceptably high false positive rates. A growing and/or very large population of elements may be managed, including by resizing and/or further partitioning partition-specific Bloom filters, as needed, independently of one another, minimizing filter unavailability.
Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.
Claims
1. A method, comprising:
- receiving, by one or more processors, a representation of an item;
- using, by one or more processors, the representation to determine a particular partition with which the item is associated, wherein the particular partition with which the item is associated is one of a plurality of partitions corresponding to a volume that comprises the item, and each of the plurality of partitions have a corresponding partition-specific Bloom filter;
- determining, by one or more processors, whether the item is an element of a set with which the particular partition is associated;
- dynamically determining to partition or resize the partition-specific Bloom filter based at least in part on a computed probability of the partition-specific Bloom filter rendering a false positive; and
- in response to determining to partition or resize the partition specific Bloom filter, partitioning or resizing the partition-specific Bloom filter independent of one or more other partition-specific Bloom filters corresponding to one or more other partitions of the plurality of partitions.
2. The method of claim 1, wherein the determining of whether the item is an element in the set comprises:
- checking a partition-specific Bloom filter corresponding to the particular partition to determine if the item is an element of the set;
- in response to determining that the partition-specific Bloom filter indicates that the item is not an element of the set, determining that the item is not an element of the set; and
- in response to determining that the partition-specific Bloom filter indicates that the item is an element of the set, querying a table associated with the set for the representation of the item, and determining that the item is an element of the set in the event that the querying of the table associated with the set indicates that the set includes the item.
3. The method of claim 1, wherein the computed probability is based at least in part on one or more of a filter size and a number of elements in the particular partition corresponding to the partition-specific Bloom filter.
4. The method of claim 1, wherein the Bloom filter comprises a counting Bloom filter.
5. The method of claim 1, wherein the particular partition comprises a subset of the set.
6. The method of claim 1, wherein the representation comprises a hash.
7. The method of claim 1, wherein the item comprises a chunk included in a set of one or more chunks into which a file has been segmented.
8. The method of claim 1, wherein the item comprises a chunk of data and representation comprises a hash of the chunk of data.
9. The method of claim 1, further comprising determining a number of partitions to associate with the set.
10. The method of claim 9, further comprising determining for one or more of the plurality of partitions an initial size of a corresponding partition-specific Bloom filter.
11. The method of claim 1, wherein the particular partition comprises a first partition; and further comprising resizing the partition-specific Bloom filter associated with the first partition without affecting operation of one or more other partition-specific Bloom filters associated the one or more other partitions of the plurality of partitions.
12. The method of claim 1, further comprising determining to rebuild the particular partition based at least in part on a count reflecting a number of items that have been removed from the particular partition.
13. The method of claim 1, further comprising:
- dynamically determining whether to partition or resize the partition-specific Bloom filter based at least in part on a number of observed false positive results with respect to the partition-specific Bloom filter.
14. The method of claim 1, wherein:
- the one or more other partition-specific Bloom filters corresponding to the one or more other partitions of the plurality of partitions are responsive to queries during the partitioning or resizing of the partition-specific Bloom filter.
15. A system, comprising:
- a processor configured to:
- receive a representation of an item;
- use the representation to determine a particular partition with which the item is associated, wherein the particular partition with which the item is associated is one of a plurality of partitions corresponding to a volume that comprises the item, and each of the plurality of partitions have a corresponding partition-specific Bloom filter;
- determine whether the item is an element of a set with which the particular partition is associated, wherein the set comprises one or more objects stored in a distributed file system; dynamically determine to partition or resize the partition-specific Bloom filter based at least in part on a computed probability of the partition-specific Bloom filter rendering a false positive; in response to determining to partition or resize the partition specific Bloom filter, partition or resize the partition-specific Bloom filter independent of one or more other partition-specific Bloom filters corresponding to the one or more other partitions of the plurality of partitions; and
- a storage device coupled to the processor and configured to store the partition-specific Bloom filter.
16. A computer program product embodied in a non-transitory computer readable storage medium and comprising computer instructions for:
- receiving a representation of an item;
- using the representation to determine a particular partition with which the item is associated, wherein the particular partition with which the item is associated is one of a plurality of partitions corresponding to a volume that comprises the item, and each of the plurality of partitions have a corresponding partition-specific Bloom filter;
- determining, by one or more processors, whether the item is an element of a set with which the particular partition is associated, wherein the set comprises one or more objects stored in a distributed file system;
- dynamically determining to partition or resize the partition-specific Bloom filter based at least in part on a computed probability of the partition-specific Bloom filter rendering a false positive; and
- in response to determining to partition or resize the partition specific Bloom filter, partitioning or resizing the partition-specific Bloom filter independent of one or more other partition-specific Bloom filters corresponding to one or more other partitions of the plurality of partitions.
Type: Application
Filed: Oct 31, 2019
Publication Date: Feb 27, 2020
Inventors: Thomas Manville (Mountain View, CA), Julio Lopez (Mountain View, CA), Shrinand Javadekar (Sunnyvale, CA)
Application Number: 16/670,458