SYSTEMS, METHODS AND DEVICES FOR ELIMINATING DUPLICATES AND VALUE REDUNDANCY IN COMPUTER MEMORIES
A computer memory compression method involves analyzing (1210) computer memory content with respect to occurrence of duplicate memory objects as well as value redundancy of data values in unique memory objects. The computer memory content is encoded (1220) by eliminating the duplicate memory objects and compressing each remaining unique memory object by exploiting data value locality of the data values thereof. Metadata (500) is provided (1230) to represent the memory objects of the encoded computer memory content. The metadata reflects eliminated duplicate memory objects, remaining unique memory objects as well as a type of compression used for compressing each remaining unique memory object. A memory object in the encoded computer memory content is located (1240) using the metadata (500).
This subject matter generally relates to the field of data compression in memories in electronic computers.
BACKGROUNDData compression is a general technique to store and transfer data more efficiently by coding frequent collections of data more efficiently than less frequent collections of data. It is of interest to generally store and transfer data more efficiently for a number of reasons. In computer memories, for example memories that keep data and computer instructions that processing devices operate on, for example in main or cache memories, it is of interest to store said data more efficiently, say K times, as it then can reduce the size of said memories potentially by K times, using potentially K times less communication capacity to transfer data between one memory to another memory and with potentially K times less energy expenditure to store and transfer said data inside or between computer systems and/or between memories. Alternatively, one can potentially store K times more data in available computer memory than without data compression. This can be of interest to achieve potentially K times higher performance of a computer without having to add more memory, which can be costly or can simply be less desirable due to resource constraints. As another example, the size and weight of a smartphone, a tablet, a lap/desktop or a set-top box can be limited as a larger or heavier smartphone, tablet, a lap/desktop or a set-top box could be of less value for an end user; hence potentially lowering the market value of such products. Yet, more memory capacity or higher memory communication bandwidth can potentially increase the market value of the product as more memory capacity or memory communication bandwidth can result in higher performance and hence better utility of the product.
To summarize, in the general landscape of computerized products, including isolated devices or interconnected ones, data compression can potentially increase the performance, lower the energy expenditure, increase the memory communication bandwidth or lower the cost and area consumed by memory. Therefore, data compression has a broad utility in a wide range of computerized products beyond those mentioned here.
Compressed memory systems in prior art typically compress a memory page when it is created, either by reading it from disk or through memory allocation. Compression can be done using a variety of well-known methods by software routines or by hardware accelerators. When the processors request data from memory, data must typically be first decompressed before serving the requesting processor. As such requests may end up on the critical memory access path, decompression is typically hardware accelerated to impose a low impact on the memory access time.
In one compression approach, called deduplication, the idea is to identify identical memory objects. For example, let us assume that the memory contains five identical instances of the same page. Then, only one of them needs to be saved whereas the remaining four can make a reference to that only instance; thus, providing a compression ratio of a factor of five. Deduplication known in prior art has been applied to fixed-size objects at a range of granularities such as memory pages whose size are typically on the order of a few KiloBytes to tens of KiloBytes (KB), or even more, and memory blocks whose size are typically a few tens of bytes, for example 64 Bytes (64B). Other prior art considers variable-grain sizes such as variable-size storage files. In any case, a limitation of deduplication is that it builds on only removing duplicates of the occurrence of identical memory objects.
In removing identical objects, the removed object must establish a reference to the sole object identical to it. References, in terms of pointers, are to point to the sole copy of the memory object and this consumes memory space. Hence, deduplication can lead to significant compression meta-data overhead. For example, let us assume that deduplication is applied to memory blocks of 64B (=26 bytes) in a memory of 1 Terabyte=240 bytes. Then, a (40−6=) 34-bit reference pointer is needed to point to the unique copy of a deduplicated memory block.
Alternative compression approaches known from prior art leverage value redundancy (in terms of single words, say 32 or 64 bits). For example, a memory object that is more common than another will be encoded with fewer bits than a memory object that is not so common. As an example, Entropy-based compression techniques abound in prior art including for example Huffman coding and arithmetic coding. Other compression techniques include Base-Delta-Immediate compression that exploits that numerical values stored in data objects, e.g. memory pages and blocks, are numerically close to each other and encode the difference between them densely.
Importantly, deduplication, that removes duplicates and compression techniques exploiting value locality, such as entropy-based compression and base-delta-immediate compression, that remove value redundancy, are complementary in a number of ways. Consider for example page-based deduplication where a single copy of identical pages is stored whereas a reference pointer is provided from the copies to refer to the unique copy. Such a deduplication scheme does, however, not take advantage of the value redundancy existing at finer granularity, for example, at the word level (e.g. 32 or 64-bit entities) within the page. By combining deduplication with compression schemes that reduce value redundancy, it is possible to eliminate duplicates and store the remaining unique copies much denser by encoding each data value inside the unique copy based on its statistical value nature. It is the intent of this document to disclose an invention that provides devices, systems and methods of a family of compression techniques applied to computer memory that eliminates duplicates as well as value redundancy.
Combining deduplication with value-locality-based compression opens up a number of technical challenges. A first challenge is how to find an encoding that offers a combined gain in compressibility by removing duplicates as well as compressing the items in the remaining unique copies using a value-locality-based approach. To locate a memory block efficiently in the compressed memory, using a combined approach of deduplication and value-locality-based compression, will open up a challenge to keep the amount of metadata low and allow for compression and decompression devices to impose a low memory latency overhead. Hence, a second challenge is to come up with compression and decompression methods, devices and systems that can keep the amount of metadata low and that impose a low memory latency overhead. At operation, data objects will change in response to processor writes. This has the effect that the nature of duplicated and unique blocks will change; both concerning the number of duplicates as well as the statistical nature of the value locality of the remaining unique copies. A third challenge is to provide methods, devices and systems that can keep the compressibility high in light of such dynamic effects. It is the intent that the disclosed invention addresses all these and other challenges.
SUMMARYA first aspect of the present invention is a computer memory compression method. The method comprises analyzing computer memory content with respect to occurrence of duplicate memory objects as well as value redundancy of data values in unique memory objects. The method also comprises encoding said computer memory content by eliminating said duplicate memory objects and compressing each remaining unique memory object by exploiting data value locality of the data values thereof. The method further comprises providing metadata representing the memory objects of the encoded computer memory content. The metadata reflects eliminated duplicate memory objects, remaining unique memory objects as well as a type of compression used for compressing each remaining unique memory object. The method moreover comprises and locating a memory object in the encoded computer memory content using said metadata.
A second aspect of the present invention is a computer memory compression device. The device comprises an analyzer unit configured for analyzing computer memory content with respect to occurrence of duplicate memory objects as well as value redundancy of data values in unique memory objects. The device also comprises an encoder unit configured for encoding said computer memory content by eliminating said duplicate memory objects and compressing each remaining unique memory object by exploiting data value locality of the data values thereof. The encoder unit is further being configured for providing metadata representing the memory objects of the encoded computer memory content. The metadata reflects eliminated duplicate memory objects, remaining unique memory objects as well as a type of compression used for compressing each remaining unique memory object. The device further comprises a locator unit configured for locating a memory object in the encoded computer memory content using said metadata.
Other aspects, as well as objectives, features and advantages of the disclosed embodiments will appear from the following detailed disclosure, from the attached dependent claims as well as from the drawings.
Generally, compressing by exploiting data value locality as described in this document may involve entropy-based encoding, delta encoding, dictionary-based encoding or pattern-based encoding, without limitations.
Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to “a/an/the [element, device, component, means, step, etc]” are to be interpreted openly as referring to at least one instance of the element, device, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
This document discloses systems, methods and devices to compress data in computer memory with a family of compression approaches that eliminates duplicates and value redundancy in computer memories.
An exemplary embodiment of a computer system 100 is depicted in
Computer systems, as exemplified by the embodiment in
This invention disclosure considers several embodiments that differ at which level of the aforementioned exemplary memory hierarchy compression is applied. A first embodiment considers the invented compression method being applied at the main memory. However, other embodiments can be appreciated by someone skilled in the art. It is the intent that such embodiments are also contemplated while not being explicitly covered in this patent disclosure.
As for the first disclosed embodiment, where we consider the problem of a limited main memory capacity, the exemplary system in
As will be explained in more detail below, the analyzer unit 214 is configured for analyzing computer memory content with respect to occurrence of duplicate memory objects as well as value redundancy of data values in unique memory objects. In this regards, the data values will typically be of finer granularity than the memory objects, and the memory objects will typically be of finer granularity than the computer memory content. The computer memory content may typically be a page of computer memory, the memory objects may typically be memory blocks, and each memory block may typically comprise a plurality of data values, such as memory words.
The encoder unit 212 is configured for encoding the computer memory content by eliminating the duplicate memory objects and compressing each remaining unique memory object by exploiting data value locality of the data values thereof. The encoder unit 212 is further configured for providing metadata representing the memory objects of the encoded computer memory content. The metadata reflects eliminated duplicate memory objects, remaining unique memory objects as well as a type of compression used for compressing each remaining unique memory object. Examples of such metadata are, for instance, seen at 500 in
A corresponding general computer memory compression method 1200 is shown in
The computer memory compression device 205 is connected to the memory controllers on one side and the last-level cache C3 on the other side. A purpose of the address translation unit 211 is to translate a conventional physical address PA to a compressed address CA to locate a memory block in the compressed memory. Someone skilled in the art realizes that such address translation is needed because a conventional memory page (say 4 KB) may be compressed to any size in a compressed memory. A purpose of the encoder (compressor) unit 212 is to compress memory blocks that have been modified and are evicted from the last-level cache. To have a negligible impact on the performance of the memory system, compression must be fast and is typically accelerated by a dedicated compressor unit. Similarly, when a memory block is requested by the processor and is not available in any of the cache levels, e.g. C1, C2 and C3 in the exemplary embodiment, the memory block must be requested from memory. The address translation unit 211 will locate the block but before it is installed in the cache hierarchy, e.g. in C1, it must be decompressed. A purpose of the decompressor unit 213 is to accelerate this process so that it can have negligible impact on the performance of the memory system.
Someone skilled in the art may realize that the functionality of the compressor and the decompressor unit depends on the type of compression algorithm being used. In one embodiment, delta encoding (such as base-delta-immediate encoding) can be used, where the difference between a value and a base value is stored rather than the value itself. In another embodiment, entropy-based encoding (such as Huffman-encoding) can be used in which values that are more frequent than others use denser codes. In a third embodiment, one can use deduplication where only unique blocks are stored in memory. It is the intent of this invention disclosure to cover all compression algorithms with the purpose of removing value redundancy.
Given the embodiment according to
We provide an exemplary overview of how a memory page is compressed using deduplication in combination with entropy-based compression in
Prior art also comprises compression methods that encode frequently used data denser than less frequently used data, such as Huffman encoding, or that exploit that numerical values are similar, such as delta encoding (e.g. base-delta-immediate encoding). These compression methods are referred to as value-redundancy removal compression methods. To compress a page, using a value-redundancy removal compression method, one typically analyzes all individual data items at some granularity, for example, at the word level (say 64 bits). The value frequency distribution would capture the relative occurrence of different values in the page. However, when trivially applied to the original content of the memory page, before deduplication, the existence of duplicates can drastically change the value distribution. For this reason, the proposed embodiment applies deduplication first, to remove duplicates, before value distribution of the remaining unique memory blocks is established. The rightmost exemplary layout, seen at (C) in
We now turn our attention to how the combined approach is realized as exemplified in
To establish whether a memory block is unique and must be inserted in the tree-based data structure, its signature is first compared with the signature of the top node 410 in the tree data structure 400. If it is the same, a second test is carried out to compare the content of the two memory blocks. If the memory blocks are indeed identical, a duplicate block has been detected. This same operation is carried out at each node in the tree-based data structure. However, if the signature is the same, but the two blocks are not identical, the new block has to be inserted with the same signature. This may involve the following additional test to handle false positives. When the created signature S matches 650 a signature represented in the tree data structure 400:
-
- determining whether said individual memory object is identical to said unique memory block represented by said matching signature; and
- if said individual memory object and said unique memory block represented by said matching signature are not identical:
- inserting a node in the tree data structure 400;
- entering the created signature S in the inserted node; and
- generating the metadata 500 for said individual memory object with the information 510 indicating that it is a unique memory object and the unique memory object reference 530, U_PTR to said individual memory object.
On the other hand, if there is a signature mismatch, the search proceeds in the left branch of the tree if the signature is less than that of the top node 410 according to the test at 460. If the signature is greater than the signature of the top node, the search proceeds in the right branch of the tree according to the test (box 470). Hence, all nodes 410, 420, 430, 440 and 450 are organized in descending order (left branch) and ascending order (right branch) to make the search time logarithmic rather than linear. As duplicates will be removed in the process, a memory block will not reside at the same address as in a conventional uncompressed page. For this reason, the new location of a block will be recorded in the tree-based data structure as depicted in each node by “Block location—BL”.
The end result of the deduplication process is that all duplicated memory blocks have been eliminated. For this reason, and as has been explained in relation to
The rightmost part of
Hence, in summary, the metadata 500 advantageously comprises, for each memory object of the encoded computer memory content:
-
- information 510 indicative of the memory object being an eliminated duplicate memory object or a remaining unique memory object;
- when the memory object is a unique memory object, information 520 indicative of the type of compression used and a unique memory object reference 530, U_PTR to the unique memory object; and
- when the memory object is a duplicate memory object, a unique memory object reference 530, U_PTR to a unique memory object, the non-compressed contents of which are identical to the duplicate memory object.
Advantageously, the metadata 500 further comprises, for each memory object being a unique memory object, a duplicate memory object reference 540, D_PTR to an eliminated duplicate memory object, the non-compressed contents of which are identical to the unique memory object.
Let us now establish the entire process by which memory blocks become deduplicated by analyzing all the memory blocks within a page (other granularities such as multiple pages are also applicable). The process is depicted in the flow graph of
As will be understood from the description of
-
- creating the signature S, the signature being a dense representation of the data values of the memory object;
- traversing the tree data structure 400 to compare the created signature S to signatures already represented in the tree data structure 400;
- if the created signature S does not match 660 any of the signatures represented in the tree data structure 400:
- inserting a node in the tree data structure 400;
- entering the created signature S in the inserted node; and
- generating the metadata 500 for said individual memory object with the information 510 indicating that it is a unique memory object and the unique memory object reference 530, U_PTR to the individual memory object; and
- if the created signature S matches 650 a signature represented in the tree data structure 400:
- generating the metadata 500 for said individual memory object with the information 510 indicating that it is a duplicate memory object and the unique memory object reference 530, U_PTR to a unique memory block represented by the matching signature in the tree data structure 400; and
- updating the metadata 500 of the unique memory block represented by the matching signature in the tree data structure 400 to introduce a duplicate memory object reference 540, D_PTR to said individual memory object.
As has been pointed out, applying deduplication prior to any compression method aiming at leveraging on the value locality of individual data items, for example at the word level, is important as duplicates will not represent the value frequency distribution correctly. To this end, a process is needed to establish the value frequency distribution of unique blocks. Such a process 700 is depicted in
The analyzer unit 214 and the encoder unit 212 of the computer memory compression device 205 may hence be configured for, when all memory objects in the computer memory content have been processed 600:
-
- traversing the tree data structure 400 to generate a value-frequency table for the data values of the unique memory objects as represented by the nodes of the tree data structure 400; and
- compressing each unique memory object by an entropy-based compression scheme using the generated value frequency table.
In one such embodiment, the analyzer unit could implement a hash-table to record the frequency of each value to be used for later analysis, perhaps using software routines, to establish encodings using e.g. Huffman encoding or some other entropy-based encoding techniques.
In an alternative embodiment, using delta encoding (e.g. base-delta-immediate encoding), the values remaining after duplicates have been removed can be used to select one or a plurality of base values. In one approach, clustering techniques can be used to analyze which base value is closest to all values in the unique copies in a page, after duplicates have been removed.
Alternatively, therefore, the analyzer unit 214 and the encoder unit 212 of the computer memory compression device 205 may be configured for, when all memory objects in the computer memory content have been processed 600:
-
- traversing the tree data structure 400 to generate a value frequency table for the data values of the unique memory objects as represented by the nodes of the tree data structure 400; and
- compressing each unique memory object by an entropy-based compression scheme using the generated value frequency table.
We now turn our attention to how a memory block is located and decompressed in the compressed memory using the combined deduplication and value-redundancy removal compression technique. Returning to
When one of the processors in
Now suppose that a write request is destined to the unique memory block 830 and let us turn our attention to the rightmost scenario of
An alternative way of handling a write request destined to the unique memory block 830 in
Let us now consider a scenario where a write request is destined to a deduplicated block and let us turn our attention to the leftmost scenario of
Note that in both the scenarios of
Part of the metadata of
In the event that a block is being replaced from the last-level cache C3 of the exemplary embodiment of
The description of
-
- receiving a read request for a memory block in a memory page having a physical memory page address PA;
- determining a compressed memory page address CA from a look-up table 1010;
- retrieving metadata 1020 for the memory block;
- calculating a compressed memory block address 1040 from the compressed memory page address CA and the unique memory object reference 530, U_PTR of the retrieved metadata;
- retrieving a compressed memory block 1105 at the calculated compressed memory block address 1040; and
- decompressing, by the decompressor unit 213; 1110, the retrieved compressed memory block 1105 using the information 520; 1120, ENC which is indicative of the type of compression and is available in the retrieved metadata for the memory block.
As was explained with reference to
-
- receiving a write-back request involving an update of a unique memory block at an original memory location 830;
- copying the unique memory block 830 prior to update to a new memory location 870 in a dedicated free memory area 840 of the computer memory content;
- updating the metadata of duplicate memory blocks 810, 820 linked to the unique memory block 830, such that the duplicate memory object references 540, D_PTR thereof are redirected to the new memory location 870 in the dedicated free memory area 840; and
- updating the unique memory block at its original memory location 830 in accordance with the write-back request.
Also, the computer memory compression device 205 may advantageously be further configured for:
-
- providing metadata which includes a reference F_PTR to a starting address 805 of the dedicated free memory area 840; and
- updating said reference F_PTR to reflect a new starting address 880 after the copying of the unique memory block 830 to the new memory location 870 in the dedicated free memory area 840.
As was explained as an alternative to
-
- receiving a write-back request involving an update of a unique memory block 830 at an original memory location;
- finding a deduplicated memory block 820 being a duplicate of the unique memory block 830;
- promoting the found deduplicated memory block 820 as unique in the tree data structure 400 by using the signature S of the unique memory block 830;
- writing updated contents of the unique memory block 830 according to the write-back request to a new memory location 870 in a dedicated free memory area 840; and
- updating the metadata of the unique memory block 830 such that the unique memory object reference 530, U_PTR thereof is redirected to the new memory location 870 in the dedicated free memory area 840, whereas any duplicate memory object references 540, D_PTR thereof are removed.
Also, the computer memory compression device 205 may advantageously be further configured for:
-
- providing metadata which includes a reference F_PTR to a starting address 805 of the dedicated free memory area 840; and
- updating said reference F_PTR to reflect a new starting address 880 after the writing of the updated contents of the unique memory block 830 according to the write-back request to the new memory location 870 in the dedicated free memory area 840.
As was explained with reference to
-
- receiving a write-back request involving an update of a duplicate memory block 920;
- storing the contents of the updated duplicate memory block as a new unique memory block 980 in a dedicated free memory area 940; and
- updating the metadata of a unique memory block 910 previously linked to the duplicate memory block 920 to reflect that the unique memory block 910 is no longer linked to the duplicate memory block 920 while maintaining any links between the unique memory block 910 and other duplicate memory blocks 930.
As a result of write-back requests, unique and deduplicated copies will be updated and will end up in the free area used to avoid unnecessary duplication to happen, as explained in relation to
Accordingly, the computer memory compression device 205 may advantageously be further configured for:
-
- monitoring compression ratio over time for a memory page; and
- if the compression ratio does not satisfy a given criterion, performing recompression of the memory page by performing the functionality of the computer memory compression method 1200 as described in this document.
Alternatively, or additionally, the computer memory compression device 205 may be further configured for periodically performing recompression of a memory page to improve compression ratio by performing the functionality of the computer memory compression method 1200 as described in this document.
Although the inventive aspects have been described in this document by referring to the example embodiments, the inventive aspects are not limited to the disclosed embodiments but they cover alternative embodiments that can be realized by someone skilled in the art.
One alternative inventive aspect can be seen as a system for analysis of computer memory data with the purpose of compressing it by eliminating duplicates of data items and value redundancy, the system comprising means to eliminate duplicates and value redundancy, means to locate data items after duplicate and value redundancy removal, means for compressing and decompressing data items using said compression method, and means for recompressing data items.
Another alternative inventive aspect can be seen as a method for analysis of computer memory data with the purpose of compressing it by eliminating duplicates of data items and value redundancy, the method comprising the steps of eliminating duplicates and value redundancy, locating data items after duplicate and value redundancy removal, compressing and decompressing data items using said compression method, and recompressing data items.
Yet another alternative inventive aspect can be seen as a device for analysis of computer memory data with the purpose of compressing it by eliminating duplicates of data items and value redundancy, the device being configured to eliminate duplicates and value redundancy, locate data items after duplicate and value redundancy removal, compress and decompress data items using said compression method, and recompress data items.
Still another alternative inventive aspect can be seen as the disclosed invention comprises a system for data analysis with means to analyze the content of pages in main memory with respect to the occurrence of duplicates of memory blocks and with respect to occurrence of value redundancy of the remaining unique memory blocks. The disclosed invention comprises also a system with means for removing duplicates and value redundancy of memory. Furthermore, the disclosed invention comprises a system with means to locate individual memory blocks after duplicates and value redundancy have been removed and means for compression and decompression of memory blocks using the same. Finally, the disclosed invention comprises systems with means to re-compress memory pages.
Further alternative inventive aspect can be seen as methods that analyze the content of pages in main memory with respect to the occurrence of duplicates of memory blocks and with respect to the relative frequency of values in the remaining unique memory blocks; methods for encoding memory blocks taking into account both deduplication and value-locality-based encoding methods; and methods for locating individual memory blocks in the compressed memory for the family of combined deduplication and value-locality-based compression technique and methods for compressing and decompressing memory blocks using the same. Finally, the disclosed invention comprises methods for re-compressing memory pages.
Other alternative inventive aspect can be seen as a data analyzer device configured to analyze the content of pages in main memory with respect to the occurrence of duplicates of memory blocks and with respect to value redundancy of the remaining unique memory blocks; a data encoder device configured to encode memory blocks taking into account removal of duplicates as well as value redundancy in remaining unique blocks; a memory block locator device configured to locate individual memory blocks in the compressed memory for the family of combined deduplication and value-locality-based compression technique and devices configured to compress and decompress memory blocks using the same; and devices configures to re-compress memory pages.
Claims
1. A computer memory compression method (1200), comprising:
- analyzing (1210) computer memory content with respect to occurrence of duplicate memory objects as well as value redundancy of data values in unique memory objects;
- encoding (1220) said computer memory content by eliminating said duplicate memory objects and compressing each remaining unique memory object by exploiting data value locality of the data values thereof;
- providing (1230) metadata (500) representing the memory objects of the encoded computer memory content, wherein the metadata (500) reflects eliminated duplicate memory objects, remaining unique memory objects as well as a type of compression used for compressing each remaining unique memory object; and
- locating (1240) a memory object in the encoded computer memory content using said metadata (500).
2. The method as defined in claim 1, wherein the metadata (500) comprises, for each memory object of the encoded computer memory content:
- information (510) indicative of the memory object being an eliminated duplicate memory object or a remaining unique memory object;
- when the memory object is a unique memory object, information (520) indicative of the type of compression used and a unique memory object reference (530, U_PTR) to the unique memory object; and
- when the memory object is a duplicate memory object, a unique memory object reference (530, U_PTR) to a unique memory object, the non-compressed contents of which are identical to the duplicate memory object.
3. The method as defined in claim 2, wherein the metadata (500) further comprises, for each memory object being a unique memory object, a duplicate memory object reference (540, D_PTR) to an eliminated duplicate memory object, the non-compressed contents of which are identical to the unique memory object.
4. The method as defined in claim 3, further comprising processing (600) each individual memory object in the computer memory content by:
- creating a signature (S), the signature being a dense representation of the data values of the memory object;
- traversing a tree data structure (400) to compare the created signature (S) to signatures already represented in the tree data structure (400);
- if the created signature (S) does not match (660) any of the signatures represented in the tree data structure (400): inserting a node in the tree data structure (400); entering the created signature (S) in the inserted node; and generating the metadata (500) for said individual memory object with the information (510) indicating that it is a unique memory object and the unique memory object reference (530, U_PTR) to said individual memory object; and
- if the created signature (S) matches (650) a signature represented in the tree data structure (400): generating the metadata (500) for said individual memory object with the information (510) indicating that it is a duplicate memory object and the unique memory object reference (530, U_PTR) to a unique memory block represented by the matching signature in the tree data structure (400); and updating the metadata (500) of the unique memory block represented by the matching signature in the tree data structure (400) to introduce a duplicate memory object reference (540, D_PTR) to said individual memory object.
5. The method as defined in claim 4, further comprising, when the created signature (S) matches (650) a signature represented in the tree data structure (400):
- determining whether said individual memory object is identical to said unique memory block represented by said matching signature; and
- if said individual memory object and said unique memory block represented by said matching signature are not identical: inserting a node in the tree data structure (400); entering the created signature (S) in the inserted node; and generating the metadata (500) for said individual memory object with the information (510) indicating that it is a unique memory object and the unique memory object reference (530, U_PTR) to said individual memory object.
6. The method as defined in claim 4 or 5, further comprising, when all memory objects in the computer memory content have been processed (600):
- traversing the tree data structure (400) to generate a value frequency table for the data values of the unique memory objects as represented by the nodes of the tree data structure (400); and
- compressing each unique memory object by an entropy-based compression scheme using the generated value frequency table.
7. The method as defined in claim 4 or 5, further comprising, when all memory objects in the computer memory content have been processed (600):
- traversing the tree data structure (400) by examining the data values of the unique memory objects as represented by the nodes of the tree data structure (400) and determining one or more base values; and
- compressing each unique memory object by a delta encoding-based compression scheme using the determined one or more base values.
8. The method as defined in any preceding claim, wherein said data values are of finer granularity than said memory objects, and said memory objects are of finer granularity than said computer memory content.
9. The method as defined in claim 8, wherein said computer memory content is a page of computer memory, said memory objects are memory blocks, and each memory block comprises a plurality of data values.
10. The method as defined in claim 9 when dependent on claim 2, further comprising:
- receiving a read request for a memory block in a memory page having a physical memory page address (PA);
- determining a compressed memory page address (CA) from a look-up table (1010);
- retrieving metadata (1020) for the memory block;
- calculating a compressed memory block address (1040) from the compressed memory page address (CA) and the unique memory object reference (530, U_PTR) of the retrieved metadata;
- retrieving a compressed memory block (1105) at the calculated compressed memory block address (1040); and
- decompressing (1110) the retrieved compressed memory block (1105) using the information (520; 1120, ENC) indicative of the type of compression from the retrieved metadata for the memory block.
11. The method as defined in claim 9 or 10 when dependent on claim 3, further comprising:
- receiving a write-back request involving an update of a unique memory block (830) at an original memory location;
- copying the unique memory block (830) prior to update to a new memory location (870) in a dedicated free memory area (840) of the computer memory content;
- updating the metadata of duplicate memory blocks (810, 820) linked to the unique memory block (830) such that the duplicate memory object references (540, D_PTR) thereof are redirected to the new memory location (870) in the dedicated free memory area (840); and
- updating the unique memory block at its original memory location (830) in accordance with the write-back request.
12. The method as defined in claim 11, further comprising:
- providing metadata which includes a reference (F_PTR) to a starting address (805) of the dedicated free memory area (840); and
- updating said reference (F_PTR) to reflect a new starting address (880) after the copying of the unique memory block (830) to the new memory location (870) in the dedicated free memory area (840).
13. The method as defined in claim 4 and any of claim 9 or 10, further comprising:
- receiving a write-back request involving an update of a unique memory block (830) at an original memory location;
- finding a deduplicated memory block (820) being a duplicate of the unique memory block (830);
- promoting the found deduplicated memory block (820) as unique in the tree data structure (400) by using the signature (S) of the unique memory block (830);
- writing updated contents of the unique memory block (830) according to the write-back request to a new memory location (870) in a dedicated free memory area (840); and
- updating the metadata of the unique memory block (830) such that the unique memory object reference (530, U_PTR) thereof is redirected to the new memory location (870) in the dedicated free memory area (840), whereas any duplicate memory object references (540, D_PTR) thereof are removed.
14. The method as defined in claim 13, further comprising:
- providing metadata which includes a reference (F_PTR) to a starting address (805) of the dedicated free memory area (840); and
- updating said reference (F_PTR) to reflect a new starting address (880) after the writing of the updated contents of the unique memory block (830) according to the write-back request to the new memory location (870) in the dedicated free memory area (840).
15. The method as defined in any of claims 9-14, further comprising:
- receiving a write-back request involving an update of a duplicate memory block (920);
- storing the contents of the updated duplicate memory block as a new unique memory block (980) in a dedicated free memory area (940); and
- updating the metadata of a unique memory block (910) previously linked to the duplicate memory block (920) to reflect that said unique memory block (910) is no longer linked to said duplicate memory block (920) while maintaining any links between said unique memory block (910) and other duplicate memory blocks (930).
16. The method as defined in any of claims 11-15, further comprising:
- monitoring compression ratio over time for a memory page; and
- if the compression ratio does not satisfy a given criterion, performing recompression of the memory page by performing the functionality of the method as defined in any of claims 1-10.
17. The method as defined in any of claims 15-16, further comprising:
- periodically performing recompression of a memory page to improve compression ratio by performing the functionality of the method as defined in any of claims 1-10.
18. The method as defined in any preceding claim, wherein said compressing by exploiting data value locality involves one of:
- entropy-based encoding;
- delta encoding;
- dictionary-based encoding; and
- pattern-based encoding.
19. A computer memory compression device (205), comprising:
- an analyzer unit (214) configured for analyzing computer memory content with respect to occurrence of duplicate memory objects as well as value redundancy of data values in unique memory objects;
- an encoder unit (212) configured for encoding said computer memory content by eliminating said duplicate memory objects and compressing each remaining unique memory object by exploiting data value locality of the data values thereof, the encoder unit (212) further being configured for providing metadata (500) representing the memory objects of the encoded computer memory content, wherein the metadata (500) reflects eliminated duplicate memory objects, remaining unique memory objects as well as a type of compression used for compressing each remaining unique memory object; and
- a locator unit (211) configured for locating a memory object in the encoded computer memory content using said metadata (500).
20. The device as defined in claim 19, wherein the metadata (500) comprises, for each memory object of the encoded computer memory content:
- information (510) indicative of the memory object being an eliminated duplicate memory object or a remaining unique memory object;
- when the memory object is a unique memory object, information (520) indicative of the type of compression used and a unique memory object reference (530, U_PTR) to the unique memory object; and
- when the memory object is a duplicate memory object, a unique memory object reference (530, U_PTR) to a unique memory object, the non-compressed contents of which are identical to the duplicate memory object.
21. The device as defined in claim 20, wherein the metadata (500) further comprises, for each memory object being a unique memory object, a duplicate memory object reference (540, D_PTR) to an eliminated duplicate memory object, the non-compressed contents of which are identical to the unique memory object.
22. The device as defined in claim 21, wherein the analyzer unit (214) and the encoder unit (212) are configured for processing (600) each individual memory object in the computer memory content by:
- creating a signature (S), the signature being a dense representation of the data values of the memory object;
- traversing a tree data structure (400) to compare the created signature (S) to signatures already represented in the tree data structure (400);
- if the created signature (S) does not match (660) any of the signatures represented in the tree data structure (400): inserting a node in the tree data structure (400); entering the created signature (S) in the inserted node; and generating the metadata (500) for said individual memory object with the information (510) indicating that it is a unique memory object and the unique memory object reference (530, U_PTR) to said individual memory object; and
- if the created signature (S) matches (650) a signature represented in the tree data structure (400): generating the metadata (500) for said individual memory object with the information (510) indicating that it is a duplicate memory object and the unique memory object reference (530, U_PTR) to a unique memory block represented by the matching signature in the tree data structure (400); and updating the metadata (500) of the unique memory block represented by the matching signature in the tree data structure (400) to introduce a duplicate memory object reference (540, D_PTR) to said individual memory object.
23. The device as defined in claim 22, wherein the analyzer unit (214) and the encoder unit (212) are further configured for, when the created signature (S) matches (650) a signature represented in the tree data structure (400):
- determining whether said individual memory object is identical to said unique memory block represented by said matching signature; and
- if said individual memory object and said unique memory block represented by said matching signature are not identical: inserting a node in the tree data structure (400); entering the created signature (S) in the inserted node; and generating the metadata (500) for said individual memory object with the information (510) indicating that it is a unique memory object and the unique memory object reference (530, U_PTR) to said individual memory object.
24. The device as defined in claim 23, wherein the analyzer unit (214) and the encoder unit (212) are further configured for, when all memory objects in the computer memory content have been processed (600):
- traversing the tree data structure (400) to generate a value frequency table for the data values of the unique memory objects as represented by the nodes of the tree data structure (400); and
- compressing each unique memory object by an entropy-based compression scheme using the generated value frequency table.
25. The device as defined in claim 23, wherein the analyzer unit (214) and the encoder unit (212) are further configured for, when all memory objects in the computer memory content have been processed (600):
- traversing the tree data structure (400) by examining the data values of the unique memory objects as represented by the nodes of the tree data structure (400) and determining one or more base values; and
- compressing each unique memory object by a delta encoding-based compression scheme using the determined one or more base values.
26. The device as defined in any of claims 19-25, wherein said data values are of finer granularity than said memory objects, and said memory objects are of finer granularity than said computer memory content.
27. The device as defined in claim 26, wherein said computer memory content is a page of computer memory, said memory objects are memory blocks, and each memory block comprises a plurality of data values.
28. The device as defined in claim 27 when dependent on claim 20, further comprising a decompressor unit (213; 1110) and being configured for:
- receiving a read request for a memory block in a memory page having a physical memory page address (PA);
- determining a compressed memory page address (CA) from a look-up table (1010);
- retrieving metadata (1020) for the memory block;
- calculating a compressed memory block address (1040) from the compressed memory page address (CA) and the unique memory object reference (530, U_PTR) of the retrieved metadata;
- retrieving a compressed memory block (1105) at the calculated compressed memory block address (1040); and
- decompressing, by the decompressor unit (213), the retrieved compressed memory block (1105) using the information (520; 1120, ENC) indicative of the type of compression from the retrieved metadata for the memory block.
29. The device as defined in claim 27 or 28 when dependent on claim 21, further configured for:
- receiving a write-back request involving an update of a unique memory block (830) at an original memory location;
- copying the unique memory block (830) prior to update to a new memory location (870) in a dedicated free memory area (840) of the computer memory content;
- updating the metadata of duplicate memory blocks (810, 820) linked to the unique memory block (830), such that the duplicate memory object references (540, D_PTR) thereof are redirected to the new memory location (870) in the dedicated free memory area (840); and
- updating the unique memory block at its original memory location (830) in accordance with the write-back request.
30. The device as defined in claim 29, further configured for:
- providing metadata which includes a reference (F_PTR) to a starting address (805) of the dedicated free memory area (840); and
- updating said reference (F_PTR) to reflect a new starting address (880) after the copying of the unique memory block (830) to the new memory location (870) in the dedicated free memory area (840).
31. The device as defined in claim 22 and any of claim 27 or 28, further configured for:
- receiving a write-back request involving an update of a unique memory block (830) at an original memory location;
- finding a deduplicated memory block (820) being a duplicate of the unique memory block (830);
- promoting the found deduplicated memory block (820) as unique in the tree data structure (400) by using the signature (S) of the unique memory block (830);
- writing updated contents of the unique memory block (830) according to the write-back request to a new memory location (870) in a dedicated free memory area (840); and
- updating the metadata of the unique memory block (830) such that the unique memory object reference (530, U_PTR) thereof is redirected to the new memory location (870) in the dedicated free memory area (840), whereas any duplicate memory object references (540, D_PTR) thereof are removed.
32. The device as defined in claim 31, further configured for:
- providing metadata which includes a reference (F_PTR) to a starting address (805) of the dedicated free memory area (840); and
- updating said reference (F_PTR) to reflect a new starting address (880) after the writing of the updated contents of the unique memory block (830) according to the write-back request to the new memory location (870) in the dedicated free memory area (840).
33. The device as defined in any of claims 27-32, further configured for:
- receiving a write-back request involving an update of a duplicate memory block (920);
- storing the contents of the updated duplicate memory block as a new unique memory block (980) in a dedicated free memory area (940); and
- updating the metadata of a unique memory block (910) previously linked to the duplicate memory block (920) to reflect that said unique memory block (910) is no longer linked to said duplicate memory block (920) while maintaining any links between said unique memory block (910) and other duplicate memory blocks (930).
34. The device as defined in any of claim 32 or 33, further configured for:
- monitoring compression ratio over time for a memory page; and
- if the compression ratio does not satisfy a given criterion, performing recompression of the memory page by performing the functionality of the method as defined in any of claims 1-10.
35. The device as defined in any of claims 32-34, further configured for:
- periodically performing recompression of a memory page to improve compression ratio by performing the functionality of the method as defined in any of claims 1-9.
36. The device as defined in any of claims 19-35, wherein said encoder unit (212) is configured for compressing by exploiting data value locality by applying one of:
- entropy-based compression; and
- base-delta-immediate compression.
37. A computer system (200) comprising:
- one or more processors (P1... PN);
- one or more computer memories (M1-Mk; C1-C3); and
- a computer memory compression device (205) according to any of claims 19-36.
Type: Application
Filed: Jan 9, 2020
Publication Date: Mar 9, 2023
Inventors: Angelos Arelakis (Göteborg), Per Stenström (Torslanda)
Application Number: 17/421,800