TIERED MEMORY MANAGEMENT SYSTEM
A tiered memory system includes a tiered memory management system that is coupled to a first memory subsystem associated with a first memory subsystem tier, and a second memory subsystem associated with a second memory subsystem tier that is different than the first memory subsystem tier. The tiered memory management system monitors a health of the first memory subsystem associated with the first memory subsystem tier and the second memory subsystem associated with the second memory subsystem tier. When the tiered memory management system identifies a health issue with the first memory subsystem associated with the first memory subsystem tier, it moves data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier.
The present disclosure relates generally to information handling systems, and more particularly to the management of tiered memory used by information handling systems.
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
Information handling systems such as, for example, server devices, desktop computing devices, laptop/notebook computing devices, tablet computing devices, mobile phones, and/or other computing devices known in the art, sometime use tiered memory systems for the storage and use of data. For example, a plurality of memory subsystems that utilize any of a plurality of different types of storage media may be included in computing devices, coupled to a network, and/or may be otherwise accessible by any of the computing devices that utilize the tiered memory system, and may be assigned to provide for the storage and/or utilization of data by any of those computing devices based a range of factors such as cost, availability, performance, data recovery capabilities, and/or other tiered memory subsystem factors. In addition, conventional tiered memory systems monitor data access frequency, and operate to move data with relatively high access frequency to relatively higher memory subsystems tiers, while moving data with relatively low access frequency to relatively lower memory subsystem tiers.
As will be appreciated by one of skill in the art, tiered memory systems allow for the provisioning of relatively larger memory footprints and relatively lower overall costs per memory capacity, but also introduce issues. For example, tiered memory systems are associated relatively high complexity and introduce relatively more memory locations where failures or other memory unavailability may occur that can render the tiered memory system unavailable. For example, when memory subsystems in the tiered memory system utilize particular persistent memory technologies that are subject to wearing out (e.g., due to repeated data writes to those memory subsystem), failure or other unavailability of a single memory subsystem used to provide an operating system or other application due to wear can result in failure or other unavailability of that operating system or other application.
Accordingly, it would be desirable to provide a tiered memory system that addresses the issues discussed above.
SUMMARYAccording to one embodiment, an Information Handling System (IHS) includes a processing system; and a memory system that is coupled to the processing system and that includes instructions that, when executed by the processing system, cause the processing system to provide a tiered memory management engine that is configured to: monitor a health of a first memory subsystem associated with a first memory subsystem tier and a second memory subsystem associated with a second memory subsystem tier that is different than the first memory subsystem tier; identify a health issue with the first memory subsystem associated with the first memory subsystem tier; and move, in response to identifying the health issue with the first memory subsystem, data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier.
For purposes of this disclosure, an information handling system may include any instrumentality or aggregate of instrumentalities operable to compute, calculate, determine, classify, process, transmit, receive, retrieve, originate, switch, store, display, communicate, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, or other purposes. For example, an information handling system may be a personal computer (e.g., desktop or laptop), tablet computer, mobile device (e.g., personal digital assistant (PDA) or smart phone), server (e.g., blade server or rack server), a network storage device, or any other suitable device and may vary in size, shape, performance, functionality, and price. The information handling system may include random access memory (RAM), one or more processing resources such as a central processing unit (CPU) or hardware or software control logic, ROM, and/or other types of nonvolatile memory. Additional components of the information handling system may include one or more disk drives, one or more network ports for communicating with external devices as well as various input and output (I/O) devices, such as a keyboard, a mouse, touchscreen and/or a video display. The information handling system may also include one or more buses operable to transmit communications between the various hardware components.
In one embodiment, IHS 100,
Referring now to
In the illustrated embodiment, the computing devices 202a-202c are coupled to a network 204 that may be provided by a Local Area Network (LAN), the Internet, combinations thereof, and/or any other networks that would be apparent to one of skill in the art in possession of the present disclosure. In the illustrated embodiment, a network-attached memory system 206 is coupled to the computing devices 202a-202c via the network 204. In an embodiment, the network-attached memory system 206 may be provided by the IHS 100 discussed above with reference to
Referring now to
The chassis 302 may also house a storage system (not illustrated, but which may include the storage 108 discussed above with reference to
The chassis 302 may also house a communication system 310 that is coupled to the computing engine 304 (e.g., via a coupling between the communication system 310 and the processing system) and that may be provided by a Network Interface Controller (NIC), wireless communication systems (e.g., BLUETOOTH®, Near Field Communication (NFC) components, WiFi components, etc.), and/or any other communication components that would be apparent to one of skill in the art in possession of the present disclosure. In the illustrated embodiment, the chassis 302 also houses a tiered memory management system 312, discussed in further detail below with reference to
In a specific example, the tiered memory management system 312 may be provided using software in a memory management system for the memory system 308, a hypervisor provided by the computing engine 304, an operating system provided by the computing engine 304, and/or other software enabled systems in the computing device 300. As such, the tiered memory management system 312 may be provided using hardware such as a co-processor, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), and/or other hardware subsystem that has access (e.g., control plane access) to Direct Memory Access (DMA) engines, memory devices, and/or other hardware/software that one of skill in the art in possession of the present disclosure would recognize as enabling the functionality described below.
However, while the tiered memory management system 312 is illustrated and described as being included in the computing device 300, one of skill in the art in possession of the present disclosure will appreciate how the tiered memory management system of the present disclosure may be coupled to any of the computing devices 202a-202c/300 via the network 204 (e.g., as a stand-alone device, as part of the network-attached memory system 206, included in a different computing device 202a-202c/300, etc.) while remaining within the scope of the present disclosure as well. As such, while a specific computing device 300 has been illustrated and described, one of skill in the art in possession of the present disclosure will recognize that computing devices (or other devices operating according to the teachings of the present disclosure in a manner similar to that described below for the computing device 300) may include a variety of components and/or component configurations for providing conventional computing device functionality, as well as the functionality discussed below, while remaining within the scope of the present disclosure as well.
Referring now to
As such, the chassis 402 may support/house a processing system (not illustrated, but which may include the processor 102 discussed above with reference to
The chassis 402 may also support/house a storage system (not illustrated, but which may include the storage 108 discussed above with reference to
The chassis 402 may also support/house one or more data mover devices 408 that are coupled to the tiered memory management engine 404 (e.g., via a coupling between the data mover device(s) 408 and the processing system) and to the communication system 410. However, while the data mover device(s) 408 are described as being supported/housed by the chassis 402 of the tiered memory management system 400, one of skill in the art in possession of the present disclosure will appreciate how the data mover device(s) 408 may be included in the computing device 300 and coupled to the tiered memory management system 400, coupled to the tiered memory system 400 via the network 204, and/or other accessible to the tiered memory management engine 404 in a variety of other manners that will fall within the scope of the present disclosure. As such, while a specific tiered memory management system 400 has been illustrated and described, one of skill in the art in possession of the present disclosure will recognize that tiered memory management systems (or other devices operating according to the teachings of the present disclosure in a manner similar to that described below for the tiered memory management system 400) may include a variety of components and/or component configurations for providing conventional functionality, as well as the functionality discussed below, while remaining within the scope of the present disclosure as well.
Referring now to
The method 500 begins at block 502 where a tiered memory management system defines a tiered memory system associating memory subsystem types with different memory subsystem tiers. In an embodiment, at block 502, the tiered memory management engine 404 in the tiered memory management system 312/400 may define a tiered memory system for the computing device 300 that associates memory subsystem types with different memory subsystem tiers. With reference to
As illustrated, the tiered memory system 600 includes a first memory subsystem tier 602 that is associated with on-package memory subsystems that may be included in the memory system 308 of the computing device 202a/300 and that may be provided by, for example, on-package High Bandwidth Memory (HBM) devices defined by Joint Electron Device Engineering Council (JEDEC) standards and including Dynamic Random Access Memory (DRAM) memory technology using Through-Silicon Vias (TSVs) to interconnect stacked DRAM die, as well as other on-package memory subsystems that would be apparent to one of skill in the art in possession of the present disclosure. The tiered memory system 600 also includes a second memory subsystem tier 604 that is associated with direct DRAM subsystems that may be included in the memory system 308 of the computing device 202a/300 and that may be provided by, for example, Double Data Rate (DDR) memory devices and/or other direct DRAM subsystems that would be apparent to one of skill in the art in possession of the present disclosure. The tiered memory system 600 also includes a third memory subsystem tier 606 that is associated with local Compute express Link (CXL) DRAM subsystems that may be included in the memory system 308 of the computing device 202a/300.
The tiered memory system 600 also includes a fourth memory subsystem tier 608 that is associated with local CXL persistent DRAM subsystems that may be included in the memory system 308 of the computing device 202a/300. The tiered memory system 600 also includes a fifth memory subsystem tier 610 that is associated with local CXL Storage Class Memory (SCM) subsystems that may be included in the memory system 308 of the computing device 202a/300 and that may be provided by, for example, OPTANE® memory devices available from INTEL® corporation of Redmond, Washington, United States, as well as other local CXL SCM subsystems that would be apparent to one of skill in the art in possession of the present disclosure. The tiered memory system 600 also includes a sixth memory subsystem tier 612 that is associated with external CSL memory subsystems that may be included in the computing devices 202b-202c and accessible to the computing device 202a/300, and that may be provided by, for example, OPTANE® memory devices available from INTEL® corporation of Redmond, Washington, United States; DRAM devices; SCM devices; and/or other external CXL memory subsystems that would be apparent to one of skill in the art in possession of the present disclosure.
The tiered memory system 600 also includes a seventh memory subsystem tier 614 that is associated with local Non-Volatile Memory express (NVMe) memory subsystems that may be included in the computing device 202a, and that may be provided by, for example, OPTANE® memory devices available from INTEL® corporation of Redmond, Washington, United States; NAND memory devices in NVMe Solid State Drive (SSD) storage device; and/or other local NVMe memory subsystems that would be apparent to one of skill in the art in possession of the present disclosure. The tiered memory system 600 also includes an eighth memory subsystem tier 616 that is associated with (NVMe) over Fabric (NVMe-oF) memory subsystems that may be included in the network-attached memory system 206 and accessible to the computing device 202a via the network 204, and that may be provided by, for example, OPTANE® memory devices available from INTEL® corporation of Redmond, Washington, United States; NAND memory devices in NVMe Solid State Drive (SSD) storage devices; and/or other NVMe-OF memory subsystems that would be apparent to one of skill in the art in possession of the present disclosure.
As discussed above, the different memory subsystem tiers in a tiered memory system may be ranked relative to each other based on cost factors, availability factors, performance factors, capability factors, and/or other tiered memory subsystem factors that would be apparent to one of skill in the art in possession of the present disclosure. For example, performance factors used to rank different memory subsystem tiers in a tiered memory system may include memory subsystem latency, memory subsystem bandwidth, memory subsystem power consumption, memory subsystem write endurance, and/or other performance factors that would be apparent to one of skill in the art in possession of the present disclosure. In another example, capability factors used to rank different memory subsystem tiers in a tiered memory system may include data recovery capabilities of memory subsystems, Reliability/Availability/Serviceability (RAS) capabilities of memory subsystems, data persistence capabilities of memory subsystems, metadata support capabilities of memory subsystems, and/or other capabilities factors that would be apparent to one of skill in the art in possession of the present disclosure.
As such, one of skill in the art in possession of the present disclosure will appreciate how tiered memory system 600 illustrated in
The method 500 then proceeds to block 504 where the tiered memory management system configures one or more memory subsystems in the tiered memory system to store data for one or more computing subsystems. In an embodiment, at block 504, the tiered memory management engine 404 in the tiered memory management system 312/400 may configure memory subsystem(s) in the memory system 308 of the computing devices 202a/300, in the computing devices 202b-202c, and/or in the network-attached memory system 206 to store data for one or more computing subsystems provided by the computing engine 304. With reference to
To provide a specific example, the computing subsystems provided by the computing engine 202a/300 may include an operating system and/or one or more applications, although other computing subsystems that utilize memory subsystems will fall within the scope of the present disclosure as well. In some embodiments, at block 504, the tiered memory management engine 404 in the tiered memory management system 312/400 may define a plurality of logical memory address spaces such as, for example, “bins” that represent the granularity at which the tiered memory management engine 404 and/or the data mover device(s) 408 operate, and/or any other logical memory addresses spaces that would be apparent to one of skill in the art in possession of the present disclosure. The tiered memory management engine 404 may then assign subsets of the logical memory address spaces to one or more computing subsystems provided by the computing engine 304 (e.g., assign a first subset of the logical memory address spaces to an operating system, assign a second subset of the logical memory address spaces to a first application, assign a third subset of the logical memory address spaces to a second application, and so on).
The tiered memory management engine 404 in the tiered memory management system 312/400 may then associate each logical memory address space with one or more of the plurality of memory subsystems, and one of skill in the art in possession of the present disclosure will appreciate how the association of any logical memory address space with memory subsystem(s) may be based, at least in part, on the computing subsystem that was assigned that logical memory address space, the data provided for storage by the computing subsystem that was assigned that logical memory address space, the memory subsystem tier in which that memory subsystem is included, as well as any other memory subsystem/logical address space association factors that would be apparent to one of skill in the art in possession of the present disclosure. For example, any logical memory address space may be associated with a memory subsystem in a memory subsystem tier that includes characteristics desired for the data that will be stored in that logical memory address space by the computing subsystem to which it is assigned.
With reference to
For example, as illustrated in
As such, one of skill in the art in possession of the present disclosure will appreciate how the assignment of logical memory address space(s) to a computing subsystem and the association of any of those logical memory address space(s) with a memory subsystem will configure that memory subsystem to store data for that computing subsystem. However, while a specific example of the configuration of memory subsystems to store data for computing subsystems has been illustrated and described, one of skill in the art in possession of the present disclosure will appreciate how other techniques for configuring the memory subsystems to store data for computing subsystems will fall within the scope of the present disclosure as well. As will be appreciated by one of skill in the art in possession of the present disclosure, following block 504, the computing subsystems in the computing device 202a/300 may operate to store and utilize data in the memory subsystems associated with their assigned logical memory address spaces.
The method 500 then proceeds to decision block 506 where it is determined whether a health issue exists in a first memory subsystem. In an embodiment, at decision block 506, the tiered memory management engine 404 in the tiered memory management system 312/400 may monitor the plurality of memory subsystems utilized in the tiered memory system 600 in order to determine whether a health issue exists in any of those memory subsystems. For example, the tiered memory management engine 604 may be configured to receive write telemetry data, error telemetry data, and/or other health telemetry data that would be apparent to one of skill in the art in possession of the present disclosure, from each of the memory subsystems utilized in the tiered memory system 600. At decision block 506, the tiered memory management engine 604 may analyze any health telemetry data received from any of the memory subsystems in the tiered memory system 600 in order to determine whether a health issue exists in that memory subsystem by, for example, determining whether that health telemetry data reaches or exceeds a threshold. However, while the use of thresholds to determine a health issue is described below, one of skill in the art in possession of the present disclosure will appreciate how the health issues discussed below may be determined using other techniques that will fall within the scope of the present disclosure as well.
For example, at decision block 506, the tiered memory management engine 404 in the tiered memory management system 312/400 may use the health telemetry data to determine whether a threshold number of errors have occurred in a memory subsystem, whether a threshold number of warnings have been received from a memory subsystem, whether a threshold number of write retries have been attempted to write data to a memory subsystem, to determine whether a memory subsystem includes a threshold number of “bad” pages, to determine whether a memory subsystem includes a threshold number of “bad” blocks (e.g., in an SCM memory subsystem, a flash memory subsystem, and/or other storage-device-like memory subsystem), to determine whether a memory subsystem had been written to a threshold number of times, to determine whether a memory subsystem has experienced a threshold number of correctable errors in a particular subset of the memory subsystem (e.g., a rank, page, or other subset of the memory subsystem), and/or to determine whether a variety of other thresholds have been reached that may be indicated of health issues in a memory subsystem. However, while the monitoring for several specific examples of health issues in memory subsystems have been described, one of skill in the art in possession of the present disclosure will appreciate how any of a variety of memory subsystem health issues may be monitored for at decision block 506 while remaining within the scope of the present disclosure.
If, at decision block 506, it is determined that a health issue does not exist in the first memory subsystem, the method 500 returns to decision block 506. As such, the method 500 may loop such that the tiered memory management engine 404 in the tiered memory management system 312/400 monitors the health of the memory subsystems while the computing subsystems in the computing device 202a/300 operate to store and utilize data in the memory subsystems associated with their assigned logical memory address spaces.
If, at decision block 506, it is determined that a health issue exists in the first memory subsystem, the method 500 proceeds to block 508 where the tiered memory management system moves data stored in the first memory subsystem associated with a first memory subsystem tier to a second memory subsystem associated with a second memory subsystem tier that is different than the first memory subsystem tier. With reference to
With reference to
In different embodiments, the movement of the data from the DDR1 memory subsystem to the LCS2 and LCS3 memory subsystems in response to being marked for memory subsystem tier movement may be performed using a variety of tiered memory system data movement techniques that would be apparent to one of skill in the art in possession of the present disclosure. For example, one of skill in the art in possession of the present disclosure will appreciate how the marking of the data in the DDR1 memory subsystem for memory subsystem tier movement may include “demoting” the logical memory address spaces 2 and 5 associated with the DDR1 memory subsystem that has reached the memory subsystem failure threshold to a lower memory subsystem tier, and thus may result in the movement of the data stored in the logical memory address spaces 2 and 5 to memory subsystem(s) in lower memory subsystem tiers like the LCS2 and LCS3 memory subsystems in the fifth memory subsystem tier 610 in the illustrated examples. However, one of skill in the art in possession of the present disclosure will recognize how tiered memory system data movement techniques may be utilized to move the data stored in the logical memory address spaces 2 and 5 from the DDR1 memory subsystem to memory subsystem(s) in lower memory subsystem tiers, the same memory subsystem tier, and/or higher memory subsystem tiers (e.g., when such memory subsystems have available space) while remaining within the scope of the present disclosure as well.
As such, with reference to
The method 500 then proceeds to block 510 where the tiered memory management system causes the computing subsystem that was storing data in the first memory subsystem to store data in the second memory subsystem. With reference to
For example, when the computing device 300 is booted, initialized, or otherwise configured, the memory subsystems illustrated
As will be appreciated by one of skill in the art in possession of the present disclosure, following the movement of the data to the LCS2 and LCS3 memory subsystems, that data may be later moved to another memory subsystem tier as per conventional tiered memory system operations. For example, in the event the access frequency of that data reaches or exceeds a threshold, that data may be moved from the LCS2 and LCS3 memory subsystems in the fifth memory subsystem tier 610 to at least one memory subsystem in a higher memory subsystem tier in the tiered memory system 600, and one of skill in the art in possession of the present disclosure will appreciate how that data may be moved to at least one memory subsystem in a lower memory subsystem tier in a similar manner.
With reference to
With reference to
In different embodiments, the movement of the data stored in the logical memory address space 1 associated with the LCS2 memory subsystem to the ECM0 memory subsystem in response to being marked for memory subsystem tier movement may be performed using a variety of tiered memory system data movement techniques that would be apparent to one of skill in the art in possession of the present disclosure. For example, one of skill in the art in possession of the present disclosure will appreciate how the marking of the data logical memory address space 1 associated with the LCS2 memory subsystem for memory subsystem tier movement may include “demoting” the logical memory address space 1 associated with the LCS2 memory subsystem that has reached the relative memory subsystem write threshold to a lower memory subsystem tier, and thus may result in the movement of the data stored in the logical memory address space 1 to memory subsystem(s) in lower memory subsystem tiers like the ECM0 memory subsystem in the sixth memory subsystem tier 612 in the illustrated examples. However, one of skill in the art in possession of the present disclosure will recognize how tiered memory system data movement techniques may be utilized to move the data stored in the logical memory address space 1 from the LCS2 memory subsystem in the fifth memory subsystem tier 610 to memory subsystem(s) in lower memory subsystem tiers, the same memory subsystem tier, and/or higher memory subsystem tiers (e.g., when such memory subsystems have available space) while remaining within the scope of the present disclosure as well.
As such, with reference to
As will be appreciated by one of skill in the art in possession of the present disclosure, following the movement of the data to the ECM0 memory subsystem, the LCS2 memory subsystem should experience fewer writes (e.g., due the data associated with logical memory address space 1 now being located to the ECM0 memory subsystem), but at the cost of reducing the amount of memory subsystem space in the fifth memory subsystem tier 610. However, the data moved to the ECM0 memory subsystem may be later moved to another memory subsystem tier as per conventional tiered memory system operations. For example, in the event the access frequency of that data reaches or exceeds a threshold, that data may be moved from the ECM0 memory subsystem in the sixth memory subsystem tier 612 to at least one memory subsystem in a higher memory subsystem tier in the tiered memory system 600, and one of skill in the art in possession of the present disclosure will appreciate how that data may be moved to at least one memory subsystem in a lower memory subsystem tier in a similar manner. In a specific example, following block 510 the disparity in writes between the LCS2 memory subsystem and the other memory subsystems in the fifth memory subsystem tier 610 (e.g., the LCS0, LCS1, and LCS3 memory subsystems) may reduce below the relative memory system write threshold, and the logical memory address space 1 may again be associated with the LCS2 memory subsystem.
Thus, systems and methods have been described that provide for the management of memory subsystems in a tiered memory system in a manner that allows the preemptive removal of failing memory subsystems without interruption of the computing subsystems (e.g., the operating system and/or applications) using them, as well as the alleviation of uneven wear patterns in memory subsystems tiers by adjusting the rate of writes to memory subsystem that are reaching their write endurance limits faster than other memory subsystems in a memory subsystem tier. For example, the tiered memory management system of the present disclosure may be coupled to a first memory subsystem associated with a first memory subsystem tier, and a second memory subsystem associated with a second memory subsystem tier that is different than the first memory subsystem tier. The tiered memory management system monitors a health of the first memory subsystem associated with the first memory subsystem tier and the second memory subsystem associated with the second memory subsystem tier. When the tiered memory management system identifies a health issue with the first memory subsystem associated with the first memory subsystem tier, it moves data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier. As such, failure issues present in conventional tiered memory systems are eliminated.
Although illustrative embodiments have been shown and described, a wide range of modification, change and substitution is contemplated in the foregoing disclosure and in some instances, some features of the embodiments may be employed without a corresponding use of other features. Accordingly, it is appropriate that the appended claims be construed broadly and in a manner consistent with the scope of the embodiments disclosed herein.
Claims
1. A tiered memory system, comprising:
- a first memory subsystem associated with a first memory subsystem tier;
- a second memory subsystem associated with a second memory subsystem tier that is different than the first memory subsystem tier; and
- a tiered memory management system that is coupled to the first memory subsystem and the second memory subsystem, wherein the tiered memory management system is configured to: monitor a health of the first memory subsystem associated with the first memory subsystem tier and the second memory subsystem associated with the second memory subsystem tier; identify a health issue with the first memory subsystem associated with the first memory subsystem tier; and move, in response to identifying the health issue with the first memory subsystem, data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier.
2. The system of claim 1, wherein the moving the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier includes:
- marking the data stored in the first memory subsystem associated with the first memory subsystem tier for memory subsystem tier movement, wherein the marking of the data for memory subsystem tier movement is configured to cause a data mover device to move the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier.
3. The system of claim 1, wherein the second memory subsystem tier is lower than the first memory subsystem tier.
4. The system of claim 1, wherein the identifying the health issue with the first memory subsystem associated with the first memory subsystem tier includes:
- identifying that a memory subsystem failure threshold for the first memory subsystem has been reached, and wherein the tiered memory management system is configured to: identify, to at least one computing subsystem that utilizes the first memory subsystem in response to moving the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier, the first memory subsystem as unavailable.
5. The system of claim 1, wherein the tiered memory management system is configured to:
- associate a logical memory address space with the first memory subsystem to cause a computing subsystem to store the data in the first memory subsystem; and
- in response to identifying the health issue with the first memory subsystem: disassociate the logical memory address space from the first memory subsystem; and associate the logical memory address space with the second memory subsystem to cause the computing subsystem to store subsequent data in the second memory subsystem.
6. The system of claim 1, wherein the identifying the health issue with the first memory subsystem associated with the first memory subsystem tier includes:
- identifying that a relative memory subsystem write threshold for the first memory subsystem has been reached.
7. An Information Handling System (IHS), comprising:
- a processing system; and
- a memory system that is coupled to the processing system and that includes instructions that, when executed by the processing system, cause the processing system to provide a tiered memory management engine that is configured to: monitor a health of a first memory subsystem associated with a first memory subsystem tier and a second memory subsystem associated with a second memory subsystem tier that is different than the first memory subsystem tier; identify a health issue with the first memory subsystem associated with the first memory subsystem tier; and move, in response to identifying the health issue with the first memory subsystem, data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier.
8. The IHS of claim 7, wherein the moving the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier includes:
- marking the data stored in the first memory subsystem associated with the first memory subsystem tier for memory subsystem tier movement, wherein the marking of the data for memory subsystem tier movement is configured to cause a data mover device to move the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier.
9. The IHS of claim 7, wherein the second memory subsystem tier is lower than the first memory subsystem tier.
10. The IHS of claim 7, wherein the identifying the health issue with the first memory subsystem associated with the first memory subsystem tier includes:
- identifying that a memory subsystem failure threshold for the first memory subsystem has been reached, and wherein the tiered memory management engine is configured to: identify, to at least one computing subsystem that utilizes the first memory subsystem in response to moving the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier, the first memory subsystem as unavailable.
11. The IHS of claim 7, wherein the tiered memory management engine is configured to:
- associate a logical memory address space with the first memory subsystem to cause a computing subsystem to store the data in the first memory subsystem; and
- in response to identifying the health issue with the first memory subsystem: disassociate the logical memory address space from the first memory subsystem; and associate the logical memory address space with the second memory subsystem to cause the computing subsystem to store subsequent data in the second memory subsystem.
12. The IHS of claim 7, wherein the identifying the health issue with the first memory subsystem associated with the first memory subsystem tier includes:
- identifying that a relative memory subsystem write threshold for the first memory subsystem has been reached.
13. The IHS of claim 12, wherein the tiered memory management engine is configured to:
- determine that the first memory subsystem is below the relative memory subsystem write threshold and, in response, move the data stored in the second memory subsystem associated with the second memory subsystem tier to the first memory subsystem associated with the first memory subsystem tier.
14. A method for managing tiered memory, comprising:
- monitoring, by a tiered memory management system, a health of a first memory subsystem associated with a first memory subsystem tier and a second memory subsystem associated with a second memory subsystem tier that is different than the first memory subsystem tier;
- identifying, by the tiered memory management system, a health issue with the first memory subsystem associated with the first memory subsystem tier; and
- moving, by the tiered memory management system in response to identifying the health issue with the first memory subsystem, data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier.
15. The method of claim 14, wherein the moving the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier includes:
- marking the data stored in the first memory subsystem associated with the first memory subsystem tier for memory subsystem tier movement, wherein the marking of the data for memory subsystem tier movement is configured to cause a data mover device to move the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier.
16. The method of claim 14, wherein the second memory subsystem tier is lower than the first memory subsystem tier.
17. The method of claim 14, wherein the identifying the health issue with the first memory subsystem associated with the first memory subsystem tier includes:
- identifying that a memory subsystem failure threshold for the first memory subsystem has been reached, and wherein the method further comprises: identifying by the tiered memory management system to at least one computing subsystem that utilizes the first memory subsystem in response to moving the data stored in the first memory subsystem associated with the first memory subsystem tier to the second memory subsystem associated with the second memory subsystem tier, the first memory subsystem as unavailable.
18. The method of claim 14, further comprising:
- associating, by the tiered memory management system, a logical memory address space with the first memory subsystem to cause a computing subsystem to store the data in the first memory subsystem; and
- in response to identifying the health issue with the first memory subsystem: disassociating, by the tiered memory management system, the logical memory address space from the first memory subsystem; and associating, by the tiered memory management system, the logical memory address space with the second memory subsystem to cause the computing subsystem to store subsequent data in the second memory subsystem.
19. The method of claim 14, wherein the identifying the health issue with the first memory subsystem associated with the first memory subsystem tier includes:
- identifying that a relative memory subsystem write threshold for the first memory subsystem has been reached.
20. The method of claim 19, further comprising:
- determining, by the tiered memory management system, that the first memory subsystem is below the relative memory subsystem write threshold and, in response, moving the data stored in the second memory subsystem associated with the second memory subsystem tier to the first memory subsystem associated with the first memory subsystem tier.
Type: Application
Filed: Dec 2, 2022
Publication Date: Jun 6, 2024
Inventors: William Price Dawkins (Lakeway, TX), Stuart Allen Berke (Austin, TX)
Application Number: 18/073,915