Patents by Inventor Jee Ho Ryoo

Jee Ho Ryoo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11531617
    Abstract: A heterogeneous memory system is implemented using a low-latency near memory (NM) and a high-latency far memory (FM). Pages in the memory system include NM blocks stored in the NM and FM blocks stored in the FM. A page is assigned to a region in the memory system based on the proportion of NM blocks in the page. When accessing a block, the block address is used to determine a region of the memory system, and a block offset is used to determine whether the block is stored in NM or FM. The memory system may observe memory accesses to determine the access statistics of the page and the block. Based on a page's hotness and access density, the page may be migrated to a different region. Based on a block's hotness, the block may be migrated between NM and FM allocated to the page.
    Type: Grant
    Filed: January 22, 2021
    Date of Patent: December 20, 2022
    Assignee: Oracle International Corporation
    Inventors: Lizy John, Jee Ho Ryoo, Hung-Ming Hsu, Karthik Ganesan
  • Publication number: 20210141724
    Abstract: A heterogeneous memory system is implemented using a low-latency near memory (NM) and a high-latency far memory (FM). Pages in the memory system include NM blocks stored in the NM and FM blocks stored in the FM. A page is assigned to a region in the memory system based on the proportion of NM blocks in the page. When accessing a block, the block address is used to determine a region of the memory system, and a block offset is used to determine whether the block is stored in NM or FM. The memory system may observe memory accesses to determine the access statistics of the page and the block. Based on a page's hotness and access density, the page may be migrated to a different region. Based on a block's hotness, the block may be migrated between NM and FM allocated to the page.
    Type: Application
    Filed: January 22, 2021
    Publication date: May 13, 2021
    Inventors: Lizy John, Jee Ho Ryoo, Hung-Ming Hsu, Karthik Ganesan
  • Patent number: 10901894
    Abstract: A heterogeneous memory system is implemented using a low-latency near memory (NM) and a high-latency far memory (FM). Pages in the memory system include NM blocks stored in the NM and FM blocks stored in the FM. A page is assigned to a region in the memory system based on the proportion of NM blocks in the page. When accessing a block, the block address is used to determine a region of the memory system, and a block offset is used to determine whether the block is stored in NM or FM. The memory system may observe memory accesses to determine the access statistics of the page and the block. Based on a page's hotness and access density, the page may be migrated to a different region. Based on a block's hotness, the block may be migrated between NM and FM allocated to the page.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: January 26, 2021
    Assignee: Oracle International Corporation
    Inventors: Lizy John, Jee Ho Ryoo, Hung-Ming Hsu, Karthik Ganesan
  • Patent number: 10503655
    Abstract: The described embodiments include a computing device that caches data acquired from a main memory in a high-bandwidth memory (HBM), the computing device including channels for accessing data stored in corresponding portions of the HBM. During operation, the computing device sets each of the channels so that data blocks stored in the corresponding portions of the HBM include corresponding numbers of cache lines. Based on records of accesses of cache lines in the HBM that were acquired from pages in the main memory, the computing device sets a data block size for each of the pages, the data block size being a number of cache lines. The computing device stores, in the HBM, data blocks acquired from each of the pages in the main memory using a channel having a data block size corresponding to the data block size for each of the pages.
    Type: Grant
    Filed: July 21, 2016
    Date of Patent: December 10, 2019
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Mitesh R. Meswani, Jee Ho Ryoo
  • Patent number: 10503658
    Abstract: The present disclosure is directed to techniques for migrating data between heterogeneous memories in a computing system. More specifically, the techniques involve migrating data between a memory having better access characteristics (e.g., lower latency but greater capacity) and a memory having worse access characteristics (e.g., higher latency but lower capacity). Migrations occur with a variable migration granularity. A migration granularity specifies a number of memory pages, having virtual addresses that are contiguous in virtual address space, that are migrated in a single migration operation. A history-based technique that adjusts migration granularity based on the history of memory utilization by an application is provided. A profiling-based technique that adjusts migration granularity based on a profiling operation is also provided.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: December 10, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Arkaprava Basu, Jee Ho Ryoo
  • Patent number: 10296465
    Abstract: A processor architecture utilizing a L3 translation lookaside buffer (TLB) to reduce page walks. The processor includes multiple cores, where each core includes a L1 TLB and a L2 TLB. The processor further includes a L3 TLB that is shared across the processor cores, where the L3 TLB is implemented in off-chip or die-stack dynamic random-access memory. Furthermore, the processor includes a page table connected to the L3 TLB, where the page table stores a mapping between virtual addresses and physical addresses. In such an architecture, by having the L3 TLB with a very large capacity, performance may be improved, such as execution time, by eliminating page walks, which requires multiple data accesses.
    Type: Grant
    Filed: July 20, 2017
    Date of Patent: May 21, 2019
    Assignee: Board of Regents, The University of Texas System
    Inventors: Lizy K. John, Jee Ho Ryoo, Nagendra Gulur
  • Patent number: 10261915
    Abstract: A processor architecture which partitions on-chip data caches to efficiently cache translation entries alongside data which reduces conflicts between virtual to physical address translation and data accesses. The architecture includes processor cores that include a first level translation lookaside buffer (TLB) and a second level TLB located either internally within each processor core or shared across the processor cores. Furthermore, the architecture includes a second level data cache (e.g., located either internally within each processor core or shared across the processor cores) partitioned to store both data and translation entries. Furthermore, the architecture includes a third level data cache connected to the processor cores, where the third level data cache is partitioned to store both data and translation entries. The third level data cache is shared across the processor cores. The processor architecture can also include a data stack distance profiler and a translation stack distance profiler.
    Type: Grant
    Filed: September 15, 2017
    Date of Patent: April 16, 2019
    Assignee: Board of Regents, The University Of Texas System
    Inventors: Lizy K. John, Yashwant Marathe, Jee Ho Ryoo, Nagendra Gulur
  • Publication number: 20190087350
    Abstract: A processor architecture which partitions the on-chip data caches to efficiently cache translation entries alongside data which reduces the conflicts between virtual to physical address translation and data accesses. The architecture includes processor cores that include a first level translation lookaside buffer (TLB) and a second level TLB located either internally within each processor core or shared across the processor cores. Furthermore, the architecture includes a second level data cache (e.g., located either internally within each processor core or shared across the processor cores) partitioned to store both data and translation entries. Furthermore, the architecture includes a third level data cache connected to the processor cores, where the third level data cache is partitioned to store both data and translation entries. The third level data cache is shared across the processor cores.
    Type: Application
    Filed: September 15, 2017
    Publication date: March 21, 2019
    Inventors: Lizy K. John, Yashwant Marathe, Jee Ho Ryoo, Nagendra Gulur
  • Publication number: 20180314436
    Abstract: The present disclosure is directed to techniques for migrating data between heterogeneous memories in a computing system. More specifically, the techniques involve migrating data between a memory having better access characteristics (e.g., lower latency but greater capacity) and a memory having worse access characteristics (e.g., higher latency but lower capacity). Migrations occur with a variable migration granularity. A migration granularity specifies a number of memory pages, having virtual addresses that are contiguous in virtual address space, that are migrated in a single migration operation. A history-based technique that adjusts migration granularity based on the history of memory utilization by an application is provided. A profiling-based technique that adjusts migration granularity based on a profiling operation is also provided.
    Type: Application
    Filed: April 27, 2017
    Publication date: November 1, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Arkaprava Basu, Jee Ho Ryoo
  • Publication number: 20180260323
    Abstract: A heterogeneous memory system is implemented using a low-latency near memory (NM) and a high-latency far memory (FM). Pages in the memory system include NM blocks stored in the NM and FM blocks stored in the FM. A page is assigned to a region in the memory system based on the proportion of NM blocks in the page. When accessing a block, the block address is used to determine a region of the memory system, and a block offset is used to determine whether the block is stored in NM or FM. The memory system may observe memory accesses to determine the access statistics of the page and the block. Based on a page's hotness and access density, the page may be migrated to a different region. Based on a block's hotness, the block may be migrated between NM and FM allocated to the page.
    Type: Application
    Filed: December 22, 2017
    Publication date: September 13, 2018
    Inventors: Lizy John, Jee Ho Ryoo, Hung-Ming Hsu, Karthik Ganesan
  • Publication number: 20180150406
    Abstract: A processor architecture utilizing a L3 translation lookaside buffer (TLB) to reduce page walks. The processor includes multiple cores, where each core includes a L1 TLB and a L2 TLB. The processor further includes a L3 TLB that is shared across the processor cores, where the L3 TLB is implemented in off-chip or die-stack dynamic random-access memory. Furthermore, the processor includes a page table connected to the L3 TLB, where the page table stores a mapping between virtual addresses and physical addresses. In such an architecture, by having the L3 TLB with a very large capacity, performance may be improved, such as execution time, by eliminating page walks, which requires multiple data accesses.
    Type: Application
    Filed: July 20, 2017
    Publication date: May 31, 2018
    Inventors: Lizy K. John, Jee Ho Ryoo, Nagendra Gulur
  • Publication number: 20180024935
    Abstract: The described embodiments include a computing device that caches data acquired from a main memory in a high-bandwidth memory (HBM), the computing device including channels for accessing data stored in corresponding portions of the HBM. During operation, the computing device sets each of the channels so that data blocks stored in the corresponding portions of the HBM include corresponding numbers of cache lines. Based on records of accesses of cache lines in the HBM that were acquired from pages in the main memory, the computing device sets a data block size for each of the pages, the data block size being a number of cache lines. The computing device stores, in the HBM, data blocks acquired from each of the pages in the main memory using a channel having a data block size corresponding to the data block size for each of the pages.
    Type: Application
    Filed: July 21, 2016
    Publication date: January 25, 2018
    Inventors: Mitesh R. Meswani, Jee Ho Ryoo
  • Patent number: 9406361
    Abstract: A memory subsystem incorporating a die-stacked DRAM (DSDRAM) is disclosed. In one embodiment, a system include a processor implemented on a silicon interposer of an integrated circuit (IC) package, a DSDRAM coupled to the processor, the DSDRAM implemented on the silicon interposer of the IC package, and a DRAM implemented separately from the IC package. The DSDRAM and the DRAM form a main memory having a contiguous address space comprising a range of physical addresses. The physical addresses of the DSDRAM occupy a first contiguous portion of the address space, while the DRAM occupies a second contiguous portion of the address space. Each physical address of the contiguous address space is augmented with a first bit that, when set, indicates that a page is stored in the DRAM and the DSDRAM.
    Type: Grant
    Filed: March 27, 2014
    Date of Patent: August 2, 2016
    Assignee: Oracle International Corporation
    Inventors: Jee Ho Ryoo, Karthik Ganesan, Yao-Min Chen
  • Publication number: 20150279436
    Abstract: A memory subsystem incorporating a die-stacked DRAM (DSDRAM) is disclosed. In one embodiment, a system include a processor implemented on a silicon interposer of an integrated circuit (IC) package, a DSDRAM coupled to the processor, the DSDRAM implemented on the silicon interposer of the IC package, and a DRAM implemented separately from the IC package. The DSDRAM and the DRAM form a main memory having a contiguous address space comprising a range of physical addresses. The physical addresses of the DSDRAM occupy a first contiguous portion of the address space, while the DRAM occupies a second contiguous portion of the address space. Each physical address of the contiguous address space is augmented with a first bit that, when set, indicates that a page is stored in the DRAM and the DSDRAM.
    Type: Application
    Filed: March 27, 2014
    Publication date: October 1, 2015
    Inventors: Jee Ho Ryoo, Karthik Ganesan, Yao-Min Chen