Patents by Inventor Gongxian Jeffrey Cheng
Gongxian Jeffrey Cheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12032487Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.Type: GrantFiled: February 8, 2022Date of Patent: July 9, 2024Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Benjamin T. Sander, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Michael Mantor
-
Publication number: 20220269620Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.Type: ApplicationFiled: February 8, 2022Publication date: August 25, 2022Inventors: Benjamin T. SANDER, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Michael Mantor
-
Patent number: 11288205Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.Type: GrantFiled: June 23, 2015Date of Patent: March 29, 2022Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Benjamin T. Sander, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Mike Mantor
-
Patent number: 11100004Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.Type: GrantFiled: June 23, 2015Date of Patent: August 24, 2021Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULCInventors: Gongxian Jeffrey Cheng, Mark Fowler, Philip J. Rogers, Benjamin T. Sander, Anthony Asaro, Mike Mantor, Raja Koduri
-
Patent number: 10545800Abstract: A technique for facilitating direct doorbell rings in a virtualized system is provided. A first device is configured to “ring” a “doorbell” of a second device, where both the first and second devices are not a host processor such as a central processing unit and are coupled to an interconnect fabric such as peripheral component interconnect express (“PCIe”). The first device is configured to ring the doorbell of the second device by writing to a doorbell address in a guest physical address space. For security reasons, a check block checks an offset portion of the doorbell address against a set of allowed doorbell addresses for doorbells specified in the guest physical address space, allowing the doorbell to be written if the doorbell is included in the set of allowed doorbell addresses.Type: GrantFiled: May 31, 2017Date of Patent: January 28, 2020Assignee: ATI Technologies ULCInventors: Anthony Asaro, Gongxian Jeffrey Cheng
-
Patent number: 10521389Abstract: Described herein is a method and system for accessing a block addressable input/output (I/O) device, such as a non-volatile memory (NVM), as byte addressable memory. A front end processor connected to a Peripheral Component Interconnect Express (PCIe) switch performs as a front end interface to the block addressable I/O device to emulate byte addressability. A PCIe device, such as a graphics processing unit (GPU), can directly access the necessary bytes via the front end processor from the block addressable I/O device. The PCIe compatible devices can access data from the block I/O devices without having to go through system memory and a host processor. In an implementation, a system can include block addressable I/O, byte addressable I/O and hybrids thereof which support direct access to byte addressable memory by the host processor, GPU and any other PCIe compatible device.Type: GrantFiled: December 23, 2016Date of Patent: December 31, 2019Assignee: ATI Technologies ULCInventor: Gongxian Jeffrey Cheng
-
Patent number: 10474490Abstract: A technique for efficient time-division of resources in a virtualized accelerated processing device (“APD”) is provided. In a virtualization scheme implemented on the APD, different virtual machines are assigned different “time-slices” in which to use the APD. When a time-slice expires, the APD performs a virtualization context switch by stopping operations for a current virtual machine (“VM”) and starting operations for another VM. Typically, each VM is assigned a fixed length of time, after which a virtualization context switch is performed. This fixed length of time can lead to inefficiencies. Therefore, in some situations, in response to a VM having no more work to perform on the APD and the APD being idle, a virtualization context switch is performed “early.” This virtualization context switch is “early” in the sense that the virtualization context switch is performed before the fixed length of time for the time-slice expires.Type: GrantFiled: June 29, 2017Date of Patent: November 12, 2019Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Gongxian Jeffrey Cheng, Louis Regniere, Anthony Asaro
-
Patent number: 10423354Abstract: A memory manager of a processor identifies a block of data for eviction from a first memory module to a second memory module. In response, the processor copies only those portions of the data block that have been identified as modified portions to the second memory module. The amount of data to be copied is thereby reduced, improving memory management efficiency and reducing processor power consumption.Type: GrantFiled: September 23, 2015Date of Patent: September 24, 2019Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Philip Rogers, Benjamin T. Sander, Anthony Asaro, Gongxian Jeffrey Cheng
-
Patent number: 10176548Abstract: A processor includes a scheduler that governs which of a plurality of pending graphics contexts is selected for execution at a graphics pipeline of the processor. The processor also includes a plurality of flip queues storing data ready to be rendered at a display device. The executing graphics context can issue a flip request to change data at stored at one of the flip queues. In response to determining that the flip request targets a flip queue that is being used for rendering at the display device, the scheduler executes a context switch to schedule a different graphics context for execution at the graphics pipeline.Type: GrantFiled: December 18, 2015Date of Patent: January 8, 2019Assignee: ATI TECHNOLOGIES ULCInventor: Gongxian Jeffrey Cheng
-
Publication number: 20190004839Abstract: A technique for efficient time-division of resources in a virtualized accelerated processing device (“APD”) is provided. In a virtualization scheme implemented on the APD, different virtual machines are assigned different “time-slices” in which to use the APD. When a time-slice expires, the APD performs a virtualization context switch by stopping operations for a current virtual machine (“VM”) and starting operations for another VM. Typically, each VM is assigned a fixed length of time, after which a virtualization context switch is performed. This fixed length of time can lead to inefficiencies. Therefore, in some situations, in response to a VM having no more work to perform on the APD and the APD being idle, a virtualization context switch is performed “early.” This virtualization context switch is “early” in the sense that the virtualization context switch is performed before the fixed length of time for the time-slice expires.Type: ApplicationFiled: June 29, 2017Publication date: January 3, 2019Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Gongxian Jeffrey Cheng, Louis Regniere, Anthony Asaro
-
Publication number: 20180349165Abstract: A technique for facilitating direct doorbell rings in a virtualized system is provided. A first device is configured to “ring” a “doorbell” of a second device, where both the first and second devices are not a host processor such as a central processing unit and are coupled to an interconnect fabric such as peripheral component interconnect express (“PCIe”). The first device is configured to ring the doorbell of the second device by writing to a doorbell address in a guest physical address space. For security reasons, a check block checks an offset portion of the doorbell address against a set of allowed doorbell addresses for doorbells specified in the guest physical address space, allowing the doorbell to be written if the doorbell is included in the set of allowed doorbell addresses.Type: ApplicationFiled: May 31, 2017Publication date: December 6, 2018Applicant: ATI Technologies ULCInventors: Anthony Asaro, Gongxian Jeffrey Cheng
-
Publication number: 20180181519Abstract: Described herein is a method and system for accessing a block addressable input/output (I/O) device, such as a non-volatile memory (NVM), as byte addressable memory. A front end processor connected to a Peripheral Component Interconnect Express (PCIe) switch performs as a front end interface to the block addressable I/O device to emulate byte addressability. A PCIe device, such as a graphics processing unit (GPU), can directly access the necessary bytes via the front end processor from the block addressable I/O device. The PCIe compatible devices can access data from the block I/O devices without having to go through system memory and a host processor. In an implementation, a system can include block addressable I/O, byte addressable I/O and hybrids thereof which support direct access to byte addressable memory by the host processor, GPU and any other PCIe compatible device.Type: ApplicationFiled: December 23, 2016Publication date: June 28, 2018Applicant: ATI Technologies ULCInventor: Gongxian Jeffrey Cheng
-
Publication number: 20170178273Abstract: A processor includes a scheduler that governs which of a plurality of pending graphics contexts is selected for execution at a graphics pipeline of the processor. The processor also includes a plurality of flip queues storing data ready to be rendered at a display device. The executing graphics context can issue a flip request to change data at stored at one of the flip queues. In response to determining that the flip request targets a flip queue that is being used for rendering at the display device, the scheduler executes a context switch to schedule a different graphics context for execution at the graphics pipeline.Type: ApplicationFiled: December 18, 2015Publication date: June 22, 2017Inventor: Gongxian Jeffrey Cheng
-
Publication number: 20170083240Abstract: A memory manager of a processor identifies a block of data for eviction from a first memory module to a second memory module. In response, the processor copies only those portions of the data block that have been identified as modified portions to the second memory module. The amount of data to be copied is thereby reduced, improving memory management efficiency and reducing processor power consumption.Type: ApplicationFiled: September 23, 2015Publication date: March 23, 2017Inventors: Philip Rogers, Benjamin T. Sander, Anthony Asaro, Gongxian Jeffrey Cheng
-
Publication number: 20160378682Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.Type: ApplicationFiled: June 23, 2015Publication date: December 29, 2016Inventors: Benjamin T. Sander, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Mike Mantor
-
Publication number: 20160378674Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.Type: ApplicationFiled: June 23, 2015Publication date: December 29, 2016Inventors: Gongxian Jeffrey Cheng, Mark Fowler, Philip J. Rogers, Benjamin T. Sander, Anthony Asaro, Mike Mantor, Raja Koduri
-
Patent number: 9201682Abstract: In a hardware-based virtualization system, a hypervisor switches out of a first function into a second function. The first function is one of a physical function and a virtual function and the second function is one of a physical function and a virtual function. During the switching a malfunction of the first function is detected. The first function is reset without resetting the second function. The switching, detecting, and resetting operations are performed by a hypervisor of the hardware-based virtualization system. Embodiments further include a communication mechanism for the hypervisor to notify a driver of the function that was reset to enable the driver to restore the function without delay.Type: GrantFiled: June 21, 2013Date of Patent: December 1, 2015Assignee: ATI Technologies ULCInventors: Gongxian Jeffrey Cheng, Anthony Asaro, Yinan Jiang
-
Patent number: 9176795Abstract: A method, system, and computer program product are disclosed for providing improved access to accelerated processing device compute resources to user mode applications. The functionality disclosed allows user mode applications to provide commands to an accelerated processing device without the need for kernel mode transitions in order to access a unified ring buffer. Instead, applications are each provided with their own buffers, which the accelerated processing device hardware can access to process commands. With full operating system support, user mode applications are able to utilize the accelerated processing device in much the same way as a CPU.Type: GrantFiled: November 4, 2011Date of Patent: November 3, 2015Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Rex McCrary, Michael Houston, Philip J. Rogers, Gongxian Jeffrey Cheng, Mark Hummel, Paul Blinzer
-
Patent number: 9099051Abstract: A method, computer program product, and system that includes a virtual function module with an emulated display timing device, a first independent resource, and a second independent resource, where the first and second independent resources signal a physical function module that a new surface has been rendered, and where the physical function module signals the virtual function module via the emulated timing device and the first and second independent resources when the rendered new surface has been displayed, copied, used, or otherwise consumed.Type: GrantFiled: March 2, 2012Date of Patent: August 4, 2015Assignee: ATI Technologies ULCInventors: Gongxian Jeffrey Cheng, Syed Athar Hussain
-
Patent number: 8933947Abstract: Disclosed herein are systems, apparatuses, and methods for enabling efficient reads to a local memory of a processing unit. In an embodiment, a processing unit includes an interface and a buffer. The interface is configured to (i) send a request for a portion of data in a region of a local memory of an other processing unit and (ii) receive, responsive to the request, all the data from the region. The buffer is configured to store the data from the region of the local memory of the other processing unit.Type: GrantFiled: March 8, 2010Date of Patent: January 13, 2015Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: David I. J. Glen, Philip J. Rogers, Gordon F. Caruk, Gongxian Jeffrey Cheng, Mark Hummel, Stephen Patrick Thompson, Anthony Asaro