Patents by Inventor Anthony Asaro

Anthony Asaro has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Instruction set and micro-architecture supporting asynchronous memory access

Patent number: 10209991

Abstract: A system and method for reducing latencies of main memory data accesses are described. A non-blocking load (NBLD) instruction identifies an address of requested data and a subroutine. The subroutine includes instructions dependent on the requested data. A processing unit verifies that address translations are available for both the address and the subroutine. The processing unit continues processing instructions with no stalls caused by younger-in-program-order instructions waiting for the requested data. The non-blocking load unit performs a cache coherent data read request on behalf of the NBLD instruction and requests that the processing unit perform an asynchronous jump to the subroutine upon return of the requested data from lower-level memory.

Type: Grant

Filed: November 16, 2016

Date of Patent: February 19, 2019

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Meenakshi Sundaram Bhaskaran, Elliot H. Mednick, David A. Roberts, Anthony Asaro, Amin Farmahini-Farahani
HANG DETECTION FOR VIRTUALIZED ACCELERATED PROCESSING DEVICE

Publication number: 20190018699

Abstract: A technique for recovering from a hang in a virtualized accelerated processing device (“APD”) is provided. In the virtualization scheme, different virtual machines are assigned different “time-slices” in which to use the APD. When a time-slice expires, the APD stops operations for a current VM and starts operations for another VM. To stop operations on the APD, a virtualization scheduler sends a request to idle the APD. The APD responds by completing work and idling. If one or more portions of the APD do not complete this idling process before a timeout expires, then a hang occurs. In response to the hang, the virtualization scheduler informs the hypervisor that a hang has occurred. The hypervisor performs a function level reset on the APD and informs the VM that the hang has occurred. The VM responds by stopping command issue to the APD and re-initializing the APD for the function.

Type: Application

Filed: July 28, 2017

Publication date: January 17, 2019

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Anthony Asaro, Yinan Jiang, Andy Sung, Ahmed M. Abdelkhalek, Xiaowei Wang, Sidney D. Fortes
EARLY VIRTUALIZATION CONTEXT SWITCH FOR VIRTUALIZED ACCELERATED PROCESSING DEVICE

Publication number: 20190004839

Abstract: A technique for efficient time-division of resources in a virtualized accelerated processing device (“APD”) is provided. In a virtualization scheme implemented on the APD, different virtual machines are assigned different “time-slices” in which to use the APD. When a time-slice expires, the APD performs a virtualization context switch by stopping operations for a current virtual machine (“VM”) and starting operations for another VM. Typically, each VM is assigned a fixed length of time, after which a virtualization context switch is performed. This fixed length of time can lead to inefficiencies. Therefore, in some situations, in response to a VM having no more work to perform on the APD and the APD being idle, a virtualization context switch is performed “early.” This virtualization context switch is “early” in the sense that the virtualization context switch is performed before the fixed length of time for the time-slice expires.

Type: Application

Filed: June 29, 2017

Publication date: January 3, 2019

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Gongxian Jeffrey Cheng, Louis Regniere, Anthony Asaro
REGISTER PARTITION AND PROTECTION FOR VIRTUALIZED PROCESSING DEVICE

Publication number: 20190004840

Abstract: A register protection mechanism for a virtualized accelerated processing device (“APD”) is disclosed. The mechanism protects registers of the accelerated processing device designated as physical-function-or-virtual-function registers (“PF-or-VF* registers”), which are single architectural instance registers that are shared among different functions that share the APD in a virtualization scheme whereby each function can maintain a different value in these registers. The protection mechanism for these registers comprises comparing the function associated with the memory address specified by a particular register access request to the “currently active” function for the APD and disallowing the register access request if a match does not occur.

Type: Application

Filed: June 29, 2017

Publication date: January 3, 2019

Applicant: ATI Technologies ULC

Inventors: Anthony Asaro, Yinan Jiang, Kelly Donald Clark Zytaruk
Routing direct memory access requests in a virtualized computing environment

Patent number: 10162765

Abstract: A device may receive a direct memory access request that identifies a virtual address. The device may determine whether the virtual address is within a particular range of virtual addresses. The device may selectively perform a first action or a second action based on determining whether the virtual address is within the particular range of virtual addresses. The first action may include causing a first address translation algorithm to be performed to translate the virtual address to a physical address associated with a memory device when the virtual address is not within the particular range of virtual addresses. The second action may include causing a second address translation algorithm to be performed to translate the virtual address to the physical address when the virtual address is within the particular range of virtual addresses. The second address translation algorithm may be different from the first address translation algorithm.

Type: Grant

Filed: April 19, 2017

Date of Patent: December 25, 2018

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Andrew G. Kegel, Anthony Asaro
Efficient arbitration for memory accesses

Patent number: 10152434

Abstract: A system and method for efficient arbitration of memory access requests are described. One or more functional units generate memory access requests for a partitioned memory. An arbitration unit stores the generated requests and selects a given one of the stored requests. The arbitration unit identifies a given partition of the memory which stores a memory location targeted by the selected request. The arbitration unit determines whether one or more other stored requests access memory locations in the given partition. The arbitration unit sends each of the selected memory access request and the identified one or more other memory access requests to the memory to be serviced out of order.

Type: Grant

Filed: December 20, 2016

Date of Patent: December 11, 2018

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Rostyslav Kyrychynskyi, Anthony Asaro, Kostantinos Danny Christidis, Mark Fowler, Michael J. Mantor, Robert Scott Hartog
DIRECT DOORBELL RING IN VIRTUALIZED PROCESSING DEVICE

Publication number: 20180349165

Abstract: A technique for facilitating direct doorbell rings in a virtualized system is provided. A first device is configured to “ring” a “doorbell” of a second device, where both the first and second devices are not a host processor such as a central processing unit and are coupled to an interconnect fabric such as peripheral component interconnect express (“PCIe”). The first device is configured to ring the doorbell of the second device by writing to a doorbell address in a guest physical address space. For security reasons, a check block checks an offset portion of the doorbell address against a set of allowed doorbell addresses for doorbells specified in the guest physical address space, allowing the doorbell to be written if the doorbell is included in the set of allowed doorbell addresses.

Type: Application

Filed: May 31, 2017

Publication date: December 6, 2018

Applicant: ATI Technologies ULC

Inventors: Anthony Asaro, Gongxian Jeffrey Cheng
SILENT ACTIVE PAGE MIGRATION FAULTS

Publication number: 20180307414

Abstract: Systems, apparatuses, and methods for migrating memory pages are disclosed herein. In response to detecting that a migration of a first page between memory locations is being initiated, a first page table entry (PTE) corresponding to the first page is located and a migration pending indication is stored in the first PTE. In one embodiment, the migration pending indication is encoded in the first PTE by disabling read and write permissions. If a translation request targeting the first PTE is received by the MMU and the translation request corresponds to a read request, a read operation is allowed to the first page. Otherwise, if the translation request corresponds to a write request, a write operation to the first page is blocked and a silent retry request is generated and conveyed to the requesting client.

Type: Application

Filed: April 24, 2017

Publication date: October 25, 2018

Inventors: Wade K. Smith, Anthony Asaro
FULLY VIRTUALIZED TLBS

Publication number: 20180307622

Abstract: Systems, apparatuses, and methods for implementing a virtualized translation lookaside buffer (TLB) are disclosed herein. In one embodiment, a system includes at least an execution unit and a first TLB. The system supports the execution of a plurality of virtual machines in a virtualization environment. The system detects a translation request generated by a first virtual machine with a first virtual memory identifier (VMID). The translation request is conveyed from the execution unit to the first TLB. The first TLB performs a lookup of its cache using at least a portion of a first virtual address and the first VMID. If the lookup misses in the cache, the first TLB allocates an entry which is addressable by the first virtual address and the first VMID, and the first TLB sends the translation request with the first VMID to a second TLB.

Type: Application

Filed: April 24, 2017

Publication date: October 25, 2018

Inventors: Wade K. Smith, Anthony Asaro
INPUT/OUTPUT MEMORY MAP UNIT AND NORTHBRIDGE

Publication number: 20180307619

Abstract: A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.

Type: Application

Filed: July 2, 2018

Publication date: October 25, 2018

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Vydhyanathan Kalyanasundharam, Philip Ng, Maggie Chan, Vincent Cueva, Anthony Asaro, Jimshed Mirza, Greggory D. Donley, Bryan Broussard, Benjamin Tsien, Yaniv Adiri
TRANSLATE FURTHER MECHANISM

Publication number: 20180300253

Abstract: Systems, apparatuses, and methods for implementing a translate further mechanism are disclosed herein. In one embodiment, a processor detects a hit to a first entry of a page table structure during a first lookup to the page table structure. The processor retrieves a page table entry address from the first entry and uses this address to perform a second lookup to the page table structure responsive to detecting a first indication in the first entry. The processor retrieves a physical address from the first entry and uses the physical address to access the memory subsystem responsive to not detecting the first indication in the first entry. In one embodiment, the first indication is a translate further bit being set. In another embodiment, the first indication is a page directory entry as page table entry field not being activated.

Type: Application

Filed: April 13, 2017

Publication date: October 18, 2018

Inventors: Wade K. Smith, Anthony Asaro, Dhirendra Partap Singh Rana
Input/output memory map unit and northbridge

Patent number: 10025721

Abstract: The present invention provides for page table access and dirty bit management in hardware via a new atomic test[0] and OR and Mask. The present invention also provides for a gasket that enables ACE to CCI translations. This gasket further provides request translation between ACE and CCI, deadlock avoidance for victim and probe collision, ARM barrier handling, and power management interactions. The present invention also provides a solution for ARM victim/probe collision handling which deadlocks the unified northbridge. These solutions includes a dedicated writeback virtual channel, probes for IO requests using 4-hop protocol, and a WrBack Reorder Ability in MCT where victims update older requests with data as they pass the requests.

Type: Grant

Filed: October 24, 2014

Date of Patent: July 17, 2018

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Vydhyanathan Kalyanasundharam, Philip Ng, Maggie Chan, Vincent Cueva, Anthony Asaro, Jimshed Mirza, Greggory D. Donley, Bryan Broussard, Benjamin Tsien, Yaniv Adiri
HIGH-SPEED SELECTIVE CACHE INVALIDATES AND WRITE-BACKS ON GPUS

Publication number: 20180181488

Abstract: Techniques for performing cache invalidates and write-backs in an accelerated processing device (e.g., a graphics processing device that renders three-dimensional graphics) are disclosed. The techniques involve receiving requests from a “master” (e.g., the central processing unit). The techniques involve invalidating virtual-to-physical address translations in an address translation request. The techniques include splitting up the requests based on whether the requests target virtually or physically tagged caches. Addresses for the portions of a request that target physically tagged caches are translated using invalidated virtual-to-physical address translations for speed. The split up request is processed to generate micro-transactions for individual caches targeted by the request. Micro-transactions for physically and virtually tagged caches are processed in parallel. Once all micro-transactions for a request have been processed, the unit that made the request is notified.

Type: Application

Filed: December 23, 2016

Publication date: June 28, 2018

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Mark Fowler, Jimshed Mirza, Anthony Asaro
EFFICIENT ARBITRATION FOR MEMORY ACCESSES

Publication number: 20180173649

Abstract: A system and method for efficient arbitration of memory access requests are described. One or more functional units generate memory access requests for a partitioned memory. An arbitration unit stores the generated requests and selects a given one of the stored requests. The arbitration unit identifies a given partition of the memory which stores a memory location targeted by the selected request. The arbitration unit determines whether one or more other stored requests access memory locations in the given partition. The arbitration unit sends each of the selected memory access request and the identified one or more other memory access requests to the memory to be serviced out of order.

Type: Application

Filed: December 20, 2016

Publication date: June 21, 2018

Inventors: Rostyslav Kyrychynskyi, Anthony Asaro, Kostantinos Danny Christidis, Mark Fowler, Michael J. Mantor, Robert Scott Hartog
Managing coherent memory between an accelerated processing device and a central processing unit

Patent number: 9965392

Abstract: Existing multiprocessor computing systems often have insufficient memory coherency and, consequently, are unable to efficiently utilize separate memory systems. Specifically, a CPU cannot effectively write to a block of memory and then have a GPU access that memory unless there is explicit synchronization. In addition, because the GPU is forced to statically split memory locations between itself and the CPU, existing multiprocessor computing systems are unable to efficiently utilize the separate memory systems. Embodiments described herein overcome these deficiencies by receiving a notification within the GPU that the CPU has finished processing data that is stored in coherent memory, and invalidating data in the CPU caches that the GPU has finished processing from the coherent memory. Embodiments described herein also include dynamically partitioning a GPU memory into coherent memory and local memory through use of a probe filter.

Type: Grant

Filed: August 24, 2016

Date of Patent: May 8, 2018

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Anthony Asaro, Kevin Normoyle, Mark Hummel
Memory controller having plurality of channels that provides simultaneous access to data when accessing unified graphics memory

Patent number: 9959593

Abstract: An apparatus includes a unified system/graphics memory and a memory controller. The memory controller is operative to receive client data access requests associated with one or more clients and a central processing unit (CPU) data access request associated with a CPU, to a plurality of memory channels for accessing the unified system/graphics memory. The memory controller is operative to provide access to the plurality of memory channels, in parallel, by the CPU and at least one client of the one or more clients. The memory controller is operative to prioritize the CPU data access request to the unified memory over the client data access requests to the unified memory and control the plurality of memory channels to access, in parallel, data for the CPU and data for the at least one client based on a request of the client data access requests and the CPU data access request.

Type: Grant

Filed: June 30, 2017

Date of Patent: May 1, 2018

Assignee: ATI Technologies ULC

Inventors: Milivoje Aleksic, Raymond M. Li, Danny H. M. Cheng, Carl K. Mizuyabu, Anthony Asaro
Cache access statistics accumulation for cache line replacement selection

Patent number: 9910788

Abstract: A processor device includes a cache and a memory storing a set of counters. Each counter of the set is associated with a corresponding block of a plurality of blocks of the cache. The processor device further includes a cache access monitor to, for each time quantum for a series of one or more time quanta, increment counter values of the set of counters based on accesses to the corresponding blocks of the cache. The processor device further includes a transfer engine to, after completion of each time quantum, transfer the counter values of the set of counters for the time quantum to a corresponding location in a system memory.

Type: Grant

Filed: September 22, 2015

Date of Patent: March 6, 2018

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Philip J. Rogers, Benjamin T. Sander, Anthony Asaro
DDR MEMORY ERROR RECOVERY

Publication number: 20180018221

Abstract: In one form, a memory controller includes a command queue, an arbiter, and a replay queue. The command queue receives and stores memory access requests. The arbiter is coupled to the command queue for providing a sequence of memory commands to a memory channel. The replay queue stores the sequence of memory commands to the memory channel, and continues to store memory access commands that have not yet received responses from the memory channel. When a response indicates a completion of a corresponding memory command without any error, the replay queue removes the corresponding memory command without taking further action. When a response indicates a completion of the corresponding memory command with an error, the replay queue replays at least the corresponding memory command. In another form, a data processing system includes the memory controller, a memory accessing agent, and a memory system to which the memory controller is coupled.

Type: Application

Filed: December 9, 2016

Publication date: January 18, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: James R. Magro, Ruihua Peng, Anthony Asaro, Kedarnath Balakrishnan, Scott P. Murphy, YuBin Yao
MEMORY HEAPS IN A MEMORY MODEL FOR A UNIFIED COMPUTING SYSTEM

Publication number: 20180011798

Abstract: A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a first processor configured for unified operation with a second processor. The method includes receiving a memory operation from a processor and mapping the memory operation to one of a plurality of memory heaps. The mapping produces a mapping result. The method also includes providing the mapping result to the processor.

Type: Application

Filed: September 5, 2017

Publication date: January 11, 2018

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Anthony ASARO, Kevin NORMOYLE, Mark HUMMEL
MEMORY DEVICE FOR PROVIDING DATA IN A GRAPHICS SYSTEM AND METHOD AND APPARATUS THEREOF

Publication number: 20170301058

Abstract: An apparatus includes a unified system/graphics memory and a memory controller. The memory controller is operative to receive client data access requests associated with one or more clients and a central processing unit (CPU) data access request associated with a CPU, to a plurality of memory channels for accessing the unified system/graphics memory. The memory controller is operative to provide access to the plurality of memory channels, in parallel, by the CPU and at least one client of the one or more clients. The memory controller is operative to prioritize the CPU data access request to the unified memory over the client data access requests to the unified memory and control the plurality of memory channels to access, in parallel, data for the CPU and data for the at least one client based on a request of the client data access requests and the CPU data access request.

Type: Application

Filed: June 30, 2017

Publication date: October 19, 2017

Inventors: Milivoje Aleksic, Raymond M. Li, Danny H.M. Cheng, Carl K. Mizuyabu, Anthony Asaro

prev 1 2 3 4 5 6 next