Patents by Inventor Anthony Asaro

Anthony Asaro has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9734549
    Abstract: A central processor unit (CPU) is connected to a system/graphics controller generally comprising a monolithic semiconductor device. The system/graphics controller is connected to an input output (IO) controller via a high-speed PCI bus. The IO controller interfaces to the system graphics controller via the high-speed PCI bus. The IO controller includes a lower speed PCI port controlled by an arbiter within the IO controller. Generally, the low speed PCI arbiter of the IO controller will interface to standard 33 MHz PCI cards. In addition, the IO controller interfaces to an external storage device, such as a hard drive, via either a standard or a proprietary bus protocol. A unified system/graphics memory which is accessed by the system/graphics controller. The unified memory contains both system data and graphics data. In a specific embodiment, two channels, CH0 and CH1 access the unified memory.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: August 15, 2017
    Assignee: ATI Technologies ULC
    Inventors: Milivoje Aleksic, Raymond M. Li, Danny H. M. Cheng, Carl K. Mizuyabu, Anthony Asaro
  • Publication number: 20170220485
    Abstract: A device may receive a direct memory access request that identifies a virtual address. The device may determine whether the virtual address is within a particular range of virtual addresses. The device may selectively perform a first action or a second action based on determining whether the virtual address is within the particular range of virtual addresses. The first action may include causing a first address translation algorithm to be performed to translate the virtual address to a physical address associated with a memory device when the virtual address is not within the particular range of virtual addresses. The second action may include causing a second address translation algorithm to be performed to translate the virtual address to the physical address when the virtual address is within the particular range of virtual addresses. The second address translation algorithm may be different from the first address translation algorithm.
    Type: Application
    Filed: April 19, 2017
    Publication date: August 3, 2017
    Inventors: Andrew G. KEGEL, Anthony Asaro
  • Publication number: 20170212760
    Abstract: A system and method for reducing latencies of main memory data accesses are described. A non-blocking load (NBLD) instruction identifies an address of requested data and a subroutine. The subroutine includes instructions dependent on the requested data. A processing unit verifies that address translations are available for both the address and the subroutine. The processing unit continues processing instructions with no stalls caused by younger-in-program-order instructions waiting for the requested data. The non-blocking load unit performs a cache coherent data read request on behalf of the NBLD instruction and requests that the processing unit perform an asynchronous jump to the subroutine upon return of the requested data from lower-level memory.
    Type: Application
    Filed: November 16, 2016
    Publication date: July 27, 2017
    Inventors: Meenakshi Sundaram Bhaskaran, Elliot H. Mednick, David A. Roberts, Anthony Asaro, Amin Farmahini-Farahani
  • Publication number: 20170083240
    Abstract: A memory manager of a processor identifies a block of data for eviction from a first memory module to a second memory module. In response, the processor copies only those portions of the data block that have been identified as modified portions to the second memory module. The amount of data to be copied is thereby reduced, improving memory management efficiency and reducing processor power consumption.
    Type: Application
    Filed: September 23, 2015
    Publication date: March 23, 2017
    Inventors: Philip Rogers, Benjamin T. Sander, Anthony Asaro, Gongxian Jeffrey Cheng
  • Publication number: 20170083455
    Abstract: A processor device includes a cache and a memory storing a set of counters. Each counter of the set is associated with a corresponding block of a plurality of blocks of the cache. The processor device further includes a cache access monitor to, for each time quantum for a series of one or more time quanta, increment counter values of the set of counters based on accesses to the corresponding blocks of the cache. The processor device further includes a transfer engine to, after completion of each time quantum, transfer the counter values of the set of counters for the time quantum to a corresponding location in a system memory.
    Type: Application
    Filed: September 22, 2015
    Publication date: March 23, 2017
    Inventors: Philip J. Rogers, Benjamin T. Sander, Anthony Asaro
  • Publication number: 20160378682
    Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.
    Type: Application
    Filed: June 23, 2015
    Publication date: December 29, 2016
    Inventors: Benjamin T. Sander, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Mike Mantor
  • Publication number: 20160378674
    Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.
    Type: Application
    Filed: June 23, 2015
    Publication date: December 29, 2016
    Inventors: Gongxian Jeffrey Cheng, Mark Fowler, Philip J. Rogers, Benjamin T. Sander, Anthony Asaro, Mike Mantor, Raja Koduri
  • Publication number: 20160371197
    Abstract: A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a first processor configured for unified operation with a second processor. The method includes receiving a memory operation from a processor and mapping the memory operation to one of a plurality of memory heaps. The mapping produces a mapping result. The method also includes providing the mapping result to the processor.
    Type: Application
    Filed: September 1, 2016
    Publication date: December 22, 2016
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Anthony ASARO, Kevin NORMOYLE, Mark HUMMEL
  • Publication number: 20160364334
    Abstract: Existing multiprocessor computing systems often have insufficient memory coherency and, consequently, are unable to efficiently utilize separate memory systems. Specifically, a CPU cannot effectively write to a block of memory and then have a GPU access that memory unless there is explicit synchronization. In addition, because the GPU is forced to statically split memory locations between itself and the CPU, existing multiprocessor computing systems are unable to efficiently utilize the separate memory systems. Embodiments described herein overcome these deficiencies by receiving a notification within the GPU that the CPU has finished processing data that is stored in coherent memory, and invalidating data in the CPU caches that the GPU has finished processing from the coherent memory. Embodiments described herein also include dynamically partitioning a GPU memory into coherent memory and local memory through use of a probe filter.
    Type: Application
    Filed: August 24, 2016
    Publication date: December 15, 2016
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Anthony Asaro, Kevin Normoyle, Mark Hummel
  • Patent number: 9448930
    Abstract: A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a first processor configured for unified operation with a second processor. The method includes receiving a memory operation from a processor and mapping the memory operation to one of a plurality of memory heaps. The mapping produces a mapping result. The method also includes providing the mapping result to the processor.
    Type: Grant
    Filed: August 24, 2015
    Date of Patent: September 20, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Anthony Asaro, Kevin Normoyle, Mark Hummel
  • Patent number: 9430391
    Abstract: Existing multiprocessor computing systems often have insufficient memory coherency and, consequently, are unable to efficiently utilize separate memory systems. Specifically, a CPU cannot effectively write to a block of memory and then have a GPU access that memory unless there is explicit synchronization. In addition, because the GPU is forced to statically split memory locations between itself and the CPU, existing multiprocessor computing systems are unable to efficiently utilize the separate memory systems. Embodiments described herein overcome these deficiencies by receiving a notification within the GPU that the CPU has finished processing data that is stored in coherent memory, and invalidating data in the CPU caches that the GPU has finished processing from the coherent memory. Embodiments described herein also include dynamically partitioning a GPU memory into coherent memory and local memory through use of a probe filter.
    Type: Grant
    Filed: August 31, 2012
    Date of Patent: August 30, 2016
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Anthony Asaro, Kevin Normoyle, Mark Hummel
  • Publication number: 20160062911
    Abstract: A device may receive a direct memory access request that identifies a virtual address. The device may determine whether the virtual address is within a particular range of virtual addresses. The device may selectively perform a first action or a second action based on determining whether the virtual address is within the particular range of virtual addresses. The first action may include causing a first address translation algorithm to be performed to translate the virtual address to a physical address associated with a memory device when the virtual address is not within the particular range of virtual addresses. The second action may include causing a second address translation algorithm to be performed to translate the virtual address to the physical address when the virtual address is within the particular range of virtual addresses. The second address translation algorithm may be different from the first address translation algorithm.
    Type: Application
    Filed: August 27, 2014
    Publication date: March 3, 2016
    Inventors: Andrew G. KEGEL, Anthony ASARO
  • Publication number: 20150363310
    Abstract: A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a first processor configured for unified operation with a second processor. The method includes receiving a memory operation from a processor and mapping the memory operation to one of a plurality of memory heaps. The mapping produces a mapping result. The method also includes providing the mapping result to the processor.
    Type: Application
    Filed: August 24, 2015
    Publication date: December 17, 2015
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Anthony ASARO, Kevin NORMOYLE, Mark HUMMEL
  • Patent number: 9201682
    Abstract: In a hardware-based virtualization system, a hypervisor switches out of a first function into a second function. The first function is one of a physical function and a virtual function and the second function is one of a physical function and a virtual function. During the switching a malfunction of the first function is detected. The first function is reset without resetting the second function. The switching, detecting, and resetting operations are performed by a hypervisor of the hardware-based virtualization system. Embodiments further include a communication mechanism for the hypervisor to notify a driver of the function that was reset to enable the driver to restore the function without delay.
    Type: Grant
    Filed: June 21, 2013
    Date of Patent: December 1, 2015
    Assignee: ATI Technologies ULC
    Inventors: Gongxian Jeffrey Cheng, Anthony Asaro, Yinan Jiang
  • Patent number: 9152571
    Abstract: An input/output memory management unit (IOMMU) having an “invalidate all” command available to clear the contents of cache memory is presented. The cache memory provides fast access to address translation data that has been previously obtained by a process. A typical cache memory includes device tables, page tables and interrupt remapping entries. Cache memory data can become stale or be compromised from security breaches or malfunctioning devices. In these circumstances, a rapid approach to clearing cache memory content is provided.
    Type: Grant
    Filed: July 31, 2012
    Date of Patent: October 6, 2015
    Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Andrew G. Kegel, Mark D. Hummel, Anthony Asaro
  • Patent number: 9116809
    Abstract: A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a first processor configured for unified operation with a second processor. The method includes receiving a memory operation from a processor and mapping the memory operation to one of a plurality of memory heaps. The mapping produces a mapping result. The method also includes providing the mapping result to the processor.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: August 25, 2015
    Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Anthony Asaro, Kevin Normoyle, Mark Hummel
  • Publication number: 20150120978
    Abstract: The present invention provides for page table access and dirty bit management in hardware via a new atomic test[0] and OR and Mask. The present invention also provides for a gasket that enables ACE to CCI translations. This gasket further provides request translation between ACE and CCI, deadlock avoidance for victim and probe collision, ARM barrier handling, and power management interactions. The present invention also provides a solution for ARM victim/probe collision handling which deadlocks the unified northbridge. These solutions includes a dedicated writeback virtual channel, probes for IO requests using 4-hop protocol, and a WrBack Reorder Ability in MCT where victims update older requests with data as they pass the requests.
    Type: Application
    Filed: October 24, 2014
    Publication date: April 30, 2015
    Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Vydhyanathan Kalyanasundharam, Philip Ng, Maggie Chan, Vincent Cueva, Liang Chen, Anthony Asaro, Jimshed Mirza, Greggory D. Donley, Bryan Broussard, Benjamin Tsien, Yaniv Adiri
  • Patent number: 9009419
    Abstract: Methods and systems are provided for mapping a memory instruction to a shared memory address space in a computer arrangement having a CPU and an APD. A method includes receiving a memory instruction that refers to an address in the shared memory address space, mapping the memory instruction based on the address to a memory resource associated with either the CPU or the APD, and performing the memory instruction based on the mapping.
    Type: Grant
    Filed: July 31, 2012
    Date of Patent: April 14, 2015
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Anthony Asaro, Kevin Normoyle, Mark D. Hummel, Mark Fowler
  • Patent number: 8984511
    Abstract: Provided is a method of permitting the reordering of a visibility order of operations in a computer arrangement configured for permitting a first processor and a second processor threads to access a shared memory. The method includes receiving in a program order, a first and a second operation in a first thread and permitting the reordering of the visibility order for the operations in the shared memory based on the class of each operation. The visibility order determines the visibility in the shared memory, by a second thread, of stored results from the execution of the first and second operations.
    Type: Grant
    Filed: August 17, 2012
    Date of Patent: March 17, 2015
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Anthony Asaro, Kevin Normoyle, Mark Hummel
  • Patent number: 8935475
    Abstract: Embodiments of the present invention provides for the execution of threads and/or workitems on multiple processors of a heterogeneous computing system in a manner that they can share data correctly and efficiently. Disclosed method, system, and article of manufacture embodiments include, responsive to an instruction from a sequence of instructions of a work-item, determining an ordering of visibility to other work-items of one or more other data items in relation to a particular data item, and performing at least one cache operation upon at least one of the particular data item or the other data items present in any one or more cache memories in accordance with the determined ordering. The semantics of the instruction includes a memory operation upon the particular data item.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: January 13, 2015
    Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Anthony Asaro, Kevin Normoyle, Mark Hummel, Norman Rubin, Mark Fowler