Patents by Inventor Joseph L. Greathouse
Joseph L. Greathouse has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240419358Abstract: A method for hardware management of DMA transfer commands includes accessing, by a first DMA engine, a DMA transfer command and determining a first portion of a data transfer requested by the DMA transfer command. Transfer of a first portion of the data transfer by the first DMA engine is initiated based at least in part on the DMA transfer command. Similarly, a second portion of the data transfer by a second DMA engine is initiated based at least in part on the DMA transfer command. After transferring the first portion and the second portion of the data transfer, an indication is generated that signals completion of the data transfer requested by the DMA transfer command.Type: ApplicationFiled: May 16, 2024Publication date: December 19, 2024Inventors: Joseph L. GREATHOUSE, Sean KEELY, Alan D. SMITH, Anthony ASARO, Ling-Ling WANG, Milind N NEMLEKAR, Hari THANGIRALA, Felix KUEHLING
-
Publication number: 20240330046Abstract: A processing device includes a hardware scheduler, an unmapped queue unit, and command processor, and a plurality of compute units. Responsive to a queue doorbell being an unmapped queue doorbell, the unmapped queue unit is configured to transmit a signal to the hardware scheduler indicating work has been placed into a queue currently unmapped to a hardware queue of the processing device. The hardware scheduler is configured to map the queue to a hardware queue of a plurality of hardware queues at the processing device in response to the signal. The command processor is configured to dispatch the work associated with the mapped queue to one or more compute units of the plurality of compute units.Type: ApplicationFiled: September 29, 2023Publication date: October 3, 2024Inventors: Alexander Fuad Ashkar, Joseph L. Greathouse, Manu Rastogi
-
Publication number: 20240202047Abstract: The disclosed computer-implemented method can include reaching, by a chiplet involved in carrying out an operation for a process, a synchronization barrier. The method can additionally include receiving, by the chiplet, dedicated control messages pushed to the chiplet by other chiplets involved in carrying out the operation for the process, wherein the dedicated control messages are pushed over a control network by the other chiplets. The method can also include advancing, by the chiplet, the synchronization barrier in response to receipt of the dedicated control messages. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 16, 2022Publication date: June 20, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Joseph L. Greathouse, Alan D. Smith, Anthony Asaro, Kostantinos Danny Christidis, Alexander Fuad Ashkar, Milind N. Nemlekar
-
Patent number: 11995351Abstract: A method for hardware management of DMA transfer commands includes accessing, by a first DMA engine, a DMA transfer command and determining a first portion of a data transfer requested by the DMA transfer command. Transfer of a first portion of the data transfer by the first DMA engine is initiated based at least in part on the DMA transfer command. Similarly, a second portion of the data transfer by a second DMA engine is initiated based at least in part on the DMA transfer command. After transferring the first portion and the second portion of the data transfer, an indication is generated that signals completion of the data transfer requested by the DMA transfer command.Type: GrantFiled: November 1, 2021Date of Patent: May 28, 2024Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Joseph L Greathouse, Sean Keely, Alan D. Smith, Anthony Asaro, Ling-Ling Wang, Milind N Nemlekar, Hari Thangirala, Felix Kuehling
-
Patent number: 11972261Abstract: A system includes a hardware compare and swap (CAS) module communicatively coupled to a bus, the CAS module to perform an atomic operation in response to a first request from a first request agent for the atomic operation to be performed on a data value that is shared among a plurality of request agents and obtain a first result value. The atomic operation includes initiating a CAS command via the bus. The CAS module performs the atomic operation in response to a second request from a second request agent and obtains a second result value. Responsive to determining a failure to successfully process one or more of the first request or the second request, the hardware CAS module repetitively performs the atomic operation, for one or more of the first request or the second request.Type: GrantFiled: December 27, 2021Date of Patent: April 30, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Vydhyanathan Kalyanasundharam, Joseph L. Greathouse, Shyam Sekhar
-
Patent number: 11853734Abstract: A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.Type: GrantFiled: May 10, 2022Date of Patent: December 26, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Gregory P. Rodgers, Joseph L. Greathouse
-
Publication number: 20230384947Abstract: Systems and methods for dynamic repartitioning of physical memory address mapping involve relocating data stored at one or more physical memory locations of one or more memory devices to another memory device or mass storage device, repartitioning one or more corresponding physical memory maps to include new mappings between physical memory addresses and physical memory locations of the one or more memory devices, then loading the relocated data back onto the one or more memory devices at physical memory locations determined by the new physical address mapping. Such dynamic repartitioning of the physical memory address mapping does not require a processing system to be rebooted and has various applications in connection with interleaving reconfiguration and error correcting code (ECC) reconfiguration of the processing system.Type: ApplicationFiled: June 12, 2023Publication date: November 30, 2023Inventors: Joseph L. Greathouse, Alan D. Smith, Francisco L. Duran, Felix Kuehling, Anthony Asaro
-
Publication number: 20230205608Abstract: A disclosed technique includes executing, for a first wavefront, a barrier arrival notification instruction, for a first barrier, indicating arrival at a first barrier point; performing, for the first wavefront, work prior to the first barrier point; executing, for the first wavefront, a barrier check instruction; and executing, for the first wavefront, at a control flow path based on a result of the barrier check instruction.Type: ApplicationFiled: December 27, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Brian Emberling, Joseph L. Greathouse
-
Publication number: 20230205523Abstract: A system includes a hardware compare and swap (CAS) module communicatively coupled to a bus, the CAS module to perform an atomic operation in response to a first request from a first request agent for the atomic operation to be performed on a data value that is shared among a plurality of request agents and obtain a first result value. The atomic operation includes initiating a CAS command via the bus. The CAS module performs the atomic operation in response to a second request from a second request agent and obtains a second result value. Responsive to determining a failure to successfully process one or more of the first request or the second request, the hardware CAS module repetitively performs the atomic operation, for one or more of the first request or the second request.Type: ApplicationFiled: December 27, 2021Publication date: June 29, 2023Inventors: Vydhyanathan KALYANASUNDHARAM, Joseph L. GREATHOUSE, Shyam SEKAR
-
Patent number: 11687251Abstract: Systems and methods for dynamic repartitioning of physical memory address mapping involve relocating data stored at one or more physical memory locations of one or more memory devices to another memory device or mass storage device, repartitioning one or more corresponding physical memory maps to include new mappings between physical memory addresses and physical memory locations of the one or more memory devices, then loading the relocated data back onto the one or more memory devices at physical memory locations determined by the new physical address mapping. Such dynamic repartitioning of the physical memory address mapping does not require a processing system to be rebooted and has various applications in connection with interleaving reconfiguration and error correcting code (ECC) reconfiguration of the processing system.Type: GrantFiled: September 28, 2021Date of Patent: June 27, 2023Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Joseph L. Greathouse, Alan D. Smith, Francisco L. Duran, Felix Kuehling, Anthony Asaro
-
Publication number: 20230195664Abstract: A method for software management of DMA transfer commands includes receiving a DMA transfer command instructing a data transfer by a first processor device. Based at least in part on a determination of runtime system resource availability, a device different from the first processor device is assigned to assist in transfer of at least a first portion of the data transfer. In some embodiments, the DMA transfer command instructs the first processor device to write a copy of data to a third processor device. Software analyzes network bus congestion at a shared communications bus and initiates DMA transfer via a multi-hop communications path to bypass the congested network bus.Type: ApplicationFiled: December 22, 2021Publication date: June 22, 2023Inventors: Sean KEELY, Joseph L. GREATHOUSE, Hari THANGIRALA, Alan D. SMITH, Milind N. NEMLEKAR
-
Publication number: 20230132931Abstract: A method for hardware management of DMA transfer commands includes accessing, by a first DMA engine, a DMA transfer command and determining a first portion of a data transfer requested by the DMA transfer command. Transfer of a first portion of the data transfer by the first DMA engine is initiated based at least in part on the DMA transfer command. Similarly, a second portion of the data transfer by a second DMA engine is initiated based at least in part on the DMA transfer command. After transferring the first portion and the second portion of the data transfer, an indication is generated that signals completion of the data transfer requested by the DMA transfer command.Type: ApplicationFiled: November 1, 2021Publication date: May 4, 2023Inventors: Joseph L. Greathouse, Sean Keely, Alan D. Smith, Anthony Asaro, Ling-Ling Wang, Milind N. Nemlekar, Hari Thangirala, Felix Kuehling
-
Publication number: 20230108234Abstract: A processing system adjusts an operating state of one or more processors during execution of a workload based on a command paired with the workload. The command specifies a desired operating state or a performance or power efficiency target operational goal and is enqueued asynchronously with the workload. A power management controller reads the command synchronously with dispatching the workload to the processor. By asynchronously enqueuing the tag with the workload, the processing system tunes the operating state of the processor to reach higher performance, higher performance per watt, and/or higher energy efficiency during processing of the workload.Type: ApplicationFiled: September 28, 2021Publication date: April 6, 2023Inventors: Joseph L. Greathouse, Stephen Kushnir, Karthik Rao, Leopold Grinberg
-
Publication number: 20230097344Abstract: Systems and methods for dynamic repartitioning of physical memory address mapping involve relocating data stored at one or more physical memory locations of one or more memory devices to another memory device or mass storage device, repartitioning one or more corresponding physical memory maps to include new mappings between physical memory addresses and physical memory locations of the one or more memory devices, then loading the relocated data back onto the one or more memory devices at physical memory locations determined by the new physical address mapping. Such dynamic repartitioning of the physical memory address mapping does not require a processing system to be rebooted and has various applications in connection with interleaving reconfiguration and error correcting code (ECC) reconfiguration of the processing system.Type: ApplicationFiled: September 28, 2021Publication date: March 30, 2023Inventors: Joseph L. Greathouse, Alan D. Smith, Francisco L. Duran, Felix Kuehling, Anthony Asaro
-
Patent number: 11604737Abstract: A processing device determines a scope indicating at least a portion of the processing system and target data from atomic memory operation to be performed. Based on the scope, the processing device determines one or more hardware parameters for at least a portion of the processing system. The processing device then compares the hardware parameters to the scope and target data to determine one or more corrections. The processing device then provides the scope, target data, hardware parameters, and corrections to a plurality of hardware lookup tables. The hardware lookup tables are configured to receive the scope, target data, hardware parameters, and corrections as inputs and output values indicating one or more coherency actions and one or more orderings. The processing device then executes one or more of the indicated coherency actions and the atomic memory operation based on the indicated ordering.Type: GrantFiled: November 2, 2021Date of Patent: March 14, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Joseph L. Greathouse, Steven Tony Tye, Mark Fowler, Milind N. Nemlekar
-
Patent number: 11556250Abstract: A system including a stack of two or more layers of volatile memory, such as layers of a 3D stacked DRAM memory, places data in the stack based on a temperature or a refresh rate. When a threshold is exceeded, data are moved from a first region to a second region in the stack, the second region having one or both of a second temperature lower than a first temperature of the first region or a second refresh rate lower than a first refresh rate of the first region.Type: GrantFiled: July 27, 2020Date of Patent: January 17, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, Karthik Rao, Joseph L. Greathouse
-
Patent number: 11556162Abstract: A processor utilizes instruction based sampling to generate sampling data sampled on a per instruction basis during execution of an instruction. The sampling data indicates what processor hardware was used due to the execution of the instruction. Software receives the sampling data and generates an estimate of energy used by the instruction based on the sampling data. The sampling data may include microarchitectural events and the energy estimate utilizes a base energy amount corresponding to the instruction executed along with energy amounts corresponding to the microarchitectural events in the sampling data. The sampling data may include switching events associated with hardware blocks that switched due to execution of the instruction and the energy estimate for the instruction is based on the switching events and capacitance estimates associated with the hardware blocks.Type: GrantFiled: March 16, 2018Date of Patent: January 17, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Shijia Wei, Joseph L. Greathouse, John Kalamatianos
-
Publication number: 20220318056Abstract: A method for reducing power variations resulting from changes in processor workload includes communicating a power dip condition to a workload scheduler of a processor device in response to identifying the power dip condition. One or more target power workloads are assigned for execution at the processor device based at least in part on the power dip condition. Further, each of the one or more target power workloads is associated with a known power load.Type: ApplicationFiled: March 30, 2021Publication date: October 6, 2022Inventors: Nicholas Penha MALAYA, Stephen KUSHNIR, William C. BRANTLEY, Joseph L. GREATHOUSE
-
Publication number: 20220269492Abstract: A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.Type: ApplicationFiled: May 10, 2022Publication date: August 25, 2022Inventors: Gregory P. RODGERS, Joseph L. GREATHOUSE
-
Patent number: 11347486Abstract: A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.Type: GrantFiled: March 27, 2020Date of Patent: May 31, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Gregory P. Rodgers, Joseph L. Greathouse