Patents by Inventor Joseph L. Greathouse

Joseph L. Greathouse has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11853734
    Abstract: A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.
    Type: Grant
    Filed: May 10, 2022
    Date of Patent: December 26, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Gregory P. Rodgers, Joseph L. Greathouse
  • Publication number: 20230384947
    Abstract: Systems and methods for dynamic repartitioning of physical memory address mapping involve relocating data stored at one or more physical memory locations of one or more memory devices to another memory device or mass storage device, repartitioning one or more corresponding physical memory maps to include new mappings between physical memory addresses and physical memory locations of the one or more memory devices, then loading the relocated data back onto the one or more memory devices at physical memory locations determined by the new physical address mapping. Such dynamic repartitioning of the physical memory address mapping does not require a processing system to be rebooted and has various applications in connection with interleaving reconfiguration and error correcting code (ECC) reconfiguration of the processing system.
    Type: Application
    Filed: June 12, 2023
    Publication date: November 30, 2023
    Inventors: Joseph L. Greathouse, Alan D. Smith, Francisco L. Duran, Felix Kuehling, Anthony Asaro
  • Publication number: 20230205523
    Abstract: A system includes a hardware compare and swap (CAS) module communicatively coupled to a bus, the CAS module to perform an atomic operation in response to a first request from a first request agent for the atomic operation to be performed on a data value that is shared among a plurality of request agents and obtain a first result value. The atomic operation includes initiating a CAS command via the bus. The CAS module performs the atomic operation in response to a second request from a second request agent and obtains a second result value. Responsive to determining a failure to successfully process one or more of the first request or the second request, the hardware CAS module repetitively performs the atomic operation, for one or more of the first request or the second request.
    Type: Application
    Filed: December 27, 2021
    Publication date: June 29, 2023
    Inventors: Vydhyanathan KALYANASUNDHARAM, Joseph L. GREATHOUSE, Shyam SEKAR
  • Publication number: 20230205608
    Abstract: A disclosed technique includes executing, for a first wavefront, a barrier arrival notification instruction, for a first barrier, indicating arrival at a first barrier point; performing, for the first wavefront, work prior to the first barrier point; executing, for the first wavefront, a barrier check instruction; and executing, for the first wavefront, at a control flow path based on a result of the barrier check instruction.
    Type: Application
    Filed: December 27, 2021
    Publication date: June 29, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Brian Emberling, Joseph L. Greathouse
  • Patent number: 11687251
    Abstract: Systems and methods for dynamic repartitioning of physical memory address mapping involve relocating data stored at one or more physical memory locations of one or more memory devices to another memory device or mass storage device, repartitioning one or more corresponding physical memory maps to include new mappings between physical memory addresses and physical memory locations of the one or more memory devices, then loading the relocated data back onto the one or more memory devices at physical memory locations determined by the new physical address mapping. Such dynamic repartitioning of the physical memory address mapping does not require a processing system to be rebooted and has various applications in connection with interleaving reconfiguration and error correcting code (ECC) reconfiguration of the processing system.
    Type: Grant
    Filed: September 28, 2021
    Date of Patent: June 27, 2023
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Joseph L. Greathouse, Alan D. Smith, Francisco L. Duran, Felix Kuehling, Anthony Asaro
  • Publication number: 20230195664
    Abstract: A method for software management of DMA transfer commands includes receiving a DMA transfer command instructing a data transfer by a first processor device. Based at least in part on a determination of runtime system resource availability, a device different from the first processor device is assigned to assist in transfer of at least a first portion of the data transfer. In some embodiments, the DMA transfer command instructs the first processor device to write a copy of data to a third processor device. Software analyzes network bus congestion at a shared communications bus and initiates DMA transfer via a multi-hop communications path to bypass the congested network bus.
    Type: Application
    Filed: December 22, 2021
    Publication date: June 22, 2023
    Inventors: Sean KEELY, Joseph L. GREATHOUSE, Hari THANGIRALA, Alan D. SMITH, Milind N. NEMLEKAR
  • Publication number: 20230132931
    Abstract: A method for hardware management of DMA transfer commands includes accessing, by a first DMA engine, a DMA transfer command and determining a first portion of a data transfer requested by the DMA transfer command. Transfer of a first portion of the data transfer by the first DMA engine is initiated based at least in part on the DMA transfer command. Similarly, a second portion of the data transfer by a second DMA engine is initiated based at least in part on the DMA transfer command. After transferring the first portion and the second portion of the data transfer, an indication is generated that signals completion of the data transfer requested by the DMA transfer command.
    Type: Application
    Filed: November 1, 2021
    Publication date: May 4, 2023
    Inventors: Joseph L. Greathouse, Sean Keely, Alan D. Smith, Anthony Asaro, Ling-Ling Wang, Milind N. Nemlekar, Hari Thangirala, Felix Kuehling
  • Publication number: 20230108234
    Abstract: A processing system adjusts an operating state of one or more processors during execution of a workload based on a command paired with the workload. The command specifies a desired operating state or a performance or power efficiency target operational goal and is enqueued asynchronously with the workload. A power management controller reads the command synchronously with dispatching the workload to the processor. By asynchronously enqueuing the tag with the workload, the processing system tunes the operating state of the processor to reach higher performance, higher performance per watt, and/or higher energy efficiency during processing of the workload.
    Type: Application
    Filed: September 28, 2021
    Publication date: April 6, 2023
    Inventors: Joseph L. Greathouse, Stephen Kushnir, Karthik Rao, Leopold Grinberg
  • Publication number: 20230097344
    Abstract: Systems and methods for dynamic repartitioning of physical memory address mapping involve relocating data stored at one or more physical memory locations of one or more memory devices to another memory device or mass storage device, repartitioning one or more corresponding physical memory maps to include new mappings between physical memory addresses and physical memory locations of the one or more memory devices, then loading the relocated data back onto the one or more memory devices at physical memory locations determined by the new physical address mapping. Such dynamic repartitioning of the physical memory address mapping does not require a processing system to be rebooted and has various applications in connection with interleaving reconfiguration and error correcting code (ECC) reconfiguration of the processing system.
    Type: Application
    Filed: September 28, 2021
    Publication date: March 30, 2023
    Inventors: Joseph L. Greathouse, Alan D. Smith, Francisco L. Duran, Felix Kuehling, Anthony Asaro
  • Patent number: 11604737
    Abstract: A processing device determines a scope indicating at least a portion of the processing system and target data from atomic memory operation to be performed. Based on the scope, the processing device determines one or more hardware parameters for at least a portion of the processing system. The processing device then compares the hardware parameters to the scope and target data to determine one or more corrections. The processing device then provides the scope, target data, hardware parameters, and corrections to a plurality of hardware lookup tables. The hardware lookup tables are configured to receive the scope, target data, hardware parameters, and corrections as inputs and output values indicating one or more coherency actions and one or more orderings. The processing device then executes one or more of the indicated coherency actions and the atomic memory operation based on the indicated ordering.
    Type: Grant
    Filed: November 2, 2021
    Date of Patent: March 14, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Joseph L. Greathouse, Steven Tony Tye, Mark Fowler, Milind N. Nemlekar
  • Patent number: 11556250
    Abstract: A system including a stack of two or more layers of volatile memory, such as layers of a 3D stacked DRAM memory, places data in the stack based on a temperature or a refresh rate. When a threshold is exceeded, data are moved from a first region to a second region in the stack, the second region having one or both of a second temperature lower than a first temperature of the first region or a second refresh rate lower than a first refresh rate of the first region.
    Type: Grant
    Filed: July 27, 2020
    Date of Patent: January 17, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jagadish B. Kotra, Karthik Rao, Joseph L. Greathouse
  • Patent number: 11556162
    Abstract: A processor utilizes instruction based sampling to generate sampling data sampled on a per instruction basis during execution of an instruction. The sampling data indicates what processor hardware was used due to the execution of the instruction. Software receives the sampling data and generates an estimate of energy used by the instruction based on the sampling data. The sampling data may include microarchitectural events and the energy estimate utilizes a base energy amount corresponding to the instruction executed along with energy amounts corresponding to the microarchitectural events in the sampling data. The sampling data may include switching events associated with hardware blocks that switched due to execution of the instruction and the energy estimate for the instruction is based on the switching events and capacitance estimates associated with the hardware blocks.
    Type: Grant
    Filed: March 16, 2018
    Date of Patent: January 17, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shijia Wei, Joseph L. Greathouse, John Kalamatianos
  • Publication number: 20220318056
    Abstract: A method for reducing power variations resulting from changes in processor workload includes communicating a power dip condition to a workload scheduler of a processor device in response to identifying the power dip condition. One or more target power workloads are assigned for execution at the processor device based at least in part on the power dip condition. Further, each of the one or more target power workloads is associated with a known power load.
    Type: Application
    Filed: March 30, 2021
    Publication date: October 6, 2022
    Inventors: Nicholas Penha MALAYA, Stephen KUSHNIR, William C. BRANTLEY, Joseph L. GREATHOUSE
  • Publication number: 20220269492
    Abstract: A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.
    Type: Application
    Filed: May 10, 2022
    Publication date: August 25, 2022
    Inventors: Gregory P. RODGERS, Joseph L. GREATHOUSE
  • Patent number: 11347486
    Abstract: A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: May 31, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Gregory P. Rodgers, Joseph L. Greathouse
  • Patent number: 11137809
    Abstract: A plurality of thermal electric cooler (TEC) elements are formed in a TEC grid structure. Control logic dynamically varies a supply current supplied to each TEC element (or group of TEC elements) in the TEC grid based on changes in power density respectively associated with areas cooled by each of the TEC elements or group of TEC elements.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: October 5, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Karthik Rao, Wei Huang, Xudong An, Manish Arora, Joseph L. Greathouse
  • Publication number: 20210303284
    Abstract: A processing system includes a compiler that automatically identifies sequences of instructions of tileable source code that can be replaced with tensor operations. The compiler generates enhanced code that replaces the identified sequences of instructions with tensor operations that invoke a special-purpose hardware accelerator. By automatically replacing instructions with tensor operations that invoke the special-purpose hardware accelerator, the compiler makes the performance improvements achievable through the special-purpose hardware accelerator available to programmers using high-level programming languages.
    Type: Application
    Filed: March 27, 2020
    Publication date: September 30, 2021
    Inventors: Gregory P. RODGERS, Joseph L. GREATHOUSE
  • Patent number: 10936697
    Abstract: A method includes storing a first portion of a sparse triangular matrix in a local memory and launching a kernel for executing a set of workgroups. The first portion includes a plurality of row blocks, and each workgroup in the set of workgroups is associated with one of the plurality of row blocks. The method also includes, for each workgroup in the set of workgroups, solving the row block. The row block is solved by, for each row segment of a first subset of row segments in the row block, calculating a partial sum for the row segment based on one or more matrix elements in the row segment, and writing the partial sum to a remote memory of a first remote processing unit prior to terminating the kernel.
    Type: Grant
    Filed: July 24, 2018
    Date of Patent: March 2, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Khaled Hamidouche, Michael W. LeBeane, Nicholas P. Malaya, Joseph L. Greathouse
  • Publication number: 20210034256
    Abstract: A system including a stack of two or more layers of volatile memory, such as layers of a 3D stacked DRAM memory, places data in the stack based on a temperature or a refresh rate. When a threshold is exceeded, data are moved from a first region to a second region in the stack, the second region having one or both of a second temperature lower than a first temperature of the first region or a second refresh rate lower than a first refresh rate of the first region.
    Type: Application
    Filed: July 27, 2020
    Publication date: February 4, 2021
    Inventors: Jagadish B. KOTRA, Karthik RAO, Joseph L. GREATHOUSE
  • Patent number: 10725670
    Abstract: A system including a stack of two or more layers of volatile memory, such as layers of a 3D stacked DRAM memory, places data in the stack based on a temperature or a refresh rate. When a threshold is exceeded, data are moved from a first region to a second region in the stack, the second region having one or both of a second temperature lower than a first temperature of the first region or a second refresh rate lower than a first refresh rate of the first region.
    Type: Grant
    Filed: August 1, 2018
    Date of Patent: July 28, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Jagadish B. Kotra, Karthik Rao, Joseph L. Greathouse