Patents by Inventor John Kalamatianos

John Kalamatianos has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240126552
    Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.
    Type: Application
    Filed: December 21, 2023
    Publication date: April 18, 2024
    Inventors: JOHN KALAMATIANOS, MICHAEL T. CLARK, MARIUS EVERS, WILLIAM L. WALKER, PAUL MOYER, JAY FLEISCHMAN, JAGADISH B. KOTRA
  • Patent number: 11960404
    Abstract: Systems, apparatuses, and methods for efficiently processing memory requests are disclosed. A computing system includes at least one processing unit coupled to a memory. Circuitry in the processing unit determines a memory request becomes a long-latency request based on detecting a translation lookaside buffer (TLB) miss, a branch misprediction, a memory dependence misprediction, or a precise exception has occurred. The circuitry marks the memory request as a long-latency request such as storing an indication of a long-latency request in an instruction tag of the memory request. The circuitry uses weighted criteria for scheduling out-of-order issue and servicing of memory requests. However, the indication of a long-latency request is not combined with other criteria in a weighted sum. Rather, the indication of the long-latency request is a separate value. The circuitry prioritizes memory requests marked as long-latency requests over memory requests not marked as long-latency requests.
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: April 16, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jagadish B. Kotra, John Kalamatianos
  • Publication number: 20240111676
    Abstract: A disclosed computing device includes at least one prefetcher and a processing device communicatively coupled to the prefetcher. The processing device is configured to detect a throttling instruction that indicates a start of a throttling region. The computing device is further configured to prevent the prefetcher from being trained on one or more memory instructions included in the throttling region in response to the throttling instruction. Various other apparatuses, systems, and methods are also disclosed.
    Type: Application
    Filed: September 30, 2022
    Publication date: April 4, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Marko Scrbak, Gabriel H. Loh, Akhil Arunkumar
  • Publication number: 20240111677
    Abstract: A method for performing prefetching operations is disclosed. The method includes storing a recorded access pattern indicating a set of accesses for a region; in response to an access within the region, fetching the recorded access pattern; and performing prefetching based on the access pattern.
    Type: Application
    Filed: September 30, 2022
    Publication date: April 4, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Gabriel H. Loh, Marko Scrbak, Akhil Arunkumar, John Kalamatianos
  • Publication number: 20240111678
    Abstract: Systems and methods for pushed prefetching include: multiple core complexes, each core complex having multiple cores and multiple caches, the multiple caches configured in a memory hierarchy with multiple levels; an interconnect device coupling the core complexes to each other and coupling the core complexes to shared memory, the shared memory at a lower level of the memory hierarchy than the multiple caches; and a push-based prefetcher having logic to: monitor memory traffic between caches of a first level of the memory hierarchy and the shared memory; and based on the monitoring, initiate a prefetch of data to a cache of the first level of the memory hierarchy.
    Type: Application
    Filed: September 30, 2022
    Publication date: April 4, 2024
    Inventors: JAGADISH B. KOTRA, JOHN KALAMATIANOS, PAUL MOYER, GABRIEL H. LOH
  • Publication number: 20240111420
    Abstract: Methods, devices, and systems for retrieving information based on cache miss prediction. It is predicted, based on a history of cache misses at a private cache, that a cache lookup for the information will miss a shared victim cache. A speculative memory request is enabled based on the prediction that the cache lookup for the information will miss the shared victim cache. The information is fetched based on the enabled speculative memory request.
    Type: Application
    Filed: September 29, 2022
    Publication date: April 4, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Jagadish B. Kotra, John Kalamatianos
  • Patent number: 11921634
    Abstract: Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host is disclosed. In an implementation, a memory controller identifies a first write instruction to write first data to a first memory location, where the first write instruction is not a processing-in-memory (PIM) instruction. The memory controller then writes the first data to a first PIM register. Opportunistically, the memory controller moves the first data from the first PIM register to the first memory location. In another implementation, a memory controller identifies a first memory location associated with a first read instruction, where the first read instruction is not a processing-in-memory (PIM) instruction. The memory controller identifies that a PIM register is associated with the first memory location. The memory controller then reads, in response to the first read instruction, first data from the PIM register.
    Type: Grant
    Filed: December 28, 2021
    Date of Patent: March 5, 2024
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Jagadish B. Kotra, John Kalamatianos, Yasuko Eckert, Yonghae Kim
  • Patent number: 11874739
    Abstract: A memory module includes one or more programmable ECC engines that may be programed by a host processing element with a particular ECC implementation. As used herein, the term “ECC implementation” refers to ECC functionality for performing error detection and subsequent processing, for example using the results of the error detection to perform error correction and to encode corrupted data that cannot be corrected, etc. The approach allows an SoC designer or company to program and reprogram ECC engines in memory modules in a secure manner without having to disclose the particular ECC implementations used by the ECC engines to memory vendors or third parties.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: January 16, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sudhanva Gurumurthi, Vilas Sridharan, Shaizeen Aga, Nuwan Jayasena, Michael Ignatowski, Shrikanth Ganapathy, John Kalamatianos
  • Patent number: 11868777
    Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: January 9, 2024
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Michael T. Clark, Marius Evers, William L. Walker, Paul Moyer, Jay Fleischman, Jagadish B. Kotra
  • Publication number: 20240004584
    Abstract: In accordance with described techniques for DRAM row management for processing in memory, a plurality of instructions are obtained for execution by a processing in memory component embedded in a dynamic random access memory. An instruction is identified that last accesses a row of the dynamic random access memory, and a subsequent instruction is identified that first accesses an additional row of the dynamic random access memory. A first command is issued to close the row and a second command is issued to open the additional row after the row is last accessed by the instruction.
    Type: Application
    Filed: June 30, 2022
    Publication date: January 4, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Niti Madan, Yasuko Eckert, Varun Agrawal, John Kalamatianos
  • Publication number: 20230409336
    Abstract: In accordance with described techniques for VLIW Dynamic Communication, an instruction that causes dynamic communication of data to at least one processing element of a very long instruction word (VLIW) machine is dispatched to a plurality of processing elements of the VLIW machine. A first count of data communications issued by the plurality of processing elements and a second count of data communications served by the plurality of processing elements are maintained. At least one additional instruction is determined for dispatch to the plurality of processing elements of the VLIW machine based on the first count and the second count. For example, an instruction that is independent of the instruction is determined for dispatch while the first count and the second count are unequal, and an instruction that is dependent on the instruction is determined for dispatch based on the first count and the second count being equal.
    Type: Application
    Filed: June 17, 2022
    Publication date: December 21, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sriseshan Srikanth, Karthik Ramu Sangaiah, Anthony Thomas Gutierrez, Vedula Venkata Srikant Bharadwaj, John Kalamatianos
  • Patent number: 11847062
    Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.
    Type: Grant
    Filed: December 16, 2021
    Date of Patent: December 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
  • Patent number: 11847061
    Abstract: A technical solution to the technical problem of how to support memory-centric operations on cached data uses a novel memory-centric memory operation that invokes write back functionality on cache controllers and memory controllers. The write back functionality enforces selective flushing of dirty, i.e., modified, cached data that is needed for memory-centric memory operations from caches to the completion level of the memory-centric memory operations, and updates the coherence state appropriately at each cache level. The technical solution ensures that commands to implement the selective cache flushing are ordered before the memory-centric memory operation at the completion level of the memory-centric memory operation.
    Type: Grant
    Filed: July 26, 2021
    Date of Patent: December 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, John Kalamatianos
  • Patent number: 11842199
    Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: December 12, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Greg Sadowski, John Kalamatianos, Shomit N. Das
  • Publication number: 20230376420
    Abstract: A method includes recording a first set of consecutive memory access deltas, where each of the consecutive memory access deltas represents a difference between two memory addresses accessed by an application, updating values in a prefetch training table based on the first set of memory access deltas, and predicting one or more memory addresses for prefetching responsive to a second set of consecutive memory access deltas and based on values in the prefetch training table.
    Type: Application
    Filed: April 19, 2023
    Publication date: November 23, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Susumu Mashimo, John Kalamatianos
  • Publication number: 20230315536
    Abstract: To reduce inter- and intra-instruction register bank access conflicts in parallel processors, a processing system includes a remapping circuit to dynamically remap virtual registers to physical registers of a parallel processor during execution of a wavefront. The remapping circuit remaps virtual registers to physical registers at a register mapping table that holds the current set of virtual to physical register mappings based on a list of available registers indicating which physical registers are available for a new mapping and a register mapping policy.
    Type: Application
    Filed: March 30, 2022
    Publication date: October 5, 2023
    Inventors: Mark Wyse, Bradford Michael Beckmann, John Kalamatianos, Anthony Thomas Gutierrez
  • Patent number: 11736119
    Abstract: A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.
    Type: Grant
    Filed: April 18, 2022
    Date of Patent: August 22, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, Nuwan Jayasena, John Kalamatianos
  • Patent number: 11726917
    Abstract: A method includes recording a first set of consecutive memory access deltas, where each of the consecutive memory access deltas represents a difference between two memory addresses accessed by an application, updating values in a prefetch training table based on the first set of memory access deltas, and predicting one or more memory addresses for prefetching responsive to a second set of consecutive memory access deltas and based on values in the prefetch training table.
    Type: Grant
    Filed: July 13, 2020
    Date of Patent: August 15, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Susumu Mashimo, John Kalamatianos
  • Patent number: 11726783
    Abstract: A processor includes a micro-operation cache having a plurality of micro-operation cache entries for storing micro-operations decoded from instruction groups and a micro-operation filter having a plurality of micro-operation filter table entries for storing identifiers of instruction groups for which the micro-operations are predicted dead on fill if stored in the micro-operation cache. The micro-operation filter receives an identifier for an instruction group. The micro-operation filter then prevents a copy of the micro-operations from the first instruction group from being stored in the micro-operation cache when a micro-operation filter table entry includes an identifier that matches the first identifier.
    Type: Grant
    Filed: April 23, 2020
    Date of Patent: August 15, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Marko Scrbak, Mahzabeen Islam, John Kalamatianos, Jagadish B. Kotra
  • Patent number: 11726868
    Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: August 15, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi