Patents by Inventor John Kalamatianos

John Kalamatianos has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230244492
    Abstract: Preserving memory ordering between offloaded instructions and non-offloaded instructions is disclosed. An offload instruction for an operation to be offloaded is processed and a lock is placed on a memory address associated with the offload instruction. In response to completing a cache operation targeting the memory address, the lock on the memory address is removed. For multithreaded applications, upon determining that a plurality of processor cores have each begun executing a sequence of offload instructions, the execution of non-offload instructions that are younger than any of the offload instructions is restricted. In response to determining that each processor core has completed executing its sequence of offload instructions, the restriction is removed. The remote device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.
    Type: Application
    Filed: April 11, 2023
    Publication date: August 3, 2023
    Inventors: JAGADISH B. KOTRA, JOHN KALAMATIANOS
  • Patent number: 11714652
    Abstract: A processing device includes a zero detection circuit to determine that an operand of a first instruction is zero and instruction conversion logic coupled with the zero detection circuit to, in response to the zero detection circuit determining that the operand is zero, convert the first instruction to a register move instruction executable by the processing device.
    Type: Grant
    Filed: July 23, 2021
    Date of Patent: August 1, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Ganesh Dasika
  • Publication number: 20230205706
    Abstract: An approach is provided for managing PIM commands and non-PIM commands at a memory controller. A memory controller enqueues PIM commands and non-PIM commands and selects the next command to process based upon various selection criteria. The memory controller maintains and uses a page table to properly configure memory elements, such as banks in a memory module, for the next memory command, whether a PIM command or a non-PIM command. The page table tracks the status of memory elements as of the most recent memory command that was issued. The page table includes an “All Bank” entry that indicates the status of banks after processing the most recent PIM command. For example, the All Banks entry indicates whether all the banks have a row open and if so, specifies the open row for all the banks.
    Type: Application
    Filed: December 23, 2021
    Publication date: June 29, 2023
    Inventors: Niti Madan, John Kalamatianos
  • Publication number: 20230205872
    Abstract: A method includes receiving an indication that a number of activations of a memory structure exceeds a threshold number of activations for a time period, and in response to the indication, throttling instruction execution for a thread issuing the activations.
    Type: Application
    Filed: December 23, 2021
    Publication date: June 29, 2023
    Inventors: Jagadish B. Kotra, Onur Kayiran, John Kalamatianos, Alok Garg
  • Publication number: 20230205693
    Abstract: Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host is disclosed. In an implementation, a memory controller identifies a first write instruction to write first data to a first memory location, where the first write instruction is not a processing-in-memory (PIM) instruction. The memory controller then writes the first data to a first PIM register. Opportunistically, the memory controller moves the first data from the first PIM register to the first memory location. In another implementation, a memory controller identifies a first memory location associated with a first read instruction, where the first read instruction is not a processing-in-memory (PIM) instruction. The memory controller identifies that a PIM register is associated with the first memory location. The memory controller then reads, in response to the first read instruction, first data from the PIM register.
    Type: Application
    Filed: December 28, 2021
    Publication date: June 29, 2023
    Inventors: JAGADISH B. KOTRA, JOHN KALAMATIANOS, YASUKO ECKERT, YONGHAE KIM
  • Publication number: 20230195643
    Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.
    Type: Application
    Filed: December 16, 2021
    Publication date: June 22, 2023
    Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
  • Patent number: 11656945
    Abstract: Methods and processing devices are provided for error protection to support instruction replay for executing idempotent instructions at a processing in memory PIM device. The processing apparatus includes a PIM device configured to execute an idempotent instruction. The processing apparatus also includes a processor, in communication with the PIM device, configured to issue the idempotent instruction to the PIM device for execution at the PIM device and reissue the idempotent instruction to the PIM device when one of execution of the idempotent instruction at the PIM device results in an error and a predetermined latency period expires from when the idempotent instruction is issued.
    Type: Grant
    Filed: December 24, 2020
    Date of Patent: May 23, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Nuwan Jayasena, Sudhanva Gurumurthi, Shaizeen Aga, Shrikanth Ganapathy
  • Patent number: 11658681
    Abstract: Various energy efficient data encoding schemes and computing devices are disclosed. In one aspect, a method of transmitting data from a transmitter to a receiver connected by plural wires is provided. The method includes sending from the transmitter on at least one but not all of the wires a first wave form that has first and second signal transitions. The receiver receives the first waveform and measures a first duration between the first and second signal transitions using a locally generated clock signal not received from the transmitter. The first duration is indicative of a first particular data value.
    Type: Grant
    Filed: June 15, 2021
    Date of Patent: May 23, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Greg Sadowski, John Kalamatianos
  • Patent number: 11645073
    Abstract: Address-based filtering for load/store speculation includes maintaining a filtering table including table entries associated with ranges of addresses; in response to receiving an ordering check triggering transaction, querying the filtering table using a target address of the ordering check triggering transaction to determine if an instruction dependent upon the ordering check triggering transaction has previously been generated a physical address; and in response to determining that the filtering table lacks an indication that the instruction dependent upon the ordering check triggering transaction has previously been generated a physical address, bypassing a lookup operation in an ordering violation memory structure to determine whether the instruction dependent upon the ordering check triggering transaction is currently in-flight.
    Type: Grant
    Filed: April 23, 2021
    Date of Patent: May 9, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Krishnan V. Ramani, Susumu Mashimo
  • Patent number: 11625249
    Abstract: Preserving memory ordering between offloaded instructions and non-offloaded instructions is disclosed. An offload instruction for an operation to be offloaded is processed and a lock is placed on a memory address associated with the offload instruction. In response to completing a cache operation targeting the memory address, the lock on the memory address is removed. For multithreaded applications, upon determining that a plurality of processor cores have each begun executing a sequence of offload instructions, the execution of non-offload instructions that are younger than any of the offload instructions is restricted. In response to determining that each processor core has completed executing its sequence of offload instructions, the restriction is removed. The remote device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.
    Type: Grant
    Filed: December 29, 2020
    Date of Patent: April 11, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Jagadish B. Kotra, John Kalamatianos
  • Publication number: 20230077933
    Abstract: A processor for supporting PIM (Processing-in-Memory) execution in a multiprocessing environment includes logic configured to: receive a request to initiate an offload of a number of PIM instructions to a PIM device. The request is issued by a first thread of a processor. The logic is also configured to reserve, based on information in the request, resources of the PIM device for execution of the plurality of instructions.
    Type: Application
    Filed: September 14, 2021
    Publication date: March 16, 2023
    Inventors: JOHN KALAMATIANOS, NUWAN JAYASENA
  • Patent number: 11586441
    Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: February 21, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Jagadish B. Kotra
  • Patent number: 11586555
    Abstract: Systems, apparatuses, and methods for implementing flexible dictionary sharing techniques for caches are disclosed. A set-associative cache includes a dictionary for each data array set. When a cache line is to be allocated in the cache, a cache controller determines to which set a base index of the cache line address maps. Then, a selector unit determines which dictionary of a group of dictionaries stored by those sets neighboring this set would achieve the most compression for the cache line. This dictionary is then selected to compress the cache line. An offset is added to the base index of the cache line to generate a full index in order to map the cache line to the set corresponding to this chosen dictionary. The compressed cache line is stored in this set with the chosen dictionary, and the offset is stored in the corresponding tag array entry.
    Type: Grant
    Filed: April 15, 2021
    Date of Patent: February 21, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, John Kalamatianos
  • Publication number: 20230030679
    Abstract: A technical solution to the technical problem of how to improve dispatch throughput for memory-centric commands bypasses address checking for certain memory-centric commands. Implementations include using an Address Check Bypass (ACB) bit to specify whether address checking should be performed for a memory-centric command. ACB bit values are specified in memory-centric instructions, automatically specified by a process, such as a compiler, or by host hardware, such as dispatch hardware, based upon whether a memory-centric command explicitly references memory. Implementations include bypassing, i.e., not performing, address checking for memory-centric commands that do not access memory and also for memory-centric commands that do access memory, but that have the same physical address as a prior memory-centric command that explicitly accessed memory to ensure that any data in caches was flushed to memory and/or invalidated.
    Type: Application
    Filed: July 27, 2021
    Publication date: February 2, 2023
    Inventors: Jagadish B. Kotra, John Kalamatianos, Gagandeep Panwar
  • Publication number: 20230021492
    Abstract: A technical solution to the technical problem of how to support memory-centric operations on cached data uses a novel memory-centric memory operation that invokes write back functionality on cache controllers and memory controllers. The write back functionality enforces selective flushing of dirty, i.e., modified, cached data that is needed for memory-centric memory operations from caches to the completion level of the memory-centric memory operations, and updates the coherence state appropriately at each cache level. The technical solution ensures that commands to implement the selective cache flushing are ordered before the memory-centric memory operation at the completion level of the memory-centric memory operation.
    Type: Application
    Filed: July 26, 2021
    Publication date: January 26, 2023
    Inventors: Shaizeen Aga, Nuwan Jayasena, John Kalamatianos
  • Publication number: 20230024089
    Abstract: A processing device includes a zero detection circuit to determine that an operand of a first instruction is zero and instruction conversion logic coupled with the zero detection circuit to, in response to the zero detection circuit determining that the operand is zero, convert the first instruction to a register move instruction executable by the processing device.
    Type: Application
    Filed: July 23, 2021
    Publication date: January 26, 2023
    Inventors: John Kalamatianos, Ganesh Dasika
  • Patent number: 11556162
    Abstract: A processor utilizes instruction based sampling to generate sampling data sampled on a per instruction basis during execution of an instruction. The sampling data indicates what processor hardware was used due to the execution of the instruction. Software receives the sampling data and generates an estimate of energy used by the instruction based on the sampling data. The sampling data may include microarchitectural events and the energy estimate utilizes a base energy amount corresponding to the instruction executed along with energy amounts corresponding to the microarchitectural events in the sampling data. The sampling data may include switching events associated with hardware blocks that switched due to execution of the instruction and the energy estimate for the instruction is based on the switching events and capacitance estimates associated with the hardware blocks.
    Type: Grant
    Filed: March 16, 2018
    Date of Patent: January 17, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shijia Wei, Joseph L. Greathouse, John Kalamatianos
  • Patent number: 11550588
    Abstract: A branch predictor of a processor includes one or more prediction structures, including a predicted branch address and predicted branch direction, that identify predicted branches. To reduce power consumption, the branch predictor selects one or more of the prediction structures that are not expected to provide useful branch prediction information and filters the selected structures such that the filtered structures are not used for branch prediction. The branch predictor thereby reduces the amount of power used for branch prediction without substantially reducing the accuracy of the predicted branches.
    Type: Grant
    Filed: August 22, 2018
    Date of Patent: January 10, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Adithya Yalavarti, Varun Agrawal, Subhankar Pal, Vinesh Srinivasan
  • Patent number: 11513802
    Abstract: An electronic device includes a processor having a micro-operation queue, multiple scheduler entries, and scheduler compression logic. When a pair of micro-operations in the micro-operation queue is compressible in accordance with one or more compressibility rules, the scheduler compression logic acquires the pair of micro-operations from the micro-operation queue and stores information from both micro-operations of the pair of micro-operations into different portions in a single scheduler entry. In this way, the scheduler compression logic compresses the pair of micro-operations into the single scheduler entry.
    Type: Grant
    Filed: September 27, 2020
    Date of Patent: November 29, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael W. Boyer, John Kalamatianos, Pritam Majumder
  • Patent number: 11513801
    Abstract: An electronic device is described that handles control transfer instructions (CTIs) when executing instructions in program code. The electronic device has a processor that includes a branch prediction functional block and a sequential fetch logic functional block. The sequential fetch logic functional block determines, based on a record associated with a CTI, that a specified number of fetch groups of instructions that were previously determined to include no CTIs are to be fetched for execution in sequence following the CTI. When each of the specified number of fetch groups is fetched and prepared for execution, the sequential fetch logic prevents corresponding accesses of the branch prediction functional block for acquiring branch prediction information for instructions in that fetch group.
    Type: Grant
    Filed: September 10, 2018
    Date of Patent: November 29, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Adithya Yalavarti, John Kalamatianos, Matthew R. Poremba