Patents by Inventor Jagadish B. Kotra
Jagadish B. Kotra has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12373207Abstract: Systems, apparatuses, and methods for compacting multiple groups of micro-operations into individual cache lines of a micro-operation cache are disclosed. A processor includes at least a decode unit and a micro-operation cache. When a new group of micro-operations is decoded and ready to be written to the micro-operation cache, the micro-operation cache determines which set is targeted by the new group of micro-operations. If there is a way in this set that can store the new group without evicting any existing group already stored in the way, then the new group is stored into the way with the existing group(s) of micro-operations. Metadata is then updated to indicate that the new group of micro-operations has been written to the way. Additionally, the micro-operation cache manages eviction and replacement policy at the granularity of micro-operation groups rather than at the granularity of cache lines.Type: GrantFiled: May 19, 2021Date of Patent: July 29, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, John Kalamatianos
-
Patent number: 12339783Abstract: A memory request issue counter (MRIC) is maintained that is incremented for every memory access a central processing unit core makes. A region reuse distance table is also maintained that includes multiple entries each of which stores the region reuse distance for a corresponding region. When a memory access request for a physical address is received, a reuse distance for the physical address is calculated. This reuse distance is the difference between the current MRIC value and a previous MRIC value for the physical address. The previous MRIC value for the physical address is the MRIC value the MRIC had when a memory access request for the physical address was last received. A region reuse distance for a region that includes the physical address is generated based on the reuse distance for the physical address and used to manage the cache.Type: GrantFiled: December 27, 2022Date of Patent: June 24, 2025Assignee: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Jagadish B. Kotra, Asmita Pal
-
Patent number: 12287739Abstract: Address translation is performed to translate a virtual address targeted by a memory request (e.g., a load or memory request for data or an instruction) to a physical address. This translation is performed using an address translation buffer, e.g., a translation lookaside buffer (TLB). One or more actions are taken to reduce data access latencies for memory requests in the event of a TLB miss where the virtual address to physical address translation is not in the TLB. Examples of actions that are performed in various implementations in response to a TLB miss include bypassing level 1 (L1) and level 2 (L2) caches in the memory system, and speculatively sending the memory request to the L2 cache while checking whether the memory request is satisfied by the L1 cache.Type: GrantFiled: December 9, 2022Date of Patent: April 29, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B Kotra, John Kalamatianos
-
Publication number: 20250110878Abstract: Selectively bypassing cache directory lookups for processing-in-memory instructions is described. In one example, a system maintains information describing a status—clean or dirty—of a memory address, where a dirty status indicates that the memory address is modified in a cache and thus different than the memory address as represented in system memory. A processing-in-memory request involving the memory address is assigned a cache directory bypass bit based on the status of the memory address. The cache directory bypass bit for a processing-in-memory request controls whether a cache directory lookup is performed after the processing-in-memory request is issued by a processor core and before the processing-in-memory request is executed by a processing-in-memory component.Type: ApplicationFiled: September 29, 2023Publication date: April 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Travis Henry Boraten, Jagadish B. Kotra, David Andrew Werner
-
Publication number: 20250110655Abstract: Efficient memory operation using a destructive read memory array is described. In accordance with the described techniques, a system may include a memory configured to store data of a first logic state in a ferroelectric capacitor when an electric polarization of the ferroelectric capacitor is in a first direction. A system may include a controller configured to erase the data from the memory by commanding the electric polarization of the ferroelectric capacitor in a second direction, opposite of the first direction and skipping a subsequent write operation of a null value to the memory.Type: ApplicationFiled: September 28, 2023Publication date: April 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, Divya Madapusi Srinivas Prasad
-
Publication number: 20250110886Abstract: Speculative cache invalidation techniques for processing-in-memory instructions are described. In one example, a system includes a cache system including a plurality of cache levels and a cache coherence controller. The cache coherence controller is configured to perform a cache directory lookup using a cache directory. The cache directory lookup is configured to indicate whether data associated with a memory address specified by a processing-in-memory request is valid in memory. The system employs speculative evaluation logic to identify whether the data associated with the processing-in-memory request is stored in the cache system before the processing-in-memory request is transmitted to the cache coherence controller. If the data is stored in the cache system, the cache system locally invalidates or flushes the data to avoid stalling the processing-in-memory request during a cache directory lookup.Type: ApplicationFiled: September 29, 2023Publication date: April 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Travis Henry Boraten, Jagadish B. Kotra, David Andrew Werner
-
Patent number: 12265470Abstract: Selectively bypassing cache directory lookups for processing-in-memory instructions is described. In one example, a system maintains information describing a status—clean or dirty—of a memory address, where a dirty status indicates that the memory address is modified in a cache and thus different than the memory address as represented in system memory. A processing-in-memory request involving the memory address is assigned a cache directory bypass bit based on the status of the memory address. The cache directory bypass bit for a processing-in-memory request controls whether a cache directory lookup is performed after the processing-in-memory request is issued by a processor core and before the processing-in-memory request is executed by a processing-in-memory component.Type: GrantFiled: September 29, 2023Date of Patent: April 1, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Travis Henry Boraten, Jagadish B. Kotra, David Andrew Werner
-
Publication number: 20250103650Abstract: Graph analytics system are described. In accordance with the described techniques, a graph having vertices that include a first vertex and a second vertex that are associated with access control metadata are received. An updated graph is output based on a merging of the first vertex and the second vertex into a merged vertex of a group of vertices based on the first vertex and the second vertex being associated with access control metadata common to the first vertex and the second vertex and based on a reordering technique. A single copy of the access control metadata is stored for the first vertex and the second vertex.Type: ApplicationFiled: September 21, 2023Publication date: March 27, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Kishore Punniyamurthy, Jagadish B. Kotra
-
Patent number: 12197378Abstract: An apparatus configured for offloading system service tasks to a processing-in-memory (“PIM”) device includes an agent configured to: receive, from a host processor, a request to offload a memory task associated with a system service to the PIM device; determine at least one PIM command and at least one memory page associated with the host processor based upon the request; and issue the at least one PIM command to the PIM device for execution by the PIM device to perform the memory task upon the at least one memory page.Type: GrantFiled: June 1, 2022Date of Patent: January 14, 2025Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Jagadish B. Kotra, Kishore Punniyamurthy
-
Patent number: 12189953Abstract: Methods, devices, and systems for retrieving information based on cache miss prediction. It is predicted, based on a history of cache misses at a private cache, that a cache lookup for the information will miss a shared victim cache. A speculative memory request is enabled based on the prediction that the cache lookup for the information will miss the shared victim cache. The information is fetched based on the enabled speculative memory request.Type: GrantFiled: September 29, 2022Date of Patent: January 7, 2025Inventors: Jagadish B. Kotra, John Kalamatianos
-
Publication number: 20250006232Abstract: An apparatus and method for creating less computationally intensive nodes for a neural network. An integrated circuit includes a host processor and multiple memory channels, each with multiple memory array banks. Each of the memory array banks includes components of a processing-in-memory (PIM) accelerator and a scatter and gather circuit used to dynamically perform quantization operations and dequantization operations that offload these operations from the host processor. The host processor executes a data model that represents a neural network. The memory array banks store a single copy of a particular data value in a single precision. Therefore, the memory array banks avoid storing replications of the same data value with different precisions to be used by a neural network node. The memory array banks dynamically perform quantization operations and dequantization operations on one or more of the weight values, input data values, and activation output values of the neural network.Type: ApplicationFiled: June 30, 2023Publication date: January 2, 2025Inventors: Ioannis Papadopoulos, Vignesh Adhinarayanan, Ashwin Aji, Jagadish B. Kotra
-
Patent number: 12153926Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.Type: GrantFiled: December 21, 2023Date of Patent: November 26, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Michael T. Clark, Marius Evers, William L. Walker, Paul Moyer, Jay Fleischman, Jagadish B. Kotra
-
Patent number: 12147338Abstract: In accordance with the described techniques for leveraging processing in memory registers as victim buffers, a computing device includes a memory, a processing in memory component having registers for data storage, and a memory controller having a victim address table that includes at least one address of a row of the memory that is stored in the registers. The memory controller receives a request to access the row of the memory and accesses data of the row from the registers based on the address of the row being included in the victim address table.Type: GrantFiled: December 27, 2022Date of Patent: November 19, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B Kotra, Dong Kai Wang
-
Patent number: 12099723Abstract: A method for operating a memory having a plurality of banks accessible in parallel, each bank including a plurality of grains accessible in parallel is provided. The method includes: based on a memory access request that specifies a memory address, identifying a set that stores data for the memory access request, wherein the set is spread across multiple grains of the plurality of grains; and performing operations to satisfy the memory access request, using entries of the set stored across the multiple grains of the plurality of grains.Type: GrantFiled: September 29, 2022Date of Patent: September 24, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, Marko Scrbak
-
Patent number: 12073251Abstract: Offloading computations from a processor to remote execution logic is disclosed. Offload instructions for remote execution on a remote device are dispatched in the form of processor instructions like conventional instructions. In the processor, an offload instruction is inserted in an offload queue. The offload instruction may be inserted at the dispatch stage or the retire stage of the processor pipeline. Metadata for the offload instruction is added to the offload instruction in the offload queue. After retirement of the offload instruction, the processor transmits an offload request generated from the offload instruction.Type: GrantFiled: December 29, 2020Date of Patent: August 27, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Nagadastagiri Reddy Challapalle, Jagadish B. Kotra, John Kalamatianos
-
Patent number: 12050531Abstract: In accordance with the described techniques for data compression and decompression for processing in memory, a page address is received by a processing in memory component that maps to a first location in memory where data of a page is maintained. The data of the page is compressed by the processing in memory component. Further, compressed data of the page is written by the processing in memory component to a compressed block device responsive to the compressed data satisfying one or more compressibility criteria. The compressed block device is a portion of the memory dedicated to storing data in a compressed form.Type: GrantFiled: September 26, 2022Date of Patent: July 30, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Kishore Punniyamurthy, Jagadish B Kotra
-
Publication number: 20240211407Abstract: A memory request issue counter (MRIC) is maintained that is incremented for every memory access a central processing unit core makes. A region reuse distance table is also maintained that includes multiple entries each of which stores the region reuse distance for a corresponding region. When a memory access request for a physical address is received, a reuse distance for the physical address is calculated. This reuse distance is the difference between the current MRIC value and a previous MRIC value for the physical address. The previous MRIC value for the physical address is the MRIC value the MRIC had when a memory access request for the physical address was last received. A region reuse distance for a region that includes the physical address is generated based on the reuse distance for the physical address and used to manage the cache.Type: ApplicationFiled: December 27, 2022Publication date: June 27, 2024Applicant: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Jagadish B. Kotra, Asmita Pal
-
Publication number: 20240211393Abstract: In accordance with the described techniques for leveraging processing in memory registers as victim buffers, a computing device includes a memory, a processing in memory component having registers for data storage, and a memory controller having a victim address table that includes at least one address of a row of the memory that is stored in the registers. The memory controller receives a request to access the row of the memory and accesses data of the row from the registers based on the address of the row being included in the victim address table.Type: ApplicationFiled: December 27, 2022Publication date: June 27, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Jagadish B Kotra, Dong Kai Wang
-
Patent number: 12019566Abstract: Arbitrating atomic memory operations, including: receiving, by a media controller, a plurality of atomic memory operations; determining, by an atomics controller associated with the media controller, based on one or more arbitration rules, an ordering for issuing the plurality of atomic memory operations; and issuing the plurality of atomic memory operations to a memory module according to the ordering.Type: GrantFiled: July 24, 2020Date of Patent: June 25, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Sergey Blagodurov, Johnathan Alsop, Jagadish B. Kotra, Marko Scrbak, Ganesh Dasika
-
Patent number: 12019547Abstract: A technical solution to the technical problem of how to improve dispatch throughput for memory-centric commands bypasses address checking for certain memory-centric commands. Implementations include using an Address Check Bypass (ACB) bit to specify whether address checking should be performed for a memory-centric command. ACB bit values are specified in memory-centric instructions, automatically specified by a process, such as a compiler, or by host hardware, such as dispatch hardware, based upon whether a memory-centric command explicitly references memory. Implementations include bypassing, i.e., not performing, address checking for memory-centric commands that do not access memory and also for memory-centric commands that do access memory, but that have the same physical address as a prior memory-centric command that explicitly accessed memory to ensure that any data in caches was flushed to memory and/or invalidated.Type: GrantFiled: July 27, 2021Date of Patent: June 25, 2024Inventors: Jagadish B. Kotra, John Kalamatianos, Gagandeep Panwar