Patents by Inventor Jagadish B. Kotra

Jagadish B. Kotra has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LEVERAGING PROCESSING-IN-MEMORY (PIM) RESOURCES TO EXPEDITE NON-PIM INSTRUCTIONS EXECUTED ON A HOST

Publication number: 20230205693

Abstract: Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host is disclosed. In an implementation, a memory controller identifies a first write instruction to write first data to a first memory location, where the first write instruction is not a processing-in-memory (PIM) instruction. The memory controller then writes the first data to a first PIM register. Opportunistically, the memory controller moves the first data from the first PIM register to the first memory location. In another implementation, a memory controller identifies a first memory location associated with a first read instruction, where the first read instruction is not a processing-in-memory (PIM) instruction. The memory controller identifies that a PIM register is associated with the first memory location. The memory controller then reads, in response to the first read instruction, first data from the PIM register.

Type: Application

Filed: December 28, 2021

Publication date: June 29, 2023

Inventors: JAGADISH B. KOTRA, JOHN KALAMATIANOS, YASUKO ECKERT, YONGHAE KIM
METHOD AND APPARATUS FOR RECOVERING REGULAR ACCESS PERFORMANCE IN FINE-GRAINED DRAM

Publication number: 20230186976

Abstract: A fine-grained dynamic random-access memory (DRAM) includes a first memory bank, a second memory bank, and a dual mode I/O circuit. The first memory bank includes a memory array divided into a plurality of grains, each grain including a row buffer and input/output (I/O) circuitry. The dual-mode I/O circuit is coupled to the I/O circuitry of each grain in the first memory bank, and operates in a first mode in which commands having a first data width are routed to and fulfilled individually at each grain, and a second mode in which commands having a second data width different from the first data width are fulfilled by at least two of the grains in parallel.

Type: Application

Filed: December 13, 2021

Publication date: June 15, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Sriseshan Srikanth, Vignesh Adhinarayanan, Jagadish B. Kotra, Sergey Blagodurov
Data As Compute

Publication number: 20230169015

Abstract: A method includes storing a function representing a set of data elements stored in a backing memory and, in response to a first memory read request for a first data element of the set of data elements, calculating a function result representing the first data element based on the function.

Type: Application

Filed: November 30, 2021

Publication date: June 1, 2023

Inventors: Kishore Punniyamurthy, SeyedMohammad SeyedzadehDelcheh, Sergey Blagodurov, Ganesh Dasika, Jagadish B. Kotra
Preserving memory ordering between offloaded instructions and non-offloaded instructions

Patent number: 11625249

Abstract: Preserving memory ordering between offloaded instructions and non-offloaded instructions is disclosed. An offload instruction for an operation to be offloaded is processed and a lock is placed on a memory address associated with the offload instruction. In response to completing a cache operation targeting the memory address, the lock on the memory address is removed. For multithreaded applications, upon determining that a plurality of processor cores have each begun executing a sequence of offload instructions, the execution of non-offload instructions that are younger than any of the offload instructions is restricted. In response to determining that each processor core has completed executing its sequence of offload instructions, the restriction is removed. The remote device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.

Type: Grant

Filed: December 29, 2020

Date of Patent: April 11, 2023

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Jagadish B. Kotra, John Kalamatianos
Method and apparatus for virtualizing the micro-op cache

Patent number: 11586441

Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions.

Type: Grant

Filed: December 17, 2020

Date of Patent: February 21, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: John Kalamatianos, Jagadish B. Kotra
Adaptive cache management based on programming model information

Patent number: 11586539

Abstract: A processing system selectively allocates space to store a group of one or more cache lines at a cache level of a cache hierarchy having a plurality of cache levels based on memory access patterns of a software application executing at the processing system. The processing system generates bit vectors indicating which cache levels are to allocate space to store groups of one or more cache lines based on the memory access patterns, which are derived from data granularity and movement information. Based on the bit vectors, the processing system provides hints to the cache hierarchy indicating the lowest cache level that can exploit the reuse potential for a particular data.

Type: Grant

Filed: December 13, 2019

Date of Patent: February 21, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Weon Taek Na, Jagadish B. Kotra, Yasuko Eckert, Steven Raasch, Sergey Blagodurov
DISPATCH BANDWIDTH OF MEMORY-CENTRIC REQUESTS BY BYPASSING STORAGE ARRAY ADDRESS CHECKING

Publication number: 20230030679

Abstract: A technical solution to the technical problem of how to improve dispatch throughput for memory-centric commands bypasses address checking for certain memory-centric commands. Implementations include using an Address Check Bypass (ACB) bit to specify whether address checking should be performed for a memory-centric command. ACB bit values are specified in memory-centric instructions, automatically specified by a process, such as a compiler, or by host hardware, such as dispatch hardware, based upon whether a memory-centric command explicitly references memory. Implementations include bypassing, i.e., not performing, address checking for memory-centric commands that do not access memory and also for memory-centric commands that do access memory, but that have the same physical address as a prior memory-centric command that explicitly accessed memory to ensure that any data in caches was flushed to memory and/or invalidated.

Type: Application

Filed: July 27, 2021

Publication date: February 2, 2023

Inventors: Jagadish B. Kotra, John Kalamatianos, Gagandeep Panwar
Method and apparatus for temperature-gradient aware data-placement for 3D stacked DRAMs

Patent number: 11556250

Abstract: A system including a stack of two or more layers of volatile memory, such as layers of a 3D stacked DRAM memory, places data in the stack based on a temperature or a refresh rate. When a threshold is exceeded, data are moved from a first region to a second region in the stack, the second region having one or both of a second temperature lower than a first temperature of the first region or a second refresh rate lower than a first refresh rate of the first region.

Type: Grant

Filed: July 27, 2020

Date of Patent: January 17, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Jagadish B. Kotra, Karthik Rao, Joseph L. Greathouse
Data compression and encryption based on translation lookaside buffer evictions

Patent number: 11507519

Abstract: A processing system selectively compresses cache lines at a cache or at a memory or encrypts cache lines at the memory based on evictions of entries mapping virtual-to-physical address translations from a translation lookaside buffer (TLB). Upon eviction of a TLB entry, the processing system identifies cache lines corresponding to the physical addresses of the evicted TLB entry and selectively compresses the cache lines to increase the effective storage capacity of the processing system or encrypts the cache lines to protect against vulnerabilities.

Type: Grant

Filed: December 28, 2020

Date of Patent: November 22, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Jagadish B. Kotra, Gabriel H. Loh, Matthew R. Poremba
Hardware-software collaborative address mapping scheme for efficient processing-in-memory systems

Patent number: 11487447

Abstract: Approaches are provided for implementing hardware-software collaborative address mapping schemes that enable mapping data elements which are accessed together in the same row of one bank or over the same rows of different banks to achieve higher performance by reducing row conflicts. Using an intra-bank frame striping policy (IBFS), corresponding subsets of data elements are interleaved into a single row of a bank. Using an intra-channel frame striping policy (ICFS), corresponding subsets of data elements are interleaved into a single channel row of a channel. A memory controller utilizes ICFS and/or IBFS to efficiently store and access data elements in memory, such as processing-in-memory (PIM) enabled memory.

Type: Grant

Filed: August 28, 2020

Date of Patent: November 1, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Mahzabeen Islam, Shaizeen Aga, Nuwan Jayasena, Jagadish B. Kotra
METHOD AND APPARATUS FOR A DRAM CACHE TAG PREFETCHER

Publication number: 20220318151

Abstract: Devices and methods for cache prefetching are provided. A device is provided which comprises memory and a processor. The memory comprises a DRAM cache, a cache dedicated to the processor and one or more intermediate caches between the dedicated cache and the DRAM cache. The processor is configured to issue prefetch requests to prefetch data, issue data access requests to fetch the data and when one or more previously issued prefetch requests are determined to be inaccurate, issue a prefetch request to prefetch a tag, corresponding to the memory address of requested data in the DRAM cache. A tag look-up is performed at the DRAM cache without performing tag look-ups at the dedicated cache or the intermediate caches. The tag is prefetched from the DRAM cache without prefetching the requested data.

Type: Application

Filed: March 31, 2021

Publication date: October 6, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Jagadish B. Kotra, Marko Scrbak, Matthew Raymond Poremba
HARDWARE-SOFTWARE COLLABORATIVE ADDRESS MAPPING SCHEME FOR EFFICIENT PROCESSING-IN-MEMORY SYSTEMS

Publication number: 20220276795

Abstract: Approaches are provided for implementing hardware-software collaborative address mapping schemes that enable mapping data elements which are accessed together in the same row of one bank or over the same rows of different banks to achieve higher performance by reducing row conflicts. Using an intra-bank frame striping policy (IBFS), corresponding subsets of data elements are interleaved into a single row of a bank. Using an intra-channel frame striping policy (ICFS), corresponding subsets of data elements are interleaved into a single channel row of a channel. A memory controller utilizes ICFS and/or IBFS to efficiently store and access data elements in memory, such as processing-in-memory (PIM) enabled memory.

Type: Application

Filed: May 16, 2022

Publication date: September 1, 2022

Inventors: Mahzabeen Islam, Shaizeen Aga, Nuwan Jayasena, Jagadish B. Kotra
Temporal link encoding

Patent number: 11398831

Abstract: Temporal link encoding, including: identifying a data type of a data value to be transmitted; determining that the data type is included in one or more data types for temporal encoding; and transmitting the data value using temporal encoding.

Type: Grant

Filed: May 7, 2020

Date of Patent: July 26, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Onur Kayiran, Steven Raasch, Sergey Blagodurov, Jagadish B. Kotra
PRESERVING MEMORY ORDERING BETWEEN OFFLOADED INSTRUCTIONS AND NON-OFFLOADED INSTRUCTIONS

Publication number: 20220206817

Abstract: Preserving memory ordering between offloaded instructions and non-offloaded instructions is disclosed. An offload instruction for an operation to be offloaded is processed and a lock is placed on a memory address associated with the offload instruction. In response to completing a cache operation targeting the memory address, the lock on the memory address is removed. For multithreaded applications, upon determining that a plurality of processor cores have each begun executing a sequence of offload instructions, the execution of non-offload instructions that are younger than any of the offload instructions is restricted. In response to determining that each processor core has completed executing its sequence of offload instructions, the restriction is removed. The remote device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.

Type: Application

Filed: December 29, 2020

Publication date: June 30, 2022

Inventors: JAGADISH B. KOTRA, JOHN KALAMATIANOS
OFFLOADING COMPUTATIONS FROM A PROCESSOR TO REMOTE EXECUTION LOGIC

Publication number: 20220206855

Abstract: Offloading computations from a processor to remote execution logic is disclosed. Offload instructions for remote execution on a remote device are dispatched in the form of processor instructions like conventional instructions. In the processor, an offload instruction is inserted in an offload queue. The offload instruction may be inserted at the dispatch stage or the retire stage of the processor pipeline. Metadata for the offload instruction is added to the offload instruction in the offload queue. After retirement of the offload instruction, the processor transmits an offload request generated from the offload instruction.

Type: Application

Filed: December 29, 2020

Publication date: June 30, 2022

Inventors: NAGADASTAGIRI REDDY CHALLAPALLE, JAGADISH B. KOTRA, JOHN KALAMATIANOS
METHODS FOR CONFIGURING SPAN OF CONTROL UNDER VARYING TEMPERATURE

Publication number: 20220188208

Abstract: A method may include, in response to a change in an operating parameter of a processing unit, modifying a signal pathway to a processing circuit component of the processing unit, and communicating with the processing circuit component via the signal pathway.

Type: Application

Filed: December 10, 2020

Publication date: June 16, 2022

Inventors: Anthony Gutierrez, Yasuko Eckert, Sergey Blagodurov, Jagadish B. Kotra
PROCESSOR-GUIDED EXECUTION OF OFFLOADED INSTRUCTIONS USING FIXED FUNCTION OPERATIONS

Publication number: 20220188117

Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.

Type: Application

Filed: December 16, 2020

Publication date: June 16, 2022

Inventors: JOHN KALAMATIANOS, MICHAEL T. CLARK, MARIUS EVERS, WILLIAM L. WALKER, PAUL MOYER, JAY FLEISCHMAN, JAGADISH B. KOTRA
MANAGING CACHED DATA USED BY PROCESSING-IN-MEMORY INSTRUCTIONS

Publication number: 20220188233

Abstract: A system-on-chip configured for eager invalidation and flushing of cached data used by PIM (Processing-in-Memory) instructions includes: one or more processor cores; one or more caches and an I/O (input/output) die comprising logic to: receive a cache probe request, wherein the cache probe request including a physical memory address associated with a PIM instruction, and the PIM instruction is to be offloaded to a PIM device for execution; and issue, based on the physical memory address, a cache probe to one or more of the caches prior to receiving the PIM instruction for dispatch to the PIM device.

Type: Application

Filed: September 13, 2021

Publication date: June 16, 2022

Inventors: JOHN KALAMATIANOS, JAGADISH B. KOTRA, GAGANDEEP PANWAR
Techniques to improve translation lookaside buffer reach by leveraging idle resources

Patent number: 11321241

Abstract: Techniques are disclosed for processing address translations. The techniques include detecting a first miss for a first address translation request for a first address translation in a first translation lookaside buffer, in response to the first miss, fetching the first address translation into the first translation lookaside buffer and evicting a second address translation from the translation lookaside buffer into an instruction cache or local data share memory, detecting a second miss for a second address translation request referencing the second address translation, in the first translation lookaside buffer, and in response to the second miss, fetching the second address translation from the instruction cache or the local data share memory.

Type: Grant

Filed: August 31, 2020

Date of Patent: May 3, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Jagadish B. Kotra, Michael W. LeBeane
METHOD AND APPARATUS FOR REDUCING THE LATENCY OF LONG LATENCY MEMORY REQUESTS

Publication number: 20220091986

Abstract: Systems, apparatuses, and methods for efficiently processing memory requests are disclosed. A computing system includes at least one processing unit coupled to a memory. Circuitry in the processing unit determines a memory request becomes a long-latency request based on detecting a translation lookaside buffer (TLB) miss, a branch misprediction, a memory dependence misprediction, or a precise exception has occurred. The circuitry marks the memory request as a long-latency request such as storing an indication of a long-latency request in an instruction tag of the memory request. The circuitry uses weighted criteria for scheduling out-of-order issue and servicing of memory requests. However, the indication of a long-latency request is not combined with other criteria in a weighted sum. Rather, the indication of the long-latency request is a separate value. The circuitry prioritizes memory requests marked as long-latency requests over memory requests not marked as long-latency requests.

Type: Application

Filed: September 23, 2020

Publication date: March 24, 2022

Inventors: Jagadish B. Kotra, John Kalamatianos

prev 1 2 3 4 next