Patents by Inventor Johnathan Alsop

Johnathan Alsop has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240119198
    Abstract: A physical system is simulated using a model including a plurality of elements in a mesh or grid. The elements are divided into partitions processed by different processing units. For some time steps, state data is transmitted between partitions and used to calculate flux data for updating the state of edge elements of the partitions. Periodically, transmission of state data is suppressed, and flux data is obtained by linear interpolation based on past flux data. Alternatively, flux data is obtained by processing state variables of an edge element and past flux data using a machine learning model, such as a DNN. Whether to suppress transmission of state data may be determined based on one or both of (a) uncertainty in an output of the machine learning model (e.g., Bayesian neural network) and (b) complexity of model of the physical system (e.g., spatial or temporal gradients).
    Type: Application
    Filed: September 30, 2022
    Publication date: April 11, 2024
    Inventors: Laurent S. White, Johnathan Alsop, Ganesh Dasika
  • Publication number: 20240045606
    Abstract: Methods and apparatuses to control digital data transfer via a memory channel between a memory module and a processor are disclosed. At least one of the memory module or the processor coalesces a plurality of short data words into multicast coalesced block data comprising a single data block for transfer via the memory channel. Each of the plurality of short data words pertains to one of at least two partitioned memory submodules in the memory module. The multicast coalesced block data is communicated over the memory channel.
    Type: Application
    Filed: October 23, 2023
    Publication date: February 8, 2024
    Inventors: JOHNATHAN ALSOP, NUWAN JAYASENA, SHAIZEEN AGA, ANDREW M. MCCRABB
  • Publication number: 20240004653
    Abstract: An approach is provided for managing near-memory processing commands (“PIM commands”) from multiple processor threads in a manner to prevent interference and maintain correctness at near-memory processing elements. A memory controller uses thread identification information and last command information to issue a PIM command sequence from a first processor thread, directed to a PIM-enabled memory element, while deferring the issuance of PIM command sequences from other processor threads, directed to the same PIM-enabled memory element. After the last PIM command in the PIM command sequence for the first processor thread has been issued, a PIM command sequence for another processor thread is issued, and so on. The approach allows multiple processor threads to concurrently issue fine grained PIM commands to the same PIM-enabled memory element without having to be aware of address-to-memory element mapping, and without having to coordinate with other threads.
    Type: Application
    Filed: June 29, 2022
    Publication date: January 4, 2024
    Inventors: Johnathan Alsop, Laurent S. White, Shaizeen Aga
  • Patent number: 11803311
    Abstract: Methods and apparatuses to control digital data transfer via a memory channel between a memory module and a processor are disclosed. At least one of the memory module or the processor coalesces a plurality of short data words into multicast coalesced block data comprising a single data block for transfer via the memory channel. Each of the plurality of short data words pertains to one of at least two partitioned memory submodules in the memory module. The multicast coalesced block data is communicated over the memory channel.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: October 31, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Johnathan Alsop, Nuwan Jayasena, Shaizeen Aga, Andrew McCrabb
  • Publication number: 20230325317
    Abstract: Systems, apparatuses, and methods for reducing probe filter accesses in response to processing-in-memory (PIM) requests are disclosed. A coherent secondary unit receives PIM requests targeting a corresponding PIM device. For each PIM request that is received, the coherent secondary unit performs a lookup of a PIM address table (PAT). If the address of the PIM request matches an address of an existing entry in the PAT, the coherent secondary unit prevents the PIM request from being sent to a probe filter. Otherwise, if there is no match for the address of the PIM request in the entries of the PAT, the coherent secondary unit sends the PIM request to the probe filter, and the coherent secondary unit creates a new PAT entry for the address of the PIM request. Any subsequent PIM requests to the same address will match with the new entry in the PAT.
    Type: Application
    Filed: April 12, 2022
    Publication date: October 12, 2023
    Inventors: Michael Warren Boyer, Johnathan Alsop
  • Publication number: 20230266924
    Abstract: Systems, apparatuses, and methods for dynamically coalescing multi-bank memory commands to improve command throughput are disclosed. A system includes a processor coupled to a memory via a memory controller. The memory also includes processing-in-memory (PIM) elements which are able to perform computations within the memory. The processor generates memory requests targeting the memory which are sent to the memory controller. The memory controller stores commands received from the processor in a queue, and the memory controller determines whether opportunities exist for coalescing multiple commands together into a single multi-bank command. After coalescing multiple commands into a single combined multi-bank command, the memory controller conveys, across the memory bus to multiple separate banks, the single multi-bank command and a multi-bank code specifying which banks are targeted. The memory banks process the command in parallel, and the PIM elements process the data next to each respective bank.
    Type: Application
    Filed: May 2, 2023
    Publication date: August 24, 2023
    Inventors: Johnathan Alsop, Shaizeen Dilawarhusen Aga
  • Patent number: 11726918
    Abstract: Dynamically coalescing atomic memory operations for memory-local computing is disclosed. In an embodiment, it is determined whether a first atomic memory access and a second atomic memory access are candidates for coalescing. In response to a triggering event, the atomic memory accesses that are candidates for coalescing are coalesced in a cache prior to requesting memory-local processing by a memory-local compute unit. The atomic memory accesses may be coalesced in the same cache line or atomic memory accesses in different cache lines may be coalesced using a multicast memory-local processing command.
    Type: Grant
    Filed: June 28, 2021
    Date of Patent: August 15, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Johnathan Alsop, Alexandru Dutu, Shaizeen Aga, Nuwan Jayasena
  • Patent number: 11693725
    Abstract: Detecting execution hazards in offloaded operations is disclosed. A second offload operation is compared to a first offload operation that precedes the second offload operation. It is determined whether the second offload operation creates an execution hazard on an offload target device based on the comparison of the second offload operation to the first offload operation. If the execution hazard is detected, an error handling operation may be performed. In some examples, the offload operations are processing-in-memory operations.
    Type: Grant
    Filed: November 29, 2021
    Date of Patent: July 4, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Johnathan Alsop, Shaizeen Aga
  • Publication number: 20230195618
    Abstract: Near-memory compute elements perform memory operations and temporarily store at least a portion of address information for the memory operations in local storage. A broadcast memory command is then issued to the near-memory compute elements that causes the near-memory compute elements to perform a subsequent memory operation using their respective address information stored in the local storage. This allows a single broadcast memory command to be used to perform memory operations across multiple memory elements, such as DRAM banks, using bank-specific address information. In one implementation, the approach is used to process workloads with irregular updates to memory while consuming less command bus bandwidth than conventional approaches. Implementations include using conditional flags to selectively designate address information in local storage that is to be processed with the broadcast memory command.
    Type: Application
    Filed: December 21, 2021
    Publication date: June 22, 2023
    Inventors: Shaizeen Aga, Johnathan Alsop, Nuwan Jayasena
  • Patent number: 11681465
    Abstract: Systems, apparatuses, and methods for dynamically coalescing multi-bank memory commands to improve command throughput are disclosed. A system includes a processor coupled to a memory via a memory controller. The memory also includes processing-in-memory (PIM) elements which are able to perform computations within the memory. The processor generates memory requests targeting the memory which are sent to the memory controller. The memory controller stores commands received from the processor in a queue, and the memory controller determines whether opportunities exist for coalescing multiple commands together into a single multi-bank command. After coalescing multiple commands into a single combined multi-bank command, the memory controller conveys, across the memory bus to multiple separate banks, the single multi-bank command and a multi-bank code specifying which banks are targeted. The memory banks process the command in parallel, and the PIM elements process the data next to each respective bank.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: June 20, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Johnathan Alsop, Shaizeen Dilawarhusen Aga
  • Patent number: 11656796
    Abstract: A data processor includes a fabric-attached memory (FAM) interface for coupling to a data fabric and fulfilling memory access instructions. A requestor-side adaptive consistency controller coupled to the FAM interface requests notifications from a fabric manager for the fabric-attached memory regarding changes in requestors authorized to access a FAM region which the data processor is authorized to access. If a notification indicates that more than one requestor is authorized to access the FAM region, fences are activated for selected memory access instructions in a local application.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: May 23, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sergey Blagodurov, Brandon K. Potter, Johnathan Alsop
  • Publication number: 20220414013
    Abstract: Dynamically coalescing atomic memory operations for memory-local computing is disclosed. In an embodiment, it is determined whether a first atomic memory access and a second atomic memory access are candidates for coalescing. In response to a triggering event, the atomic memory accesses that are candidates for coalescing are coalesced in a cache prior to requesting memory-local processing by a memory-local compute unit. The atomic memory accesses may be coalesced in the same cache line or atomic memory accesses in different cache lines may be coalesced using a multicast memory-local processing command.
    Type: Application
    Filed: June 28, 2021
    Publication date: December 29, 2022
    Inventors: JOHNATHAN ALSOP, ALEXANDRU DUTU, SHAIZEEN AGA, NUWAN JAYASENA
  • Patent number: 11526449
    Abstract: A processing system limits the propagation of unnecessary memory updates by bypassing writing back dirty cache lines to other levels of a memory hierarchy in response to receiving an indication from software executing at a processor of the processing system that the value of the dirty cache line is dead (i.e., will not be read again or will not be read until after it has been overwritten). In response to receiving an indication from software that data is dead, a cache controller prevents propagation of the dead data to other levels of memory in response to eviction of the dead data or flushing of the cache at which the dead data is stored.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: December 13, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Johnathan Alsop, Pouya Fotouhi, Bradford Beckmann, Sergey Blagodurov
  • Patent number: 11507522
    Abstract: Systems, apparatuses, and methods for implementing memory request priority assignment techniques for parallel processors are disclosed. A system includes at least a parallel processor coupled to a memory subsystem, where the parallel processor includes at least a plurality of compute units for executing wavefronts in lock-step. The parallel processor assigns priorities to memory requests of wavefronts on a per-work-item basis by indexing into a first priority vector, with the index generated based on lane-specific information. If a given event is detected, a second priority vector is generated by applying a given priority promotion vector to the first priority vector. Then, for subsequent wavefronts, memory requests are assigned priorities by indexing into the second priority vector with lane-specific information. The use of priority vectors to assign priorities to memory requests helps to reduce the memory divergence problem experienced by different work-items of a wavefront.
    Type: Grant
    Filed: December 6, 2019
    Date of Patent: November 22, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sooraj Puthoor, Kishore Punniyamurthy, Onur Kayiran, Xianwei Zhang, Yasuko Eckert, Johnathan Alsop, Bradford Michael Beckmann
  • Publication number: 20220317926
    Abstract: Ordering between memory-centric memory operations, referred to hereinafter as “MC-Mem-Ops,” and core-centric memory operations, referred to hereinafter as “CC-Mem-Ops,” is enforced using inter-centric fences, referred to hereinafter as an “IC-fences.” IC-fences are implemented by an ordering primitive or ordering instruction, that cause a memory controller, a cache controller, etc., to enforce ordering of MC-Mem-Ops and CC-Mem-Ops throughout the memory pipeline and at the memory controller by not reordering MC-Mem-Ops (or sometimes CC-Mem-Ops) that arrive before the IC-fence to after the IC-fence. Processing of an IC-fence also causes the memory controller to issue an ordering acknowledgment to the thread that issued the IC-fence instruction. IC-fences are tracked at the core and designated as complete when the ordering acknowledgment is received.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Inventors: Shaizeen Aga, Nuwan Jayasena, Johnathan Alsop
  • Publication number: 20220318015
    Abstract: Enforcing data placement requirements via address bit swapping, including: receiving an instruction comprising a first memory address associated with a first address bit mapping; generating a remapped instruction by rearranging a plurality of bits of the first memory address according to a second address bit mapping; and issuing the remapped instruction to memory.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Inventors: JOHNATHAN ALSOP, SHAIZEEN AGA
  • Publication number: 20220317927
    Abstract: A data processor includes a fabric-attached memory (FAM) interface for coupling to a data fabric and fulfilling memory access instructions. A requestor-side adaptive consistency controller coupled to the FAM interface requests notifications from a fabric manager for the fabric-attached memory regarding changes in requestors authorized to access a FAM region which the data processor is authorized to access. If a notification indicates that more than one requestor is authorized to access the FAM region, fences are activated for selected memory access instructions in a local application.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sergey Blagodurov, Brandon K. Potter, Johnathan Alsop
  • Publication number: 20220318085
    Abstract: Detecting execution hazards in offloaded operations is disclosed. A second offload operation is compared to a first offload operation that precedes the second offload operation. It is determined whether the second offload operation creates an execution hazard on an offload target device based on the comparison of the second offload operation to the first offload operation. If the execution hazard is detected, an error handling operation may be performed. In some examples, the offload operations are processing-in-memory operations.
    Type: Application
    Filed: November 29, 2021
    Publication date: October 6, 2022
    Inventors: JOHNATHAN ALSOP, SHAIZEEN AGA
  • Publication number: 20220317876
    Abstract: Methods and apparatuses to control digital data transfer via a memory channel between a memory module and a processor are disclosed. At least one of the memory module or the processor coalesces a plurality of short data words into multicast coalesced block data comprising a single data block for transfer via the memory channel. Each of the plurality of short data words pertains to one of at least two partitioned memory submodules in the memory module. The multicast coalesced block data is communicated over the memory channel.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Inventors: Johnathan Alsop, Nuwan Jayasena, Shaizeen Aga, Andrew McCrabb
  • Publication number: 20220197506
    Abstract: Systems, apparatuses, and methods for determining data placement based on packet metadata are disclosed. A system includes a traffic analyzer that determines data placement across connected devices based on observed values of the metadata fields in actively exchanged packets across a plurality of protocol types. In one implementation, the protocol that is supported by the system is the compute express link (CXL) protocol. The traffic analyzer performs various actions in response to events observed in a packet stream that match items from a pre-configured list. Data movement is handled underneath the software applications by changing the virtual-to-physical address translation once the data movement is completed. After the data movement is finished, threads will pull in the new host physical address into their translation lookaside buffers (TLBs) via a page table walker or via an address translation service (ATS) request.
    Type: Application
    Filed: December 17, 2020
    Publication date: June 23, 2022
    Inventors: Sergey Blagodurov, Johnathan Alsop, SeyedMohammad SeyedzadehDelcheh