Patents by Inventor Bradford Beckmann

Bradford Beckmann has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250123846
    Abstract: A processing unit includes a plurality of processing cores and is configured to arrange a sparse matrix for parallel performance by the cores on different rows of the matrix at least in part by calculating a respective quantity of non-zero elements in each row, assigning each row to a respective collection according to the respective quantity of non-zero elements for the row, wherein the processing unit is configured to assign at least one first row of the sparse matrix to respective collections of in parallel with assigning at least one second row of the sparse matrix to respective collections, and performing at least one mathematical operation on at least a first collection of the plurality of collections in parallel with performing the at least one mathematical operation on at least a second collection of the plurality of collections.
    Type: Application
    Filed: October 12, 2023
    Publication date: April 17, 2025
    Applicant: Advanced Micro Devices, Inc.
    Inventors: William Peter Ehrett, Muhammad Osama, Bradford Beckmann
  • Publication number: 20250103395
    Abstract: A computer-implemented method for dynamic resource management can include evaluating, by at least one processor, whether a priority of one or more processes associated with a request for one or more shared resources meets a threshold condition. The method can additionally include determining, by the at least one processor and in response to an evaluation that the priority meets the threshold condition, whether the one or more shared resources is available to meet the request. The method can further include completing, by the at least one processor and in response to a determination that the one or more shared resources is available, execution of the one or more processes. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: September 27, 2023
    Publication date: March 27, 2025
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Bradford Beckmann, Matthew David Sinclair, Vinay Bharadwaj Ramakrishnaiah, William Peter Ehrett
  • Publication number: 20240395289
    Abstract: Integrated circuit (IC) memory devices and methods for fabricating the same are provided. In one example, an integrated circuit (IC) memory device is provided that includes a substrate, at least two or more memory (IC) dies, and a non-memory IC die integrated in a chip package. The memory (IC) dies are stacked on the substrate to form a memory die stack. The non-memory IC die contains row segmentation logic having an output routed to corresponding wordline drivers of the memory IC dies through vertical wiring passing through the memory die stack.
    Type: Application
    Filed: May 22, 2024
    Publication date: November 28, 2024
    Inventors: Vignesh ADHINARAYANAN, Hyung-Dong LEE, Bradford BECKMANN, Seyedmohammad SEYEDZADEHDELCHEH, Sergey BLAGODUROV
  • Patent number: 12131199
    Abstract: A processing system monitors and synchronizes parallel execution of workgroups (WGs). One or more of the WGs perform (e.g., periodically or in response to a trigger such as an indication of oversubscription) a waiting atomic instruction. In response to a comparison between an atomic value produced as a result of the waiting atomic instruction and an expected value, WGs that fail to produce a correct atomic value are identified as being in a waiting state (e.g., waiting for a synchronization variable). Execution of WGs in the waiting state is prevented (e.g., by a context switch) until corresponding synchronization variables are released.
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: October 29, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexandru Dutu, Matthew David Sinclair, Bradford Beckmann, David A. Wood
  • Publication number: 20240329984
    Abstract: An electronic device includes processing circuitry that executes a lookup table (LUT) vector instruction. Executing the lookup table vector instruction causes the processing circuitry to acquire a set of reference values by using each input value from an input vector as an index to acquire a reference value from a reference vector. The processing circuitry then provides the set of reference values for one or more subsequent operations. The processing circuitry can also use the set of reference values for controlling vector elements from among a set of vector elements for which a vector operation is performed.
    Type: Application
    Filed: March 30, 2023
    Publication date: October 3, 2024
    Inventors: Yasuko Eckert, Vadim Vadimovich Nikiforov, Gabriel H. Loh, Bradford Beckmann
  • Publication number: 20240095180
    Abstract: The disclosed computer-implemented method for interpolating register-based lookup tables can include identifying, within a set of registers, a lookup table that has been encoded for storage within the set of registers. The method can also include receiving a request to look up a value in the lookup table and responding to the request by interpolating, from the encoded lookup table stored in the set of registers, a representation of the requested value. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: December 23, 2022
    Publication date: March 21, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Gabriel H. Loh, Michael Estlick, Jay Fleischman, Michael J. Schulte, Bradford Beckmann, Yasuko Eckert
  • Patent number: 11875425
    Abstract: Implementing heterogeneous wavefronts on a graphics processing unit (GPU) is disclosed. A scheduler assigns heterogeneous wavefronts for execution on a compute unit of a processing device. The heterogeneous wavefronts include different types of wavefronts such as vector compute wavefronts and service-level wavefronts that vary in resource requirements and instruction sets. As one example, heterogeneous wavefronts may include scalar wavefronts and vector compute wavefronts that execute on scalar units and vector units, respectively. Distinct sets of instructions are executed for the heterogeneous wavefronts on the compute unit. Heterogeneous wavefronts are processed in the same pipeline of the processing device.
    Type: Grant
    Filed: December 28, 2020
    Date of Patent: January 16, 2024
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Sooraj Puthoor, Bradford Beckmann, Nuwan Jayasena, Anthony Gutierrez
  • Patent number: 11868809
    Abstract: A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such that a memory transfer of the second proxy object of the plurality of proxy objects occurs while the first task is being executed.
    Type: Grant
    Filed: January 11, 2023
    Date of Patent: January 9, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Muhammad Amber Hassaan, Anirudh Mohan Kaushik, Sooraj Puthoor, Gokul Subramanian Ravi, Bradford Beckmann, Ashwin Aji
  • Publication number: 20230393855
    Abstract: An approach is provided for implementing register based single instruction, multiple data (SIMD) lookup table operations. According to the approach, an instruction set architecture (ISA) can support one or more SIMD instructions that enable vectors or multiple values in source data registers to be processed in parallel using a lookup table or truth table stored in one or more function registers. The SIMD instructions can be flexibly configured to support functions with inputs and outputs of various sizes and data formats. Various approaches are also described for supporting very large lookup tables that span multiple registers.
    Type: Application
    Filed: June 6, 2022
    Publication date: December 7, 2023
    Inventors: Gabriel H. Loh, Yasuko Eckert, Bradford Beckmann, Michael Estlick, Jay Fleischman
  • Patent number: 11740791
    Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
    Type: Grant
    Filed: October 8, 2021
    Date of Patent: August 29, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Seyed Mohammad Seyedzadehdelcheh, Xianwei Zhang, Bradford Beckmann, Shomit N. Das
  • Patent number: 11734059
    Abstract: A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such that a memory transfer of the second proxy object of the plurality of proxy objects occurs while the first task is being executed.
    Type: Grant
    Filed: March 19, 2020
    Date of Patent: August 22, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Muhammad Amber Hassaan, Anirudh Mohan Kaushik, Sooraj Puthoor, Gokul Subramanian Ravi, Bradford Beckmann, Ashwin Aji
  • Publication number: 20230229494
    Abstract: A processor includes a task scheduling unit and a compute unit coupled to the task scheduling unit. The task scheduling unit performs a task dependency assessment of a task dependency graph and task data requirements that correspond to each task of the plurality of tasks. Based on the task dependency assessment, the task scheduling unit schedules a first task of the plurality of tasks and a second proxy object of a plurality of proxy objects specified by the task data requirements such that a memory transfer of the second proxy object of the plurality of proxy objects occurs while the first task is being executed.
    Type: Application
    Filed: January 11, 2023
    Publication date: July 20, 2023
    Inventors: Muhammad Amber HASSAAN, Anirudh Mohan KAUSHIK, Sooraj PUTHOOR, Gokul Subramanian RAVI, Bradford BECKMANN, Ashwin AJI
  • Patent number: 11526449
    Abstract: A processing system limits the propagation of unnecessary memory updates by bypassing writing back dirty cache lines to other levels of a memory hierarchy in response to receiving an indication from software executing at a processor of the processing system that the value of the dirty cache line is dead (i.e., will not be read again or will not be read until after it has been overwritten). In response to receiving an indication from software that data is dead, a cache controller prevents propagation of the dead data to other levels of memory in response to eviction of the dead data or flushing of the cache at which the dead data is stored.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: December 13, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Johnathan Alsop, Pouya Fotouhi, Bradford Beckmann, Sergey Blagodurov
  • Patent number: 11487671
    Abstract: Wavefront loading in a processor is managed and includes monitoring a selected wavefront of a set of wavefronts. Reuse of memory access requests for the selected wavefront is counted. A cache hit rate in one or more caches of the processor is determined based on the counted reuse. Based on the cache hit rate, subsequent memory requests of other wavefronts of the set of wavefronts are modified by including a type of reuse of cache lines in requests to the caches. In the caches, storage of data in the caches is based on the type of reuse indicated by the subsequent memory access requests. Reused cache lines are protected by preventing cache line contents from being replaced by another cache line for a duration of processing the set of wavefronts. Caches are bypassed when streaming access requests are made.
    Type: Grant
    Filed: June 19, 2019
    Date of Patent: November 1, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Xianwei Zhang, John Kalamatianos, Bradford Beckmann
  • Patent number: 11481250
    Abstract: A first workgroup is preempted in response to threads in the first workgroup executing a first wait instruction including a first value of a signal and a first hint indicating a type of modification for the signal. The first workgroup is scheduled for execution on a processor core based on a first context after preemption in response to the signal having the first value. A second workgroup is scheduled for execution on the processor core based on a second context in response to preempting the first workgroup and in response to the signal having a second value. A third context it is prefetched into registers of the processor core based on the first hint and the second value. The first context is stored in a first portion of the registers and the second context is prefetched into a second portion of the registers prior to preempting the first workgroup.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: October 25, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexandru Dutu, Matthew David Sinclair, Bradford Beckmann, David A. Wood
  • Publication number: 20220206869
    Abstract: Virtualizing resources of a memory-based execution device is disclosed. A host processing system orchestrates the execution of two or more offload tasks on a remote execution device. The remote execution device includes a memory array coupled to a processing unit that is shared by concurrent processes on the host processing system. The host processing system provides time-multiplexed access to the processing unit by each concurrent process for completing offload tasks on the processing unit. The host processing system initiates a context switch on the remote execution device from a first offload task to a second offload task. The context state of the first offload task is saved on the remote execution device.
    Type: Application
    Filed: December 28, 2020
    Publication date: June 30, 2022
    Inventors: VAIBHAV RAMAKRISHNAN RAMACHANDRAN, ALEXANDRU DUTU, BRADFORD BECKMANN
  • Publication number: 20220207643
    Abstract: Implementing heterogenous wavefronts on a graphics processing unit (GPU) is disclosed. A schedule assigns heterogeneous wavefronts for execution on a compute unit of a processing device. The heterogeneous wavefronts include different types of wavefronts such as vector compute wavefronts service-level wavefronts that vary in resource requirements and instruction sets. As one example, heterogenous wavefronts may include scalar wavefronts and vector compute wavefronts that execute on scalar units and vector units, respectively. Distinct sets of instructions are executed for the heterogenous wavefronts on the compute unit. Heterogenous wavefronts are processed in the same pipeline of the processing device.
    Type: Application
    Filed: December 28, 2020
    Publication date: June 30, 2022
    Inventors: SOORAJ PUTHOOR, BRADFORD BECKMANN, NUWAN JAYASENA, ANTHONY GUTIERREZ
  • Publication number: 20220083233
    Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
    Type: Application
    Filed: October 8, 2021
    Publication date: March 17, 2022
    Inventors: Seyed Mohammad SEYEDZADEHDELCHEH, Xianwei ZHANG, Bradford BECKMANN, Shomit N. DAS
  • Publication number: 20220066940
    Abstract: A processing system limits the propagation of unnecessary memory updates by bypassing writing back dirty cache lines to other levels of a memory hierarchy in response to receiving an indication from software executing at a processor of the processing system that the value of the dirty cache line is dead (i.e., will not be read again or will not be read until after it has been overwritten). In response to receiving an indication from software that data is dead, a cache controller prevents propagation of the dead data to other levels of memory in response to eviction of the dead data or flushing of the cache at which the dead data is stored.
    Type: Application
    Filed: August 31, 2020
    Publication date: March 3, 2022
    Inventors: Johnathan ALSOP, Pouya FOTOUHI, Bradford BECKMANN, Sergey BLAGODUROV
  • Publication number: 20210373975
    Abstract: A processing system monitors and synchronizes parallel execution of workgroups (WGs). One or more of the WGs perform (e.g., periodically or in response to a trigger such as an indication of oversubscription) a waiting atomic instruction. In response to a comparison between an atomic value produced as a result of the waiting atomic instruction and an expected value, WGs that fail to produce a correct atomic value are identified as being in a waiting state (e.g., waiting for a synchronization variable). Execution of WGs in the waiting state is prevented (e.g., by a context switch) until corresponding synchronization variables are released.
    Type: Application
    Filed: September 23, 2020
    Publication date: December 2, 2021
    Inventors: Alexandru DUTU, Matthew David SINCLAIR, Bradford BECKMANN, David A. WOOD