Patents by Inventor Mark Leather

Mark Leather has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12361628
    Abstract: A graphics processing unit (GPU) of a processing system is partitioned into multiple dies (referred to as GPU chiplets) that are configurable to collectively function and interface with an application as a single GPU in a first mode and as multiple GPUs in a second mode. By dividing the GPU into multiple GPU chiplets, the processing system flexibly and cost-effectively configures an amount of active GPU physical resources based on an operating mode. In addition, a configurable number of GPU chiplets are assembled into a single GPU, such that multiple different GPUs having different numbers of GPU chiplets can be assembled using a small number of tape-outs and a multiple-die GPU can be constructed out of GPU chiplets that implement varying generations of technology.
    Type: Grant
    Filed: December 8, 2022
    Date of Patent: July 15, 2025
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark Fowler, Samuel Naffziger, Michael Mantor, Mark Leather
  • Publication number: 20250209006
    Abstract: A technique for improving performance of a hash operation on a processor is provided, in which an input value is hashed into a second value corresponding to a number of bins. The number of bins is an integer that corresponds to a product of first and second integers, the first integer corresponding to a prime number and the second integer corresponding to a power of two. A first modulo hashing operation is performed in which the input value is hashed into the first integer. A second hashing operation is performed using less than all bits of the input value. An output value is formed by concatenating a result of the first hashing operation with a result of the second hashing operation.
    Type: Application
    Filed: December 20, 2023
    Publication date: June 26, 2025
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Jeffrey Christopher Allan, Mark Leather
  • Patent number: 12299413
    Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.
    Type: Grant
    Filed: January 16, 2024
    Date of Patent: May 13, 2025
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Bin He, Brian Emberling, Mark Leather, Michael Mantor
  • Patent number: 12205218
    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.
    Type: Grant
    Filed: March 29, 2022
    Date of Patent: January 21, 2025
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark Leather, Michael Mantor
  • Publication number: 20240311199
    Abstract: A program code executing on a processing system includes one or more instructions each identifying a workload that includes a plurality of waves and each identifying resource allocations for the plurality of waves of the workgroup. In response to receiving an instruction identifying a workload and resource allocations for the plurality of waves of the workgroup, a processor allocates a first set of processing resources to a compute unit of the processor based on the resource allocations for the plurality of waves. The compute unit then performs operations for the workgroup using the allocated set of processing resources.
    Type: Application
    Filed: March 13, 2023
    Publication date: September 19, 2024
    Inventors: Nicolai Haehnle, Mark Leather, Brian Emberling, Michael John Bedy, Daniel Schneider
  • Publication number: 20240193844
    Abstract: A graphics processing unit (GPU) of a processing system is partitioned into multiple dies (referred to as GPU chiplets) that are configurable to collectively function and interface with an application as a single GPU in a first mode and as multiple GPUs in a second mode. By dividing the GPU into multiple GPU chiplets, the processing system flexibly and cost-effectively configures an amount of active GPU physical resources based on an operating mode. In addition, a configurable number of GPU chiplets are assembled into a single GPU, such that multiple different GPUs having different numbers of GPU chiplets can be assembled using a small number of tape-outs and a multiple-die GPU can be constructed out of GPU chiplets that implement varying generations of technology.
    Type: Application
    Filed: December 8, 2022
    Publication date: June 13, 2024
    Inventors: Mark Fowler, Samuel Naffziger, Michael Mantor, Mark Leather
  • Publication number: 20240168719
    Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.
    Type: Application
    Filed: January 16, 2024
    Publication date: May 23, 2024
    Inventors: Bin HE, Brian EMBERLING, Mark LEATHER, Michael MANTOR
  • Patent number: 11675568
    Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.
    Type: Grant
    Filed: December 14, 2020
    Date of Patent: June 13, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Bin He, Brian Emberling, Mark Leather, Michael Mantor
  • Publication number: 20220237851
    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.
    Type: Application
    Filed: March 29, 2022
    Publication date: July 28, 2022
    Inventors: Mark LEATHER, Michael MANTOR
  • Publication number: 20220188076
    Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.
    Type: Application
    Filed: December 14, 2020
    Publication date: June 16, 2022
    Inventors: Bin He, Brian Emberling, Mark Leather, Michael Mantor
  • Patent number: 11295507
    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: April 5, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark Leather, Michael Mantor
  • Publication number: 20210241516
    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.
    Type: Application
    Filed: November 6, 2020
    Publication date: August 5, 2021
    Inventors: Mark LEATHER, Michael MANTOR
  • Patent number: 10579388
    Abstract: A method for use in a processor for arbitrating between multiple processes to select wavefronts for execution on a shader core is provided. The processor includes a compute pipeline configured to issue wavefronts to the shader core for execution, a hardware queue descriptor associated with the compute pipeline, and the shader core. The shader core is configured to execute work for the compute pipeline corresponding to a first memory queue descriptor executed using data for the first memory queue descriptor that is loaded into a first hardware queue descriptor. The processor is configured to detect a context switch condition, and, responsive to the context switch condition, perform a context switch operation including loading data for a second memory queue descriptor into the first hardware queue descriptor. The shader core is configured to execute work corresponding to the second memory queue descriptor that is loaded into the first hardware queue descriptor.
    Type: Grant
    Filed: July 19, 2018
    Date of Patent: March 3, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Mark Leather, Michael Mantor, Rex McCrary, Sebastien Nussbaum, Philip J. Rogers, Ralph Clay Taylor, Thomas Woller
  • Patent number: 10242420
    Abstract: Methods and apparatus are described. A method includes an accelerated processing device running a process. When a maximum time interval during which the process is permitted to run expires before the process completes, the accelerated processing device receives an operating-system-initiated instruction to stop running the process. The accelerated processing device stops the process from running in response to the received operating-system-initiated instruction.
    Type: Grant
    Filed: November 28, 2016
    Date of Patent: March 26, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clayton Taylor, Michael Mantor, Kevin John McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas Woller
  • Publication number: 20180321946
    Abstract: A method for use in a processor for arbitrating between multiple processes to select wavefronts for execution on a shader core is provided. The processor includes a compute pipeline configured to issue wavefronts to the shader core for execution, a hardware queue descriptor associated with the compute pipeline, and the shader core. The shader core is configured to execute work for the compute pipeline corresponding to a first memory queue descriptor executed using data for the first memory queue descriptor that is loaded into a first hardware queue descriptor. The processor is configured to detect a context switch condition, and, responsive to the context switch condition, perform a context switch operation including loading data for a second memory queue descriptor into the first hardware queue descriptor. The shader core is configured to execute work corresponding to the second memory queue descriptor that is loaded into the first hardware queue descriptor.
    Type: Application
    Filed: July 19, 2018
    Publication date: November 8, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Mark Leather, Michael Mantor, Rex McCrary, Sebastien Nussbaum, Philip J. Rogers, Ralph Clay Taylor, Thomas Woller
  • Publication number: 20170076421
    Abstract: Methods and apparatus are described. A method includes an accelerated processing device running a process. When a maximum time interval during which the process is permitted to run expires before the process completes, the accelerated processing device receives an operating-system-initiated instruction to stop running the process. The accelerated processing device stops the process from running in response to the received operating-system-initiated instruction.
    Type: Application
    Filed: November 28, 2016
    Publication date: March 16, 2017
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clayton Taylor, Michael Mantor, Kevin John McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas Woller
  • Patent number: 9507632
    Abstract: Methods, systems, and computer readable media for preemptive context-switching of processes on an accelerated processing device are based upon a comparison of the running time of the process and a threshold time quanta. A method includes preempting a process running on an accelerated processing device based upon a running time of the process and a threshold time quanta.
    Type: Grant
    Filed: November 4, 2011
    Date of Patent: November 29, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Kevin McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip Rogers, Thomas Woller
  • Patent number: 9329893
    Abstract: A method resumes an accelerated processing device (APD) wavefront in which a subset of elements have faulted. A restore command for a job including a wavefront is received. A list of context states for the wavefront is read from a memory associated with a APD. An empty shell wavefront is created for restoring the list of context states. A portion of not acknowledged data is masked over a portion of acknowledged data within the restored wavefronts.
    Type: Grant
    Filed: December 14, 2011
    Date of Patent: May 3, 2016
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Thomas R. Woller, Kevin McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Philip J. Rogers, Mark Leather
  • Patent number: 9299121
    Abstract: Methods, systems, and computer readable media embodiments are disclosed for preemptive context-switching of processes running on a accelerated processing device. Embodiments include, detecting by an accelerated processing device a memory exception, and preempting a process from running on the accelerated processing device based upon the detected exception.
    Type: Grant
    Filed: November 4, 2011
    Date of Patent: March 29, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Kevin McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas R. Woller
  • Patent number: 9256465
    Abstract: Methods, systems, and computer readable media embodiments are disclosed for preemptive context-switching of processes running on an accelerated processing device. A method includes, responsive to an exception upon access to a memory by a process running on a accelerated processing device, whether to preempt the process based on the exception, and preempting, based upon the determining, the process from running on the accelerated processing device.
    Type: Grant
    Filed: November 4, 2011
    Date of Patent: February 9, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Kevin McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas R. Woller