Patents by Inventor Rex Eldon MCCRARY

Rex Eldon MCCRARY has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250191111
    Abstract: A plurality of programmable processing cores is configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The processing cores and the fixed-function hardware units are configured to implement a configurable number of virtual pipelines. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.
    Type: Application
    Filed: February 25, 2025
    Publication date: June 12, 2025
    Inventors: Timour T. PALTASHEV, Michael MANTOR, Rex Eldon MCCRARY
  • Publication number: 20240394829
    Abstract: A primary processing unit includes queues configured to store commands prior to execution in corresponding pipelines. The primary processing unit also includes a first table configured to store entries indicating dependencies between commands that are to be executed on different ones of a plurality of processing units that include the primary processing unit and one or more secondary processing units. The primary processing unit also includes a scheduler configured to release commands in response to resolution of the dependencies. In some cases, a first one of the secondary processing units schedules the first command for execution in response to resolution of a dependency on a second command executing in a second one of the secondary processing units. The second one of the secondary processing units notifies the primary processing unit in response to completing execution of the second command.
    Type: Application
    Filed: May 29, 2024
    Publication date: November 28, 2024
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20240320042
    Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.
    Type: Application
    Filed: March 6, 2024
    Publication date: September 26, 2024
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20230094639
    Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.
    Type: Application
    Filed: September 16, 2022
    Publication date: March 30, 2023
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20220188963
    Abstract: A processing system includes a graphics pipeline that executes a first shader of a first type and a second shader of a second type. In some cases, the first shader is a geometry shader and the second shader is a pixel shader. The processing system also includes buffers that hold primitives generated by the first shader and provide the primitives to the second shader. The processing system also includes a primitive hub that monitors fullness of the buffers. Launching of waves from the first shader is throttled based on the fullness of the buffers. A shader processor input (SPI) selectively throttles the waves launched by the geometry shader based on a signal from the primitive hub indicating the fullness, an indication of relative resource usage of geometry waves and pixel waves in the graphics pipeline, or an indication of lifetimes of the geometry waves.
    Type: Application
    Filed: December 16, 2020
    Publication date: June 16, 2022
    Inventors: Nishank PATHAK, Randy Wayne RAMSEY, Tad LITWILLER, Rex Eldon MCCRARY
  • Publication number: 20220091847
    Abstract: In response to executing a specified command packet, a processing unit prefetches commands stored at an indirect buffer a command queue for execution, prior to executing a command that initiates execution of the commands stored at the indirect buffer. By prefetching the data prior to executing the indirect buffer execution command, the processing unit reduces delays in processing the commands stored at the indirect buffer.
    Type: Application
    Filed: September 23, 2020
    Publication date: March 24, 2022
    Inventors: Alexander Fuad ASHKAR, Harry J. WISE, Rex Eldon MCCRARY, Hans FERNLUND
  • Publication number: 20210272229
    Abstract: An apparatus such as a graphics processing unit (GPU) includes shader engines and front end (FE) circuits. Subsets of the FE circuits are configured to schedule commands for execution on corresponding subsets of the shader engines. The apparatus also includes a set of physical paths configured to convey information from the FE circuits to a memory via the shader engines. Subsets of the physical paths are allocated to the subsets of the FE circuits and the corresponding subsets of the shader engines. The apparatus further includes a scheduler configured to receive a reconfiguration request and modify the set of physical paths based on the reconfiguration request. In some cases, the reconfiguration request is provided by a central processing unit (CPU) that requests the modification based on characteristics of applications generating the commands.
    Type: Application
    Filed: February 28, 2020
    Publication date: September 2, 2021
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20210272347
    Abstract: An apparatus such as a graphics processing unit (GPU) includes a set of shader engines and a set of front end (FE) circuits. Subsets of the set of FE circuits schedule geometry workloads for subsets of the set of shader engines based on a mapping. The apparatus also includes a set of physical paths that convey information from the set of FE circuits to a memory via the set of shader engines. Subsets of the set of physical paths are allocated to the subsets of the set of FE circuits and the subsets of the set of shader engines based on the mapping. The mapping determines information stored in a set of registers used to configure the apparatus. In some cases, the set of registers store information indicating a spatial partitioning of the set of physical paths.
    Type: Application
    Filed: February 28, 2020
    Publication date: September 2, 2021
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20210192672
    Abstract: A primary processing unit includes queues configured to store commands prior to execution in corresponding pipelines. The primary processing unit also includes a first table configured to store entries indicating dependencies between commands that are to be executed on different ones of a plurality of processing units that include the primary processing unit and one or more secondary processing units. The primary processing unit also includes a scheduler configured to release commands in response to resolution of the dependencies. In some cases, a first one of the secondary processing units schedules the first command for execution in response to resolution of a dependency on a second command executing in a second one of the secondary processing units. The second one of the secondary processing units notifies the primary processing unit in response to completing execution of the second command.
    Type: Application
    Filed: December 19, 2019
    Publication date: June 24, 2021
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20210191730
    Abstract: A processing system includes a set of queues to store command buffers prior to execution in a corresponding plurality of pipelines. The processing system also includes one or more first doorbells and a second doorbell. The first doorbells map to one or more queues in the set of queues on a one-to-one basis. The second doorbell maps to a subset of the set of queues on a one-to-many basis. A doorbell monitor generates an interrupt in response to an empty queue in the subset becoming a non-empty queue. A scheduler polls the subset in response to the interrupt. The scheduler schedules a command buffer from the non-empty queue for execution or adds the command buffer to a pool for subsequent execution.
    Type: Application
    Filed: December 19, 2019
    Publication date: June 24, 2021
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20210191793
    Abstract: A processing unit such as a graphics processing unit (GPU) includes a set of queues that stores command buffers prior to execution in a corresponding plurality of pipelines. The processing unit also implements a kernel mode driver that allocates a first subset of the set of queues to a first application in response to receiving registration requests from the first application. The processing unit further includes a scheduler that schedules command buffers in the first subset of the set of queues for concurrent execution on a first subset of the set of pipelines. In some cases, an interrupt is generated in response to execution of a first command in a first command buffer in the first queue or the second queue. The interrupt includes an address indicating a location of a routine to be executed by a second subset of the plurality of pipelines.
    Type: Application
    Filed: December 19, 2019
    Publication date: June 24, 2021
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20210191771
    Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.
    Type: Application
    Filed: December 19, 2019
    Publication date: June 24, 2021
    Inventor: Rex Eldon MCCRARY
  • Publication number: 20210049729
    Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.
    Type: Application
    Filed: May 21, 2020
    Publication date: February 18, 2021
    Inventors: Timour T. PALTASHEV, Michael MANTOR, Rex Eldon MCCRARY
  • Publication number: 20200379767
    Abstract: A method of context bouncing includes receiving, at a command processor of a graphics processing unit (GPU), a conditional execute packet providing a hash identifier corresponding to an encapsulated state. The encapsulated state includes one or more context state packets following the conditional execute packet. A command packet following the encapsulated state is executed based at least in part on determining whether the hash identifier of the encapsulated state matches one of a plurality of hash identifiers of active context states currently stored at the GPU.
    Type: Application
    Filed: May 30, 2019
    Publication date: December 3, 2020
    Inventors: Rex Eldon MCCRARY, Yi LUO, Harry J. WISE, Alexander Fuad ASHKAR, Michael MANTOR
  • Publication number: 20200379792
    Abstract: A processing unit employs microcode wherein the jump table associated with the microcode is embedded in the microcode itself. When the microcode is compiled based on a set of programmer instructions, the compiler prepares the jump table for the microcode and stores the jump table in the same file or other storage unit as the microcode. When the processing unit is initialized to execute a program, such as an operating system, the processing unit retrieves the microcode corresponding to the program from memory, stores the microcode in a cache or other memory module for execution, and automatically loads the embedded jump table from the microcode to a specified set of jump table registers, thereby preparing the processing unit to process received packets.
    Type: Application
    Filed: May 31, 2019
    Publication date: December 3, 2020
    Inventors: Alexander Fuad ASHKAR, Rakan KHRAISHA, Rex Eldon MCCRARY, Harry J. WISE
  • Publication number: 20090172677
    Abstract: The present invention provides an efficient state management system for a complex ASIC, and applications thereof. In an embodiment, a computer-based system executes state-dependent processes. The computer-based system includes a command processor (CP) and a plurality of processing blocks. The CP receives commands in a command stream and manages a global state responsive to global context events in the command stream. The plurality of processing blocks receive the commands in the command stream and manage respective block states responsive to block context events in the command stream. Each respective processing block executes a process on data in a data stream based on the global state and the block state of the respective processing block.
    Type: Application
    Filed: December 22, 2008
    Publication date: July 2, 2009
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael MANTOR, Rex Eldon MCCRARY