Patents by Inventor Rex Eldon MCCRARY
Rex Eldon MCCRARY has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250191111Abstract: A plurality of programmable processing cores is configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The processing cores and the fixed-function hardware units are configured to implement a configurable number of virtual pipelines. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.Type: ApplicationFiled: February 25, 2025Publication date: June 12, 2025Inventors: Timour T. PALTASHEV, Michael MANTOR, Rex Eldon MCCRARY
-
Publication number: 20240394829Abstract: A primary processing unit includes queues configured to store commands prior to execution in corresponding pipelines. The primary processing unit also includes a first table configured to store entries indicating dependencies between commands that are to be executed on different ones of a plurality of processing units that include the primary processing unit and one or more secondary processing units. The primary processing unit also includes a scheduler configured to release commands in response to resolution of the dependencies. In some cases, a first one of the secondary processing units schedules the first command for execution in response to resolution of a dependency on a second command executing in a second one of the secondary processing units. The second one of the secondary processing units notifies the primary processing unit in response to completing execution of the second command.Type: ApplicationFiled: May 29, 2024Publication date: November 28, 2024Inventor: Rex Eldon MCCRARY
-
Publication number: 20240320042Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.Type: ApplicationFiled: March 6, 2024Publication date: September 26, 2024Inventor: Rex Eldon MCCRARY
-
Publication number: 20230094639Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.Type: ApplicationFiled: September 16, 2022Publication date: March 30, 2023Inventor: Rex Eldon MCCRARY
-
Publication number: 20220188963Abstract: A processing system includes a graphics pipeline that executes a first shader of a first type and a second shader of a second type. In some cases, the first shader is a geometry shader and the second shader is a pixel shader. The processing system also includes buffers that hold primitives generated by the first shader and provide the primitives to the second shader. The processing system also includes a primitive hub that monitors fullness of the buffers. Launching of waves from the first shader is throttled based on the fullness of the buffers. A shader processor input (SPI) selectively throttles the waves launched by the geometry shader based on a signal from the primitive hub indicating the fullness, an indication of relative resource usage of geometry waves and pixel waves in the graphics pipeline, or an indication of lifetimes of the geometry waves.Type: ApplicationFiled: December 16, 2020Publication date: June 16, 2022Inventors: Nishank PATHAK, Randy Wayne RAMSEY, Tad LITWILLER, Rex Eldon MCCRARY
-
Publication number: 20220091847Abstract: In response to executing a specified command packet, a processing unit prefetches commands stored at an indirect buffer a command queue for execution, prior to executing a command that initiates execution of the commands stored at the indirect buffer. By prefetching the data prior to executing the indirect buffer execution command, the processing unit reduces delays in processing the commands stored at the indirect buffer.Type: ApplicationFiled: September 23, 2020Publication date: March 24, 2022Inventors: Alexander Fuad ASHKAR, Harry J. WISE, Rex Eldon MCCRARY, Hans FERNLUND
-
Publication number: 20210272229Abstract: An apparatus such as a graphics processing unit (GPU) includes shader engines and front end (FE) circuits. Subsets of the FE circuits are configured to schedule commands for execution on corresponding subsets of the shader engines. The apparatus also includes a set of physical paths configured to convey information from the FE circuits to a memory via the shader engines. Subsets of the physical paths are allocated to the subsets of the FE circuits and the corresponding subsets of the shader engines. The apparatus further includes a scheduler configured to receive a reconfiguration request and modify the set of physical paths based on the reconfiguration request. In some cases, the reconfiguration request is provided by a central processing unit (CPU) that requests the modification based on characteristics of applications generating the commands.Type: ApplicationFiled: February 28, 2020Publication date: September 2, 2021Inventor: Rex Eldon MCCRARY
-
Publication number: 20210272347Abstract: An apparatus such as a graphics processing unit (GPU) includes a set of shader engines and a set of front end (FE) circuits. Subsets of the set of FE circuits schedule geometry workloads for subsets of the set of shader engines based on a mapping. The apparatus also includes a set of physical paths that convey information from the set of FE circuits to a memory via the set of shader engines. Subsets of the set of physical paths are allocated to the subsets of the set of FE circuits and the subsets of the set of shader engines based on the mapping. The mapping determines information stored in a set of registers used to configure the apparatus. In some cases, the set of registers store information indicating a spatial partitioning of the set of physical paths.Type: ApplicationFiled: February 28, 2020Publication date: September 2, 2021Inventor: Rex Eldon MCCRARY
-
Publication number: 20210192672Abstract: A primary processing unit includes queues configured to store commands prior to execution in corresponding pipelines. The primary processing unit also includes a first table configured to store entries indicating dependencies between commands that are to be executed on different ones of a plurality of processing units that include the primary processing unit and one or more secondary processing units. The primary processing unit also includes a scheduler configured to release commands in response to resolution of the dependencies. In some cases, a first one of the secondary processing units schedules the first command for execution in response to resolution of a dependency on a second command executing in a second one of the secondary processing units. The second one of the secondary processing units notifies the primary processing unit in response to completing execution of the second command.Type: ApplicationFiled: December 19, 2019Publication date: June 24, 2021Inventor: Rex Eldon MCCRARY
-
Publication number: 20210191730Abstract: A processing system includes a set of queues to store command buffers prior to execution in a corresponding plurality of pipelines. The processing system also includes one or more first doorbells and a second doorbell. The first doorbells map to one or more queues in the set of queues on a one-to-one basis. The second doorbell maps to a subset of the set of queues on a one-to-many basis. A doorbell monitor generates an interrupt in response to an empty queue in the subset becoming a non-empty queue. A scheduler polls the subset in response to the interrupt. The scheduler schedules a command buffer from the non-empty queue for execution or adds the command buffer to a pool for subsequent execution.Type: ApplicationFiled: December 19, 2019Publication date: June 24, 2021Inventor: Rex Eldon MCCRARY
-
Publication number: 20210191793Abstract: A processing unit such as a graphics processing unit (GPU) includes a set of queues that stores command buffers prior to execution in a corresponding plurality of pipelines. The processing unit also implements a kernel mode driver that allocates a first subset of the set of queues to a first application in response to receiving registration requests from the first application. The processing unit further includes a scheduler that schedules command buffers in the first subset of the set of queues for concurrent execution on a first subset of the set of pipelines. In some cases, an interrupt is generated in response to execution of a first command in a first command buffer in the first queue or the second queue. The interrupt includes an address indicating a location of a routine to be executed by a second subset of the plurality of pipelines.Type: ApplicationFiled: December 19, 2019Publication date: June 24, 2021Inventor: Rex Eldon MCCRARY
-
Publication number: 20210191771Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.Type: ApplicationFiled: December 19, 2019Publication date: June 24, 2021Inventor: Rex Eldon MCCRARY
-
Publication number: 20210049729Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.Type: ApplicationFiled: May 21, 2020Publication date: February 18, 2021Inventors: Timour T. PALTASHEV, Michael MANTOR, Rex Eldon MCCRARY
-
Publication number: 20200379767Abstract: A method of context bouncing includes receiving, at a command processor of a graphics processing unit (GPU), a conditional execute packet providing a hash identifier corresponding to an encapsulated state. The encapsulated state includes one or more context state packets following the conditional execute packet. A command packet following the encapsulated state is executed based at least in part on determining whether the hash identifier of the encapsulated state matches one of a plurality of hash identifiers of active context states currently stored at the GPU.Type: ApplicationFiled: May 30, 2019Publication date: December 3, 2020Inventors: Rex Eldon MCCRARY, Yi LUO, Harry J. WISE, Alexander Fuad ASHKAR, Michael MANTOR
-
Publication number: 20200379792Abstract: A processing unit employs microcode wherein the jump table associated with the microcode is embedded in the microcode itself. When the microcode is compiled based on a set of programmer instructions, the compiler prepares the jump table for the microcode and stores the jump table in the same file or other storage unit as the microcode. When the processing unit is initialized to execute a program, such as an operating system, the processing unit retrieves the microcode corresponding to the program from memory, stores the microcode in a cache or other memory module for execution, and automatically loads the embedded jump table from the microcode to a specified set of jump table registers, thereby preparing the processing unit to process received packets.Type: ApplicationFiled: May 31, 2019Publication date: December 3, 2020Inventors: Alexander Fuad ASHKAR, Rakan KHRAISHA, Rex Eldon MCCRARY, Harry J. WISE
-
Publication number: 20090172677Abstract: The present invention provides an efficient state management system for a complex ASIC, and applications thereof. In an embodiment, a computer-based system executes state-dependent processes. The computer-based system includes a command processor (CP) and a plurality of processing blocks. The CP receives commands in a command stream and manages a global state responsive to global context events in the command stream. The plurality of processing blocks receive the commands in the command stream and manage respective block states responsive to block context events in the command stream. Each respective processing block executes a process on data in a data stream based on the global state and the block state of the respective processing block.Type: ApplicationFiled: December 22, 2008Publication date: July 2, 2009Applicant: Advanced Micro Devices, Inc.Inventors: Michael MANTOR, Rex Eldon MCCRARY