Patents by Inventor Michal Mrozek

Michal Mrozek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ASYNCHRONOUS INPUT DEPENDENCY RESOLUTION MECHANISM

Publication number: 20250175282

Abstract: Described herein is a graphics processor configured to perform asynchronous input dependency resolution among a group of interdependent workloads. The graphics processor can dynamically resolve input dependencies among the workloads according to a dependency relationship defined for the workloads. Dependency resolution be performed via a deferred submission mode which resolves input dependencies prior to thread dispatch to the processing resources or via immediate submission mode which resolves input dependencies at the processing resources.

Type: Application

Filed: December 3, 2024

Publication date: May 29, 2025

Applicant: Intel Corporation

Inventors: Michal Mrozek, Vinod Tipparaju
Multi-tile graphics processing unit

Patent number: 12217327

Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. In one embodiment, the apparatus includes a graphics processor comprising a first semiconductor die including a first high-bandwidth memory (HBM) device, a second semiconductor die including a second HBM device, and a third semiconductor die coupled with the first semiconductor die and the second semiconductor die in a 2.5-dimensional (2.5D) arrangement. The third semiconductor die includes a graphics processing resource and a cache coupled with the graphics processing resource. The cache is configurable to cache data associated with memory accessed by the graphics processing resource and the graphics processing resource includes a general-purpose graphics processor core and a tensor core.

Type: Grant

Filed: November 1, 2022

Date of Patent: February 4, 2025

Assignee: Intel Corporation

Inventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
Asynchronous input dependency resolution mechanism

Patent number: 12199759

Abstract: Described herein is a graphics processor configured to perform asynchronous input dependency resolution among a group of interdependent workloads. The graphics processor can dynamically resolve input dependencies among the workloads according to a dependency relationship defined for the workloads. Dependency resolution be performed via a deferred submission mode which resolves input dependencies prior to thread dispatch to the processing resources or via immediate submission mode which resolves input dependencies at the processing resources.

Type: Grant

Filed: March 7, 2022

Date of Patent: January 14, 2025

Assignee: Intel Corporation

Inventors: Michal Mrozek, Vinod Tipparaju
Concurrent workload scheduling with multiple level of dependencies

Patent number: 11907756

Abstract: A graphics processing apparatus that includes at least a memory device and an execution unit coupled to the memory. The memory device can store a command buffer with at least one command that is dependent on completion of at least one other command. The command buffer can include a jump command that causes a jump to a location in the command buffer to identify any unscheduled command. The execution unit is to jump to a location in the command buffer based on execution of the jump command. The execution unit is to perform one or more jumps to one or more locations in the command buffer to attempt to schedule a command with dependency on completion of at least one other command until the command with a dependency on completion of at least one other command is scheduled.

Type: Grant

Filed: February 20, 2020

Date of Patent: February 20, 2024

Assignee: Intel Corporation

Inventors: Bartosz Dunajski, Brandon Fliflet, Michal Mrozek
CONCURRENT COMPUTE CONTEXT

Publication number: 20240054595

Abstract: Embodiments described herein provide a system of concurrent compute queues that enable the scheduling of a large number of compute contexts simultaneously on graphics processor hardware. One embodiment provides an apparatus comprising a system interface and a general-purpose graphics processor coupled with the system interface. The general-purpose graphics processor comprises a plurality of graphics processor hardware resources configured to be partitioned into a plurality of isolated partitions, each of the plurality of isolated partitions including a first command streamer, a second command streamer, and circuitry configured to schedule general-purpose graphics compute workloads submitted to a first plurality of command queues associated with the first command streamer and a second plurality of command queues associated with the second command streamer.

Type: Application

Filed: August 10, 2022

Publication date: February 15, 2024

Applicant: Intel Corporation

Inventors: Joydeep Ray, Vasanth Ranganathan, James Valerio, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Ben J. Ashbaugh, Michal Mrozek, Murali Ramadoss, Hong Jiang, Ankur Shah
MULTI-TILE GRAPHICS PROCESSING UNIT

Publication number: 20230051227

Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. In one embodiment, the apparatus includes a graphics processor comprising a first semiconductor die including a first high-bandwidth memory (HBM) device, a second semiconductor die including a second HBM device, and a third semiconductor die coupled with the first semiconductor die and the second semiconductor die in a 2.5-dimensional (2.5D) arrangement. The third semiconductor die includes a graphics processing resource and a cache coupled with the graphics processing resource. The cache is configurable to cache data associated with memory accessed by the graphics processing resource and the graphics processing resource includes a general-purpose graphics processor core and a tensor core.

Type: Application

Filed: November 1, 2022

Publication date: February 16, 2023

Applicant: Intel Corporation

Inventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
ASYNCHRONOUS INPUT DEPENDENCY RESOLUTION MECHANISM

Publication number: 20220291955

Abstract: Described herein is a graphics processor configured to perform asynchronous input dependency resolution among a group of interdependent workloads. The graphics processor can dynamically resolve input dependencies among the workloads according to a dependency relationship defined for the workloads. Dependency resolution be performed via a deferred submission mode which resolves input dependencies prior to thread dispatch to the processing resources or via immediate submission mode which resolves input dependencies at the processing resources.

Type: Application

Filed: March 7, 2022

Publication date: September 15, 2022

Applicant: Intel Corporation

Inventors: Michal Mrozek, Vinod Tipparaju
MULTI-TILE GRAPHICS PROCESSING UNIT

Publication number: 20220156879

Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. The apparatus comprises a plurality of processing tiles, each including a memory device and a plurality of processing resources, coupled to the device memory, and a memory management unit to manage the memory devices in each of the plurality of tiles to perform allocation of memory resources among the memory devices for execution by the plurality of processing resources.

Type: Application

Filed: November 18, 2020

Publication date: May 19, 2022

Applicant: Intel Corporation

Inventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
CONCURRENT WORKLOAD SCHEDULING WITH MULTIPLE LEVEL OF DEPENDENCIES

Publication number: 20210263766

Abstract: Examples described herein include a graphics processing apparatus that includes at least a memory device and an execution unit coupled to the memory. The memory device can store a command buffer with at least one command that is dependent on completion of at least one other command. The command buffer can include a jump command that causes a jump to a location in the command buffer to identify any unscheduled command. The execution unit is to jump to a location in the command buffer based on execution of the jump command. The execution unit is to perform one or more jumps to one or more locations in the command buffer to attempt to schedule a command with dependency on completion of at least one other command until the command with a dependency on completion of at least one other command is scheduled.

Type: Application

Filed: February 20, 2020

Publication date: August 26, 2021

Inventors: Bartosz DUNAJSKI, Brandon FLIFLET, Michal MROZEK
Scheduling and dispatch of GPGPU workloads

Patent number: 10937118

Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.

Type: Grant

Filed: January 28, 2019

Date of Patent: March 2, 2021

Assignee: INTEL CORPORATION

Inventors: Jayanth N. Rao, Michal Mrozek
Method and apparatus for a highly efficient graphics processing unit (GPU) execution model

Patent number: 10521874

Abstract: An apparatus and method are described for executing workloads without host intervention. For example, one embodiment of an apparatus comprises: a host processor; and a graphics processor unit (GPU) to execute a hierarchical workload responsive to one or more commands issued by the host processor, the hierarchical workload comprising a parent workload and a plurality of child workloads interconnected in a logical graph structure; and a scheduler kernel implemented by the GPU to schedule execution of the plurality of child workloads without host intervention, the scheduler kernel to evaluate conditions required for execution of the child workloads and determine an order in which to execute the child workloads on the GPU based on the evaluated conditions; the GPU to execute the child workloads in the order determined by the scheduler kernel and to provide results of parent and child workloads to the host processor following execution of all of the child workloads.

Type: Grant

Filed: September 26, 2014

Date of Patent: December 31, 2019

Assignee: Intel Corporation

Inventors: Jayanth N. Rao, Pavan K. Lanka, Michal Mrozek
SCHEDULING AND DISPATCH OF GPGPU WORKLOADS

Publication number: 20190259129

Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.

Type: Application

Filed: January 28, 2019

Publication date: August 22, 2019

Applicant: Intel Corporation

Inventors: Jayanth N. Rao, Michal Mrozek
Scheduling and dispatch of GPGPU workloads

Patent number: 10235732

Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.

Type: Grant

Filed: December 27, 2013

Date of Patent: March 19, 2019

Assignee: INTEL CORPORATION

Inventors: Jayanth N. Rao, Michal Mrozek
EMPLOYING OUT OF ORDER QUEUES FOR BETTER GPU UTILIZATION

Publication number: 20170300361

Abstract: Methods and apparatus relating to employing out-of-order queues for improved GPU (Graphics Processing Unit) utilization are described. In an embodiment, logic is used to employ out-of-order queues for improved GPU (Graphics Processing Unit) utilization. Other embodiments are also disclosed and claimed.

Type: Application

Filed: July 3, 2016

Publication date: October 19, 2017

Applicant: INTEL CORPORATION

Inventors: Pavan K. Lanka, Krzysztof Laskowski, Michal Mrozek
METHOD AND APPARATUS FOR A HIGHLY EFFICIENT GRAPHICS PROCESSING UNIT (GPU) EXECUTION MODEL

Publication number: 20160093012

Abstract: An apparatus and method are described for executing workloads without host intervention. For example, one embodiment of an apparatus comprises: a host processor; and a graphics processor unit (GPU) to execute a hierarchical workload responsive to one or more commands issued by the host processor, the hierarchical workload comprising a parent workload and a plurality of child workloads interconnected in a logical graph structure; and a scheduler kernel implemented by the GPU to schedule execution of the plurality of child workloads without host intervention, the scheduler kernel to evaluate conditions required for execution of the child workloads and determine an order in which to execute the child workloads on the GPU based on the evaluated conditions; the GPU to execute the child workloads in the order determined by the scheduler kernel and to provide results of parent and child workloads to the host processor following execution of all of the child workloads.

Type: Application

Filed: September 26, 2014

Publication date: March 31, 2016

Inventors: JAYANTH N. RAO, PAVAN K. LANKA, MICHAL MROZEK
SCHEDULING AND DISPATCH OF GPGPU WORKLOADS

Publication number: 20150187040

Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.

Type: Application

Filed: December 27, 2013

Publication date: July 2, 2015

Inventors: Jayanth N. Rao, Michal Mrozek