Patents by Inventor Michal Mrozek
Michal Mrozek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12217327Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. In one embodiment, the apparatus includes a graphics processor comprising a first semiconductor die including a first high-bandwidth memory (HBM) device, a second semiconductor die including a second HBM device, and a third semiconductor die coupled with the first semiconductor die and the second semiconductor die in a 2.5-dimensional (2.5D) arrangement. The third semiconductor die includes a graphics processing resource and a cache coupled with the graphics processing resource. The cache is configurable to cache data associated with memory accessed by the graphics processing resource and the graphics processing resource includes a general-purpose graphics processor core and a tensor core.Type: GrantFiled: November 1, 2022Date of Patent: February 4, 2025Assignee: Intel CorporationInventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
-
Patent number: 12199759Abstract: Described herein is a graphics processor configured to perform asynchronous input dependency resolution among a group of interdependent workloads. The graphics processor can dynamically resolve input dependencies among the workloads according to a dependency relationship defined for the workloads. Dependency resolution be performed via a deferred submission mode which resolves input dependencies prior to thread dispatch to the processing resources or via immediate submission mode which resolves input dependencies at the processing resources.Type: GrantFiled: March 7, 2022Date of Patent: January 14, 2025Assignee: Intel CorporationInventors: Michal Mrozek, Vinod Tipparaju
-
Patent number: 11907756Abstract: A graphics processing apparatus that includes at least a memory device and an execution unit coupled to the memory. The memory device can store a command buffer with at least one command that is dependent on completion of at least one other command. The command buffer can include a jump command that causes a jump to a location in the command buffer to identify any unscheduled command. The execution unit is to jump to a location in the command buffer based on execution of the jump command. The execution unit is to perform one or more jumps to one or more locations in the command buffer to attempt to schedule a command with dependency on completion of at least one other command until the command with a dependency on completion of at least one other command is scheduled.Type: GrantFiled: February 20, 2020Date of Patent: February 20, 2024Assignee: Intel CorporationInventors: Bartosz Dunajski, Brandon Fliflet, Michal Mrozek
-
Publication number: 20240054595Abstract: Embodiments described herein provide a system of concurrent compute queues that enable the scheduling of a large number of compute contexts simultaneously on graphics processor hardware. One embodiment provides an apparatus comprising a system interface and a general-purpose graphics processor coupled with the system interface. The general-purpose graphics processor comprises a plurality of graphics processor hardware resources configured to be partitioned into a plurality of isolated partitions, each of the plurality of isolated partitions including a first command streamer, a second command streamer, and circuitry configured to schedule general-purpose graphics compute workloads submitted to a first plurality of command queues associated with the first command streamer and a second plurality of command queues associated with the second command streamer.Type: ApplicationFiled: August 10, 2022Publication date: February 15, 2024Applicant: Intel CorporationInventors: Joydeep Ray, Vasanth Ranganathan, James Valerio, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Ben J. Ashbaugh, Michal Mrozek, Murali Ramadoss, Hong Jiang, Ankur Shah
-
Publication number: 20230051227Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. In one embodiment, the apparatus includes a graphics processor comprising a first semiconductor die including a first high-bandwidth memory (HBM) device, a second semiconductor die including a second HBM device, and a third semiconductor die coupled with the first semiconductor die and the second semiconductor die in a 2.5-dimensional (2.5D) arrangement. The third semiconductor die includes a graphics processing resource and a cache coupled with the graphics processing resource. The cache is configurable to cache data associated with memory accessed by the graphics processing resource and the graphics processing resource includes a general-purpose graphics processor core and a tensor core.Type: ApplicationFiled: November 1, 2022Publication date: February 16, 2023Applicant: Intel CorporationInventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
-
Publication number: 20220291955Abstract: Described herein is a graphics processor configured to perform asynchronous input dependency resolution among a group of interdependent workloads. The graphics processor can dynamically resolve input dependencies among the workloads according to a dependency relationship defined for the workloads. Dependency resolution be performed via a deferred submission mode which resolves input dependencies prior to thread dispatch to the processing resources or via immediate submission mode which resolves input dependencies at the processing resources.Type: ApplicationFiled: March 7, 2022Publication date: September 15, 2022Applicant: Intel CorporationInventors: Michal Mrozek, Vinod Tipparaju
-
Publication number: 20220156879Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. The apparatus comprises a plurality of processing tiles, each including a memory device and a plurality of processing resources, coupled to the device memory, and a memory management unit to manage the memory devices in each of the plurality of tiles to perform allocation of memory resources among the memory devices for execution by the plurality of processing resources.Type: ApplicationFiled: November 18, 2020Publication date: May 19, 2022Applicant: Intel CorporationInventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
-
Publication number: 20210263766Abstract: Examples described herein include a graphics processing apparatus that includes at least a memory device and an execution unit coupled to the memory. The memory device can store a command buffer with at least one command that is dependent on completion of at least one other command. The command buffer can include a jump command that causes a jump to a location in the command buffer to identify any unscheduled command. The execution unit is to jump to a location in the command buffer based on execution of the jump command. The execution unit is to perform one or more jumps to one or more locations in the command buffer to attempt to schedule a command with dependency on completion of at least one other command until the command with a dependency on completion of at least one other command is scheduled.Type: ApplicationFiled: February 20, 2020Publication date: August 26, 2021Inventors: Bartosz DUNAJSKI, Brandon FLIFLET, Michal MROZEK
-
Patent number: 10937118Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.Type: GrantFiled: January 28, 2019Date of Patent: March 2, 2021Assignee: INTEL CORPORATIONInventors: Jayanth N. Rao, Michal Mrozek
-
Patent number: 10521874Abstract: An apparatus and method are described for executing workloads without host intervention. For example, one embodiment of an apparatus comprises: a host processor; and a graphics processor unit (GPU) to execute a hierarchical workload responsive to one or more commands issued by the host processor, the hierarchical workload comprising a parent workload and a plurality of child workloads interconnected in a logical graph structure; and a scheduler kernel implemented by the GPU to schedule execution of the plurality of child workloads without host intervention, the scheduler kernel to evaluate conditions required for execution of the child workloads and determine an order in which to execute the child workloads on the GPU based on the evaluated conditions; the GPU to execute the child workloads in the order determined by the scheduler kernel and to provide results of parent and child workloads to the host processor following execution of all of the child workloads.Type: GrantFiled: September 26, 2014Date of Patent: December 31, 2019Assignee: Intel CorporationInventors: Jayanth N. Rao, Pavan K. Lanka, Michal Mrozek
-
Publication number: 20190259129Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.Type: ApplicationFiled: January 28, 2019Publication date: August 22, 2019Applicant: Intel CorporationInventors: Jayanth N. Rao, Michal Mrozek
-
Patent number: 10235732Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.Type: GrantFiled: December 27, 2013Date of Patent: March 19, 2019Assignee: INTEL CORPORATIONInventors: Jayanth N. Rao, Michal Mrozek
-
Publication number: 20170300361Abstract: Methods and apparatus relating to employing out-of-order queues for improved GPU (Graphics Processing Unit) utilization are described. In an embodiment, logic is used to employ out-of-order queues for improved GPU (Graphics Processing Unit) utilization. Other embodiments are also disclosed and claimed.Type: ApplicationFiled: July 3, 2016Publication date: October 19, 2017Applicant: INTEL CORPORATIONInventors: Pavan K. Lanka, Krzysztof Laskowski, Michal Mrozek
-
Publication number: 20160093012Abstract: An apparatus and method are described for executing workloads without host intervention. For example, one embodiment of an apparatus comprises: a host processor; and a graphics processor unit (GPU) to execute a hierarchical workload responsive to one or more commands issued by the host processor, the hierarchical workload comprising a parent workload and a plurality of child workloads interconnected in a logical graph structure; and a scheduler kernel implemented by the GPU to schedule execution of the plurality of child workloads without host intervention, the scheduler kernel to evaluate conditions required for execution of the child workloads and determine an order in which to execute the child workloads on the GPU based on the evaluated conditions; the GPU to execute the child workloads in the order determined by the scheduler kernel and to provide results of parent and child workloads to the host processor following execution of all of the child workloads.Type: ApplicationFiled: September 26, 2014Publication date: March 31, 2016Inventors: JAYANTH N. RAO, PAVAN K. LANKA, MICHAL MROZEK
-
Publication number: 20150187040Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.Type: ApplicationFiled: December 27, 2013Publication date: July 2, 2015Inventors: Jayanth N. Rao, Michal Mrozek