Patents by Inventor Michal Mrozek

Michal Mrozek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12217327
    Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. In one embodiment, the apparatus includes a graphics processor comprising a first semiconductor die including a first high-bandwidth memory (HBM) device, a second semiconductor die including a second HBM device, and a third semiconductor die coupled with the first semiconductor die and the second semiconductor die in a 2.5-dimensional (2.5D) arrangement. The third semiconductor die includes a graphics processing resource and a cache coupled with the graphics processing resource. The cache is configurable to cache data associated with memory accessed by the graphics processing resource and the graphics processing resource includes a general-purpose graphics processor core and a tensor core.
    Type: Grant
    Filed: November 1, 2022
    Date of Patent: February 4, 2025
    Assignee: Intel Corporation
    Inventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
  • Patent number: 12199759
    Abstract: Described herein is a graphics processor configured to perform asynchronous input dependency resolution among a group of interdependent workloads. The graphics processor can dynamically resolve input dependencies among the workloads according to a dependency relationship defined for the workloads. Dependency resolution be performed via a deferred submission mode which resolves input dependencies prior to thread dispatch to the processing resources or via immediate submission mode which resolves input dependencies at the processing resources.
    Type: Grant
    Filed: March 7, 2022
    Date of Patent: January 14, 2025
    Assignee: Intel Corporation
    Inventors: Michal Mrozek, Vinod Tipparaju
  • Patent number: 11907756
    Abstract: A graphics processing apparatus that includes at least a memory device and an execution unit coupled to the memory. The memory device can store a command buffer with at least one command that is dependent on completion of at least one other command. The command buffer can include a jump command that causes a jump to a location in the command buffer to identify any unscheduled command. The execution unit is to jump to a location in the command buffer based on execution of the jump command. The execution unit is to perform one or more jumps to one or more locations in the command buffer to attempt to schedule a command with dependency on completion of at least one other command until the command with a dependency on completion of at least one other command is scheduled.
    Type: Grant
    Filed: February 20, 2020
    Date of Patent: February 20, 2024
    Assignee: Intel Corporation
    Inventors: Bartosz Dunajski, Brandon Fliflet, Michal Mrozek
  • Publication number: 20240054595
    Abstract: Embodiments described herein provide a system of concurrent compute queues that enable the scheduling of a large number of compute contexts simultaneously on graphics processor hardware. One embodiment provides an apparatus comprising a system interface and a general-purpose graphics processor coupled with the system interface. The general-purpose graphics processor comprises a plurality of graphics processor hardware resources configured to be partitioned into a plurality of isolated partitions, each of the plurality of isolated partitions including a first command streamer, a second command streamer, and circuitry configured to schedule general-purpose graphics compute workloads submitted to a first plurality of command queues associated with the first command streamer and a second plurality of command queues associated with the second command streamer.
    Type: Application
    Filed: August 10, 2022
    Publication date: February 15, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Vasanth Ranganathan, James Valerio, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Ben J. Ashbaugh, Michal Mrozek, Murali Ramadoss, Hong Jiang, Ankur Shah
  • Publication number: 20230051227
    Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. In one embodiment, the apparatus includes a graphics processor comprising a first semiconductor die including a first high-bandwidth memory (HBM) device, a second semiconductor die including a second HBM device, and a third semiconductor die coupled with the first semiconductor die and the second semiconductor die in a 2.5-dimensional (2.5D) arrangement. The third semiconductor die includes a graphics processing resource and a cache coupled with the graphics processing resource. The cache is configurable to cache data associated with memory accessed by the graphics processing resource and the graphics processing resource includes a general-purpose graphics processor core and a tensor core.
    Type: Application
    Filed: November 1, 2022
    Publication date: February 16, 2023
    Applicant: Intel Corporation
    Inventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
  • Publication number: 20220291955
    Abstract: Described herein is a graphics processor configured to perform asynchronous input dependency resolution among a group of interdependent workloads. The graphics processor can dynamically resolve input dependencies among the workloads according to a dependency relationship defined for the workloads. Dependency resolution be performed via a deferred submission mode which resolves input dependencies prior to thread dispatch to the processing resources or via immediate submission mode which resolves input dependencies at the processing resources.
    Type: Application
    Filed: March 7, 2022
    Publication date: September 15, 2022
    Applicant: Intel Corporation
    Inventors: Michal Mrozek, Vinod Tipparaju
  • Publication number: 20220156879
    Abstract: An apparatus to facilitate processing in a multi-tile device is disclosed. The apparatus comprises a plurality of processing tiles, each including a memory device and a plurality of processing resources, coupled to the device memory, and a memory management unit to manage the memory devices in each of the plurality of tiles to perform allocation of memory resources among the memory devices for execution by the plurality of processing resources.
    Type: Application
    Filed: November 18, 2020
    Publication date: May 19, 2022
    Applicant: Intel Corporation
    Inventors: Michal Mrozek, Bartosz Dunajski, Ben Ashbaugh, Brandon Fliflet
  • Publication number: 20210263766
    Abstract: Examples described herein include a graphics processing apparatus that includes at least a memory device and an execution unit coupled to the memory. The memory device can store a command buffer with at least one command that is dependent on completion of at least one other command. The command buffer can include a jump command that causes a jump to a location in the command buffer to identify any unscheduled command. The execution unit is to jump to a location in the command buffer based on execution of the jump command. The execution unit is to perform one or more jumps to one or more locations in the command buffer to attempt to schedule a command with dependency on completion of at least one other command until the command with a dependency on completion of at least one other command is scheduled.
    Type: Application
    Filed: February 20, 2020
    Publication date: August 26, 2021
    Inventors: Bartosz DUNAJSKI, Brandon FLIFLET, Michal MROZEK
  • Patent number: 10937118
    Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.
    Type: Grant
    Filed: January 28, 2019
    Date of Patent: March 2, 2021
    Assignee: INTEL CORPORATION
    Inventors: Jayanth N. Rao, Michal Mrozek
  • Patent number: 10521874
    Abstract: An apparatus and method are described for executing workloads without host intervention. For example, one embodiment of an apparatus comprises: a host processor; and a graphics processor unit (GPU) to execute a hierarchical workload responsive to one or more commands issued by the host processor, the hierarchical workload comprising a parent workload and a plurality of child workloads interconnected in a logical graph structure; and a scheduler kernel implemented by the GPU to schedule execution of the plurality of child workloads without host intervention, the scheduler kernel to evaluate conditions required for execution of the child workloads and determine an order in which to execute the child workloads on the GPU based on the evaluated conditions; the GPU to execute the child workloads in the order determined by the scheduler kernel and to provide results of parent and child workloads to the host processor following execution of all of the child workloads.
    Type: Grant
    Filed: September 26, 2014
    Date of Patent: December 31, 2019
    Assignee: Intel Corporation
    Inventors: Jayanth N. Rao, Pavan K. Lanka, Michal Mrozek
  • Publication number: 20190259129
    Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.
    Type: Application
    Filed: January 28, 2019
    Publication date: August 22, 2019
    Applicant: Intel Corporation
    Inventors: Jayanth N. Rao, Michal Mrozek
  • Patent number: 10235732
    Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.
    Type: Grant
    Filed: December 27, 2013
    Date of Patent: March 19, 2019
    Assignee: INTEL CORPORATION
    Inventors: Jayanth N. Rao, Michal Mrozek
  • Publication number: 20170300361
    Abstract: Methods and apparatus relating to employing out-of-order queues for improved GPU (Graphics Processing Unit) utilization are described. In an embodiment, logic is used to employ out-of-order queues for improved GPU (Graphics Processing Unit) utilization. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: July 3, 2016
    Publication date: October 19, 2017
    Applicant: INTEL CORPORATION
    Inventors: Pavan K. Lanka, Krzysztof Laskowski, Michal Mrozek
  • Publication number: 20160093012
    Abstract: An apparatus and method are described for executing workloads without host intervention. For example, one embodiment of an apparatus comprises: a host processor; and a graphics processor unit (GPU) to execute a hierarchical workload responsive to one or more commands issued by the host processor, the hierarchical workload comprising a parent workload and a plurality of child workloads interconnected in a logical graph structure; and a scheduler kernel implemented by the GPU to schedule execution of the plurality of child workloads without host intervention, the scheduler kernel to evaluate conditions required for execution of the child workloads and determine an order in which to execute the child workloads on the GPU based on the evaluated conditions; the GPU to execute the child workloads in the order determined by the scheduler kernel and to provide results of parent and child workloads to the host processor following execution of all of the child workloads.
    Type: Application
    Filed: September 26, 2014
    Publication date: March 31, 2016
    Inventors: JAYANTH N. RAO, PAVAN K. LANKA, MICHAL MROZEK
  • Publication number: 20150187040
    Abstract: A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.
    Type: Application
    Filed: December 27, 2013
    Publication date: July 2, 2015
    Inventors: Jayanth N. Rao, Michal Mrozek