Patents by Inventor DARIN M. STARKEY

DARIN M. STARKEY has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11900502
    Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.
    Type: Grant
    Filed: May 2, 2022
    Date of Patent: February 13, 2024
    Assignee: Intel Corporation
    Inventors: Chandra S. Gurram, Gang Y. Chen, Subramaniam Maiyuran, Supratim Pal, Ashutosh Garg, Jorge E. Parra, Darin M. Starkey, Guei-Yuan Lueh, Wei-Yu Chen
  • Publication number: 20220261949
    Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.
    Type: Application
    Filed: May 2, 2022
    Publication date: August 18, 2022
    Inventors: Chandra S. GURRAM, Gang Y. CHEN, Subramaniam MAIYURAN, Supratim PAL, Ashutosh GARG, Jorge E. PARRA, Darin M. STARKEY, Guei-Yuan LUEH, Wei-Yu CHEN
  • Patent number: 11321799
    Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.
    Type: Grant
    Filed: December 24, 2019
    Date of Patent: May 3, 2022
    Assignee: Intel Corporation
    Inventors: Chandra S. Gurram, Gang Y. Chen, Subramaniam Maiyuran, Supratim Pal, Ashutosh Garg, Jorge E. Parra, Darin M. Starkey, Guei-Yuan Lueh, Wei-Yu Chen
  • Publication number: 20210192673
    Abstract: Examples described herein relate to a software and hardware optimization that manages scenarios where a write operation to a register is less than an entirety of the register. A compiler detects instructions that make partial writes to the same register, groups such instructions, and provides hints to hardware of the partial write. The execution unit combines the output data for grouped instructions and updates the destination register as single write instead of multiple separate partial writes.
    Type: Application
    Filed: December 24, 2019
    Publication date: June 24, 2021
    Inventors: Chandra S. GURRAM, Gang Y. CHEN, Subramaniam MAIYURAN, Supratim PAL, Ashutosh GARG, Jorge E. PARRA, Darin M. STARKEY, Guei-Yuan LUEH, Wei-Yu CHEN
  • Patent number: 10699362
    Abstract: Embodiments provide support for divergent control flow in heterogeneous compute operations on a fused execution unit. On embodiment provides for a processing apparatus comprising a fused execution unit including multiple graphics execution units having a common instruction pointer; logic to serialize divergent function calls by the fused execution unit, the logic configured to compare a call target of execution channels within the fused execution unit and create multiple groups of channels, each group of channels associated with a single call target; and wherein the fused execution unit is to execute a first group of channels via a first execution unit and a second group of channels via a second execution unit.
    Type: Grant
    Filed: June 23, 2016
    Date of Patent: June 30, 2020
    Assignee: INTEL CORPORATION
    Inventors: Pratik J. Ashar, Guei-Yuan Ken Lueh, Kaiyu Chen, Subramaniam Maiyuran, Brent A. Schwartz, Darin M. Starkey
  • Patent number: 10692170
    Abstract: Embodiments described herein provide a graphics processor in which dependency tracking hardware is simplified via the use of compiler provided software scoreboard information. In one embodiment the shader compiler for shader programs is configured to encode software scoreboard information into each instruction. Dependencies can be evaluated by the shader compiler and provided as scoreboard information with each instruction. The hardware can then use the provided information when scheduling instructions. In one embodiment, a software scoreboard synchronization instruction is provided to facilitate software dependency handling within a shader program. Using software to facilitate software dependency handling and synchronization can simplify hardware design, reducing the area consumed by the hardware. In one embodiment, dependencies can be evaluated by the shader compiler instead of the GPU hardware.
    Type: Grant
    Filed: June 11, 2019
    Date of Patent: June 23, 2020
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Supratim Pal, Jorge E. Parra, Chandra S. Gurram, Ashwin J. Shivani, Ashutosh Garg, Brent A. Schwartz, Jorge F. Garcia Pabon, Darin M. Starkey, Shubh B. Shah, Guei-Yuan Lueh, Kaiyu Chen, Konrad Trifunovic, Buqi Cheng, Weiyu Chen
  • Publication number: 20190362460
    Abstract: Embodiments described herein provide a graphics processor in which dependency tracking hardware is simplified via the use of compiler provided software scoreboard information. In one embodiment the shader compiler for shader programs is configured to encode software scoreboard information into each instruction. Dependencies can be evaluated by the shader compiler and provided as scoreboard information with each instruction. The hardware can then use the provided information when scheduling instructions. In one embodiment, a software scoreboard synchronization instruction is provided to facilitate software dependency handling within a shader program. Using software to facilitate software dependency handling and synchronization can simplify hardware design, reducing the area consumed by the hardware. In one embodiment, dependencies can be evaluated by the shader compiler instead of the GPU hardware.
    Type: Application
    Filed: June 11, 2019
    Publication date: November 28, 2019
    Applicant: Intel Corporation
    Inventors: Subramaniam Maiyuran, Supratim Pal, Jorge E. Parra, Chandra S. Gurram, Ashwin J. Shivani, Ashutosh Garg, Brent A. Schwartz, Jorge F. Garcia Pabon, Darin M. Starkey, Shubh B. Shah, Guei-Yuan Lueh, Kaiyu Chen, Konrad Trifunovic, Buqi Cheng, Weiyu Chen
  • Publication number: 20190265973
    Abstract: Methods and apparatus relating to techniques for fusing SIMD processing units. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive an instruction set for execution on at least two graphics processing execution units, determine whether the instruction set requires data dependent addressing, and select between a synchronized execution environment for the at least two graphics processing units and an unsynchronized execution environment for the at least two graphics processing units based at least in part on the determination whether the instruction set requires data dependent addressing. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: February 23, 2018
    Publication date: August 29, 2019
    Applicant: Intel Corporation
    Inventors: Subramaniam Maiyuran, Supratim Pal, Ashutosh Garg, Darin M. Starkey, Guei-Yuan Lueh, Jorge E. Parra, Shubh B. Shah, Wei-Yu Chen, Vikranth Vemulapalli, Narsim Krishna, Brent A. Schwartz, Chandra S. Gurram, Wei Pan, Ashwin J. Shivani
  • Patent number: 10360654
    Abstract: Embodiments described herein provide a graphics processor in which dependency tracking hardware is simplified via the use of compiler provided software scoreboard information. In one embodiment the shader compiler for shader programs is configured to encode software scoreboard information into each instruction. Dependencies can be evaluated by the shader compiler and provided as scoreboard information with each instruction. The hardware can then use the provided information when scheduling instructions. In one embodiment, a software scoreboard synchronization instruction is provided to facilitate software dependency handling within a shader program. Using software to facilitate software dependency handling and synchronization can simplify hardware design, reducing the area consumed by the hardware. In one embodiment, dependencies can be evaluated by the shader compiler instead of the GPU hardware.
    Type: Grant
    Filed: May 25, 2018
    Date of Patent: July 23, 2019
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Supratim Pal, Jorge E. Parra, Chandra S. Gurram, Ashwin J. Shivani, Ashutosh Garg, Brent A. Schwartz, Jorge F. Garcia Pabon, Darin M. Starkey, Shubh B. Shah, Guei-Yuan Lueh, Kaiyu Chen, Konrad Trifunovic, Buqi Cheng, Weiyu Chen
  • Patent number: 9983884
    Abstract: An apparatus and method for a SIMD structured branching. For example, one embodiment of a processor comprises: an execution unit having a plurality of channels to execute instructions; and a branch unit to process control flow instructions and to maintain a per channel count for each channel and a control instruction count for the control flow instructions, the branch unit to enable and disable the channels based at least on the per channel count.
    Type: Grant
    Filed: September 26, 2014
    Date of Patent: May 29, 2018
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Darin M. Starkey, Thomas A. Piazza
  • Patent number: 9928076
    Abstract: An apparatus and method for a SIMD unstructured branching. For example, one embodiment of a processor comprises: an execution unit having a plurality of channels to execute instructions; and a branch unit to process unstructured control flow instructions and to maintain a per channel count value for each channel, the branch unit to store instruction pointer tags for the unstructured control flow instructions in a memory and identify the instruction pointer tags using tag addresses, the branch unit to further enable and disable the channels based at least on the per channel count value.
    Type: Grant
    Filed: September 26, 2014
    Date of Patent: March 27, 2018
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Darin M. Starkey
  • Publication number: 20170372446
    Abstract: Embodiments provide support for divergent control flow in heterogeneous compute operations on a fused execution unit. On embodiment provides for a processing apparatus comprising a fused execution unit including multiple graphics execution units having a common instruction pointer; logic to serialize divergent function calls by the fused execution unit, the logic configured to compare a call target of execution channels within the fused execution unit and create multiple groups of channels, each group of channels associated with a single call target; and wherein the fused execution unit is to execute a first group of channels via a first execution unit and a second group of channels via a second execution unit.
    Type: Application
    Filed: June 23, 2016
    Publication date: December 28, 2017
    Applicant: Intel Corporation
    Inventors: Pratik J. Ashar, Guei-Yuan Ken Lueh, Kaiyu Chen, Subramaniam Maiyuran, Brent A. Schwartz, Darin M. Starkey
  • Publication number: 20160092239
    Abstract: An apparatus and method for a SIMD unstructured branching. For example, one embodiment of a processor comprises: an execution unit having a plurality of channels to execute instructions; and a branch unit to process unstructured control flow instructions and to maintain a per channel count value for each channel, the branch unit to store instruction pointer tags for the unstructured control flow instructions in a memory and identify the instruction pointer tags using tag addresses, the branch unit to further enable and disable the channels based at least on the per channel count value.
    Type: Application
    Filed: September 26, 2014
    Publication date: March 31, 2016
    Inventors: SUBRAMANIAM MAIYURAN, DARIN M. STARKEY
  • Publication number: 20160092240
    Abstract: An apparatus and method for a SIMD structured branching. For example, one embodiment of a processor comprises: an execution unit having a plurality of channels to execute instructions; and a branch unit to process control flow instructions and to maintain a per channel count for each channel and a control instruction count for the control flow instructions, the branch unit to enable and disable the channels based at least on the per channel count.
    Type: Application
    Filed: September 26, 2014
    Publication date: March 31, 2016
    Inventors: Subramaniam MAIYURAN, Darin M. STARKEY, Thomas A. PIAZZA