Patents by Inventor Marek Targowski

Marek Targowski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for widened SIMD execution within a constrained register file

Patent number: 11029960

Abstract: Apparatus and method for widened SIMD execution on a limited register file. For example, one embodiment of an apparatus comprises: instruction dispatch circuitry to dispatch instructions of a thread for execution, including a first instruction to indicate a start of a double execution instruction sequence and a second instruction to indicate an end of a double execution instruction sequence; and execution circuitry including single instruction multiple data (SIMD) circuitry, the execution circuitry to execute the double execution instruction sequence in a first pass using a first set of lanes of the SIMD circuitry and to execute the double execution instruction sequence in a second pass following the first pass using a second set of lanes of the SIMD circuitry.

Type: Grant

Filed: December 7, 2018

Date of Patent: June 8, 2021

Assignee: Intel Corporation

Inventors: Marek Targowski, Konrad Trifunović
Smart performance of spill fill data transfers in computing environments

Patent number: 10956359

Abstract: A mechanism is described for facilitating smart spill/fill data transfers in computing environments. A method of embodiments, as described herein, includes facilitating dividing a kernel into regions including low pressure regions and high pressure regions, where the low pressure regions are associated with low use of registers hosted by a processor of a computing device, while the high pressure regions are associated with high use of the registers. The method may further include transferring of data between memory and the registers based on at least one of the low pressure regions and the high pressure regions.

Type: Grant

Filed: August 16, 2017

Date of Patent: March 23, 2021

Assignee: Intel Corporation

Inventors: Marek Targowski, Konrad Trifunovic
APPARATUS AND METHOD FOR WIDENED SIMD EXECUTION WITHIN A CONSTRAINED REGISTER FILE

Publication number: 20200183697

Abstract: Apparatus and method for widened SIMD execution on a limited register file. For example, one embodiment of an apparatus comprises: instruction dispatch circuitry to dispatch instructions of a thread for execution, including a first instruction to indicate a start of a double execution instruction sequence and a second instruction to indicate an end of a double execution instruction sequence; and execution circuitry including single instruction multiple data (SIMD) circuitry, the execution circuitry to execute the double execution instruction sequence in a first pass using a first set of lanes of the SIMD circuitry and to execute the double execution instruction sequence in a second pass following the first pass using a second set of lanes of the SIMD circuitry.

Type: Application

Filed: December 7, 2018

Publication date: June 11, 2020

Inventors: MAREK TARGOWSKI, KONRAD TRIFUNOVIC
APPARATUS AND METHOD FOR EFFICIENTLY ACCESSING MEMORY WHEN PERFORMING A HORIZONTAL DATA REDUCTION

Publication number: 20190286430

Abstract: Apparatus and method for optimizing shader execution. For example, one embodiment of a graphics processing apparatus comprises: a plurality of execution units to execute shader programs; optimization detection circuitry and/or logic to identify one or more portions of shader program code to be optimized including one or more reduction operations which require read/write memory operations and associated barrier operations; and optimization circuitry and/or logic to optimize the shader program code by converting a plurality of the read/write memory operations to read/write register operations and removing one or more barrier operations to generate optimized shader program code; the execution units to execute the optimized shader program code.

Type: Application

Filed: March 15, 2018

Publication date: September 19, 2019

Inventor: MAREK TARGOWSKI
Apparatus and method for efficiently accessing memory when performing a horizontal data reduction

Patent number: 10409571

Abstract: Apparatus and method for optimizing shader execution. For example, one embodiment of a graphics processing apparatus comprises: a plurality of execution units to execute shader programs; optimization detection circuitry and/or logic to identify one or more portions of shader program code to be optimized including one or more reduction operations which require read/write memory operations and associated barrier operations; and optimization circuitry and/or logic to optimize the shader program code by converting a plurality of the read/write memory operations to read/write register operations and removing one or more barrier operations to generate optimized shader program code; the execution units to execute the optimized shader program code.

Type: Grant

Filed: March 15, 2018

Date of Patent: September 10, 2019

Assignee: Intel Corporation

Inventor: Marek Targowski
SMART PERFORMANCE OF SPILL FILL DATA TRANSFERS IN COMPUTING ENVIRONMENTS

Publication number: 20190057061

Abstract: A mechanism is described for facilitating smart spill/fill data transfers in computing environments. A method of embodiments, as described herein, includes facilitating dividing a kernel into regions including low pressure regions and high pressure regions, where the low pressure regions are associated with low use of registers hosted by a processor of a computing device, while the high pressure regions are associated with high use of the registers. The method may further include transferring of data between memory and the registers based on at least one of the low pressure regions and the high pressure regions.

Type: Application

Filed: August 16, 2017

Publication date: February 21, 2019

Applicant: Intel Corporation

Inventors: MAREK TARGOWSKI, Konrad TRIFUNOVIC
Systems, methods, and computer program products for preemption of threads at a synchronization barrier

Patent number: 9652300

Abstract: Systems and methods for the processing of EU threads (also known as warps) in a thread group. The status of each EU thread in the group may be monitored, to determine if it is executing or if it is halted and waiting at a synchronization barrier. If certain threshold conditions are met, the waiting EU threads may be preempted to allow execution of threads from another thread group. The threshold conditions may include a minimum number of EUs in use, a minimum number of EU threads in the first thread group that are waiting at the synchronization barrier and/or a maximum number of EU threads that are still executing, and a minimum wait time for one or more of the EU threads waiting at the barrier.

Type: Grant

Filed: June 28, 2012

Date of Patent: May 16, 2017

Assignee: Intel Corporation

Inventor: Marek Targowski
Reducing the number of sequential operations in an application to be performed on a shared memory cell

Patent number: 9449360

Abstract: Methods and apparatuses to reduce the number of sequential operations such as atomic operations in an application to be performed on a shared memory cell may be provided. A translation unit can detect in the application multiple atomic operations to be performed on the same memory and replaces the multiple atomic operations with an equivalent single atomic operation. In some implementations, the application includes shader code. In some implementations, each of the multiple atomic operations increment a value stored at the same memory by an update amount. The translation unit may calculate the partial prefix sum over all the atomic operations and replace the multiple atomic operations with a single atomic operation to increment the value stored at memory by the sum of the update amounts.

Type: Grant

Filed: December 28, 2011

Date of Patent: September 20, 2016

Assignee: Intel Corporation

Inventors: Tomasz Janczak, Marek Targowski
REDUCING THE NUMBER OF SEQUENTIAL OPERATIONS IN AN APPLICATION TO BE PERFORMED ON A SHARED MEMORY CELL

Publication number: 20140198110

Abstract: Methods and apparatuses to reduce the number of sequential operations such as atomic operations in an application to be performed on a shared memory cell may be provided. A translation unit can detect in the application multiple atomic operations to be performed on the same memory and replaces the multiple atomic operations with an equivalent single atomic operation. In some implementations, the application includes shader code. In some implementations, each of the multiple atomic operations increment a value stored at the same memory by an update amount. The translation unit may calculate the partial prefix sum over all the atomic operations and replace the multiple atomic operations with a single atomic operation to increment the value stored at memory by the sum of the update amounts.

Type: Application

Filed: December 28, 2011

Publication date: July 17, 2014

Inventors: Tomasz Janczak, Marek Targowski
SYSTEMS, METHODS, AND COMPUTER PROGRAM PRODUCTS FOR PREEMPTION OF THREADS AT A SYNCHRONIZATION BARRIER

Publication number: 20140007111

Abstract: Systems and methods for the processing of EU threads (also known as warps) in a thread group. The status of each EU thread in the group may be monitored, to determine if it is executing or if it is halted and waiting at a synchronization barrier. If certain threshold conditions are met, the waiting EU threads may be preempted to allow execution of threads from another thread group. The threshold conditions may include a minimum number of EUs in use, a minimum number of EU threads in the first thread group that are waiting at the synchronization barrier and/or a maximum number of EU threads that are still executing, and a minimum wait time for one or more of the EU threads waiting at the barrier.

Type: Application

Filed: June 28, 2012

Publication date: January 2, 2014

Inventor: Marek Targowski