Patents by Inventor Subramaniam Maiyuran

Subramaniam Maiyuran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ACCUMULATOR POOLING MECHANISM

Publication number: 20200320662

Abstract: A processor is disclosed. The processor includes an execution unit having a register file having one or more banks of registers to store operand values, an accumulator comprising a pool of registers to store operand values determined to cause a conflict at register banks within the register file and cache circuitry to control storage of the operand values determined to cause a conflict at the register banks from the register file to the pool of registers.

Type: Application

Filed: April 8, 2019

Publication date: October 8, 2020

Applicant: Intel Corporation

Inventors: Guei-Yuan Lueh, Subramaniam Maiyuran, Wei-Yu Chen, Konrad Trifunovic, Supratim Pal, Chandra S. Gurram, Jorge E. Parra, Pratik J. Ashar, Tomasz Bujewski
MECHANISM TO PERFORM SINGLE PRECISION FLOATING POINT EXTENDED MATH OPERATIONS

Publication number: 20200319851

Abstract: A processor to facilitate execution of a single-precision floating point operation on an operand is disclosed. The processor includes one or more execution units, each having a plurality of floating point units to execute one or more instructions to perform the single-precision floating point operation on the operand, including performing a floating point operation on an exponent component of the operand; and performing a floating point operation on a mantissa component of the operand, comprising dividing the mantissa component into a first sub-component and a second sub-component, determining a result of the floating point operation for the first sub-component and determining a result of the floating point operation for the second sub-component, and returning a result of the floating point operation.

Type: Application

Filed: April 4, 2019

Publication date: October 8, 2020

Applicant: Intel Corporation

Inventors: Abhishek Rhisheekesan, Shashank Lakshminarayana, Subramaniam Maiyuran
METHOD AND APPARATUS FOR APPROXIMATION USING POLYNOMIALS

Publication number: 20200310800

Abstract: Methods and apparatus for approximation using polynomial functions are disclosed. In one embodiment, a processor comprises decoding and execution circuitry. The decoding circuitry is to decode an instruction, where the instruction comprises a first operand specifying an output location and a second operand specifying a plurality of data element values to be computed. The execution circuitry is to execute the decoded instruction. The execution includes to compute a result for each of the plurality of data element values using a polynomial function to approximate a complex function, where the computation uses coefficients stored in a lookup location for the complex function, and where data element values within different data element value ranges use different sets of coefficients. The execution further includes to store results of the computation in the output location.

Type: Application

Filed: March 27, 2019

Publication date: October 1, 2020

Inventors: Jorge PARRA, Dan BAUM, Robert CHAPPELL, Michael ESPIG, Varghese GEORGE, Alexander HEINECKE, Christopher HUGHES, Subramaniam MAIYURAN, Elmoustapha OULD-AHMED-VALL, Prasoonkumar SURTI, Ronen ZOHAR
Dynamic thread splitting having multiple instruction pointers for the same thread

Patent number: 10789071

Abstract: Systems, apparatuses and methods may provide for associating a first instruction pointer with an IF block of a primary IF-ELSE conditional construct associated with a thread and activating a second instruction pointer in response to a dependency associated with the IF block. Additionally, the second instruction pointer may be associated with an ELSE block of the primary IF-ELSE conditional construct. In one example, the IF block and the ELSE block are executed, via the first instruction pointer and the second instruction pointer, one or more of independently from or parallel to one another.

Type: Grant

Filed: July 8, 2015

Date of Patent: September 29, 2020

Assignee: Intel Corporation

Inventors: Hema C. Nalluri, Supratim Pal, Subramaniam Maiyuran, Joy Chandra
SCALAR CORE INTEGRATION

Publication number: 20200293488

Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: JOYDEEP RAY, ARAVINDH ANANTARAMAN, ABHISHEK R. APPU, ALTUG KOKER, ELMOUSTAPHA OULD-AHMED-VALL, VALENTIN ANDREI, SUBRAMANIAM MAIYURAN, NICOLAS GALAPPO VON BORRIES, VARGHESE GEORGE, MIKE MACPHERSON, BEN ASHBAUGH, MURALI RAMADOSS, VIKRANTH VEMULAPALLI, WILLIAM SADLER, JONATHAN PEARCE, SUNGYE KIM
MEMORY PREFETCHING IN MULTIPLE GPU ENVIRONMENT

Publication number: 20200294179

Abstract: Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher of each of the GPUs is to prefetch data from the memory to the cache of the GPU; and wherein the prefetcher of a GPU is prohibited from prefetching from a page that is not owned by the GPU or by the host processor.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Joydeep Ray, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Nicolas Galoppo von Borries, Varghese George, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran
SYSTEMS AND METHODS FOR SYNCHRONIZATION OF MULTI-THREAD LANES

Publication number: 20200293368

Abstract: Apparatuses to synchronize lanes that diverge or threads that drift are disclosed. In one embodiment, a graphics multiprocessor includes a queue having an initial state of groups with a first group having threads of first and second instruction types and a second group having threads of the first and second instruction types. A regroup engine (or regroup circuitry) regroups threads into a third group having threads of the first instruction type and a fourth group having threads of the second instruction type.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Valentin Andrei, Subramaniam Maiyuran, SungYe Kim, Varghese George, Altug Koker, Aravindh Anantaraman
DATA PREFETCHING FOR GRAPHICS DATA PROCESSING

Publication number: 20200293450

Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, JR., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
PREEMPTIVE PAGE FAULT HANDLING

Publication number: 20200293456

Abstract: Methods and apparatus relating to predictive page fault handling. In an example, an apparatus comprises a processor to receive a virtual address that triggered a page fault for a compute process, check a virtual memory space for a virtual memory allocation for the compute process that triggered the page fault and manage the page fault according to one of a first protocol in response to a determination that the virtual address that triggered the page fault is a last page in the virtual memory allocation for the compute process, or a second protocol in response to a determination that the virtual address that triggered the page fault is not a last page in the virtual memory allocation for the compute process. Other embodiments are also disclosed and claimed.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: MURALI RAMADOSS, VIKRANTH VEMULAPALLI, NIRAN COORAY, WILLIAM B. SADLER, JONATHAN D. PEARCE, MARIAN ALIN PETRE, BEN ASHBAUGH, ELMOUSTAPHA OULD-AHMED-VALL, NICOLAS GALOPPO VON BORRIES, ALTUG KOKER, ARAVINDH ANANTARAMAN, SUBRAMANIAM MAIYURAN, VARGHESE GEORGE, SUNGYE KIM, ANDREI VALENTIN
LOCAL MEMORY SHARING BETWEEN KERNELS

Publication number: 20200293367

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a set of processing elements to execute one or more thread groups of a second kernel to be executed by the general-purpose graphics processor, an on-chip memory coupled to the set of processing elements, and a scheduler coupled with the set of processing elements, the scheduler to schedule the thread groups of the kernel to the set of processing elements, wherein the scheduler is to schedule a thread group of the second kernel to execute subsequent to a thread group of a first kernel, the thread group of the second kernel configured to access a region of the on-chip memory that contains data written by the thread group of the first kernel in response to a determination that the second kernel is dependent upon the first kernel.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Valentin Andrei, Aravindh Anantaraman, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Altug Koker, SungYe Kim, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran, Vasanth Ranganathan, Joydeep Ray
GRAPHICS SYSTEMS AND METHODS FOR ACCELERATING SYNCHRONIZATION USING FINE GRAIN DEPENDENCY CHECK AND SCHEDULING OPTIMIZATIONS BASED ON AVAILABLE SHARED MEMORY SPACE

Publication number: 20200293369

Abstract: Accelerated synchronization operations using fine grain dependency check are disclosed. A graphics multiprocessor includes a plurality of execution units and synchronization circuitry that is configured to determine availability of at least one execution unit. The synchronization circuitry to perform a fine grain dependency check of availability of dependent data or operands in shared local memory or cache when at least one execution unit is available.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Subramaniam Maiyuran, Varghese George, Altug Koker, Aravindh Anantaraman, SungYe Kim, Valentin Andrei, Joydeep Ray
ON CHIP DENSE MEMORY FOR TEMPORAL BUFFERING

Publication number: 20200294182

Abstract: Apparatuses including general-purpose graphics processing units having on chip dense memory for temporal buffering are disclosed. In one embodiment, a graphics multiprocessor includes a plurality of compute engines to perform first computations to generate a first set of data, cache for storing data, and a high density memory that is integrated on chip with the plurality of compute engines and the cache. The high density memory to receive the first set of data, to temporarily store the first set of data, and to provide the first set of data to the cache during a first time period that is prior to a second time period when the plurality of compute engines will use the first set of data for second computations.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Varghese George, Altug Koker, Aravindh Anantaraman, Subramaniam Maiyuran, SungYe Kim, Valentin Andrei, Elmoustapha Ould-Ahmed-Vall, Joydeep Ray, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Prasoonkumar Surti, Mike Macpherson
SYSTEMS AND METHODS FOR EXPLOITING QUEUES AND TRANSITIONAL STORAGE FOR IMPROVED LOW-LATENCY HIGH-BANDWIDTH ON-DIE DATA RETRIEVAL

Publication number: 20200294178

Abstract: Apparatuses including general-purpose graphics processing units and graphics multiprocessors that exploit queues or transitional buffers for improved low-latency high-bandwidth on-die data retrieval are disclosed. In one embodiment, a graphics multiprocessor includes at least one compute engine to provide a request, a queue or transitional buffer, and logic coupled to the queue or transitional buffer. The logic is configured to cause a request to be transferred to a queue or transitional buffer for temporary storage without processing the request and to determine whether the queue or transitional buffer has a predetermined amount of storage capacity.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Aravindh Anantaraman, Altug Koker, Varghese George, Subramaniam Maiyuran, SungYe Kim, Valentin Andrei
Thread priority mechanism

Patent number: 10776156

Abstract: A processing apparatus is described. The apparatus includes a graphics processing unit (GPU), including a thread dispatcher to assign a priority class to each of a plurality of processing threads prior to dispatching the one or more processing threads, a plurality of execution units to process the threads, a shared resource coupled to each of the plurality of execution units and an arbitration unit to grant access to the shared resource to a first of the plurality of execution units based on the priority class of a thread being executed at the first execution unit.

Type: Grant

Filed: September 30, 2016

Date of Patent: September 15, 2020

Assignee: INTEL CORPORATION

Inventors: Altug Koker, Prasoonkumar Surti, Guei-Yuan Lueh, Subramaniam Maiyuran, Tomas G. Akenine-Moller, David J. Cowperthwaite, Balaji Vembu
REGISTER SHARING MECHANISM

Publication number: 20200285471

Abstract: An apparatus to facilitate register sharing is disclosed.

Type: Application

Filed: May 22, 2020

Publication date: September 10, 2020

Applicant: Intel Corporation

Inventors: PRATIK J. ASHAR, SUPRATIM PAL, SUBRAMANIAM MAIYURAN, WEI-YU CHEN, GUEI-YUAN LUEH
Single input multiple data processing mechanism

Patent number: 10769751

Abstract: A processing apparatus is described. The apparatus includes a graphics processing unit (GPU), including a register file having a plurality of channels to store data and an execution unit to examine data at each of the plurality of channels, read a data value from a first of the plurality of channels upon a determination that each of the plurality of channels has the same data and execute a single input multi data (SIMD) instruction based on the data value.

Type: Grant

Filed: August 19, 2019

Date of Patent: September 8, 2020

Assignee: INTEL CORPORATION

Inventors: Subramaniam Maiyuran, Jorge F. Garcia Pabon, Vikranth Vemulapalli, Chandra S. Gurram, Aditya Navale, Saurabh Sharma
Register bank conflict reduction for multi-threaded processor

Patent number: 10754651

Abstract: Embodiments are generally directed to register bank conflict reduction for multi-threaded processor execution units. An embodiment of an apparatus includes a processor including one or more execution units (EUs), at least a first execution unit (EU) to process a plurality of threads, the first EU including a register file including multiple register banks with each register bank including multiple registers, and one or more read multiplexers to read registers from the register file, wherein attempting to read more than one register from a single register bank of the register file in a same clock cycle generates a register bank conflict. Registers for each thread for the first EU are distributed across the registers banks within the register file such that a first register for a first thread of the plurality of threads and a following second register for the first thread are located in different register banks within the register file.

Type: Grant

Filed: June 29, 2018

Date of Patent: August 25, 2020

Assignee: INTEL CORPORATION

Inventors: Chandra Gurram, Subramaniam Maiyuran, Buqi Cheng, Ashutosh Garg, Guei-Yuan Lueh, Wei-Yu Chen
HYBRID LOW POWER HOMOGENOUS GRAPICS PROCESSING UNITS

Publication number: 20200210238

Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

Type: Application

Filed: December 24, 2019

Publication date: July 2, 2020

Applicant: Intel Corporation

Inventors: Abhishek R Appu, Altug Koker, Balaji Vembu, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Kiran C. Veernapu, Subramaniam Maiyuran, Sanjeev S. Jahagirdar, Eric J. Asperheim, Guei-Yuan Lueh, David Puffer, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Josh B. Mastronarde, Linda L. Hurd, Travis T. Schluessler, Tomasz Janczak, Abhishek Venkatesh, Kai Xiao, Slawomir Grajewski
Recompiling GPU code based on spill/fill instructions and number of stall cycles

Patent number: 10698689

Abstract: An apparatus to facilitate register sharing is disclosed. The apparatus includes one or more processors to generate first machine code having a first General Purpose Register (GRF) per thread ratio, detect an occurrence of one or more spill/fill instructions in the first machine code, and generate second machine code having a second GRF per thread ratio upon a detection of one or more spill/fill instructions in the first machine code, wherein the second GRF per thread ratio is based on a disabling of a first of a plurality of hardware threads.

Type: Grant

Filed: September 1, 2018

Date of Patent: June 30, 2020

Assignee: Intel Corporation

Inventors: Pratik J. Ashar, Supratim Pal, Subramaniam Maiyuran, Wei-Yu Chen, Guei-Yuan Lueh
Divergent control flow for fused EUs

Patent number: 10699362

Abstract: Embodiments provide support for divergent control flow in heterogeneous compute operations on a fused execution unit. On embodiment provides for a processing apparatus comprising a fused execution unit including multiple graphics execution units having a common instruction pointer; logic to serialize divergent function calls by the fused execution unit, the logic configured to compare a call target of execution channels within the fused execution unit and create multiple groups of channels, each group of channels associated with a single call target; and wherein the fused execution unit is to execute a first group of channels via a first execution unit and a second group of channels via a second execution unit.

Type: Grant

Filed: June 23, 2016

Date of Patent: June 30, 2020

Assignee: INTEL CORPORATION

Inventors: Pratik J. Ashar, Guei-Yuan Ken Lueh, Kaiyu Chen, Subramaniam Maiyuran, Brent A. Schwartz, Darin M. Starkey

prev … 8 9 10 11 12 13 14 15 16 … next