Patents by Inventor Joydeep Ray

Joydeep Ray has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

GRAPHICS SYSTEMS AND METHODS FOR ACCELERATING SYNCHRONIZATION USING FINE GRAIN DEPENDENCY CHECK AND SCHEDULING OPTIMIZATIONS BASED ON AVAILABLE SHARED MEMORY SPACE

Publication number: 20200293369

Abstract: Accelerated synchronization operations using fine grain dependency check are disclosed. A graphics multiprocessor includes a plurality of execution units and synchronization circuitry that is configured to determine availability of at least one execution unit. The synchronization circuitry to perform a fine grain dependency check of availability of dependent data or operands in shared local memory or cache when at least one execution unit is available.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Subramaniam Maiyuran, Varghese George, Altug Koker, Aravindh Anantaraman, SungYe Kim, Valentin Andrei, Joydeep Ray
DATA PREFETCHING FOR GRAPHICS DATA PROCESSING

Publication number: 20200293450

Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, JR., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
ON CHIP DENSE MEMORY FOR TEMPORAL BUFFERING

Publication number: 20200294182

Abstract: Apparatuses including general-purpose graphics processing units having on chip dense memory for temporal buffering are disclosed. In one embodiment, a graphics multiprocessor includes a plurality of compute engines to perform first computations to generate a first set of data, cache for storing data, and a high density memory that is integrated on chip with the plurality of compute engines and the cache. The high density memory to receive the first set of data, to temporarily store the first set of data, and to provide the first set of data to the cache during a first time period that is prior to a second time period when the plurality of compute engines will use the first set of data for second computations.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Varghese George, Altug Koker, Aravindh Anantaraman, Subramaniam Maiyuran, SungYe Kim, Valentin Andrei, Elmoustapha Ould-Ahmed-Vall, Joydeep Ray, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Prasoonkumar Surti, Mike Macpherson
SCALAR CORE INTEGRATION

Publication number: 20200293488

Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: JOYDEEP RAY, ARAVINDH ANANTARAMAN, ABHISHEK R. APPU, ALTUG KOKER, ELMOUSTAPHA OULD-AHMED-VALL, VALENTIN ANDREI, SUBRAMANIAM MAIYURAN, NICOLAS GALAPPO VON BORRIES, VARGHESE GEORGE, MIKE MACPHERSON, BEN ASHBAUGH, MURALI RAMADOSS, VIKRANTH VEMULAPALLI, WILLIAM SADLER, JONATHAN PEARCE, SUNGYE KIM
MEMORY PREFETCHING IN MULTIPLE GPU ENVIRONMENT

Publication number: 20200294179

Abstract: Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher of each of the GPUs is to prefetch data from the memory to the cache of the GPU; and wherein the prefetcher of a GPU is prohibited from prefetching from a page that is not owned by the GPU or by the host processor.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Joydeep Ray, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Nicolas Galoppo von Borries, Varghese George, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran
LOCAL MEMORY SHARING BETWEEN KERNELS

Publication number: 20200293367

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a set of processing elements to execute one or more thread groups of a second kernel to be executed by the general-purpose graphics processor, an on-chip memory coupled to the set of processing elements, and a scheduler coupled with the set of processing elements, the scheduler to schedule the thread groups of the kernel to the set of processing elements, wherein the scheduler is to schedule a thread group of the second kernel to execute subsequent to a thread group of a first kernel, the thread group of the second kernel configured to access a region of the on-chip memory that contains data written by the thread group of the first kernel in response to a determination that the second kernel is dependent upon the first kernel.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Applicant: Intel Corporation

Inventors: Valentin Andrei, Aravindh Anantaraman, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Altug Koker, SungYe Kim, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran, Vasanth Ranganathan, Joydeep Ray
System and method to support multiple walkers per command

Patent number: 10776897

Abstract: Embodiments described herein provide an apparatus comprising a processor to configure a plurality of contexts of a command engine to execute a graphics workload comprising a plurality of walkers, allocate, from a pool of execution units of a graphics processor, a subset of execution units to each walker in the plurality of walkers based at least in part on the predetermined number of walkers configured for the context, for each context in the plurality of contexts, dispatch one or more walkers of the plurality of walkers to the execution units, and upon dispatch of the one or more walkers of the plurality of walkers, write an opcode to a computer-readable memory indicating that the dispatch of the walker is complete, wherein the opcode comprises dependency data for the one or more walkers of the plurality of walkers. Other embodiments may be described and claimed.

Type: Grant

Filed: March 8, 2019

Date of Patent: September 15, 2020

Assignee: INTEL CORPORATION

Inventors: James Valerio, Vasanth Ranganathan, Joydeep Ray, Abhishek R. Appu, Ben J. Ashbaugh, Brandon Fliflet, Jeffery S. Boles, Srinivasan Embar Raghukrishnan, Rahul Kulkarni
GRAPHICS SCHEDULING MECHANISM

Publication number: 20200285480

Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.

Type: Application

Filed: March 20, 2020

Publication date: September 10, 2020

Applicant: Intel Corporation

Inventors: Balaji Vembu, Abhishek R. Appu, Joydeep Ray, Altug Koker
SYSTEM AND METHOD TO SUPPORT MULTIPLE WALKERS PER COMMAND

Publication number: 20200286201

Abstract: Embodiments described herein provide an apparatus comprising a processor to configure a plurality of contexts of a command engine to execute a graphics workload comprising a plurality of walkers, allocate, from a pool of execution units of a graphics processor, a subset of execution units to each walker in the plurality of walkers based at least in part on the predetermined number of walkers configured for the context, for each context in the plurality of contexts, dispatch one or more walkers of the plurality of walkers to the execution units, and upon dispatch of the one or more walkers of the plurality of walkers, write an opcode to a computer-readable memory indicating that the dispatch of the walker is complete, wherein the opcode comprises dependency data for the one or more walkers of the plurality of walkers. Other embodiments may be described and claimed.

Type: Application

Filed: March 8, 2019

Publication date: September 10, 2020

Applicant: Intel Corporation

Inventors: James Valerio, Vasanth Ranganathan, Joydeep Ray, Abhishek R. Appu, Ben J. Ashbaugh, Brandon Fliflet, Jeffery S. Boles, Srinivasan Embar Raghukrishnan, Rahul Kulkarni
SCATTER GATHER ENGINE

Publication number: 20200286277

Abstract: In an example, an apparatus comprises a plurality of execution units, and logic, at least partially including hardware logic, to create a scatter gather list in memory and collect a plurality of operating statistics for the plurality of execution units using the scatter gather list. Other embodiments are also disclosed and claimed.

Type: Application

Filed: March 27, 2020

Publication date: September 10, 2020

Applicant: Intel Corporation

Inventors: Balaji Vembu, Murali Ramadoss, David I. Standring, Shruti A. Sethi, Jeffrey S. Frizzell, Alan M. Curtis, Abhishek R. Appu, Joydeep Ray, Altug Koker
Apparatus and method for memory management in a graphics processing environment

Patent number: 10769078

Abstract: An apparatus and method are described for implementing memory management in a graphics processing system. For example, one embodiment of an apparatus comprises: a first plurality of graphics processing resources to execute graphics commands and process graphics data; a first memory management unit (MMU) to communicatively couple the first plurality of graphics processing resources to a system-level MMU to access a system memory; a second plurality of graphics processing resources to execute graphics commands and process graphics data; a second MMU to communicatively couple the second plurality of graphics processing resources to the first MMU; wherein the first MMU is configured as a master MMU having a direct connection to the system-level MMU and the second MMU comprises a slave MMU configured to send memory transactions to the first MMU, the first MMU either servicing a memory transaction or sending the memory transaction to the system-level MMU on behalf of the second MMU.

Type: Grant

Filed: June 26, 2019

Date of Patent: September 8, 2020

Assignee: Intel Corporation

Inventors: Niranjan L. Cooray, Abhishek R. Appu, Altug Koker, Joydeep Ray, Balaji Vembu, Pattabhiraman K, David Puffer, David J. Cowperthwaite, Rajesh M. Sankaran, Satyeshwar Singh, Sameer Kp, Ankur N. Shah, Kun Tian
Avoid cache lookup for cold cache

Patent number: 10769072

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive, in a read/modify/write (RMW) pipeline, a cache access request from a requestor, wherein the cache request comprises a cache set identifier associated with requested data in the cache set, determine whether the cache set associated with the cache set identifier is in an inaccessible invalid state, and in response to a determination that the cache set is in an inaccessible state or an invalid state, to terminate the cache access request. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: February 15, 2019

Date of Patent: September 8, 2020

Assignee: INTEL CORPORATION

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Prasoonkumar Surti, Kamal Sinha, Kiran C. Veernapu, Balaji Vembu
Smart compression/decompression schemes for efficiency and superior results

Patent number: 10769818

Abstract: A mechanism is described for facilitating smart compression/decompression schemes at computing devices. A method of embodiments, as described herein, includes unifying a first compression scheme relating to three-dimensional (3D) content and a second compression scheme relating to media content into a unified compression scheme to perform compression of one or more of the 3D content and the media content relating to a processor including a graphics processor.

Type: Grant

Filed: April 9, 2017

Date of Patent: September 8, 2020

Assignee: INTEL CORPORATION

Inventors: Abhishek R. Appu, Kiran C. Veernapu, Prasoonkumar Surti, Joydeep Ray, Altug Koker, Eric G. Liskay
Source synchronized signaling mechanism

Patent number: 10769083

Abstract: An apparatus to facilitate source synchronous signaling is disclosed. The apparatus includes transfer protocol logic to provide for source synchronous transfer of data within an interconnect fabric, including one or more synchronizers having logic to a transmit data signal and a source clock (clk) signal during the transfer of data.

Type: Grant

Filed: August 14, 2019

Date of Patent: September 8, 2020

Assignee: INTEL CORPORATION

Inventors: Altug Koker, Joydeep Ray, Vasanth Ranganathan, Abhishek R. Appu
APPARATUS AND METHOD FOR DYNAMIC PROVISIONING, QUALITY OF SERVICE, AND SCHEDULING IN A GRAPHICS PROCESSOR

Publication number: 20200278938

Abstract: An apparatus and method for dynamic provisioning and traffic control on a memory fabric.

Type: Application

Filed: December 2, 2019

Publication date: September 3, 2020

Inventors: Balaji VEMBU, Altug KOKER, Joydeep RAY, Abhishek R. APPU, Pattabhiraman K, Niranjan L. COORAY
Shutting down GPU components in response to unchanged scene detection

Patent number: 10761591

Abstract: Methods and apparatus relating to techniques for shutting down one or more GPU (Graphics Processing Unit) components in response to unchanged scene detection are described. In one embodiment, one or more components of a processor enter a low power consumption state in response to a determination that a scene to be displayed is static. The static scene is displayed on a display device (e.g., based on information to be retrieved from memory) for as long as no change to the static scene is detected. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: April 1, 2017

Date of Patent: September 1, 2020

Assignee: Intel Corporation

Inventors: Prasoonkumar Surti, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall, Deepak S. Vembar, Abhishek R. Appu, Ankur N. Shah
Post-packaging environment recovery of graphics on-die memory

Patent number: 10762978

Abstract: Systems, apparatuses and methods may provide for technology that identifies a redundant portion of a packaged on-die memory and detects, during a field test of the packaged on-die memory, one or more failed cells in the packaged on-die memory. Additionally, one or more memory cells in the redundant portion may be substituted for the one or more failed memory cells.

Type: Grant

Filed: May 22, 2019

Date of Patent: September 1, 2020

Assignee: Intel Corporation

Inventors: Altug Koker, Travis T. Schluessler, Ankur N. Shah, Abhishek R. Appu, Joydeep Ray, Jonathan Kennedy
Interconnect fabric link width reduction to reduce instantaneous power consumption

Patent number: 10761589

Abstract: Described herein are various embodiments of reducing dynamic power consumption within a processor device. One embodiment provides a technique for dynamic link width reduction based on the instantaneous throughput demand for client of an interconnect fabric. One embodiment provides for a parallel processor comprising an interconnect fabric including a dynamic bus module to configure a bus width for a client of the interconnect fabric based on throughput demand from the client.

Type: Grant

Filed: April 21, 2017

Date of Patent: September 1, 2020

Assignee: Intel Corporation

Inventors: Mohammed Tameem, Altug Koker, Kiran C. Veernapu, Abhishek R. Appu, Ankur N. Shah, Joydeep Ray, Travis T. Schluessler, Jonathan Kennedy
PROCESSOR POWER MANAGEMENT

Publication number: 20200272215

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.

Type: Application

Filed: February 28, 2020

Publication date: August 27, 2020

Applicant: INTEL CORPORATION

Inventors: Altug Koker, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Eric J. Hoekstra, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy
Low granularity coarse depth test efficiency enhancement

Patent number: 10748242

Abstract: Briefly, in accordance with one or more embodiments, an apparatus comprises a processor to compute depth values for one or more 4×4 blocks of pixels using 16 source interpolators and 8 destination interpolators on an incoming fragment of pixel data if the destination is in min/max format, and a memory to store a depth test result performed on the one or more 4×4 blocks of pixels. Otherwise the processor is to compute depth values for one or more 8×4 blocks of pixels using 16 source interpolators and 16 destination interpolators if the destination is in plane format.

Type: Grant

Filed: July 26, 2018

Date of Patent: August 18, 2020

Assignee: Intel Corporation

Inventors: Vasanth Ranganathan, Saikat Mandal, Karol A. Szerszen, Saurabh Sharma, Vamsee Vardhan Chivukula, Abhishek R. Appu, Joydeep Ray, Prasoonkumar Surti, Altug Koker

prev … 30 31 32 33 34 35 36 37 38 … next