Patents by Inventor Vasanth Ranganathan

Vasanth Ranganathan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Avoid thread switching in cache management

Patent number: 11055248

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to monitor a thread switching overhead parameter for an application executing in a processing system and in response to a determination that the thread switching overhead parameter exceeds a threshold, to activate a thread management algorithm to reduce thread switching in the processing system. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: October 11, 2019

Date of Patent: July 6, 2021

Assignee: INTEL CORPORATION

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Kiran C. Veernapu, Balaji Vembu, Vasanth Ranganathan, Prasoonkumar Surti
COORDINATION AND INCREASED UTILIZATION OF GRAPHICS PROCESSORS DURING INFERENCE

Publication number: 20210201438

Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, at training time, information relating to one or more tasks to be performed according to a training dataset relating to a processor including a graphics processor. The method may further include analyzing the information to determine one or more portions of hardware relating to the processor capable of supporting the one or more tasks, and configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks.

Type: Application

Filed: January 7, 2021

Publication date: July 1, 2021

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, DUKHWAN Kim
Dynamic voltage-frequency curve mangement

Patent number: 11048605

Abstract: Methods and apparatus relating to techniques for power management. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to generate a voltage/frequency curve for at least one of a core or a sub-core in a processor and manage an operating voltage level of the at least one of a core or a sub-core using the voltage/frequency curve. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: July 30, 2019

Date of Patent: June 29, 2021

Assignee: INTEL CORPORATION

Inventors: Nikos Kaburlasos, Balaji Vembu, Josh B. Mastronarde, Altug Koker, Eric C. Samson, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Vasanth Ranganathan, Sanjeev S. Jahagirdar
SECTOR CACHE FOR COMPRESSION

Publication number: 20210191872

Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.

Type: Application

Filed: March 3, 2021

Publication date: June 24, 2021

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
MECHANISM TO PARTITION A SHARED LOCAL MEMORY

Publication number: 20210191868

Abstract: An apparatus to facilitate partitioning of local memory is disclosed. The apparatus includes a plurality of execution units to execute a plurality of execution threads, a memory coupled to share access between the plurality of execution units and partitioning hardware to partition the memory to be used as a cache and as shared local memory (SLM), wherein the partitioning hardware partitions the memory based on a quantity of the plurality of execution threads executing on the execution units that are active.

Type: Application

Filed: December 23, 2019

Publication date: June 24, 2021

Applicant: Intel Corporation

Inventors: JOYDEEP RAY, VASANTH RANGANATHAN, BEN ASHBAUGH, JAMES VALERIO
DE-CENTRALIZED LOAD-BALANCING AT PROCESSORS

Publication number: 20210182120

Abstract: A mechanism is described for facilitating localized load-balancing for processors in computing devices. A method of embodiments, as described herein, includes facilitating hosting, at a processor of a computing device, a local load-balancing mechanism. The method may further include monitoring balancing of loads at the processor and serving as a local scheduler to maintain de-centralized load-balancing at the processor and between the processor and other one or more processors.

Type: Application

Filed: December 22, 2020

Publication date: June 17, 2021

Applicant: Intel Corporation

Inventors: Prasoonkumar Surti, David Cowperthwaite, Abhishek R. Appu, Joydeep Ray, Vasanth Ranganathan, Altug Koker, Balaji Vembu
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20210182058

Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.

Type: Application

Filed: February 5, 2021

Publication date: June 17, 2021

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
ENHANCING HIERARCHICAL DEPTH BUFFER CULLING EFFICIENCY VIA MASK ACCUMULATION

Publication number: 20210174575

Abstract: Embodiments described herein provide for a technique to improve the culling efficiency of coarse depth testing. One embodiment provides for a graphics processor that includes a depth pipeline that is configured to perform a method to track a history of source fragments that are tested against a destination tile. When a combination of partial fragments sum to full coverage, the most conservative source far depth value is used instead of the previous destination far depth value.

Type: Application

Filed: December 9, 2019

Publication date: June 10, 2021

Applicant: Intel Corporation

Inventors: Saikat Mandal, Vasanth Ranganathan
DATA LOCALITY ENHANCEMENT FOR GRAPHICS PROCESSING UNITS

Publication number: 20210149680

Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the first processing resource and one or more consumer tasks executing on the second processing resource and move a data output from one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. Other embodiments may be described and claimed.

Type: Application

Filed: November 11, 2020

Publication date: May 20, 2021

Applicant: Intel Corporation

Inventors: Christopher J. HUGHES, Prasoonkumar SURTI, Guei-Yuan LUEH, Adam T. LAKE, Jill BOYCE, Subramaniam MAIYURAN, Lidong XU, James M. HOLLAND, Vasanth RANGANATHAN, Nikos KABURLASOS, Altug KOKER, Abhishek R. APPU
SYSTOLIC ARITHMETIC ON SPARSE DATA

Publication number: 20210150770

Abstract: Embodiments described herein provided for an instruction and associated logic to enable a processing resource including a tensor accelerator to perform optimized computation of sparse submatrix operations. One embodiment provides hardware logic to apply a numerical transform to matrix data to increase the sparsity of the data. Increasing the sparsity may result in a higher compression ratio when the matrix data is compressed.

Type: Application

Filed: November 11, 2020

Publication date: May 20, 2021

Applicant: Intel Corporation

Inventors: ABHISHEK R. APPU, PRASOONKUMAR SURTI, JILL BOYCE, SUBRAMANIAM MAIYURAN, MICHAEL APODACA, ADAM T. LAKE, JAMES HOLLAND, VASANTH RANGANATHAN, ALTUG KOKER, LIDONG XU, NIKOS KABURLASOS
GRAPHICS PROCESSING UNIT PROCESSING AND CACHING IMPROVEMENTS

Publication number: 20210150663

Abstract: Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a system includes a producer intellectual property (IP) (e.g., a media IP), a compute core (e.g., a GPU or an AI-specific core of the GPU), a streaming buffer logically interposed between the producer IP and the compute core. The producer IP is operable to consume data from memory and output results to the streaming buffer. The compute core is operable to perform AI inference processing based on data consumed from the streaming buffer and output AI inference processing results to the memory.

Type: Application

Filed: November 11, 2020

Publication date: May 20, 2021

Applicant: Intel Corporation

Inventors: Subramaniam Maiyuran, Durgaprasad Bilagi, Joydeep Ray, Scott Janus, Sanjeev Jahagirdar, Brent Insko, Lidong Xu, Abhishek R. Appu, James Holland, Vasanth Ranganathan, Nikos Kaburlasos, Altug Koker, Xinmin Tian, Guei-Yuan Lueh, Changliang Wang
SYSTEMS AND METHODS FOR ERROR DETECTION AND CONTROL FOR EMBEDDED MEMORY AND COMPUTE ELEMENTS

Publication number: 20210149763

Abstract: Apparatuses including a graphics processing unit, graphics multiprocessor, or graphics processor having an error detection correction logic for cache memory or shared memory are disclosed. In one embodiment, a graphics multiprocessor includes cache or local memory for storing data and error detection correction circuitry integrated with or coupled to the cache or local memory. The error detection correction circuitry is configured to perform a tag read for data of the cache or local memory to check error detection correction information.

Type: Application

Filed: November 11, 2020

Publication date: May 20, 2021

Applicant: Intel Corporation

Inventors: Vasanth Ranganathan, Joydeep Ray, Abhishek R. Appu, Nikos Kaburlasos, Lidong Xu, Subramaniam Maiyuran, Altug Koker, Naveen Matam, James Holland, Brent Insko, Sanjeev Jahagirdar, Scott Janus, Durgaprasad Bilagi, Xinmin Tian
ENHANCED PROCESSOR FUNCTIONS FOR CALCULATION

Publication number: 20210149677

Abstract: Enhanced processor functions for calculation are described. An example of an apparatus includes one or more processors including one or more processing resources and a memory to store data, the data including data for compute operations. A processing resource of the one or more processing resources includes a configurable pipeline for calculation operations, and wherein the configurable pipeline may be utilized to perform both a normal instruction for a calculation in a certain precision and a systolic instruction for a calculation in a certain precision.

Type: Application

Filed: November 11, 2020

Publication date: May 20, 2021

Applicant: Intel Corporation

Inventors: Subramaniam Maiyuran, Lidong Xu, Abhishek R. Appu, James M. Holland, Vasanth Ranganathan, Nikos Kaburlasos, Altug Koker
Dynamic precision for neural network compute operations

Patent number: 11010659

Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: April 24, 2017

Date of Patent: May 18, 2021

Assignee: INTEL CORPORATION

Inventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev S. Jahagirdar
FABRIC-BASED COMPRESSION/DECOMPRESSION FOR INTERNAL DATA TRANSFER

Publication number: 20210125379

Abstract: A mechanism is described for facilitating fabric-based compression and/or decompression of data at computing devices. A method of embodiments, as described herein, includes compressing contents of a data stream traveling through an internal fabric between a source component and a destination component, wherein the contents are compressed on the internal fabric.

Type: Application

Filed: October 20, 2020

Publication date: April 29, 2021

Applicant: Intel Corporation

Inventors: Altug Koker, Vasanth Ranganathan, Joydeep Ray, Abhishek R. Appu
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20210124579

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Application

Filed: December 9, 2020

Publication date: April 29, 2021

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
REGISTER SPILL/FILL USING SHARED LOCAL MEMORY SPACE

Publication number: 20210125581

Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.

Type: Application

Filed: October 5, 2020

Publication date: April 29, 2021

Applicant: Intel Corporation

Inventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, JR., Wei-Yu Chen, Subramaniam M. Maiyuran
Memory compression hashing mechanism

Patent number: 10983906

Abstract: An apparatus to facilitate memory data compression is disclosed. The apparatus includes a memory and having a plurality of banks to store main data and metadata associated with the main data and a memory management unit (MMU) coupled to the plurality of banks to perform a hash function to compute indices into virtual address locations in memory for the main data and the metadata and adjust the metadata virtual address locations to store each adjusted metadata virtual address location in a bank storing the associated main data.

Type: Grant

Filed: March 18, 2019

Date of Patent: April 20, 2021

Assignee: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Niranjan Cooray, Prasoonkumar Surti, Sudhakar Kamma, Vasanth Ranganathan
ARCHITECTURE FOR BLOCK SPARSE OPERATIONS ON A SYSTOLIC ARRAY

Publication number: 20210103550

Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides for data aware sparsity via compressed bitstreams. One embodiment provides for block sparse dot product instructions. One embodiment provides for a depth-wise adapter for a systolic array.

Type: Application

Filed: December 15, 2020

Publication date: April 8, 2021

Applicant: Intel Corporation

Inventors: Abhishek Appu, Subramaniam Maiyuran, Mike Macpherson, Fangwen Fu, Jiasheng Chen, Varghese George, Vasanth Ranganathan, Ashutosh Garg, Joydeep Ray
Power consumption management for communication bus

Patent number: 10955896

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive data for a current write operation to a memory, determine a number of bits in the received data for the current write operation to the memory which have changed from a previous write operation to the memory and in response to a determination that the number of bits in the received data for the current write operation to the memory which have changed from a previous write operation to the memory exceeds a threshold, to toggle a plurality of bits in the data for the current write operation to create an encoded data set and set an indicator bit to a value which indicates that the plurality of bits have been toggled. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: February 5, 2020

Date of Patent: March 23, 2021

Assignee: INTEL CORPORATION

Inventors: Abhishek R. Appu, Altug Koker, Eric J. Hoekstra, Kiran C. Veernapu, Prasoonkumar Surti, Vasanth Ranganathan, Kamal Sinha, Balaji Vembu, Eric J. Asperheim, Sanjeev S. Jahagirdar, Joydeep Ray

prev … 4 5 6 7 8 9 10 11 12 … next