Patents by Inventor Ashok Jagannathan

Ashok Jagannathan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Debugging architecture for system in package composed of multiple semiconductor chips

Patent number: 12204834

Abstract: A method is described. The method includes maintaining a synchronized count value in each of a plurality of logic chips within a same package. The method includes comparing the count value against a same looked for count value in each of the plurality of logic chips. The method includes each of the plurality of logic chips recording in its respective local memory at least some of its state information in response to each of the plurality of logic chips recognizing within a same cycle that the count value has reached the same looked for count value.

Type: Grant

Filed: December 23, 2020

Date of Patent: January 21, 2025

Assignee: Intel Corporation

Inventors: Shanker Raman Nagesh, Ashok Jagannathan
Peer-to-peer link sharing for upstream communications from XPUS to a host processor

Patent number: 12197374

Abstract: A processor unit comprising a first controller to couple to a host processing unit over a first link; a second controller to couple to a second processor unit over a second link, wherein the second processor unit is to couple to the host central processing unit via a third link; and circuitry to determine whether to send a cache coherent request to the host central processing unit over the first link or over the second link via the second processing unit.

Type: Grant

Filed: June 25, 2021

Date of Patent: January 14, 2025

Assignee: Intel Corporation

Inventors: Rahul Pal, Nayan Amrutlal Suthar, David M. Puffer, Ashok Jagannathan
APPARATUSES, METHODS, AND SYSTEMS FOR NEURAL NETWORKS

Publication number: 20240118892

Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.

Type: Application

Filed: December 18, 2023

Publication date: April 11, 2024

Inventors: Swagath VENKATARAMANI, Dipankar DAS, Ashish RANJAN, Subarno BANERJEE, Sasikanth AVANCHA, Ashok JAGANNATHAN, Ajaya V. DURG, Dheemanth NAGARAJ, Bharat KAUL, Anand RAGHUNATHAN
Bias-based coherency in an interconnect fabric

Patent number: 11663135

Abstract: A fabric controller to provide a coherent accelerator fabric, including: a host interconnect to communicatively couple to a host device; a memory interconnect to communicatively couple to an accelerator memory; an accelerator interconnect to communicatively couple to an accelerator having a last-level cache (LLC); and an LLC controller configured to provide a bias check for memory access operations.

Type: Grant

Filed: December 20, 2021

Date of Patent: May 30, 2023

Assignee: Intel Corporation

Inventors: Ritu Gupta, Aravindh V. Anantaraman, Stephen R. Van Doren, Ashok Jagannathan
DEBUGGING ARCHITECTURE FOR SYSTEM IN PACKAGE COMPOSED OF MULTIPLE SEMICONDUCTOR CHIPS

Publication number: 20220198110

Abstract: A method is described. The method includes maintaining a synchronized count value in each of a plurality of logic chips within a same package. The method includes comparing the count value against a same looked for count value in each of the plurality of logic chips. The method includes each of the plurality of logic chips recording in its respective local memory at least some of its state information in response to each of the plurality of logic chips recognizing within a same cycle that the count value has reached the same looked for count value.

Type: Application

Filed: December 23, 2020

Publication date: June 23, 2022

Inventors: Shanker Raman NAGESH, Ashok JAGANNATHAN
COHERENT ACCELERATOR FABRIC CONTROLLER

Publication number: 20220114105

Abstract: A fabric controller to provide a coherent accelerator fabric, including: a host interconnect to communicatively couple to a host device; a memory interconnect to communicatively couple to an accelerator memory; an accelerator interconnect to communicatively couple to an accelerator having a last-level cache (LLC); and an LLC controller configured to provide a bias check for memory access operations.

Type: Application

Filed: December 20, 2021

Publication date: April 14, 2022

Applicant: Intel Corporation

Inventors: Ritu Gupta, Aravindh V. Anantaraman, Stephen R. Van Doren, Ashok Jagannathan
Coherent accelerator fabric controller

Patent number: 11263143

Abstract: A fabric controller is provided for a coherent accelerator fabric. The coherent accelerator fabric includes a host interconnect, a memory interconnect, and an accelerator interconnect. The host interconnect communicatively couples to a host device. The memory interconnect communicatively couples to an accelerator memory. The accelerator interconnect communicatively couples to an accelerator having a last-level cache (LLC). An LLC controller is provided that is configured to provide a bias check for memory access operations on the fabric.

Type: Grant

Filed: September 29, 2017

Date of Patent: March 1, 2022

Assignee: Intel Corporation

Inventors: Ritu Gupta, Aravindh V. Anantaraman, Stephen R. Van Doren, Ashok Jagannathan
APPARATUSES, METHODS, AND SYSTEMS FOR NEURAL NETWORKS

Publication number: 20220050683

Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.

Type: Application

Filed: October 26, 2021

Publication date: February 17, 2022

Inventors: Swagath VENKATARAMANI, Dipankar DAS, Ashish RANJAN, Subarno BANERJEE, Sasikanth AVANCHA, Ashok JAGANNATHAN, Ajaya V. DURG, Dheemanth NAGARAJ, Bharat KAUL, Anand RAGHUNATHAN
PEER-TO-PEER LINK SHARING FOR UPSTREAM COMMUNICATIONS FROM XPUS TO A HOST PROCESSOR

Publication number: 20210318980

Abstract: A processor unit comprising a first controller to couple to a host processing unit over a first link; a second controller to couple to a second processor unit over a second link, wherein the second processor unit is to couple to the host central processing unit via a third link; and circuitry to determine whether to send a cache coherent request to the host central processing unit over the first link or over the second link via the second processing unit.

Type: Application

Filed: June 25, 2021

Publication date: October 14, 2021

Applicant: Intel Corporation

Inventors: Rahul Pal, Nayan Amrutlal Suthar, David M. Puffer, Ashok Jagannathan
APPARATUSES, METHODS, AND SYSTEMS FOR NEURAL NETWORKS

Publication number: 20190303743

Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.

Type: Application

Filed: September 27, 2016

Publication date: October 3, 2019

Inventors: Swagath VENKATARAMANI, Dipankar DAS, Ashish RANJAN, Subarno BANERJEE, Sasikanth AVANCHA, Ashok JAGANNATHAN, Ajaya V. DURG, Dheemanth NAGARAJ, Bharat KAUL, Anand RAGHUNATHAN
Optimized caching agent with integrated directory cache

Patent number: 10339060

Abstract: System, method, and processor for enabling early deallocation of tracker entries which track memory accesses are described herein. One embodiment of a method includes: maintaining an RSF corresponding to a first processing unit of a plurality of processing units to track cache lines, wherein a cache line is tracked by the RSF if the cache line is stored in both a memory and one or more other processing unit, the memory is coupled to and shared by the plurality of processing units; receiving a request to access a target cache line from a processing core of the first processing unit; allocating a tracker entry corresponding to the request, the tracker entry used to track a status of the request; performing a lookup in the RSF for the target cache line; and deallocating the tracker entry responsive to a detection that the target cache line is not tracked the RSF.

Type: Grant

Filed: December 30, 2016

Date of Patent: July 2, 2019

Assignee: Intel Corporation

Inventors: Bahaa Fahim, Ashok Jagannathan, Jeffrey D. Chamberlain, Samuel D. Strom
ACCELERATOR FABRIC

Publication number: 20190102311

Abstract: A fabric controller to provide a coherent accelerator fabric, including: a host interconnect to communicatively couple to a host device; a memory interconnect to communicatively couple to an accelerator memory; an accelerator interconnect to communicatively couple to an accelerator having a last-level cache (LLC); and an LLC controller configured to provide a bias check for memory access operations.

Type: Application

Filed: September 29, 2017

Publication date: April 4, 2019

Inventors: Ritu Gupta, Aravindh V. Anantaraman, Stephen R. Van Doren, Ashok Jagannathan
OPTIMIZED CACHING AGENT WITH INTEGRATED DIRECTORY CACHE

Publication number: 20180189180

Abstract: System, method, and processor for enabling early deallocation of tracker entries which track memory accesses are described herein. One embodiment of a method includes: maintaining an RSF corresponding to a first processing unit of a plurality of processing units to track cache lines, wherein a cache line is tracked by the RSF if the cache line is stored in both a memory and one or more other processing unit, the memory is coupled to and shared by the plurality of processing units; receiving a request to access a target cache line from a processing core of the first processing unit; allocating a tracker entry corresponding to the request, the tracker entry used to track a status of the request; performing a lookup in the RSF for the target cache line; and deallocating the tracker entry responsive to a detection that the target cache line is not tracked the RSF.

Type: Application

Filed: December 30, 2016

Publication date: July 5, 2018

Inventors: Bahaa Fahim, Ashok Jagannathan, Jeffrey D. Chamberlain, Samuel D. Strom
Method and apparatus for distributed snoop filtering

Patent number: 9727475

Abstract: An apparatus and method are described for distributed snoop filtering. For example, one embodiment of a processor comprises: a plurality of cores to execute instructions and process data; first snoop logic to track a first plurality of cache lines stored in a mid-level cache (“MLC”) accessible by one or more of the cores, the first snoop logic to allocate entries for cache lines stored in the MLC and to deallocate entries for cache lines evicted from the MLC, wherein at least some of the cache lines evicted from the MLC are retained in a level 1 (L1) cache; and second snoop logic to track a second plurality of cache lines stored in a non-inclusive last level cache (NI LLC), the second snoop logic to allocate entries in the NI LLC for cache lines evicted from the MLC and to deallocate entries for cache lines stored in the MLC, wherein the second snoop logic is to store and maintain a first set of core valid bits to identify cores containing copies of the cache lines stored in the NI LLC.

Type: Grant

Filed: September 26, 2014

Date of Patent: August 8, 2017

Assignee: Intel Corporation

Inventors: Rahul Pal, Ishwar Agarwal, Yen-Cheng Liu, Joseph Nuzman, Ashok Jagannathan, Bahaa Fahim, Nithiyanandan Bashyam
Instruction and logic for prefetcher throttling based on counts of memory accesses to data sources

Patent number: 9507596

Abstract: A processor includes a core, a prefetcher, and a prefetcher control module. The prefetcher includes logic to make speculative prefetch requests through a memory subsystem for an element for execution by the core, and logic to store prefetched elements in a cache. The prefetcher control module includes logic to determine counts of memory accesses to two types of memory and, based upon the counts and the type of memory, reduce the speculative prefetch requests of the prefetcher.

Type: Grant

Filed: August 28, 2014

Date of Patent: November 29, 2016

Assignee: Intel Corporation

Inventors: Ashok Jagannathan, Prabhat Jain, Krishna N. Vinod, Avinash Sodani
Supporting large pages in hardware prefetchers

Patent number: 9430392

Abstract: Technologies for supporting large pages in hardware prefetchers are described. A processor includes a processor core comprising a pipeline, cache memory and a hardware prefetcher coupled to the processor core and the cache memory. The hardware prefetcher is a region-based hardware prefetcher to track memory regions of a predefined region size that is defined by software to be executed by the processor. The hardware prefetcher is operative to receive incoming requests and track different memory regions of predefined size with multiple streams in a stream table with stream entries. The hardware prefetcher generates a prefetch request and determines whether the prefetch request goes beyond a page boundary of the one memory region. The hardware prefetcher creates a new stream entry to track a successive memory region when the prefetch request goes beyond the page boundary of the one memory region, allowing subsequent prefetch requests to the successive memory region.

Type: Grant

Filed: March 26, 2014

Date of Patent: August 30, 2016

Assignee: Intel Corporation

Inventors: Prabhat Jain, Ashok Jagannathan
METHOD AND APPARATUS FOR DISTRIBUTED SNOOP FILTERING

Publication number: 20160092366

Abstract: An apparatus and method are described for distributed snoop filtering. For example, one embodiment of a processor comprises: a plurality of cores to execute instructions and process data; first snoop logic to track a first plurality of cache lines stored in a mid-level cache (“MLC”) accessible by one or more of the cores, the first snoop logic to allocate entries for cache lines stored in the MLC and to deallocate entries for cache lines evicted from the MLC, wherein at least some of the cache lines evicted from the MLC are retained in a level 1 (L1) cache; and second snoop logic to track a second plurality of cache lines stored in a non-inclusive last level cache (NI LLC), the second snoop logic to allocate entries in the NI LLC for cache lines evicted from the MLC and to deallocate entries for cache lines stored in the MLC, wherein the second snoop logic is to store and maintain a first set of core valid bits to identify cores containing copies of the cache lines stored in the NI LLC.

Type: Application

Filed: September 26, 2014

Publication date: March 31, 2016

Inventors: Rahul PAL, Ishwar AGARWAL, Yen-Cheng LIU, Joseph NUZMAN, Ashok JAGANNATHAN, Bahaa FAHIM, Nithiyanandan BASHYAM
INSTRUCTION AND LOGIC FOR PREFETCHER THROTTLING BASED ON DATA SOURCE

Publication number: 20160062768

Abstract: A processor includes a core, a prefetcher, and a prefetcher control module. The prefetcher includes logic to make speculative prefetch requests through a memory subsystem for an element for execution by the core, and logic to store prefetched elements in a cache. The prefetcher control module includes logic to determine counts of memory accesses to two types of memory and, based upon the counts and the type of memory, reduce the speculative prefetch requests of the prefetcher.

Type: Application

Filed: August 28, 2014

Publication date: March 3, 2016

Inventors: Ashok Jagannathan, Prabhat Jain, Krishna N. Vinod, Avinash Sodani
Power reduction using unmodified information in evicted cache lines

Patent number: 9229879

Abstract: Embodiments of the present disclosure describe techniques and configurations to reduce power consumption using unmodified information in evicted cache lines. A method includes identifying unmodified information of a cache line stored in a cache of a processor, tracking the unmodified information using a bit vector comprising one or more bits to indicate the unmodified information of the cache line, and selectively suppressing a write operation or send operation for the unmodified information of the cache line that is evicted from the cache to an input/output (I/O) component coupled to the cache, the selective suppressing being based on the one or more bits, and the I/O component being an outer component external to the cache. Other embodiments may be described and/or claimed.

Type: Grant

Filed: July 11, 2011

Date of Patent: January 5, 2016

Assignee: Intel Corporation

Inventors: Mahesh K. Kumashikar, Ashok Jagannathan
SUPPORTING LARGE PAGES IN HARDWARE PREFETCHERS

Publication number: 20150278099

Abstract: Technologies for supporting large pages in hardware prefetchers are described. A processor includes a processor core comprising a pipeline, cache memory and a hardware prefetcher coupled to the processor core and the cache memory. The hardware prefetcher is a region-based hardware prefetcher to track memory regions of a predefined region size that is defined by software to be executed by the processor. The hardware prefetcher is operative to receive incoming requests and track different memory regions of predefined size with multiple streams in a stream table with stream entries. The hardware prefetcher generates a prefetch request and determines whether the prefetch request goes beyond a page boundary of the one memory region. The hardware prefetcher creates a new stream entry to track a successive memory region when the prefetch request goes beyond the page boundary of the one memory region, allowing subsequent prefetch requests to the successive memory region.

Type: Application

Filed: March 26, 2014

Publication date: October 1, 2015

Inventors: PRABHAT JAIN, ASHOK JAGANNATHAN

1 2 next