Patents by Inventor Ashok Jagannathan

Ashok Jagannathan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240118892
    Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.
    Type: Application
    Filed: December 18, 2023
    Publication date: April 11, 2024
    Inventors: Swagath VENKATARAMANI, Dipankar DAS, Ashish RANJAN, Subarno BANERJEE, Sasikanth AVANCHA, Ashok JAGANNATHAN, Ajaya V. DURG, Dheemanth NAGARAJ, Bharat KAUL, Anand RAGHUNATHAN
  • Patent number: 11663135
    Abstract: A fabric controller to provide a coherent accelerator fabric, including: a host interconnect to communicatively couple to a host device; a memory interconnect to communicatively couple to an accelerator memory; an accelerator interconnect to communicatively couple to an accelerator having a last-level cache (LLC); and an LLC controller configured to provide a bias check for memory access operations.
    Type: Grant
    Filed: December 20, 2021
    Date of Patent: May 30, 2023
    Assignee: Intel Corporation
    Inventors: Ritu Gupta, Aravindh V. Anantaraman, Stephen R. Van Doren, Ashok Jagannathan
  • Publication number: 20220198110
    Abstract: A method is described. The method includes maintaining a synchronized count value in each of a plurality of logic chips within a same package. The method includes comparing the count value against a same looked for count value in each of the plurality of logic chips. The method includes each of the plurality of logic chips recording in its respective local memory at least some of its state information in response to each of the plurality of logic chips recognizing within a same cycle that the count value has reached the same looked for count value.
    Type: Application
    Filed: December 23, 2020
    Publication date: June 23, 2022
    Inventors: Shanker Raman NAGESH, Ashok JAGANNATHAN
  • Publication number: 20220114105
    Abstract: A fabric controller to provide a coherent accelerator fabric, including: a host interconnect to communicatively couple to a host device; a memory interconnect to communicatively couple to an accelerator memory; an accelerator interconnect to communicatively couple to an accelerator having a last-level cache (LLC); and an LLC controller configured to provide a bias check for memory access operations.
    Type: Application
    Filed: December 20, 2021
    Publication date: April 14, 2022
    Applicant: Intel Corporation
    Inventors: Ritu Gupta, Aravindh V. Anantaraman, Stephen R. Van Doren, Ashok Jagannathan
  • Patent number: 11263143
    Abstract: A fabric controller is provided for a coherent accelerator fabric. The coherent accelerator fabric includes a host interconnect, a memory interconnect, and an accelerator interconnect. The host interconnect communicatively couples to a host device. The memory interconnect communicatively couples to an accelerator memory. The accelerator interconnect communicatively couples to an accelerator having a last-level cache (LLC). An LLC controller is provided that is configured to provide a bias check for memory access operations on the fabric.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: March 1, 2022
    Assignee: Intel Corporation
    Inventors: Ritu Gupta, Aravindh V. Anantaraman, Stephen R. Van Doren, Ashok Jagannathan
  • Publication number: 20220050683
    Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.
    Type: Application
    Filed: October 26, 2021
    Publication date: February 17, 2022
    Inventors: Swagath VENKATARAMANI, Dipankar DAS, Ashish RANJAN, Subarno BANERJEE, Sasikanth AVANCHA, Ashok JAGANNATHAN, Ajaya V. DURG, Dheemanth NAGARAJ, Bharat KAUL, Anand RAGHUNATHAN
  • Publication number: 20210318980
    Abstract: A processor unit comprising a first controller to couple to a host processing unit over a first link; a second controller to couple to a second processor unit over a second link, wherein the second processor unit is to couple to the host central processing unit via a third link; and circuitry to determine whether to send a cache coherent request to the host central processing unit over the first link or over the second link via the second processing unit.
    Type: Application
    Filed: June 25, 2021
    Publication date: October 14, 2021
    Applicant: Intel Corporation
    Inventors: Rahul Pal, Nayan Amrutlal Suthar, David M. Puffer, Ashok Jagannathan
  • Publication number: 20190303743
    Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.
    Type: Application
    Filed: September 27, 2016
    Publication date: October 3, 2019
    Inventors: Swagath VENKATARAMANI, Dipankar DAS, Ashish RANJAN, Subarno BANERJEE, Sasikanth AVANCHA, Ashok JAGANNATHAN, Ajaya V. DURG, Dheemanth NAGARAJ, Bharat KAUL, Anand RAGHUNATHAN
  • Patent number: 10339060
    Abstract: System, method, and processor for enabling early deallocation of tracker entries which track memory accesses are described herein. One embodiment of a method includes: maintaining an RSF corresponding to a first processing unit of a plurality of processing units to track cache lines, wherein a cache line is tracked by the RSF if the cache line is stored in both a memory and one or more other processing unit, the memory is coupled to and shared by the plurality of processing units; receiving a request to access a target cache line from a processing core of the first processing unit; allocating a tracker entry corresponding to the request, the tracker entry used to track a status of the request; performing a lookup in the RSF for the target cache line; and deallocating the tracker entry responsive to a detection that the target cache line is not tracked the RSF.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: July 2, 2019
    Assignee: Intel Corporation
    Inventors: Bahaa Fahim, Ashok Jagannathan, Jeffrey D. Chamberlain, Samuel D. Strom
  • Publication number: 20190102311
    Abstract: A fabric controller to provide a coherent accelerator fabric, including: a host interconnect to communicatively couple to a host device; a memory interconnect to communicatively couple to an accelerator memory; an accelerator interconnect to communicatively couple to an accelerator having a last-level cache (LLC); and an LLC controller configured to provide a bias check for memory access operations.
    Type: Application
    Filed: September 29, 2017
    Publication date: April 4, 2019
    Inventors: Ritu Gupta, Aravindh V. Anantaraman, Stephen R. Van Doren, Ashok Jagannathan
  • Publication number: 20180189180
    Abstract: System, method, and processor for enabling early deallocation of tracker entries which track memory accesses are described herein. One embodiment of a method includes: maintaining an RSF corresponding to a first processing unit of a plurality of processing units to track cache lines, wherein a cache line is tracked by the RSF if the cache line is stored in both a memory and one or more other processing unit, the memory is coupled to and shared by the plurality of processing units; receiving a request to access a target cache line from a processing core of the first processing unit; allocating a tracker entry corresponding to the request, the tracker entry used to track a status of the request; performing a lookup in the RSF for the target cache line; and deallocating the tracker entry responsive to a detection that the target cache line is not tracked the RSF.
    Type: Application
    Filed: December 30, 2016
    Publication date: July 5, 2018
    Inventors: Bahaa Fahim, Ashok Jagannathan, Jeffrey D. Chamberlain, Samuel D. Strom
  • Patent number: 9727475
    Abstract: An apparatus and method are described for distributed snoop filtering. For example, one embodiment of a processor comprises: a plurality of cores to execute instructions and process data; first snoop logic to track a first plurality of cache lines stored in a mid-level cache (“MLC”) accessible by one or more of the cores, the first snoop logic to allocate entries for cache lines stored in the MLC and to deallocate entries for cache lines evicted from the MLC, wherein at least some of the cache lines evicted from the MLC are retained in a level 1 (L1) cache; and second snoop logic to track a second plurality of cache lines stored in a non-inclusive last level cache (NI LLC), the second snoop logic to allocate entries in the NI LLC for cache lines evicted from the MLC and to deallocate entries for cache lines stored in the MLC, wherein the second snoop logic is to store and maintain a first set of core valid bits to identify cores containing copies of the cache lines stored in the NI LLC.
    Type: Grant
    Filed: September 26, 2014
    Date of Patent: August 8, 2017
    Assignee: Intel Corporation
    Inventors: Rahul Pal, Ishwar Agarwal, Yen-Cheng Liu, Joseph Nuzman, Ashok Jagannathan, Bahaa Fahim, Nithiyanandan Bashyam
  • Patent number: 9507596
    Abstract: A processor includes a core, a prefetcher, and a prefetcher control module. The prefetcher includes logic to make speculative prefetch requests through a memory subsystem for an element for execution by the core, and logic to store prefetched elements in a cache. The prefetcher control module includes logic to determine counts of memory accesses to two types of memory and, based upon the counts and the type of memory, reduce the speculative prefetch requests of the prefetcher.
    Type: Grant
    Filed: August 28, 2014
    Date of Patent: November 29, 2016
    Assignee: Intel Corporation
    Inventors: Ashok Jagannathan, Prabhat Jain, Krishna N. Vinod, Avinash Sodani
  • Patent number: 9430392
    Abstract: Technologies for supporting large pages in hardware prefetchers are described. A processor includes a processor core comprising a pipeline, cache memory and a hardware prefetcher coupled to the processor core and the cache memory. The hardware prefetcher is a region-based hardware prefetcher to track memory regions of a predefined region size that is defined by software to be executed by the processor. The hardware prefetcher is operative to receive incoming requests and track different memory regions of predefined size with multiple streams in a stream table with stream entries. The hardware prefetcher generates a prefetch request and determines whether the prefetch request goes beyond a page boundary of the one memory region. The hardware prefetcher creates a new stream entry to track a successive memory region when the prefetch request goes beyond the page boundary of the one memory region, allowing subsequent prefetch requests to the successive memory region.
    Type: Grant
    Filed: March 26, 2014
    Date of Patent: August 30, 2016
    Assignee: Intel Corporation
    Inventors: Prabhat Jain, Ashok Jagannathan
  • Publication number: 20160092366
    Abstract: An apparatus and method are described for distributed snoop filtering. For example, one embodiment of a processor comprises: a plurality of cores to execute instructions and process data; first snoop logic to track a first plurality of cache lines stored in a mid-level cache (“MLC”) accessible by one or more of the cores, the first snoop logic to allocate entries for cache lines stored in the MLC and to deallocate entries for cache lines evicted from the MLC, wherein at least some of the cache lines evicted from the MLC are retained in a level 1 (L1) cache; and second snoop logic to track a second plurality of cache lines stored in a non-inclusive last level cache (NI LLC), the second snoop logic to allocate entries in the NI LLC for cache lines evicted from the MLC and to deallocate entries for cache lines stored in the MLC, wherein the second snoop logic is to store and maintain a first set of core valid bits to identify cores containing copies of the cache lines stored in the NI LLC.
    Type: Application
    Filed: September 26, 2014
    Publication date: March 31, 2016
    Inventors: Rahul PAL, Ishwar AGARWAL, Yen-Cheng LIU, Joseph NUZMAN, Ashok JAGANNATHAN, Bahaa FAHIM, Nithiyanandan BASHYAM
  • Publication number: 20160062768
    Abstract: A processor includes a core, a prefetcher, and a prefetcher control module. The prefetcher includes logic to make speculative prefetch requests through a memory subsystem for an element for execution by the core, and logic to store prefetched elements in a cache. The prefetcher control module includes logic to determine counts of memory accesses to two types of memory and, based upon the counts and the type of memory, reduce the speculative prefetch requests of the prefetcher.
    Type: Application
    Filed: August 28, 2014
    Publication date: March 3, 2016
    Inventors: Ashok Jagannathan, Prabhat Jain, Krishna N. Vinod, Avinash Sodani
  • Patent number: 9229879
    Abstract: Embodiments of the present disclosure describe techniques and configurations to reduce power consumption using unmodified information in evicted cache lines. A method includes identifying unmodified information of a cache line stored in a cache of a processor, tracking the unmodified information using a bit vector comprising one or more bits to indicate the unmodified information of the cache line, and selectively suppressing a write operation or send operation for the unmodified information of the cache line that is evicted from the cache to an input/output (I/O) component coupled to the cache, the selective suppressing being based on the one or more bits, and the I/O component being an outer component external to the cache. Other embodiments may be described and/or claimed.
    Type: Grant
    Filed: July 11, 2011
    Date of Patent: January 5, 2016
    Assignee: Intel Corporation
    Inventors: Mahesh K. Kumashikar, Ashok Jagannathan
  • Publication number: 20150278099
    Abstract: Technologies for supporting large pages in hardware prefetchers are described. A processor includes a processor core comprising a pipeline, cache memory and a hardware prefetcher coupled to the processor core and the cache memory. The hardware prefetcher is a region-based hardware prefetcher to track memory regions of a predefined region size that is defined by software to be executed by the processor. The hardware prefetcher is operative to receive incoming requests and track different memory regions of predefined size with multiple streams in a stream table with stream entries. The hardware prefetcher generates a prefetch request and determines whether the prefetch request goes beyond a page boundary of the one memory region. The hardware prefetcher creates a new stream entry to track a successive memory region when the prefetch request goes beyond the page boundary of the one memory region, allowing subsequent prefetch requests to the successive memory region.
    Type: Application
    Filed: March 26, 2014
    Publication date: October 1, 2015
    Inventors: PRABHAT JAIN, ASHOK JAGANNATHAN
  • Patent number: 8862828
    Abstract: Method and apparatus to efficiently store and cache data. Cores of a processor and cache slices co-located with the cores may be grouped into a cluster. A memory space may be partitioned into address regions. The cluster may be associated with an address region from the address regions. Each memory address of the address region may be mapped to one or more of the cache slices grouped into the cluster. A cache access from one or more of the cores grouped into the cluster may be biased to the address region based on the association of the cluster with the address region.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: October 14, 2014
    Assignee: Intel Corporation
    Inventors: Ravindra P. Saraf, Rahul Pal, Ashok Jagannathan
  • Publication number: 20140006715
    Abstract: Method and apparatus to efficiently store and cache data. Cores of a processor and cache slices co-located with the cores may be grouped into a cluster. A memory space may be partitioned into address regions. The cluster may be associated with an address region from the address regions. Each memory address of the address region may be mapped to one or more of the cache slices grouped into the cluster. A cache access from one or more of the cores grouped into the cluster may be biased to the address region based on the association of the cluster with the address region.
    Type: Application
    Filed: August 13, 2012
    Publication date: January 2, 2014
    Applicant: INTEL CORPORATION
    Inventors: Ravindra P. Saraf, Rahul Pal, Ashok Jagannathan