Patents by Inventor Sreenivas Subramoney

Sreenivas Subramoney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20190286567
    Abstract: In one embodiment, a processor includes: a cache memory to store a plurality of cache lines; and a cache controller to control the cache memory. The cache controller may include a control circuit to allocate a virtual write buffer within the cache memory in response to a bandwidth on an interconnect that exceeds a first bandwidth threshold. The cache controller may further include a replacement circuit to control eviction of cache lines from the cache memory. Other embodiments are described and claimed.
    Type: Application
    Filed: March 16, 2018
    Publication date: September 19, 2019
    Inventors: Mainak Chaudhuri, Jayesh Gaur, Sreenivas Subramoney, Hong Wang
  • Patent number: 10402413
    Abstract: A processor may include a plurality of processing elements and a hardware accelerator for selecting data elements. The hardware accelerator may: access an input data set comprising a set of data elements, each data element having a score value; increment bin counters based on the score values of the set of data elements, each bin counter to count a number of data elements with an associated score value; determine a cumulative sum of count values for a sequence of bin counters, the sequence beginning with a first bin counter of the plurality of bin counters; identify a second bin counter in the sequence of bin counters at which the cumulative sum reaches a selection quantity N; and generate an output data set based on a comparison of the set of data elements to a threshold score associated with the second bin counter.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: September 3, 2019
    Assignee: Intel Corporation
    Inventors: Mahesh Mamidipaka, Srivatsava Jandhyala, Anish N K, Nagadastagiri Reddy C, Sreenivas Subramoney
  • Publication number: 20190243760
    Abstract: A memory-efficient last level cache (LLC) architecture is described. A processor implementing a LLC architecture may include a processor core, a last level cache (LLC) operatively coupled to the processor core, and a cache controller operatively coupled to the LLC. The cache controller is to monitor a bandwidth demand of a channel between the processor core and a dynamic random-access memory (DRAM) device associated with the LLC. The cache controller is further to perform a first defined number of consecutive reads from the DRAM device when the bandwidth demand exceeds a first threshold value and perform a first defined number of consecutive writes of modified lines from the LLC to the DRAM device when the bandwidth demand exceeds the first threshold value.
    Type: Application
    Filed: December 17, 2018
    Publication date: August 8, 2019
    Inventors: Jayesh Gaur, Ayan Mandal, Anant V. Nori, Sreenivas Subramoney
  • Publication number: 20190243684
    Abstract: A processor including an execution unit, an instruction scheduler circuit to identify a first instruction of an instruction stream, identify a second instruction on which execution of the first instruction depends, and assign a first dispatch priority value to the first instruction and the second instruction, and a dispatch circuit to dispatch, based on the first dispatch priority value, the first instruction and the second instruction to an instruction execution circuit.
    Type: Application
    Filed: February 7, 2018
    Publication date: August 8, 2019
    Inventors: Pooja Roy, Jayesh Gaur, Sreenivas Subramoney, Zeev Sperber, Alexandr Titov, Lihu Rappoport, Stanislav Shwartsman, Hong Wang, Adi Yoaz, Ronak Singhal, Robert S. Chappell
  • Publication number: 20190220284
    Abstract: One embodiment provides an apparatus. The apparatus includes a store direct dependent (SDD) branch prediction circuitry and an SDD management circuitry. The store direct dependent (SDD) branch prediction circuitry is to store an SDD branch table. The SDD branch table is to store at least one record. Each record includes a branch instruction pointer (IP) field, a load IP field, a store IP field, a comparison info field and at least one of a store value field and/or a predicted outcome field. The SDD management circuitry is to populate the SDD branch table at runtime and to override a baseline branch prediction associated with an incoming branch IP with an SDD branch prediction, if the SDD branch table contains a first record populated with the incoming branch IP and at least one of a store value and/or an SDD predicted outcome.
    Type: Application
    Filed: January 12, 2018
    Publication date: July 18, 2019
    Applicant: Intel Corporation
    Inventors: Saurabh Gupta, Rahul Pal, Niranjan Soundararajan, Ragavendra Natarajan, Sreenivas Subramoney
  • Publication number: 20190205143
    Abstract: In one embodiment, a branch prediction circuit includes: a first bimodal predictor having a first plurality of entries each to store first prediction information for a corresponding branch instruction; a global predictor having a plurality of global entries each to store global prediction information for a corresponding branch instruction; a second bimodal predictor having a second plurality of entries each to store second prediction information for a corresponding branch instruction; a monitoring table having a plurality of monitoring entries each to store a counter value based on the second prediction information for a corresponding branch instruction; and a control circuit to allocate a global entry within the global predictor based at least in part on the counter value of a monitoring entry of the monitoring table for a corresponding branch instruction. Other embodiments are described and claimed.
    Type: Application
    Filed: December 29, 2017
    Publication date: July 4, 2019
    Inventors: Ragavendra Natarajan, Niranjan Soundararajan, Sreenivas Subramoney
  • Publication number: 20190205135
    Abstract: Implementations of the disclosure implement timely and context triggered (TACT) prefetching that targets particular load IPs in a program contributing to a threshold amount of the long latency accesses. A processing device comprising an execution unit; and a prefetcher circuit communicably coupled to the execution unit is provided. The prefetcher circuit is to detect a memory request for a target instruction pointer (IP) in a program to be executed by the execution unit. A trigger IP is identified to initiate a prefetch operation of memory data for the target IP. Thereupon, an association is determined between memory addresses of the trigger IP and the target IP. The association comprising a series of offsets representing a path between the trigger IP and an instance of the target IP in memory. Based on the association, an offset from the memory address of the trigger IP to prefetch the memory data is produced.
    Type: Application
    Filed: January 3, 2018
    Publication date: July 4, 2019
    Inventors: Anant Vithal Nori, Sreenivas Subramoney, Shankar Balachandran, Hong Wang
  • Patent number: 10331582
    Abstract: A processor includes a processing core and a cache controller including a read queue and a separate write queue. The read queue is to buffer read requests of the processing core to a non-volatile memory, last level cache (NVM-LLC), and the write queue is to buffer write requests to the NVM-LLC. The cache controller is to detect whether the write queue is full. The cache controller further prioritizes a first order of sending requests to the NVM-LLC when the write queue contains an empty slot, the first order specifying a first pattern of sending the read requests before the write requests, and prioritizes a second order of sending requests to the NVM-LLC in response to a determination that the write queue is full, the second order specifying a second pattern of alternating between sending a write request from the write queue and a read request from the read queue.
    Type: Grant
    Filed: February 13, 2017
    Date of Patent: June 25, 2019
    Assignee: Intel Corporation
    Inventors: Ishwar S. Bhati, Huichu Liu, Jayesh Gaur, Kunal Korgaonkar, Sasikanth Manipatruni, Sreenivas Subramoney, Tanay Karnik, Hong Wang, Ian A. Young
  • Patent number: 10318834
    Abstract: One embodiment provides an image processing circuitry. The image processing circuitry includes a feature extraction circuitry and an optimization circuitry. The feature extraction circuitry is to determine a feature descriptor based, at least in part, on a feature point location and a corresponding scale. The optimization circuitry is to optimize an operation of the feature extraction circuitry. Each optimization is to at least one of accelerate the operation of the feature extraction circuitry, reduce a power consumption of the feature extraction circuitry and/or reduce a system memory bandwidth used by the feature extraction circuitry.
    Type: Grant
    Filed: May 1, 2017
    Date of Patent: June 11, 2019
    Assignee: Intel Corporation
    Inventors: Gurpreet S. Kalsi, Om J. Omer, Biji George, Gopi Neela, Dipan Kumar Mandal, Sreenivas Subramoney
  • Publication number: 20190171909
    Abstract: An example apparatus for selecting keypoints in image includes a keypoint detector to detect keypoints in a plurality of received images. The apparatus also includes a score calculator to calculate a keypoint score for each of the detected keypoints based on a descriptor score indicating descriptor invariance. The apparatus includes a keypoint selector to select keypoints based on the calculated keypoint scores. The apparatus also further includes a descriptor calculator to calculate descriptors for each of the selected keypoints. The apparatus also includes a descriptor matcher to match corresponding descriptors between images in the plurality of received images. The apparatus further also includes a feature tracker to track a feature in the plurality of images based on the matched descriptors.
    Type: Application
    Filed: December 26, 2018
    Publication date: June 6, 2019
    Applicant: INTEL CORPORATION
    Inventors: Dipan Kumar Mandal, Gurpreet Kalsi, Om J. Omer, Prashant Laddha, Sreenivas Subramoney
  • Patent number: 10268600
    Abstract: In one embodiment, a processor includes: a first cache controller to control a first cache memory. This cache controller may include a replacement circuit to: associate a first priority indicator with a first cache line based on storage of demand data in the first cache line and first learning information associated with a set of demand-based categories of cache lines; and associate a second priority indicator with a second cache line based on storage of prefetch data in the second cache line and second learning information associated with a set of prefetch-based categories of cache lines. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 12, 2017
    Date of Patent: April 23, 2019
    Assignee: Intel Corporation
    Inventors: Jayesh Gaur, Sreenivas Subramoney, Sanjay Ganapathy
  • Publication number: 20190095331
    Abstract: A method is described. The method includes receiving a read or write request for a cache line. The method includes directing the request to a set of logical super lines based on the cache line's system memory address. The method includes associating the request with a cache line of the set of logical super lines. The method includes, if the request is a write request: compressing the cache line to form a compressed cache line, breaking the cache line down into smaller data units and storing the smaller data units into a memory side cache. The method includes, if the request is a read request: reading smaller data units of the compressed cache line from the memory side cache and decompressing the cache line.
    Type: Application
    Filed: September 28, 2017
    Publication date: March 28, 2019
    Inventors: Israel DIAMAND, Alaa R. ALAMELDEEN, Sreenivas SUBRAMONEY, Supratik MAJUMDER, Srinivas Santosh Kumar MADUGULA, Jayesh GAUR, Zvika GREENFIELD, Anant V. NORI
  • Publication number: 20190079877
    Abstract: In one embodiment, a processor includes: a first cache controller to control a first cache memory. This cache controller may include a replacement circuit to: associate a first priority indicator with a first cache line based on storage of demand data in the first cache line and first learning information associated with a set of demand-based categories of cache lines; and associate a second priority indicator with a second cache line based on storage of prefetch data in the second cache line and second learning information associated with a set of prefetch-based categories of cache lines. Other embodiments are described and claimed.
    Type: Application
    Filed: September 12, 2017
    Publication date: March 14, 2019
    Inventors: Jayesh Gaur, Sreenivas Subramoney, Sanjay Ganapathy
  • Patent number: 10191689
    Abstract: Systems for page management using local page information are disclosed. The system may include a processor, including a memory controller, and a memory, including a row buffer. The memory controller may include circuitry to determine that a page stored in the row buffer has been idle for a time exceeding a predetermined threshold determine whether the page is exempt from idle page closures, and, based on a determination that the page is exempt, refrain from closing the page. Associated methods are also disclosed.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: January 29, 2019
    Assignee: Intel Corporation
    Inventors: Sriseshan Srikanth, Lavanya Subramanian, Sreenivas Subramoney
  • Patent number: 10176124
    Abstract: A technology is described for determining an idle page close timeout for a row buffer. An example memory controller may comprise a scoreboard buffer and a predictive timeout engine. The scoreboard buffer may be configured to store a number of page hits and a number of page misses for a plurality of candidate timeout values for an idle page close timeout. The predictive timeout engine may be configured to increment the page hits and the page misses in the scoreboard buffer according to estimated page hit results and page miss results for the candidate timeout values, and identify a candidate timeout value from the scoreboard buffer estimated to maximize the number of page hits to the number of page misses.
    Type: Grant
    Filed: April 1, 2017
    Date of Patent: January 8, 2019
    Assignee: Intel Corporation
    Inventors: Sriseshan Srikanth, Lavanya Subramanian, Sreenivas Subramoney
  • Patent number: 10162756
    Abstract: A memory-efficient last level cache (LLC) architecture is described. A processor implementing a LLC architecture may include a processor core, a last level cache (LLC) operatively coupled to the processor core, and a cache controller operatively coupled to the LLC. The cache controller is to monitor a bandwidth demand of a channel between the processor core and a dynamic random-access memory (DRAM) device associated with the LLC. The cache controller is further to perform a first defined number of consecutive reads from the DRAM device when the bandwidth demand exceeds a first threshold value and perform a first defined number of consecutive writes of modified lines from the LLC to the DRAM device when the bandwidth demand exceeds the first threshold value.
    Type: Grant
    Filed: January 18, 2017
    Date of Patent: December 25, 2018
    Assignee: Intel Corporation
    Inventors: Jayesh Gaur, Ayan Mandal, Anant Nori, Sreenivas Subramoney
  • Publication number: 20180349144
    Abstract: In one embodiment, a processor comprises a branch predictor to generate, in association with a program loop, a frozen history vector comprising a snapshot of a branch history vector; track a current iteration of the program loop; and provide a prediction for a branch instruction associated with the program loop, the prediction based on the frozen history vector and the current iteration of the program loop.
    Type: Application
    Filed: June 6, 2017
    Publication date: December 6, 2018
    Applicant: Intel Corporation
    Inventors: Rahul Pal, Ragavendra Natarajan, Niranjan K. Soundararajan, Sreenivas Subramoney, Daniel Deng, Jared Warner Stark, IV, Hong Wang, Ronak Singhal
  • Publication number: 20180314903
    Abstract: One embodiment provides an image processing circuitry. The image processing circuitry includes a feature extraction circuitry and an optimization circuitry. The feature extraction circuitry is to determine a feature descriptor based, at least in part, on a feature point location and a corresponding scale. The optimization circuitry is to optimize an operation of the feature extraction circuitry. Each optimization is to at least one of accelerate the operation of the feature extraction circuitry, reduce a power consumption of the feature extraction circuitry and/or reduce a system memory bandwidth used by the feature extraction circuitry.
    Type: Application
    Filed: May 1, 2017
    Publication date: November 1, 2018
    Applicant: INTEL CORPORATION
    Inventors: Gurpreet S. Kalsi, Om J. Omer, Biji George, Gopi Neela, Dipan Kumar Mandal, Sreenivas Subramoney
  • Publication number: 20180285286
    Abstract: A technology is described for determining an idle page close timeout for a row buffer. An example memory controller may comprise a scoreboard buffer and a predictive timeout engine. The scoreboard buffer may be configured to store a number of page hits and a number of page misses for a plurality of candidate timeout values for an idle page close timeout. The predictive timeout engine may be configured to increment the page hits and the page misses in the scoreboard buffer according to estimated page hit results and page miss results for the candidate timeout values, and identify a candidate timeout value from the scoreboard buffer estimated to maximize the number of page hits to the number of page misses.
    Type: Application
    Filed: April 1, 2017
    Publication date: October 4, 2018
    Applicant: Intel Corporation
    Inventors: Sriseshan Srikanth, Lavanya Subramanian, Sreenivas Subramoney
  • Publication number: 20180285115
    Abstract: Embodiments of apparatuses, methods, and systems for misprediction-triggered local history-based branch prediction are described. In one embodiments, an apparatus includes a current pattern table and a local pattern table. The current pattern table has a plurality of entries, each entry in which to store a plurality of pattern lengths of a current pattern of one of a plurality of branch instructions. The local pattern table is to provide a first branch prediction based on the current pattern.
    Type: Application
    Filed: April 1, 2017
    Publication date: October 4, 2018
    Inventors: Niranjan K. Soundararajan, Saurabh Gupta, Sreenivas Subramoney, Rahul Pal, Ragavendra Natarajan, Daniel Deng, Jared W. Stark, Ronak Singhal, Hong Wang