Patents by Inventor Anant Nori
Anant Nori has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230409481Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: ApplicationFiled: May 19, 2023Publication date: December 21, 2023Applicant: Intel CorporationInventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Patent number: 11693780Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: GrantFiled: August 2, 2021Date of Patent: July 4, 2023Assignee: Intel CorporationInventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Publication number: 20230205692Abstract: Apparatus and method for leveraging simultaneous multithreading for bulk compute operations. For example, one embodiment of a processor comprises: a plurality of cores including a first core to simultaneously process instructions of a plurality of threads; a cache hierarchy coupled to the first core and the memory, the cache hierarchy comprising a Level 1 (L1) cache, a Level 2 (L2) cache, and a Level 3 (L3) cache; and a plurality of compute units coupled to the first core including a first compute unit associated with the L1 cache, a second compute unit associated with the L2 cache, and a third compute unit associated with the L3 cache, wherein the first core is to offload instructions for execution by the compute units, the first core to offload instructions from a first thread to the first compute unit, instructions from a second thread to the second compute unit, and instructions from a third thread to the third compute unit.Type: ApplicationFiled: December 23, 2021Publication date: June 29, 2023Inventors: ANANT NORI, RAHUL BERA, SHANKAR BALACHANDRAN, JOYDEEP RAKSHIT, Om Ji OMER, SREENIVAS SUBRAMONEY, AVISHAII ABUHATZERA, BELLIAPPA KUTTANNA
-
Publication number: 20220100514Abstract: Techniques for processing loops are described. An exemplary apparatus at least includes decoder circuitry to decode a single instruction, the single instruction to include a field for an opcode, the opcode to indicate execution circuitry is to perform an operation to configure execution of one or more loops, wherein the one or more loops are to include a plurality of configuration instructions and instructions that are to use metadata generated by ones of the plurality of configuration instructions; and execution circuitry to perform the operation as indicated by the opcode.Type: ApplicationFiled: December 26, 2020Publication date: March 31, 2022Inventors: Anant NORI, Shankar BALACHANDRAN, Sreenivas SUBRAMONEY, Joydeep RAKSHIT, Vedvyas SHANBHOGUE, Avishaii ABUHATZERA, Belliappa KUTTANNA
-
Publication number: 20210365377Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: ApplicationFiled: August 2, 2021Publication date: November 25, 2021Applicant: Intel CorporationInventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Patent number: 11080194Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: GrantFiled: December 27, 2018Date of Patent: August 3, 2021Assignee: Intel CorporationInventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Publication number: 20200210339Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: ApplicationFiled: December 27, 2018Publication date: July 2, 2020Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Patent number: 10559348Abstract: In one embodiment, an apparatus includes a memory array having a plurality of memory cells, a plurality of bitlines coupled to the plurality of memory cells, and a plurality of wordlines coupled to the plurality of memory cells. The memory array may further include a sense amplifier circuit to sense and amplify a value stored in a memory cell of the plurality of memory cells. The sense amplifier circuit may include: a buffer circuit to store the value, the buffer circuit coupled between a first internal node of the sense amplifier circuit and a second internal node of the sense amplifier circuit; and an equalization circuit to equalize the first internal node and the second internal node while the sense amplifier circuit is decoupled from the memory array. Other embodiments are described and claimed.Type: GrantFiled: May 16, 2018Date of Patent: February 11, 2020Assignee: Intel CorporationInventors: Lavanya Subramanian, Kaushik Vaidyanathan, Anant Nori, Sreenivas Subramoney, Tanay Karnik
-
Publication number: 20190355411Abstract: In one embodiment, an apparatus includes a memory array having a plurality of memory cells, a plurality of bitlines coupled to the plurality of memory cells, and a plurality of wordlines coupled to the plurality of memory cells. The memory array may further include a sense amplifier circuit to sense and amplify a value stored in a memory cell of the plurality of memory cells. The sense amplifier circuit may include: a buffer circuit to store the value, the buffer circuit coupled between a first internal node of the sense amplifier circuit and a second internal node of the sense amplifier circuit; and an equalization circuit to equalize the first internal node and the second internal node while the sense amplifier circuit is decoupled from the memory array. Other embodiments are described and claimed.Type: ApplicationFiled: May 16, 2018Publication date: November 21, 2019Inventors: Lavanya Subramanian, Kaushik Vaidyanathan, Anant Nori, Sreenivas Subramoney, Tanay Karnik
-
Patent number: 10162756Abstract: A memory-efficient last level cache (LLC) architecture is described. A processor implementing a LLC architecture may include a processor core, a last level cache (LLC) operatively coupled to the processor core, and a cache controller operatively coupled to the LLC. The cache controller is to monitor a bandwidth demand of a channel between the processor core and a dynamic random-access memory (DRAM) device associated with the LLC. The cache controller is further to perform a first defined number of consecutive reads from the DRAM device when the bandwidth demand exceeds a first threshold value and perform a first defined number of consecutive writes of modified lines from the LLC to the DRAM device when the bandwidth demand exceeds the first threshold value.Type: GrantFiled: January 18, 2017Date of Patent: December 25, 2018Assignee: Intel CorporationInventors: Jayesh Gaur, Ayan Mandal, Anant Nori, Sreenivas Subramoney
-
Publication number: 20180203799Abstract: A memory-efficient last level cache (LLC) architecture is described. A processor implementing a LLC architecture may include a processor core, a last level cache (LLC) operatively coupled to the processor core, and a cache controller operatively coupled to the LLC. The cache controller is to monitor a bandwidth demand of a channel between the processor core and a dynamic random-access memory (DRAM) device associated with the LLC. The cache controller is further to perform a first defined number of consecutive reads from the DRAM device when the bandwidth demand exceeds a first threshold value and perform a first defined number of consecutive writes of modified lines from the LLC to the DRAM device when the bandwidth demand exceeds the first threshold value.Type: ApplicationFiled: January 18, 2017Publication date: July 19, 2018Inventors: Jayesh Gaur, Ayan Mandal, Anant Nori, Sreenivas Subramoney
-
Publication number: 20180088944Abstract: A multi-core processor includes a plurality of cores to execute a plurality of threads and to monitor metrics for each of the plurality of threads during an interval, the metrics including stall cycle values, prefetches of a first type, and prefetches of a second type. The multi-core processor further includes criticality-aware thread prioritization (CATP) logic to compute a stall fraction for each of the plurality of threads during the interval using the stall cycle values, identify a thread with a highest stall fraction of the plurality of threads, determine the highest stall fraction is greater than a stall threshold, prioritize demand requests of the identified thread, compute a prefetch accuracy of the identified thread during the interval using the prefetches of the first type and the prefetches of the second type, determine the prefetch accuracy is greater than a prefetch threshold, and prioritize prefetch requests of the identified thread.Type: ApplicationFiled: September 23, 2016Publication date: March 29, 2018Inventors: Lavanya Subramanian, Sreenivas Subramoney, Nithiyanandan Bashyam, Anant Nori
-
Patent number: 9921839Abstract: A multi-core processor includes a plurality of cores to execute a plurality of threads and to monitor metrics for each of the plurality of threads during an interval, the metrics including stall cycle values, prefetches of a first type, and prefetches of a second type. The multi-core processor further includes criticality-aware thread prioritization (CATP) logic to compute a stall fraction for each of the plurality of threads during the interval using the stall cycle values, identify a thread with a highest stall fraction of the plurality of threads, determine the highest stall fraction is greater than a stall threshold, prioritize demand requests of the identified thread, compute a prefetch accuracy of the identified thread during the interval using the prefetches of the first type and the prefetches of the second type, determine the prefetch accuracy is greater than a prefetch threshold, and prioritize prefetch requests of the identified thread.Type: GrantFiled: September 23, 2016Date of Patent: March 20, 2018Assignee: Intel CorporationInventors: Lavanya Subramanian, Sreenivas Subramoney, Nithiyanandan Bashyam, Anant Nori