Patents by Inventor Vineeth Mekkat
Vineeth Mekkat has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230409481Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: ApplicationFiled: May 19, 2023Publication date: December 21, 2023Applicant: Intel CorporationInventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Patent number: 11693780Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: GrantFiled: August 2, 2021Date of Patent: July 4, 2023Assignee: Intel CorporationInventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Publication number: 20210365377Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: ApplicationFiled: August 2, 2021Publication date: November 25, 2021Applicant: Intel CorporationInventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Patent number: 11080194Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: GrantFiled: December 27, 2018Date of Patent: August 3, 2021Assignee: Intel CorporationInventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Patent number: 10915320Abstract: A processor includes an instruction fetch circuit to retrieve instructions from memory, and a decode unit circuit to decode retrieved instructions. The decode unit circuit identifies a shift instruction, accumulates a shift folded immediate value to track a number of bit positions shifted for a source register, and prevents the shift instruction from allocation to an execution unit of the processor.Type: GrantFiled: December 21, 2018Date of Patent: February 9, 2021Assignee: INTEL CORPORATIONInventors: Vineeth Mekkat, Xi Chen, Manjunath Shevgoor
-
Patent number: 10853078Abstract: A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.Type: GrantFiled: December 21, 2018Date of Patent: December 1, 2020Assignee: INTEL CORPORATIONInventors: Vineeth Mekkat, Mark Dechene, Zhongying Zhang, John Faistl, Janghaeng Lee, Hou-Jen Ko, Sebastian Winkel, Oleg Margulis
-
Publication number: 20200310798Abstract: An integrated circuit with support for memory atomicity comprises a processor core. The processor core comprises a data cache unit (DCU), a store buffer (SB), a retirement unit, and memory atomicity facilities. The memory atomicity facilities are configured, when engaged, to (a) add an SB entry to the SB, in response to the processor core executing a store instruction that is part of an atomic region of code; (b) cause the SB entry to become senior, in response to the retirement unit retiring the store instruction; and (c) cause the SB entry to become walk enabled, in response to the retirement unit committing a transaction associated with the atomic region. Other embodiments are described and claimed.Type: ApplicationFiled: March 28, 2019Publication date: October 1, 2020Inventors: Manjunath Shevgoor, Mark Joseph Dechene, Vineeth Mekkat, Jason Michael Agron, Zhongying Zhang
-
Publication number: 20200210339Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.Type: ApplicationFiled: December 27, 2018Publication date: July 2, 2020Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
-
Publication number: 20200201632Abstract: A processor includes an instruction fetch circuit to retrieve instructions from memory, and a decode unit circuit to decode retrieved instructions. The decode unit circuit identifies a shift instruction, accumulates a shift folded immediate value to track a number of bit positions shifted for a source register, and prevents the shift instruction from allocation to an execution unit of the processor.Type: ApplicationFiled: December 21, 2018Publication date: June 25, 2020Inventors: Vineeth MEKKAT, Xi CHEN, Manjunath SHEVGOOR
-
Publication number: 20200201645Abstract: A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.Type: ApplicationFiled: December 21, 2018Publication date: June 25, 2020Inventors: Vineeth MEKKAT, Mark DECHENE, Zhongying ZHANG, John FAISTL, Janghaeng LEE, Hou-Jen KO, Sebastian WINKEL, Oleg MARGULIS
-
Patent number: 10540178Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.Type: GrantFiled: September 14, 2016Date of Patent: January 21, 2020Assignee: Intel CorporationInventors: Vineeth Mekkat, Youfeng Wu, Sebastian Winkel, Oleg Margulis
-
Patent number: 10296343Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.Type: GrantFiled: March 30, 2017Date of Patent: May 21, 2019Assignee: Intel CorporationInventors: Vineeth Mekkat, Jason M. Agron, Youfeng Wu
-
Patent number: 10235177Abstract: In an example, an apparatus includes a binary translator (BT) including circuitry to: analyze a code block; determine that an architectural register mapped to a physical register in the physical register file is available for early reclamation; and insert a reclamation hint into the code block. In another example, a processor reclaims the physical register based at least in part on the reclamation hint.Type: GrantFiled: July 2, 2016Date of Patent: March 19, 2019Assignee: Intel CorporationInventors: Vineeth Mekkat, Janghaeng Lee, Youfeng Wu
-
Patent number: 10228956Abstract: In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.Type: GrantFiled: September 30, 2016Date of Patent: March 12, 2019Assignee: Intel CorporationInventors: Vineeth Mekkat, Mark J. Dechene, Zhongying Zhang, Jason Agron, Sebastian Winkel
-
Patent number: 10120686Abstract: A processor includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, and a binary translator. The binary translator includes circuitry to identify a redundant store in the instruction stream, mark the start and end of a region of the instruction stream with the redundant store, remove the redundant store, and store an amended instruction stream with the redundant store removed.Type: GrantFiled: June 7, 2016Date of Patent: November 6, 2018Assignee: Intel CorporationInventors: Vineeth Mekkat, Oleg Margulis, Ching-Tsun Chou, Youfeng Wu
-
Publication number: 20180285112Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.Type: ApplicationFiled: March 30, 2017Publication date: October 4, 2018Inventors: Vineeth Mekkat, Jason M. Agron, Youfeng Wu
-
Patent number: 9996356Abstract: Apparatus and method for detecting and recovering from incorrect memory dependence speculation in an out-of-order processor are described herein. For example, one embodiment of a method comprises: executing a first load instruction; detecting when the first load instruction experiences a bad store-to-load forwarding event during execution; tracking the occurrences of bad store-to-load forwarding event experienced by the first load instruction during execution; controlling enablement of an S-bit in the first load instruction based on the tracked occurrences; generating a plurality of load operations responsive to an enabled S-bit in first load instruction, wherein execution of the plurality of load operations produces a result equivalent to that from the execution of the first load instruction.Type: GrantFiled: December 26, 2015Date of Patent: June 12, 2018Assignee: Intel CorporationInventors: Vineeth Mekkat, Oleg Margulis, Jason M. Agron, Ethan Schuchman, Sebastian Winkel, Youfeng Wu, Gisle Dankel
-
Publication number: 20180095765Abstract: In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.Type: ApplicationFiled: September 30, 2016Publication date: April 5, 2018Inventors: Vineeth Mekkat, Mark J. Dechene, Zhongying Zhang, Jason Agron, Sebastian Winkel
-
Publication number: 20180074827Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.Type: ApplicationFiled: September 14, 2016Publication date: March 15, 2018Inventors: Vineeth Mekkat, Youfeng Wu, Sebastian Winkel, Oleg Margulis
-
Patent number: 9916164Abstract: Methods, apparatus, systems and articles of manufacture are disclosed herein. An example apparatus includes an instruction profiler to identify a predicated block within instructions to be executed by a hardware processor. The example apparatus includes a performance monitor to access a mis-prediction statistic based on an instruction address associated with the predicated block. The example apparatus includes a region former to, in response to determining that the mis-prediction statistic is above a mis-prediction threshold, include the predicated block in a predicated region for optimization.Type: GrantFiled: June 11, 2015Date of Patent: March 13, 2018Assignee: Intel CorporationInventors: Vineeth Mekkat, Girish Venkatasubramanian, Howard H. Chen