Patents by Inventor Vineeth Mekkat

Vineeth Mekkat has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230409481
    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
    Type: Application
    Filed: May 19, 2023
    Publication date: December 21, 2023
    Applicant: Intel Corporation
    Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
  • Patent number: 11693780
    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
    Type: Grant
    Filed: August 2, 2021
    Date of Patent: July 4, 2023
    Assignee: Intel Corporation
    Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
  • Publication number: 20210365377
    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
    Type: Application
    Filed: August 2, 2021
    Publication date: November 25, 2021
    Applicant: Intel Corporation
    Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
  • Patent number: 11080194
    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
  • Patent number: 10915320
    Abstract: A processor includes an instruction fetch circuit to retrieve instructions from memory, and a decode unit circuit to decode retrieved instructions. The decode unit circuit identifies a shift instruction, accumulates a shift folded immediate value to track a number of bit positions shifted for a source register, and prevents the shift instruction from allocation to an execution unit of the processor.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: February 9, 2021
    Assignee: INTEL CORPORATION
    Inventors: Vineeth Mekkat, Xi Chen, Manjunath Shevgoor
  • Patent number: 10853078
    Abstract: A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: December 1, 2020
    Assignee: INTEL CORPORATION
    Inventors: Vineeth Mekkat, Mark Dechene, Zhongying Zhang, John Faistl, Janghaeng Lee, Hou-Jen Ko, Sebastian Winkel, Oleg Margulis
  • Publication number: 20200310798
    Abstract: An integrated circuit with support for memory atomicity comprises a processor core. The processor core comprises a data cache unit (DCU), a store buffer (SB), a retirement unit, and memory atomicity facilities. The memory atomicity facilities are configured, when engaged, to (a) add an SB entry to the SB, in response to the processor core executing a store instruction that is part of an atomic region of code; (b) cause the SB entry to become senior, in response to the retirement unit retiring the store instruction; and (c) cause the SB entry to become walk enabled, in response to the retirement unit committing a transaction associated with the atomic region. Other embodiments are described and claimed.
    Type: Application
    Filed: March 28, 2019
    Publication date: October 1, 2020
    Inventors: Manjunath Shevgoor, Mark Joseph Dechene, Vineeth Mekkat, Jason Michael Agron, Zhongying Zhang
  • Publication number: 20200210339
    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 2, 2020
    Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
  • Publication number: 20200201632
    Abstract: A processor includes an instruction fetch circuit to retrieve instructions from memory, and a decode unit circuit to decode retrieved instructions. The decode unit circuit identifies a shift instruction, accumulates a shift folded immediate value to track a number of bit positions shifted for a source register, and prevents the shift instruction from allocation to an execution unit of the processor.
    Type: Application
    Filed: December 21, 2018
    Publication date: June 25, 2020
    Inventors: Vineeth MEKKAT, Xi CHEN, Manjunath SHEVGOOR
  • Publication number: 20200201645
    Abstract: A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.
    Type: Application
    Filed: December 21, 2018
    Publication date: June 25, 2020
    Inventors: Vineeth MEKKAT, Mark DECHENE, Zhongying ZHANG, John FAISTL, Janghaeng LEE, Hou-Jen KO, Sebastian WINKEL, Oleg MARGULIS
  • Patent number: 10540178
    Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.
    Type: Grant
    Filed: September 14, 2016
    Date of Patent: January 21, 2020
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Youfeng Wu, Sebastian Winkel, Oleg Margulis
  • Patent number: 10296343
    Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.
    Type: Grant
    Filed: March 30, 2017
    Date of Patent: May 21, 2019
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Jason M. Agron, Youfeng Wu
  • Patent number: 10235177
    Abstract: In an example, an apparatus includes a binary translator (BT) including circuitry to: analyze a code block; determine that an architectural register mapped to a physical register in the physical register file is available for early reclamation; and insert a reclamation hint into the code block. In another example, a processor reclaims the physical register based at least in part on the reclamation hint.
    Type: Grant
    Filed: July 2, 2016
    Date of Patent: March 19, 2019
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Janghaeng Lee, Youfeng Wu
  • Patent number: 10228956
    Abstract: In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: March 12, 2019
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Mark J. Dechene, Zhongying Zhang, Jason Agron, Sebastian Winkel
  • Patent number: 10120686
    Abstract: A processor includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, and a binary translator. The binary translator includes circuitry to identify a redundant store in the instruction stream, mark the start and end of a region of the instruction stream with the redundant store, remove the redundant store, and store an amended instruction stream with the redundant store removed.
    Type: Grant
    Filed: June 7, 2016
    Date of Patent: November 6, 2018
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Oleg Margulis, Ching-Tsun Chou, Youfeng Wu
  • Publication number: 20180285112
    Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.
    Type: Application
    Filed: March 30, 2017
    Publication date: October 4, 2018
    Inventors: Vineeth Mekkat, Jason M. Agron, Youfeng Wu
  • Patent number: 9996356
    Abstract: Apparatus and method for detecting and recovering from incorrect memory dependence speculation in an out-of-order processor are described herein. For example, one embodiment of a method comprises: executing a first load instruction; detecting when the first load instruction experiences a bad store-to-load forwarding event during execution; tracking the occurrences of bad store-to-load forwarding event experienced by the first load instruction during execution; controlling enablement of an S-bit in the first load instruction based on the tracked occurrences; generating a plurality of load operations responsive to an enabled S-bit in first load instruction, wherein execution of the plurality of load operations produces a result equivalent to that from the execution of the first load instruction.
    Type: Grant
    Filed: December 26, 2015
    Date of Patent: June 12, 2018
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Oleg Margulis, Jason M. Agron, Ethan Schuchman, Sebastian Winkel, Youfeng Wu, Gisle Dankel
  • Publication number: 20180095765
    Abstract: In one implementation, a processing device is provided that includes a memory to store instructions and a processor core to execute the instructions. The processor core is to receive a sequence of instructions reordered by a binary translator for execution. A first load of the sequence of instructions is identified. The first load references a memory location that stores a data item to be loaded. An occurrence of a second load is detected. The second load to access the memory location subsequent to an execution of the first load instruction. A protection field in the first load is enabled based on the detected occurrence of the second load. The enabled protection field indicates that the first load is to be checked for an aliasing associated with the memory location with respect to a subsequent store instruction. The second load is eliminated based on the enabled of the protection field.
    Type: Application
    Filed: September 30, 2016
    Publication date: April 5, 2018
    Inventors: Vineeth Mekkat, Mark J. Dechene, Zhongying Zhang, Jason Agron, Sebastian Winkel
  • Publication number: 20180074827
    Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.
    Type: Application
    Filed: September 14, 2016
    Publication date: March 15, 2018
    Inventors: Vineeth Mekkat, Youfeng Wu, Sebastian Winkel, Oleg Margulis
  • Patent number: 9916164
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed herein. An example apparatus includes an instruction profiler to identify a predicated block within instructions to be executed by a hardware processor. The example apparatus includes a performance monitor to access a mis-prediction statistic based on an instruction address associated with the predicated block. The example apparatus includes a region former to, in response to determining that the mis-prediction statistic is above a mis-prediction threshold, include the predicated block in a predicated region for optimization.
    Type: Grant
    Filed: June 11, 2015
    Date of Patent: March 13, 2018
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Girish Venkatasubramanian, Howard H. Chen