Patents by Inventor Girish Venkatasubramanian

Girish Venkatasubramanian has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11372775
    Abstract: A processor comprising an instruction execution circuit to execute a second code stored at a second address of a memory, wherein the second code is translated from a first code stored at a first address of the memory and a translation table (TT) controller coupled to a translation table to store a TT entry comprising a mapping between the first address and the second address and an attribute field comprising an attribute value associated with execution of the second code, wherein the TT controller is to monitor execution of the second code by the instruction execution circuit and update, based on a performance metric of the execution, the attribute value of the TT entry.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: June 28, 2022
    Assignee: Intel Corporation
    Inventors: Girish Venkatasubramanian, Jason M. Agron, Cristiano Pereira, Rangeen Basu Roy Chowdhury
  • Publication number: 20200174944
    Abstract: A processor comprising an instruction execution circuit to execute a second code stored at a second address of a memory, wherein the second code is translated from a first code stored at a first address of the memory and a translation table (TT) controller coupled to a translation table to store a TT entry comprising a mapping between the first address and the second address and an attribute field comprising an attribute value associated with execution of the second code, wherein the TT controller is to monitor execution of the second code by the instruction execution circuit and update, based on a performance metric of the execution, the attribute value of the TT entry.
    Type: Application
    Filed: January 30, 2020
    Publication date: June 4, 2020
    Inventors: Girish Venkatasubramanian, Jason M. Agron, Cristiano Pereira, Rangeen Basu Roy Chowdhury
  • Patent number: 10474442
    Abstract: Methods, apparatus, systems and articles of manufacture to perform region formation for usage by a dynamic binary translation are disclosed. An example apparatus includes an initial region former to form an initial region starting at a first block of hot code of a control flow graph. The initial region former also adds blocks of hot code lying on a first hottest path of the control flow graph. A region extender extends the initial region to form an extended region including the initial region. The extended region begins at a hottest exit of the initial region and includes blocks of hot code lying on a second hottest path until one of a threshold path length has been satisfied or a back edge of the control flow graph is added to the extended region. A region pruner prunes the remove all loop nests except a selected loop nest which forms a final region.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: November 12, 2019
    Assignee: Intel Corporation
    Inventors: Girish Venkatasubramanian, Tanima Dey, Dasarath Weeratunge, Cristiano Pereira, Jose Baiocchi Paredes
  • Publication number: 20190179766
    Abstract: A processor comprising an instruction execution circuit to execute a translated code generated based on a received code and a translation table (TT) controller circuit coupled to a translation table comprising a plurality of address mappings, wherein the TT controller circuit is to identify a trigger event associated with a physical memory page, determine, based on an identifier of the physical memory page, an entry in a manifest table, the entry comprising an address mapping between a first memory address within an address range comprising the physical memory page and a second memory address, and store the address mapping to the translation table.
    Type: Application
    Filed: December 12, 2017
    Publication date: June 13, 2019
    Inventors: Girish Venkatasubramanian, Jason M. Agron, Cristiano Pereira, Glenn Hinton, Sebastian Winkel, Rangeen Basu Roy Chowdhury
  • Publication number: 20190163642
    Abstract: A processor comprising an instruction execution circuit to execute a second code stored at a second address of a memory, wherein the second code is translated from a first code stored at a first address of the memory and a translation table (TT) controller coupled to a translation table to store a TT entry comprising a mapping between the first address and the second address and an attribute field comprising an attribute value associated with execution of the second code, wherein the TT controller is to monitor execution of the second code by the instruction execution circuit and update, based on a performance metric of the execution, the attribute value of the TT entry.
    Type: Application
    Filed: November 27, 2017
    Publication date: May 30, 2019
    Inventors: Girish Venkatasubramanian, Jason M. Agron, Cristiano Pereira, Rangeen Basu Roy Chowdhury
  • Publication number: 20190102150
    Abstract: Methods, apparatus, systems and articles of manufacture to perform region formation for usage by a dynamic binary translation are disclosed. An example apparatus includes an initial region former to form an initial region starting at a first block of hot code of a control flow graph. The initial region former also adds blocks of hot code lying on a first hottest path of the control flow graph. A region extender extends the initial region to form an extended region including the initial region. The extended region begins at a hottest exit of the initial region and includes blocks of hot code lying on a second hottest path until one of a threshold path length has been satisfied or a back edge of the control flow graph is added to the extended region. A region pruner prunes the remove all loop nests except a selected loop nest which forms a final region.
    Type: Application
    Filed: September 29, 2017
    Publication date: April 4, 2019
    Inventors: Girish Venkatasubramanian, Tanima Dey, Dasarath Weeratunge, Cristiano Pereira, Jose Baiocchi Paredes
  • Patent number: 10191745
    Abstract: In one example a processor includes a region formation engine to identify a region of code for translation from a guest instruction set architecture to a native instruction set architecture. The processor also includes a binary translator to translate the region of code. The region formation engine is to perform aggressive region formation, which includes forming a region across a boundary of a return instruction. The translated region of code is to prevent a side entry into the translated region of code at a translated return target instruction included in the translated region of code. In more specific examples, performing aggressive region formation includes a region formation grow phase and a region formation cleanup phase. In the grow phase priority may be given to growing complete paths from a call target to a corresponding return. The region formation cleanup phase may comprise eliminating call targets that are not reachable.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: January 29, 2019
    Assignee: Intel Corporation
    Inventors: Hou-Jen Ko, Girish Venkatasubramanian, Jason Agron, Tyler Sondag, Youfeng Wu
  • Publication number: 20180285113
    Abstract: In one example a processor includes a region formation engine to identify a region of code for translation from a guest instruction set architecture to a native instruction set architecture. The processor also includes a binary translator to translate the region of code. The region formation engine is to perform aggressive region formation, which includes forming a region across a boundary of a return instruction. The translated region of code is to prevent a side entry into the translated region of code at a translated return target instruction included in the translated region of code. In more specific examples, performing aggressive region formation includes a region formation grow phase and a region formation cleanup phase. In the grow phase priority may be given to growing complete paths from a call target to a corresponding return. The region formation cleanup phase may comprise eliminating call targets that are not reachable.
    Type: Application
    Filed: March 31, 2017
    Publication date: October 4, 2018
    Applicant: Intel Corporation
    Inventors: Hou-Jen Ko, Girish Venkatasubramanian, Jason Agron, Tyler Sondag, Youfeng Wu
  • Patent number: 10083033
    Abstract: A method and apparatus are described for efficient register reclamation. For example, one embodiment of an apparatus comprises: single usage detection and tagging logic to examine a sequence of instructions to detect logical registers used by the sequence of instructions that have a single use and to tag an instruction as a single usage instruction if the instruction is a consumer of a logical register that has a single use; an allocator to allocate processor resources to execute the sequence of instructions, the processor resources including physical registers mapped to logical registers to execute the sequence of instructions; and register reclamation logic to free up a logical to physical mapping of a single use register in response to detecting the tag provided by the instruction tagging logic.
    Type: Grant
    Filed: March 10, 2015
    Date of Patent: September 25, 2018
    Assignee: Intel Corporation
    Inventors: Sebastian Winkel, Girish Venkatasubramanian, Tyler N. Sondag, Rolf Kassa
  • Patent number: 10055256
    Abstract: A processor includes a front end and a scheduler. The front end includes circuitry to determine whether to apply an acyclical or cyclical thread assignment scheme to code received at the processor, and to, based upon a determined thread assignment scheme, assign code to a static logical thread and to a rotating logical thread. The scheduler includes circuitry to assign the static logical thread to the same physical thread upon a subsequent control flow execution of the static logical thread, and to assign the rotating logical thread to different physical threads upon different executions of instructions in the rotating logical thread.
    Type: Grant
    Filed: February 29, 2016
    Date of Patent: August 21, 2018
    Assignee: Intel Corporation
    Inventors: Sebastian Winkel, Ethan Schuchman, Tyler Sondag, Girish Venkatasubramanian
  • Patent number: 9916164
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed herein. An example apparatus includes an instruction profiler to identify a predicated block within instructions to be executed by a hardware processor. The example apparatus includes a performance monitor to access a mis-prediction statistic based on an instruction address associated with the predicated block. The example apparatus includes a region former to, in response to determining that the mis-prediction statistic is above a mis-prediction threshold, include the predicated block in a predicated region for optimization.
    Type: Grant
    Filed: June 11, 2015
    Date of Patent: March 13, 2018
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Girish Venkatasubramanian, Howard H. Chen
  • Patent number: 9858057
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to validate translated guest code in a dynamic binary translator. An example apparatus disclosed herein includes a translator to generate a first translation of code to execute on a host machine, the first translation of the guest code to facilitate creating a first translated guest code, and the translator to generate a second translation of the translated guest code to execute on the host machine. The example apparatus also includes a translation versions manager to identify a first host machine state based on executing a portion of the first translation, and the translation versions manager to identify a second host machine state based on executing a portion of the second translation. The example system also includes a validator to determine a state divergence status of the second translation based on a comparison between the first host machine state and the second host machine state.
    Type: Grant
    Filed: December 2, 2015
    Date of Patent: January 2, 2018
    Assignee: Intel Corporation
    Inventors: Girish Venkatasubramanian, Chaitanya Mangla, Gerolf F. Hoflehner, Ethan Schuchman
  • Patent number: 9823938
    Abstract: In one embodiment, a processor includes a front end unit to fetch and decode an instruction. The front end unit includes a first random number generator to generate a random value responsive to a profileable event associated with the instruction. The processor further includes a profile logic to collect profile information associated with the instruction responsive to a sample signal, where the sample signal is based on at least a portion of the random value. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 18, 2015
    Date of Patent: November 21, 2017
    Assignee: Intel Corporation
    Inventors: Girish Venkatasubramanian, Jamison D. Collins, Jason M. Agron, Polychronis Xekalakis
  • Publication number: 20160371065
    Abstract: In one embodiment, a processor includes a front end unit to fetch and decode an instruction. The front end unit includes a first random number generator to generate a random value responsive to a profileable event associated with the instruction. The processor further includes a profile logic to collect profile information associated with the instruction responsive to a sample signal, where the sample signal is based on at least a portion of the random value. Other embodiments are described and claimed.
    Type: Application
    Filed: June 18, 2015
    Publication date: December 22, 2016
    Inventors: Girish Venkatasubramanian, Jamison D. Collins, Jason M. Agron, Polychronis Xekalakis
  • Publication number: 20160364240
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed herein. An example apparatus includes an instruction profiler to identify a predicated block within instructions to be executed by a hardware processor. The example apparatus includes a performance monitor to access a mis-prediction statistic based on an instruction address associated with the predicated block. The example apparatus includes a region former to, in response to determining that the mis-prediction statistic is above a mis-prediction threshold, include the predicated block in a predicated region for optimization.
    Type: Application
    Filed: June 11, 2015
    Publication date: December 15, 2016
    Inventors: Vineeth Mekkat, Girish Venkatasubramanian, Howard H. Chen
  • Patent number: 9460022
    Abstract: A mechanism is described for facilitating dynamic and efficient binary translation-based translation lookaside buffer prefetching according to one embodiment. A method of embodiments, as described herein, includes translating code blocks into code translation blocks at a computing device. The code translation blocks are submitted for execution. The method may further include tracking, in runtime, dynamic system behavior of the code translation blocks, and inferring translation lookaside buffer (TLB) prefetching based on the analysis of the tracked dynamic system behavior.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: October 4, 2016
    Assignee: Intel Corporation
    Inventors: Girish Venkatasubramanian, Ethan Schuchman
  • Publication number: 20160283247
    Abstract: Methods and apparatuses relating to selectively executing a commit instruction. In one embodiment, a data storage device stores code that when executed by a hardware processor causes the hardware processor to perform the following: translating an instruction into a translated instruction to be executed by the hardware processor, marking a commit instruction one of for execution and for optional execution by the hardware processor, and including a hint for a commit instruction marked for optional execution; and a hardware commit unit to determine if the commit instruction marked for optional execution is to be executed based on the hint.
    Type: Application
    Filed: March 25, 2015
    Publication date: September 29, 2016
    Inventors: Girish Venkatasubramanian, Ethan Schuchman, David Keppel, Sebastian Winkel, David N. Mackintosh, Jaroslaw Topp
  • Publication number: 20160274944
    Abstract: A processor includes a front end and a scheduler. The front end includes circuitry to determine whether to apply an acyclical or cyclical thread assignment scheme to code received at the processor, and to, based upon a determined thread assignment scheme, assign code to a static logical thread and to a rotating logical thread. The scheduler includes circuitry to assign the static logical thread to the same physical thread upon a subsequent control flow execution of the static logical thread, and to assign the rotating logical thread to different physical threads upon different executions of instructions in the rotating logical thread.
    Type: Application
    Filed: February 29, 2016
    Publication date: September 22, 2016
    Inventors: Sebastian Winkel, Ethan Schuchman, Tyler Sondag, Girish Venkatasubramanian
  • Publication number: 20160266901
    Abstract: A method and apparatus are described for efficient register reclamation. For example, one embodiment of an apparatus comprises: single usage detection and tagging logic to examine a sequence of instructions to detect logical registers used by the sequence of instructions that have a single use and to tag an instruction as a single usage instruction if the instruction is a consumer of a logical register that has a single use; an allocator to allocate processor resources to execute the sequence of instructions, the processor resources including physical registers mapped to logical registers to execute the sequence of instructions; and register reclamation logic to free up a logical to physical mapping of a single use register in response to detecting the tag provided by the instruction tagging logic.
    Type: Application
    Filed: March 10, 2015
    Publication date: September 15, 2016
    Inventors: SEBASTIAN WINKEL, GIRISH VENKATASUBRAMANIAN, TYLER N. SONDAG, ROLF KASSA
  • Publication number: 20160085556
    Abstract: A processor includes a front end and a scheduler. The front end includes logic to determine whether to apply an acyclical or cyclical thread assignment scheme to code received at the processor, and to, based upon a determined thread assignment scheme, assign code to a static logical thread and to a rotating logical thread. The scheduler includes logic to assign the static logical thread to the same physical thread upon a subsequent control flow execution of the static logical thread, and to assign the rotating logical thread to different physical threads upon different executions of instructions in the rotating logical thread.
    Type: Application
    Filed: September 24, 2014
    Publication date: March 24, 2016
    Inventors: Sebastian Winkel, Ethan Schuchman, Tyler Sondag, Girish Venkatasubramanian