Patents by Inventor Girish Venkatasubramanian
Girish Venkatasubramanian has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11372775Abstract: A processor comprising an instruction execution circuit to execute a second code stored at a second address of a memory, wherein the second code is translated from a first code stored at a first address of the memory and a translation table (TT) controller coupled to a translation table to store a TT entry comprising a mapping between the first address and the second address and an attribute field comprising an attribute value associated with execution of the second code, wherein the TT controller is to monitor execution of the second code by the instruction execution circuit and update, based on a performance metric of the execution, the attribute value of the TT entry.Type: GrantFiled: January 30, 2020Date of Patent: June 28, 2022Assignee: Intel CorporationInventors: Girish Venkatasubramanian, Jason M. Agron, Cristiano Pereira, Rangeen Basu Roy Chowdhury
-
Publication number: 20200174944Abstract: A processor comprising an instruction execution circuit to execute a second code stored at a second address of a memory, wherein the second code is translated from a first code stored at a first address of the memory and a translation table (TT) controller coupled to a translation table to store a TT entry comprising a mapping between the first address and the second address and an attribute field comprising an attribute value associated with execution of the second code, wherein the TT controller is to monitor execution of the second code by the instruction execution circuit and update, based on a performance metric of the execution, the attribute value of the TT entry.Type: ApplicationFiled: January 30, 2020Publication date: June 4, 2020Inventors: Girish Venkatasubramanian, Jason M. Agron, Cristiano Pereira, Rangeen Basu Roy Chowdhury
-
Patent number: 10474442Abstract: Methods, apparatus, systems and articles of manufacture to perform region formation for usage by a dynamic binary translation are disclosed. An example apparatus includes an initial region former to form an initial region starting at a first block of hot code of a control flow graph. The initial region former also adds blocks of hot code lying on a first hottest path of the control flow graph. A region extender extends the initial region to form an extended region including the initial region. The extended region begins at a hottest exit of the initial region and includes blocks of hot code lying on a second hottest path until one of a threshold path length has been satisfied or a back edge of the control flow graph is added to the extended region. A region pruner prunes the remove all loop nests except a selected loop nest which forms a final region.Type: GrantFiled: September 29, 2017Date of Patent: November 12, 2019Assignee: Intel CorporationInventors: Girish Venkatasubramanian, Tanima Dey, Dasarath Weeratunge, Cristiano Pereira, Jose Baiocchi Paredes
-
Publication number: 20190179766Abstract: A processor comprising an instruction execution circuit to execute a translated code generated based on a received code and a translation table (TT) controller circuit coupled to a translation table comprising a plurality of address mappings, wherein the TT controller circuit is to identify a trigger event associated with a physical memory page, determine, based on an identifier of the physical memory page, an entry in a manifest table, the entry comprising an address mapping between a first memory address within an address range comprising the physical memory page and a second memory address, and store the address mapping to the translation table.Type: ApplicationFiled: December 12, 2017Publication date: June 13, 2019Inventors: Girish Venkatasubramanian, Jason M. Agron, Cristiano Pereira, Glenn Hinton, Sebastian Winkel, Rangeen Basu Roy Chowdhury
-
Publication number: 20190163642Abstract: A processor comprising an instruction execution circuit to execute a second code stored at a second address of a memory, wherein the second code is translated from a first code stored at a first address of the memory and a translation table (TT) controller coupled to a translation table to store a TT entry comprising a mapping between the first address and the second address and an attribute field comprising an attribute value associated with execution of the second code, wherein the TT controller is to monitor execution of the second code by the instruction execution circuit and update, based on a performance metric of the execution, the attribute value of the TT entry.Type: ApplicationFiled: November 27, 2017Publication date: May 30, 2019Inventors: Girish Venkatasubramanian, Jason M. Agron, Cristiano Pereira, Rangeen Basu Roy Chowdhury
-
Publication number: 20190102150Abstract: Methods, apparatus, systems and articles of manufacture to perform region formation for usage by a dynamic binary translation are disclosed. An example apparatus includes an initial region former to form an initial region starting at a first block of hot code of a control flow graph. The initial region former also adds blocks of hot code lying on a first hottest path of the control flow graph. A region extender extends the initial region to form an extended region including the initial region. The extended region begins at a hottest exit of the initial region and includes blocks of hot code lying on a second hottest path until one of a threshold path length has been satisfied or a back edge of the control flow graph is added to the extended region. A region pruner prunes the remove all loop nests except a selected loop nest which forms a final region.Type: ApplicationFiled: September 29, 2017Publication date: April 4, 2019Inventors: Girish Venkatasubramanian, Tanima Dey, Dasarath Weeratunge, Cristiano Pereira, Jose Baiocchi Paredes
-
Patent number: 10191745Abstract: In one example a processor includes a region formation engine to identify a region of code for translation from a guest instruction set architecture to a native instruction set architecture. The processor also includes a binary translator to translate the region of code. The region formation engine is to perform aggressive region formation, which includes forming a region across a boundary of a return instruction. The translated region of code is to prevent a side entry into the translated region of code at a translated return target instruction included in the translated region of code. In more specific examples, performing aggressive region formation includes a region formation grow phase and a region formation cleanup phase. In the grow phase priority may be given to growing complete paths from a call target to a corresponding return. The region formation cleanup phase may comprise eliminating call targets that are not reachable.Type: GrantFiled: March 31, 2017Date of Patent: January 29, 2019Assignee: Intel CorporationInventors: Hou-Jen Ko, Girish Venkatasubramanian, Jason Agron, Tyler Sondag, Youfeng Wu
-
Publication number: 20180285113Abstract: In one example a processor includes a region formation engine to identify a region of code for translation from a guest instruction set architecture to a native instruction set architecture. The processor also includes a binary translator to translate the region of code. The region formation engine is to perform aggressive region formation, which includes forming a region across a boundary of a return instruction. The translated region of code is to prevent a side entry into the translated region of code at a translated return target instruction included in the translated region of code. In more specific examples, performing aggressive region formation includes a region formation grow phase and a region formation cleanup phase. In the grow phase priority may be given to growing complete paths from a call target to a corresponding return. The region formation cleanup phase may comprise eliminating call targets that are not reachable.Type: ApplicationFiled: March 31, 2017Publication date: October 4, 2018Applicant: Intel CorporationInventors: Hou-Jen Ko, Girish Venkatasubramanian, Jason Agron, Tyler Sondag, Youfeng Wu
-
Patent number: 10083033Abstract: A method and apparatus are described for efficient register reclamation. For example, one embodiment of an apparatus comprises: single usage detection and tagging logic to examine a sequence of instructions to detect logical registers used by the sequence of instructions that have a single use and to tag an instruction as a single usage instruction if the instruction is a consumer of a logical register that has a single use; an allocator to allocate processor resources to execute the sequence of instructions, the processor resources including physical registers mapped to logical registers to execute the sequence of instructions; and register reclamation logic to free up a logical to physical mapping of a single use register in response to detecting the tag provided by the instruction tagging logic.Type: GrantFiled: March 10, 2015Date of Patent: September 25, 2018Assignee: Intel CorporationInventors: Sebastian Winkel, Girish Venkatasubramanian, Tyler N. Sondag, Rolf Kassa
-
Patent number: 10055256Abstract: A processor includes a front end and a scheduler. The front end includes circuitry to determine whether to apply an acyclical or cyclical thread assignment scheme to code received at the processor, and to, based upon a determined thread assignment scheme, assign code to a static logical thread and to a rotating logical thread. The scheduler includes circuitry to assign the static logical thread to the same physical thread upon a subsequent control flow execution of the static logical thread, and to assign the rotating logical thread to different physical threads upon different executions of instructions in the rotating logical thread.Type: GrantFiled: February 29, 2016Date of Patent: August 21, 2018Assignee: Intel CorporationInventors: Sebastian Winkel, Ethan Schuchman, Tyler Sondag, Girish Venkatasubramanian
-
Patent number: 9916164Abstract: Methods, apparatus, systems and articles of manufacture are disclosed herein. An example apparatus includes an instruction profiler to identify a predicated block within instructions to be executed by a hardware processor. The example apparatus includes a performance monitor to access a mis-prediction statistic based on an instruction address associated with the predicated block. The example apparatus includes a region former to, in response to determining that the mis-prediction statistic is above a mis-prediction threshold, include the predicated block in a predicated region for optimization.Type: GrantFiled: June 11, 2015Date of Patent: March 13, 2018Assignee: Intel CorporationInventors: Vineeth Mekkat, Girish Venkatasubramanian, Howard H. Chen
-
Patent number: 9858057Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to validate translated guest code in a dynamic binary translator. An example apparatus disclosed herein includes a translator to generate a first translation of code to execute on a host machine, the first translation of the guest code to facilitate creating a first translated guest code, and the translator to generate a second translation of the translated guest code to execute on the host machine. The example apparatus also includes a translation versions manager to identify a first host machine state based on executing a portion of the first translation, and the translation versions manager to identify a second host machine state based on executing a portion of the second translation. The example system also includes a validator to determine a state divergence status of the second translation based on a comparison between the first host machine state and the second host machine state.Type: GrantFiled: December 2, 2015Date of Patent: January 2, 2018Assignee: Intel CorporationInventors: Girish Venkatasubramanian, Chaitanya Mangla, Gerolf F. Hoflehner, Ethan Schuchman
-
Patent number: 9823938Abstract: In one embodiment, a processor includes a front end unit to fetch and decode an instruction. The front end unit includes a first random number generator to generate a random value responsive to a profileable event associated with the instruction. The processor further includes a profile logic to collect profile information associated with the instruction responsive to a sample signal, where the sample signal is based on at least a portion of the random value. Other embodiments are described and claimed.Type: GrantFiled: June 18, 2015Date of Patent: November 21, 2017Assignee: Intel CorporationInventors: Girish Venkatasubramanian, Jamison D. Collins, Jason M. Agron, Polychronis Xekalakis
-
Publication number: 20160371065Abstract: In one embodiment, a processor includes a front end unit to fetch and decode an instruction. The front end unit includes a first random number generator to generate a random value responsive to a profileable event associated with the instruction. The processor further includes a profile logic to collect profile information associated with the instruction responsive to a sample signal, where the sample signal is based on at least a portion of the random value. Other embodiments are described and claimed.Type: ApplicationFiled: June 18, 2015Publication date: December 22, 2016Inventors: Girish Venkatasubramanian, Jamison D. Collins, Jason M. Agron, Polychronis Xekalakis
-
Publication number: 20160364240Abstract: Methods, apparatus, systems and articles of manufacture are disclosed herein. An example apparatus includes an instruction profiler to identify a predicated block within instructions to be executed by a hardware processor. The example apparatus includes a performance monitor to access a mis-prediction statistic based on an instruction address associated with the predicated block. The example apparatus includes a region former to, in response to determining that the mis-prediction statistic is above a mis-prediction threshold, include the predicated block in a predicated region for optimization.Type: ApplicationFiled: June 11, 2015Publication date: December 15, 2016Inventors: Vineeth Mekkat, Girish Venkatasubramanian, Howard H. Chen
-
Patent number: 9460022Abstract: A mechanism is described for facilitating dynamic and efficient binary translation-based translation lookaside buffer prefetching according to one embodiment. A method of embodiments, as described herein, includes translating code blocks into code translation blocks at a computing device. The code translation blocks are submitted for execution. The method may further include tracking, in runtime, dynamic system behavior of the code translation blocks, and inferring translation lookaside buffer (TLB) prefetching based on the analysis of the tracked dynamic system behavior.Type: GrantFiled: March 15, 2013Date of Patent: October 4, 2016Assignee: Intel CorporationInventors: Girish Venkatasubramanian, Ethan Schuchman
-
Publication number: 20160283247Abstract: Methods and apparatuses relating to selectively executing a commit instruction. In one embodiment, a data storage device stores code that when executed by a hardware processor causes the hardware processor to perform the following: translating an instruction into a translated instruction to be executed by the hardware processor, marking a commit instruction one of for execution and for optional execution by the hardware processor, and including a hint for a commit instruction marked for optional execution; and a hardware commit unit to determine if the commit instruction marked for optional execution is to be executed based on the hint.Type: ApplicationFiled: March 25, 2015Publication date: September 29, 2016Inventors: Girish Venkatasubramanian, Ethan Schuchman, David Keppel, Sebastian Winkel, David N. Mackintosh, Jaroslaw Topp
-
Publication number: 20160274944Abstract: A processor includes a front end and a scheduler. The front end includes circuitry to determine whether to apply an acyclical or cyclical thread assignment scheme to code received at the processor, and to, based upon a determined thread assignment scheme, assign code to a static logical thread and to a rotating logical thread. The scheduler includes circuitry to assign the static logical thread to the same physical thread upon a subsequent control flow execution of the static logical thread, and to assign the rotating logical thread to different physical threads upon different executions of instructions in the rotating logical thread.Type: ApplicationFiled: February 29, 2016Publication date: September 22, 2016Inventors: Sebastian Winkel, Ethan Schuchman, Tyler Sondag, Girish Venkatasubramanian
-
Publication number: 20160266901Abstract: A method and apparatus are described for efficient register reclamation. For example, one embodiment of an apparatus comprises: single usage detection and tagging logic to examine a sequence of instructions to detect logical registers used by the sequence of instructions that have a single use and to tag an instruction as a single usage instruction if the instruction is a consumer of a logical register that has a single use; an allocator to allocate processor resources to execute the sequence of instructions, the processor resources including physical registers mapped to logical registers to execute the sequence of instructions; and register reclamation logic to free up a logical to physical mapping of a single use register in response to detecting the tag provided by the instruction tagging logic.Type: ApplicationFiled: March 10, 2015Publication date: September 15, 2016Inventors: SEBASTIAN WINKEL, GIRISH VENKATASUBRAMANIAN, TYLER N. SONDAG, ROLF KASSA
-
Publication number: 20160085556Abstract: A processor includes a front end and a scheduler. The front end includes logic to determine whether to apply an acyclical or cyclical thread assignment scheme to code received at the processor, and to, based upon a determined thread assignment scheme, assign code to a static logical thread and to a rotating logical thread. The scheduler includes logic to assign the static logical thread to the same physical thread upon a subsequent control flow execution of the static logical thread, and to assign the rotating logical thread to different physical threads upon different executions of instructions in the rotating logical thread.Type: ApplicationFiled: September 24, 2014Publication date: March 24, 2016Inventors: Sebastian Winkel, Ethan Schuchman, Tyler Sondag, Girish Venkatasubramanian