Patents by Inventor Suresh Srinivas

Suresh Srinivas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10725755
    Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.
    Type: Grant
    Filed: June 6, 2017
    Date of Patent: July 28, 2020
    Assignee: Intel Corporation
    Inventors: David J. Sager, Ruchira Sasanka, Ron Gabor, Shlomo Raikin, Joseph Nuzman, Leeor Peled, Jason A. Domer, Ho-Seop Kim, Youfeng Wu, Koichi Yamada, Tin-Fook Ngai, Howard H. Chen, Jayaram Bobba, Jeffrey J. Cook, Omar M. Shaikh, Suresh Srinivas
  • Patent number: 10120663
    Abstract: An inter-architecture compatibility apparatus of an aspect includes a control flow transfer reception module to receive a first call procedure operation, intended for a first architecture library module, from a first architecture code module. The first call procedure operation involves a first plurality of input parameters. An application binary interface (ABI) change module is coupled with the control flow transfer reception module. The ABI change module makes ABI changes to convert the first call procedure operation involving the first plurality of input parameters to a corresponding second call procedure operation involving a second plurality of input parameters. The second call procedure operation is compatible with a second architecture library module. A control flow transfer output module is coupled with the ABI change module. The control flow transfer output module provides the second call procedure operation to the second architecture library module.
    Type: Grant
    Filed: March 28, 2014
    Date of Patent: November 6, 2018
    Assignee: Intel Corporation
    Inventors: Niranjan Hasabnis, Suresh Srinivas, Jayaram Bobba
  • Publication number: 20180060049
    Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.
    Type: Application
    Filed: June 6, 2017
    Publication date: March 1, 2018
    Inventors: DAVID J. SAGER, RUCHIRA SASANKA, RON GABOR, SHLOMO RAIKIN, JOSEPH NUZMAN, LEEOR PELED, JASON A. DOMER, HO-SEOP KIM, YOUFENG WU, KOICHI YAMADA, TIN-FOOK NGAI, HOWARD H. CHEN, JAYARAM BOBBA, JEFFREY J. COOK, OMAR M. SHAIKH, SURESH SRINIVAS
  • Publication number: 20170212825
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Application
    Filed: January 10, 2017
    Publication date: July 27, 2017
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Patent number: 9672019
    Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.
    Type: Grant
    Filed: December 25, 2010
    Date of Patent: June 6, 2017
    Assignee: Intel Corporation
    Inventors: David J. Sager, Ruchira Sasanka, Ron Gabor, Shlomo Raikin, Joseph Nuzman, Leeor Peled, Jason A. Domer, Ho-Seop Kim, Youfeng Wu, Koichi Yamada, Tin-Fook Ngai, Howard H. Chen, Jayaram Bobba, Jeffery J. Cook, Omar M. Shaikh, Suresh Srinivas
  • Patent number: 9542191
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: January 10, 2017
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Patent number: 9529645
    Abstract: Example methods and apparatus to manage object locks are disclosed. A disclosed example method includes intercepting a processor request to apply the lock on the object, identifying a performance history of the object based on a number of instances of contention, reducing computing resources of the processor by, when the number of instances is below a threshold value, generating a lock bypass for the object to cause speculative execution of target code within the object, and preventing speculative execution by applying the lock on the object when the number of instances is above the threshold value.
    Type: Grant
    Filed: March 2, 2015
    Date of Patent: December 27, 2016
    Assignee: Intel Corporation
    Inventors: Suresh Srinivas, Stephen H. Dohrmann, Mingqiu Sun, Uma Srinivasan, Ravi Rajwar, Konrad K. Lai
  • Patent number: 9417855
    Abstract: A micro-architecture may provide a hardware and software co-designed dynamic binary translation. The micro-architecture may invoke a method to perform a dynamic binary translation. The method may comprise executing original software code compiled targeting a first instruction set, using processor hardware to detect a hot spot in the software code and passing control to a binary translation translator, determining a hot spot region for translation, generating the translated code using a second instruction set, placing the translated code in a translation cache, executing the translated code from the translated cache, and transitioning back to the original software code after the translated code finishes execution.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: August 16, 2016
    Assignee: Intel Corporation
    Inventors: Abhay S. Kanhere, Paul Caprioli, Koichi Yamada, Suriya Madras-Subramanian, Suresh Srinivas
  • Patent number: 9189233
    Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. For example, a method according to one embodiment comprises: analyzing a single-threaded region of executing program code, the analysis including identifying dependencies within the single-threaded region; determining portions of the single-threaded region of executing program code which may be executed in parallel based on the analysis; assigning the portions to two or more parallel execution tracks; and executing the portions in parallel across the assigned execution tracks.
    Type: Grant
    Filed: June 26, 2012
    Date of Patent: November 17, 2015
    Assignee: INTEL CORPORATION
    Inventors: Ruchira Sasanka, Abhinav Das, Jeffrey J. Cook, Jayaram Bobba, Arvind Krishnaswamy, David J. Sager, Suresh Srinivas
  • Patent number: 9170789
    Abstract: Embodiments of computer-implemented methods, systems, computing devices, and computer-readable media (transitory and non-transitory) are described herein for analyzing execution of a plurality of executable instructions and, based on the analysis, providing an indication of a benefit to be obtained by vectorization of at least a subset of the plurality of executable instructions. In various embodiments, the analysis may include identification of the subset of the plurality of executable instructions suitable for conversion to one or more single-instruction multiple-data (“SIMD”) instructions.
    Type: Grant
    Filed: March 5, 2013
    Date of Patent: October 27, 2015
    Assignee: Intel Corporation
    Inventors: Ruchira Sasanka, Jeffrey J. Cook, Abhinav Das, Jayaram Bobba, Michael R. Greenfield, Suresh Srinivas
  • Publication number: 20150277867
    Abstract: An inter-architecture compatibility apparatus of an aspect includes a control flow transfer reception module to receive a first call procedure operation, intended for a first architecture library module, from a first architecture code module. The first call procedure operation involves a first plurality of input parameters. An application binary interface (ABI) change module is coupled with the control flow transfer reception module. The ABI change module makes ABI changes to convert the first call procedure operation involving the first plurality of input parameters to a corresponding second call procedure operation involving a second plurality of input parameters. The second call procedure operation is compatible with a second architecture library module. A control flow transfer output module is coupled with the ABI change module. The control flow transfer output module provides the second call procedure operation to the second architecture library module.
    Type: Application
    Filed: March 28, 2014
    Publication date: October 1, 2015
    Inventors: Niranjan Hasabnis, Suresh Srinivas, Jayaram Bobba
  • Publication number: 20150169384
    Abstract: Example methods and apparatus to manage object locks are disclosed. A disclosed example method includes intercepting a processor request to apply the lock on the object, identifying a performance history of the object based on a number of instances of contention, reducing computing resources of the processor by, when the number of instances is below a threshold value, generating a lock bypass for the object to cause speculative execution of target code within the object, and preventing speculative execution by applying the lock on the object when the number of instances is above the threshold value.
    Type: Application
    Filed: March 2, 2015
    Publication date: June 18, 2015
    Inventors: Suresh Srinivas, Stephen H. Dohrmann, Mingqiu Sun, Uma Srinivasan, Ravi Rajwar, Konrad K. Lai
  • Patent number: 8972994
    Abstract: Example methods and apparatus to manage object locks are disclosed. A disclosed example method includes receiving an object lock request from a processor, the lock request associated with object lock code to lock an object, and generating object lock-bypass code based on a type of the processor, the object lock-bypass code to execute in a managed runtime in response to receiving the object lock request. The example method also includes identifying a type of instruction set architecture (ISA) associated with the processor, invoking a checkpoint instruction for the processor based on the identified ISA, suspending the object lock code from executing and executing target code when the object is uncontended, and allowing the object lock code to execute when the object is contended.
    Type: Grant
    Filed: December 23, 2009
    Date of Patent: March 3, 2015
    Assignee: Intel Corporation
    Inventors: Suresh Srinivas, Stephen H. Dohrmann, Mingqiu Sun, Uma Srinivasan, Ravi Rajwar, Konrad K. Lai
  • Publication number: 20140258677
    Abstract: Embodiments of computer-implemented methods, systems, computing devices, and computer-readable media (transitory and non-transitory) are described herein for analyzing execution of a plurality of executable instructions and, based on the analysis, providing an indication of a benefit to be obtained by vectorization of at least a subset of the plurality of executable instructions. In various embodiments, the analysis may include identification of the subset of the plurality of executable instructions suitable for conversion to one or more single-instruction multiple-data (“SIMD”) instructions.
    Type: Application
    Filed: March 5, 2013
    Publication date: September 11, 2014
    Inventors: Ruchira Sasanka, Jeffrey J. Cook, Abhinav Das, Jayaram Bobba, Michael R. Greenfield, Suresh Srinivas
  • Patent number: 8812792
    Abstract: A technique for using memory attributes to relay information to a program or other agent. More particularly, embodiments of the invention relate to using memory attribute bits to check various memory properties in an efficient manner.
    Type: Grant
    Filed: September 21, 2013
    Date of Patent: August 19, 2014
    Assignee: Intel Corporation
    Inventors: Quinn A. Jacobson, Anne W. Bracy, Hong Wang, John P. Shen, Per Hammarlund, Matthew C. Merten, Suresh Srinivas, Kshitij A. Doshi, Gautham Shinya, Bratin Saha, Ali-Reza Adi-Tabatabai, Gad Sheaffer
  • Patent number: 8775153
    Abstract: In one embodiment, a processor can operate in multiple modes, including a direct execution mode and an emulation execution mode. More specifically, the processor may operate in a partial emulation model in which source instruction set architecture (ISA) instructions are directly handled in the direct execution mode and translated code generated by an emulation engine is handled in the emulation execution mode. Embodiments may also provide for efficient transitions between the modes using information that can be stored in one or more storages of the processor and elsewhere in a system. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 23, 2009
    Date of Patent: July 8, 2014
    Assignee: Intel Corporation
    Inventors: Sebastian Winkel, Koichi Yamada, Suresh Srinivas, James E. Smith
  • Patent number: 8762127
    Abstract: In one embodiment, a processor can operate in multiple modes, including a direct execution mode and an emulation execution mode. More specifically, the processor may operate in a partial emulation model in which source instruction set architecture (ISA) instructions are directly handled in the direct execution mode and translated code generated by an emulation engine is handled in the emulation execution mode. Embodiments may also provide for efficient transitions between the modes using information that can be stored in one or more storages of the processor and elsewhere in a system. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 5, 2013
    Date of Patent: June 24, 2014
    Assignee: Intel Corporation
    Inventors: Sebastian Winkel, Koichi Yamada, Suresh Srinivas, James E. Smith
  • Publication number: 20140025901
    Abstract: A technique for using memory attributes to relay information to a program or other agent. More particularly, embodiments of the invention relate to using memory attribute bits to check various memory properties in an efficient manner.
    Type: Application
    Filed: September 21, 2013
    Publication date: January 23, 2014
    Inventors: Quinn A. Jacobson, Anne C. Bracy, Hong Wang, John P. Shen, Per Hammarlund, Matthew C. Merten, Suresh Srinivas, Kshitij A. Doshi, Gautham Chinya, Bratin Saha, Ali-Reza Adi-Tabatabai, Gad Sheaffer
  • Publication number: 20130311758
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Application
    Filed: March 30, 2012
    Publication date: November 21, 2013
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Publication number: 20130283249
    Abstract: A micro-architecture may provide a hardware and software co-designed dynamic binary translation. The micro-architecture may invoke a method to perform a dynamic binary translation. The method may comprise executing original software code compiled targeting a first instruction set, using processor hardware to detect a hot spot in the software code and passing control to a binary translation translator, determining a hot spot region for translation, generating the translated code using a second instruction set, placing the translated code in a translation cache, executing the translated code from the translated cache, and transitioning back to the original software code after the translated code finishes execution.
    Type: Application
    Filed: September 30, 2011
    Publication date: October 24, 2013
    Inventors: Abhay S. Kanhere, Paul Caprioli, Koichi Yamada, Suriya Madras-Subramanian, Suresh Srinivas