Patents by Inventor John K. O'Brien

John K. O'Brien has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10223091
    Abstract: In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unaligned ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.
    Type: Grant
    Filed: July 20, 2017
    Date of Patent: March 5, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Patent number: 10223260
    Abstract: According to one embodiment, a method of creating compiler-generated memory mapping hints in a computer system includes analyzing code, by a compiler of the computer system, to identify data access patterns in the code. System configuration information defining data processing system characteristics of a target system for the code is accessed. The data processing system characteristics include a plurality of processing resources and memory domain characteristics relative to the processing resources. A preferred allocation of data in memory domains of the target system is determined based on mapping the code to one or more selected processing resources and mapping the data to one or more of the memory domains based on the memory domain characteristics relative to the one or more selected processing resources. The preferred allocation is stored as compiler-generated memory mapping hints in a format accessible by a physical memory mapping resource of the target system.
    Type: Grant
    Filed: March 19, 2014
    Date of Patent: March 5, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kathryn M. O'Brien, John K. O'Brien, Zehra N. Sura
  • Patent number: 9971713
    Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
    Type: Grant
    Filed: April 30, 2015
    Date of Patent: May 15, 2018
    Assignee: GLOBALFOUNDRIES INC.
    Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
  • Patent number: 9875089
    Abstract: In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unaligned ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.
    Type: Grant
    Filed: June 19, 2015
    Date of Patent: January 23, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Publication number: 20170351501
    Abstract: In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unaligned ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.
    Type: Application
    Filed: July 20, 2017
    Publication date: December 7, 2017
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Patent number: 9792098
    Abstract: In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unaligned ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.
    Type: Grant
    Filed: March 25, 2015
    Date of Patent: October 17, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Patent number: 9772825
    Abstract: Embodiments relate to program structure-based blocking. An aspect includes receiving source code corresponding to a computer program by a compiler of a computer system. Another aspect includes determining a prefetching section in the source code by a marking module of the compiler. Yet another aspect includes performing, by a blocking module of the compiler, blocking of instructions located in the prefetching section into instruction blocks, such that the instruction blocks of the prefetching section only contain instructions that are located in the prefetching section.
    Type: Grant
    Filed: June 17, 2015
    Date of Patent: September 26, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Carlo Bertolli, Alexandre E. Eichenberger, John K. O'Brien, Zehra N. Sura
  • Patent number: 9772824
    Abstract: Embodiments relate to program structure-based blocking. An aspect includes receiving source code corresponding to a computer program by a compiler of a computer system. Another aspect includes determining a prefetching section in the source code by a marking module of the compiler. Yet another aspect includes performing, by a blocking module of the compiler, blocking of instructions located in the prefetching section into instruction blocks, such that the instruction blocks of the prefetching section only contain instructions that are located in the prefetching section.
    Type: Grant
    Filed: March 25, 2015
    Date of Patent: September 26, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Carlo Bertolli, Alexandre E. Eichenberger, John K. O'Brien, Zehra N. Sura
  • Patent number: 9513828
    Abstract: An aspect includes a table of contents (TOC) that was generated by a compiler being received at an accelerator device. The TOC includes an address of global data in a host memory space. The global data is copied from the address in the host memory space to an address in the device memory space. The address in the host memory space is obtained from the received TOC. The received TOC is updated to indicate that global data is stored at the address in the device memory space. A kernel that accesses the global data from the address in the device memory space is executed. The address in the device memory space is obtained based on contents of the updated TOC. When the executing is completed, the global data from the address in the device memory space is copied to the address in the host memory space.
    Type: Grant
    Filed: June 22, 2015
    Date of Patent: December 6, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Patent number: 9513832
    Abstract: An aspect includes a table of contents (TOC) that was generated by a compiler being received at an accelerator device. The TOC includes an address of global data in a host memory space. The global data is copied from the address in the host memory space to an address in the device memory space. The address in the host memory space is obtained from the received TOC. The received TOC is updated to indicate that global data is stored at the address in the device memory space. A kernel that accesses the global data from the address in the device memory space is executed. The address in the device memory space is obtained based on contents of the updated TOC. When the executing is completed, the global data from the address in the device memory space is copied to the address in the host memory space.
    Type: Grant
    Filed: March 25, 2015
    Date of Patent: December 6, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Patent number: 9495274
    Abstract: A computer-implemented method includes selecting a runtime for executing a program. The runtime includes a first combination of feature implementations, where each feature implementation implements a feature of an application programming interface (API). Execution of the program is monitored, and the execution uses the runtime. Monitor data is generated based on the monitoring. A second combination of feature implementations are selected, by a computer processor, where the selection is based at least in part on the monitor data. The runtime is modified by activating the second combination of feature implementations to replace the first combination of feature implementations.
    Type: Grant
    Filed: November 30, 2015
    Date of Patent: November 15, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Samuel F. Antao, Carlo Bertolli, Alexandre E. Eichenberger, John K. O'Brien
  • Patent number: 9465714
    Abstract: A computer-implemented method includes selecting a runtime for executing a program. The runtime includes a first combination of feature implementations, where each feature implementation implements a feature of an application programming interface (API). Execution of the program is monitored, and the execution uses the runtime. Monitor data is generated based on the monitoring. A second combination of feature implementations are selected, by a computer processor, where the selection is based at least in part on the monitor data. The runtime is modified by activating the second combination of feature implementations to replace the first combination of feature implementations.
    Type: Grant
    Filed: September 22, 2015
    Date of Patent: October 11, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Samuel F. Antao, Carlo Bertolli, Alexandre E. Eichenberger, John K. O'Brien
  • Publication number: 20160283209
    Abstract: In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unaligned ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.
    Type: Application
    Filed: March 25, 2015
    Publication date: September 29, 2016
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Publication number: 20160283210
    Abstract: Embodiments relate to program structure-based blocking. An aspect includes receiving source code corresponding to a computer program by a compiler of a computer system. Another aspect includes determining a prefetching section in the source code by a marking module of the compiler. Yet another aspect includes performing, by a blocking module of the compiler, blocking of instructions located in the prefetching section into instruction blocks, such that the instruction blocks of the prefetching section only contain instructions that are located in the prefetching section.
    Type: Application
    Filed: March 25, 2015
    Publication date: September 29, 2016
    Inventors: Carlo Bertolli, Alexandre E. Eichenberger, John K. O'Brien, Zehra N. Sura
  • Publication number: 20160283208
    Abstract: Embodiments relate to program structure-based blocking. An aspect includes receiving source code corresponding to a computer program by a compiler of a computer system. Another aspect includes determining a prefetching section in the source code by a marking module of the compiler. Yet another aspect includes performing, by a blocking module of the compiler, blocking of instructions located in the prefetching section into instruction blocks, such that the instruction blocks of the prefetching section only contain instructions that are located in the prefetching section.
    Type: Application
    Filed: June 17, 2015
    Publication date: September 29, 2016
    Inventors: Carlo Bertolli, Alexandre E. Eichenberger, John K. O'Brien, Zehra N. Sura
  • Publication number: 20160283144
    Abstract: An aspect includes a table of contents (TOC) that was generated by a compiler being received at an accelerator device. The TOC includes an address of global data in a host memory space. The global data is copied from the address in the host memory space to an address in the device memory space. The address in the host memory space is obtained from the received TOC. The received TOC is updated to indicate that global data is stored at the address in the device memory space. A kernel that accesses the global data from the address in the device memory space is executed. The address in the device memory space is obtained based on contents of the updated TOC. When the executing is completed, the global data from the address in the device memory space is copied to the address in the host memory space.
    Type: Application
    Filed: June 22, 2015
    Publication date: September 29, 2016
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Publication number: 20160283158
    Abstract: An aspect includes a table of contents (TOC) that was generated by a compiler being received at an accelerator device. The TOC includes an address of global data in a host memory space. The global data is copied from the address in the host memory space to an address in the device memory space. The address in the host memory space is obtained from the received TOC. The received TOC is updated to indicate that global data is stored at the address in the device memory space. A kernel that accesses the global data from the address in the device memory space is executed. The address in the device memory space is obtained based on contents of the updated TOC. When the executing is completed, the global data from the address in the device memory space is copied to the address in the host memory space.
    Type: Application
    Filed: March 25, 2015
    Publication date: September 29, 2016
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Publication number: 20160283211
    Abstract: In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unaligned ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.
    Type: Application
    Filed: June 19, 2015
    Publication date: September 29, 2016
    Inventors: Carlo Bertolli, John K. O'Brien, Olivier H. Sallenave, Zehra N. Sura
  • Publication number: 20160011996
    Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
    Type: Application
    Filed: April 30, 2015
    Publication date: January 14, 2016
    Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
  • Patent number: 9229715
    Abstract: A monitor bit per hardware thread in a memory location may be allocated, in a multiprocessing computer system having a plurality of hardware threads, the plurality of hardware threads sharing the memory location, and each of the allocated monitor bit corresponding to one of the plurality of hardware threads. A condition bit may be allocated for each of the plurality of hardware threads, the condition bit being allocated in each context of the plurality of hardware threads. In response to detecting the memory location being accessed, it is determined whether a monitor bit corresponding to a hardware thread in the memory location is set. In response to determining that the monitor bit corresponding to a hardware thread is set in the memory location, a condition bit corresponding to a thread accessing the memory location is set in the hardware thread's context.
    Type: Grant
    Filed: May 21, 2013
    Date of Patent: January 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: Michael K. Gschwind, John K. O'Brien, Valentina Salapura, Zehra N. Sura