Patents by Inventor John Kevin Patrick O'Brien

John Kevin Patrick O'Brien has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10782973
    Abstract: A method includes a computer device receiving a branch instruction; the computer device managing two tables, where a first table relates to application blocks and a second table relates to available address slots; and the computer device calculating a target of the branch instruction using a branch-to-link register, the computer device optimizes re-wiring in a cache using the calculation and the managed two tables.
    Type: Grant
    Filed: May 14, 2015
    Date of Patent: September 22, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Carlo Bertolli, John Kevin Patrick O'Brien, Alexandre E Eichenberger, Zehra Noman Sura
  • Publication number: 20160335087
    Abstract: A method includes a computer device receiving a branch instruction; the computer device managing two tables, where a first table relates to application blocks and a second table relates to available address slots; and the computer device calculating a target of the branch instruction using a branch-to-link register, the computer device optimizes re-wiring in a cache using the calculation and the managed two tables.
    Type: Application
    Filed: May 14, 2015
    Publication date: November 17, 2016
    Inventors: Carlo Bertolli, John Kevin Patrick O'Brien, Alexandre E. Eichenberger, Zehra Noman Sura
  • Patent number: 9038040
    Abstract: Partitioning programs between a general purpose core and one or more accelerators is provided. A compiler front end is provided for converting a program source code in a corresponding high level programming language into an intermediate code representation. This intermediate code representation is provided to an interprocedural optimizer which determines which core processor or accelerator each portion of the program should execute on and partitions the program into sub-programs based on this set of decisions. The interprocedural optimizer may further add instructions to the partitions to coordinate and synchronize the sub-programs as required. Each sub-program is compiled on an appropriate compiler backend for the instruction set architecture of the particular core processor or accelerator selected to execute the sub-program. The compiled sub-programs and then linked to thereby generate an executable program.
    Type: Grant
    Filed: January 25, 2006
    Date of Patent: May 19, 2015
    Assignee: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel A. Prener
  • Patent number: 8516230
    Abstract: An application thread executes a direct branch instruction that is stored in an instruction cache line. Upon execution, the direct branch instruction branches to a branch descriptor that is also stored in the instruction cache line. The branch descriptor includes a trampoline branch instruction and a target instruction space address. Next, the trampoline branch instruction sends a branch descriptor pointer, which points to the branch descriptor, to an instruction cache manager. The instruction cache manager extracts the target instruction space address from the branch descriptor, and executes a target instruction corresponding to the target instruction space address. In one embodiment, the instruction cache manager generates a target local store address by masking off a portion of bits included in the target instruction space address. In turn, the application thread executes the target instruction located at the target local store address accordingly.
    Type: Grant
    Filed: December 29, 2009
    Date of Patent: August 20, 2013
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad William Michael, Mark Richard Nutter, Kathryn M. O'Brien, John Kevin Patrick O'Brien
  • Patent number: 8375374
    Abstract: An mechanism is provided for partitioning programs between a general purpose core and one or more accelerators. With the apparatus and method, a compiler front end is provided for converting a program source code in a corresponding high level programming language into an intermediate code representation. This intermediate code representation is provided to an interprocedural optimizer which determines which core processor or accelerator each portion of the program should execute on and partitions the program into sub-programs based on this set of decisions. The interprocedural optimizer may further add instructions to the partitions to coordinate and synchronize the sub-programs as required. Each sub-program is compiled on an appropriate compiler backend for the instruction set architecture of the particular core processor or accelerator selected to execute the sub-program. The compiled sub-programs and then linked to thereby generate an executable program.
    Type: Grant
    Filed: May 27, 2008
    Date of Patent: February 12, 2013
    Assignee: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel A. Prener
  • Patent number: 8370817
    Abstract: A mechanism is provided for optimizing scalar code executed on a single instruction multiple data (SIMD) engine by aligning the slots of SIMD registers. With the mechanism, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.
    Type: Grant
    Filed: May 27, 2008
    Date of Patent: February 5, 2013
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien
  • Patent number: 8370575
    Abstract: Process, cache memory, computer product and system for loading data associated with a requested address in a software cache. The process includes loading address tags associated with a set in a cache directory using a Single Instruction Multiple Data (SIMD) operation, determining a position of the requested address in the set using a SIMD comparison, and determining an actual data value associated with the position of the requested address in the set.
    Type: Grant
    Filed: September 7, 2006
    Date of Patent: February 5, 2013
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Tao Zhang
  • Patent number: 8359435
    Abstract: A method for computing includes executing a program, including multiple cacheable lines of executable code, on a processor having a software-managed cache. A run-time cache management routine running on the processor is used to assemble a profile of inter-line jumps occurring in the software-managed cache while executing the program. Based on the profile, an optimized layout of the lines in the code is computed, and the lines of the program are re-ordered in accordance with the optimized layout while continuing to execute the program.
    Type: Grant
    Filed: December 16, 2009
    Date of Patent: January 22, 2013
    Assignee: International Business Machines Corporation
    Inventors: Revital Erez, Brian Flachs, Mark Richard Nutter, John Kevin Patrick O'Brien, Ulrich Weigand, Ayal Zaks
  • Patent number: 8214808
    Abstract: A system and method for speculative assistance to a thread in a heterogeneous processing environment is provided. A first set of instructions is identified in a source code representation (e.g., a source code file) that is suitable for speculative execution. The identified set of instructions are analyzed to determine the processing requirements. Based on the analysis, a processor type is identified that will be used to execute the identified first set of instructions based. The processor type is selected from more than one processor types that are included in the heterogeneous processing environment. The heterogeneous processing environment includes more than one heterogeneous processing cores in a single silicon substrate. The various processing cores can utilize different instruction set architectures (ISAs). An object code representation is then generated for the identified first set of instructions with the object code representation being adapted to execute on the determined type of processor.
    Type: Grant
    Filed: May 7, 2007
    Date of Patent: July 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Michael Norman Day, Michael Karl Gschwind, John Kevin Patrick O'Brien, Kathryn O'Brien
  • Patent number: 8214816
    Abstract: A compiler implemented software cache in which non-aliased explicitly fetched data are excluded are provided. With the mechanisms of the illustrative embodiments, a compiler uses a forward data flow analysis to prove that there is no alias between the cached data and explicitly fetched data. Explicitly fetched data that has no alias in the cached data are excluded from the software cache. Explicitly fetched data that has aliases in the cached data are allowed to be stored in the software cache. In this way, there is no runtime overhead to maintain the correctness of the two copies of data. Moreover, the number of lines of the software cache that must be protected from eviction is decreased. This leads to a decrease in the amount of computation cycles required by the cache miss handler when evicting cache lines during cache miss handling.
    Type: Grant
    Filed: May 28, 2008
    Date of Patent: July 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, John Kevin Patrick O'Brien, Kathryn O'Brien, Byoungro So, Zehra N. Sura, Tao Zhang
  • Patent number: 8146067
    Abstract: Vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores is presented. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
    Type: Grant
    Filed: April 23, 2008
    Date of Patent: March 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Peng Wu
  • Patent number: 8141067
    Abstract: A “kill” intrinsic that may be used in programs for designating specific data objects as having been “killed” by a preceding action is provided. The concept of a data object being “killed” is that the compiler is informed that no operations (e.g., loads and stores) on that data object, or its aliases, can be moved across the point in the program flow where the data object is designated as having been “killed.” The “kill” intrinsic limits the reordering capability of an optimization scheduler of a compiler with regard to operations performed on “killed” data objects. The “kill” intrinsic may be used with direct memory access (DMA) operations. Data objects being DMA'ed from a local store of a processor may be “killed” through use of the “kill” intrinsic prior to submitting the DMA request. Data objects being DMA'ed to the local store of the processor may be “killed” after verifying the transfer completes.
    Type: Grant
    Filed: May 29, 2008
    Date of Patent: March 20, 2012
    Assignee: International Business Machines Corporation
    Inventors: Daniel A. Brokenshire, John Kevin Patrick O'Brien
  • Patent number: 8132169
    Abstract: A system and method for dividing an application into a number of logical program partitions is presented. Each of these logical program partitions are stored in a logical program package along with a execution monitor. The execution monitor runs in one of the processing environments of a heterogeneous processing environment. The logical program partition includes sets of object code for executing on each of the types of processors included in the heterogeneous processing environment. The logical program partition includes instrumentation data used to evaluate the performance of a currently executing partition. The execution monitor compares the instrumentation data to the gathered profile data. If the execution monitor determines that the partition is performing poorly then the code for the other environment is retrieved from the logical program package and loaded and executed on the other environment.
    Type: Grant
    Filed: July 21, 2006
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, John Kevin Patrick O'Brien, Kathryn O'Brien
  • Patent number: 8126957
    Abstract: An approach for managing position independent code using a software framework is presented. A software framework provides the ability to cache multiple plug-in's which are loaded in a processor's local storage. A processor receives a command or data stream from another processor, which includes information corresponding to a particular plug-in. The processor uses the plug-in identifier to load the plug-in from shared memory into local memory before it is required in order to minimize latency. When the data stream requests the processor to use the plug-in, the processor retrieves a location offset corresponding to the plug-in and applies the plug-in to the data stream. A plug-in manager manages an entry point table that identifies memory locations corresponding to each plug-in and, therefore, plug-ins may be placed anywhere in a processor's local memory.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: February 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Michael Stan Gowen, Barry L Minor, Mark Richard Nutter, John Kevin Patrick O'Brien
  • Patent number: 8037463
    Abstract: The present invention provides for a system for computer program functional partitioning for heterogeneous multi-processing systems. At least one system parameter of a computer system comprising one or more disparate processing nodes is identified. Computer program code comprising a program to be run on the computer system is received. A whole program representation is generated based on received computer program code. At least one single-entry-single-exit (SESE) region is identified based on the whole program representation. At least one node-specific SESE region is identified based on identified SESE regions and the at least one system parameter. Each node-specific SESE region is grouped into a node-specific subroutine. Each node-specific subroutine is compiled based on a specified node characteristic. The computer program code is modified based on the node-specific subroutines and the modified computer program code is compiled.
    Type: Grant
    Filed: January 8, 2009
    Date of Patent: October 11, 2011
    Assignee: International Business Machines Corporation
    Inventors: Kathryn M. O'Brien, John Kevin Patrick O'Brien
  • Patent number: 8032873
    Abstract: The present invention provides for a system for computer program code size partitioning for multiple memory multi-processor systems. At least one system parameter of a computer system comprising one or more disparate processing nodes is identified. Computer program code comprising a program to be run on the computer system is received. A program representation based on received computer program code is generated. At least one single-entry-single-exit (SESE) region is identified based on the whole program representation. At least one SESE region of less than a certain size (store-size-specific) is identified based on identified SESE regions and the at least one system parameter. Each store-size-specific SESE region is grouped into a node-specific subroutine. The non node-specific parts of the computer program code are modified based on the partitioning into node-specific subroutines. The modified computer program code including each node-specific subroutine is compiled based on a specified node characteristic.
    Type: Grant
    Filed: December 17, 2008
    Date of Patent: October 4, 2011
    Assignee: International Business Machines Corporation
    Inventors: Kathryn M. O'Brien, John Kevin Patrick O'Brien
  • Patent number: 8010957
    Abstract: A computer implemented method, apparatus, and computer usable program code for eliminating redundant read-modify-write code sequences in non-vectorizable code. Code is received comprising a sequence of operations. The sequence of operations includes a loop. Non-vectorizable operations are identified within the loop that modifies at least one sub-part of a storage location. The non-vectorizable operations are modified to include a single store operation for the number of sub-parts of the storage location.
    Type: Grant
    Filed: August 1, 2006
    Date of Patent: August 30, 2011
    Assignee: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien
  • Patent number: 8006238
    Abstract: A process, compiler, computer program product and system for workload partitioning in a heterogeneous system. The process includes determining heterogeneous alignment constraints in the workload, partitioning a portion of tasks to a processing element sensitive to alignment constraints, and partitioning a remaining portion of tasks to a processing element not sensitive to alignment constraints.
    Type: Grant
    Filed: September 26, 2006
    Date of Patent: August 23, 2011
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Kathryn M. O'Brien, Tong Chen
  • Publication number: 20110161641
    Abstract: An application thread executes a direct branch instruction that is stored in an instruction cache line. Upon execution, the direct branch instruction branches to a branch descriptor that is also stored in the instruction cache line. The branch descriptor includes a trampoline branch instruction and a target instruction space address. Next, the trampoline branch instruction sends a branch descriptor pointer, which points to the branch descriptor, to an instruction cache manager. The instruction cache manager extracts the target instruction space address from the branch descriptor, and executes a target instruction corresponding to the target instruction space address. In one embodiment, the instruction cache manager generates a target local store address by masking off a portion of bits included in the target instruction space address. In turn, the application thread executes the target instruction located at the target local store address accordingly.
    Type: Application
    Filed: December 29, 2009
    Publication date: June 30, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: TONG CHEN, BRIAN FLACHS, BRAD WILLIAM MICHAEL, MARK RICHARD NUTTER, KATHRYN M. O'BRIEN, JOHN KEVIN PATRICK O'BRIEN
  • Publication number: 20110145503
    Abstract: A method for computing includes executing a program, including multiple cacheable lines of executable code, on a processor having a software-managed cache. A run-time cache management routine running on the processor is used to assemble a profile of inter-line jumps occurring in the software-managed cache while executing the program. Based on the profile, an optimized layout of the lines in the code is computed, and the lines of the program are re-ordered in accordance with the optimized layout while continuing to execute the program.
    Type: Application
    Filed: December 16, 2009
    Publication date: June 16, 2011
    Applicant: International Business Machines Corporation
    Inventors: Revital Erez, Brian Flachs, Mark Richard Nutter, John Kevin Patrick O'Brien, Ulrich Weigand, Ayal Zaks