Patents by Inventor Kathryn M. O'Brien

Kathryn M. O'Brien has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7512745
    Abstract: Garbage collection in heterogeneous multiprocessor systems is provided. In some illustrative embodiments, garbage collection operations are distributed across a plurality of the processors in the heterogeneous multiprocessor system. Portions of a global mark queue are assigned to processors of the heterogeneous multiprocessor system along with corresponding chunks of a shared memory. The processors perform garbage collection on their assigned portions of the global mark queue and corresponding chunk of shared memory marking memory object references as reachable or adding memory object references to a non-local mark stack. The marked memory objects are merged with a global mark stack and memory object references in the non-local mark stack are merged with a “to be traced” portion of the global mark queue for re-checking using a garbage collection operation.
    Type: Grant
    Filed: April 28, 2006
    Date of Patent: March 31, 2009
    Assignee: International Business Machines Corporation
    Inventors: Michael K. Gschwind, John Kevin Patrick O'Brien, Kathryn M. O'Brien
  • Publication number: 20090083724
    Abstract: A system and method for advanced polyhedral loop transformations of source code in a compiler are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.
    Type: Application
    Filed: September 26, 2007
    Publication date: March 26, 2009
    Inventors: Alexandre E. Eichenberger, John K. P. O'Brien, Kathryn M. O'Brien, Nicolas T. Vasilache
  • Publication number: 20090083722
    Abstract: A system and method for stable transitions in the presence of conditionals for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.
    Type: Application
    Filed: September 26, 2007
    Publication date: March 26, 2009
    Inventors: Alexandre E. Eichenberger, John K. P. O'Brien, Kathryn M. O'Brien, Nicolas T. Vasilache
  • Publication number: 20090083702
    Abstract: A system and method for selective code generation optimization for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.
    Type: Application
    Filed: September 26, 2007
    Publication date: March 26, 2009
    Inventors: Alexandre E. Eichenberger, John K.P. O'Brien, Kathryn M. O'Brien, Nicolas T. Vasilache
  • Patent number: 7493452
    Abstract: A method to efficiently pre-fetch and batch compiler-assisted software cache accesses is provided. The method reduces the overhead associated with software cache directory accesses. With the method, the local memory address of the cache line that stores the pre-fetched data is itself cached, such as in a register or well known location in local memory, so that a later data access does not need to perform address translation and software cache operations and can instead access the data directly from the software cache using the cached local memory address. This saves processor cycles that would otherwise be required to perform the address translation a second time when the data is to be used. Moreover, the system and method directly enable software cache accesses to be effectively decoupled from address translation in order to increase the overlap between computation and communication.
    Type: Grant
    Filed: August 18, 2006
    Date of Patent: February 17, 2009
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Kathryn M. O'Brien
  • Patent number: 7487496
    Abstract: The present invention provides for a method for computer program functional partitioning for heterogeneous multi-processing systems. At least one system parameter of a computer system comprising one or more disparate processing nodes is identified. Computer program code comprising a program to be run on the computer system is received. A whole program representation is generated based on received computer program code. At least one single-entry-single-exit (SESE) region is identified based on the whole program representation. At least one node-specific SESE region is identified based on identified SESE regions and the at least one system parameter. Each node-specific SESE region is grouped into a node-specific subroutine. Each node-specific subroutine is compiled based on a specified node characteristic. The computer program code is modified based on the node-specific subroutines and the modified computer program code is compiled.
    Type: Grant
    Filed: December 2, 2004
    Date of Patent: February 3, 2009
    Assignee: International Business Machines Corporation
    Inventors: Kathryn M. O'Brien, John Kevin Patrick O'Brien
  • Patent number: 7478376
    Abstract: The present invention provides for a method for computer program code size partitioning for multiple memory multi-processor systems. At least one system parameter of a computer system comprising one or more disparate processing nodes is identified. Computer program code comprising a program to be run on the computer system is received. A program representation based on received computer program code is generated. At least one single-entry-single-exit (SESE) region is identified based on the whole program representation. At least one SESE region of less than a certain size (store-size-specific) is identified based on identified SESE regions and the at least one system parameter. Each store-size-specific SESE region is grouped into a node-specific subroutine. The non node-specific parts of the computer program code are modified based on the partitioning into node-specific subroutines. The modified computer program code including each node-specific subroutine is compiled based on a specified node characteristic.
    Type: Grant
    Filed: December 2, 2004
    Date of Patent: January 13, 2009
    Assignee: International Business Machines Corporation
    Inventors: Kathryn M. O'Brien, John Kevin Patrick O'Brien
  • Publication number: 20080256521
    Abstract: An apparatus and method for partitioning programs between a general purpose core and one or more accelerators are provided. With the apparatus and method, a compiler front end is provided for converting a program source code in a corresponding high level programming language into an intermediate code representation. This intermediate code representation is provided to an interprocedural optimizer which determines which core processor or accelerator each portion of the program should execute on and partitions the program into sub-programs based on this set of decisions. The interprocedural optimizer may further add instructions to the partitions to coordinate and synchronize the sub-programs as required. Each sub-program is compiled on an appropriate compiler backend for the instruction set architecture of the particular core processor or accelerator selected to execute the sub-program. The compiled sub-programs and then linked to thereby generate an executable program.
    Type: Application
    Filed: May 27, 2008
    Publication date: October 16, 2008
    Applicant: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel A. Prener
  • Publication number: 20080229298
    Abstract: A compiler includes a mechanism for employing multiple synergistic processors to execute long vectors. The compiler receives a single source program. The compiler identifies vectorizable loop code in the single source program and extracts the vectorizable loop code from the single source program. The compiler then compiles the extracted vectorizable loop code for a plurality of synergistic processors. The compiler also compiles a remainder of the single source program for a principal processor to form an executable main program such that the executable main program controls operation of the executable vectorizable loop code on the plurality of synergistic processors.
    Type: Application
    Filed: March 15, 2007
    Publication date: September 18, 2008
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel Arthur Prener
  • Publication number: 20080077930
    Abstract: A process, compiler, computer program product and system for workload partitioning in a heterogeneous system. The process includes determining heterogeneous alignment constraints in the workload, partitioning a portion of tasks to a processing element sensitive to alignment constraints, and partitioning a remaining portion of tasks to a processing element not sensitive to alignment constraints.
    Type: Application
    Filed: September 26, 2006
    Publication date: March 27, 2008
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Kathryn M. O'Brien, Tong Chen
  • Publication number: 20080052688
    Abstract: A computer implemented method, apparatus, and computer usable program code for eliminating redundant read-modify-write code sequences in non-vectorizable code. Code is received comprising a sequence of operations. The sequence of operations includes a loop. Non-vectorizable operations are identified within the loop that modifies at least one sub-part of a storage location. The non-vectorizable operations are modified to include a single store operation for the number of sub-parts of the storage location.
    Type: Application
    Filed: August 1, 2006
    Publication date: February 28, 2008
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien
  • Publication number: 20080046657
    Abstract: A system and method to efficiently pre-fetch and batch compiler-assisted software cache accesses are provided. The system and method reduce the overhead associated with software cache directory accesses. With the system and method, the local memory address of the cache line that stores the pre-fetched data is itself cached, such as in a register or well known location in local memory, so that a later data access does not need to perform address translation and software cache operations and can instead access the data directly from the software cache using the cached local memory address. This saves processor cycles that would otherwise be required to perform the address translation a second time when the data is to be used. Moreover, the system and method directly enable software cache accesses to be effectively decoupled from address translation in order to increase the overlap between computation and communication.
    Type: Application
    Filed: August 18, 2006
    Publication date: February 21, 2008
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Kathryn M. O'Brien
  • Publication number: 20080010635
    Abstract: A compiler includes a mechanism for improving branch prediction in a processor that supports a branch hint instruction. The compiler receives a sequence of instructions, wherein the sequence of instructions comprises a loop. This loop sequence employs an hbr instruction to avoid the misprediction penalty of the taken branch to the start of the loop on each loop iteration. However, this penalty will be incurred regardless, on exiting the loop. The compiler inserts a compare and select instruction sequence which dynamically changes the input to the hbr instruction thereby avoiding this penalty when leaving the loop.
    Type: Application
    Filed: July 7, 2006
    Publication date: January 10, 2008
    Inventors: John Kevin O'Brien, Kathryn M. O'Brien
  • Publication number: 20080005473
    Abstract: A computer implemented method, data processing system, and computer usable program code are provided for configuring a cache. A compiler performs an analysis of software code to identify cacheable information in the software code that will be accessed in the cache at runtime. The properties of the cacheable information are analyzed to form a data reference analysis. Using the data reference analysis, a cache configuration is determined for caching the cacheable information during execution of the software code. Modified lookup code is inserted in the software code based on the cache configuration used to configure the cache.
    Type: Application
    Filed: June 30, 2006
    Publication date: January 3, 2008
    Inventors: Tong Chen, John Kevin Patrick O'Brien, Kathryn M. O'Brien, Byoungro So, Zehra N. Sura, Tao Zhang
  • Patent number: 7243195
    Abstract: The present invention provides for a method for computer program code optimization for a software managed cache in either a uni-processor or a multi-processor system. A single source file comprising a plurality of array references is received. The plurality of array references is analyzed to identify predictable accesses. The plurality of array references is analyzed to identify secondary predictable accesses. One or more of the plurality of array references is aggregated based on identified predictable accesses and identified secondary predictable accesses to generate aggregated references. The single source file is restructured based on the aggregated references to generate restructured code. Prefetch code is inserted in the restructured code based on the aggregated references. Software cache update code is inserted in the restructured code based on the aggregated references. Explicit cache lookup code is inserted for the remaining unpredictable accesses.
    Type: Grant
    Filed: December 2, 2004
    Date of Patent: July 10, 2007
    Assignee: International Business Machines Corporation
    Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien
  • Patent number: 7243333
    Abstract: The present invention provides a compilation system for compiling and linking an integrated executable adapted to execute on a heterogeneous parallel processor architecture. The compiler and linker compile different segments of the source code for a first and second processor architecture, and generate appropriate stub functions directed at loading code and data to remote nodes so as to cause them to perform operations described by the transmitted code on the data. The compiler and linker generate stub objects to represent remote execution capability, and stub objects encapsulate the transfers necessary to execute code in such environment.
    Type: Grant
    Filed: October 24, 2002
    Date of Patent: July 10, 2007
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Kathryn M. O'Brien, John Kevin O'Brien, Valentina Salapura
  • Patent number: 7225431
    Abstract: The present invention provides inserting and deleting a breakpoint in a parallel processing system. A breakpoint is inserted in a module loaded into the execution environment of an attached processor unit. The breakpoint can be inserted directly. Furthermore, the unloaded image of the module can also have a breakpoint associated with it. The breakpoint can be inserted directly into the module image, or a breakpoint request can be generated, and the breakpoint is inserted when the module is loaded into the execution environment of the attached processor unit.
    Type: Grant
    Filed: October 24, 2002
    Date of Patent: May 29, 2007
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Kathryn M. O'Brien, John Kevin O'Brien, Valentina Salapura
  • Patent number: 7222332
    Abstract: The present invention provides for creating and employing code and data partitions in a heterogeneous environment. This is achieved by separating source code and data into at least two partitioned sections and at least one unpartitioned section. Generally, a partitioned section is targeted for execution on an independent memory device, such as an attached processor unit. Then, at least two overlay sections are generated from at least one partition section. The plurality of partition sections are pre-bound to each other. A root module is also created, associated with both the pre-bound plurality of partitions and the overlay sections. The root module is employable to exchange the at least two overlay sections between the first and second execution environments. The pre-bound plurality of partition sections are then bound to the at least one unpartitioned section. The binding produces an integrated executable.
    Type: Grant
    Filed: October 24, 2002
    Date of Patent: May 22, 2007
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Kathryn M. O'Brien, John Kevin O'Brien, Valentina Salapura
  • Patent number: 7213123
    Abstract: The present invention provides for the employment of a dynamic debugger for a parallel processing environment. This is achieved by dynamically updating mapping information at run-time in a mapping table, wherein the mapping table is read by the dynamic debugger.
    Type: Grant
    Filed: October 24, 2002
    Date of Patent: May 1, 2007
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Kathryn M. O'Brien, John Kevin O'Brien, Valentina Salapura
  • Patent number: 7200840
    Abstract: In the present invention, global information is passed from a first execution environment to a second execution environment, wherein both the first and second processor units comprise separate memories. The global variable is transferred through the invocation of a memory flow controller by a stub function. The global descriptor has a plurality of field indicia that allow a binder to link separate object files bound to the first and second execution environments.
    Type: Grant
    Filed: October 24, 2002
    Date of Patent: April 3, 2007
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Kathryn M. O'Brien, John Kevin O'Brien, Valentina Salapura