Patents by Inventor John Kevin Patrick O'Brien

John Kevin Patrick O'Brien has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Compiler method for employing multiple autonomous synergistic processors to simultaneously operate on longer vectors of data

Patent number: 7962906

Abstract: A compiler includes a mechanism for employing multiple synergistic processors to execute long vectors. The compiler receives a single source program. The compiler identifies vectorizable loop code in the single source program and extracts the vectorizable loop code from the single source program. The compiler then compiles the extracted vectorizable loop code for a plurality of synergistic processors. The compiler also compiles a remainder of the single source program for a principal processor to form an executable main program such that the executable main program controls operation of the executable vectorizable loop code on the plurality of synergistic processors.

Type: Grant

Filed: March 15, 2007

Date of Patent: June 14, 2011

Assignee: International Business Machines Corporation

Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel Arthur Prener
Insuring maximum code motion of accesses to DMA buffers

Patent number: 7870544

Abstract: A “kill” intrinsic that may be used in programs for designating specific data objects as having been “killed” by a preceding action is provided. The concept of a data object being “killed” is that the compiler is informed that no operations (e.g., loads and stores) on that data object, or its aliases, can be moved across the point in the program flow where the data object is designated as having been “killed.” The “kill” intrinsic limits the reordering capability of an optimization scheduler of a compiler with regard to operations performed on “killed” data objects. The “kill” intrinsic may be used with DMA operations. Data objects being DMA'ed from a local store of a processor may be “killed” through use of the “kill” intrinsic prior to submitting the DMA request. Data objects being DMA'ed to the local store of the processor may be “killed” after verifying the transfer completes.

Type: Grant

Filed: April 5, 2006

Date of Patent: January 11, 2011

Assignee: International Business Machines Corporation

Inventors: Daniel A. Brokenshire, John Kevin Patrick O'Brien
Compiler implemented software cache method in which non-aliased explicitly fetched data are excluded

Patent number: 7784037

Abstract: A compiler implemented software cache is provided in which non-aliased explicitly fetched data are excluded are provided. With the mechanisms of the illustrative embodiments, a compiler uses a forward data flow analysis to prove that there is no alias between the cached data and explicitly fetched data. Explicitly fetched data that has no alias in the cached data are excluded from the software cache. Explicitly fetched data that has aliases in the cached data are allowed to be stored in the software cache. In this way, there is no runtime overhead to maintain the correctness of the two copies of data. Moreover, the number of lines of the software cache that must be protected from eviction is decreased. This leads to a decrease in the amount of computation cycles required by the cache miss handler when evicting cache lines during cache miss handling.

Type: Grant

Filed: April 14, 2006

Date of Patent: August 24, 2010

Assignee: International Business Machines Corporation

Inventors: Tong Chen, John Kevin Patrick O'Brien, Kathryn O'Brien, Byoungro So, Zehra N. Sura, Tao Zhang
Performing useful computations while waiting for a line in a system with a software implemented cache

Patent number: 7765360

Abstract: Mechanisms for performing useful computations during a software cache reload operation are provided. With the illustrative embodiments, in order to perform software caching, a compiler takes original source code, and while compiling the source code, inserts explicit cache lookup instructions into appropriate portions of the source code where cacheable variables are referenced. In addition, the compiler inserts a cache miss handler routine that is used to branch execution of the code to a cache miss handler if the cache lookup instructions result in a cache miss. The cache miss handler, prior to performing a wait operation for waiting for the data to be retrieved from the backing store, branches execution to an independent subroutine identified by a compiler. The independent subroutine is executed while the data is being retrieved from the backing store such that useful work is performed.

Type: Grant

Filed: October 1, 2008

Date of Patent: July 27, 2010

Assignee: International Business Machines Corporation

Inventors: John Kevin Patrick O'Brien, Kathryn O'Brien
Computer Program Code Size Partitioning System for Multiple Memory Multi-Processing Systems

Publication number: 20090158019

Abstract: The present invention provides for a system for computer program code size partitioning for multiple memory multi-processor systems. At least one system parameter of a computer system comprising one or more disparate processing nodes is identified. Computer program code comprising a program to be run on the computer system is received. A program representation based on received computer program code is generated. At least one single-entry-single-exit (SESE) region is identified based on the whole program representation. At least one SESE region of less than a certain size (store-size-specific) is identified based on identified SESE regions and the at least one system parameter. Each store-size-specific SESE region is grouped into a node-specific subroutine. The non node-specific parts of the computer program code are modified based on the partitioning into node-specific subroutines. The modified computer program code including each node-specific subroutine is compiled based on a specified node characteristic.

Type: Application

Filed: December 17, 2008

Publication date: June 18, 2009

Applicant: International Business Machines Corporation

Inventors: Kathryn M. O'Brien, John Kevin Patrick O'Brien
Computer Program Functional Partitioning System for Heterogeneous Multi-processing Systems

Publication number: 20090119652

Abstract: The present invention provides for a system for computer program functional partitioning for heterogeneous multi-processing systems. At least one system parameter of a computer system comprising one or more disparate processing nodes is identified. Computer program code comprising a program to be run on the computer system is received. A whole program representation is generated based on received computer program code. At least one single-entry-single-exit (SESE) region is identified based on the whole program representation. At least one node-specific SESE region is identified based on identified SESE regions and the at least one system parameter. Each node-specific SESE region is grouped into a node-specific subroutine. Each node-specific subroutine is compiled based on a specified node characteristic. The computer program code is modified based on the node-specific subroutines and the modified computer program code is compiled.

Type: Application

Filed: January 8, 2009

Publication date: May 7, 2009

Applicant: International Business Machines Corporation

Inventors: Kathryn M. O'Brien, John Kevin Patrick O'Brien
Managing position independent code using a software framework

Patent number: 7512699

Abstract: A method for managing position independent code using a software framework is presented. A software framework provides the ability to cache multiple plug-in's which are loaded in a processor's local storage. A processor receives a command or data stream from another processor, which includes information corresponding to a particular plug-in. The processor uses the plug-in identifier to load the plug-in from shared memory into local memory before it is required in order to minimize latency. When the data stream requests the processor to use the plug-in, the processor retrieves a location offset corresponding to the plug-in and applies the plug-in to the data stream. A plug-in manager manages an entry point table that identifies memory locations corresponding to each plug-in and, therefore, plug-ins may be placed anywhere in a processor's local memory.

Type: Grant

Filed: November 12, 2004

Date of Patent: March 31, 2009

Assignee: International Business Machines Corporation

Inventors: Michael Stan Gowen, Barry L Minor, Mark Richard Nutter, John Kevin Patrick O'Brien
Method for garbage collection in heterogeneous multiprocessor systems

Patent number: 7512745

Abstract: Garbage collection in heterogeneous multiprocessor systems is provided. In some illustrative embodiments, garbage collection operations are distributed across a plurality of the processors in the heterogeneous multiprocessor system. Portions of a global mark queue are assigned to processors of the heterogeneous multiprocessor system along with corresponding chunks of a shared memory. The processors perform garbage collection on their assigned portions of the global mark queue and corresponding chunk of shared memory marking memory object references as reachable or adding memory object references to a non-local mark stack. The marked memory objects are merged with a global mark stack and memory object references in the non-local mark stack are merged with a “to be traced” portion of the global mark queue for re-checking using a garbage collection operation.

Type: Grant

Filed: April 28, 2006

Date of Patent: March 31, 2009

Assignee: International Business Machines Corporation

Inventors: Michael K. Gschwind, John Kevin Patrick O'Brien, Kathryn M. O'Brien
Performing Useful Computations While Waiting for a Line in a System with a Software Implemented Cache

Publication number: 20090055588

Abstract: Mechanisms for performing useful computations during a software cache reload operation are provided. With the illustrative embodiments, in order to perform software caching, a compiler takes original source code, and while compiling the source code, inserts explicit cache lookup instructions into appropriate portions of the source code where cacheable variables are referenced. In addition, the compiler inserts a cache miss handler routine that is used to branch execution of the code to a cache miss handler if the cache lookup instructions result in a cache miss. The cache miss handler, prior to performing a wait operation for waiting for the data to be retrieved from the backing store, branches execution to an independent subroutine identified by a compiler. The independent subroutine is executed while the data is being retrieved from the backing store such that useful work is performed.

Type: Application

Filed: October 1, 2008

Publication date: February 26, 2009

Applicant: International Business Machines Corporation

Inventors: John Kevin Patrick O'Brien, Kathryn O'Brien
Method to efficiently prefetch and batch compiler-assisted software cache accesses

Patent number: 7493452

Abstract: A method to efficiently pre-fetch and batch compiler-assisted software cache accesses is provided. The method reduces the overhead associated with software cache directory accesses. With the method, the local memory address of the cache line that stores the pre-fetched data is itself cached, such as in a register or well known location in local memory, so that a later data access does not need to perform address translation and software cache operations and can instead access the data directly from the software cache using the cached local memory address. This saves processor cycles that would otherwise be required to perform the address translation a second time when the data is to be used. Moreover, the system and method directly enable software cache accesses to be effectively decoupled from address translation in order to increase the overlap between computation and communication.

Type: Grant

Filed: August 18, 2006

Date of Patent: February 17, 2009

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Kathryn M. O'Brien
Computer program functional partitioning method for heterogeneous multi-processing systems

Patent number: 7487496

Abstract: The present invention provides for a method for computer program functional partitioning for heterogeneous multi-processing systems. At least one system parameter of a computer system comprising one or more disparate processing nodes is identified. Computer program code comprising a program to be run on the computer system is received. A whole program representation is generated based on received computer program code. At least one single-entry-single-exit (SESE) region is identified based on the whole program representation. At least one node-specific SESE region is identified based on identified SESE regions and the at least one system parameter. Each node-specific SESE region is grouped into a node-specific subroutine. Each node-specific subroutine is compiled based on a specified node characteristic. The computer program code is modified based on the node-specific subroutines and the modified computer program code is compiled.

Type: Grant

Filed: December 2, 2004

Date of Patent: February 3, 2009

Assignee: International Business Machines Corporation

Inventors: Kathryn M. O'Brien, John Kevin Patrick O'Brien
Computer program code size partitioning method for multiple memory multi-processing systems

Patent number: 7478376

Abstract: The present invention provides for a method for computer program code size partitioning for multiple memory multi-processor systems. At least one system parameter of a computer system comprising one or more disparate processing nodes is identified. Computer program code comprising a program to be run on the computer system is received. A program representation based on received computer program code is generated. At least one single-entry-single-exit (SESE) region is identified based on the whole program representation. At least one SESE region of less than a certain size (store-size-specific) is identified based on identified SESE regions and the at least one system parameter. Each store-size-specific SESE region is grouped into a node-specific subroutine. The non node-specific parts of the computer program code are modified based on the partitioning into node-specific subroutines. The modified computer program code including each node-specific subroutine is compiled based on a specified node characteristic.

Type: Grant

Filed: December 2, 2004

Date of Patent: January 13, 2009

Assignee: International Business Machines Corporation

Inventors: Kathryn M. O'Brien, John Kevin Patrick O'Brien
Performing useful computations while waiting for a line in a system with a software implemented cache

Patent number: 7461205

Abstract: Mechanisms for performing useful computations during a software cache reload operation are provided. With the illustrative embodiments, in order to perform software caching, a compiler takes original source code, and while compiling the source code, inserts explicit cache lookup instructions into appropriate portions of the source code where cacheable variables are referenced. In addition, the compiler inserts a cache miss handler routine that is used to branch execution of the code to a cache miss handler if the cache lookup instructions result in a cache miss. The cache miss handler, prior to performing a wait operation for waiting for the data to be retrieved from the backing store, branches execution to an independent subroutine identified by a compiler. The independent subroutine is executed while the data is being retrieved from the backing store such that useful work is performed.

Type: Grant

Filed: June 1, 2006

Date of Patent: December 2, 2008

Assignee: International Business Machines Corporation

Inventors: John Kevin Patrick O'Brien, Kathryn O'Brien
System and Method for Speculative Thread Assist in a Heterogeneous Processing Environment

Publication number: 20080282064

Abstract: A system and method for speculative assistance to a thread in a heterogeneous processing environment is provided. A first set of instructions is identified in a source code representation (e.g., a source code file) that is suitable for speculative execution. The identified set of instructions are analyzed to determine the processing requirements. Based on the analysis, a processor type is identified that will be used to execute the identified first set of instructions based. The processor type is selected from more than one processor types that are included in the heterogeneous processing environment. The heterogeneous processing environment includes more than one heterogeneous processing cores in a single silicon substrate. The various processing cores can utilize different instruction set architectures (ISAs). An object code representation is then generated for the identified first set of instructions with the object code representation being adapted to execute on the determined type of processor.

Type: Application

Filed: May 7, 2007

Publication date: November 13, 2008

Inventors: Michael Norman Day, Michael Karl Gschwind, John Kevin Patrick O'Brien, Kathryn O'Brien
Apparatus and Method for Partitioning Programs Between a General Purpose Core and One or More Accelerators

Publication number: 20080256521

Abstract: An apparatus and method for partitioning programs between a general purpose core and one or more accelerators are provided. With the apparatus and method, a compiler front end is provided for converting a program source code in a corresponding high level programming language into an intermediate code representation. This intermediate code representation is provided to an interprocedural optimizer which determines which core processor or accelerator each portion of the program should execute on and partitions the program into sub-programs based on this set of decisions. The interprocedural optimizer may further add instructions to the partitions to coordinate and synchronize the sub-programs as required. Each sub-program is compiled on an appropriate compiler backend for the instruction set architecture of the particular core processor or accelerator selected to execute the sub-program. The compiled sub-programs and then linked to thereby generate an executable program.

Type: Application

Filed: May 27, 2008

Publication date: October 16, 2008

Applicant: International Business Machines Corporation

Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel A. Prener
Compiler Implemented Software Cache Apparatus and Method in which Non-Aliased Explicitly Fetched Data are Excluded

Publication number: 20080229291

Abstract: A compiler implemented software cache apparatus and method in which non-aliased explicitly fetched data are excluded are provided. With the mechanisms of the illustrative embodiments, a compiler uses a forward data flow analysis to prove that there is no alias between the cached data and explicitly fetched data. Explicitly fetched data that has no alias in the cached data are excluded from the software cache. Explicitly fetched data that has aliases in the cached data are allowed to be stored in the software cache. In this way, there is no runtime overhead to maintain the correctness of the two copies of data. Moreover, the number of lines of the software cache that must be protected from eviction is decreased. This leads to a decrease in the amount of computation cycles required by the cache miss handler when evicting cache lines during cache miss handling.

Type: Application

Filed: May 28, 2008

Publication date: September 18, 2008

Applicant: International Business Machines Corporation

Inventors: Tong Chen, John Kevin Patrick O'Brien, Kathryn O'Brien, Byoungro So, Zehra N. Sura, Tao Zhang
Compiler Method for Employing Multiple Autonomous Synergistic Processors to Simultaneously Operate on Longer Vectors of Data

Publication number: 20080229298

Abstract: A compiler includes a mechanism for employing multiple synergistic processors to execute long vectors. The compiler receives a single source program. The compiler identifies vectorizable loop code in the single source program and extracts the vectorizable loop code from the single source program. The compiler then compiles the extracted vectorizable loop code for a plurality of synergistic processors. The compiler also compiles a remainder of the single source program for a principal processor to form an executable main program such that the executable main program controls operation of the executable vectorizable loop code on the plurality of synergistic processors.

Type: Application

Filed: March 15, 2007

Publication date: September 18, 2008

Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel Arthur Prener
Ensuring Maximum Code Motion of Accesses to DMA Buffers

Publication number: 20080229295

Abstract: A “kill” intrinsic that may be used in programs for designating specific data objects as having been “killed” by a preceding action is provided. The concept of a data object being “killed” is that the compiler is informed that no operations (e.g., loads and stores) on that data object, or its aliases, can be moved across the point in the program flow where the data object is designated as having been “killed.” The “kill” intrinsic limits the reordering capability of an optimization scheduler of a compiler with regard to operations performed on “killed” data objects. The “kill” intrinsic may be used with DMA operations. Data objects being DMA'ed from a local store of a processor may be “killed” through use of the “kill” intrinsic prior to submitting the DMA request. Data objects being DMA'ed to the local store of the processor may be “killed” after verifying the transfer completes.

Type: Application

Filed: May 29, 2008

Publication date: September 18, 2008

Applicant: International Business Machines Corporation

Inventors: Daniel A. Brokenshire, John Kevin Patrick O'Brien
Apparatus and Method for Optimizing Scalar Code Executed on a SIMD Engine by Alignment of SIMD Slots

Publication number: 20080222391

Abstract: An apparatus and method for optimizing scalar code executed on a single instruction multiple data (SIMD) engine is provided that aligns the slots of SIMD registers. With the apparatus and method, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.

Type: Application

Filed: May 27, 2008

Publication date: September 11, 2008

Applicant: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien
Efficient Data Reorganization to Satisfy Data Alignment Constraints

Publication number: 20080201699

Abstract: Vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores is presented. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.

Type: Application

Filed: April 23, 2008

Publication date: August 21, 2008

Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Peng Wu

prev 1 2 3 next