Patents by Inventor Roch Archambault

Roch Archambault has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8490065
    Abstract: The present invention provides a computer implemented method, apparatus, and computer usable program code for compiling instructions to manage a cache system. Loop constructs are analyzed to identify data usage characteristics for cache and prefetching conditions in instructions to form identified prefetch conditions. A set of control instructions are inserted into the instructions based on the data usage characteristics and the identified prefetch conditions to form multiple modified instructions. The set of multiple modified instructions are compiled to generate code for execution to form compiled instructions. The set of control instructions in the compiled instructions form a cache management policy to control movement of data in a memory system during execution of the compiled instructions.
    Type: Grant
    Filed: October 13, 2005
    Date of Patent: July 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: Roch Archambault, Yaoqing Gao, Francis Patrick O'Connell, Robert Brett Tremaine, Michael Edward Wazlowski, Steven Wayne White, Lixin Zhang
  • Patent number: 7380086
    Abstract: An improved scalability runtime system for a global address space language running on a distributed or shared memory machine uses a directory of shared variables having a data structure for tracking shared variable information that is shared by a plurality of program threads. Allocation and de-allocation routines are used to allocate and de-allocate shared variable entries in the directory of shared variables. Different routines can be used to access different types of shared data. A control structure is used to control access to the shared data such that all threads can access the data at any time. Since all threads see the same objects, synchronization issues are eliminated. In addition, the improved efficiency of the data sharing allows the number of program threads to be vastly increased.
    Type: Grant
    Filed: December 12, 2003
    Date of Patent: May 27, 2008
    Assignee: International Business Machines Corporation
    Inventors: Roch Archambault, Anthony Simon Bolmarcich, G. Calin Cascaval, Siddhartha Chatterjee, Maria Eleftheriou, Raymond Mak
  • Publication number: 20080028381
    Abstract: An embodiment of the present invention provides an optimizer for optimizing source code to generate optimized source code having instructions for instructing a central processing unit (CPU) to iteratively compute values for a primary recurrence element. A computer programmed loop for computing the primary recurrence element and subsequent recurrence elements is an example of a case involving iteratively computing the primary recurrence element. The CPU is operatively coupled to fast operating memory (FOM) and operatively coupled to slow operating memory (SOM). SOM stores the generated optimized source code. The optimized source code includes instructions for instructing said CPU to store a computed value of the primary recurrence element in a storage location of FOM. The instructions also includes instructions to consign the computed value of the primary recurrence element from the storage location to another storage location of the FOM.
    Type: Application
    Filed: October 10, 2007
    Publication date: January 31, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: ROCH ARCHAMBAULT, ROBERT BLAINEY, CHARLES HALL, YINGWEI ZHANG
  • Publication number: 20070240137
    Abstract: A method of compiling source code. The method includes converting pointer-based access in the source code to array-based access in the source code in a first pass compilation of the source code. Information is collected for objects in the source code during the first pass compilation. Candidate objects in the source code are selected based on the collected information to form selected candidate objects. Global stride variables are created for the selected candidate objects. Memory allocation operations are updated for the selected candidate objects in a second pass compilation of the source code. Multiple-level pointer indirect references are replaced in the source code with multi-dimensional array indexed references for the selected candidate objects in the second pass compilation of the source code.
    Type: Application
    Filed: April 11, 2006
    Publication date: October 11, 2007
    Inventors: Roch Archambault, Shimin Cui, Yaoqing Gao, Raul Silvera
  • Publication number: 20070088937
    Abstract: Under the present invention, a branch target address corresponding to a target instruction to be pre-fetched is predicted based on two values. The first value is a “predictor value” that is known for the branch target address. The second value is the address of the branch instruction from which the target instruction is branched to within the program code. Once these two values are provided, they can be processed (e.g., hashed) to yield an index value, which is used to obtain a predicted branch target address from a cache. This technique is generally implemented for branch instructions such as switch statements or polymorphic calls. In the case of the former, the predictor value is a selector operand, while in the case of the latter the predictor value is a class object address (in JAVA) or a virtual function table address (in C++).
    Type: Application
    Filed: October 13, 2005
    Publication date: April 19, 2007
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Roch Archambault, R. Hay, James McInnes, Kevin Stoodley
  • Publication number: 20070089105
    Abstract: A computer implemented method, system and computer program product for accessing threadprivate memory for threadprivate variables in a parallel program during program compilation. A computer implemented method for accessing threadprivate variables in a parallel program during program compilation includes aggregating threadprivate variables in the program, replacing references of the threadprivate variables by indirect references, moving address load operations of the threadprivate variables, and replacing the address load operations of the threadprivate variables by calls to runtime routines to access the threadprivate memory. The invention enables a compiler to minimize the runtime routines call times to access the threadprivate variables, thus improving program performance.
    Type: Application
    Filed: October 13, 2005
    Publication date: April 19, 2007
    Inventors: Roch Archambault, Shimin Cui
  • Publication number: 20070088915
    Abstract: The present invention provides a computer implemented method, apparatus, and computer usable program code for compiling instructions to manage a cache system. Loop constructs are analyzed to identify data usage characteristics for cache and prefetching conditions in instructions to form identified prefetch conditions. A set of control instructions are inserted into the instructions based on the data usage characteristics and the identified prefetch conditions to form multiple modified instructions. The set of multiple modified instructions are compiled to generate code for execution to form compiled instructions. The set of control instructions in the compiled instructions form a cache management policy to control movement of data in a memory system during execution of the compiled instructions.
    Type: Application
    Filed: October 13, 2005
    Publication date: April 19, 2007
    Inventors: Roch Archambault, Yaoqing Gao, Francis O' Connell, Robert Tremaine, Michael Wazlowski, Steven White, Lixin Zhang
  • Publication number: 20060048111
    Abstract: A method, apparatus, and computer instructions for processing instructions. A data dependency graph is built. The data dependency graph is analyzed for recurrences, and unpipelined instructions that lie outside of the recurrences are expanded.
    Type: Application
    Filed: August 30, 2004
    Publication date: March 2, 2006
    Applicant: International Business Machines Corporation
    Inventors: Roch Archambault, Robert Enenkel, Robert Hay, Allan Martin, James McInnes, Ronald McIntosh, Mark Mendell
  • Publication number: 20060048103
    Abstract: Inter-procedural strength reduction is provided by a mechanism of the present invention to improve data cache performance. During a forward pass, the present invention collects information of global variables and analyzes the usage pattern of global objects to select candidate computations for optimization. During a backward pass, the present invention remaps global objects into smaller size new global objects and generates more cache efficient code by replacing candidate computations with indirect or indexed reference of smaller global objects and inserting store operations to the new global objects for each computation that references the candidate global objects.
    Type: Application
    Filed: August 30, 2004
    Publication date: March 2, 2006
    Applicant: International Business Machines Corporation
    Inventors: Roch Archambault, Shimin Cui, Yaoqing Gao, Raul Silvera
  • Publication number: 20060048118
    Abstract: A method, apparatus, and computer instructions for facilitating optimization of code. An artificial statement is placed into the code. The artificial statement encodes information. The code is optimized with a compiler. The compiler performs an action based on the artificial statement in the code.
    Type: Application
    Filed: August 30, 2004
    Publication date: March 2, 2006
    Applicant: International Business Machines Corporation
    Inventors: Roch Archambault, Peng Zhao
  • Publication number: 20060048117
    Abstract: Inter-procedural strength reduction is provided by a mechanism of the present invention to optimize software program. During a forward pass, the present invention collects information of global variables and analyzes the information to select candidate computations for optimization. During a backward pass, the present invention replaces costly computations with less costly or weaker computations using pre-computed values and inserts store operations of new global variables to pre-compute the costly computations at definition points of the global variables used in the costly computations.
    Type: Application
    Filed: August 30, 2004
    Publication date: March 2, 2006
    Applicant: International Business Machines Corporation
    Inventors: Roch Archambault, Shimin Cui, Raul Silvera
  • Publication number: 20050268039
    Abstract: A method and system for reducing or avoiding store misses with a data cache block zero (DCBZ) instruction in cooperation with the underlying hardware load stream prefetching support for helping to increase effective aggregate bandwith. The method identifies and classifies unique streams in a loop based on dependency and reuse analysis, and performs loop transformations, such as node splitting, loop distribution or stream unrolling to get the proper number of streams. Static prediction and run-time profile information are used to guide loop and stream selection. Compile-time loop cost analysis and run-time check code and versioning are used to determine the number of cache lines ahead of each reference for data cache line zeroing and to tolerate required data alignment relative to data cache lines.
    Type: Application
    Filed: May 25, 2004
    Publication date: December 1, 2005
    Applicant: International Business Machines Corporation
    Inventors: Roch Archambault, Robert Blainey, Yaoging Gao, Randall Heisch, Steven White
  • Publication number: 20050246700
    Abstract: A compiling program with cache utilization optimizations employs an inter-procedural global analysis of the data access patterns of compile units to be processed. The global analysis determines sufficient information to allow intelligent application of optimization techniques to be employed to enhance the operation and utilization of the available cache systems on target hardware.
    Type: Application
    Filed: April 30, 2004
    Publication date: November 3, 2005
    Applicant: International Business Machines Corporation
    Inventors: Roch Archambault, Robert Blainey, Yaoqing Gao
  • Publication number: 20050149903
    Abstract: An improved scalability runtime system for a global address space language running on a distributed or shared memory machine uses a directory of shared variables having a data structure for tracking shared variable information that is shared by a plurality of program threads. Allocation and de-allocation routines are used to allocate and de-allocate shared variable entries in the directory of shared variables. Different routines can be used to access different types of shared data. A control structure is used to control access to the shared data such that all threads can access the data at any time. Since all threads see the same objects, synchronization issues are eliminated. In addition, the improved efficiency of the data sharing allows the number of program threads to be vastly increased.
    Type: Application
    Filed: December 12, 2003
    Publication date: July 7, 2005
    Inventors: Roch Archambault, Anthony Bolmarcich, G. Calin Cascaval, Siddhartha Chatterjee, Maria Eleftheriou, Raymond Mak
  • Publication number: 20050138613
    Abstract: A method and system of modifying instructions forming a loop is provided. A method of modifying instructions forming a loop includes modifying instructions forming a loop including: determining static and dynamic characteristics for the instructions; selecting a modification factor for the instructions based on a number of separate equivalent sections forming a cache in a processor which is processing the instructions; and modifying the instructions to interleave the instructions in the loop according to the modification factor and the static and dynamic characteristics when the instructions satisfy a modification criteria based on the static and dynamic characteristics.
    Type: Application
    Filed: May 27, 2004
    Publication date: June 23, 2005
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Roch Archambault, Robert Blainey, Yaoqing Gao, John McCalpin, Francis O'Connell, Pascal Vezolle, Steven White
  • Publication number: 20050080981
    Abstract: A data processing system is adapted to execute at least one workshare construct in a parallel region. The data processing system uses at least one thread for executing a corresponding subsection of the workshare construct and provides control blocks for managing corresponding workshare constructs in the parallel region. A method of managing the control blocks comprises: adding an array of control blocks to a control block queue; assigning control blocks in the initialized array to corresponding workshare constructs in the parallel region until a barrier is reached; and waiting at the barrier for all threads in the parallel region to complete their corresponding subsections and then resetting the control block to the beginning of the control block queue. Also provided are a computer program product and a data processing system for implementing the method.
    Type: Application
    Filed: May 13, 2004
    Publication date: April 14, 2005
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Roch Archambault, Raul Silvera, Guansong Zhang