Patents by Inventor Alexandre E. Eichenberger

Alexandre E. Eichenberger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8266587
    Abstract: Disclosure for using SLP in processing a plurality of statements, wherein the statements are associated with an array having a number of array positions, and each statement includes one or more expressions. Expressions are gathered for each of the statements into a structure comprising a single merge stream furnished with a location for each expression. The location for a given expression is associated with one of the array positions. A plurality of expressions are selectively identified and SLP packing operations are applied to the identified expressions to merge into one or more isomorphic sub-streams. Expressions of the isomorphic sub-streams and other expressions of the single merge stream are combined into a number of input vectors that are substantially equal in length to one another. A location vector is generated that contains the respective locations for all of the expressions in the single merge stream.
    Type: Grant
    Filed: December 26, 2007
    Date of Patent: September 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
  • Patent number: 8255884
    Abstract: Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.
    Type: Grant
    Filed: June 6, 2008
    Date of Patent: August 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Michael K. Gschwind, John A. Gunnels
  • Publication number: 20120210073
    Abstract: An apparatus, method and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
    Type: Application
    Filed: February 11, 2011
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Alan Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
  • Patent number: 8245208
    Abstract: Generating loop code to execute on Single-Instruction Multiple-Datapath (SIMD) architectures, where the loop operates on datatypes having different lengths, is disclosed. Further, a preferred embodiment of the present invention includes a novel technique to efficiently realign or shift arbitrary streams to an arbitrary offset, regardless whether the alignments or offsets are known at the compile time or not. This technique enables the application of advanced alignment optimizations to runtime alignment. Length conversion operations, for packing and unpacking data values, are included in the alignment handling framework. These operations are formally defined in terms of standard SIMD instructions that are readily available on various SIMD platforms. This allows sequential loop code operating on datatypes of disparate length to be transformed (“simdized”) into optimized SIMD code through a fully automated process.
    Type: Grant
    Filed: December 4, 2008
    Date of Patent: August 14, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
  • Publication number: 20120204189
    Abstract: A runtime dependence-aware scheduling of dependent iterations mechanism is provided. Computation is performed for one or more iterations of computer executable code by a main thread. Dependence information is determined for a plurality of memory accesses within the computer executable code using modified executable code using a set of dependence threads. Using the dependence information, a determination is made as to whether a subset of a set of uncompleted iterations in the plurality of iterations is capable of being executed ahead-of-time by the one or more available threads in the data processing system. If the subset of the set of uncompleted iterations in the plurality of iterations is capable of being executed ahead-of-time, the main thread is signaled to skip the subset of the set of uncompleted iterations and the set of assist threads is signaled to execute the subset of the set of uncompleted iterations.
    Type: Application
    Filed: April 10, 2012
    Publication date: August 9, 2012
    Applicant: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kathryn M. O'Brien, Xiaotong Zhuang
  • Publication number: 20120198425
    Abstract: A compiler of a single instruction multiple data (SIMD) information handling system (IHS) identifies “if-then-else” statements that offer opportunity for conditional branch conversion. The compiler converts those “if-then-else” statements into “conditional branch and prepare” statements as well as “branch return” statements. The compiler compiles source code file information containing “if-then-else” statement opportunities into compiled code, namely an executable program. The SIMD IHS employs a processor or processors to execute the executable program. During execution, the processor generates and updates SIMD lane mask information to track and manage the conditional branch loops of the executing program. The processor saves branch addresses and employs SIMD lane masks to identify conditional branch loops with different branch conditions than previous conditional branch loops. The processor may reduce SIMD IHS processing time during processing of compiled code of the original “if-then-else” statements.
    Type: Application
    Filed: January 28, 2011
    Publication date: August 2, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Brian Flachs, Dorit Nuzman, Ira Rosen, Ulrich Weigand, Ayal Zaks
  • Publication number: 20120191953
    Abstract: Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor, Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.
    Type: Application
    Filed: March 30, 2012
    Publication date: July 26, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
  • Publication number: 20120192167
    Abstract: Mechanisms for extracting data dependencies during runtime are provided. The mechanisms execute a portion of code having a loop and generate, for the loop, a first parallel execution group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The mechanisms further execute the first parallel execution group and determining, for each iteration in the subset of iterations, whether the iteration has a data dependence. Moreover, the mechanisms commit store data to system memory only for stores performed by iterations in the subset of iterations for which no data dependence is determined. Store data of stores performed by iterations in the subset of iterations for which a data dependence is determined is not committed to the system memory.
    Type: Application
    Filed: March 30, 2012
    Publication date: July 26, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
  • Publication number: 20120180031
    Abstract: Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.
    Type: Application
    Filed: March 26, 2012
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
  • Patent number: 8214831
    Abstract: A runtime dependence-aware scheduling of dependent iterations mechanism is provided. Computation is performed for one or more iterations of computer executable code by a main thread. Dependence information is determined for a plurality of memory accesses within the computer executable code using modified executable code using a set of dependence threads. Using the dependence information, a determination is made as to whether a subset of a set of uncompleted iterations in the plurality of iterations is capable of being executed ahead-of-time by the one or more available threads in the data processing system. If the subset of the set of uncompleted iterations in the plurality of iterations is capable of being executed ahead-of-time, the main thread is signaled to skip the subset of the set of uncompleted iterations and the set of assist threads is signaled to execute the subset of the set of uncompleted iterations.
    Type: Grant
    Filed: May 5, 2009
    Date of Patent: July 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kathryn M. O'Brien, Xiaotong Zhuang
  • Patent number: 8196124
    Abstract: Loop code is generated to execute on Single-Instruction Multiple-Datapath (SIMD) architectures, where the loop operates on datatypes having different lengths. Further, a preferred embodiment of the present invention includes a novel technique to efficiently realign or shift arbitrary streams to an arbitrary offset, regardless whether the alignments or offsets are known at the compile time or not. This technique enables the application of advanced alignment optimizations to runtime alignment. This allows sequential loop code operating on datatypes of disparate length to be transformed (“simdized”) into optimized SIMD code through a fully automated process.
    Type: Grant
    Filed: August 22, 2008
    Date of Patent: June 5, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
  • Patent number: 8171464
    Abstract: An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.
    Type: Grant
    Filed: May 16, 2008
    Date of Patent: May 1, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
  • Patent number: 8156310
    Abstract: One embodiment of the present method and apparatus for data stream alignment support includes retrieving a first input from a first register file, retrieving a second input from a second register file, the second register file being dedicated to a stream shift unit and performing the stream shift instruction in accordance with the first input, the second input and a third input.
    Type: Grant
    Filed: September 11, 2006
    Date of Patent: April 10, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Michael Karl Gschwind, John-David Wellman, Peng Wu
  • Patent number: 8146067
    Abstract: Vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores is presented. In the framework presented herein, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirement of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residue iteration counts, and multiple statements with arbitrary alignment combinations. Beyond generating a valid simdization, a preferred embodiment further improves the quality of the generated codes. Four stream-shift placement policies are disclosed, which minimize the number of data reorganization generated by the alignment handling.
    Type: Grant
    Filed: April 23, 2008
    Date of Patent: March 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien, Peng Wu
  • Patent number: 8136105
    Abstract: A computer program product is provided for extracting SIMD parallelism. The computer program product includes instructions for providing a stream of input code comprising basic blocks; identifying pairs of statements that are semi-isomorphic with respect to each other within a basic block; iteratively combining into packs, pairs of statements that are semi-isomorphic with respect to each other, and combining packs into combined packs; collecting packs whose statements can be scheduled together for processing; and generating SIMD instructions for each pack to provide for extracting the SIMD parallelism..
    Type: Grant
    Filed: September 29, 2006
    Date of Patent: March 13, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu, Peng Zhao
  • Publication number: 20120060015
    Abstract: Mechanisms for performing a scattered load operation are provided. With these mechanisms, an extended address is received in a cache memory of a processor. The extended address has a plurality of data element address portions that specify a plurality of data elements to be accessed using the single extended address. Each of the plurality of data element address portions is provided to corresponding data element selector logic units of the cache memory. Each data element selector logic unit in the cache memory selects a corresponding data element from a cache line buffer based on a corresponding data element address portion provided to the data element selector logic unit. Each data element selector logic unit outputs the corresponding data element for use by the processor.
    Type: Application
    Filed: September 7, 2010
    Publication date: March 8, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Michael K. Gschwind, Valentina Salapura
  • Publication number: 20120060016
    Abstract: Mechanisms for performing a scattered load operation are provided. With these mechanisms, a gather instruction is receive in a logic unit of a processor, the gather instruction specifying a plurality of addresses in a memory from which data is to be loaded into a target vector register of the processor. A plurality of separate load instructions for loading the data from the plurality of addresses in the memory are automatically generated within the logic unit. The plurality of separate load instructions are sent, from the logic unit, to one or more load/store units of the processor. The data corresponding to the plurality of addresses is gathered in a buffer of the processor. The logic unit then writes data stored in the buffer to the target vector register.
    Type: Application
    Filed: September 7, 2010
    Publication date: March 8, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Michael K. Gschwind, Valentina Salapura
  • Publication number: 20120011348
    Abstract: Mechanisms for performing a matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A pair-wise load and splat operation is performed to load a pair of scalar values of a second vector operand and replicate the pair of scalar values within a second target vector register. An operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored. This operation may be repeated for a second pair of scalar values of the second vector operand.
    Type: Application
    Filed: July 12, 2010
    Publication date: January 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Michael K. Gschwind, John A. Gunnels, Valentina Salapura
  • Patent number: 8087011
    Abstract: Mechanisms for domain stretching for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.
    Type: Grant
    Filed: September 26, 2007
    Date of Patent: December 27, 2011
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John K. P. O'Brien, Kathryn M. O'Brien, Nicolas T. Vasilache
  • Patent number: 8087010
    Abstract: Mechanisms for selective code generation optimization for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.
    Type: Grant
    Filed: September 26, 2007
    Date of Patent: December 27, 2011
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John K. P. O'Brien, Kathryn M. O'Brien, Nicolas T. Vasilache