Including Loop Patents (Class 717/160)

Including scheduling instructions (Class 717/161)

PREPARING NAVIGATION STRUCTURE FOR AN AUDIOVISUAL PRODUCT

Publication number: 20110161923

Abstract: The system includes a command set defining a plurality of navigation commands for an audiovisual reproduction apparatus and a human-oriented scripting program for automatically authoring a navigation structure for use in a stand alone audiovisual product playable in the audiovisual reproduction apparatus. The scripting program includes an iterative loop with a variable adjusted according to the iterations of the loop. The scripting program is operable to automatically, for each iteration of the loop; select from the plurality of navigation commands a navigation command defined according to the variable as adjusted for each iteration of the loop; and add the navigation command to an intermediate representation of the navigation structure. An associated method is also provided.

Type: Application

Filed: April 19, 2005

Publication date: June 30, 2011

Applicant: ZOOtech Limited

Inventor: Stuart Green
Compiler method for employing multiple autonomous synergistic processors to simultaneously operate on longer vectors of data

Patent number: 7962906

Abstract: A compiler includes a mechanism for employing multiple synergistic processors to execute long vectors. The compiler receives a single source program. The compiler identifies vectorizable loop code in the single source program and extracts the vectorizable loop code from the single source program. The compiler then compiles the extracted vectorizable loop code for a plurality of synergistic processors. The compiler also compiles a remainder of the single source program for a principal processor to form an executable main program such that the executable main program controls operation of the executable vectorizable loop code on the plurality of synergistic processors.

Type: Grant

Filed: March 15, 2007

Date of Patent: June 14, 2011

Assignee: International Business Machines Corporation

Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel Arthur Prener
Reducing number of exception checks

Patent number: 7937695

Abstract: Based on operations within an uncounted loop of source code, one or more calculations are generated for determining, at runtime, an expected number of iterations through which the uncounted loop can iterate before encountering an exception corresponding to at least one target exception check. A copy of the uncounted loop omitting each target exception check is generated. The uncounted loop, the copy of the uncounted loop, and the one or more calculations are arranged in compiled code so that at runtime program flow enters the copy of the uncounted loop. If a maximum number of iterations of the copy of the uncounted loop is reached, program flow proceeds from the copy of the uncounted loop to the uncounted loop. The maximum number of iterations is no more than the smallest member of a set consisting of the expected number of iterations for each target exception check.

Type: Grant

Filed: April 27, 2007

Date of Patent: May 3, 2011

Assignee: International Business Machines Corporation

Inventor: Mark Graham Stoodley
Speculative computation lock coarsening through the use of localized lock reservation

Patent number: 7908256

Abstract: A computer-implementable method, system and computer-usable medium. One or more objects among a plurality of objects can be processed utilizing a data-processing apparatus/system. One or more lock reservations can be applied among a group of lock reservations over a multiple sequential lock operations with respect the particular object. Thereafter, the lock reservation can be cancelled with respect to the last monitor exit operation in order to eliminate lock operations where traditional lock coarsening cannot be applied.

Type: Grant

Filed: November 30, 2007

Date of Patent: March 15, 2011

Assignee: International Business Machines Corporation

Inventors: Nikola Grcevski, Peter Burka
PROACTIVE LOOP FUSION OF NON-ADJACENT LOOPS WITH INTERVENING CONTROL FLOW INSTRUCTIONS

Publication number: 20110047534

Abstract: A system and method for optimization of code with non-adjacent loops. A compiler builds a node tree, which is not a control flow graph, that represents parent-child relationships of nodes of a computer program. Each node represents a control flow statement or a straight-line block of statements of the computer program. If a non-adjacent loop pair of nodes satisfy predetermined conditions, the compiler may perform legal code transformations on the computer program and corresponding node transformations on the node tree. These transformations may make adjacent this pair of loop nodes. The compiler may be configured to perform legal code transformations, such as head and tail duplication, code motion, and if-merging, in order to make adjacent these two loop nodes. Then loop fusion may be performed on this loop pair in order to increase instruction level parallelism (ILP) within an optimized version of the original source code.

Type: Application

Filed: August 22, 2009

Publication date: February 24, 2011

Inventors: Mei Ye, Dinesh Suresh, Dz-ching Ju, Michael Lai
Array value substitution and propagation with loop transformations through static analysis

Patent number: 7890942

Abstract: A method and system for substituting array values (i.e., expressions) in a program at compile time. An initialization of an array is identified in a loop. The initialization is an assignment of an expression (i.e., a constant or a function of an induction variable to elements of the array). The expression is stored in a table that associates the expression with the array and indices of the array. An assignment statement is detected that is to assign at least one element of the initialized elements. The expression is retrieved from the table based on the expression being associated with the array and corresponding indices. The expression is substituted for the at least one element so that the expression is to be assigned by the assignment statement. The process of substituting array values is extended to interprocedural analysis.

Type: Grant

Filed: August 15, 2006

Date of Patent: February 15, 2011

Assignee: International Business Machines Corporation

Inventor: Rohini Nair
Code optimization based on loop structures

Patent number: 7890943

Abstract: Instructions that have no dependence constraint between them and other instructions in a loop of a critical section may be moved out of the critical section so that the size of the critical section may be reduced. A flow graph of a program including the critical section may be generated, which includes loops. The flow graph may be transformed based on which any unnecessary instructions in loops may be moved out of the critical section. Subsequently, the original flow graph of the critical section may be recovered from the transformed flow graph.

Type: Grant

Filed: March 30, 2007

Date of Patent: February 15, 2011

Assignee: Intel Corporation

Inventors: Xiaofeng Guo, Jinquan Dai, Long Li
Compilation and runtime information generation and optimization

Patent number: 7890940

Abstract: To collect frequencies with which processes of a program are executed at high speed. A compiler apparatus for optimizing a program based on frequencies with which each process is executed has a loop process detection portion for detecting a repeatedly executed loop process of the program, a loop process frequency collection portion for collecting loop process frequencies with which the loop process is executed in the program, an in-loop process frequency collection portion for collecting in-loop process frequencies with which, as against times of execution of loop process, each of a plurality of in-loop processes included in the loop process is executed, an in-loop execution information generating portion for generating in-loop execution information indicating the frequencies with which each of the plurality of in-loop processes is executed in the case where the program is executed, and an optimization portion for optimizing the program based on the in-loop execution information.

Type: Grant

Filed: January 11, 2008

Date of Patent: February 15, 2011

Assignee: International Business Machines Corporation

Inventors: Hideaki Komatsu, Toshio Suganuma, Toshiaki Yasue
VECTORIZATION OF PROGRAM CODE

Publication number: 20110029962

Abstract: A method for vectorization of a block of code is provided. The method comprises receiving a first block of code as input; and converting the first block of code into at least a second block of code and a third block of code. The first block of code accesses a first set of memory addresses that are potentially misaligned. The second block of code performs conditional leaping address incrementation to selectively access a first subset of the first set of memory addresses. The third block of code accesses a second subset of the first set of memory addresses starting from an aligned memory address, simultaneously accessing multiple memory addresses at a time. No memory address belongs to both the first subset and the second subset of memory addresses.

Type: Application

Filed: July 28, 2009

Publication date: February 3, 2011

Applicant: International Business Machines Corporation

Inventors: Dorit Nuzman, Ira Rosen, Ayal Zaks
Method, system, and program of a compiler to parallelize source code

Patent number: 7882498

Abstract: Provided are a method, system, and program for parallelizing source code with a compiler. Source code including source code statements is received. The source code statements are processed to determine a dependency of the statements. Multiple groups of statements are determined from the determined dependency of the statements, wherein statements in one group are dependent on one another. At least one directive is inserted in the source code, wherein each directive is associated with one group of statements. Resulting threaded code is generated including the inserted at least one directive. The group of statements to which the directive in the resulting threaded code applies are processed as a separate task. Each group of statements designated by the directive to be processed as a separate task may be processed concurrently with respect to other groups of statements.

Type: Grant

Filed: March 31, 2006

Date of Patent: February 1, 2011

Assignee: Intel Corporation

Inventors: Guilherme D. Ottoni, Xinmin Tian, Hong Wang, Richard A. Hankins, Wei Li, John Shen
Method of partially copying first and last private arrays for parallelized loops based on array data flow

Patent number: 7877739

Abstract: A computer-implemented method for determining whether an array within a loop can be privatized for that loop is presented. The method calculates the array sections that require first or last privatization and copies only those sections, reducing the privatization overhead of the known solutions.

Type: Grant

Filed: October 9, 2006

Date of Patent: January 25, 2011

Assignee: International Business Machines Corporation

Inventors: Roch G. Archambault, Erik P. Charlebois, Guansong Zhang
Stack unique signatures for program procedures and methods

Patent number: 7873954

Abstract: Stack signature marking segments are inserted into re-entrant programming source code modules prior to compilation of the modules at each code module entry point and at each code module exit point, followed by producing one or more executable programs from the programming source code modules. Upon execution of instances of the executable programs, the inserted segments assign unique, non-duplicated module identifier values to the instances of the code modules, generate an instance count for each instantiation of executable code module in the stack signature for each object instance dynamically created during runtime of a re-entrant executable code module, and push onto a processing stack the module identifier values and the instance counts within stack frames allocated to each of the executable program instances.

Type: Grant

Filed: May 12, 2006

Date of Patent: January 18, 2011

Assignee: International Business Machines Corporation

Inventors: Lorin Ullmann, Allen Chester Wynn
Using transactional memory for precise exception handling in aggressive dynamic binary optimizations

Patent number: 7865885

Abstract: Dynamic optimization of application code is performed by selecting a portion of the application code as a possible transaction. A transaction has a property that when it is executed, it is either atomically committed or atomically aborted. Determining whether to convert the selected portion of the application code to a transaction includes determining whether to apply at least one of a group of code optimizations to the portion of the application code. If it is determined to apply at least one of the code optimizations of the group of optimizations to the portion of application code, then the optimization is applied to the portion of the code and the portion of the code is converted to a transaction.

Type: Grant

Filed: September 27, 2006

Date of Patent: January 4, 2011

Assignee: Intel Corporation

Inventors: Youfeng Wu, Cheng Wang, Ho-seop Kim
Blocking of nested loops having feedback or feedforward indexes

Patent number: 7865886

Abstract: A method and apparatus for to blocking nested loops having feedback or feedforward indexing. An embodiment of a method includes receiving a computer code segment, the segment including a first inner loop and a second outer loop, the inner loop being within the outer loop and the inn loops having a one-dimensional iteration space that is independent of the outer loop. The first loop is indexed by a variable I over a contiguous one-dimensional iteration space and addresses one or more data arrays with a shift in the index. The method further includes dividing a two-dimensional iteration space of the first loop and the second loop into multiple contiguous windows, where the second loop uses only one window of the plurality of windows during each iteration and the plurality of windows cover the iteration space. The method includes modifying the computer code segment by adding a third outer loop outside the second loop of the segment, the third loop encompassing the first loop and the second loop.

Type: Grant

Filed: November 28, 2005

Date of Patent: January 4, 2011

Assignee: Intel Corporation

Inventor: Hans-Joachim Plum
Compiler apparatus

Patent number: 7856629

Abstract: A compiler apparatus, which can perform software pipelining optimization that has a considerable effect of reducing the number of execution cycles taken to complete a loop process, converts a source program into a machine program for a processor which is capable of parallel processing. The compiler apparatus is composed of: a parsing unit operable to parse the source program and then to convert the source program into an intermediate program which is described in an intermediate language; an optimization unit operable to optimize the intermediate program; and a conversion unit operable to convert the optimized intermediate program into the machine language program, wherein the optimization unit is operable to execute software pipelining, by inserting a transfer instruction, which is used for transferring data between operands, into a loop process included in the intermediate program so that a data dependence relation is changed.

Type: Grant

Filed: May 24, 2006

Date of Patent: December 21, 2010

Assignee: Panasonic Corporation

Inventors: Shohei Michimoto, Taketo Heishi, Hajime Ogawa, Teruo Kawabata
STATIC PROGRAM REDUCTION FOR COMPLEXITY ANALYSIS

Publication number: 20100318980

Abstract: Described is an analysis tool/techniques for determining the computational complexity of a computer program, including when the program includes procedures having nested loops and/or multi-path loops. First, multi-path loops are converted into code-fragments consisting of simpler loops via a transformation called control flow refinement. Progress invariants are determined for appropriate locations in the procedure to represent relationships between a state that can arise at that program location and the previous state at that location. A bound finding mechanism (such as one based on pattern matching) is then used to compute loop bounds from progress invariants. These bounds are then composed appropriately to determine a precise bound for the enclosing procedure.

Type: Application

Filed: June 13, 2009

Publication date: December 16, 2010

Applicant: Microsoft Corporation

Inventors: Sumit Gulwani, Sagar Jain, Eric J. Koskinen
VECTOR ATOMIC MEMORY OPERATION VECTOR UPDATE SYSTEM AND METHOD

Publication number: 20100318979

Abstract: A system and method of compiling program code, wherein the program code includes an operation on an array of data elements stored in memory of a computer system. The program code is scanned for an equation which may have recurring data points. The equation is then replaced with vectorized machine executable code, wherein the machine executable code comprises a nested loop and wherein the nested loop comprises an exterior loop and a virtual interior loop. The exterior loop decomposes the equation into a plurality of loops of length N, wherein N is an integer greater than one. The virtual interior loop executes vector operations corresponding to the N length loop to form a result vector resident in memory, wherein the virtual interior loop includes a vector atomic memory operation (AMO) instruction.

Type: Application

Filed: June 12, 2009

Publication date: December 16, 2010

Applicant: Cray Inc.

Inventor: Terry D. Greyzck
Method and apparatus for software scouting regions of a program

Patent number: 7849453

Abstract: One embodiment of the present invention provides a system that generates code for software scouting the regions of a program. During operation, the system receives source code for a program. The system then compiles the source code. In the first step of the compilation process, the system identifies a first set of loops from a hierarchy of loops in the source code, wherein each loop in the first set of loops contains at least one effective prefetch candidate. Then, from the first set of loops, the system identifies a second set of loops where scout-mode prefetching is profitable. Next, for each loop in the second set of loops, the system produces executable code for a helper-thread which contains a prefetch instruction for each effective prefetch candidate. At runtime the helper-thread is executed in parallel with the main thread in advance of where the main thread is executing to prefetch data items for the main thread.

Type: Grant

Filed: November 9, 2005

Date of Patent: December 7, 2010

Assignee: Oracle America, Inc.

Inventors: Partha P. Tirumalai, Yonghong Song, Spiros Kalogeropulos
Compiler apparatus

Patent number: 7827542

Abstract: A compiler apparatus that improves the performance of loop processing. The compiler apparatus translates a C program that includes a loop into a machine language program, and includes: a movement judgment unit that judges whether or not an instruction which is positioned outside of the loop of the C program can be moved into the loop, based on a state of live ranges of variables used in the instruction; a movement execution unit that moves the instruction into the loop in the case where the movement judgment unit judges that the instruction can be moved into the loop, thereby generating an intermediate program; and a translation unit that translates the intermediate program into the machine language program.

Type: Grant

Filed: September 25, 2006

Date of Patent: November 2, 2010

Assignee: Panasonic Corporation

Inventors: Hajime Ogawa, Ryoko Miyachi, Toshiyuki Sakata
Using a concurrent partial inspector loop with speculative parallelism

Patent number: 7823141

Abstract: A method for executing a loop in an application that includes executing iterations in a first segment of the loop by a base thread, logging memory transactions that occur during execution of iterations in the first segment by a co-inspector thread to obtain a co-inspector log, executing iterations in a second segment of the loop by a co-thread to obtain temporary results, logging memory transactions that occur during execution of iterations in the second segment to obtain a co-thread log, and comparing the co-inspector log and the co-thread log to determine whether a thread interdependency exists.

Type: Grant

Filed: September 30, 2005

Date of Patent: October 26, 2010

Assignee: Oracle America, Inc.

Inventors: Phyllis E. Gustafson, Michael H. Paleczny, Christopher A. Vick, Olaf Manczak, Jay R. Freeman, Yuguang Wu
Method for loop reformulation

Patent number: 7814468

Abstract: A method for loop reformulation is provided such that a single exit ill-formed loop (SEIFL) can be reformulated into a reformulated code block that contains a transformed well-formed loop (TWFL). A SEIFL loop is a loop that can exit from the loop body of the loop. After the loop reformulation, the TWFL of the reformulated code block can only exit from the end of the loop. The reformulated code block will replace the SEIFL in the compiler's internal representation (IR) such that a more efficient executable machine code can be generated by optimizing the reformulated compiler's IR.

Type: Grant

Filed: April 20, 2005

Date of Patent: October 12, 2010

Assignee: Oracle America, Inc.

Inventors: Yonghong Song, Xiangyun Kong
LEVERAGING MULTICORE SYSTEMS WHEN COMPILING PROCEDURES

Publication number: 20100257516

Abstract: A method, apparatus and program product are provided for parallelizing analysis and optimization in a compiler. A plurality of basic blocks and a subset of data points of a computer program is prepared for processing by a main thread selected from a plurality of hardware threads. The plurality of prepared basic blocks and subset of data points are placed in a shared data structure by the main thread. A prepared basic block of the plurality of prepared basic blocks and/or a tuple associated with the subset of data points is concurrently retrieved from the shared data structure by a work thread selected from the plurality of hardware threads. A compiler analysis or optimization is performed on the prepared basic block or tuple by the work thread.

Type: Application

Filed: April 2, 2009

Publication date: October 7, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert R. Roediger, William J. Schmidt
System and method for performing over time statistics in an electronic spreadsheet environment

Patent number: 7810032

Abstract: A method and system for computing statistical parameters for sets of data items, by executing instructions of a computer program that is coded within a spreadsheet. Each set is generated in a time sequence that is specific to each set. For each time sequence, each data item is one data value or a pair of data values. The data items appears one-at-a-time in only one cell structure of the spreadsheet at each time in the time sequence. The one cell structure is a single cell or two cells. A loop of iterations is performed for each set. In each iteration, a command is responded to by updating the statistical parameters based on the latest data item in the one cell structure in the spreadsheet. The updated statistical parameter are stored in a parameter field of the spreadsheet assigned to each statistical parameter.

Type: Grant

Filed: September 13, 2005

Date of Patent: October 5, 2010

Assignee: International Business Machines Corporation

Inventors: Frederic Bauchot, Gerard Marmigere
Array compression method

Patent number: 7805413

Abstract: A program stored in a storage device is read. Partial compression, in the element in an array in a loop nest in the program, is performed by replacing an element local only in the loop nest in the entire program with a scalar variable. Access to an original array is inserted into a program for an non-local element.

Type: Grant

Filed: December 22, 2003

Date of Patent: September 28, 2010

Assignee: Fujitsu Limited

Inventor: Akira Hosoi
ONE-PASS COMPILATION OF VIRTUAL INSTRUCTIONS

Publication number: 20100235819

Abstract: In embodiments, prior to compilation into machine code, a preprocessor generates directives by processing a source code and/or bytecode representation of a program and/or selecting default directives. The preprocessor embeds the directives in a bytecode representation of the program or a separate stream associated with the bytecode representation of the program. A just-in-time compiler may compile the bytecode representation into machine code directed by the embedded directives in one pass and/or a bytecode interpreter may interpret the bytecode representation of the program. In some embodiments, a computing device generates bytecodes during execution of a program, selects default directives, and embeds the default directives in the bytecodes or a separate stream associated with the bytecodes prior to compilation of the bytecodes into machine code.

Type: Application

Filed: March 10, 2009

Publication date: September 16, 2010

Applicant: Sun Microsystems, Inc.

Inventor: John Robert Rose
Estimating a dominant resource used by a computer program

Patent number: 7797692

Abstract: A system that estimates a dominant computational resource which is used by a computer program. During operation, for each basic block in the computer program, the system determines a nesting level for the basic block. Next, the system selects basic blocks with nesting levels greater than a specified threshold. For each selected basic block, the system analyzes the basic block to estimate the dominant computational resource used by the basic block. The system then uses the estimated dominant computational resources for the selected basic blocks to estimate the dominant computational resource for the computer program.

Type: Grant

Filed: May 12, 2006

Date of Patent: September 14, 2010

Assignee: Google Inc.

Inventor: Grzegorz J. Czajkowski
Systems and methods for affine-partitioning programs onto multiple processing units

Patent number: 7793278

Abstract: Systems and methods perform affine partitioning on a code stream to produce code segments that may be parallelized. The code segments include copies of the original code stream with conditional inserted that aid in parallelizing code. The conditional is formed by determining the constraints on a processor variable determined by the affine partitioning and applying the constraints to the original code stream.

Type: Grant

Filed: September 30, 2005

Date of Patent: September 7, 2010

Assignee: Intel Corporation

Inventors: Zhao Hui Du, Shih-Wei Liao, Gansha Wu, Guei-Yuan Lueh
Method of converting computer program with loops to one without loops

Patent number: 7788659

Abstract: The present invention is a method of eliminating loops from a computer program by receiving the program, graphing its function and control, identifying its entry point, and identifying groups of loops connected to its entry point. Stop if there are no such groups. Otherwise, selecting a group of loops. Then, identifying the selected group's entry point. If the selected group includes no group of loops having a different entry point then replacing it with a recursive or non-recursive function, reconfiguring each connection entering and exiting the selected group to preserve their functionality, and returning to the fifth step. Otherwise, identifying groups of loops in the selected group connected to, but having different entry points and returning to the loop selection step.

Type: Grant

Filed: February 27, 2007

Date of Patent: August 31, 2010

Assignee: United States of America as represented by the Director, the National Security Agency

Inventor: Francis S. Rimlinger
SYSTEM, METHODS AND APPARATUS FOR PROGRAM OPTIMIZATION FOR MULTI-THREADED PROCESSOR ARCHITECTURES

Publication number: 20100218196

Abstract: Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.

Type: Application

Filed: April 16, 2010

Publication date: August 26, 2010

Inventors: Allen K. Leung, Benoit Meister, Nicolas T. Vasilache, David E. Wohlford, Cedric Bastoul, Peter Szilagyi, Richard A. Lethin
CONTROL STRUCTURE REFINEMENT OF LOOPS USING STATIC ANALYSIS

Publication number: 20100205592

Abstract: A system and method for discovering a set of possible iteration sequences for a given loop in a software program is described, to transform the loop representation. In a program containing a loop, the loop is partitioned into a plurality of portions based on splitting criteria. Labels are associated with the portions, and an initial loop automaton is constructed that represents the loop iterations as a regular language over the labels corresponding to the portions in the program. Subsequences of the labels are analyzed to determine infeasibility of the subsequences permitted in the automaton. The automaton is refined by removing all infeasible subsequences to discover a set of possible iteration sequences in the loop. The resulting loop automaton is used in a subsequent program verification or analysis technique to find violations of correctness properties in programs.

Type: Application

Filed: February 8, 2010

Publication date: August 12, 2010

Applicant: NEC Laboratories America, Inc.

Inventors: SRIRAM SANKARANARAYANAN, Aarti Gupta, Gogul Balakrishnan
Method and system for performing reassociation in software loops

Patent number: 7774766

Abstract: Various embodiments of the present invention relate to methods and systems for optimizing an intermediate code in a compilation logic. The intermediate code is optimized by performing reassociation in software loops. The intermediate code includes at least one critical recurrence cycle. The performance of reassociation in software loops can reduce a critical recurrence cycle in them, which can speed up their execution. The subject method can include the determination of one or more critical recurrence cycles in a software loop. The method can also include the determination of at least one edge in a critical recurrence cycle, with respect to which reassociation can be performed, if one or more pre-determined criteria are met. The method can further include performing reassociation of a dependee and a dependent of an edge. In an embodiment, when one or more pre-determined criteria are met, the logic of the software loop is maintained after performing reassociation of the dependee and the dependent of the edge.

Type: Grant

Filed: September 29, 2005

Date of Patent: August 10, 2010

Assignee: Intel Corporation

Inventors: Kalyan Muthukumar, Daniel M Lavery
Generating efficient parallel code using partitioning, coalescing, and degenerative loop and guard removal

Patent number: 7757222

Abstract: Code is affine partitioned to generate affine partitioning mappings. Parallel code is generated based on the affine partitioning mappings. Generating the parallel code includes coalescing loops in the parallel code generated from the affine partitioning mappings to generate coalesced parallel code and optimizing the coalesced parallel code.

Type: Grant

Filed: September 30, 2005

Date of Patent: July 13, 2010

Assignee: Intel Corporation

Inventors: Shih-wei Liao, Zhao Hui Du, Bu Qi Cheng, Gansha Wu, Guei-Yuan Lueh
COMPILER APPARATUS WITH FLEXIBLE OPTIMIZATION

Publication number: 20100175056

Abstract: A compiler comprises an analysis unit that detects directives (options and pragmas) from a user to the compiler, an optimization unit that is made up of a processing unit (a global region allocation unit, a software pipelining unit, a loop unrolling unit, a “if” conversion unit, and a pair instruction generation unit) that performs individual optimization processing designated by options and pragmas from a user, following the directives and the like from the analysis unit, etc. The global region allocation unit performs optimization processing, following designation of the maximum data size of variables to be allocated to a global region, designation of variables to be allocated to the global region, and options and pragmas regarding designation of variables not to be allocated in the global region.

Type: Application

Filed: February 16, 2010

Publication date: July 8, 2010

Inventors: Hajime OGAWA, Taketo Heishi, Toshiyuki Sakata, Shuichi Takayama, Shohei Michimoto, Tomoo Hamada, Ryoko Miyachi
Methods and systems for ordering instructions using future values

Patent number: 7747993

Abstract: A method of ordering instructions. The method can include placing a first instruction that consumes a value of an object before a second instruction that produces the value of the object such that the first instruction is processed before the second instruction and a physical location is allocated to the value of the object upon processing the first instruction.

Type: Grant

Filed: December 30, 2004

Date of Patent: June 29, 2010

Assignee: Michigan Technological University

Inventor: Soner Onder
METHOD AND SYSTEM FOR INTERPROCEDURAL PREFETCHING

Publication number: 20100146495

Abstract: A computing system has an amount of shared cache, and performs runtime automatic parallelization wherein when a parallelized loop is encountered, a main thread shares the workload with at least one other non-main thread. A method for providing interprocedural prefetching includes compiling source code to produce compiled code having a main thread including a parallelized loop. Prior to the parallelized loop in the main thread, the main thread includes prefetching instructions for the at least one other non-main thread that shares the workload of the parallelized loop. As a result, the main thread prefetches data into the shared cache for use by the at least one other non-main thread.

Type: Application

Filed: December 10, 2008

Publication date: June 10, 2010

Applicant: SUN MICROSYSTEMS, INC.

Inventors: Yonghong Song, Spiros Kalogeropulos, Partha P. Tirumalai
Efficient protocol for encoding software pipelined loop when PC trace is enabled

Patent number: 7721267

Abstract: A software pipelined loop tracing method involves inhibiting an output of trace data at a start of a software pipelined loop (SPLOOP). A skip in an output trace packet is indicated if the SPLOOP is skipped, and the SPLOOP is indicated at a cycle of an epilog state in the output trace packet if the SPLOOP is not skipped. An iteration count indication SPLOOP information and a position within a SPLOOP, is maintained. A periodic SPLOOP marker (PerSP) coinciding with a sync point is output if the SPLOOP is active.

Type: Grant

Filed: May 16, 2006

Date of Patent: May 18, 2010

Assignee: Texas Instruments Incorporated

Inventor: Manisha Agarwala
Macroscalar Processor Architecture

Publication number: 20100122069

Abstract: A macroscalar processor architecture is described herein. In one embodiment, a processor receives instructions of a program loop having a vector block and a sequence block intended to be executed after the vector block, where the processor includes multiple slices and each of the slices is capable of executing an instruction of an iteration of the program loop substantially in parallel. For each iteration of the program loop, the processor executes an instruction of the sequence block using one of the slices while executing instructions of the vector block using a remainder of the slices substantially in parallel. Other methods and apparatuses are also described.

Type: Application

Filed: November 6, 2009

Publication date: May 13, 2010

Inventor: Jeffry E. Gonion
Method for predicate promotion in a software loop

Patent number: 7712091

Abstract: A method and system for optimizing the execution of a software loop is provided. The method involves the determination of an edge in a critical recurrence cycle in the software loop. The edge is a dependency link between two instructions and contains a dependee and a dependent. The dependee is an instruction that produces a result, and the dependent is an instruction that uses the result. The method further involves performing predicate promotion of at least one of the dependee and the dependent if one or more pre-determined conditions are met.

Type: Grant

Filed: September 30, 2005

Date of Patent: May 4, 2010

Assignee: Intel Corporation

Inventors: Kalyan Muthukumar, Robyn A. Sampson, Daniel Lavery
Dynamic prefetch distance calculation

Patent number: 7702856

Abstract: The prefetch distance to be used by a prefetch instruction may not always be correctly calculated using compile-time information. In one embodiment, the present invention generates prefetch distance calculation code to dynamically calculate a prefetch distance used by a prefetch instruction at run-time.

Type: Grant

Filed: November 9, 2005

Date of Patent: April 20, 2010

Assignee: Intel Corporation

Inventors: Rakesh Krishnaiyer, Somnath Ghosh, Abhay Kanhere
Compiler apparatus with flexible optimization

Patent number: 7698696

Abstract: A compiler comprises an analysis unit that detects directives (options and pragmas) from a user to the compiler, an optimization unit that is made up of a processing unit (a global region allocation unit, a software pipelining unit, a loop unrolling unit, a “if” conversion unit, and a pair instruction generation unit) that performs individual optimization processing designated by options and pragmas from a user, following the directives and the like from the analysis unit, etc. The global region allocation unit performs optimization processing, following designation of the maximum data size of variables to be allocated to a global region, designation of variables to be allocated to the global region, and options and pragmas regarding designation of variables not to be allocated in the global region.

Type: Grant

Filed: June 30, 2003

Date of Patent: April 13, 2010

Assignee: Panasonic Corporation

Inventors: Hajime Ogawa, Taketo Heishi, Toshiyuki Sakata, Shuichi Takayama, Shohei Michimoto, Tomoo Hamada, Ryoko Miyachi
Splitting the computation space to optimize parallel code

Patent number: 7689980

Abstract: Linear transformations of statements in code are performed to generate linear expressions associated with the statements. Parallel code is generated using the linear expressions. Generating the parallel code includes splitting the computation-space of the statements into intervals and generating parallel code for the intervals.

Type: Grant

Filed: September 30, 2005

Date of Patent: March 30, 2010

Assignee: Intel Corporation

Inventors: Zhao Hui Du, Shih-wei Liao, Gansha Wu, Guei-Yuan Lueh
METHODS AND APPARATUS FOR JOINT PARALLELISM AND LOCALITY OPTIMIZATION IN SOURCE CODE COMPILATION

Publication number: 20100070956

Abstract: Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one mufti-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that avow for parallel execution of tasks. The first custom computing apparatus optimizes the code for both parallelism and locality of operations on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.

Type: Application

Filed: September 16, 2009

Publication date: March 18, 2010

Inventors: Allen Leung, Nicolas T. Vasilache, Benoit Meister, Richard A. Lethin
Fine-grained software-directed data prefetching using integrated high-level and low-level code analysis optimizations

Patent number: 7669194

Abstract: A mechanism for minimizing effective memory latency without unnecessary cost through fine-grained software-directed data prefetching using integrated high-level and low-level code analysis and optimizations is provided. The mechanism identifies and classifies streams, identifies data that is most likely to incur a cache miss, exploits effective hardware prefetching to determine the proper number of streams to be prefetched, exploits effective data prefetching on different types of streams in order to eliminate redundant prefetching and avoid cache pollution, and uses high-level transformations with integrated lower level cost analysis in the instruction scheduler to schedule prefetch instructions effectively.

Type: Grant

Filed: August 26, 2004

Date of Patent: February 23, 2010

Assignee: International Business Machines Corporation

Inventors: Roch Georges Archambault, Robert James Blainey, Yaoqing Gao, Allan Russell Martin, James Lawrence McInnes, Francis Patrick O'Connell
Program execution method using an optimizing just-in-time compiler

Patent number: 7665079

Abstract: It is one object of the present invention to provide a program execution method for performing greater optimization. A program execution apparatus according to the present invention performs a transfer from an interpreter process to a compiled code process in the course of the execution of a method. At this time, if no problem occurs when a transfer point is moved to the top of a loop, the transfer point for code is so moved. And when a transfer point is located inside a loop, a point that post-dominates the top of the loop and the transfer point is copied to a position immediately preceding the loop. Then, information for generating recalculation code is provided for the transfer point, and a recalculation is performed.

Type: Grant

Filed: November 8, 2000

Date of Patent: February 16, 2010

Assignee: International Business Machines Corporation

Inventors: Toshiaki Yasue, Kazunori Ogata, Kazuaki Ishizaki, Hideaki Komatsu
Huffman-L compiler optimized for cell-based computers or other computers having reconfigurable instruction sets

Patent number: 7665078

Abstract: A method for optimizing a code sequence by tuning the representations of an instruction set based on the frequency of operations performed by the code sequence. For example, the number of bit symbols used to represent a code sequence may be reduced using the present invention.

Type: Grant

Filed: August 21, 2003

Date of Patent: February 16, 2010

Assignee: Gateway, Inc.

Inventor: Frank Liebenow
Dynamically Maintaining Coherency Within Live Ranges of Direct Buffers

Publication number: 20100023700

Abstract: Reducing coherency problems in a data processing system is provided. Source code that is to be compiled is received and analyzed to identify at least one of a plurality of loops that contain a memory reference. A determination is made as to whether the memory reference is an access to a global memory that should be handled by a direct buffer. Responsive to an indication that the memory reference is an access to the global memory that should be handled by the direct buffer, the memory reference is marked for direct buffer transformation. The direct buffer transformation is then applied to the memory reference.

Type: Application

Filed: July 22, 2008

Publication date: January 28, 2010

Applicant: International Business Machines Corporation

Inventors: Tong Chen, John K. O'Brien, Tao Zhang
Efficient Software Cache Accessing With Handle Reuse

Publication number: 20100023932

Abstract: A mechanism for efficient software cache accessing with handle reuse is provided. The mechanism groups references in source code into a reference stream with the reference stream having a size equal to or less than a size of a software cache line. The source code is transformed into optimized code by modifying the source code to include code for performing at most two cache lookup operations for the reference stream to obtain two cache line handles. Moreover, the transformation involves inserting code to resolve references in the reference stream based on the two cache line handles. The optimized code may be output for generation of executable code.

Type: Application

Filed: July 22, 2008

Publication date: January 28, 2010

Applicant: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Marc Gonzalez Tallada, John K. O'Brien
Editing, creating, and verifying reorganization of flowchart, and transforming between flowchart and tree diagram

Patent number: 7647577

Abstract: Provides methods for transforming a flowchart to an equivalent tree diagram, methods for transforming an equivalent tree diagram to a flowchart, methods for verifying reorganization of a flowchart, methods for editing a flowchart, methods for creating a flowchart and a flowchart editor. A flowchart includes one or more logic structures and one or more processing activities in said one or more logic structures. The method for transforming a flowchart to an equivalent tree diagram comprises: traversing said flowchart; transforming said one or more logic structures in said flowchart to one or more branching nodes in said tree diagram; and transforming one or more processing activities in said logic structures of said flowchart to one or more leaf nodes below corresponding branching nodes in said tree diagram. Further, edition of a flowchart and verification of reorganization of a flowchart are performed by utilizing an equivalent tree diagram.

Type: Grant

Filed: May 27, 2005

Date of Patent: January 12, 2010

Assignee: International Business Machines Corporation

Inventors: Jian Wang, Jun Zhu, Sheng Ye, Jing Li, Hai Qi Liang, Ying Liu, Ying Nan Zuo
INTERFACE OPTIMIZATION IN A CLOSED SYSTEM

Publication number: 20090328020

Abstract: Interface optimization is provided using a closed system in which all the individual software components in the system are known to the compiler at a single point in time. This knowledge enables significant opportunities to optimize the implementation of interfaces on a set of implemented objects. When code is compiled, because the compiler knows the full list of interfaces and the objects which implement the interfaces, it can improve execution and working set (i.e., recently referenced pages in a program's virtual address space) when implementing the interfaces on objects. This improvement may be realized by reducing the size of interface lookup tables which map each interface to the object types which implement that particular interface.

Type: Application

Filed: June 28, 2008

Publication date: December 31, 2009

Applicant: Microsoft Corporation

Inventors: Jeffrey E. Stall, Jonathon Michael Stall
Multiversioning if statement merging and loop fusion

Publication number: 20090328021

Abstract: In one embodiment of the invention, a method for fusing a first loop nested in a first IF statement with a second loop nested in a second IF statement without the use of modified and referenced (mod-ref) information to determine if certain conditional statements in the IF statements retain variable values.

Type: Application

Filed: June 30, 2008

Publication date: December 31, 2009

Inventors: John L. Ng, Robert Cox, Dmitry V. Budanov

prev … 2 3 4 5 6 7 8 9 next