Including Scheduling Instructions Patents (Class 717/161)

Code generation in the presence of paged memory

Patent number: 8341609

Abstract: A computer is programmed to automatically identify multiple sequences of executable code such that each sequence fits within a page of memory. When the executable code comprising several sequences is loaded into the paged memory, each sequence is placed in its own page. The computer is further programmed to prepare a number of structures which identify a corresponding number of instructions that transfer control between sequences. Each structure identifies at least a control transfer instruction in one sequence and a target in another sequence. When loading the sequences into memory, the structures are used to replace destination addresses of control transfers between sequences with new addresses derived from base addresses of pages that have been allocated in memory to hold the sequences.

Type: Grant

Filed: January 26, 2007

Date of Patent: December 25, 2012

Assignee: Oracle International Corporation

Inventors: Robert H. Lee, David Unietis, Mark Jungerman
System and method for topology-aware job scheduling and backfilling in an HPC environment

Patent number: 8336040

Abstract: A method for job management in an HPC environment includes determining an unallocated subset from a plurality of HPC nodes, with each of the unallocated HPC nodes comprising an integrated fabric. An HPC job is selected from a job queue and executed using at least a portion of the unallocated subset of nodes.

Type: Grant

Filed: April 15, 2004

Date of Patent: December 18, 2012

Assignee: Raytheon Company

Inventors: Shannon V. Davidson, Anthony N. Richoux
Method and system of performing thread scheduling

Patent number: 8336031

Abstract: A method and system of performing thread scheduling. At least some of the illustrative embodiments are computer-readable mediums storing a program that, when executed by a processor of a host system, causes the processor to instantiate a CPU object that represents a processor abstraction, create a CPU context object that represents a thread abstraction (wherein the CPU context object is associated to a method, and wherein the CPU context object is mapped onto the CPU object), and execute the method within the CPU object.

Type: Grant

Filed: October 31, 2007

Date of Patent: December 18, 2012

Assignee: Texas Instruments Incorporated

Inventors: Gilbert Cabillic, Jean-Philippe Lesot
Compiler and compiling method

Patent number: 8336041

Abstract: A compiler allocates an unroll_group_number conferred based on a sequence in which a loop body is replicated by loop unrolling to each loop body during loop unrolling based on the optimized number of loop unrolling. The allocated unroll_group_number is added to each instruction included in each loop body. A priority of an instruction is adjusted based on the allocated unroll_group_number during instruction scheduling.

Type: Grant

Filed: September 11, 2009

Date of Patent: December 18, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventor: Byung-chang Cha
Array reference safety analysis in the presence of loops with conditional control flow

Patent number: 8327344

Abstract: Mechanisms are provided for analyzing and optimizing loops with conditional control flow in source code based on array reference safety. Mechanisms are provided for analyzing blocks of the source code to identify a conditional control flow loop having loop source code specifying a total access range for an array reference. A safe access range, of the total access range of the array reference in the loop source code, is identified over which a compiler-based optimization of the loop source code can be safely applied without introducing new exception conditions. The compiler-based optimization of the loop source code is performed based on the identified safe access range to generate optimized code. The optimized code is output for generation of executable code for execution on a processor.

Type: Grant

Filed: October 14, 2008

Date of Patent: December 4, 2012

Assignee: International Business Machines Corporation

Inventor: Michael K. Gschwind
Computation table for block computation

Patent number: 8327345

Abstract: In response to receiving pre-processed code, a compiler identifies a code section that is not candidate for acceleration and identifying a code block specifying an iterated operation that is a candidate for acceleration. In response to identifying the code section, the compiler generates post-processed code containing one or more lower level instructions corresponding to the identified code section, and in response to identifying the code block, the compiler creates and outputs an operation data structure separate from the post-processed code that identifies the iterated operation. The compiler places a block computation command in the post-processed code that invokes processing of the operation data structure to perform the iterated operation and outputs the post-processed code.

Type: Grant

Filed: December 16, 2008

Date of Patent: December 4, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Balaram Sinharoy
Processor cooling management

Patent number: 8311683

Abstract: Illustrative embodiments provide a computer implemented method, a data processing system, and a computer program product for adjusting cooling settings. The computer implemented method comprises analyzing a set of instructions of an application to determine a number of degrees by which a set of instructions will raise a temperature of at least one processor core. The computer implemented method further calculates a cooling setting for at least one cooling system for the at least one processor core. The computer implemented method adjusts the at least one cooling system based on the cooling setting. The step of analyzing the set of instructions is performed before the set of instructions is executed on the at least one processor core. The step of adjusting the at least one cooling system is performed before the set of instructions is executed on the at least one processor core.

Type: Grant

Filed: April 29, 2009

Date of Patent: November 13, 2012

Assignee: International Business Machines Corporation

Inventors: Robert Lee Angell, David Wayne Cosby, Robert R. Friedlander, James R. Kraemer
Communication scheduling for parallel processing architectures

Patent number: 8291400

Abstract: A system comprises a plurality of computation units interconnected by an interconnection network. A method for configuring the system comprises receiving subsets of instructions corresponding to different portions of a program, each subset assigned to one of the computation units; scheduling instructions in a given subset for execution on the assigned computation unit, including scheduling communication instructions that send to or receive from a different computation unit over the interconnection network; allocating registers in a given computation unit for storing values accessed by instructions in a subset assigned to the given computation unit; and scheduling instructions after allocating registers to account for spills of values stored in allocated register to memory, preserving the order of communication instructions scheduled before allocating registers.

Type: Grant

Filed: February 7, 2008

Date of Patent: October 16, 2012

Assignee: Tilera Corporation

Inventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
Reconfigurable processing

Patent number: 8281297

Abstract: A method of producing a reconfigurable circuit device for running a computer program of moderate complexity such as multimedia processing. Code for the application is compiled into Control Flow Graphs representing distinct parts of the application to be run. From those Control Flow Graphs are extracted basic blocks. The basic blocks are converted to Data Flow Graphs by a compiler utility. From two or more Data Flow Graphs, a largest common subgraph is determined. The largest common subgraph is ASAP scheduled and substituted back into the Data Flow Graphs which also have been scheduled. The separate Data Flow Graphs containing the scheduled largest common subgraph are converted to data paths that are then combined to form code for operating the application. The largest common subgraph is effected in hardware that is shared among the parts of the application from which the Data Flow Graphs were developed.

Type: Grant

Filed: February 5, 2004

Date of Patent: October 2, 2012

Assignee: Arizona Board of Regents

Inventors: Aravind R. Dasu, Ali Akoglu, Arvind Sudarsanam, Sethuraman Panchanathan
Profiling of software and circuit designs utilizing data operation analyses

Patent number: 8276135

Abstract: The present invention is a method, system, software and data structure for profiling programs, other code, and adaptive computing integrated circuit architectures, using a plurality of data parameters such as data type, input and output data size, data source and destination locations, data pipeline length, locality of reference, distance of data movement, speed of data movement, data access frequency, number of data load/stores, memory usage, and data persistence. The profiler of the invention accepts a data set as input, and profiles a plurality of functions by measuring a plurality of data parameters for each function, during operation of the plurality of functions with the input data set, to form a plurality of measured data parameters. From the plurality of measured data parameters, the profiler generates a plurality of data parameter comparative results corresponding to the plurality of functions and the input data set.

Type: Grant

Filed: November 7, 2002

Date of Patent: September 25, 2012

Assignee: QST Holdings LLC

Inventor: Paul L. Master
Self-optimizable code for optimizing execution of tasks and allocation of memory in a data processing system

Patent number: 8266606

Abstract: A mechanism is provided for increasing efficiency of tasks by observing the performance of generally equivalent code paths during execution of the task are disclosed. Embodiments involve a computer system with software, or hard-coded logic that includes reflexive code paths. The reflexive code paths may be identified by a software or hardware designer during the design of the computer system. For that particular computer system, however, one of the code paths may offer better performance characteristics so a monitor collects performance data during execution of the reflexive code paths and a code path selector selects the reflexive code with favorable performance characteristics. One embodiment improves the performance of memory allocation by selectively implementing a tunable, linear, memory allocation module in place of a default memory allocation module.

Type: Grant

Filed: May 13, 2008

Date of Patent: September 11, 2012

Assignee: International Business Machines Corporation

Inventor: Marc Alan Dickenson
Extension of swing modulo scheduling to evenly distribute uniform strongly connected components

Patent number: 8266610

Abstract: A method, apparatus, and computer instructions for scheduling instructions for execution. Identify a series of instructions in a loop, wherein the series of instructions has a cyclic data dependency. Determine whether the series of instructions is a uniform series of instructions. Schedule execution of the uniform series of instructions within the loop to optimize execution of the loop in response to the identified series of instructions being the uniform series of instructions.

Type: Grant

Filed: September 19, 2008

Date of Patent: September 11, 2012

Assignee: International Business Machines Corporation

Inventor: Allan Russell Martin
Single-chip multiprocessor with clock cycle-precise program scheduling of parallel execution

Patent number: 8261250

Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.

Type: Grant

Filed: January 10, 2011

Date of Patent: September 4, 2012

Assignee: Elbrus International

Inventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
Method and system for automated code conversion

Patent number: 8261252

Abstract: A method and system for converting application code into optimized application code or into execution code suitable for execution on a computation engine with an architecture comprising at least a first and a second level of data memory units are disclosed. In one aspect, the method comprises obtaining application code, the application code comprising data transfer operations between the levels of memory units. The method further comprises converting at least a part of the application code. The converting of application code comprises scheduling of data transfer operations from a first level of memory units to a second level of memory units such that accesses of data accessed multiple times are brought closer together in time than in the original code.

Type: Grant

Filed: March 26, 2008

Date of Patent: September 4, 2012

Assignees: IMEC, Katholieke Universiteit Leuven

Inventors: Praveen Raghavan, Murali Jayapala, Francky Catthoor, Absar Javed, Andy Lambrechts
Distributed schemes for deploying an application in a large parallel system

Patent number: 8261249

Abstract: Embodiments of the invention provide a method for deploying and running an application on a massively parallel computer system, while minimizing the costs associated with latency, bandwidth, and limited memory resources. The executable code of a program may be divided into multiple code fragments and distributed to different compute nodes of a parallel computing system. During program execution, one compute node may fetch code fragments from other compute nodes as necessary.

Type: Grant

Filed: January 8, 2008

Date of Patent: September 4, 2012

Assignee: International Business Machines Corporation

Inventors: Charles Jens Archer, Thomas Michael Gooding, Ruth Janine Poole, Albert Sidelnik
Compiling code for parallel processing architectures based on control flow

Patent number: 8250555

Abstract: A system comprises a plurality of computation units interconnected by an interconnection network.

Type: Grant

Filed: February 7, 2008

Date of Patent: August 21, 2012

Assignee: Tilera Corporation

Inventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
Distributing parallelism for parallel processing architectures

Patent number: 8250556

Abstract: A system comprises a plurality of computation units interconnected by an interconnection network. A method for configuring the system comprises receiving an initial partitioning of instructions into initial subsets corresponding to different portions of a program; forming a refined partitioning of the instructions into refined subsets each including one or more of the initial subsets, including determining whether to combine a first subset and a second subset to form a third subset according to a comparison of a communication cost between the first subset and second subset and a load cost of the third subset that is based at least in part on a number of instructions issued per cycle by a computation unit; and assigning each refined subset of instructions to one of the computation units for execution on the assigned computation unit.

Type: Grant

Filed: February 7, 2008

Date of Patent: August 21, 2012

Assignee: Tilera Corporation

Inventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
Configuring a dependency graph for dynamic by-pass instruction scheduling

Patent number: 8250557

Abstract: There is disclosed a method and system for configuring a data dependency graph (DDG) to handle instruction scheduling in computer architectures permitting dynamic by-pass execution, and for performing dynamic by-pass scheduling utilizing such a configured DDG. In accordance with an embodiment of the invention, a heuristic function is used to obtain a ranking of nodes in the DDG after setting delays at all identified by-pass pairs of nodes in the DDG to 0. From among a list of identified by-pass pairs of nodes, a node that is identified as being the least important to schedule early is marked as “bonded” to its successor, and the corresponding delay for that identified node is set to 0. Node rankings are re-computed and the bonded by-pass pair of nodes are scheduled in consecutive execution cycles with a delay of 0 to increase the likelihood that a by-pass can be successfully taken during run-time execution.

Type: Grant

Filed: May 7, 2008

Date of Patent: August 21, 2012

Assignee: International Business Machines Corporation

Inventors: Marcel Mitran, Alexander Vasilevskiy
Optimizing Libraries for Validating C++ Programs Using Symbolic Execution

Publication number: 20120192169

Abstract: Particular embodiments optimize a C++ function comprising one or more loops for symbolic execution, comprising for each loop, if there is a branching condition within the loop, then rewrite the loop to move the branching condition outside the loop. Particular embodiments may further optimize the C++ function through simplified symbolic expressions and adding constructs forcing delayed interpretation of symbolic expressions during the symbolic execution.

Type: Application

Filed: January 20, 2011

Publication date: July 26, 2012

Applicant: FUJITSU LIMITED

Inventors: Guodong Li, Sreeranga P. Rajan, Indradeep Ghosh
Method and system for implementing parallel execution in a computing system and in a circuit simulator

Patent number: 8224636

Abstract: A method and mechanism for implementing a general purpose scripting language that supports parallel execution is described. In one approach, parallel execution is provided in a seamless and high-level approach rather than requiring or expecting a user to have low-level programming expertise with parallel processing languages/functions. Also described is a system and method for performing circuit simulation. The present approach provides methods and systems that create reusable and independent measurements for use with circuit simulators. Also disclosed are parallelizable measurements having looping constructs that can be run without interference between parallel iterations. Reusability is enhanced by having parameterized measurements. Revisions and history of the operating parameters of circuit designs subject to simulation are tracked.

Type: Grant

Filed: December 17, 2003

Date of Patent: July 17, 2012

Assignee: Cadence Design Systems, Inc.

Inventor: Kenneth S. Kundert
Sparse vectorization without hardware gather/scatter

Patent number: 8191056

Abstract: A target operation in a normalized target loop, susceptible of vectorization and which may, after compilation into a vectorized form, seek to operate on data in nonconsecutive physical memory, is identified in source code. Hardware instructions are inserted into executable code generated from the source code, directing a system that will run the executable code to create a representation of the data in consecutive physical memory. A vector loop containing the target operation is replaced, in the executable code, with a function call to a vector library to call a vector function that will operate on the representation to generate a result identical to output expected from executing the vector loop containing the target operation. On execution, a representation of data residing in nonconsecutive physical memory is created in consecutive physical memory, and the vectorized target operation is applied to the representation to process the data.

Type: Grant

Filed: October 13, 2006

Date of Patent: May 29, 2012

Assignee: International Business Machines Corporation

Inventors: Roch Georges Archambault, George Chochia, Peng Zhao
Systems, methods, and computer products for compiler support for aggressive safe load speculation

Patent number: 8191057

Abstract: Systems, methods and computer products for compiler support for aggressive safe load speculation. Exemplary embodiments include a method for aggressive safe load speculation for a compiler in a computer system, the method including building a control flow graph, identifying both countable and non-countable loops, gathering a set of candidate loops for load speculation, and for each candidate loop in the set of candidate loops gathered for load speculation, computing an estimate of the iteration count, delay cycles, and code size, performing a profitability analysis and determining an unroll factor based on the delay cycles and the code size, transforming the loop by generating a prologue loop to achieve data alignment and an unrolled main loop with loop directives, indicating which loads can safely be executed speculatively and performing low-level instruction scheduling on the generated unrolled main loop.

Type: Grant

Filed: August 27, 2007

Date of Patent: May 29, 2012

Assignee: International Business Machines Corporation

Inventors: Roch G. Archambault, Geoffrey O. Blandy, Roland Froese, Yaoqing Gao, Liangxiao Hu, James L. McInnes, Raul E. Silvera
Efficient code generation using loop peeling for SIMD loop code with multile misaligned statements

Patent number: 8171464

Abstract: An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.

Type: Grant

Filed: May 16, 2008

Date of Patent: May 1, 2012

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
Systems and methods for caching compute kernels for an application running on a parallel-processing computer system

Patent number: 8146066

Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.

Type: Grant

Filed: March 5, 2007

Date of Patent: March 27, 2012

Assignee: Google Inc.

Inventors: Christopher G. Demetriou, Matthew N. Papakipos
Compiler with flexible scheduling

Patent number: 8141068

Abstract: A computer program consisting of a compiler for compiling source code programs into executable code. The compiler is suited to achieving high efficiency on a processor that can process many instructions at once but the instructions have dependency constraints and the processor has no internal mechanism for dealing with these constraints, such as the Itanium class of processors. As each instruction is considered for addition to a group of instructions for a single cycle, dependencies are checked to determine whether the entire group can be scheduled in any possible order. Once all the instructions of the group have been selected, the instructions are then reordered for placement in a reservation table. For implementation in the Itanium class of processors, detailed requirements of the processor are accommodated with a structure that can be adjusted for any processor in the class. The structure can also be adjusted for other classes of processors.

Type: Grant

Filed: June 18, 2002

Date of Patent: March 20, 2012

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Carol L. Thompson
Systems and methods for compiling an application for a parallel-processing computer system

Patent number: 8136102

Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.

Type: Grant

Filed: March 5, 2007

Date of Patent: March 13, 2012

Assignee: Google Inc.

Inventors: Matthew N. Papakipos, Brian K. Grant, Christopher G. Demetriou, Morgan S. McGuire
Systems and methods for determining compute kernels for an application in a parallel-processing computer system

Patent number: 8136104

Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.

Type: Grant

Filed: March 5, 2007

Date of Patent: March 13, 2012

Assignee: Google Inc.

Inventors: Matthew N. Papakipos, Brian K. Grant, Morgan S. McGuire, Christopher G. Demetriou
Method and apparatus for automatic second-order predictive commoning

Patent number: 8132163

Abstract: A method and apparatus for automatic second-order predictive commoning is provided by the present invention. During an analysis phase, the intermediate representation of a program code is analyzed to identify opportunities for second-order predictive commoning optimization. The analyzed information is used by the present invention for apply transformations to the program code, such that the number of memory access and the number of computations are reduced for loop iterations and performance of program code is improved.

Type: Grant

Filed: January 30, 2009

Date of Patent: March 6, 2012

Assignee: International Business Machines Corporation

Inventors: Arie Tal, Dina Tal
Mechanism to restrict parallelization of loops

Patent number: 8104030

Abstract: A computer implemented method, computer usable program code, and a system for parallelizing a loop. A parameter that will be used to limit parallelization of the loop is identified to limit parallelization of the loop. The parameter specifies a minimum number of loop iterations that a thread should execute. The parameter can be adjusted based on a parallel performance factor. A parallel performance factor is a factor that influences the performance of parallel code. A number of threads from a plurality of threads is selected for processing iterations of the loop based on the parameter. The number of threads is selected prior to execution of the first iteration of the loop.

Type: Grant

Filed: December 21, 2005

Date of Patent: January 24, 2012

Assignee: International Business Machines Corporation

Inventors: Raul Esteban Silvera, Priya Unnikrishnan, Guansong Zhang
System and method for instruction latency reduction in graphics processing

Patent number: 8098251

Abstract: A system, method and apparatus are disclosed, in which an instruction scheduler of a compiler, e.g., a shader compiler, reduces instruction latency based on a determined instruction distance between a dependent predecessor and successor instructions.

Type: Grant

Filed: February 22, 2008

Date of Patent: January 17, 2012

Assignee: QUALCOMM Incorporated

Inventor: Lin Chen
Implementing strong atomicity in software transactional memory

Patent number: 8099726

Abstract: A software transactional memory system is described which utilizes decomposed software transactional memory instructions as well as runtime optimizations to achieve efficient performance. The decomposed instructions allow a compiler with knowledge of the instruction semantics to perform optimizations which would be unavailable on traditional software transactional memory systems. Additionally, high-level software transactional memory optimizations are performed such as code movement around procedure calls, addition of operations to provide strong atomicity, removal of unnecessary read-to-update upgrades, and removal of operations for newly-allocated objects. During execution, multi-use header words for objects are extended to provide for per-object housekeeping, as well as fast snapshots which illustrate changes to objects. Additionally, entries to software transactional memory logs are filtered using an associative table during execution, preventing needless writes to the logs.

Type: Grant

Filed: March 23, 2006

Date of Patent: January 17, 2012

Assignee: Microsoft Corporation

Inventor: Timothy Lawrence Harris
Implementing shadow versioning to improve data dependence analysis for instruction scheduling

Patent number: 8091079

Abstract: A method for implementing shadow versioning to improve data dependence analysis for instruction scheduling in compiling code, and to identify loops within the code to be compiled, for each loop initializing a dependence a matrix, for each loop shadow identifying symbols that are accessed by the loop, examining dependencies, storing, comparing and classifying the dependence vectors, generating new shadow symbols, replacing the old shadow symbols with the new shadow symbols, generating alias relationships between the newly created shadow symbols, scheduling instructions and compiling the code.

Type: Grant

Filed: August 29, 2007

Date of Patent: January 3, 2012

Assignee: International Business Machines Corporation

Inventors: Roch G. Archambault, Yaoqing Gao, Raul E. Silvera, Peng Zhao
Method and apparatus for enabling optimistic program execution

Patent number: 8065670

Abstract: A system that reduces overly optimistic program execution. During operation, the system encounters a bounded-execution block while executing a program, wherein the bounded execution block includes a primary path and a secondary path. Next, the system executes the bounded execution block. After executing the bounded execution block, the system determines whether executing instructions on the primary path is preferable to executing instructions on the secondary path based on information gathered while executing the bounded-execution block. If not, the system dynamically modifies the instructions of the bounded-execution block so that during subsequent passes through the bounded-execution block, the instructions on the secondary path are executed instead of the instructions on the primary path.

Type: Grant

Filed: October 3, 2006

Date of Patent: November 22, 2011

Assignee: Oracle America, Inc.

Inventors: Tycho G. Nightingale, Wayne Mesard
Methods, systems, and computer products for evaluating robustness of a list scheduling framework

Patent number: 8042100

Abstract: Systems, methods, and computer products for evaluating robustness of a list scheduling framework. Exemplary embodiments include a method for evaluating the robustness of a list scheduling framework, the method including identifying a set of compiler benchmarks known to be sensitive to an instruction scheduler, running the set of benchmarks against a heuristic under test, H and collect an execution time Exec(H[G]), where G is a directed a-cyclical graph, running the set of benchmarks against a plurality of random heuristics Hrand[G]i, and collect a plurality of respective execution times Exec(Hrand[G])i, computing a robustness of the list scheduling framework, and checking robustness check it against a pre-determined threshold.

Type: Grant

Filed: August 27, 2007

Date of Patent: October 18, 2011

Assignee: International Business Machines Corporation

Inventors: Marcel Mitran, Joran S. C. Siu, Alexander Vasilevskiy
Run-Time parallelization of loops in computer programs using bit vectors

Patent number: 8028281

Abstract: Parallelization of loops is performed for loops having indirect loop index variables and embedded conditional statements in the loop body. Loops having any finite number of array variables in the loop body, and any finite number of indirect loop index variables can be parallelized. There are two particular limitations of the described techniques: (i) that there are no cross-iteration dependencies in the loop other than through the indirect loop index variables; and (ii) that the loop index variables (either direct or indirect) are not redefined in the loop body.

Type: Grant

Filed: January 5, 2007

Date of Patent: September 27, 2011

Assignee: International Business Machines Corporation

Inventor: Rajendra K. Bera
Hardware accelerator

Patent number: 8020142

Abstract: A method for instruction processing may include adding a first operand from a first register, a second operand from a second register and a carry input bit to generate a sum and a carry out bit, loading the sum into a third register and loading the carry out bit into a most significant bit position of the third register to generate a third operand, performing a single bit shift on the third operand via a shifter unit to produce a shifted operand and loading the shifted operand into the fourth register, loading a least significant bit from the sum into the most significant bit position of the fourth register to generate a fourth operand, generating a greatest common divisor (GCD) of the first and second operands via the fourth operand and generating a public key based on, at least in part, the GCD. Many alternatives, variations and modifications are possible.

Type: Grant

Filed: December 14, 2006

Date of Patent: September 13, 2011

Assignee: Intel Corporation

Inventors: Gilbert M. Wolrich, William Hasenplaugh, Wajdi Feghali, Daniel Cutter, Vinodh Gopal, Gunnar Gaubatz
Method for the translation of programs for reconfigurable architectures

Patent number: 7996827

Abstract: A method for advantageously translating high-level language codes for data processing using a reconfigurable architecture, memories addressable internally from within said reconfigurable architecture, and memories external to said reconfigurable architecture, may include constructing a finite automaton for computation in such a way that a complex combinatory network of individual functions is formed, assigning memories to the network for storage of operands and results, and separating external memory accesses for providing a transfer of at least one of operands and results as data from an external memory to a memory addressable internally by the reconfigurable architecture.

Type: Grant

Filed: August 16, 2002

Date of Patent: August 9, 2011

Inventors: Martin Vorbach, May Frank, Armin Nückel
Compiler device, method, program and recording medium

Patent number: 7979853

Abstract: Compiler device optimizes a program by changing an order of executing instructions.

Type: Grant

Filed: January 24, 2008

Date of Patent: July 12, 2011

Assignee: International Business Machines Corporation

Inventors: Motohiro Kawahito, Hideaki Komatsu
Scheduling technique for software pipelining

Patent number: 7962907

Abstract: An improved scheduling technique for software pipelining is disclosed which is designed to find schedules requiring fewer processor clock cycles and reduce register pressure hot spots when scheduling multiple groups of instructions (e.g. as represented by multiple sub-graphs of a DDG) which are independent, and substantially identical. The improvement in instruction scheduling and reduction of hot spots is achieved by evenly distributing such groups of instructions around the schedule for a given loop.

Type: Grant

Filed: August 17, 2007

Date of Patent: June 14, 2011

Assignee: International Business Machines Corporation

Inventors: Allan Russell Martin, James Lawrence McInnes
Scheduling technique for software pipelining

Patent number: 7930688

Abstract: An improved scheduling technique for software pipelining is disclosed which is designed to find schedules requiring fewer processor clock cycles and reduce register pressure hot spots when scheduling multiple groups of instructions (e.g. as represented by multiple sub-graphs of a DDG) which are independent, and substantially identical. The improvement in instruction scheduling and reduction of hot spots is achieved by evenly distributing such groups of instructions around the schedule for a given loop.

Type: Grant

Filed: January 3, 2008

Date of Patent: April 19, 2011

Assignee: International Business Machines Corporation

Inventors: Allen Russell Martin, James Lawrence McInnes
Single-chip multiprocessor with clock cycle-precise program scheduling of parallel execution

Patent number: 7895587

Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.

Type: Grant

Filed: September 8, 2006

Date of Patent: February 22, 2011

Assignee: Elbrus International

Inventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
Context based event handling and execution with prioritization and interrupt management

Patent number: 7865887

Abstract: Embodiments of the present invention provide improved event handling between systems. In one embodiment, the present invention includes software event handling method comprising receiving a first event from a first source system in a plurality of source systems, the first event including event information, accessing context information corresponding to the first event, generating a second event based on at least a portion of the event information and context information using one or more rules, assigning a priority to the second event, and sending the second event to a first target system in a plurality of target systems.

Type: Grant

Filed: November 30, 2006

Date of Patent: January 4, 2011

Assignee: SAP AG

Inventors: Matthias U. Kaiser, Keith S. Klemba, Shuyuan Chen, Frankie James
Using transactional memory for precise exception handling in aggressive dynamic binary optimizations

Patent number: 7865885

Abstract: Dynamic optimization of application code is performed by selecting a portion of the application code as a possible transaction. A transaction has a property that when it is executed, it is either atomically committed or atomically aborted. Determining whether to convert the selected portion of the application code to a transaction includes determining whether to apply at least one of a group of code optimizations to the portion of the application code. If it is determined to apply at least one of the code optimizations of the group of optimizations to the portion of application code, then the optimization is applied to the portion of the code and the portion of the code is converted to a transaction.

Type: Grant

Filed: September 27, 2006

Date of Patent: January 4, 2011

Assignee: Intel Corporation

Inventors: Youfeng Wu, Cheng Wang, Ho-seop Kim
Program conversion device and method

Patent number: 7856625

Abstract: A program conversion device for converting a program source is provided. The program conversion device comprises: a section and index acquisition device for acquiring a section code for indicating a section embedded in the program and performance index information embedded in the program in association with the section code; a task code conversion device for separating the acquired section code into task codes and adding a code to indicate the beginning of the task and a code to indicate the end of the task; and a task index attachment device for attaching a performance index, to input to the scheduler, to the task.

Type: Grant

Filed: October 21, 2005

Date of Patent: December 21, 2010

Assignee: Panasonic Corporation

Inventor: Kunihiko Hayashi
Compiler apparatus

Patent number: 7856629

Abstract: A compiler apparatus, which can perform software pipelining optimization that has a considerable effect of reducing the number of execution cycles taken to complete a loop process, converts a source program into a machine program for a processor which is capable of parallel processing. The compiler apparatus is composed of: a parsing unit operable to parse the source program and then to convert the source program into an intermediate program which is described in an intermediate language; an optimization unit operable to optimize the intermediate program; and a conversion unit operable to convert the optimized intermediate program into the machine language program, wherein the optimization unit is operable to execute software pipelining, by inserting a transfer instruction, which is used for transferring data between operands, into a loop process included in the intermediate program so that a data dependence relation is changed.

Type: Grant

Filed: May 24, 2006

Date of Patent: December 21, 2010

Assignee: Panasonic Corporation

Inventors: Shohei Michimoto, Taketo Heishi, Hajime Ogawa, Teruo Kawabata
Speculative multi-threading for instruction prefetch and/or trace pre-build

Patent number: 7814469

Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is a speculative prefetch thread to perform instruction prefetch and/or trace pre-build for the main thread.

Type: Grant

Filed: April 24, 2003

Date of Patent: October 12, 2010

Assignee: Intel Corporation

Inventors: Hong Wang, Tor M. Aamodt, Pedro Marcuello, Jared W. Stark, IV, John P. Shen, Antonio González, Per Hammarlund, Gerolf F. Hoflehner, Perry H. Wang, Steve Shih-wei Liao
PROCESSORS AND COMPILING METHODS FOR PROCESSORS

Publication number: 20100251229

Abstract: A compiling method compiles an object program to be executed by a processor having a plurality of execution units operable in parallel. In the method a first availability chain is created from a producer instruction (p1), scheduled for execution by a first one of the execution units (20: AGU), to a first consumer instruction (c1), scheduled for execution by a second one of the execution units (22: EXU) and requiring a value produced by the said producer instruction. The first availability chain comprises at least one move instruction (mv1-mv3) for moving the required value from a first point (20: ARF) accessible by the first execution unit to a second point (22: DRF) accessible by the second execution unit.

Type: Application

Filed: June 9, 2010

Publication date: September 30, 2010

Applicant: ALTERA CORPORATION

Inventors: Marcio Merino Fernandes, Raymond Malcolm Livesley
Estimating a dominant resource used by a computer program

Patent number: 7797692

Abstract: A system that estimates a dominant computational resource which is used by a computer program. During operation, for each basic block in the computer program, the system determines a nesting level for the basic block. Next, the system selects basic blocks with nesting levels greater than a specified threshold. For each selected basic block, the system analyzes the basic block to estimate the dominant computational resource used by the basic block. The system then uses the estimated dominant computational resources for the selected basic blocks to estimate the dominant computational resource for the computer program.

Type: Grant

Filed: May 12, 2006

Date of Patent: September 14, 2010

Assignee: Google Inc.

Inventor: Grzegorz J. Czajkowski
Computer code partitioning for enhanced performance

Patent number: 7788658

Abstract: A method and system for enhancing the execution performance of program code. An analysis of the program code is used to generate code usage information for each code module. For each module, the code usage information is used to determine whether the code module should be separated from its original module container. If so, the code module is migrated to a new module container, and the code module in the original module container is replaced with a reference to the code module in the new module container.

Type: Grant

Filed: May 31, 2006

Date of Patent: August 31, 2010

Assignee: International Business Machines Corporation

Inventors: Taimur Javed, Philip Loats, William J. Tracey, II, David A. Wood, III
Method and system for performing reassociation in software loops

Patent number: 7774766

Abstract: Various embodiments of the present invention relate to methods and systems for optimizing an intermediate code in a compilation logic. The intermediate code is optimized by performing reassociation in software loops. The intermediate code includes at least one critical recurrence cycle. The performance of reassociation in software loops can reduce a critical recurrence cycle in them, which can speed up their execution. The subject method can include the determination of one or more critical recurrence cycles in a software loop. The method can also include the determination of at least one edge in a critical recurrence cycle, with respect to which reassociation can be performed, if one or more pre-determined criteria are met. The method can further include performing reassociation of a dependee and a dependent of an edge. In an embodiment, when one or more pre-determined criteria are met, the logic of the software loop is maintained after performing reassociation of the dependee and the dependent of the edge.

Type: Grant

Filed: September 29, 2005

Date of Patent: August 10, 2010

Assignee: Intel Corporation

Inventors: Kalyan Muthukumar, Daniel M Lavery

prev 1 2 3 4 5 6 7 next