Including Scheduling Instructions Patents (Class 717/161)
  • Patent number: 8341609
    Abstract: A computer is programmed to automatically identify multiple sequences of executable code such that each sequence fits within a page of memory. When the executable code comprising several sequences is loaded into the paged memory, each sequence is placed in its own page. The computer is further programmed to prepare a number of structures which identify a corresponding number of instructions that transfer control between sequences. Each structure identifies at least a control transfer instruction in one sequence and a target in another sequence. When loading the sequences into memory, the structures are used to replace destination addresses of control transfers between sequences with new addresses derived from base addresses of pages that have been allocated in memory to hold the sequences.
    Type: Grant
    Filed: January 26, 2007
    Date of Patent: December 25, 2012
    Assignee: Oracle International Corporation
    Inventors: Robert H. Lee, David Unietis, Mark Jungerman
  • Patent number: 8336040
    Abstract: A method for job management in an HPC environment includes determining an unallocated subset from a plurality of HPC nodes, with each of the unallocated HPC nodes comprising an integrated fabric. An HPC job is selected from a job queue and executed using at least a portion of the unallocated subset of nodes.
    Type: Grant
    Filed: April 15, 2004
    Date of Patent: December 18, 2012
    Assignee: Raytheon Company
    Inventors: Shannon V. Davidson, Anthony N. Richoux
  • Patent number: 8336031
    Abstract: A method and system of performing thread scheduling. At least some of the illustrative embodiments are computer-readable mediums storing a program that, when executed by a processor of a host system, causes the processor to instantiate a CPU object that represents a processor abstraction, create a CPU context object that represents a thread abstraction (wherein the CPU context object is associated to a method, and wherein the CPU context object is mapped onto the CPU object), and execute the method within the CPU object.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: December 18, 2012
    Assignee: Texas Instruments Incorporated
    Inventors: Gilbert Cabillic, Jean-Philippe Lesot
  • Patent number: 8336041
    Abstract: A compiler allocates an unroll_group_number conferred based on a sequence in which a loop body is replicated by loop unrolling to each loop body during loop unrolling based on the optimized number of loop unrolling. The allocated unroll_group_number is added to each instruction included in each loop body. A priority of an instruction is adjusted based on the allocated unroll_group_number during instruction scheduling.
    Type: Grant
    Filed: September 11, 2009
    Date of Patent: December 18, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Byung-chang Cha
  • Patent number: 8327344
    Abstract: Mechanisms are provided for analyzing and optimizing loops with conditional control flow in source code based on array reference safety. Mechanisms are provided for analyzing blocks of the source code to identify a conditional control flow loop having loop source code specifying a total access range for an array reference. A safe access range, of the total access range of the array reference in the loop source code, is identified over which a compiler-based optimization of the loop source code can be safely applied without introducing new exception conditions. The compiler-based optimization of the loop source code is performed based on the identified safe access range to generate optimized code. The optimized code is output for generation of executable code for execution on a processor.
    Type: Grant
    Filed: October 14, 2008
    Date of Patent: December 4, 2012
    Assignee: International Business Machines Corporation
    Inventor: Michael K. Gschwind
  • Patent number: 8327345
    Abstract: In response to receiving pre-processed code, a compiler identifies a code section that is not candidate for acceleration and identifying a code block specifying an iterated operation that is a candidate for acceleration. In response to identifying the code section, the compiler generates post-processed code containing one or more lower level instructions corresponding to the identified code section, and in response to identifying the code block, the compiler creates and outputs an operation data structure separate from the post-processed code that identifies the iterated operation. The compiler places a block computation command in the post-processed code that invokes processing of the operation data structure to perform the iterated operation and outputs the post-processed code.
    Type: Grant
    Filed: December 16, 2008
    Date of Patent: December 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Balaram Sinharoy
  • Patent number: 8311683
    Abstract: Illustrative embodiments provide a computer implemented method, a data processing system, and a computer program product for adjusting cooling settings. The computer implemented method comprises analyzing a set of instructions of an application to determine a number of degrees by which a set of instructions will raise a temperature of at least one processor core. The computer implemented method further calculates a cooling setting for at least one cooling system for the at least one processor core. The computer implemented method adjusts the at least one cooling system based on the cooling setting. The step of analyzing the set of instructions is performed before the set of instructions is executed on the at least one processor core. The step of adjusting the at least one cooling system is performed before the set of instructions is executed on the at least one processor core.
    Type: Grant
    Filed: April 29, 2009
    Date of Patent: November 13, 2012
    Assignee: International Business Machines Corporation
    Inventors: Robert Lee Angell, David Wayne Cosby, Robert R. Friedlander, James R. Kraemer
  • Patent number: 8291400
    Abstract: A system comprises a plurality of computation units interconnected by an interconnection network. A method for configuring the system comprises receiving subsets of instructions corresponding to different portions of a program, each subset assigned to one of the computation units; scheduling instructions in a given subset for execution on the assigned computation unit, including scheduling communication instructions that send to or receive from a different computation unit over the interconnection network; allocating registers in a given computation unit for storing values accessed by instructions in a subset assigned to the given computation unit; and scheduling instructions after allocating registers to account for spills of values stored in allocated register to memory, preserving the order of communication instructions scheduled before allocating registers.
    Type: Grant
    Filed: February 7, 2008
    Date of Patent: October 16, 2012
    Assignee: Tilera Corporation
    Inventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
  • Patent number: 8281297
    Abstract: A method of producing a reconfigurable circuit device for running a computer program of moderate complexity such as multimedia processing. Code for the application is compiled into Control Flow Graphs representing distinct parts of the application to be run. From those Control Flow Graphs are extracted basic blocks. The basic blocks are converted to Data Flow Graphs by a compiler utility. From two or more Data Flow Graphs, a largest common subgraph is determined. The largest common subgraph is ASAP scheduled and substituted back into the Data Flow Graphs which also have been scheduled. The separate Data Flow Graphs containing the scheduled largest common subgraph are converted to data paths that are then combined to form code for operating the application. The largest common subgraph is effected in hardware that is shared among the parts of the application from which the Data Flow Graphs were developed.
    Type: Grant
    Filed: February 5, 2004
    Date of Patent: October 2, 2012
    Assignee: Arizona Board of Regents
    Inventors: Aravind R. Dasu, Ali Akoglu, Arvind Sudarsanam, Sethuraman Panchanathan
  • Patent number: 8276135
    Abstract: The present invention is a method, system, software and data structure for profiling programs, other code, and adaptive computing integrated circuit architectures, using a plurality of data parameters such as data type, input and output data size, data source and destination locations, data pipeline length, locality of reference, distance of data movement, speed of data movement, data access frequency, number of data load/stores, memory usage, and data persistence. The profiler of the invention accepts a data set as input, and profiles a plurality of functions by measuring a plurality of data parameters for each function, during operation of the plurality of functions with the input data set, to form a plurality of measured data parameters. From the plurality of measured data parameters, the profiler generates a plurality of data parameter comparative results corresponding to the plurality of functions and the input data set.
    Type: Grant
    Filed: November 7, 2002
    Date of Patent: September 25, 2012
    Assignee: QST Holdings LLC
    Inventor: Paul L. Master
  • Patent number: 8266606
    Abstract: A mechanism is provided for increasing efficiency of tasks by observing the performance of generally equivalent code paths during execution of the task are disclosed. Embodiments involve a computer system with software, or hard-coded logic that includes reflexive code paths. The reflexive code paths may be identified by a software or hardware designer during the design of the computer system. For that particular computer system, however, one of the code paths may offer better performance characteristics so a monitor collects performance data during execution of the reflexive code paths and a code path selector selects the reflexive code with favorable performance characteristics. One embodiment improves the performance of memory allocation by selectively implementing a tunable, linear, memory allocation module in place of a default memory allocation module.
    Type: Grant
    Filed: May 13, 2008
    Date of Patent: September 11, 2012
    Assignee: International Business Machines Corporation
    Inventor: Marc Alan Dickenson
  • Patent number: 8266610
    Abstract: A method, apparatus, and computer instructions for scheduling instructions for execution. Identify a series of instructions in a loop, wherein the series of instructions has a cyclic data dependency. Determine whether the series of instructions is a uniform series of instructions. Schedule execution of the uniform series of instructions within the loop to optimize execution of the loop in response to the identified series of instructions being the uniform series of instructions.
    Type: Grant
    Filed: September 19, 2008
    Date of Patent: September 11, 2012
    Assignee: International Business Machines Corporation
    Inventor: Allan Russell Martin
  • Patent number: 8261250
    Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.
    Type: Grant
    Filed: January 10, 2011
    Date of Patent: September 4, 2012
    Assignee: Elbrus International
    Inventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
  • Patent number: 8261252
    Abstract: A method and system for converting application code into optimized application code or into execution code suitable for execution on a computation engine with an architecture comprising at least a first and a second level of data memory units are disclosed. In one aspect, the method comprises obtaining application code, the application code comprising data transfer operations between the levels of memory units. The method further comprises converting at least a part of the application code. The converting of application code comprises scheduling of data transfer operations from a first level of memory units to a second level of memory units such that accesses of data accessed multiple times are brought closer together in time than in the original code.
    Type: Grant
    Filed: March 26, 2008
    Date of Patent: September 4, 2012
    Assignees: IMEC, Katholieke Universiteit Leuven
    Inventors: Praveen Raghavan, Murali Jayapala, Francky Catthoor, Absar Javed, Andy Lambrechts
  • Patent number: 8261249
    Abstract: Embodiments of the invention provide a method for deploying and running an application on a massively parallel computer system, while minimizing the costs associated with latency, bandwidth, and limited memory resources. The executable code of a program may be divided into multiple code fragments and distributed to different compute nodes of a parallel computing system. During program execution, one compute node may fetch code fragments from other compute nodes as necessary.
    Type: Grant
    Filed: January 8, 2008
    Date of Patent: September 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Charles Jens Archer, Thomas Michael Gooding, Ruth Janine Poole, Albert Sidelnik
  • Patent number: 8250555
    Abstract: A system comprises a plurality of computation units interconnected by an interconnection network.
    Type: Grant
    Filed: February 7, 2008
    Date of Patent: August 21, 2012
    Assignee: Tilera Corporation
    Inventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
  • Patent number: 8250556
    Abstract: A system comprises a plurality of computation units interconnected by an interconnection network. A method for configuring the system comprises receiving an initial partitioning of instructions into initial subsets corresponding to different portions of a program; forming a refined partitioning of the instructions into refined subsets each including one or more of the initial subsets, including determining whether to combine a first subset and a second subset to form a third subset according to a comparison of a communication cost between the first subset and second subset and a load cost of the third subset that is based at least in part on a number of instructions issued per cycle by a computation unit; and assigning each refined subset of instructions to one of the computation units for execution on the assigned computation unit.
    Type: Grant
    Filed: February 7, 2008
    Date of Patent: August 21, 2012
    Assignee: Tilera Corporation
    Inventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
  • Patent number: 8250557
    Abstract: There is disclosed a method and system for configuring a data dependency graph (DDG) to handle instruction scheduling in computer architectures permitting dynamic by-pass execution, and for performing dynamic by-pass scheduling utilizing such a configured DDG. In accordance with an embodiment of the invention, a heuristic function is used to obtain a ranking of nodes in the DDG after setting delays at all identified by-pass pairs of nodes in the DDG to 0. From among a list of identified by-pass pairs of nodes, a node that is identified as being the least important to schedule early is marked as “bonded” to its successor, and the corresponding delay for that identified node is set to 0. Node rankings are re-computed and the bonded by-pass pair of nodes are scheduled in consecutive execution cycles with a delay of 0 to increase the likelihood that a by-pass can be successfully taken during run-time execution.
    Type: Grant
    Filed: May 7, 2008
    Date of Patent: August 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Marcel Mitran, Alexander Vasilevskiy
  • Publication number: 20120192169
    Abstract: Particular embodiments optimize a C++ function comprising one or more loops for symbolic execution, comprising for each loop, if there is a branching condition within the loop, then rewrite the loop to move the branching condition outside the loop. Particular embodiments may further optimize the C++ function through simplified symbolic expressions and adding constructs forcing delayed interpretation of symbolic expressions during the symbolic execution.
    Type: Application
    Filed: January 20, 2011
    Publication date: July 26, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Guodong Li, Sreeranga P. Rajan, Indradeep Ghosh
  • Patent number: 8224636
    Abstract: A method and mechanism for implementing a general purpose scripting language that supports parallel execution is described. In one approach, parallel execution is provided in a seamless and high-level approach rather than requiring or expecting a user to have low-level programming expertise with parallel processing languages/functions. Also described is a system and method for performing circuit simulation. The present approach provides methods and systems that create reusable and independent measurements for use with circuit simulators. Also disclosed are parallelizable measurements having looping constructs that can be run without interference between parallel iterations. Reusability is enhanced by having parameterized measurements. Revisions and history of the operating parameters of circuit designs subject to simulation are tracked.
    Type: Grant
    Filed: December 17, 2003
    Date of Patent: July 17, 2012
    Assignee: Cadence Design Systems, Inc.
    Inventor: Kenneth S. Kundert
  • Patent number: 8191056
    Abstract: A target operation in a normalized target loop, susceptible of vectorization and which may, after compilation into a vectorized form, seek to operate on data in nonconsecutive physical memory, is identified in source code. Hardware instructions are inserted into executable code generated from the source code, directing a system that will run the executable code to create a representation of the data in consecutive physical memory. A vector loop containing the target operation is replaced, in the executable code, with a function call to a vector library to call a vector function that will operate on the representation to generate a result identical to output expected from executing the vector loop containing the target operation. On execution, a representation of data residing in nonconsecutive physical memory is created in consecutive physical memory, and the vectorized target operation is applied to the representation to process the data.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: May 29, 2012
    Assignee: International Business Machines Corporation
    Inventors: Roch Georges Archambault, George Chochia, Peng Zhao
  • Patent number: 8191057
    Abstract: Systems, methods and computer products for compiler support for aggressive safe load speculation. Exemplary embodiments include a method for aggressive safe load speculation for a compiler in a computer system, the method including building a control flow graph, identifying both countable and non-countable loops, gathering a set of candidate loops for load speculation, and for each candidate loop in the set of candidate loops gathered for load speculation, computing an estimate of the iteration count, delay cycles, and code size, performing a profitability analysis and determining an unroll factor based on the delay cycles and the code size, transforming the loop by generating a prologue loop to achieve data alignment and an unrolled main loop with loop directives, indicating which loads can safely be executed speculatively and performing low-level instruction scheduling on the generated unrolled main loop.
    Type: Grant
    Filed: August 27, 2007
    Date of Patent: May 29, 2012
    Assignee: International Business Machines Corporation
    Inventors: Roch G. Archambault, Geoffrey O. Blandy, Roland Froese, Yaoqing Gao, Liangxiao Hu, James L. McInnes, Raul E. Silvera
  • Patent number: 8171464
    Abstract: An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.
    Type: Grant
    Filed: May 16, 2008
    Date of Patent: May 1, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
  • Patent number: 8146066
    Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.
    Type: Grant
    Filed: March 5, 2007
    Date of Patent: March 27, 2012
    Assignee: Google Inc.
    Inventors: Christopher G. Demetriou, Matthew N. Papakipos
  • Patent number: 8141068
    Abstract: A computer program consisting of a compiler for compiling source code programs into executable code. The compiler is suited to achieving high efficiency on a processor that can process many instructions at once but the instructions have dependency constraints and the processor has no internal mechanism for dealing with these constraints, such as the Itanium class of processors. As each instruction is considered for addition to a group of instructions for a single cycle, dependencies are checked to determine whether the entire group can be scheduled in any possible order. Once all the instructions of the group have been selected, the instructions are then reordered for placement in a reservation table. For implementation in the Itanium class of processors, detailed requirements of the processor are accommodated with a structure that can be adjusted for any processor in the class. The structure can also be adjusted for other classes of processors.
    Type: Grant
    Filed: June 18, 2002
    Date of Patent: March 20, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Carol L. Thompson
  • Patent number: 8136102
    Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.
    Type: Grant
    Filed: March 5, 2007
    Date of Patent: March 13, 2012
    Assignee: Google Inc.
    Inventors: Matthew N. Papakipos, Brian K. Grant, Christopher G. Demetriou, Morgan S. McGuire
  • Patent number: 8136104
    Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.
    Type: Grant
    Filed: March 5, 2007
    Date of Patent: March 13, 2012
    Assignee: Google Inc.
    Inventors: Matthew N. Papakipos, Brian K. Grant, Morgan S. McGuire, Christopher G. Demetriou
  • Patent number: 8132163
    Abstract: A method and apparatus for automatic second-order predictive commoning is provided by the present invention. During an analysis phase, the intermediate representation of a program code is analyzed to identify opportunities for second-order predictive commoning optimization. The analyzed information is used by the present invention for apply transformations to the program code, such that the number of memory access and the number of computations are reduced for loop iterations and performance of program code is improved.
    Type: Grant
    Filed: January 30, 2009
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Arie Tal, Dina Tal
  • Patent number: 8104030
    Abstract: A computer implemented method, computer usable program code, and a system for parallelizing a loop. A parameter that will be used to limit parallelization of the loop is identified to limit parallelization of the loop. The parameter specifies a minimum number of loop iterations that a thread should execute. The parameter can be adjusted based on a parallel performance factor. A parallel performance factor is a factor that influences the performance of parallel code. A number of threads from a plurality of threads is selected for processing iterations of the loop based on the parameter. The number of threads is selected prior to execution of the first iteration of the loop.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: January 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Raul Esteban Silvera, Priya Unnikrishnan, Guansong Zhang
  • Patent number: 8098251
    Abstract: A system, method and apparatus are disclosed, in which an instruction scheduler of a compiler, e.g., a shader compiler, reduces instruction latency based on a determined instruction distance between a dependent predecessor and successor instructions.
    Type: Grant
    Filed: February 22, 2008
    Date of Patent: January 17, 2012
    Assignee: QUALCOMM Incorporated
    Inventor: Lin Chen
  • Patent number: 8099726
    Abstract: A software transactional memory system is described which utilizes decomposed software transactional memory instructions as well as runtime optimizations to achieve efficient performance. The decomposed instructions allow a compiler with knowledge of the instruction semantics to perform optimizations which would be unavailable on traditional software transactional memory systems. Additionally, high-level software transactional memory optimizations are performed such as code movement around procedure calls, addition of operations to provide strong atomicity, removal of unnecessary read-to-update upgrades, and removal of operations for newly-allocated objects. During execution, multi-use header words for objects are extended to provide for per-object housekeeping, as well as fast snapshots which illustrate changes to objects. Additionally, entries to software transactional memory logs are filtered using an associative table during execution, preventing needless writes to the logs.
    Type: Grant
    Filed: March 23, 2006
    Date of Patent: January 17, 2012
    Assignee: Microsoft Corporation
    Inventor: Timothy Lawrence Harris
  • Patent number: 8091079
    Abstract: A method for implementing shadow versioning to improve data dependence analysis for instruction scheduling in compiling code, and to identify loops within the code to be compiled, for each loop initializing a dependence a matrix, for each loop shadow identifying symbols that are accessed by the loop, examining dependencies, storing, comparing and classifying the dependence vectors, generating new shadow symbols, replacing the old shadow symbols with the new shadow symbols, generating alias relationships between the newly created shadow symbols, scheduling instructions and compiling the code.
    Type: Grant
    Filed: August 29, 2007
    Date of Patent: January 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Roch G. Archambault, Yaoqing Gao, Raul E. Silvera, Peng Zhao
  • Patent number: 8065670
    Abstract: A system that reduces overly optimistic program execution. During operation, the system encounters a bounded-execution block while executing a program, wherein the bounded execution block includes a primary path and a secondary path. Next, the system executes the bounded execution block. After executing the bounded execution block, the system determines whether executing instructions on the primary path is preferable to executing instructions on the secondary path based on information gathered while executing the bounded-execution block. If not, the system dynamically modifies the instructions of the bounded-execution block so that during subsequent passes through the bounded-execution block, the instructions on the secondary path are executed instead of the instructions on the primary path.
    Type: Grant
    Filed: October 3, 2006
    Date of Patent: November 22, 2011
    Assignee: Oracle America, Inc.
    Inventors: Tycho G. Nightingale, Wayne Mesard
  • Patent number: 8042100
    Abstract: Systems, methods, and computer products for evaluating robustness of a list scheduling framework. Exemplary embodiments include a method for evaluating the robustness of a list scheduling framework, the method including identifying a set of compiler benchmarks known to be sensitive to an instruction scheduler, running the set of benchmarks against a heuristic under test, H and collect an execution time Exec(H[G]), where G is a directed a-cyclical graph, running the set of benchmarks against a plurality of random heuristics Hrand[G]i, and collect a plurality of respective execution times Exec(Hrand[G])i, computing a robustness of the list scheduling framework, and checking robustness check it against a pre-determined threshold.
    Type: Grant
    Filed: August 27, 2007
    Date of Patent: October 18, 2011
    Assignee: International Business Machines Corporation
    Inventors: Marcel Mitran, Joran S. C. Siu, Alexander Vasilevskiy
  • Patent number: 8028281
    Abstract: Parallelization of loops is performed for loops having indirect loop index variables and embedded conditional statements in the loop body. Loops having any finite number of array variables in the loop body, and any finite number of indirect loop index variables can be parallelized. There are two particular limitations of the described techniques: (i) that there are no cross-iteration dependencies in the loop other than through the indirect loop index variables; and (ii) that the loop index variables (either direct or indirect) are not redefined in the loop body.
    Type: Grant
    Filed: January 5, 2007
    Date of Patent: September 27, 2011
    Assignee: International Business Machines Corporation
    Inventor: Rajendra K. Bera
  • Patent number: 8020142
    Abstract: A method for instruction processing may include adding a first operand from a first register, a second operand from a second register and a carry input bit to generate a sum and a carry out bit, loading the sum into a third register and loading the carry out bit into a most significant bit position of the third register to generate a third operand, performing a single bit shift on the third operand via a shifter unit to produce a shifted operand and loading the shifted operand into the fourth register, loading a least significant bit from the sum into the most significant bit position of the fourth register to generate a fourth operand, generating a greatest common divisor (GCD) of the first and second operands via the fourth operand and generating a public key based on, at least in part, the GCD. Many alternatives, variations and modifications are possible.
    Type: Grant
    Filed: December 14, 2006
    Date of Patent: September 13, 2011
    Assignee: Intel Corporation
    Inventors: Gilbert M. Wolrich, William Hasenplaugh, Wajdi Feghali, Daniel Cutter, Vinodh Gopal, Gunnar Gaubatz
  • Patent number: 7996827
    Abstract: A method for advantageously translating high-level language codes for data processing using a reconfigurable architecture, memories addressable internally from within said reconfigurable architecture, and memories external to said reconfigurable architecture, may include constructing a finite automaton for computation in such a way that a complex combinatory network of individual functions is formed, assigning memories to the network for storage of operands and results, and separating external memory accesses for providing a transfer of at least one of operands and results as data from an external memory to a memory addressable internally by the reconfigurable architecture.
    Type: Grant
    Filed: August 16, 2002
    Date of Patent: August 9, 2011
    Inventors: Martin Vorbach, May Frank, Armin Nückel
  • Patent number: 7979853
    Abstract: Compiler device optimizes a program by changing an order of executing instructions.
    Type: Grant
    Filed: January 24, 2008
    Date of Patent: July 12, 2011
    Assignee: International Business Machines Corporation
    Inventors: Motohiro Kawahito, Hideaki Komatsu
  • Patent number: 7962907
    Abstract: An improved scheduling technique for software pipelining is disclosed which is designed to find schedules requiring fewer processor clock cycles and reduce register pressure hot spots when scheduling multiple groups of instructions (e.g. as represented by multiple sub-graphs of a DDG) which are independent, and substantially identical. The improvement in instruction scheduling and reduction of hot spots is achieved by evenly distributing such groups of instructions around the schedule for a given loop.
    Type: Grant
    Filed: August 17, 2007
    Date of Patent: June 14, 2011
    Assignee: International Business Machines Corporation
    Inventors: Allan Russell Martin, James Lawrence McInnes
  • Patent number: 7930688
    Abstract: An improved scheduling technique for software pipelining is disclosed which is designed to find schedules requiring fewer processor clock cycles and reduce register pressure hot spots when scheduling multiple groups of instructions (e.g. as represented by multiple sub-graphs of a DDG) which are independent, and substantially identical. The improvement in instruction scheduling and reduction of hot spots is achieved by evenly distributing such groups of instructions around the schedule for a given loop.
    Type: Grant
    Filed: January 3, 2008
    Date of Patent: April 19, 2011
    Assignee: International Business Machines Corporation
    Inventors: Allen Russell Martin, James Lawrence McInnes
  • Patent number: 7895587
    Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.
    Type: Grant
    Filed: September 8, 2006
    Date of Patent: February 22, 2011
    Assignee: Elbrus International
    Inventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
  • Patent number: 7865887
    Abstract: Embodiments of the present invention provide improved event handling between systems. In one embodiment, the present invention includes software event handling method comprising receiving a first event from a first source system in a plurality of source systems, the first event including event information, accessing context information corresponding to the first event, generating a second event based on at least a portion of the event information and context information using one or more rules, assigning a priority to the second event, and sending the second event to a first target system in a plurality of target systems.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: January 4, 2011
    Assignee: SAP AG
    Inventors: Matthias U. Kaiser, Keith S. Klemba, Shuyuan Chen, Frankie James
  • Patent number: 7865885
    Abstract: Dynamic optimization of application code is performed by selecting a portion of the application code as a possible transaction. A transaction has a property that when it is executed, it is either atomically committed or atomically aborted. Determining whether to convert the selected portion of the application code to a transaction includes determining whether to apply at least one of a group of code optimizations to the portion of the application code. If it is determined to apply at least one of the code optimizations of the group of optimizations to the portion of application code, then the optimization is applied to the portion of the code and the portion of the code is converted to a transaction.
    Type: Grant
    Filed: September 27, 2006
    Date of Patent: January 4, 2011
    Assignee: Intel Corporation
    Inventors: Youfeng Wu, Cheng Wang, Ho-seop Kim
  • Patent number: 7856625
    Abstract: A program conversion device for converting a program source is provided. The program conversion device comprises: a section and index acquisition device for acquiring a section code for indicating a section embedded in the program and performance index information embedded in the program in association with the section code; a task code conversion device for separating the acquired section code into task codes and adding a code to indicate the beginning of the task and a code to indicate the end of the task; and a task index attachment device for attaching a performance index, to input to the scheduler, to the task.
    Type: Grant
    Filed: October 21, 2005
    Date of Patent: December 21, 2010
    Assignee: Panasonic Corporation
    Inventor: Kunihiko Hayashi
  • Patent number: 7856629
    Abstract: A compiler apparatus, which can perform software pipelining optimization that has a considerable effect of reducing the number of execution cycles taken to complete a loop process, converts a source program into a machine program for a processor which is capable of parallel processing. The compiler apparatus is composed of: a parsing unit operable to parse the source program and then to convert the source program into an intermediate program which is described in an intermediate language; an optimization unit operable to optimize the intermediate program; and a conversion unit operable to convert the optimized intermediate program into the machine language program, wherein the optimization unit is operable to execute software pipelining, by inserting a transfer instruction, which is used for transferring data between operands, into a loop process included in the intermediate program so that a data dependence relation is changed.
    Type: Grant
    Filed: May 24, 2006
    Date of Patent: December 21, 2010
    Assignee: Panasonic Corporation
    Inventors: Shohei Michimoto, Taketo Heishi, Hajime Ogawa, Teruo Kawabata
  • Patent number: 7814469
    Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is a speculative prefetch thread to perform instruction prefetch and/or trace pre-build for the main thread.
    Type: Grant
    Filed: April 24, 2003
    Date of Patent: October 12, 2010
    Assignee: Intel Corporation
    Inventors: Hong Wang, Tor M. Aamodt, Pedro Marcuello, Jared W. Stark, IV, John P. Shen, Antonio González, Per Hammarlund, Gerolf F. Hoflehner, Perry H. Wang, Steve Shih-wei Liao
  • Publication number: 20100251229
    Abstract: A compiling method compiles an object program to be executed by a processor having a plurality of execution units operable in parallel. In the method a first availability chain is created from a producer instruction (p1), scheduled for execution by a first one of the execution units (20: AGU), to a first consumer instruction (c1), scheduled for execution by a second one of the execution units (22: EXU) and requiring a value produced by the said producer instruction. The first availability chain comprises at least one move instruction (mv1-mv3) for moving the required value from a first point (20: ARF) accessible by the first execution unit to a second point (22: DRF) accessible by the second execution unit.
    Type: Application
    Filed: June 9, 2010
    Publication date: September 30, 2010
    Applicant: ALTERA CORPORATION
    Inventors: Marcio Merino Fernandes, Raymond Malcolm Livesley
  • Patent number: 7797692
    Abstract: A system that estimates a dominant computational resource which is used by a computer program. During operation, for each basic block in the computer program, the system determines a nesting level for the basic block. Next, the system selects basic blocks with nesting levels greater than a specified threshold. For each selected basic block, the system analyzes the basic block to estimate the dominant computational resource used by the basic block. The system then uses the estimated dominant computational resources for the selected basic blocks to estimate the dominant computational resource for the computer program.
    Type: Grant
    Filed: May 12, 2006
    Date of Patent: September 14, 2010
    Assignee: Google Inc.
    Inventor: Grzegorz J. Czajkowski
  • Patent number: 7788658
    Abstract: A method and system for enhancing the execution performance of program code. An analysis of the program code is used to generate code usage information for each code module. For each module, the code usage information is used to determine whether the code module should be separated from its original module container. If so, the code module is migrated to a new module container, and the code module in the original module container is replaced with a reference to the code module in the new module container.
    Type: Grant
    Filed: May 31, 2006
    Date of Patent: August 31, 2010
    Assignee: International Business Machines Corporation
    Inventors: Taimur Javed, Philip Loats, William J. Tracey, II, David A. Wood, III
  • Patent number: 7774766
    Abstract: Various embodiments of the present invention relate to methods and systems for optimizing an intermediate code in a compilation logic. The intermediate code is optimized by performing reassociation in software loops. The intermediate code includes at least one critical recurrence cycle. The performance of reassociation in software loops can reduce a critical recurrence cycle in them, which can speed up their execution. The subject method can include the determination of one or more critical recurrence cycles in a software loop. The method can also include the determination of at least one edge in a critical recurrence cycle, with respect to which reassociation can be performed, if one or more pre-determined criteria are met. The method can further include performing reassociation of a dependee and a dependent of an edge. In an embodiment, when one or more pre-determined criteria are met, the logic of the software loop is maintained after performing reassociation of the dependee and the dependent of the edge.
    Type: Grant
    Filed: September 29, 2005
    Date of Patent: August 10, 2010
    Assignee: Intel Corporation
    Inventors: Kalyan Muthukumar, Daniel M Lavery