Including Scheduling Instructions Patents (Class 717/161)
  • Patent number: 7765536
    Abstract: A Veil program analyzes the source code and data of a target program and determines how best to distribute the target program and data among the processors of a multi-processor computing system. The Veil program analyzes source code loops, data sizes and types to prepare a set of distribution attempts, whereby each distribution is run under a run-time evaluation wrapper and evaluated to determine the optimal distribution.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: July 27, 2010
    Assignee: Management Services Group, Inc.
    Inventors: Robert Stephen Gordy, Terry Spitzer
  • Patent number: 7761667
    Abstract: A mechanism is provided that identifies instructions that access storage and may be candidates for catch prefetching. The mechanism augments these instructions so that any given instance of the instruction operates in one of four modes, namely normal, unexecuted, data gathering, and validation. In the normal mode, the instruction merely performs the function specified in the software runtime environment. An instruction in unexecuted mode, upon the next execution, is placed in data gathering mode. When an instruction in the data gathering mode is encountered, the mechanism of the present invention collects data to discover potential fixed storage access patterns. When an instruction is in validation mode, the mechanism of the present invention validates the presumed fixed storage access patterns.
    Type: Grant
    Filed: August 12, 2008
    Date of Patent: July 20, 2010
    Assignee: International Business Machines Corporation
    Inventors: Christopher Michael Donawa, Allan Henry Kielstra
  • Patent number: 7752611
    Abstract: Various embodiments that may be used in performing speculative code motion for memory latency hiding are disclosed. One embodiment comprises extracting an asynchronous signal from a memory access instruction in a program to represent a latency of the memory access instruction, and generating a wait instruction to wait the asynchronous signal.
    Type: Grant
    Filed: December 10, 2005
    Date of Patent: July 6, 2010
    Assignee: Intel Corporation
    Inventors: Long Li, Jinquan Dai, Zhiyuan Lv
  • Patent number: 7747993
    Abstract: A method of ordering instructions. The method can include placing a first instruction that consumes a value of an object before a second instruction that produces the value of the object such that the first instruction is processed before the second instruction and a physical location is allocated to the value of the object upon processing the first instruction.
    Type: Grant
    Filed: December 30, 2004
    Date of Patent: June 29, 2010
    Assignee: Michigan Technological University
    Inventor: Soner Onder
  • Patent number: 7730470
    Abstract: A system for binary code instrumentation to reduce effective memory latency comprises a processor and memory coupled to the processor. The memory comprises program instructions executable by the processor to implement a code analyzer configured to analyze an instruction stream of compiled code executable at an execution engine to identify, for a given memory reference instruction in the stream that references data at a memory address calculated during an execution of the instruction stream, an earliest point in time during the execution at which sufficient data is available at the execution engine to calculate the memory address. The code analyzer generates an indication of whether the given memory reference instruction is suitable for a prefetch operation based on a difference in time between the earliest point in time and a time at which the given memory reference instruction is executed during the execution.
    Type: Grant
    Filed: February 27, 2006
    Date of Patent: June 1, 2010
    Assignee: Oracle America, Inc.
    Inventors: Ilya A. Sharapov, Andrew J. Over
  • Patent number: 7712091
    Abstract: A method and system for optimizing the execution of a software loop is provided. The method involves the determination of an edge in a critical recurrence cycle in the software loop. The edge is a dependency link between two instructions and contains a dependee and a dependent. The dependee is an instruction that produces a result, and the dependent is an instruction that uses the result. The method further involves performing predicate promotion of at least one of the dependee and the dependent if one or more pre-determined conditions are met.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: May 4, 2010
    Assignee: Intel Corporation
    Inventors: Kalyan Muthukumar, Robyn A. Sampson, Daniel Lavery
  • Publication number: 20100107147
    Abstract: A compiler allocates an unroll_group_number conferred based on a sequence in which a loop body is replicated by loop unrolling to each loop body during loop unrolling based on the optimized number of loop unrolling. The allocated unroll_group_number is added to each instruction included in each loop body. A priority of an instruction is adjusted based on the allocated unroll_group_number during instruction scheduling.
    Type: Application
    Filed: September 11, 2009
    Publication date: April 29, 2010
    Inventor: Byung-chang Cha
  • Patent number: 7702856
    Abstract: The prefetch distance to be used by a prefetch instruction may not always be correctly calculated using compile-time information. In one embodiment, the present invention generates prefetch distance calculation code to dynamically calculate a prefetch distance used by a prefetch instruction at run-time.
    Type: Grant
    Filed: November 9, 2005
    Date of Patent: April 20, 2010
    Assignee: Intel Corporation
    Inventors: Rakesh Krishnaiyer, Somnath Ghosh, Abhay Kanhere
  • Patent number: 7698696
    Abstract: A compiler comprises an analysis unit that detects directives (options and pragmas) from a user to the compiler, an optimization unit that is made up of a processing unit (a global region allocation unit, a software pipelining unit, a loop unrolling unit, a “if” conversion unit, and a pair instruction generation unit) that performs individual optimization processing designated by options and pragmas from a user, following the directives and the like from the analysis unit, etc. The global region allocation unit performs optimization processing, following designation of the maximum data size of variables to be allocated to a global region, designation of variables to be allocated to the global region, and options and pragmas regarding designation of variables not to be allocated in the global region.
    Type: Grant
    Filed: June 30, 2003
    Date of Patent: April 13, 2010
    Assignee: Panasonic Corporation
    Inventors: Hajime Ogawa, Taketo Heishi, Toshiyuki Sakata, Shuichi Takayama, Shohei Michimoto, Tomoo Hamada, Ryoko Miyachi
  • Patent number: 7689980
    Abstract: Linear transformations of statements in code are performed to generate linear expressions associated with the statements. Parallel code is generated using the linear expressions. Generating the parallel code includes splitting the computation-space of the statements into intervals and generating parallel code for the intervals.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: March 30, 2010
    Assignee: Intel Corporation
    Inventors: Zhao Hui Du, Shih-wei Liao, Gansha Wu, Guei-Yuan Lueh
  • Patent number: 7685588
    Abstract: Embodiments of the present invention provide for platform independence, low intrusiveness, and optimal memory usage of the binary instrumentation process by means of employing one procedure (interceptor function) implemented in a high-level programming language to intercept an arbitrary number of functions or blocks of code. Each time a function or code block needs to be intercepted a new copy of the procedure from a provided memory region may be associated with the address of the function or block of code by means of a memory region descriptor and an intercepted function address table. Once activated, the interceptor function may retrieve its current address and, by searching memory region descriptors, determine the region the current address belongs to; the region's base address may then be obtained. A reference to the intercepted function address table may be fetched from the region descriptor; and an index to the intercepted function address table may be computed.
    Type: Grant
    Filed: March 28, 2005
    Date of Patent: March 23, 2010
    Assignee: Intel Corporation
    Inventors: Sergey N. Zheltov, Stanislav V. Bratanov, Dmitry Eremin
  • Publication number: 20100070956
    Abstract: Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one mufti-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that avow for parallel execution of tasks. The first custom computing apparatus optimizes the code for both parallelism and locality of operations on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.
    Type: Application
    Filed: September 16, 2009
    Publication date: March 18, 2010
    Inventors: Allen Leung, Nicolas T. Vasilache, Benoit Meister, Richard A. Lethin
  • Patent number: 7681188
    Abstract: One embodiment of the present invention provides a system that facilitates locked prefetch scheduling in general cyclic regions of a computer program. The system operates by first receiving a source code for the computer program and compiling the source code into intermediate code. The system then performs a trace detection on the intermediate code. Next, the system inserts prefetch instructions and corresponding locks into the intermediate code. Finally, the system generates executable code from the intermediate code, wherein a lock for a given prefetch instruction prevents subsequent prefetches from being issued until the data value returns for the given prefetch instruction.
    Type: Grant
    Filed: April 29, 2005
    Date of Patent: March 16, 2010
    Assignee: Sun Microsystems, Inc.
    Inventors: Partha P. Tirumalai, Spiros Kalogeropulos, Yonghong Song
  • Patent number: 7673296
    Abstract: A method of scheduling optional instructions in a compiler targets a processor. The scheduling includes indicating a limit on the additional processor computations that are available for executing an optional code, generating one or more required instructions corresponding to a source code and one or more optional instructions corresponding to the optional code used with the source code and scheduling all of the one or more required instructions with as many of the one or more optional instructions as possible without exceeding the indicated limit on the additional processor computations for executing the optional code.
    Type: Grant
    Filed: July 28, 2004
    Date of Patent: March 2, 2010
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Jean-Francois Collard, Alan H. Karp
  • Patent number: 7669194
    Abstract: A mechanism for minimizing effective memory latency without unnecessary cost through fine-grained software-directed data prefetching using integrated high-level and low-level code analysis and optimizations is provided. The mechanism identifies and classifies streams, identifies data that is most likely to incur a cache miss, exploits effective hardware prefetching to determine the proper number of streams to be prefetched, exploits effective data prefetching on different types of streams in order to eliminate redundant prefetching and avoid cache pollution, and uses high-level transformations with integrated lower level cost analysis in the instruction scheduler to schedule prefetch instructions effectively.
    Type: Grant
    Filed: August 26, 2004
    Date of Patent: February 23, 2010
    Assignee: International Business Machines Corporation
    Inventors: Roch Georges Archambault, Robert James Blainey, Yaoqing Gao, Allan Russell Martin, James Lawrence McInnes, Francis Patrick O'Connell
  • Patent number: 7657882
    Abstract: A dataflow instruction set architecture and execution model, referred to as WaveScalar, which is designed for scalable, low-complexity/high-performance processors, while efficiently providing traditional memory semantics through a mechanism called wave-ordered memory. Wave-ordered memory enables “real-world” programs, written in any language, to be run on the WaveScalar architecture, as well as any out-of-order execution unit. Because it is software-controlled, wave-ordered memory can be disabled to obtain greater parallelism. Wavescalar also includes a software-controlled tag management system.
    Type: Grant
    Filed: January 21, 2005
    Date of Patent: February 2, 2010
    Assignee: University of Washington
    Inventors: Mark H. Oskin, Steven J. Swanson, Susan J. Eggers
  • Patent number: 7657880
    Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is permitted to execute Store instructions. Store blocker logic operates to prevent data associated with a Store instruction in a helper thread from being committed to memory. Dependence blocker logic operates to prevent data associated with a Store instruction in a speculative helper thread from being bypassed to a Load instruction in a non-speculative thread.
    Type: Grant
    Filed: August 1, 2003
    Date of Patent: February 2, 2010
    Assignee: Intel Corporation
    Inventors: Hong Wang, Tor Aamodt, Per Hammarlund, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Steve Shih-wei Liao
  • Patent number: 7657883
    Abstract: A dispatch scheduler in a multithreading microprocessor is disclosed. Each of N concurrently executing threads has one of P priorities. P N-bit round-robin vectors are generated, each being a 1-bit left-rotated and subsequently sign-extended version of an N-bit 1-hot input vector indicating the last thread selected for dispatching at the priority. N P-input muxes each receive a corresponding one of the N bits of each of the P round-robin vectors and selects the input specified by the thread priority. Selection logic selects an instruction for dispatching from the thread having a dispatch value greater than or equal to any of the threads left thereof in the N-bit input vectors. The dispatch value of each of the threads comprises a least-significant bit equal to the corresponding P-input mux output, a most-significant bit that is true if the instruction is dispatchable, and middle bits comprising the priority of the thread.
    Type: Grant
    Filed: March 22, 2005
    Date of Patent: February 2, 2010
    Assignee: MIPS Technologies, Inc.
    Inventor: Michael Gottlieb Jensen
  • Patent number: 7647586
    Abstract: A system and method for providing exceptional flow control in protected code through watchpoints is described. Code is generated. The generated code includes a sequence of normal operations and is subject to protection against copying during execution of the generated code. Execution points within the generated code are identified. A watchpoint corresponding to each of the execution points is set. An exception handler associated with each watchpoint is defined and includes operations exceptional to the normal operations sequence that are performed upon a triggering of each watchpoint during execution of the generated code.
    Type: Grant
    Filed: August 13, 2004
    Date of Patent: January 12, 2010
    Assignee: Sun Microsystems, Inc.
    Inventors: Dean R. E. Long, Christopher J. Plummer, Nedim Fresko
  • Patent number: 7647473
    Abstract: An instruction processing method for checking an arrangement of basic instructions in a very long instruction word (VLIW) instruction, suitable for language processing systems, an assembler and a compiler, used for processors which execute variable length VLIW instructions designed based on variable length VLIW architecture.
    Type: Grant
    Filed: January 24, 2002
    Date of Patent: January 12, 2010
    Assignee: Fujitsu Limited
    Inventors: Teruhiko Kamigata, Hideo Miyake
  • Patent number: 7631305
    Abstract: Methods and products for processing a software kernel of instructions are disclosed. The software kernel has stages representing a loop nest. The software kernel is processed by partitioning iterations of an outermost loop into groups with each group representing iterations of the outermost loop, running the software kernel and rotating a register file for each stage of the software kernel preceding an innermost loop to generate code to prepare for filling and executing instructions in software pipelines for a current group, running the software kernel for each stage of the software kernel in the innermost loop to generate code to fill the software pipelines for the current group with the register file being rotated after at least one run of the software kernel for the innermost loop, and repeatedly running the software kernel to unroll inner loops to generate code to further fill the software pipelines for the current group.
    Type: Grant
    Filed: September 20, 2004
    Date of Patent: December 8, 2009
    Assignee: University of Delaware
    Inventors: Hongbo Rong, Guang R. Gao, Alban Douillet, Ramaswamy Govindarajan
  • Patent number: 7617496
    Abstract: A macroscalar processor architecture is described herein. In one embodiment, a processor receives instructions of a program loop having a vector block and a sequence block intended to be executed after the vector block, where the processor includes multiple slices and each of the slices is capable of executing an instruction of an iteration of the program loop substantially in parallel. For each iteration of the program loop, the processor executes an instruction of the sequence block using one of the slices while executing instructions of the vector block using a remainder of the slices substantially in parallel. Other methods and apparatuses are also described.
    Type: Grant
    Filed: September 1, 2005
    Date of Patent: November 10, 2009
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 7617495
    Abstract: Disclosed are embodiments of a compiler, methods, and system for resource-aware scheduling of instructions. A list scheduling approach is augmented to take into account resource constraints when determining priority for scheduling of instructions. Other embodiments are also described and claimed.
    Type: Grant
    Filed: March 24, 2004
    Date of Patent: November 10, 2009
    Assignee: Intel Corporation
    Inventors: Kalyan Muthukumar, Daniel M. Lavery, Gerolf F. Hoflehner, Chu-cheow Lim, Jean-Francois Collard
  • Patent number: 7613599
    Abstract: An integrated design environment (IDE) is disclosed for forming virtual embedded systems. The IDE includes a design language for forming finite state machine models of hardware components that are coupled to simulators of processor cores, preferably instruction set accurate simulators. A software debugger interface permits a software application to be loaded and executed on the virtual embedded system. A virtual test bench may be coupled to the simulation to serve as a human-machine interface. In one embodiment, the IDE is provided as a web-based service for the evaluation, development and procurement phases of an embedded system project. IP components, such as processor cores, may be evaluated using a virtual embedded system. In one embodiment, a virtual embedded system is used as an executable specification for the procurement of a good or service related to an embedded system.
    Type: Grant
    Filed: June 1, 2001
    Date of Patent: November 3, 2009
    Assignee: Synopsys, Inc.
    Inventors: Stephen L Bade, Shay Ben-Chorin, Paul Caamano, Marcelo E Montoreano, Ani Taggu, Filip C Theon, Dean C Wills
  • Publication number: 20090254892
    Abstract: A compiling method for compiling software which is adapted to output an intermediate result at a given timing, the compiling method includes extracting, by a computer, a process block related to parallel processing and conditional branch from a processing sequence included in a source code of a software which is processed time-sequentially, and generating, by the computer, an execution code by restructuring the process block that is extracted.
    Type: Application
    Filed: June 10, 2009
    Publication date: October 8, 2009
    Applicant: FUJITSU LIMITED
    Inventor: Koichiro Yamashita
  • Publication number: 20090254895
    Abstract: Prefetching irregular memory references into a software controlled cache is provided. A compiler analyzes source code to identify at least one of a plurality of loops that contain an irregular memory reference. The compiler determines if the irregular memory reference within the at least one loop is a candidate for optimization. Responsive to an indication that the irregular memory reference may be optimized, the compiler determines if the irregular memory reference is valid for prefetching. Responsive to an indication that the irregular memory reference is valid for prefetching, a store statement for an address of the irregular memory reference is inserted into the at least one loop. A runtime library call is inserted into a prefetch runtime library for the irregular memory reference. Data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked.
    Type: Application
    Filed: April 4, 2008
    Publication date: October 8, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tong Chen, Marc Gonzalez tallada, Zehra N. Sura, Tao Zhang
  • Patent number: 7590976
    Abstract: The present invention relates a compiler program, a computer-readable storage medium storing such a compiler program, a compiling method and a compiling unit, and an object thereof is to automatically generate a reentrant object program. In order to accomplish this object, an address saving program generator 16a generates an address saving program for saving a data area address of a calling program module; an address setting program generator 16b generates an address setting program for setting a data area address of an other program module; a transferring program generator 16c generates a transferring program for the transfer from the calling program module to the other program module; an address resetting program generator 16d generates an address resetting program for reading and resetting the saved data area address; and an accessing program generator 16e generates an accessing program for accessing a data area for the other program module using a relative address from the set data area address.
    Type: Grant
    Filed: December 26, 2003
    Date of Patent: September 15, 2009
    Assignee: Panasonic Corporation
    Inventors: Masaki Kawai, Takuji Kawamoto, Shusuke Haruna, Yutaka Fujihara
  • Publication number: 20090217253
    Abstract: A computer program is speculatively parallelized with transactional memory by scoping program variables at compile time, and inserting code into the program at compile time. Determinations of the scoping can be based on whether scalar variables being scoped are involved in inter-loop non-reduction data dependencies, are used outside loops in which they were defined, and at what point in a loop a scalar variable is defined. The inserted code can include instructions for execution at a run time of the program to determine loop boundaries of the program, and issue checkpoint instructions and commit instructions that encompass transaction regions in the program. A transaction region can include an original function of the program and a spin-waiting loop with a non-transactional load, wherein the spin-waiting loop is configured to wait for a previous thread to commit before the current transaction commits.
    Type: Application
    Filed: February 22, 2008
    Publication date: August 27, 2009
    Applicant: Sun Microsystems, Inc.
    Inventors: Yonghong Song, Xiangyun Kong, Spiros Kalogeropulos, Partha P. Tirumalai
  • Patent number: 7581215
    Abstract: We present a technique to perform dependence analysis on more complex array subscripts than the linear form of the enclosing loop indices. For such complex array subscripts, we decouple the original iteration space and the dependence test iteration space and link them through index-association functions. The dependence analysis is performed in the dependence test iteration space to determine whether the dependence exists in the original iteration space. The dependence distance in the original iteration space is determined by the distance in the dependence test iteration space and the property of index-association functions. For certain non-linear expressions, we show how to transform it to a set of linear expressions equivalently. The latter can be used in dependence test with traditional techniques. We also show how our advanced dependence analysis technique can help parallelize some otherwise hard-to-parallelize loops.
    Type: Grant
    Filed: June 24, 2004
    Date of Patent: August 25, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Yonghong Song, Xiangyun Kong
  • Patent number: 7581210
    Abstract: One embodiment disclosed relates to a method of compiling a program to be executed on a target microprocessor with multiple functional units of a same type. The method includes opportunistically scheduling a redundant operation on one of the functional units that would otherwise be idle during a cycle.
    Type: Grant
    Filed: September 10, 2003
    Date of Patent: August 25, 2009
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Dale John Shidla, Andrew Harvey Barr, Ken Gary Pomaranski
  • Patent number: 7571435
    Abstract: A method (and structure) for executing linear algebra subroutines, includes, for an execution code controlling operation of a floating point unit (FPU) performing the linear algebra subroutine execution, unrolling instructions to preload data into a floating point register (FReg) of the FPU. The unrolling generates an instruction to load data into the FReg and the instruction is inserted into a sequence of instructions that execute the linear algebra subroutine on the FPU.
    Type: Grant
    Filed: September 29, 2003
    Date of Patent: August 4, 2009
    Assignee: International Business Machines Corporation
    Inventors: Fred Gehrung Gustavson, John A. Gunnels
  • Patent number: 7565343
    Abstract: Fixed-length data (560) contained in a database (560) are segmented into a number of pieces of data that are searchable at a time and searching is performed at high speed. As means for it, a pointer table (500), a secondary pointer table, a local table, and a fixed-length-data table are provided, and when more segmentation is required, a table having a numeric-value comparing function is further provided. As means for performing efficient configuration/management of the tables and for performing management that does not interfere with a search operation, an empty-house table (700), an empty-room table (720), a room-management table (740), and a structure-management table (760) may be provided.
    Type: Grant
    Filed: March 31, 2004
    Date of Patent: July 21, 2009
    Assignee: IPT Corporation
    Inventor: Shinpei Watanabe
  • Patent number: 7546592
    Abstract: A method, computer program product, and a data processing system for scheduling instructions in a data processing system are provided. Dependencies among a plurality of nodes are analyzed to determine if any of the plurality of nodes uses a constrained resource. Each of the plurality of nodes represents an instruction in a set of instructions. A subset of the plurality of nodes is designated as resource-constrained nodes. An attempt is made to generate a schedule with the subset of the plurality of nodes scheduled with priority with respect to any of the plurality of nodes not included in the subset.
    Type: Grant
    Filed: July 21, 2005
    Date of Patent: June 9, 2009
    Assignee: International Business Machines Corporation
    Inventor: Allan Russell Martin
  • Publication number: 20090138864
    Abstract: A method and apparatus for automatic second-order predictive commoning is provided by the present invention. During an analysis phase, the intermediate representation of a program code is analyzed to identify opportunities for second-order predictive commoning optimization. The analyzed information is used by the present invention for apply transformations to the program code, such that the number of memory access and the number of computations are reduced for loop iterations and performance of program code is improved.
    Type: Application
    Filed: January 30, 2009
    Publication date: May 28, 2009
    Applicant: International Business Machines Corporation
    Inventors: Arie Tal, Dina Tal
  • Patent number: 7539884
    Abstract: A method of power-gating instruction scheduling for leakage power reduction comprises receiving a program, generating a control-flow graph dividing the program into a plurality of blocks, analyzing utilization of power-gated components of a processor executing the program, generating the first power-gating instruction placement comprising power-off instructions and power-on instructions to shut down the inactive power-gated components, generating the second power-gating instruction placement by merging the power-off instructions as one compound power-off instruction and merging the power-on instructions as one compound power-on instruction, and inserting power-gating instructions into the program in accordance with the second power-gating instruction placement.
    Type: Grant
    Filed: July 27, 2006
    Date of Patent: May 26, 2009
    Assignee: Industrial Technology Research Institute
    Inventors: Yi-Ping You, Chung Wen Huang, Jeng Kuen Lee, Chi-Lung Wang, Kuo Yu Chuang
  • Patent number: 7523449
    Abstract: A method for adaptive runtime reconfiguration of a co-processor instruction set, in a computer system with at least a main processor communicatively connected to at least one reconfigurable co-processor, includes the steps of configuring the co-processor to implement an instruction set comprising one or more co-processor instructions, issuing a co-processor instruction to the co-processor, and determining whether the instruction is implemented in the co-processor. For an instruction not implemented in the co-processor instruction set, raising a stall signal to delay the main processor, determining whether there is enough space in the co-processor for the non-implemented instruction, and if there is enough space for said instruction, reconfiguring the instruction set of the co-processor by adding the non-implemented instruction to the co-processor instruction set. The stall signal is cleared and the instruction is executed.
    Type: Grant
    Filed: August 23, 2006
    Date of Patent: April 21, 2009
    Assignee: International Business Machines Corporation
    Inventors: Sameh W. Asaad, Richard Gerard Hofmann
  • Patent number: 7516481
    Abstract: A program development supporting apparatus that groups a plurality of events each executed in an information processor to divide the events into a plurality of parallel execution units to be executed in parallel with each other has a directional graph acquisition section that acquires directional graph data expressing each of the plurality of events as a vertex and a restriction on the execution order between two of the plurality of events as a directional branch, an inverse chain partial set extraction section that traces the directional branch from each event in the forward direction to extract from the directional graph data an inverse partial set that is a combination of the events having such a relationship that any one of the events cannot be reached from the other events, and a parallel execution unit assignment section that assigns the plurality of events belonging to the inverse partial set to units different from each other in the parallel execution units.
    Type: Grant
    Filed: December 2, 2004
    Date of Patent: April 7, 2009
    Assignee: International Business Machines Corporation
    Inventor: Toshiyuki Fujikura
  • Patent number: 7509634
    Abstract: A translator receives a source code that is described using a process designation (such as a line-by-line process designation, a line data extraction designation, and a broadcast designation) to be performed on line data of an image on a line by line basis, parses and optimizes the source code, and then generates an SIMD macro code that is an intermediate form taking into consideration the use of an SIMD instruction set. A simplifier generates, from the SIMD macro code, a simplified SIMD macro code, namely, a composite macro code into which a series of codes having the relationship between the definition and the reference of the same virtual SIMD register is organized. A machine code generator generates, from the simplified SIMD macro code, a machine code that efficiently uses an SIMD instruction.
    Type: Grant
    Filed: November 12, 2003
    Date of Patent: March 24, 2009
    Assignee: NEC Corporation
    Inventor: Shorin Kyo
  • Patent number: 7506331
    Abstract: A method, apparatus, and computer instructions for processing instructions. A data dependency graph is built. The data dependency graph is analyzed for recurrences, and unpipelined instructions that lie outside of the recurrences are expanded.
    Type: Grant
    Filed: August 30, 2004
    Date of Patent: March 17, 2009
    Assignee: International Business Machines Corporation
    Inventors: Roch Georges Archambault, Robert Frederick Enenkel, Robert William Hay, Allan Russell Martin, James Lawrence McInnes, Ronald Ian McIntosh, Mark Peter Mendell
  • Patent number: 7506326
    Abstract: An improved method, apparatus, and computer instructions for generating instructions to process multiple similar expressions. Parameters are identified for the expressions in the original instructions, to form a set of identified parameters typically including the operations performed, the types of data used, and the data sizes. Each type of execution unit that can execute the instructions needed to process the expressions using the set of identified parameters is identified, wherein a set of identified execution unit types is formed. An execution unit type from the set of identified execution unit types is selected to meet a performance goal. The new instructions are generated for the selected execution unit type to process the expressions, and the original instructions for the expressions are discarded.
    Type: Grant
    Filed: March 7, 2005
    Date of Patent: March 17, 2009
    Assignee: International Business Machines Corporation
    Inventor: Ronald Ian McIntosh
  • Publication number: 20090064121
    Abstract: Systems, methods and computer products for implementing shadow versioning to improve data dependence analysis for instruction scheduling. Exemplary embodiments include a method to identify loops within the code to be compiled, for each loop a dependence initializing a matrix, for each loop shadow identifying symbols that are accessed by the loop, examining dependencies, storing, comparing and classifying the dependence vectors, generating new shadow symbols, replacing the old shadow symbols with the new shadow symbols, generating alias relationships between the newly created shadow symbols, scheduling instructions and compiling the code.
    Type: Application
    Filed: August 29, 2007
    Publication date: March 5, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Roch G. Archambault, Yaoqing Gao, Raul E. Silvera, Peng Zhao
  • Patent number: 7493609
    Abstract: A method and apparatus for automatic second-order predictive commoning is provided by the present invention. During an analysis phase, the intermediate representation of a program code is analyzed to identify opportunities for second-order predictive commoning optimization. The analyzed information is used by the present invention for apply transformations to the program code, such that the number of memory access and the number of computations are reduced for loop iterations and performance of program code is improved.
    Type: Grant
    Filed: August 30, 2004
    Date of Patent: February 17, 2009
    Assignee: International Business Machines Corporation
    Inventors: Arie Tal, Dina Tal
  • Patent number: 7493611
    Abstract: A scheduling algorithm is provided for selecting the placement of instructions with internal slack into a schedule of instructions within a loop. The algorithm achieves this by pinning nodes with internal slack to corresponding nodes on the critical path of the code that have similar properties in terms of the data dependency graph, such as earliest time and latest time. The effect is that nodes with internal slack are more often optimally placed in the schedule, reducing the need for rotating registers or register copy instructions. The benefit of the present invention can primarily be seen when performing instruction scheduling or software pipelining on loop code, but can also apply to other forms of instruction scheduling when greater control of placement of nodes with internal slack is desired.
    Type: Grant
    Filed: August 30, 2004
    Date of Patent: February 17, 2009
    Assignee: International Business Machines Corporation
    Inventor: Allan Russell Martin
  • Patent number: 7487336
    Abstract: The present disclosure relates to the allocation of registers the scheduling of instructions, and, more specifically, to the classifying of operands and allocation of registers to local operands.
    Type: Grant
    Filed: December 12, 2003
    Date of Patent: February 3, 2009
    Assignee: Intel Corporation
    Inventors: Jayashankar Bharadwaj, Tatiana Shpeisman, Ali-Reza Adl-Tabatabai
  • Patent number: 7478379
    Abstract: A technique of ordering machine instructions to reduce spill code. For each machine instruction that is ready for scheduling, an amount is determined by which the size of a committed set of machine instructions would increase upon the scheduling of the machine instruction. The machine instruction for which the determined amount is smallest is then scheduled. The currently committed instructions may be determined to be the machine instructions that are already scheduled as well as the machine instructions that are descendent from already scheduled machine instructions. The result is that new computations upon which a target processor will embark tend to be deferred. Bit vectors may be employed for efficiency during the assessment of candidate instructions that are ready for scheduling. The technique may be triggered when the risk of registers becoming overcommitted becomes high, as may occur when the number of available processor registers drops below a certain threshold.
    Type: Grant
    Filed: May 6, 2004
    Date of Patent: January 13, 2009
    Assignee: International Business Machines Corporation
    Inventors: Damien Bonaventure, James Lawrence McInnes
  • Publication number: 20090013316
    Abstract: A method, apparatus, and computer instructions for scheduling instructions for execution. Identify a series of instructions in a loop, wherein the series of instructions has a cyclic data dependency. Determine whether the series of instructions is a uniform series of instructions. Schedule execution of the uniform series of instructions within the loop to optimize execution of the loop in response to the identified series of instructions being the uniform series of instructions.
    Type: Application
    Filed: September 19, 2008
    Publication date: January 8, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Allan Russell Martin
  • Publication number: 20080313622
    Abstract: The present invention relates to a device and a method of accessing a storage of a storage device by reading or writing data to said storage device, wherein said accessing is controlled by an external data controller via a low level in a software stack of said device, and wherein said storage device is accessed without hampering the operation of functionalities at a higher level of said software stack of said device, said device comprises an intermediate storage, and the method comprises the steps of storing commands related to said accessing of data in said intermediate storage as a command queue, executing said commands in said command queue when allowed by a command queue scheduler, said scheduler scheduling in dependence of at least one of the functionalities at said higher level of said software stack. Thereby full control is obtained on storage medium requests by the scheduler.
    Type: Application
    Filed: May 26, 2005
    Publication date: December 18, 2008
    Inventor: Jozef Pieter Van Gassel
  • Patent number: 7448031
    Abstract: Methods and apparatus to compile a software program to manage parallel ? caches are disclosed. In an example method, a compiler attempts to schedule a software program such that load instructions in a first set of load instructions has a first predetermine latency greater than the latency of the first cache. The compiler also marks a second set of load instructions with a latency less than the first predetermined latency to access the first cache. The compiler attempts to schedule the software program such that the load instruction in a third set have at least a second predetermined latency greater than the latency of the second cache. The compiler identifies a fourth set of load instructions in the scheduled software program having less than the second predetermined latency and marks the fourth set of load instructions to access the second cache.
    Type: Grant
    Filed: December 17, 2003
    Date of Patent: November 4, 2008
    Assignee: Intel Corporation
    Inventor: Youfeng Wu
  • Patent number: 7447732
    Abstract: A system, method and article of manufacture return code management in autonomic systems and more particularly to managing execution of operations in data processing systems on the basis of return code tracking. One embodiment provides a method for managing execution of an operation in a data processing system. The method comprises tracking return codes received from previous executions of the operation in the data processing system, determining an execution behavior of the operation from the tracked return codes, and managing a subsequent execution of the operation on the basis of the determined execution behavior.
    Type: Grant
    Filed: May 23, 2003
    Date of Patent: November 4, 2008
    Assignee: International Business Machines Corporation
    Inventors: Eric L. Barsness, John M. Santosuosso
  • Patent number: 7444628
    Abstract: A method, apparatus, and computer instructions for scheduling instructions for execution. Identify a series of instructions in a loop, wherein the series of instructions has a cyclic data dependency. Determine whether the series of instructions is a uniform series of instructions. Schedule execution of the uniform series of instructions within the loop to optimize execution of the loop in response to the identified series of instructions being the uniform series of instructions.
    Type: Grant
    Filed: August 30, 2004
    Date of Patent: October 28, 2008
    Assignee: International Business Machines Corporation
    Inventor: Allan Russell Martin