Including Scheduling Instructions Patents (Class 717/161)

System and method for the distribution of a program among cooperating processors

Patent number: 7765536

Abstract: A Veil program analyzes the source code and data of a target program and determines how best to distribute the target program and data among the processors of a multi-processor computing system. The Veil program analyzes source code loops, data sizes and types to prepare a set of distribution attempts, whereby each distribution is run under a run-time evaluation wrapper and evaluated to determine the optimal distribution.

Type: Grant

Filed: December 21, 2005

Date of Patent: July 27, 2010

Assignee: Management Services Group, Inc.

Inventors: Robert Stephen Gordy, Terry Spitzer
Method, apparatus, and program to efficiently calculate cache prefetching patterns for loops

Patent number: 7761667

Abstract: A mechanism is provided that identifies instructions that access storage and may be candidates for catch prefetching. The mechanism augments these instructions so that any given instance of the instruction operates in one of four modes, namely normal, unexecuted, data gathering, and validation. In the normal mode, the instruction merely performs the function specified in the software runtime environment. An instruction in unexecuted mode, upon the next execution, is placed in data gathering mode. When an instruction in the data gathering mode is encountered, the mechanism of the present invention collects data to discover potential fixed storage access patterns. When an instruction is in validation mode, the mechanism of the present invention validates the presumed fixed storage access patterns.

Type: Grant

Filed: August 12, 2008

Date of Patent: July 20, 2010

Assignee: International Business Machines Corporation

Inventors: Christopher Michael Donawa, Allan Henry Kielstra
Speculative code motion for memory latency hiding

Patent number: 7752611

Abstract: Various embodiments that may be used in performing speculative code motion for memory latency hiding are disclosed. One embodiment comprises extracting an asynchronous signal from a memory access instruction in a program to represent a latency of the memory access instruction, and generating a wait instruction to wait the asynchronous signal.

Type: Grant

Filed: December 10, 2005

Date of Patent: July 6, 2010

Assignee: Intel Corporation

Inventors: Long Li, Jinquan Dai, Zhiyuan Lv
Methods and systems for ordering instructions using future values

Patent number: 7747993

Abstract: A method of ordering instructions. The method can include placing a first instruction that consumes a value of an object before a second instruction that produces the value of the object such that the first instruction is processed before the second instruction and a physical location is allocated to the value of the object upon processing the first instruction.

Type: Grant

Filed: December 30, 2004

Date of Patent: June 29, 2010

Assignee: Michigan Technological University

Inventor: Soner Onder
Binary code instrumentation to reduce effective memory latency

Patent number: 7730470

Abstract: A system for binary code instrumentation to reduce effective memory latency comprises a processor and memory coupled to the processor. The memory comprises program instructions executable by the processor to implement a code analyzer configured to analyze an instruction stream of compiled code executable at an execution engine to identify, for a given memory reference instruction in the stream that references data at a memory address calculated during an execution of the instruction stream, an earliest point in time during the execution at which sufficient data is available at the execution engine to calculate the memory address. The code analyzer generates an indication of whether the given memory reference instruction is suitable for a prefetch operation based on a difference in time between the earliest point in time and a time at which the given memory reference instruction is executed during the execution.

Type: Grant

Filed: February 27, 2006

Date of Patent: June 1, 2010

Assignee: Oracle America, Inc.

Inventors: Ilya A. Sharapov, Andrew J. Over
Method for predicate promotion in a software loop

Patent number: 7712091

Abstract: A method and system for optimizing the execution of a software loop is provided. The method involves the determination of an edge in a critical recurrence cycle in the software loop. The edge is a dependency link between two instructions and contains a dependee and a dependent. The dependee is an instruction that produces a result, and the dependent is an instruction that uses the result. The method further involves performing predicate promotion of at least one of the dependee and the dependent if one or more pre-determined conditions are met.

Type: Grant

Filed: September 30, 2005

Date of Patent: May 4, 2010

Assignee: Intel Corporation

Inventors: Kalyan Muthukumar, Robyn A. Sampson, Daniel Lavery
COMPILER AND COMPILING METHOD

Publication number: 20100107147

Abstract: A compiler allocates an unroll_group_number conferred based on a sequence in which a loop body is replicated by loop unrolling to each loop body during loop unrolling based on the optimized number of loop unrolling. The allocated unroll_group_number is added to each instruction included in each loop body. A priority of an instruction is adjusted based on the allocated unroll_group_number during instruction scheduling.

Type: Application

Filed: September 11, 2009

Publication date: April 29, 2010

Inventor: Byung-chang Cha
Dynamic prefetch distance calculation

Patent number: 7702856

Abstract: The prefetch distance to be used by a prefetch instruction may not always be correctly calculated using compile-time information. In one embodiment, the present invention generates prefetch distance calculation code to dynamically calculate a prefetch distance used by a prefetch instruction at run-time.

Type: Grant

Filed: November 9, 2005

Date of Patent: April 20, 2010

Assignee: Intel Corporation

Inventors: Rakesh Krishnaiyer, Somnath Ghosh, Abhay Kanhere
Compiler apparatus with flexible optimization

Patent number: 7698696

Abstract: A compiler comprises an analysis unit that detects directives (options and pragmas) from a user to the compiler, an optimization unit that is made up of a processing unit (a global region allocation unit, a software pipelining unit, a loop unrolling unit, a “if” conversion unit, and a pair instruction generation unit) that performs individual optimization processing designated by options and pragmas from a user, following the directives and the like from the analysis unit, etc. The global region allocation unit performs optimization processing, following designation of the maximum data size of variables to be allocated to a global region, designation of variables to be allocated to the global region, and options and pragmas regarding designation of variables not to be allocated in the global region.

Type: Grant

Filed: June 30, 2003

Date of Patent: April 13, 2010

Assignee: Panasonic Corporation

Inventors: Hajime Ogawa, Taketo Heishi, Toshiyuki Sakata, Shuichi Takayama, Shohei Michimoto, Tomoo Hamada, Ryoko Miyachi
Splitting the computation space to optimize parallel code

Patent number: 7689980

Abstract: Linear transformations of statements in code are performed to generate linear expressions associated with the statements. Parallel code is generated using the linear expressions. Generating the parallel code includes splitting the computation-space of the statements into intervals and generating parallel code for the intervals.

Type: Grant

Filed: September 30, 2005

Date of Patent: March 30, 2010

Assignee: Intel Corporation

Inventors: Zhao Hui Du, Shih-wei Liao, Gansha Wu, Guei-Yuan Lueh
Platform independent binary instrumentation and memory allocation method

Patent number: 7685588

Abstract: Embodiments of the present invention provide for platform independence, low intrusiveness, and optimal memory usage of the binary instrumentation process by means of employing one procedure (interceptor function) implemented in a high-level programming language to intercept an arbitrary number of functions or blocks of code. Each time a function or code block needs to be intercepted a new copy of the procedure from a provided memory region may be associated with the address of the function or block of code by means of a memory region descriptor and an intercepted function address table. Once activated, the interceptor function may retrieve its current address and, by searching memory region descriptors, determine the region the current address belongs to; the region's base address may then be obtained. A reference to the intercepted function address table may be fetched from the region descriptor; and an index to the intercepted function address table may be computed.

Type: Grant

Filed: March 28, 2005

Date of Patent: March 23, 2010

Assignee: Intel Corporation

Inventors: Sergey N. Zheltov, Stanislav V. Bratanov, Dmitry Eremin
METHODS AND APPARATUS FOR JOINT PARALLELISM AND LOCALITY OPTIMIZATION IN SOURCE CODE COMPILATION

Publication number: 20100070956

Abstract: Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one mufti-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that avow for parallel execution of tasks. The first custom computing apparatus optimizes the code for both parallelism and locality of operations on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.

Type: Application

Filed: September 16, 2009

Publication date: March 18, 2010

Inventors: Allen Leung, Nicolas T. Vasilache, Benoit Meister, Richard A. Lethin
Locked prefetch scheduling in general cyclic regions

Patent number: 7681188

Abstract: One embodiment of the present invention provides a system that facilitates locked prefetch scheduling in general cyclic regions of a computer program. The system operates by first receiving a source code for the computer program and compiling the source code into intermediate code. The system then performs a trace detection on the intermediate code. Next, the system inserts prefetch instructions and corresponding locks into the intermediate code. Finally, the system generates executable code from the intermediate code, wherein a lock for a given prefetch instruction prevents subsequent prefetches from being issued until the data value returns for the given prefetch instruction.

Type: Grant

Filed: April 29, 2005

Date of Patent: March 16, 2010

Assignee: Sun Microsystems, Inc.

Inventors: Partha P. Tirumalai, Spiros Kalogeropulos, Yonghong Song
Method and system for optional code scheduling

Patent number: 7673296

Abstract: A method of scheduling optional instructions in a compiler targets a processor. The scheduling includes indicating a limit on the additional processor computations that are available for executing an optional code, generating one or more required instructions corresponding to a source code and one or more optional instructions corresponding to the optional code used with the source code and scheduling all of the one or more required instructions with as many of the one or more optional instructions as possible without exceeding the indicated limit on the additional processor computations for executing the optional code.

Type: Grant

Filed: July 28, 2004

Date of Patent: March 2, 2010

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Jean-Francois Collard, Alan H. Karp
Fine-grained software-directed data prefetching using integrated high-level and low-level code analysis optimizations

Patent number: 7669194

Abstract: A mechanism for minimizing effective memory latency without unnecessary cost through fine-grained software-directed data prefetching using integrated high-level and low-level code analysis and optimizations is provided. The mechanism identifies and classifies streams, identifies data that is most likely to incur a cache miss, exploits effective hardware prefetching to determine the proper number of streams to be prefetched, exploits effective data prefetching on different types of streams in order to eliminate redundant prefetching and avoid cache pollution, and uses high-level transformations with integrated lower level cost analysis in the instruction scheduler to schedule prefetch instructions effectively.

Type: Grant

Filed: August 26, 2004

Date of Patent: February 23, 2010

Assignee: International Business Machines Corporation

Inventors: Roch Georges Archambault, Robert James Blainey, Yaoqing Gao, Allan Russell Martin, James Lawrence McInnes, Francis Patrick O'Connell
Wavescalar architecture having a wave order memory

Patent number: 7657882

Abstract: A dataflow instruction set architecture and execution model, referred to as WaveScalar, which is designed for scalable, low-complexity/high-performance processors, while efficiently providing traditional memory semantics through a mechanism called wave-ordered memory. Wave-ordered memory enables “real-world” programs, written in any language, to be run on the WaveScalar architecture, as well as any out-of-order execution unit. Because it is software-controlled, wave-ordered memory can be disabled to obtain greater parallelism. Wavescalar also includes a software-controlled tag management system.

Type: Grant

Filed: January 21, 2005

Date of Patent: February 2, 2010

Assignee: University of Washington

Inventors: Mark H. Oskin, Steven J. Swanson, Susan J. Eggers
Safe store for speculative helper threads

Patent number: 7657880

Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is permitted to execute Store instructions. Store blocker logic operates to prevent data associated with a Store instruction in a helper thread from being committed to memory. Dependence blocker logic operates to prevent data associated with a Store instruction in a speculative helper thread from being bypassed to a Load instruction in a non-speculative thread.

Type: Grant

Filed: August 1, 2003

Date of Patent: February 2, 2010

Assignee: Intel Corporation

Inventors: Hong Wang, Tor Aamodt, Per Hammarlund, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Steve Shih-wei Liao
Instruction dispatch scheduler employing round-robin apparatus supporting multiple thread priorities for use in multithreading microprocessor

Patent number: 7657883

Abstract: A dispatch scheduler in a multithreading microprocessor is disclosed. Each of N concurrently executing threads has one of P priorities. P N-bit round-robin vectors are generated, each being a 1-bit left-rotated and subsequently sign-extended version of an N-bit 1-hot input vector indicating the last thread selected for dispatching at the priority. N P-input muxes each receive a corresponding one of the N bits of each of the P round-robin vectors and selects the input specified by the thread priority. Selection logic selects an instruction for dispatching from the thread having a dispatch value greater than or equal to any of the threads left thereof in the N-bit input vectors. The dispatch value of each of the threads comprises a least-significant bit equal to the corresponding P-input mux output, a most-significant bit that is true if the instruction is dispatchable, and middle bits comprising the priority of the thread.

Type: Grant

Filed: March 22, 2005

Date of Patent: February 2, 2010

Assignee: MIPS Technologies, Inc.

Inventor: Michael Gottlieb Jensen
System and method for providing exceptional flow control in protected code through watchpoints

Patent number: 7647586

Abstract: A system and method for providing exceptional flow control in protected code through watchpoints is described. Code is generated. The generated code includes a sequence of normal operations and is subject to protection against copying during execution of the generated code. Execution points within the generated code are identified. A watchpoint corresponding to each of the execution points is set. An exception handler associated with each watchpoint is defined and includes operations exceptional to the normal operations sequence that are performed upon a triggering of each watchpoint during execution of the generated code.

Type: Grant

Filed: August 13, 2004

Date of Patent: January 12, 2010

Assignee: Sun Microsystems, Inc.

Inventors: Dean R. E. Long, Christopher J. Plummer, Nedim Fresko
Instruction processing method for verifying basic instruction arrangement in VLIW instruction for variable length VLIW processor

Patent number: 7647473

Abstract: An instruction processing method for checking an arrangement of basic instructions in a very long instruction word (VLIW) instruction, suitable for language processing systems, an assembler and a compiler, used for processors which execute variable length VLIW instructions designed based on variable length VLIW architecture.

Type: Grant

Filed: January 24, 2002

Date of Patent: January 12, 2010

Assignee: Fujitsu Limited

Inventors: Teruhiko Kamigata, Hideo Miyake
Methods and products for processing loop nests

Patent number: 7631305

Abstract: Methods and products for processing a software kernel of instructions are disclosed. The software kernel has stages representing a loop nest. The software kernel is processed by partitioning iterations of an outermost loop into groups with each group representing iterations of the outermost loop, running the software kernel and rotating a register file for each stage of the software kernel preceding an innermost loop to generate code to prepare for filling and executing instructions in software pipelines for a current group, running the software kernel for each stage of the software kernel in the innermost loop to generate code to fill the software pipelines for the current group with the register file being rotated after at least one run of the software kernel for the innermost loop, and repeatedly running the software kernel to unroll inner loops to generate code to further fill the software pipelines for the current group.

Type: Grant

Filed: September 20, 2004

Date of Patent: December 8, 2009

Assignee: University of Delaware

Inventors: Hongbo Rong, Guang R. Gao, Alban Douillet, Ramaswamy Govindarajan
Macroscalar processor architecture

Patent number: 7617496

Abstract: A macroscalar processor architecture is described herein. In one embodiment, a processor receives instructions of a program loop having a vector block and a sequence block intended to be executed after the vector block, where the processor includes multiple slices and each of the slices is capable of executing an instruction of an iteration of the program loop substantially in parallel. For each iteration of the program loop, the processor executes an instruction of the sequence block using one of the slices while executing instructions of the vector block using a remainder of the slices substantially in parallel. Other methods and apparatuses are also described.

Type: Grant

Filed: September 1, 2005

Date of Patent: November 10, 2009

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
Resource-aware scheduling for compilers

Patent number: 7617495

Abstract: Disclosed are embodiments of a compiler, methods, and system for resource-aware scheduling of instructions. A list scheduling approach is augmented to take into account resource constraints when determining priority for scheduling of instructions. Other embodiments are also described and claimed.

Type: Grant

Filed: March 24, 2004

Date of Patent: November 10, 2009

Assignee: Intel Corporation

Inventors: Kalyan Muthukumar, Daniel M. Lavery, Gerolf F. Hoflehner, Chu-cheow Lim, Jean-Francois Collard
Method and system for virtual prototyping

Patent number: 7613599

Abstract: An integrated design environment (IDE) is disclosed for forming virtual embedded systems. The IDE includes a design language for forming finite state machine models of hardware components that are coupled to simulators of processor cores, preferably instruction set accurate simulators. A software debugger interface permits a software application to be loaded and executed on the virtual embedded system. A virtual test bench may be coupled to the simulation to serve as a human-machine interface. In one embodiment, the IDE is provided as a web-based service for the evaluation, development and procurement phases of an embedded system project. IP components, such as processor cores, may be evaluated using a virtual embedded system. In one embodiment, a virtual embedded system is used as an executable specification for the procurement of a good or service related to an embedded system.

Type: Grant

Filed: June 1, 2001

Date of Patent: November 3, 2009

Assignee: Synopsys, Inc.

Inventors: Stephen L Bade, Shay Ben-Chorin, Paul Caamano, Marcelo E Montoreano, Ani Taggu, Filip C Theon, Dean C Wills
Compiling method and compiler

Publication number: 20090254892

Abstract: A compiling method for compiling software which is adapted to output an intermediate result at a given timing, the compiling method includes extracting, by a computer, a process block related to parallel processing and conditional branch from a processing sequence included in a source code of a software which is processed time-sequentially, and generating, by the computer, an execution code by restructuring the process block that is extracted.

Type: Application

Filed: June 10, 2009

Publication date: October 8, 2009

Applicant: FUJITSU LIMITED

Inventor: Koichiro Yamashita
Prefetching Irregular Data References for Software Controlled Caches

Publication number: 20090254895

Abstract: Prefetching irregular memory references into a software controlled cache is provided. A compiler analyzes source code to identify at least one of a plurality of loops that contain an irregular memory reference. The compiler determines if the irregular memory reference within the at least one loop is a candidate for optimization. Responsive to an indication that the irregular memory reference may be optimized, the compiler determines if the irregular memory reference is valid for prefetching. Responsive to an indication that the irregular memory reference is valid for prefetching, a store statement for an address of the irregular memory reference is inserted into the at least one loop. A runtime library call is inserted into a prefetch runtime library for the irregular memory reference. Data associated with the irregular memory reference is prefetched into the software controlled cache when the runtime library call is invoked.

Type: Application

Filed: April 4, 2008

Publication date: October 8, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tong Chen, Marc Gonzalez tallada, Zehra N. Sura, Tao Zhang
Compiler program, a computer-readable storage medium storing a compiler program, a compiling method and a compiling unit

Patent number: 7590976

Abstract: The present invention relates a compiler program, a computer-readable storage medium storing such a compiler program, a compiling method and a compiling unit, and an object thereof is to automatically generate a reentrant object program. In order to accomplish this object, an address saving program generator 16a generates an address saving program for saving a data area address of a calling program module; an address setting program generator 16b generates an address setting program for setting a data area address of an other program module; a transferring program generator 16c generates a transferring program for the transfer from the calling program module to the other program module; an address resetting program generator 16d generates an address resetting program for reading and resetting the saved data area address; and an accessing program generator 16e generates an accessing program for accessing a data area for the other program module using a relative address from the set data area address.

Type: Grant

Filed: December 26, 2003

Date of Patent: September 15, 2009

Assignee: Panasonic Corporation

Inventors: Masaki Kawai, Takuji Kawamoto, Shusuke Haruna, Yutaka Fujihara
COMPILER FRAMEWORK FOR SPECULATIVE AUTOMATIC PARALLELIZATION WITH TRANSACTIONAL MEMORY

Publication number: 20090217253

Abstract: A computer program is speculatively parallelized with transactional memory by scoping program variables at compile time, and inserting code into the program at compile time. Determinations of the scoping can be based on whether scalar variables being scoped are involved in inter-loop non-reduction data dependencies, are used outside loops in which they were defined, and at what point in a loop a scalar variable is defined. The inserted code can include instructions for execution at a run time of the program to determine loop boundaries of the program, and issue checkpoint instructions and commit instructions that encompass transaction regions in the program. A transaction region can include an original function of the program and a spin-waiting loop with a non-transactional load, wherein the spin-waiting loop is configured to wait for a previous thread to commit before the current transaction commits.

Type: Application

Filed: February 22, 2008

Publication date: August 27, 2009

Applicant: Sun Microsystems, Inc.

Inventors: Yonghong Song, Xiangyun Kong, Spiros Kalogeropulos, Partha P. Tirumalai
Dependency analysis system and method

Patent number: 7581215

Abstract: We present a technique to perform dependence analysis on more complex array subscripts than the linear form of the enclosing loop indices. For such complex array subscripts, we decouple the original iteration space and the dependence test iteration space and link them through index-association functions. The dependence analysis is performed in the dependence test iteration space to determine whether the dependence exists in the original iteration space. The dependence distance in the original iteration space is determined by the distance in the dependence test iteration space and the property of index-association functions. For certain non-linear expressions, we show how to transform it to a set of linear expressions equivalently. The latter can be used in dependence test with traditional techniques. We also show how our advanced dependence analysis technique can help parallelize some otherwise hard-to-parallelize loops.

Type: Grant

Filed: June 24, 2004

Date of Patent: August 25, 2009

Assignee: Sun Microsystems, Inc.

Inventors: Yonghong Song, Xiangyun Kong
Compiler-scheduled CPU functional testing

Patent number: 7581210

Abstract: One embodiment disclosed relates to a method of compiling a program to be executed on a target microprocessor with multiple functional units of a same type. The method includes opportunistically scheduling a redundant operation on one of the functional units that would otherwise be idle during a cycle.

Type: Grant

Filed: September 10, 2003

Date of Patent: August 25, 2009

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Dale John Shidla, Andrew Harvey Barr, Ken Gary Pomaranski
Method and structure for producing high performance linear algebra routines using preloading of floating point registers

Patent number: 7571435

Abstract: A method (and structure) for executing linear algebra subroutines, includes, for an execution code controlling operation of a floating point unit (FPU) performing the linear algebra subroutine execution, unrolling instructions to preload data into a floating point register (FReg) of the FPU. The unrolling generates an instruction to load data into the FReg and the instruction is inserted into a sequence of instructions that execute the linear algebra subroutine on the FPU.

Type: Grant

Filed: September 29, 2003

Date of Patent: August 4, 2009

Assignee: International Business Machines Corporation

Inventors: Fred Gehrung Gustavson, John A. Gunnels
Search apparatus and search management method for fixed-length data

Patent number: 7565343

Abstract: Fixed-length data (560) contained in a database (560) are segmented into a number of pieces of data that are searchable at a time and searching is performed at high speed. As means for it, a pointer table (500), a secondary pointer table, a local table, and a fixed-length-data table are provided, and when more segmentation is required, a table having a numeric-value comparing function is further provided. As means for performing efficient configuration/management of the tables and for performing management that does not interfere with a search operation, an empty-house table (700), an empty-room table (720), a room-management table (740), and a structure-management table (760) may be provided.

Type: Grant

Filed: March 31, 2004

Date of Patent: July 21, 2009

Assignee: IPT Corporation

Inventor: Shinpei Watanabe
System and method for optimized swing modulo scheduling based on identification of constrained resources

Patent number: 7546592

Abstract: A method, computer program product, and a data processing system for scheduling instructions in a data processing system are provided. Dependencies among a plurality of nodes are analyzed to determine if any of the plurality of nodes uses a constrained resource. Each of the plurality of nodes represents an instruction in a set of instructions. A subset of the plurality of nodes is designated as resource-constrained nodes. An attempt is made to generate a schedule with the subset of the plurality of nodes scheduled with priority with respect to any of the plurality of nodes not included in the subset.

Type: Grant

Filed: July 21, 2005

Date of Patent: June 9, 2009

Assignee: International Business Machines Corporation

Inventor: Allan Russell Martin
Method and Apparatus for Automatic Second-Order Predictive Commoning

Publication number: 20090138864

Abstract: A method and apparatus for automatic second-order predictive commoning is provided by the present invention. During an analysis phase, the intermediate representation of a program code is analyzed to identify opportunities for second-order predictive commoning optimization. The analyzed information is used by the present invention for apply transformations to the program code, such that the number of memory access and the number of computations are reduced for loop iterations and performance of program code is improved.

Type: Application

Filed: January 30, 2009

Publication date: May 28, 2009

Applicant: International Business Machines Corporation

Inventors: Arie Tal, Dina Tal
Power-gating instruction scheduling for power leakage reduction

Patent number: 7539884

Abstract: A method of power-gating instruction scheduling for leakage power reduction comprises receiving a program, generating a control-flow graph dividing the program into a plurality of blocks, analyzing utilization of power-gated components of a processor executing the program, generating the first power-gating instruction placement comprising power-off instructions and power-on instructions to shut down the inactive power-gated components, generating the second power-gating instruction placement by merging the power-off instructions as one compound power-off instruction and merging the power-on instructions as one compound power-on instruction, and inserting power-gating instructions into the program in accordance with the second power-gating instruction placement.

Type: Grant

Filed: July 27, 2006

Date of Patent: May 26, 2009

Assignee: Industrial Technology Research Institute

Inventors: Yi-Ping You, Chung Wen Huang, Jeng Kuen Lee, Chi-Lung Wang, Kuo Yu Chuang
System and method for adaptive run-time reconfiguration for a reconfigurable instruction set co-processor architecture

Patent number: 7523449

Abstract: A method for adaptive runtime reconfiguration of a co-processor instruction set, in a computer system with at least a main processor communicatively connected to at least one reconfigurable co-processor, includes the steps of configuring the co-processor to implement an instruction set comprising one or more co-processor instructions, issuing a co-processor instruction to the co-processor, and determining whether the instruction is implemented in the co-processor. For an instruction not implemented in the co-processor instruction set, raising a stall signal to delay the main processor, determining whether there is enough space in the co-processor for the non-implemented instruction, and if there is enough space for said instruction, reconfiguring the instruction set of the co-processor by adding the non-implemented instruction to the co-processor instruction set. The stall signal is cleared and the instruction is executed.

Type: Grant

Filed: August 23, 2006

Date of Patent: April 21, 2009

Assignee: International Business Machines Corporation

Inventors: Sameh W. Asaad, Richard Gerard Hofmann
Program development supporting apparatus, method, program and recording medium

Patent number: 7516481

Abstract: A program development supporting apparatus that groups a plurality of events each executed in an information processor to divide the events into a plurality of parallel execution units to be executed in parallel with each other has a directional graph acquisition section that acquires directional graph data expressing each of the plurality of events as a vertex and a restriction on the execution order between two of the plurality of events as a directional branch, an inverse chain partial set extraction section that traces the directional branch from each event in the forward direction to extract from the directional graph data an inverse partial set that is a combination of the events having such a relationship that any one of the events cannot be reached from the other events, and a parallel execution unit assignment section that assigns the plurality of events belonging to the inverse partial set to units different from each other in the parallel execution units.

Type: Grant

Filed: December 2, 2004

Date of Patent: April 7, 2009

Assignee: International Business Machines Corporation

Inventor: Toshiyuki Fujikura
SIMD instruction sequence generating program, SIMD instruction sequence generating method and apparatus

Patent number: 7509634

Abstract: A translator receives a source code that is described using a process designation (such as a line-by-line process designation, a line data extraction designation, and a broadcast designation) to be performed on line data of an image on a line by line basis, parses and optimizes the source code, and then generates an SIMD macro code that is an intermediate form taking into consideration the use of an SIMD instruction set. A simplifier generates, from the SIMD macro code, a simplified SIMD macro code, namely, a composite macro code into which a series of codes having the relationship between the definition and the reference of the same virtual SIMD register is organized. A machine code generator generates, from the simplified SIMD macro code, a machine code that efficiently uses an SIMD instruction.

Type: Grant

Filed: November 12, 2003

Date of Patent: March 24, 2009

Assignee: NEC Corporation

Inventor: Shorin Kyo
Method and apparatus for determining the profitability of expanding unpipelined instructions

Patent number: 7506331

Abstract: A method, apparatus, and computer instructions for processing instructions. A data dependency graph is built. The data dependency graph is analyzed for recurrences, and unpipelined instructions that lie outside of the recurrences are expanded.

Type: Grant

Filed: August 30, 2004

Date of Patent: March 17, 2009

Assignee: International Business Machines Corporation

Inventors: Roch Georges Archambault, Robert Frederick Enenkel, Robert William Hay, Allan Russell Martin, James Lawrence McInnes, Ronald Ian McIntosh, Mark Peter Mendell
Method and apparatus for choosing register classes and/or instruction categories

Patent number: 7506326

Abstract: An improved method, apparatus, and computer instructions for generating instructions to process multiple similar expressions. Parameters are identified for the expressions in the original instructions, to form a set of identified parameters typically including the operations performed, the types of data used, and the data sizes. Each type of execution unit that can execute the instructions needed to process the expressions using the set of identified parameters is identified, wherein a set of identified execution unit types is formed. An execution unit type from the set of identified execution unit types is selected to meet a performance goal. The new instructions are generated for the selected execution unit type to process the expressions, and the original instructions for the expressions are discarded.

Type: Grant

Filed: March 7, 2005

Date of Patent: March 17, 2009

Assignee: International Business Machines Corporation

Inventor: Ronald Ian McIntosh
SYSTEMS, METHODS, AND COMPUTER PRODUCTS FOR IMPLEMENTING SHADOW VERSIONING TO IMPROVE DATA DEPENDENCE ANALYSIS FOR INSTRUCTION SCHEDULING

Publication number: 20090064121

Abstract: Systems, methods and computer products for implementing shadow versioning to improve data dependence analysis for instruction scheduling. Exemplary embodiments include a method to identify loops within the code to be compiled, for each loop a dependence initializing a matrix, for each loop shadow identifying symbols that are accessed by the loop, examining dependencies, storing, comparing and classifying the dependence vectors, generating new shadow symbols, replacing the old shadow symbols with the new shadow symbols, generating alias relationships between the newly created shadow symbols, scheduling instructions and compiling the code.

Type: Application

Filed: August 29, 2007

Publication date: March 5, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Roch G. Archambault, Yaoqing Gao, Raul E. Silvera, Peng Zhao
Method and apparatus for automatic second-order predictive commoning

Patent number: 7493609

Abstract: A method and apparatus for automatic second-order predictive commoning is provided by the present invention. During an analysis phase, the intermediate representation of a program code is analyzed to identify opportunities for second-order predictive commoning optimization. The analyzed information is used by the present invention for apply transformations to the program code, such that the number of memory access and the number of computations are reduced for loop iterations and performance of program code is improved.

Type: Grant

Filed: August 30, 2004

Date of Patent: February 17, 2009

Assignee: International Business Machines Corporation

Inventors: Arie Tal, Dina Tal
Pinning internal slack nodes to improve instruction scheduling

Patent number: 7493611

Abstract: A scheduling algorithm is provided for selecting the placement of instructions with internal slack into a schedule of instructions within a loop. The algorithm achieves this by pinning nodes with internal slack to corresponding nodes on the critical path of the code that have similar properties in terms of the data dependency graph, such as earliest time and latest time. The effect is that nodes with internal slack are more often optimally placed in the schedule, reducing the need for rotating registers or register copy instructions. The benefit of the present invention can primarily be seen when performing instruction scheduling or software pipelining on loop code, but can also apply to other forms of instruction scheduling when greater control of placement of nodes with internal slack is desired.

Type: Grant

Filed: August 30, 2004

Date of Patent: February 17, 2009

Assignee: International Business Machines Corporation

Inventor: Allan Russell Martin
Method for register allocation during instruction scheduling

Patent number: 7487336

Abstract: The present disclosure relates to the allocation of registers the scheduling of instructions, and, more specifically, to the classifying of operands and allocation of registers to local operands.

Type: Grant

Filed: December 12, 2003

Date of Patent: February 3, 2009

Assignee: Intel Corporation

Inventors: Jayashankar Bharadwaj, Tatiana Shpeisman, Ali-Reza Adl-Tabatabai
Method for minimizing spill in code scheduled by a list scheduler

Patent number: 7478379

Abstract: A technique of ordering machine instructions to reduce spill code. For each machine instruction that is ready for scheduling, an amount is determined by which the size of a committed set of machine instructions would increase upon the scheduling of the machine instruction. The machine instruction for which the determined amount is smallest is then scheduled. The currently committed instructions may be determined to be the machine instructions that are already scheduled as well as the machine instructions that are descendent from already scheduled machine instructions. The result is that new computations upon which a target processor will embark tend to be deferred. Bit vectors may be employed for efficiency during the assessment of candidate instructions that are ready for scheduling. The technique may be triggered when the risk of registers becoming overcommitted becomes high, as may occur when the number of available processor registers drops below a certain threshold.

Type: Grant

Filed: May 6, 2004

Date of Patent: January 13, 2009

Assignee: International Business Machines Corporation

Inventors: Damien Bonaventure, James Lawrence McInnes
Extension of Swing Modulo Scheduling to Evenly Distribute Uniform Strongly Connected Components

Publication number: 20090013316

Abstract: A method, apparatus, and computer instructions for scheduling instructions for execution. Identify a series of instructions in a loop, wherein the series of instructions has a cyclic data dependency. Determine whether the series of instructions is a uniform series of instructions. Schedule execution of the uniform series of instructions within the loop to optimize execution of the loop in response to the identified series of instructions being the uniform series of instructions.

Type: Application

Filed: September 19, 2008

Publication date: January 8, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Allan Russell Martin
Low-Level Connectivity Admission Control for Networked Consumer Storage Devices

Publication number: 20080313622

Abstract: The present invention relates to a device and a method of accessing a storage of a storage device by reading or writing data to said storage device, wherein said accessing is controlled by an external data controller via a low level in a software stack of said device, and wherein said storage device is accessed without hampering the operation of functionalities at a higher level of said software stack of said device, said device comprises an intermediate storage, and the method comprises the steps of storing commands related to said accessing of data in said intermediate storage as a command queue, executing said commands in said command queue when allowed by a command queue scheduler, said scheduler scheduling in dependence of at least one of the functionalities at said higher level of said software stack. Thereby full control is obtained on storage medium requests by the scheduler.

Type: Application

Filed: May 26, 2005

Publication date: December 18, 2008

Inventor: Jozef Pieter Van Gassel
Methods and apparatus to compile a software program to manage parallel ?caches

Patent number: 7448031

Abstract: Methods and apparatus to compile a software program to manage parallel ? caches are disclosed. In an example method, a compiler attempts to schedule a software program such that load instructions in a first set of load instructions has a first predetermine latency greater than the latency of the first cache. The compiler also marks a second set of load instructions with a latency less than the first predetermined latency to access the first cache. The compiler attempts to schedule the software program such that the load instruction in a third set have at least a second predetermined latency greater than the latency of the second cache. The compiler identifies a fourth set of load instructions in the scheduled software program having less than the second predetermined latency and marks the fourth set of load instructions to access the second cache.

Type: Grant

Filed: December 17, 2003

Date of Patent: November 4, 2008

Assignee: Intel Corporation

Inventor: Youfeng Wu
Recoverable return code tracking and notification for autonomic systems

Patent number: 7447732

Abstract: A system, method and article of manufacture return code management in autonomic systems and more particularly to managing execution of operations in data processing systems on the basis of return code tracking. One embodiment provides a method for managing execution of an operation in a data processing system. The method comprises tracking return codes received from previous executions of the operation in the data processing system, determining an execution behavior of the operation from the tracked return codes, and managing a subsequent execution of the operation on the basis of the determined execution behavior.

Type: Grant

Filed: May 23, 2003

Date of Patent: November 4, 2008

Assignee: International Business Machines Corporation

Inventors: Eric L. Barsness, John M. Santosuosso
Extension of swing modulo scheduling to evenly distribute uniform strongly connected components

Patent number: 7444628

Abstract: A method, apparatus, and computer instructions for scheduling instructions for execution. Identify a series of instructions in a loop, wherein the series of instructions has a cyclic data dependency. Determine whether the series of instructions is a uniform series of instructions. Schedule execution of the uniform series of instructions within the loop to optimize execution of the loop in response to the identified series of instructions being the uniform series of instructions.

Type: Grant

Filed: August 30, 2004

Date of Patent: October 28, 2008

Assignee: International Business Machines Corporation

Inventor: Allan Russell Martin

prev 1 2 3 4 5 6 7 next