Including Scheduling Instructions Patents (Class 717/161)

Compiler for X86-based many-core coprocessors

Patent number: 8918770

Abstract: A system and method for compiling includes, for a parallelizable code portion of an application stored on a computer readable storage medium, determining one or more variables that are to be transferred to and/or from a coprocessor if the parallelizable code portion were to be offloaded. A start location and an end location are determined for at least one of the one or more variables as a size in memory. The parallelizable code portion is transformed by inserting an offload construct around the parallelizable code portion and passing the one or more variables and the size as arguments of the offload construct such that the parallelizable code portion is offloaded to a coprocessor at runtime.

Type: Grant

Filed: August 24, 2012

Date of Patent: December 23, 2014

Assignee: NEC Laboratories America, Inc.

Inventors: Nishkam Ravi, Tao Bao, Ozcan Ozturk, Srimat Chakradhar
Systems and methods for using data archiving to expedite server migration

Patent number: 8898115

Abstract: A computer-implemented method for using data archiving to expedite server migration may include: 1) archiving data from at least one source computing system to an archiving system in accordance with an archiving policy, 2) altering metadata associated with the archived data on the archiving system so that the metadata references a desired target computing system instead of the source computing system, and then, upon bringing the target computing system online, 3) restoring at least a portion of the archived data from the archiving system to the target computing system. Various other methods, systems, and configured computer-readable media are also disclosed.

Type: Grant

Filed: February 12, 2013

Date of Patent: November 25, 2014

Assignee: Symantec Corporation

Inventors: Laxmikant Gunda, Praveen Rakshe
Methodology for fast detection of false sharing in threaded scientific codes

Patent number: 8898648

Abstract: A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.

Type: Grant

Filed: November 30, 2012

Date of Patent: November 25, 2014

Assignee: International Business Machines Corporation

Inventors: I-Hsin Chung, Guojing Cong, Hiroki Murata, Yasushi Negishi, Hui-Fang Wen
Method and apparatus for register spill minimization

Patent number: 8893104

Abstract: The aspects enable a computing device to allocate memory space to variables during runtime compilation of a software application. A compiler may be modified to identify operations that can be performed on either a main pipe or an alternative pipe, identify chains of related operations that can be performed on either the main pipe or the alternative pipe, identify points in the execution of code at which the number of live values will exceed the number of registers, and choosing a chain of operations as a candidate to be moved to the alternative pipe in order to reduce the number of live values at identified points in the execution of code. The entire chosen chain of operations may be moved to the alternative pipe. The alternative pipe may perform the computations and return the results to the main pipe for execution.

Type: Grant

Filed: March 1, 2012

Date of Patent: November 18, 2014

Assignee: QUALCOMM Incorporated

Inventors: Christopher A. Vick, Gregory M. Wright
Methods for generating code for an architecture encoding an extended register specification

Patent number: 8893095

Abstract: There are provided methods and computer program products for generating code for an architecture encoding an extended register specification. A method for generating code for a fixed-width instruction set includes identifying a non-contiguous register specifier. The method further includes generating a fixed-width instruction word that includes the non-contiguous register specifier.

Type: Grant

Filed: July 26, 2012

Date of Patent: November 18, 2014

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, Robert Kevin Montoye, Brett Olsson, John-David Wellman
Apparatus and method for scheduling instruction

Patent number: 8869129

Abstract: An apparatus and method for scheduling an instruction are provided. The apparatus includes an analyzer configured to analyze dependency of a plurality of recurrence loops and a scheduler configured to schedule the recurrence loops based the analyzed dependencies. When scheduling a plurality of recurrence loops, the apparatus first schedules a dominant loop whose loop head has no dependency on another loop among the recurrence loops.

Type: Grant

Filed: November 2, 2009

Date of Patent: October 21, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Tae-wook Oh, Won-sub Kim, Bernhard Egger
Systems and methods for using provenance information for data retention in stream-processing

Patent number: 8856313

Abstract: A system and method for determining data usage based on provenance information, in a stream-processing system, includes progressively setting usage information for output stream data objects (SDOs), determining input SDOs that an output SDO depends on, based on a provenance dependency function; recursively feeding back the usage information for a subset of SDOs that can be discarded; and discarding the subset of SDOs. A system and method for data retention based on usage information, in a stream-processing system, includes managing retention of SDOs by deleting SDOs that are determined to be of null usage; and enhancing retention characteristics of SDOs that are deemed to have usage.

Type: Grant

Filed: November 13, 2007

Date of Patent: October 7, 2014

Assignee: International Business Machines Corporation

Inventors: Lisa Amini, Chitra Venkatramani
System using a unique marker with each software code-block

Patent number: 8850410

Abstract: A system and method for improving software maintainability, performance, and/or security by associating a unique marker to each software code-block; the system comprising of a plurality of processors, a plurality of code-blocks, and a marker associated with each code-block. The system may also include a special hardware register (code-block marker hardware register) in each processor for identifying the markers of the code-blocks executed by the processor, without changing any of the plurality of code-blocks.

Type: Grant

Filed: January 29, 2010

Date of Patent: September 30, 2014

Assignee: International Business Machines Corporation

Inventors: Ramanjaneya S. Burugula, Joefon Jann, Pratap C. Pattnaik
Compiling code for an enhanced application binary interface (ABI) with decode time instruction optimization

Patent number: 8832669

Abstract: Generating decode time instruction optimization (DTIO) object code that enables a DTIO enabled processor to optimize execution of DTIO instructions. A code sequence configured to facilitate DTIO in a DTIO enabled processor is identified by a computer. The code sequence includes an internal representation (IR) of a first instruction and an IR of a second instruction. The second instruction is dependent on the first instruction. A schedule associated with at least one of the IR of the first instruction and the IR of the second instruction is modified. The modifying includes generating a modified schedule that is configured to place the first instruction next to the second instruction. An object file is generated based on the modified schedule. The object file includes the first instruction placed next to the second instruction. The object file is emitted.

Type: Grant

Filed: August 5, 2013

Date of Patent: September 9, 2014

Assignee: International Business Machines Corporation

Inventors: Robert J. Blainey, Michael K. Gschwind, James L. McInnes, Steven J. Munroe
Processor cooling management

Patent number: 8831791

Abstract: Illustrative embodiments provide a computer implemented method, a data processing system, and a computer program product for adjusting cooling settings. The computer implemented method comprises analyzing a set of instructions of an application to determine a number of degrees by which a set of instructions will raise a temperature of at least one processor core. The computer implemented method further calculates a cooling setting for at least one cooling system for the at least one processor core. The computer implemented method adjusts the at least one cooling system based on the cooling setting. The step of analyzing the set of instructions is performed before the set of instructions is executed on the at least one processor core. The step of adjusting the at least one cooling system is performed before the set of instructions is executed on the at least one processor core.

Type: Grant

Filed: June 21, 2012

Date of Patent: September 9, 2014

Assignee: International Business Machines Corporation

Inventors: Robert L. Angell, David W. Cosby, Robert R. Friedlander, James R. Kraemer
Memory disambiguation hardware to support software binary translation

Patent number: 8826257

Abstract: A method of memory disambiguation hardware to support software binary translation is provided. This method includes unrolling a set of instructions to be executed within a processor, the set of instructions having a number of memory operations. An original relative order of memory operations is determined. Then, possible reordering problems are detected and identified in software. The reordering problem being when a first memory operation has been reordered prior to and aliases to a second memory operation with respect to the original order of memory operations. The reordering problem is addressed and a relative order of memory operations to the processor is communicated.

Type: Grant

Filed: March 30, 2012

Date of Patent: September 2, 2014

Assignee: Intel Corporation

Inventors: Muawya M. Al-Otoom, Paul Caprioli, Abhay S. Kanhere, Arvind Krishnaswamy, Omar M. Shaikh
Behavior invariant optimization of maximum execution times for model simulation

Patent number: 8819618

Abstract: A device receives a model that includes model elements scheduled to execute in time slots on a hardware device. The device identifies time slots, of the time slots, that are unoccupied or underutilized by the model elements, and identifies a set of model elements that can be moved to the unoccupied time slots without affecting a behavior of the model. The device calculates a combined execution time of the model elements, determines whether the combined execution time of the model elements is less than or equal to a duration of a first time slot of the time slots, and schedules the model elements for execution in the first time slot when the combined execution time of the model elements is less than or equal to the duration of the first time slot.

Type: Grant

Filed: September 26, 2012

Date of Patent: August 26, 2014

Assignee: The MathWorks, Inc.

Inventors: David MacLay, Matej Urbas
Dynamic optimization of mobile services

Patent number: 8813044

Abstract: A method, system, and article of manufacture are disclosed for transforming a definition of a process for delivering a service. This service process definition is comprised of computer readable code. The method comprises the steps of expressing a given set of assumptions in a computer readable code; and transforming said process definition by using a processing unit to apply said assumptions to said process definition to change the configuration of the process definition. The process definition may be transformed by using factors relating to the specific context in or for which the process definition is executed. Also, the process definition may be transformed by identifying, in a flow diagram for the service process definition, flows to which the assumptions apply, and applying program rewriting techniques to those identified flows.

Type: Grant

Filed: September 6, 2012

Date of Patent: August 19, 2014

Assignee: International Business Machines Corporation

Inventors: David F. Bantz, Steven J. Mastrianni, James R. Moulic, Dennis G. Shea
Program generation device, program production method, and program

Patent number: 8806466

Abstract: A program generation apparatus references a source program including a loop for executing a block N times (N?2) and having such dependence that a variable defined in a statement in the block pertaining to ith execution (1?i<N) is referenced by a statement in the block pertaining to jth execution (i<j?N), calculates equivalent representations of variables in the block pertaining to the ith execution and the block pertaining to any other execution than the ith execution, specifies, with respect to each representation of a target variable causing the dependence, a representation of a variable not causing the dependence that is equivalent to the representation of the target variable, and generates a program being for executing the block M times (M?N) and including a statement including the specified representation in place of each representation of the target variable.

Type: Grant

Filed: July 4, 2011

Date of Patent: August 12, 2014

Assignee: Panasonic Corporation

Inventors: Akira Tanaka, Hiroyuki Morishita, Akihiko Inoue
Dynamic optimization of mobile services

Patent number: 8769507

Abstract: A method, system, and article of manufacture are disclosed for transforming a definition of a process for delivering a service on a specified computing device. This service process definition is comprised of computer readable code. The method comprises the steps of expressing a given set of assumptions in a computer readable code; and transforming the definition by using a processing unit to apply the assumptions to the definition of the process to change the way in which the process operates. The definition of the process may be transformed by using factors relating to the specific context in or for which the definition is executed. Also, the definition may be transformed by identifying, in a flow diagram for the process, flows to which the assumptions apply, and applying program rewriting techniques to those identified flows.

Type: Grant

Filed: May 14, 2009

Date of Patent: July 1, 2014

Assignee: International Business Machines Corporation

Inventors: David F. Bantz, Steven J. Mastrianni, James R. Moulic, Dennis G. Shea
Throughput-aware software pipelining for highly multi-threaded systems

Patent number: 8752036

Abstract: Embodiments of the invention provide systems and methods for throughput-aware software pipelining in compilers to produce optimal code for single-thread and multi-thread execution on multi-threaded systems. A loop is identified within source code as a candidate for software pipelining. An attempt is made to generate pipelined code (e.g., generate an instruction schedule and a set of register assignments) for the loop in satisfaction of throughput-aware pipelining criteria, like maximum register count, minimum trip count, target core pipeline resource utilization, maximum code size, etc. If the attempt fails to generate code in satisfaction of the criteria, embodiments adjust one or more settings (e.g., by reducing scalarity or latency settings being used to generate the instruction schedule).

Type: Grant

Filed: October 31, 2011

Date of Patent: June 10, 2014

Assignee: Oracle International Corporation

Inventors: Spiros Kalogeropulos, Partha Tirumalai
Scheduler of reconfigurable array, method of scheduling commands, and computing apparatus

Patent number: 8745608

Abstract: A scheduler of a reconfigurable array, a method of scheduling commands, and a computing apparatus are provided. To perform a loop operation in a reconfigurable array, a recurrence node, a producer node, and a predecessor node are detected from a data flow graph of the loop operation such that resources are assigned to such nodes so as to increase the loop operating speed. Also, a dedicated path having a fixed delay may be added to the assigned resources.

Type: Grant

Filed: February 1, 2010

Date of Patent: June 3, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Won-sub Kim, Tae-wook Oh, Bernhard Egger
Method and system for implementing parallel execution in a computing system and in a circuit simulator

Patent number: 8738348

Abstract: A method and mechanism for implementing a general purpose scripting language that supports parallel execution is described. In one approach, parallel execution is provided in a seamless and high-level approach rather than requiring or expecting a user to have low-level programming expertise with parallel processing languages/functions. Also described is a system and method for performing circuit simulation. The present approach provides methods and systems that create reusable and independent measurements for use with circuit simulators. Also disclosed are parallelizable measurements having looping constructs that can be run without interference between parallel iterations. Reusability is enhanced by having parameterized measurements. Revisions and history of the operating parameters of circuit designs subject to simulation are tracked.

Type: Grant

Filed: June 15, 2012

Date of Patent: May 27, 2014

Assignee: Cadence Design Systems, Inc.

Inventor: Kenneth S. Kundert
File cloning and de-cloning in a data storage system

Patent number: 8738570

Abstract: A file cloning mechanism allows for quickly creating copies (clones) of files within a filesystem, such as when a user makes a copy of a file. In exemplary embodiments, a clone of a source object is at least initially represented by a structure containing references to various elements of the source object (e.g., indirect onodes, direct onodes, and data blocks). Both read-only and mutable clones can be created. The source file and the clone initially share such elements and continue to share unmodified elements as changes are made to the source file or mutable clone. None of the user data blocks or the metadata blocks describing the data stream (i.e., the indirect/direct onodes) associated with the source file need to be copied at the time the clone is created. At appropriate times, cloned files may be “de-cloned.

Type: Grant

Filed: November 21, 2011

Date of Patent: May 27, 2014

Assignee: Hitachi Data Systems Engineering UK Limited

Inventors: Daniel J. N. Picken, Neil Berrington
Pipelined loop parallelization with pre-computations

Patent number: 8726251

Abstract: Embodiments of the invention provide systems and methods for automatically parallelizing loops with non-speculative pipelined execution of chunks of iterations with pre-computation of selected values. Non-DOALL loops are identified and divided the loops into chunks. The chunks are assigned to separate logical threads, which may be further assigned to hardware threads. As a thread performs its runtime computations, subsequent threads attempt to pre-compute their respective chunks of the loop. These pre-computations may result in a set of assumed initial values and pre-computed final variable values associated with each chunk. As subsequent pre-computed chunks are reached at runtime, those assumed initial values can be verified to determine whether to proceed with runtime computation of the chunk or to avoid runtime execution and instead use the pre-computed final variable values.

Type: Grant

Filed: March 29, 2011

Date of Patent: May 13, 2014

Assignee: Oracle International Corporation

Inventors: Spiros Kalogeropulos, Partha Pal Tirumalai
Accelerating generic loop iterators using speculative execution

Patent number: 8701099

Abstract: A method, a system and a computer program product for effectively accelerating loop iterators using speculative execution of iterators. An Efficient Loop Iterator (ELI) utility detects initiation of a target program and initiates/spawns a speculative iterator thread at the start of the basic code block ahead of the code block that initiates a nested loop. The ELI utility assigns the iterator thread to a dedicated processor in a multi-processor system. The speculative thread runs/executes ahead of the execution of the nested loop and calculates indices in a corresponding multidimensional array. The iterator thread adds all the precomputed indices to a single queue. As a result, the ELI utility effectively enables a multidimensional loop to be replaced by a single dimensional loop. At the beginning of (or during) each iteration of the iterator, the ELI utility “dequeues” an entry from the queue to use the entry to access the array upon which the ELI utility iterates.

Type: Grant

Filed: November 2, 2010

Date of Patent: April 15, 2014

Assignee: International Business Machines Corporation

Inventors: Ganesh Bikshandi, Dibyendu Das, Smruti Ranjan Sarangi
Scheduling of instructions

Patent number: 8689202

Abstract: A method of automatically extracting information from an architecture description. A memory resident directed acyclic graph data structure comprising nodes representing instructions and edges whose weights represent dependencies between pairs of instructions is constructed. A list of ready nodes are maintained in the directed acyclic graph. A list of nodes not scheduled is maintained. And, it is determined whether the next instruction to be scheduled is to be taken from the list of ready nodes or from the list of nodes not yet scheduled.

Type: Grant

Filed: March 30, 2005

Date of Patent: April 1, 2014

Assignee: Synopsys, Inc.

Inventors: Gunnar Braun, Andreas Hoffmann, Volker Grieve, Manuel Hohenauer, Rainer Leupers
Data dependence testing for loop fusion with code replication, array contraction, and loop interchange

Patent number: 8677338

Abstract: Methods and apparatus to data dependence testing for loop fusion, e.g., with code replication, array contraction, and/or loop interchange, are described. In one embodiment, a compiler may optimize code for efficient execution during run-time by testing for dependencies associated with improving memory locality through code replication in loops that enable various loop transformations. Other embodiments are also described.

Type: Grant

Filed: June 4, 2008

Date of Patent: March 18, 2014

Assignee: Intel Corporation

Inventors: John L. Ng, Rakesh Krishnaiyer, Alexander Y. Ostanevich
Dynamic optimization using a resource cost registry

Patent number: 8635606

Abstract: Technologies are generally described for runtime optimization adjusted dynamically according to changing costs of one or more system resources. Multicore systems may encounter dynamic variations in performance associated with the relative cost of related system resources. Furthermore, multicore systems can experience dramatic variations in resource availability and costs. A dynamic registry of system resource costs can be utilized to guide dynamic optimization. The relative scarcity of each resource can be updated dynamically within the registry of system resource costs. A runtime code generating loader and optimizer may be adapted to adjust optimization according to the resource cost registry. Information regarding system resource costs can support optimization tradeoffs based on resource cost functions.

Type: Grant

Filed: October 13, 2009

Date of Patent: January 21, 2014

Assignee: Empire Technology Development LLC

Inventor: Ezekiel John Joseph Kruglick
Method, medium and apparatus storing and restoring register context for fast context switching between tasks

Patent number: 8635627

Abstract: A method, medium and apparatus for storing and restoring a register context for a fast context switching between tasks is disclosed. The method, medium and apparatus may improve overall operating speed of a system by increasing the speed of context switching. The method may include adding an update code for updating information of live registers to a task file that includes a code of a task to perform a specified function, converting the task file having the update code added thereto into a run file, updating the information of the live registers with the update code during running of the task using the run file, and storing a live register context according to the updated information of the registers.

Type: Grant

Filed: December 12, 2006

Date of Patent: January 21, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jung-keun Park, Keun-soo Yim, Woon-gee Kim, Jeong-joon Yoo, Kyoung-ho Kang, Chae-seok Im, Jae-don Lee
Parallel dynamic optimization

Patent number: 8627300

Abstract: Technologies are generally described for parallel dynamic optimization using multicore processors. A runtime compiler may be adapted to generate multiple instances of executable code from a portable intermediate software module. The various instances of executable code may be generated with variations of optimization parameters such that the code instances each express different optimization attempts. A multicore processor may be leveraged to simultaneously execute some, or all, of the various code instances. Preferred optimization parameters may be determined from the executable code instances that may correctly complete in the least time, or may use the least amount of memory, or that may prove superior according to some other fitness metric. Preferred optimization parameters may be used to seed future optimization attempts. Output generated from the preferred instances may be used as soon as the first instance correctly completes block.

Type: Grant

Filed: October 13, 2009

Date of Patent: January 7, 2014

Assignee: Empire Technology Development LLC

Inventor: Ezekiel John Joseph Kruglick
Compiling code for an enhanced application binary interface (ABI) with decode time instruction optimization

Patent number: 8615746

Abstract: Compiling code for an enhanced application binary interface (ABI) including identifying, by a computer, a code sequence configured to perform a variable address reference table function including an access to a variable at an offset outside of a location in a variable address reference table. The code sequence includes an internal representation (IR) of a first instruction and an IR of a second instruction. The second instruction is dependent on the first instruction. A scheduler cost function associated with at least one of the IR of the first instruction and the IR of the second instruction is modified. The modifying includes generating a modified scheduler cost function that is configured to place the first instruction next to the second instruction. An object file is generated responsive to the modified scheduler cost function. The object file includes the first instruction placed next to the second instruction. The object file is emitted.

Type: Grant

Filed: April 30, 2012

Date of Patent: December 24, 2013

Assignee: International Business Machines Corporation

Inventors: Robert J. Blainey, Michael K. Gschwind, James L. McInnes, Steven J. Munroe
Compiling code for an enhanced application binary interface (ABI) with decode time instruction optimization

Patent number: 8615745

Abstract: A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.

Type: Grant

Filed: October 3, 2011

Date of Patent: December 24, 2013

Assignee: International Business Machines Corporation

Inventors: Robert J. Blainey, Michael Gschwind, James L. McInnes, Steven J. Munroe
Program converting apparatus and program conversion method

Patent number: 8612958

Abstract: A compiler, which corresponds to a recent processor having a multithread function, that enables execution of efficient instruction scheduling and allows a programmer to control the instruction scheduling includes: an instruction scheduling directive receiving unit which receives, from a programmer, a directive for specifying an instruction scheduling method; and an instruction scheduling unit which executes, conforming to one of instruction scheduling methods, instruction scheduling of rearranging intermediate codes corresponding to the source program. The instruction scheduling unit selects one of instruction scheduling methods according to the directive received by the instruction scheduling directive receiving unit, and executes instruction scheduling conforming to the selected instruction scheduling method.

Type: Grant

Filed: June 17, 2011

Date of Patent: December 17, 2013

Assignee: Panasonic Corporation

Inventors: Taketo Heishi, Shohei Michimoto, Teruo Kawabata
Wavescalar architecture having a wave order memory

Patent number: 8612955

Abstract: A dataflow instruction set architecture and execution model, referred to as WaveScalar, which is designed for scalable, low-complexity/high-performance processors, while efficiently providing traditional memory semantics through a mechanism called wave-ordered memory. Wave-ordered memory enables “real-world” programs, written in any language, to be run on the WaveScalar architecture, as well as any out-of-order execution unit. Because it is software-controlled, wave-ordered memory can be disabled to obtain greater parallelism. Wavescalar also includes a software-controlled tag management system.

Type: Grant

Filed: January 22, 2008

Date of Patent: December 17, 2013

Assignee: University of Washington

Inventors: Mark H. Oskin, Steven J. Swanson, Susan J. Eggers
Scheduling multithreaded programming instructions based on dependency graph

Patent number: 8612957

Abstract: A computer implemented method for scheduling multithreaded programming instructions based on the dependency graph wherein the dependency graph organizes the programming instruction logically based on blocks, nodes, and super blocks and wherein the programming instructions could be executed outside of a critical section may be executed outside of the critical section by inserting dependency relationship in the dependency graph.

Type: Grant

Filed: January 26, 2006

Date of Patent: December 17, 2013

Assignee: Intel Corporation

Inventors: Xiaofeng Guo, Jinquan Dai, Long Li
Systems and methods for compiling an application for a parallel-processing computer system

Patent number: 8584106

Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.

Type: Grant

Filed: February 9, 2012

Date of Patent: November 12, 2013

Assignee: Google Inc.

Inventors: Matthew N. Papakipos, Brian K. Grant, Christopher G. Demetriou, Morgan S. McGuire
Program code generation support device and method, program execution device and method, and program code compression processing device and method and program thereof

Patent number: 8572557

Abstract: To obtain a program code generation support device, method, and the like, capable of generating a new program code, in particular, generating in accordance with an incorporating apparatus, by performing a further optimization on a program code. The device includes storage means for storing as data, an optimization rule that is composed of a conversion condition for converting data of a program code and a conversion content thereof, and code optimization means that includes a code analysis unit for analyzing the program code, a condition search unit for search a part matching the conversion condition in the program code through a collation with the optimization rule stored in the storage means on the basis of the analyzed program code, and an optimization unit for generating data of a new program code by converting the part matching the conversion condition on the basis of the conversion content.

Type: Grant

Filed: August 1, 2011

Date of Patent: October 29, 2013

Assignee: Mitsubishi Electric Corporation

Inventors: Takahiro Ito, Shigeki Suzuki, Yoshiko Ochiai, Noriyuki Kushiro, Yoshiaki Koizumi
Object linkage device for linking objects in statically linked executable program file, method of linking objects, and computer readable storage medium storing program thereof

Patent number: 8561047

Abstract: When a modification is applied to a statically linked executable program file, in the executable program file, an old object is replaced with a new object by adding the new object to a bottom of already-existing objects without changing the location of the old object, and the reference relationship of symbols among objects is updated and resolved and thereby a modification is applied.

Type: Grant

Filed: January 18, 2011

Date of Patent: October 15, 2013

Assignee: Fujitsu Limited

Inventors: Masaki Kobayashi, Masanori Iwazaki
Pipelined parallelization with localized self-helper threading

Patent number: 8561046

Abstract: A system and method for automatically parallelizing a computer program for multi-threaded execution. A compiler identifies and parallelizes non-DOALL parallel regions, such as loops, within a computer program. The compiler determines enhanced helper thread instructions based upon the main body instructions of the non-DOALL region. These helper thread instructions are inserted ahead of the main body instructions within each of the plurality of threads, rather than within a single main thread. Next, synchronization instructions are inserted in one or more threads such that the main body of work of each thread is performed in a pipelined manner. The helper thread instructions within each thread may reduce the total execution time of each thread.

Type: Grant

Filed: September 14, 2009

Date of Patent: October 15, 2013

Assignee: Oracle America, Inc.

Inventors: Yonghong Song, Spiros Kalogeropulos, Partha P. Tirumalai
Performing register allocation of program variables based on priority spills and assignments

Patent number: 8555267

Abstract: A mechanism for performing register allocation based on priority spills and assignments is disclosed. A method of embodiments of the invention includes repetitively detecting fat points during a compilation process of a software program running on a virtual machine of a computer system, each fat point representing a program point having a high register pressure, the high register pressure occurs when a number of live program variables of the software program living at a given program point of the software program is greater than a number of available processor registers of the computer system. The method further includes choosing a fat point with a highest register pressure, selecting a live program variable having a lowest priority at the chosen fat point, and spilling the lowest priority live program variable to memory of the computer system.

Type: Grant

Filed: March 3, 2010

Date of Patent: October 8, 2013

Assignee: Red Hat, Inc.

Inventor: Vladimir Makarov
Loop coalescing method and loop coalescing device

Patent number: 8549507

Abstract: A loop coalescing method and a loop coalescing device are disclosed. The loop coalescing method comprises removing an inner-most loop from among nested loops, so that an outer operation provided outside of the inner-most loop is performed when a condition of a conditional statement is satisfied, generating a guard code by applying an if-conversion method to the conditional statement, and converting a guard by using an instruction calculating the guard of the guard code, the instruction calculating the guard using a register where information related to a period of time corresponding to the number of iterations of the inner-most loop is stored.

Type: Grant

Filed: August 22, 2007

Date of Patent: October 1, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hee Seok Kim, Hong-Seok Kim, Chang-Woo Baek, Jeongwook Kim
Mechanism for performing instruction scheduling based on register pressure sensitivity

Patent number: 8549508

Abstract: A mechanism for performing instruction scheduling based on register pressure sensitivity is disclosed. A method of embodiments of the invention includes performing a preliminary register pressure minimization on program points during a compilation process of a software program running on a virtual machine of a computer system. The method further includes calculating a register pressure at each of the program points, detecting an instruction to be scheduled, and performing instruction scheduling of the instruction based on a current register pressure at a current scheduling point and potential register pressures at subsequent scheduling points.

Type: Grant

Filed: March 3, 2010

Date of Patent: October 1, 2013

Assignee: Red Hat, Inc.

Inventor: Vladimir Makarov
Method and apparatus for partitioning programs to balance memory latency

Patent number: 8543992

Abstract: A method of compiling code that includes partitioning instructions in the code among a plurality of processors based on memory access latency associated with the instructions is disclosed. According to one aspect of the invention, partitioning instructions includes partitioning memory access dependence chains. Other embodiments are described and claimed.

Type: Grant

Filed: December 17, 2005

Date of Patent: September 24, 2013

Assignee: Intel Corporation

Inventors: Xiaodan Jiang, Jinquan Dai
Optimization of automated system-managed storage operations

Patent number: 8516466

Abstract: Various embodiments for optimizing automated system-managed storage (SMS) operations in a computing storage environment. An execution of at least one automatic class selection (ACS) routine is monitored to determine at least one frequently used instruction. The ACS routine is modified for at least one predetermined time interval. The at least one frequently used instruction is moved to a higher execution priority of the modified ACS routine.

Type: Grant

Filed: June 30, 2010

Date of Patent: August 20, 2013

Assignee: International Business Machines Corporation

Inventors: Harold S. Huber, David C. Reed, Max D. Smith
Translation of SIMD instructions in a data processing system

Patent number: 8505002

Abstract: A data processing system is provided having a processor and analysing circuitry for identifying a SIMD instruction associated with a first SIMD instruction set and replacing it by a functionally-equivalent scalar representation and marking that functionally-equivalent scalar representation. The marked functionally-equivalent scalar representation is dynamically translated using translation circuitry upon execution of the program to generate one or more corresponding translated instructions corresponding to a instruction set architecture different from the first SIMD architecture corresponding to the identified SIMD instruction.

Type: Grant

Filed: September 27, 2007

Date of Patent: August 6, 2013

Assignees: ARM Limited, The Regents of the University of Michigan

Inventors: Sami Yehia, Krisztian Flautner, Nathan Clark, Amir Hormati, Scott Mahlke
Just-ahead-of-time compilation

Patent number: 8473935

Abstract: Pre-compiling postdominating functions. Some embodiments may be practiced in a computing environment including a runtime compilation. For example one method includes acts for compiling functions. The method includes determining that a function of an application has been called. A control flow graph is used to determine one or more postdominance relationships between the function and one or more other functions. The one or more other functions are assigned to be pre-compiled based on the postdominance relationship.

Type: Grant

Filed: April 21, 2008

Date of Patent: June 25, 2013

Assignee: Microsoft Corporation

Inventor: Matthew B. Grice
Conversion of a class oriented data flow program to a structure oriented data flow program with dynamic interpretation of data types

Patent number: 8458682

Abstract: System and method for converting a class oriented data flow program to a structure oriented data flow program. A first data flow program is received, where the first data flow program is an object oriented program comprising instances of one or more classes, and wherein the first data flow program is executable to perform a first function. The first data flow program is automatically converted to a second data flow program, where the second data flow program does not include the instances of the one or more classes, and where the second data flow program is executable to perform the first function. The second data flow program is stored on a computer memory, where the second data flow program is configured to be deployed to a device, e.g., a programmable hardware element, and where the second data flow program is executable on the device to perform the first function.

Type: Grant

Filed: April 27, 2009

Date of Patent: June 4, 2013

Assignee: National Instruments Corporation

Inventors: Stephen R. Mercer, Akash B. Bhakta, Matthew E. Novacek
Improving data locality and parallelism by code replication

Patent number: 8453134

Abstract: Provided are a method, system, and article of manufacture improving data locality and parallelism by code replication and array contraction. Source code including an array of elements referenced using at least two indices is processed. The array is nested within multiple loops, wherein at least two of the loops perform iterations with respect to the indices of the array, wherein the index incremented in at least one innermost loop of the loops does not comprise a leftmost index in the array. The source code is transformed to object code by performing operations including fusing at least two innermost loops of the loops in object code generated by compiling the source code by replicating statements from at least one of the innermost loops into a fused innermost loop and performing loop interchange in the object code to have the fused innermost loop provide iterations with respect to the leftmost index in the array.

Type: Grant

Filed: June 4, 2008

Date of Patent: May 28, 2013

Assignee: Intel Corporation

Inventors: John L. Ng, Alexander Y. Ostanevich, Alexander L. Sushentsov
Computation reuse for loops with irregular accesses

Patent number: 8453135

Abstract: A compiler selects a nested loop within software code that includes an outer loop and an inner loop. The outer loop includes an outer induction variable and the inner loop includes an inner induction variable. The compiler identifies a computation included in the nested loop that generates an irregular array access, which includes an expression of both the outer induction variable and the inner induction variable. Next, the compiler identifies a redundant calculation for the computation based upon the outer induction variable and the inner induction variable, and generates a temporary variable to correspond with the redundant calculation. The compiler replaces the computation with the temporary variable in the nested loop and, in turn, compiles the nested loop with the included temporary variable.

Type: Grant

Filed: March 11, 2010

Date of Patent: May 28, 2013

Assignee: Freescale Semiconductor, Inc.

Inventor: Abderrazek Zaafrani
Systems and methods for caching compute kernels for an application running on a parallel-processing computer system

Patent number: 8448156

Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.

Type: Grant

Filed: February 27, 2012

Date of Patent: May 21, 2013

Assignee: Googe Inc.

Inventors: Christopher G. Demetriou, Matthew N. Papakipos
Systems and methods for determining compute kernels for an application in a parallel-processing computer system

Patent number: 8443349

Abstract: A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.

Type: Grant

Filed: February 9, 2012

Date of Patent: May 14, 2013

Assignee: Google Inc.

Inventors: Matthew N. Papakipos, Brian K. Grant, Morgan S. McGuire, Christopher G. Demetriou
THROUGHPUT-AWARE SOFTWARE PIPELINING FOR HIGHLY MULTI-THREADED SYSTEMS

Publication number: 20130111453

Abstract: Embodiments of the invention provide systems and methods for throughput-aware software pipelining in compilers to produce optimal code for single-thread and multi-thread execution on multi-threaded systems. A loop is identified within source code as a candidate for software pipelining. An attempt is made to generate pipelined code (e.g., generate an instruction schedule and a set of register assignments) for the loop in satisfaction of throughput-aware pipelining criteria, like maximum register count, minimum trip count, target core pipeline resource utilization, maximum code size, etc. If the attempt fails to generate code in satisfaction of the criteria, embodiments adjust one or more settings (e.g., by reducing scalarity or latency settings being used to generate the instruction schedule).

Type: Application

Filed: October 31, 2011

Publication date: May 2, 2013

Applicant: Oracle International Corporation

Inventors: Spiros Kalogeropulos, Partha Tirumalai
Two-stage commit (TSC) region for dynamic binary optimization in X86

Patent number: 8418156

Abstract: Generally, the present disclosure provides systems and methods to generate a two-stage commit (TSC) region which has two separate commit stages. Frequently executed code may be identified and combined for the TSC region. Binary optimization operations may be performed on the TSC region to enable the code to run more efficiently by, for example, reordering load and store instructions. In the first stage, load operations in the region may be committed atomically and in the second stage, store operations in the region may be committed atomically.

Type: Grant

Filed: December 16, 2009

Date of Patent: April 9, 2013

Assignee: Intel Corporation

Inventors: Cheng Wang, Youfeng Wu
Compiler apparatus

Patent number: RE45199

Abstract: A compiler apparatus, which can perform software pipelining optimization that has a considerable effect of reducing the number of execution cycles taken to complete a loop process, converts a source program into a machine program for a processor which is capable of parallel processing. The compiler apparatus is composed of: a parsing unit operable to parse the source program and then to convert the source program into an intermediate program which is described in an intermediate language; an optimization unit operable to optimize the intermediate program; and a conversion unit operable to convert the optimized intermediate program into the machine language program, wherein the optimization unit is operable to execute software pipelining, by inserting a transfer instruction, which is used for transferring data between operands, into a loop process included in the intermediate program so that a data dependence relation is changed.

Type: Grant

Filed: September 14, 2012

Date of Patent: October 14, 2014

Assignee: Panasonic Corporation

Inventors: Shohei Michimoto, Taketo Heishi, Hajime Ogawa, Teruo Kawabata

prev 1 2 3 4 5 6 … next