For A Parallel Or Multiprocessor System Patents (Class 717/149)

Loop compiling (Class 717/150)

COMPILE METHOD AND COMPILER

Publication number: 20100229161

Abstract: A compile technique is provided for multicore allocation, by which a desired running performance can be achieved. The steps of analyzing a taskization directive, taskizing a specified part, and assigning a specified CPU the task are adopted for the compile technique. According to the program-to-tasks-decomposition compile technique, the multicore decomposition is performed by allocating tasks to CPUs individually while following a task decomposition directive of a main part designated by a user. When no direction is issued concerning a CPU to be allocated, the relation with a principal task is judged from the relation of invocation and the dependency, and CPU to be allocated, and then the CPU to be allocated is determined. In allocation to the CPU, an efficient multicore-task decomposition is achieved in consideration of copy and assignment of one processing to more than one CPU while figuring in the balance between processing speed and resources.

Type: Application

Filed: January 27, 2010

Publication date: September 9, 2010

Applicant: RENESAS TECHNOLOGY CORP.

Inventor: Noriyasu MORI
Systems and methods for affine-partitioning programs onto multiple processing units

Patent number: 7793278

Abstract: Systems and methods perform affine partitioning on a code stream to produce code segments that may be parallelized. The code segments include copies of the original code stream with conditional inserted that aid in parallelizing code. The conditional is formed by determining the constraints on a processor variable determined by the affine partitioning and applying the constraints to the original code stream.

Type: Grant

Filed: September 30, 2005

Date of Patent: September 7, 2010

Assignee: Intel Corporation

Inventors: Zhao Hui Du, Shih-Wei Liao, Gansha Wu, Guei-Yuan Lueh
Apparatus and method for automatically parallelizing network applications through pipelining transformation

Patent number: 7793276

Abstract: In some embodiments, a method and apparatus for automatically parallelizing a sequential network application through pipeline transformation are described. In one embodiment, the method includes the configuration of a network processor into a D-stage processor pipeline. Once configured, a sequential network application program is transformed into D-pipeline stages. Once transformed, the D-pipeline stages are executed in parallel within the D-stage processor pipeline. In one embodiment, transformation of a sequential application program is performed by modeling the sequential network program as a flow network model and selecting from the flow network model into a plurality of preliminary pipeline stages. Other embodiments are described and claimed.

Type: Grant

Filed: November 14, 2003

Date of Patent: September 7, 2010

Assignee: Intel Corporation

Inventors: Jinquan Dai, Luddy Harrison, Bo Huang, Cotton Seed, Long Li
System for controlling assignment of a plurality of modules of a program to available execution units based on speculative executing and granularity adjusting

Patent number: 7788672

Abstract: According to one embodiment, an information processing apparatus includes a plurality of execution modules and a scheduler which controls assignment of a plurality of basic modules to the plurality of execution modules. The scheduler includes assigning, when an available execution module which is not assigned any basic modules exists, a basic module which stands by for completion of execution of other basic module to the available execution module, measuring an execution time of processing of the basic module itself, measuring execution time of processing for assigning the basic module to the execution module, and performing granularity adjustment by linking two or more basic modules to be successively executed according to the restriction of a execution sequence so as to be assigned as one set to the execution module and redividing the linked two or more basic modules, based on the two execution measured execution times.

Type: Grant

Filed: April 6, 2009

Date of Patent: August 31, 2010

Assignee: Kabushiki Kaisha Toshiba

Inventor: Yasuyuki Tanaka
Compiler-based critical section amendment for a multiprocessor environment

Patent number: 7788650

Abstract: Source code includes a directive to indicate data structures of related data to a compiler. The compiler associates the related data to the same one of multiple processors in a multiprocessor environment. The compiler searches the source code for locks associated with the related data, and generates executable code that is modified with respect to locks written in the source code. The compiler may replace or remove locks written in the source code to protect access to the related data, resulting in an executable program that does not include the locks.

Type: Grant

Filed: May 10, 2005

Date of Patent: August 31, 2010

Assignee: Intel Corporation

Inventors: Erik J. Johnson, Stephen D. Goglin
System and method for efficient verification of memory consistency model compliance

Patent number: 7779393

Abstract: A system for efficiently verifying compliance with a memory consistency model includes a test module and an analysis module. The test module may coordinate an execution of a multithreaded test program on a test platform. If the test platform provides an indication of the order in which writes from multiple processing elements are performed at shared memory locations, the analysis module may use a first set of rules to verify that the results of the execution correspond to a valid ordering of events according to a memory consistency model. If the test platform does not provide an indication of write ordering, the analysis module may use a second set of rules to verify compliance with the memory consistency model.

Type: Grant

Filed: May 25, 2005

Date of Patent: August 17, 2010

Assignee: Oracle America, Inc.

Inventors: Chaiyasit Manovit, Sudheendra G. Hangal, Robert E. Cypher
Model checking with bounded context switches

Patent number: 7779382

Abstract: Validity of one or more assertions for any concurrent execution of a plurality of software instructions with at most k?1 context switches can be determined. Validity checking can account for execution of the software instructions in an unbounded stack depth scenario. A finite data domain representation can be used. The software instructions can be represented by a pushdown system. Validity checking can account for thread creation during execution of the plurality of software instructions.

Type: Grant

Filed: December 10, 2004

Date of Patent: August 17, 2010

Assignee: Microsoft Corporation

Inventors: Niels Jakob Rehof, Shaz Qadeer
GENERAL PURPOSE DISTRIBUTED DATA PARALLEL COMPUTING USING A HIGH LEVEL LANGUAGE

Publication number: 20100205588

Abstract: General-purpose distributed data-parallel computing using a high-level language is disclosed. Data parallel portions of a sequential program that is written by a developer in a high-level language are automatically translated into a distributed execution plan. The distributed execution plan is then executed on large compute clusters. Thus, the developer is allowed to write the program using familiar programming constructs in the high level language. Moreover, developers without experience with distributed compute systems are able to take advantage of such systems.

Type: Application

Filed: February 9, 2009

Publication date: August 12, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Yuan Yu, Dennis Fetterly, Michael Isard, Ulfar Erlingsson, Mihai Budiu
Method and apparatus for processor code optimization using code compression

Patent number: 7774768

Abstract: An improved method of optimizing the instruction set of a digital processor using code compression. In one embodiment, the method comprises obtaining an assembly language program to be used for the optimization process; calculating the static frequency of each instruction type from the base instruction set; sorting the instruction types by frequency; determining the number and type of instructions necessary for correct program execution; creating a compressed instruction set encoding; re-evaluating the compressed instruction according to the foregoing steps; and generating an instruction set encoding for the compressed instruction set. Improved compressed instruction formats and register structures useful in a processor are also disclosed. A computer program and apparatus for synthesizing logic implementing the aforementioned data cache architecture and pipeline performance enhancements are further disclosed.

Type: Grant

Filed: May 22, 2006

Date of Patent: August 10, 2010

Assignee: ARC International, PLC

Inventor: Peter Warnes
Automated Partitioning of a Computation for Parallel or Other High Capability Architecture

Publication number: 20100199257

Abstract: A method and a system for transformation-based program generation using two separate specifications as input: An implementation neutral specification of the desired computation and a specification of the execution platform. The generated implementation incorporates execution platform opportunities such as parallelism. Operationally, the invention has two broad stages. First, it designs the abstract implementation in the problem domain in terms of an Intermediate Language (IL) that is unfettered by programming language restrictions and requirements. Concurrently, the design is evolved by specializing the IL to encapsulate a plurality of desired design features in the implementation such as partitioning for multicore and/or instruction level parallelism. Concurrently, constraints that stand in for implied implementation structures are added to the design and coordinated with other constraints. Second, the IL is refined into implementation code.

Type: Application

Filed: January 31, 2009

Publication date: August 5, 2010

Inventor: Ted James Biggerstaff
Inducing concurrency in software code

Patent number: 7765532

Abstract: An Induced Multi-threading (IMT) framework may be configured to induce multi-threaded execution in software code. In one embodiment, the IMT framework may include a concurrent code generator configured to receive marked code having one or more blocks of code marked for concurrent execution. Software code initially intended for sequential execution may have been automatically marked by an automated code marker and/or marked manually to generate the marked code. The concurrent code generator may be configured to generate concurrent code from the marked code. The concurrent code may include one or more tasks configured for concurrent execution in place of the one or more marked blocks of code. In one embodiment, the IMT framework may also include a scheduler configured to schedule one or more of the tasks for multi-threaded execution.

Type: Grant

Filed: October 22, 2002

Date of Patent: July 27, 2010

Assignee: Oracle America, Inc.

Inventors: Bala Dutt, Ajay Kumar, Hanumantha R. Susarla
Generating efficient parallel code using partitioning, coalescing, and degenerative loop and guard removal

Patent number: 7757222

Abstract: Code is affine partitioned to generate affine partitioning mappings. Parallel code is generated based on the affine partitioning mappings. Generating the parallel code includes coalescing loops in the parallel code generated from the affine partitioning mappings to generate coalesced parallel code and optimizing the coalesced parallel code.

Type: Grant

Filed: September 30, 2005

Date of Patent: July 13, 2010

Assignee: Intel Corporation

Inventors: Shih-wei Liao, Zhao Hui Du, Bu Qi Cheng, Gansha Wu, Guei-Yuan Lueh
SCOPE: A STRUCTURED COMPUTATIONS OPTIMIZED FOR PARALLEL EXECUTION SCRIPT LANGUAGE

Publication number: 20100175049

Abstract: Embodiments of the present invention relate to systems, methods and computer storage media for providing Structured Computations Optimized for Parallel Execution (SCOPE) that facilitate analysis of a large-scale dataset utilizing row data of those data sets. SCOPE includes, among other features, an extract command for extracting data bytes from a data stream and structuring the data bytes as data rows having strictly defined columns. SCOPE also includes a process command and a reduce command that identify data rows as inputs. The reduce command also identifies a reduce key that facilitates the reduction based on the reduce key. SCOPE additionally includes a combine command that identifies two data row sets that are to be combined based on an identified joint condition. Additionally, SCOPE includes a select command that leverages SQL and C# languages to create an expressive script that is capable of analyzing large-scale data sets in a parallel computing environment.

Type: Application

Filed: January 7, 2009

Publication date: July 8, 2010

Applicant: MICROSOFT CORPORATION

Inventors: WILLIAM D. RAMSEY, RONNIE IRA CHAIKEN, DARREN A. SHAKIB, ROBERT JOHN JENKINS, JR., SIMON J. WEAVER, JINGREN ZHOU, DANIEL DEDU-CONSTANTIN, ACHINT SRIVASTAVA
Orthogonal Integration of de-serialization into an interpretive validating XML parser

Patent number: 7752212

Abstract: A computer-implemented method of creating a schema specific parser for processing Extensible Markup Language (XML) documents can include receiving an XML schema comprising a plurality of components, determining a hierarchy of the plurality of components of the XML schema, and creating an execution plan specifying a hierarchy of XML processing instructions. Each XML processing instruction can be associated with an XML processing function of a virtual machine that performs an XML document processing task. The hierarchy of XML processing instructions can be determined according to the hierarchy of components of the XML schema. An instruction causing the virtual machine to invoke a de-serialization module that extracts at least one item of information from the XML document can be inserted into the execution plan. The execution plan can be compiled into a bytecode version of the execution plan that is interpretable by the virtual machine. The bytecode version of the execution plan can be output.

Type: Grant

Filed: June 5, 2007

Date of Patent: July 6, 2010

Assignee: International Business Machines Corporation

Inventors: Abraham Heifets, Margaret G. Kostoulas, Moshe Morris Emanuel Matsa, Eric Perkins
Method of mixed lock-free and locking synchronization

Patent number: 7747996

Abstract: A method to enabling interoperability of a locking synchronization method with a lock-free synchronization method in a multi-threaded environment is presented. The method examines a class file for mutable fields contained in critical code sections. The mutable fields are transferred to a shadow record and a pointer is substituted in the class field for each transferred mutable field. Code is altered so that the lock-free synchronization method is used if a lock is not held on the object. Atomic compare and swap operations are employed after mutable fields are updated during execution of the lock-free synchronization method.

Type: Grant

Filed: May 25, 2006

Date of Patent: June 29, 2010

Assignee: Oracle America, Inc.

Inventor: David Dice
Partitioning distributed arrays according to criterion and functions applied to the distributed arrays

Patent number: 7743087

Abstract: The present invention provides a method and system for the dynamic distribution of an array in a parallel computing environment. The present invention obtains a criterion for distributing an array and performs flexible portioning based on the obtained criterion. In some embodiment analysis may be performed based on the criterion. The flexible portioning is then performed based on the analysis.

Type: Grant

Filed: March 22, 2006

Date of Patent: June 22, 2010

Assignee: The Math Works, Inc.

Inventors: Penelope Anderson, Cleve Moler, Sheung Hun Cheng, Patrick D. Quillen
SYSTEM AND METHOD FOR PARALLEL EXECUTION OF A PROGRAM

Publication number: 20100153937

Abstract: A computer system for executing a computer program on parallel processors, the system having a compiler for identifying within a computer program concurrency markers that indicate that code between them can be executed in parallel and should be executed with delayed side-effects; and an execution system that is operable to execute the code identified by the concurrency markers to generate a queue of side-effects and after execution of that code is completed, sequentially execute the queue of side-effects.

Type: Application

Filed: January 26, 2007

Publication date: June 17, 2010

Applicant: CODEPLAY SOFTWARE LIMITED

Inventors: Andrew Richards, Andrew Cook, Colin Riley
Parallelism performance analysis based on execution trace information

Patent number: 7739667

Abstract: A system for conducting performance analysis for executing tasks. The analysis involves generating a variety of trace information related to performance measures, including parallelism-related information, during execution of the task. In order to generate the trace information, target source code of interest is compiled in such a manner that executing the resulting executable code will generate execution trace information composed of a series of events. Each event stores trace information related to a variety of performance measures for the one or more processors and protection domains used. After the execution trace information has been generated, the system can use that trace information and a trace information description file to produce useful performance measure information. The trace information description file contains information that describes the types of execution events as well as the structure of the stored information.

Type: Grant

Filed: October 19, 2005

Date of Patent: June 15, 2010

Assignee: Cray Inc.

Inventors: Charles David Callahan, II, Keith Arnett Shields, Preston Pengra Briggs, III
Efficient generation of SIMD code in presence of multi-threading and other false sharing conditions and in machines having memory protection support

Patent number: 7730463

Abstract: A computer implemented method, system and computer program product for automatically generating SIMD code. The method begins by analyzing data to be accessed by a targeted loop including at least one statement, where each statement has at least one memory reference, to determine if memory accesses are safe. If memory accesses are safe, the targeted loop is simdized. If not safe, it is determined if a scheme can be applied in which safety need not be guaranteed. If such a scheme can be applied, the targeted loop is simdized according to the scheme. If such a scheme cannot be applied, it is determined if padding is appropriate. If padding is appropriate, the data is padded and the targeted loop is simdized. If padding is not appropriate, non-simdized code is generated based on the targeted loop for handling boundary conditions, the targeted loop is simdized and combined with the non-simdized code.

Type: Grant

Filed: February 21, 2006

Date of Patent: June 1, 2010

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu, Peng Zhao
Storage of project-planning data in an automation system

Patent number: 7730483

Abstract: The invention relates to a system and also method for storage of project-planning data in an automation system containing automation devices. To simplify changes within the automation system the project-planning data is stored in a generic, expandable data storage format, with parts of the project-planning data being assigned runtime data in each case, with the runtime data being assigned at least one automation device in each case, with the runtime data being executable parts of programs on the automation devices assigned to it and with the parts of the project-planning data being stored distributed in parallel to the runtime data assigned to it in each case in the automation devices assigned to the runtime data in each case.

Type: Grant

Filed: July 28, 2005

Date of Patent: June 1, 2010

Assignee: Siemens Aktiengesellschaft

Inventors: Martin Daimer, Ludwig Karl-Dietze, Andreas Macher, Siegfried Prieler
Controller equipment model systems and methods

Patent number: 7721273

Abstract: The present invention relates to a system and methodology facilitating automated manufacturing processes in an industrial controller environment. An automation system is provided for automated industrial processing. The system includes an equipment phase object that is executed by a controller engine, wherein the equipment phase object can be accessible from internal instructions within the controller and/or from external instructions directed to the controller such as from a server or another controller across a network connection. A sequencing engine operates with the equipment phase object to facilitate automated industrial processing. The sequencing engine can be adapted to various industrial standards or in accordance with other state type models.

Type: Grant

Filed: June 4, 2004

Date of Patent: May 18, 2010

Assignee: Rockwell Automation Technologies, Inc.

Inventors: Kenwood H. Hall, Stephen D. Ryan, Richard Alan Morse, Kam-Por Yuen, Raymond J. Staron, Paul R. D'Mura, James H. Jarrett, Michael D. Kalan, Robert C. Kline, Jr., Charles Martin Rischar, Christopher E. Stanek, Tao Zhao, Kenneth S. Plache, Shoshana L. Wodzisz, Jan Bezdicek, David A. Johnston, Jeffery W. Brooks
Methods for describing processor features

Patent number: 7716638

Abstract: A machine readable description of a new feature of a processor is provided by a processor vendor. Control code executing on a processor, such as a traditional operating system kernel, a partitioning kernel, or the like can be programmed to receive the description of the feature and to use information provided by the description to detect, enable and manage operation of the new feature.

Type: Grant

Filed: March 4, 2005

Date of Patent: May 11, 2010

Assignee: Microsoft Corporation

Inventor: Andrew J. Thornton
Systems and methods for parallel distributed programming

Patent number: 7712080

Abstract: The present invention relates generally to computer programming, and more particularly to systems and methods for parallel distributed programming. Generally, a parallel distributed program is configured to operate across multiple processors and multiple memories. In one aspect of the invention, a parallel distributed program includes a distributed shared variable located across the multiple memories and distributed programs capable of operating across multiple processors.

Type: Grant

Filed: May 21, 2004

Date of Patent: May 4, 2010

Assignee: The Regents of the University of California

Inventors: Lei Pan, Lubomir R. Bic, Michael B. Dillencourt
Parallel compiling with a serial scheduler

Patent number: 7712090

Abstract: Methods and apparatus, including computer program products, for generating an executable program, including receiving serial compile commands in a pseudo-compiler to compile source code modules, scheduling the serial compiler commands in parallel compilers to compile the source code modules, compiling the source code modules in the parallel compliers to generate object code modules, sending compiler completion acknowledgements to a synchronizer and linking the object code modules in linkers in response to linker initiation commands from the synchronizer.

Type: Grant

Filed: February 7, 2003

Date of Patent: May 4, 2010

Assignee: SAP Aktiengesellschaft

Inventor: Thomas Stuefe
Architecture for a computer-based development environment with self-contained components and a threading model

Patent number: 7707543

Abstract: A method, a device and a system arrangement are disclosed for generating self-contained software components having in each case synchronous and/or asynchronous interfaces with an internal threading model. The concept disclosed enables all necessary synchronization mechanisms to be provided automatically. The concept is based on an asynchronous operation manager used to divert callbacks from a called component into a calling component.

Type: Grant

Filed: November 22, 2005

Date of Patent: April 27, 2010

Assignee: Siemens Aktiengesellschaft

Inventors: Detlef Becker, Karlheinz Dorn, Vladyslav Ukis, Hans-Martin Von Stockhausen
Method for embedding object codes in source codes

Patent number: 7694289

Abstract: Methods for embedding codes executable in a first system having a first microprocessor into codes executable in a second system having a second microprocessor are described herein. In one aspect of the invention, an exemplary method includes providing first codes having a routine, the first codes being compilable to be executed in the first system, and compiling the first codes, resulting in second codes; the second codes comprising opcodes of the routine executable by the first system, which convert the second codes into third codes automatically, the third codes being compilable to be executed by the second system; this is followed by compiling the third codes, resulting in the fourth codes being executable in the second system, and linking the fourth codes, generating an executable image and executing the executable image in the second system. Other methods and apparatuses are also described.

Type: Grant

Filed: December 5, 2005

Date of Patent: April 6, 2010

Assignee: Apple Inc.

Inventor: Keith Stattenfield
Splitting the computation space to optimize parallel code

Patent number: 7689980

Abstract: Linear transformations of statements in code are performed to generate linear expressions associated with the statements. Parallel code is generated using the linear expressions. Generating the parallel code includes splitting the computation-space of the statements into intervals and generating parallel code for the intervals.

Type: Grant

Filed: September 30, 2005

Date of Patent: March 30, 2010

Assignee: Intel Corporation

Inventors: Zhao Hui Du, Shih-wei Liao, Gansha Wu, Guei-Yuan Lueh
Open multi-processing reduction implementation in cell broadband engine (CBE) single source compiler

Patent number: 7689977

Abstract: The present disclosure is directed to a method for providing an OpenMP reduction implementation. The method may comprise creating an aggregate of at least one reduction variable in a parallel region or a work-sharing construct; defining a pointer variable, the pointer variable pointing to a dynamic array of the aggregate; creating an initialization routine, an outlined routine and a reduction accumulation routine; replacing the parallel region or the work-sharing construct with a runtime routine, the runtime routine taking a plurality of arguments including an address of the initialization routine, an address of the outlined routine, an address of the reduction accumulation routine, an address of the pointer variable, and a size of the aggregate; and executing the runtime routine when the at least one reduction variable is in the parallel region or the work-sharing construct.

Type: Grant

Filed: April 15, 2009

Date of Patent: March 30, 2010

Assignee: International Business Machines Corporation

Inventors: Guansong Zhang, Shimin Cui, Ettore Tiotto
Method and apparatus for referencing thread local variables with stack address mapping

Patent number: 7689971

Abstract: Methods and apparatuses provide for referencing thread local variables (TLVs) with techniques such as stack address mapping. A method may involve a head pointer that points to a set of thread local variables (TLVs) of a thread. A method according to one embodiment may include an operation for storing the head pointer in a global data structure in a user space of a processing system. The head pointer may subsequently be retrieved from the global data structure and used to access one or more TLVs associated with the thread. In one embodiment, the head pointer is retrieved without executing any kernel system calls. In an example embodiment, the head pointer is stored in a global array, and a stack address for the thread is used to derive an index into the array. Other embodiments are described and claimed.

Type: Grant

Filed: August 9, 2004

Date of Patent: March 30, 2010

Assignee: Intel Corporation

Inventors: Jinzhan Peng, Xiaohua Shi, Guei-Yuan Lueh, Gansha Wu
Obstruction-free mechanism for atomic update of multiple non-contiguous locations in shared memory

Patent number: 7685583

Abstract: We present a technique for implementing obstruction-free atomic multi-target transactions that target special “transactionable” locations in shared memory. A programming interface for using operations based on these transactions can be structured in several ways, including as n-word compare-and-swap (NCAS) operations or as atomic sequences of single-word loads and stores (e.g., as transactional memory).

Type: Grant

Filed: July 16, 2003

Date of Patent: March 23, 2010

Assignee: Sun Microsystems, Inc.

Inventors: Mark S. Moir, Victor M. Luchangco, Maurice Herlihy
PROGRAM PARALLELIZING METHOD AND PROGRAM PARALLELIZING APPARATUS

Publication number: 20100070958

Abstract: Provided is a program parallelizing method and a program parallelizing apparatus that enable to efficiently generate a parallelized program with shorter parallel execution time. An instruction is scheduled by referring to inter-instruction dependency. A dependency between an instruction in a function fp/f0 and an instruction of a function fq of its descendant is analyzed, and parallelization is performed with the analysis result. First, an instruction of a deeper function fq is relatively scheduled to analyze whether each instruction has dependency with an instruction of another function fp. When there is inter-instruction dependency, scheduling of the instruction of the function fq is performed so as to maintain the dependency and realize the shortest execution time.

Type: Application

Filed: November 15, 2007

Publication date: March 18, 2010

Inventor: Masamichi Takagi
Microprocessor instruction execution method for exploiting parallelism by time ordering operations in a single thread at compile time

Patent number: 7681016

Abstract: A low overhead mechanism for supporting speculative execution and code compression in a Very Long Instruction Word (VLIW) microprocessor. Profitable speculations can be determined statically at compile time and a low overhead hardware recovery mechanism used that does not require compensation code.

Type: Grant

Filed: June 30, 2003

Date of Patent: March 16, 2010

Assignee: Critical Blue Ltd.

Inventor: Richard Michael Taylor
Mechanism for pipelining loops with irregular loop control

Patent number: 7673294

Abstract: This invention modifies an irregular software pipelined loop conditioned upon data in a condition register in a compiler scheduled very long instruction word data processor to prevent over-execution upon loop exit. The method replaces a register modifying instruction with an instruction conditional upon the inverse condition register if possible. The method inserts a conditional register move instruction to a previously unused register within the loop if possible without disturbing the schedule. Then a restoring instruction is added after the loop. Alternatively, both these two functions can be performed by a delayed register move instruction. Instruction insertion is into a previously unused instruction slot of an execute packet. These changes can be performed manually or automatically by the compiler.

Type: Grant

Filed: January 18, 2006

Date of Patent: March 2, 2010

Assignee: Texas Instruments Incorporated

Inventors: Elana D. Granston, Jagadeesh Sankaran
System and method for compile-time non-concurrency analysis

Patent number: 7673295

Abstract: Compile-time non-concurrency analysis of parallel programs improves execution efficiency by detecting possible data race conditions within program barriers. Subroutines are modeled with control flow graphs and region trees having plural nodes related by edges that represent the hierarchical loop structure and construct relationship of statements. Phase partitioning of the control flow graph allows analysis of statement relationships with programming semantics, such as those of the OpenMP language, that define permitted operations and execution orders.

Type: Grant

Filed: April 27, 2004

Date of Patent: March 2, 2010

Assignee: Sun Microsystems, Inc.

Inventor: Yuan Lin
Software application performance enhancement

Publication number: 20100042981

Abstract: Generating parallelized executable code from input code includes statically analyzing the input code to determine aspects of data flow and control flow of the input code; dynamically analyzing the input code to determine additional aspects of data flow and control flow of the input code; generating an intermediate representation of the input code based at least in part on the aspects of data flow and control flow of the input code identified by the static analysis and the additional aspects of data and control flow of the input code identified by the dynamic analysis; and processing the intermediate representation to determine portions of the intermediate representation that are eligible for parallel execution; and generating parallelized executable code from the processed intermediate representation

Type: Application

Filed: August 13, 2009

Publication date: February 18, 2010

Inventors: Robert Scott Dreyer, Joel Kevin Jones, Michael Douglas Sharp, Ivan Dimitrov Baev
Method and apparatus for detection and optimization of presumably parallel program regions

Publication number: 20100031241

Abstract: A method and apparatus for optimizing source code for use in a parallel computing environment by compiling an application source code, performing analysis, and optimizing the application source code. At the time of compilation, a compiler adds instrumentation to a prepared executable. An analysis program then analyzes the prepared executable and generates an analysis result. The analysis result is then used by the analysis program to optimize the application source code for parallel processing.

Type: Application

Filed: November 17, 2008

Publication date: February 4, 2010

Inventor: Leon Schwartz
Wavescalar architecture having a wave order memory

Patent number: 7657882

Abstract: A dataflow instruction set architecture and execution model, referred to as WaveScalar, which is designed for scalable, low-complexity/high-performance processors, while efficiently providing traditional memory semantics through a mechanism called wave-ordered memory. Wave-ordered memory enables “real-world” programs, written in any language, to be run on the WaveScalar architecture, as well as any out-of-order execution unit. Because it is software-controlled, wave-ordered memory can be disabled to obtain greater parallelism. Wavescalar also includes a software-controlled tag management system.

Type: Grant

Filed: January 21, 2005

Date of Patent: February 2, 2010

Assignee: University of Washington

Inventors: Mark H. Oskin, Steven J. Swanson, Susan J. Eggers
Method for processing data

Patent number: 7657877

Abstract: A method and device for translating a program to a system including at least one first processor and a reconfigurable unit. Code portions of the program which are suitable for the reconfigurable unit are determined. The remaining code of the program is extracted and/or separated for processing by the first processor.

Type: Grant

Filed: June 20, 2002

Date of Patent: February 2, 2010

Assignee: Pact XPP Technologies AG

Inventors: Martin Vorbach, Armin Nückel, Frank May, Markus Weinhardt, Joao Manuel Paiva Cardoso
Safe store for speculative helper threads

Patent number: 7657880

Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is permitted to execute Store instructions. Store blocker logic operates to prevent data associated with a Store instruction in a helper thread from being committed to memory. Dependence blocker logic operates to prevent data associated with a Store instruction in a speculative helper thread from being bypassed to a Load instruction in a non-speculative thread.

Type: Grant

Filed: August 1, 2003

Date of Patent: February 2, 2010

Assignee: Intel Corporation

Inventors: Hong Wang, Tor Aamodt, Per Hammarlund, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Steve Shih-wei Liao
CONTROLLING PARALLELIZATION OF RECURSION USING PLUGGABLE POLICIES

Publication number: 20090320005

Abstract: A parallelism policy object provides a control parallelism interface whose implementation evaluates parallelism conditions that are left unspecified in the interface. User-defined and other parallelism policy procedures can make recommendations to a worker program for transitioning between sequential program execution and parallel execution. Parallelizing assistance values obtained at runtime can be used in the parallelism conditions on which the recommendations are based. A consistent parallelization policy can be employed across a range of parallel constructs, and inside recursive procedures.

Type: Application

Filed: June 4, 2008

Publication date: December 24, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Stephen Toub, Igor Ostrovsky, Joe Duffy, Vance Morrison, Huseyin Yildiz
SYSTEM AND METHOD FOR SCALING SIMULATIONS AND GAMES

Publication number: 20090307671

Abstract: A system and method for modeling simulation and game artificial intelligence as a data management problem. A scripting language that provides game designers and players with a data-driven artificial intelligence scheme for customizing behavior for individual agents. Query processing and indexing techniques to efficiently execute large numbers of agent scripts, thus providing a framework for games with a large number of agents.

Type: Application

Filed: June 8, 2009

Publication date: December 10, 2009

Applicant: CORNELL UNIVERSITY

Inventors: Walker White, Johannes Gehrke, Alan John Demers, Christoph Emanuel Koch
Programming Model and Software System for Exploiting Parallelism in Irregular Programs

Publication number: 20090307655

Abstract: Systems and methods for parallelizing applications that operate on irregular data structures. In an embodiment, the methods and systems enable programmers to use set iterators to express algorithms containing amorphous data parallelism. Parallelization can be achieved by speculatively executing multiple iterations of the iterator in parallel. Conflicts between speculatively executing iterations can be detected and handled using information in class libraries.

Type: Application

Filed: June 10, 2009

Publication date: December 10, 2009

Inventors: Keshav Kumar Pingali, Milind Vidyadhar Kulkarni
COMPOSABLE AND CANCELABLE DATAFLOW CONTINUATION PASSING

Publication number: 20090300591

Abstract: Parallel tasks are created, and the tasks include a first task and a second task. Each task resolves a future. At least one of three possible continuations for each of the tasks is supplied. The three continuations include a success continuation, a cancellation continuation, and a failure continuation. A value is returned as the future of the first task upon a success continuation for the first task. The value from the first task is used in the second task to compute a second future. The cancellation continuation is supplied if the task is cancelled and the failure continuation is supplied if the task does not return a value and the task is not cancelled.

Type: Application

Filed: June 2, 2008

Publication date: December 3, 2009

Applicant: MICROSOFT CORPORATION

Inventors: John Duffy, Stephen H. Toub
Mechanism to optimize speculative parallel threading

Patent number: 7627864

Abstract: A method to optimize speculative parallel thread execution comprises selecting a plurality of partition candidate pairs for speculative parallel thread execution, transforming each partition candidate pair of the plurality of partition candidate pairs to improve the expected performance gain of each pair, and selecting a set of one or more transformed partition candidate pairs that do not interfere with each other and produce a maximum expected performance gain.

Type: Grant

Filed: June 27, 2005

Date of Patent: December 1, 2009

Assignee: Intel Corporation

Inventors: Zhao Hui Du, Tin-fook Ngai, Chu-cheow Lim
Parallelization scheme for generic reduction

Patent number: 7620945

Abstract: One embodiment of the present invention provides a system that supports parallelized generic reduction operations in a parallel programming language, wherein a reduction operation is an associative operation that can be divided into a group of sub-operations that can execute in parallel. During operation, the system detects generic reduction operations in source code. In doing so, the system identifies a set of reduction variables upon which the generic reduction operation will operate, along with a set of initial values for the variables. The system additionally identifies a merge operation that merges partial results from the parallel generic reduction operations into a final result. The system then compiles the program's source code into a form which facilitates executing the generic reduction operations in parallel.

Type: Grant

Filed: August 16, 2005

Date of Patent: November 17, 2009

Assignee: Sun Microsystems, Inc.

Inventors: Yonghong Song, Yuan Lin, Prashanth Narayanaswamy
Process for running programs with selectable instruction length processors and corresponding processor system

Patent number: 7617494

Abstract: The program to be executed is compiled by translating it into native instructions of the instruction-set architecture of the processor system, organizing the instructions deriving from the translation of the program into respective bundles in an order of successive bundles, each bundle grouping together instructions adapted to be executed in parallel by the processor system. The bundles of instructions are ordered into respective sub-bundles, said sub-bundles identifying a first set of instructions, which must be executed before the instructions belonging to the next bundle of said order, and a second set of instructions, which can be executed both before and in parallel with respect to the instructions belonging to said subsequent bundle of said order.

Type: Grant

Filed: July 1, 2003

Date of Patent: November 10, 2009

Assignee: STMicroelectronics S.r.l.

Inventors: Fabrizio Simone Rovati, Antonio Maria Borneo, Danilo Pietro Pau
Method and apparatus and determining processor utilization

Patent number: 7617488

Abstract: A method and an apparatus for determining processor utilization have been disclosed. In one embodiment, the method includes determining processor utilization in a data processing system and synchronizing execution of a number of threads in the data processing system to prevent interrupting the determining of the processor utilization. Other embodiments have been claimed and described.

Type: Grant

Filed: December 30, 2003

Date of Patent: November 10, 2009

Assignee: Intel Corporation

Inventors: Vasudevan Srinivasan, Avinash P. Chakravarthy
SYSTEM AND METHOD FOR THE DISTRIBUTION OF A PROGRAM AMONG COOPERATING PROCESSING ELEMENTS

Publication number: 20090271774

Abstract: A Veil program analyzes the source code and/or data of an existing sequential target program and determines how best to distribute the target program and data among the processing elements of a multi-processing element computing system. The Veil program analyzes source code loops, data sizes and types to prepare a set of distribution attempts, whereby each distribution is run under a run-time evaluation wrapper and evaluated to determine the optimal distribution across the available processing elements.

Type: Application

Filed: July 3, 2009

Publication date: October 29, 2009

Applicant: MANAGEMENT SERVICES GROUP, INC. d/b/a Global Technical Systems

Inventors: Robert Stephen Gordy, Terry Spitzer
Method and apparatus for sharing data between discrete clusters of processors

Patent number: 7606698

Abstract: A method and apparatus for sharing data between processors within first and second discrete clusters of processors. The method comprises supplying a first amount of data from a first data array in a first discrete cluster of processors to selector logic. A second amount of data from a second data array in a second discrete cluster of processors is also supplied to the selector logic. The first or second amount of data is then selected using the selector logic, and supplied to a shared input port on a processor in the first discrete cluster of processors. The apparatus comprises selector logic for selecting between input data supplied by a first data array and a second data array. The data arrays are located within different discrete clusters of processors. The selected data is then supplied to a shared input port on a processor.

Type: Grant

Filed: September 26, 2006

Date of Patent: October 20, 2009

Assignee: Cadence Design Systems, Inc.

Inventors: Beshara G. Elmufdi, Mitchell G. Poplack
Compiler program, a computer-readable storage medium storing a compiler program, a compiling method and a compiling unit

Patent number: 7590976

Abstract: The present invention relates a compiler program, a computer-readable storage medium storing such a compiler program, a compiling method and a compiling unit, and an object thereof is to automatically generate a reentrant object program. In order to accomplish this object, an address saving program generator 16a generates an address saving program for saving a data area address of a calling program module; an address setting program generator 16b generates an address setting program for setting a data area address of an other program module; a transferring program generator 16c generates a transferring program for the transfer from the calling program module to the other program module; an address resetting program generator 16d generates an address resetting program for reading and resetting the saved data area address; and an accessing program generator 16e generates an accessing program for accessing a data area for the other program module using a relative address from the set data area address.

Type: Grant

Filed: December 26, 2003

Date of Patent: September 15, 2009

Assignee: Panasonic Corporation

Inventors: Masaki Kawai, Takuji Kawamoto, Shusuke Haruna, Yutaka Fujihara

prev … 7 8 9 10 11 12 13 14 15 next