For A Parallel Or Multiprocessor System Patents (Class 717/149)
  • Publication number: 20100229161
    Abstract: A compile technique is provided for multicore allocation, by which a desired running performance can be achieved. The steps of analyzing a taskization directive, taskizing a specified part, and assigning a specified CPU the task are adopted for the compile technique. According to the program-to-tasks-decomposition compile technique, the multicore decomposition is performed by allocating tasks to CPUs individually while following a task decomposition directive of a main part designated by a user. When no direction is issued concerning a CPU to be allocated, the relation with a principal task is judged from the relation of invocation and the dependency, and CPU to be allocated, and then the CPU to be allocated is determined. In allocation to the CPU, an efficient multicore-task decomposition is achieved in consideration of copy and assignment of one processing to more than one CPU while figuring in the balance between processing speed and resources.
    Type: Application
    Filed: January 27, 2010
    Publication date: September 9, 2010
    Applicant: RENESAS TECHNOLOGY CORP.
    Inventor: Noriyasu MORI
  • Patent number: 7793278
    Abstract: Systems and methods perform affine partitioning on a code stream to produce code segments that may be parallelized. The code segments include copies of the original code stream with conditional inserted that aid in parallelizing code. The conditional is formed by determining the constraints on a processor variable determined by the affine partitioning and applying the constraints to the original code stream.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: September 7, 2010
    Assignee: Intel Corporation
    Inventors: Zhao Hui Du, Shih-Wei Liao, Gansha Wu, Guei-Yuan Lueh
  • Patent number: 7793276
    Abstract: In some embodiments, a method and apparatus for automatically parallelizing a sequential network application through pipeline transformation are described. In one embodiment, the method includes the configuration of a network processor into a D-stage processor pipeline. Once configured, a sequential network application program is transformed into D-pipeline stages. Once transformed, the D-pipeline stages are executed in parallel within the D-stage processor pipeline. In one embodiment, transformation of a sequential application program is performed by modeling the sequential network program as a flow network model and selecting from the flow network model into a plurality of preliminary pipeline stages. Other embodiments are described and claimed.
    Type: Grant
    Filed: November 14, 2003
    Date of Patent: September 7, 2010
    Assignee: Intel Corporation
    Inventors: Jinquan Dai, Luddy Harrison, Bo Huang, Cotton Seed, Long Li
  • Patent number: 7788672
    Abstract: According to one embodiment, an information processing apparatus includes a plurality of execution modules and a scheduler which controls assignment of a plurality of basic modules to the plurality of execution modules. The scheduler includes assigning, when an available execution module which is not assigned any basic modules exists, a basic module which stands by for completion of execution of other basic module to the available execution module, measuring an execution time of processing of the basic module itself, measuring execution time of processing for assigning the basic module to the execution module, and performing granularity adjustment by linking two or more basic modules to be successively executed according to the restriction of a execution sequence so as to be assigned as one set to the execution module and redividing the linked two or more basic modules, based on the two execution measured execution times.
    Type: Grant
    Filed: April 6, 2009
    Date of Patent: August 31, 2010
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Yasuyuki Tanaka
  • Patent number: 7788650
    Abstract: Source code includes a directive to indicate data structures of related data to a compiler. The compiler associates the related data to the same one of multiple processors in a multiprocessor environment. The compiler searches the source code for locks associated with the related data, and generates executable code that is modified with respect to locks written in the source code. The compiler may replace or remove locks written in the source code to protect access to the related data, resulting in an executable program that does not include the locks.
    Type: Grant
    Filed: May 10, 2005
    Date of Patent: August 31, 2010
    Assignee: Intel Corporation
    Inventors: Erik J. Johnson, Stephen D. Goglin
  • Patent number: 7779393
    Abstract: A system for efficiently verifying compliance with a memory consistency model includes a test module and an analysis module. The test module may coordinate an execution of a multithreaded test program on a test platform. If the test platform provides an indication of the order in which writes from multiple processing elements are performed at shared memory locations, the analysis module may use a first set of rules to verify that the results of the execution correspond to a valid ordering of events according to a memory consistency model. If the test platform does not provide an indication of write ordering, the analysis module may use a second set of rules to verify compliance with the memory consistency model.
    Type: Grant
    Filed: May 25, 2005
    Date of Patent: August 17, 2010
    Assignee: Oracle America, Inc.
    Inventors: Chaiyasit Manovit, Sudheendra G. Hangal, Robert E. Cypher
  • Patent number: 7779382
    Abstract: Validity of one or more assertions for any concurrent execution of a plurality of software instructions with at most k?1 context switches can be determined. Validity checking can account for execution of the software instructions in an unbounded stack depth scenario. A finite data domain representation can be used. The software instructions can be represented by a pushdown system. Validity checking can account for thread creation during execution of the plurality of software instructions.
    Type: Grant
    Filed: December 10, 2004
    Date of Patent: August 17, 2010
    Assignee: Microsoft Corporation
    Inventors: Niels Jakob Rehof, Shaz Qadeer
  • Publication number: 20100205588
    Abstract: General-purpose distributed data-parallel computing using a high-level language is disclosed. Data parallel portions of a sequential program that is written by a developer in a high-level language are automatically translated into a distributed execution plan. The distributed execution plan is then executed on large compute clusters. Thus, the developer is allowed to write the program using familiar programming constructs in the high level language. Moreover, developers without experience with distributed compute systems are able to take advantage of such systems.
    Type: Application
    Filed: February 9, 2009
    Publication date: August 12, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Yuan Yu, Dennis Fetterly, Michael Isard, Ulfar Erlingsson, Mihai Budiu
  • Patent number: 7774768
    Abstract: An improved method of optimizing the instruction set of a digital processor using code compression. In one embodiment, the method comprises obtaining an assembly language program to be used for the optimization process; calculating the static frequency of each instruction type from the base instruction set; sorting the instruction types by frequency; determining the number and type of instructions necessary for correct program execution; creating a compressed instruction set encoding; re-evaluating the compressed instruction according to the foregoing steps; and generating an instruction set encoding for the compressed instruction set. Improved compressed instruction formats and register structures useful in a processor are also disclosed. A computer program and apparatus for synthesizing logic implementing the aforementioned data cache architecture and pipeline performance enhancements are further disclosed.
    Type: Grant
    Filed: May 22, 2006
    Date of Patent: August 10, 2010
    Assignee: ARC International, PLC
    Inventor: Peter Warnes
  • Publication number: 20100199257
    Abstract: A method and a system for transformation-based program generation using two separate specifications as input: An implementation neutral specification of the desired computation and a specification of the execution platform. The generated implementation incorporates execution platform opportunities such as parallelism. Operationally, the invention has two broad stages. First, it designs the abstract implementation in the problem domain in terms of an Intermediate Language (IL) that is unfettered by programming language restrictions and requirements. Concurrently, the design is evolved by specializing the IL to encapsulate a plurality of desired design features in the implementation such as partitioning for multicore and/or instruction level parallelism. Concurrently, constraints that stand in for implied implementation structures are added to the design and coordinated with other constraints. Second, the IL is refined into implementation code.
    Type: Application
    Filed: January 31, 2009
    Publication date: August 5, 2010
    Inventor: Ted James Biggerstaff
  • Patent number: 7765532
    Abstract: An Induced Multi-threading (IMT) framework may be configured to induce multi-threaded execution in software code. In one embodiment, the IMT framework may include a concurrent code generator configured to receive marked code having one or more blocks of code marked for concurrent execution. Software code initially intended for sequential execution may have been automatically marked by an automated code marker and/or marked manually to generate the marked code. The concurrent code generator may be configured to generate concurrent code from the marked code. The concurrent code may include one or more tasks configured for concurrent execution in place of the one or more marked blocks of code. In one embodiment, the IMT framework may also include a scheduler configured to schedule one or more of the tasks for multi-threaded execution.
    Type: Grant
    Filed: October 22, 2002
    Date of Patent: July 27, 2010
    Assignee: Oracle America, Inc.
    Inventors: Bala Dutt, Ajay Kumar, Hanumantha R. Susarla
  • Patent number: 7757222
    Abstract: Code is affine partitioned to generate affine partitioning mappings. Parallel code is generated based on the affine partitioning mappings. Generating the parallel code includes coalescing loops in the parallel code generated from the affine partitioning mappings to generate coalesced parallel code and optimizing the coalesced parallel code.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: July 13, 2010
    Assignee: Intel Corporation
    Inventors: Shih-wei Liao, Zhao Hui Du, Bu Qi Cheng, Gansha Wu, Guei-Yuan Lueh
  • Publication number: 20100175049
    Abstract: Embodiments of the present invention relate to systems, methods and computer storage media for providing Structured Computations Optimized for Parallel Execution (SCOPE) that facilitate analysis of a large-scale dataset utilizing row data of those data sets. SCOPE includes, among other features, an extract command for extracting data bytes from a data stream and structuring the data bytes as data rows having strictly defined columns. SCOPE also includes a process command and a reduce command that identify data rows as inputs. The reduce command also identifies a reduce key that facilitates the reduction based on the reduce key. SCOPE additionally includes a combine command that identifies two data row sets that are to be combined based on an identified joint condition. Additionally, SCOPE includes a select command that leverages SQL and C# languages to create an expressive script that is capable of analyzing large-scale data sets in a parallel computing environment.
    Type: Application
    Filed: January 7, 2009
    Publication date: July 8, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: WILLIAM D. RAMSEY, RONNIE IRA CHAIKEN, DARREN A. SHAKIB, ROBERT JOHN JENKINS, JR., SIMON J. WEAVER, JINGREN ZHOU, DANIEL DEDU-CONSTANTIN, ACHINT SRIVASTAVA
  • Patent number: 7752212
    Abstract: A computer-implemented method of creating a schema specific parser for processing Extensible Markup Language (XML) documents can include receiving an XML schema comprising a plurality of components, determining a hierarchy of the plurality of components of the XML schema, and creating an execution plan specifying a hierarchy of XML processing instructions. Each XML processing instruction can be associated with an XML processing function of a virtual machine that performs an XML document processing task. The hierarchy of XML processing instructions can be determined according to the hierarchy of components of the XML schema. An instruction causing the virtual machine to invoke a de-serialization module that extracts at least one item of information from the XML document can be inserted into the execution plan. The execution plan can be compiled into a bytecode version of the execution plan that is interpretable by the virtual machine. The bytecode version of the execution plan can be output.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: July 6, 2010
    Assignee: International Business Machines Corporation
    Inventors: Abraham Heifets, Margaret G. Kostoulas, Moshe Morris Emanuel Matsa, Eric Perkins
  • Patent number: 7747996
    Abstract: A method to enabling interoperability of a locking synchronization method with a lock-free synchronization method in a multi-threaded environment is presented. The method examines a class file for mutable fields contained in critical code sections. The mutable fields are transferred to a shadow record and a pointer is substituted in the class field for each transferred mutable field. Code is altered so that the lock-free synchronization method is used if a lock is not held on the object. Atomic compare and swap operations are employed after mutable fields are updated during execution of the lock-free synchronization method.
    Type: Grant
    Filed: May 25, 2006
    Date of Patent: June 29, 2010
    Assignee: Oracle America, Inc.
    Inventor: David Dice
  • Patent number: 7743087
    Abstract: The present invention provides a method and system for the dynamic distribution of an array in a parallel computing environment. The present invention obtains a criterion for distributing an array and performs flexible portioning based on the obtained criterion. In some embodiment analysis may be performed based on the criterion. The flexible portioning is then performed based on the analysis.
    Type: Grant
    Filed: March 22, 2006
    Date of Patent: June 22, 2010
    Assignee: The Math Works, Inc.
    Inventors: Penelope Anderson, Cleve Moler, Sheung Hun Cheng, Patrick D. Quillen
  • Publication number: 20100153937
    Abstract: A computer system for executing a computer program on parallel processors, the system having a compiler for identifying within a computer program concurrency markers that indicate that code between them can be executed in parallel and should be executed with delayed side-effects; and an execution system that is operable to execute the code identified by the concurrency markers to generate a queue of side-effects and after execution of that code is completed, sequentially execute the queue of side-effects.
    Type: Application
    Filed: January 26, 2007
    Publication date: June 17, 2010
    Applicant: CODEPLAY SOFTWARE LIMITED
    Inventors: Andrew Richards, Andrew Cook, Colin Riley
  • Patent number: 7739667
    Abstract: A system for conducting performance analysis for executing tasks. The analysis involves generating a variety of trace information related to performance measures, including parallelism-related information, during execution of the task. In order to generate the trace information, target source code of interest is compiled in such a manner that executing the resulting executable code will generate execution trace information composed of a series of events. Each event stores trace information related to a variety of performance measures for the one or more processors and protection domains used. After the execution trace information has been generated, the system can use that trace information and a trace information description file to produce useful performance measure information. The trace information description file contains information that describes the types of execution events as well as the structure of the stored information.
    Type: Grant
    Filed: October 19, 2005
    Date of Patent: June 15, 2010
    Assignee: Cray Inc.
    Inventors: Charles David Callahan, II, Keith Arnett Shields, Preston Pengra Briggs, III
  • Patent number: 7730463
    Abstract: A computer implemented method, system and computer program product for automatically generating SIMD code. The method begins by analyzing data to be accessed by a targeted loop including at least one statement, where each statement has at least one memory reference, to determine if memory accesses are safe. If memory accesses are safe, the targeted loop is simdized. If not safe, it is determined if a scheme can be applied in which safety need not be guaranteed. If such a scheme can be applied, the targeted loop is simdized according to the scheme. If such a scheme cannot be applied, it is determined if padding is appropriate. If padding is appropriate, the data is padded and the targeted loop is simdized. If padding is not appropriate, non-simdized code is generated based on the targeted loop for handling boundary conditions, the targeted loop is simdized and combined with the non-simdized code.
    Type: Grant
    Filed: February 21, 2006
    Date of Patent: June 1, 2010
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu, Peng Zhao
  • Patent number: 7730483
    Abstract: The invention relates to a system and also method for storage of project-planning data in an automation system containing automation devices. To simplify changes within the automation system the project-planning data is stored in a generic, expandable data storage format, with parts of the project-planning data being assigned runtime data in each case, with the runtime data being assigned at least one automation device in each case, with the runtime data being executable parts of programs on the automation devices assigned to it and with the parts of the project-planning data being stored distributed in parallel to the runtime data assigned to it in each case in the automation devices assigned to the runtime data in each case.
    Type: Grant
    Filed: July 28, 2005
    Date of Patent: June 1, 2010
    Assignee: Siemens Aktiengesellschaft
    Inventors: Martin Daimer, Ludwig Karl-Dietze, Andreas Macher, Siegfried Prieler
  • Patent number: 7721273
    Abstract: The present invention relates to a system and methodology facilitating automated manufacturing processes in an industrial controller environment. An automation system is provided for automated industrial processing. The system includes an equipment phase object that is executed by a controller engine, wherein the equipment phase object can be accessible from internal instructions within the controller and/or from external instructions directed to the controller such as from a server or another controller across a network connection. A sequencing engine operates with the equipment phase object to facilitate automated industrial processing. The sequencing engine can be adapted to various industrial standards or in accordance with other state type models.
    Type: Grant
    Filed: June 4, 2004
    Date of Patent: May 18, 2010
    Assignee: Rockwell Automation Technologies, Inc.
    Inventors: Kenwood H. Hall, Stephen D. Ryan, Richard Alan Morse, Kam-Por Yuen, Raymond J. Staron, Paul R. D'Mura, James H. Jarrett, Michael D. Kalan, Robert C. Kline, Jr., Charles Martin Rischar, Christopher E. Stanek, Tao Zhao, Kenneth S. Plache, Shoshana L. Wodzisz, Jan Bezdicek, David A. Johnston, Jeffery W. Brooks
  • Patent number: 7716638
    Abstract: A machine readable description of a new feature of a processor is provided by a processor vendor. Control code executing on a processor, such as a traditional operating system kernel, a partitioning kernel, or the like can be programmed to receive the description of the feature and to use information provided by the description to detect, enable and manage operation of the new feature.
    Type: Grant
    Filed: March 4, 2005
    Date of Patent: May 11, 2010
    Assignee: Microsoft Corporation
    Inventor: Andrew J. Thornton
  • Patent number: 7712080
    Abstract: The present invention relates generally to computer programming, and more particularly to systems and methods for parallel distributed programming. Generally, a parallel distributed program is configured to operate across multiple processors and multiple memories. In one aspect of the invention, a parallel distributed program includes a distributed shared variable located across the multiple memories and distributed programs capable of operating across multiple processors.
    Type: Grant
    Filed: May 21, 2004
    Date of Patent: May 4, 2010
    Assignee: The Regents of the University of California
    Inventors: Lei Pan, Lubomir R. Bic, Michael B. Dillencourt
  • Patent number: 7712090
    Abstract: Methods and apparatus, including computer program products, for generating an executable program, including receiving serial compile commands in a pseudo-compiler to compile source code modules, scheduling the serial compiler commands in parallel compilers to compile the source code modules, compiling the source code modules in the parallel compliers to generate object code modules, sending compiler completion acknowledgements to a synchronizer and linking the object code modules in linkers in response to linker initiation commands from the synchronizer.
    Type: Grant
    Filed: February 7, 2003
    Date of Patent: May 4, 2010
    Assignee: SAP Aktiengesellschaft
    Inventor: Thomas Stuefe
  • Patent number: 7707543
    Abstract: A method, a device and a system arrangement are disclosed for generating self-contained software components having in each case synchronous and/or asynchronous interfaces with an internal threading model. The concept disclosed enables all necessary synchronization mechanisms to be provided automatically. The concept is based on an asynchronous operation manager used to divert callbacks from a called component into a calling component.
    Type: Grant
    Filed: November 22, 2005
    Date of Patent: April 27, 2010
    Assignee: Siemens Aktiengesellschaft
    Inventors: Detlef Becker, Karlheinz Dorn, Vladyslav Ukis, Hans-Martin Von Stockhausen
  • Patent number: 7694289
    Abstract: Methods for embedding codes executable in a first system having a first microprocessor into codes executable in a second system having a second microprocessor are described herein. In one aspect of the invention, an exemplary method includes providing first codes having a routine, the first codes being compilable to be executed in the first system, and compiling the first codes, resulting in second codes; the second codes comprising opcodes of the routine executable by the first system, which convert the second codes into third codes automatically, the third codes being compilable to be executed by the second system; this is followed by compiling the third codes, resulting in the fourth codes being executable in the second system, and linking the fourth codes, generating an executable image and executing the executable image in the second system. Other methods and apparatuses are also described.
    Type: Grant
    Filed: December 5, 2005
    Date of Patent: April 6, 2010
    Assignee: Apple Inc.
    Inventor: Keith Stattenfield
  • Patent number: 7689980
    Abstract: Linear transformations of statements in code are performed to generate linear expressions associated with the statements. Parallel code is generated using the linear expressions. Generating the parallel code includes splitting the computation-space of the statements into intervals and generating parallel code for the intervals.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: March 30, 2010
    Assignee: Intel Corporation
    Inventors: Zhao Hui Du, Shih-wei Liao, Gansha Wu, Guei-Yuan Lueh
  • Patent number: 7689977
    Abstract: The present disclosure is directed to a method for providing an OpenMP reduction implementation. The method may comprise creating an aggregate of at least one reduction variable in a parallel region or a work-sharing construct; defining a pointer variable, the pointer variable pointing to a dynamic array of the aggregate; creating an initialization routine, an outlined routine and a reduction accumulation routine; replacing the parallel region or the work-sharing construct with a runtime routine, the runtime routine taking a plurality of arguments including an address of the initialization routine, an address of the outlined routine, an address of the reduction accumulation routine, an address of the pointer variable, and a size of the aggregate; and executing the runtime routine when the at least one reduction variable is in the parallel region or the work-sharing construct.
    Type: Grant
    Filed: April 15, 2009
    Date of Patent: March 30, 2010
    Assignee: International Business Machines Corporation
    Inventors: Guansong Zhang, Shimin Cui, Ettore Tiotto
  • Patent number: 7689971
    Abstract: Methods and apparatuses provide for referencing thread local variables (TLVs) with techniques such as stack address mapping. A method may involve a head pointer that points to a set of thread local variables (TLVs) of a thread. A method according to one embodiment may include an operation for storing the head pointer in a global data structure in a user space of a processing system. The head pointer may subsequently be retrieved from the global data structure and used to access one or more TLVs associated with the thread. In one embodiment, the head pointer is retrieved without executing any kernel system calls. In an example embodiment, the head pointer is stored in a global array, and a stack address for the thread is used to derive an index into the array. Other embodiments are described and claimed.
    Type: Grant
    Filed: August 9, 2004
    Date of Patent: March 30, 2010
    Assignee: Intel Corporation
    Inventors: Jinzhan Peng, Xiaohua Shi, Guei-Yuan Lueh, Gansha Wu
  • Patent number: 7685583
    Abstract: We present a technique for implementing obstruction-free atomic multi-target transactions that target special “transactionable” locations in shared memory. A programming interface for using operations based on these transactions can be structured in several ways, including as n-word compare-and-swap (NCAS) operations or as atomic sequences of single-word loads and stores (e.g., as transactional memory).
    Type: Grant
    Filed: July 16, 2003
    Date of Patent: March 23, 2010
    Assignee: Sun Microsystems, Inc.
    Inventors: Mark S. Moir, Victor M. Luchangco, Maurice Herlihy
  • Publication number: 20100070958
    Abstract: Provided is a program parallelizing method and a program parallelizing apparatus that enable to efficiently generate a parallelized program with shorter parallel execution time. An instruction is scheduled by referring to inter-instruction dependency. A dependency between an instruction in a function fp/f0 and an instruction of a function fq of its descendant is analyzed, and parallelization is performed with the analysis result. First, an instruction of a deeper function fq is relatively scheduled to analyze whether each instruction has dependency with an instruction of another function fp. When there is inter-instruction dependency, scheduling of the instruction of the function fq is performed so as to maintain the dependency and realize the shortest execution time.
    Type: Application
    Filed: November 15, 2007
    Publication date: March 18, 2010
    Inventor: Masamichi Takagi
  • Patent number: 7681016
    Abstract: A low overhead mechanism for supporting speculative execution and code compression in a Very Long Instruction Word (VLIW) microprocessor. Profitable speculations can be determined statically at compile time and a low overhead hardware recovery mechanism used that does not require compensation code.
    Type: Grant
    Filed: June 30, 2003
    Date of Patent: March 16, 2010
    Assignee: Critical Blue Ltd.
    Inventor: Richard Michael Taylor
  • Patent number: 7673294
    Abstract: This invention modifies an irregular software pipelined loop conditioned upon data in a condition register in a compiler scheduled very long instruction word data processor to prevent over-execution upon loop exit. The method replaces a register modifying instruction with an instruction conditional upon the inverse condition register if possible. The method inserts a conditional register move instruction to a previously unused register within the loop if possible without disturbing the schedule. Then a restoring instruction is added after the loop. Alternatively, both these two functions can be performed by a delayed register move instruction. Instruction insertion is into a previously unused instruction slot of an execute packet. These changes can be performed manually or automatically by the compiler.
    Type: Grant
    Filed: January 18, 2006
    Date of Patent: March 2, 2010
    Assignee: Texas Instruments Incorporated
    Inventors: Elana D. Granston, Jagadeesh Sankaran
  • Patent number: 7673295
    Abstract: Compile-time non-concurrency analysis of parallel programs improves execution efficiency by detecting possible data race conditions within program barriers. Subroutines are modeled with control flow graphs and region trees having plural nodes related by edges that represent the hierarchical loop structure and construct relationship of statements. Phase partitioning of the control flow graph allows analysis of statement relationships with programming semantics, such as those of the OpenMP language, that define permitted operations and execution orders.
    Type: Grant
    Filed: April 27, 2004
    Date of Patent: March 2, 2010
    Assignee: Sun Microsystems, Inc.
    Inventor: Yuan Lin
  • Publication number: 20100042981
    Abstract: Generating parallelized executable code from input code includes statically analyzing the input code to determine aspects of data flow and control flow of the input code; dynamically analyzing the input code to determine additional aspects of data flow and control flow of the input code; generating an intermediate representation of the input code based at least in part on the aspects of data flow and control flow of the input code identified by the static analysis and the additional aspects of data and control flow of the input code identified by the dynamic analysis; and processing the intermediate representation to determine portions of the intermediate representation that are eligible for parallel execution; and generating parallelized executable code from the processed intermediate representation
    Type: Application
    Filed: August 13, 2009
    Publication date: February 18, 2010
    Inventors: Robert Scott Dreyer, Joel Kevin Jones, Michael Douglas Sharp, Ivan Dimitrov Baev
  • Publication number: 20100031241
    Abstract: A method and apparatus for optimizing source code for use in a parallel computing environment by compiling an application source code, performing analysis, and optimizing the application source code. At the time of compilation, a compiler adds instrumentation to a prepared executable. An analysis program then analyzes the prepared executable and generates an analysis result. The analysis result is then used by the analysis program to optimize the application source code for parallel processing.
    Type: Application
    Filed: November 17, 2008
    Publication date: February 4, 2010
    Inventor: Leon Schwartz
  • Patent number: 7657882
    Abstract: A dataflow instruction set architecture and execution model, referred to as WaveScalar, which is designed for scalable, low-complexity/high-performance processors, while efficiently providing traditional memory semantics through a mechanism called wave-ordered memory. Wave-ordered memory enables “real-world” programs, written in any language, to be run on the WaveScalar architecture, as well as any out-of-order execution unit. Because it is software-controlled, wave-ordered memory can be disabled to obtain greater parallelism. Wavescalar also includes a software-controlled tag management system.
    Type: Grant
    Filed: January 21, 2005
    Date of Patent: February 2, 2010
    Assignee: University of Washington
    Inventors: Mark H. Oskin, Steven J. Swanson, Susan J. Eggers
  • Patent number: 7657877
    Abstract: A method and device for translating a program to a system including at least one first processor and a reconfigurable unit. Code portions of the program which are suitable for the reconfigurable unit are determined. The remaining code of the program is extracted and/or separated for processing by the first processor.
    Type: Grant
    Filed: June 20, 2002
    Date of Patent: February 2, 2010
    Assignee: Pact XPP Technologies AG
    Inventors: Martin Vorbach, Armin Nückel, Frank May, Markus Weinhardt, Joao Manuel Paiva Cardoso
  • Patent number: 7657880
    Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is permitted to execute Store instructions. Store blocker logic operates to prevent data associated with a Store instruction in a helper thread from being committed to memory. Dependence blocker logic operates to prevent data associated with a Store instruction in a speculative helper thread from being bypassed to a Load instruction in a non-speculative thread.
    Type: Grant
    Filed: August 1, 2003
    Date of Patent: February 2, 2010
    Assignee: Intel Corporation
    Inventors: Hong Wang, Tor Aamodt, Per Hammarlund, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Steve Shih-wei Liao
  • Publication number: 20090320005
    Abstract: A parallelism policy object provides a control parallelism interface whose implementation evaluates parallelism conditions that are left unspecified in the interface. User-defined and other parallelism policy procedures can make recommendations to a worker program for transitioning between sequential program execution and parallel execution. Parallelizing assistance values obtained at runtime can be used in the parallelism conditions on which the recommendations are based. A consistent parallelization policy can be employed across a range of parallel constructs, and inside recursive procedures.
    Type: Application
    Filed: June 4, 2008
    Publication date: December 24, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Stephen Toub, Igor Ostrovsky, Joe Duffy, Vance Morrison, Huseyin Yildiz
  • Publication number: 20090307671
    Abstract: A system and method for modeling simulation and game artificial intelligence as a data management problem. A scripting language that provides game designers and players with a data-driven artificial intelligence scheme for customizing behavior for individual agents. Query processing and indexing techniques to efficiently execute large numbers of agent scripts, thus providing a framework for games with a large number of agents.
    Type: Application
    Filed: June 8, 2009
    Publication date: December 10, 2009
    Applicant: CORNELL UNIVERSITY
    Inventors: Walker White, Johannes Gehrke, Alan John Demers, Christoph Emanuel Koch
  • Publication number: 20090307655
    Abstract: Systems and methods for parallelizing applications that operate on irregular data structures. In an embodiment, the methods and systems enable programmers to use set iterators to express algorithms containing amorphous data parallelism. Parallelization can be achieved by speculatively executing multiple iterations of the iterator in parallel. Conflicts between speculatively executing iterations can be detected and handled using information in class libraries.
    Type: Application
    Filed: June 10, 2009
    Publication date: December 10, 2009
    Inventors: Keshav Kumar Pingali, Milind Vidyadhar Kulkarni
  • Publication number: 20090300591
    Abstract: Parallel tasks are created, and the tasks include a first task and a second task. Each task resolves a future. At least one of three possible continuations for each of the tasks is supplied. The three continuations include a success continuation, a cancellation continuation, and a failure continuation. A value is returned as the future of the first task upon a success continuation for the first task. The value from the first task is used in the second task to compute a second future. The cancellation continuation is supplied if the task is cancelled and the failure continuation is supplied if the task does not return a value and the task is not cancelled.
    Type: Application
    Filed: June 2, 2008
    Publication date: December 3, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: John Duffy, Stephen H. Toub
  • Patent number: 7627864
    Abstract: A method to optimize speculative parallel thread execution comprises selecting a plurality of partition candidate pairs for speculative parallel thread execution, transforming each partition candidate pair of the plurality of partition candidate pairs to improve the expected performance gain of each pair, and selecting a set of one or more transformed partition candidate pairs that do not interfere with each other and produce a maximum expected performance gain.
    Type: Grant
    Filed: June 27, 2005
    Date of Patent: December 1, 2009
    Assignee: Intel Corporation
    Inventors: Zhao Hui Du, Tin-fook Ngai, Chu-cheow Lim
  • Patent number: 7620945
    Abstract: One embodiment of the present invention provides a system that supports parallelized generic reduction operations in a parallel programming language, wherein a reduction operation is an associative operation that can be divided into a group of sub-operations that can execute in parallel. During operation, the system detects generic reduction operations in source code. In doing so, the system identifies a set of reduction variables upon which the generic reduction operation will operate, along with a set of initial values for the variables. The system additionally identifies a merge operation that merges partial results from the parallel generic reduction operations into a final result. The system then compiles the program's source code into a form which facilitates executing the generic reduction operations in parallel.
    Type: Grant
    Filed: August 16, 2005
    Date of Patent: November 17, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Yonghong Song, Yuan Lin, Prashanth Narayanaswamy
  • Patent number: 7617494
    Abstract: The program to be executed is compiled by translating it into native instructions of the instruction-set architecture of the processor system, organizing the instructions deriving from the translation of the program into respective bundles in an order of successive bundles, each bundle grouping together instructions adapted to be executed in parallel by the processor system. The bundles of instructions are ordered into respective sub-bundles, said sub-bundles identifying a first set of instructions, which must be executed before the instructions belonging to the next bundle of said order, and a second set of instructions, which can be executed both before and in parallel with respect to the instructions belonging to said subsequent bundle of said order.
    Type: Grant
    Filed: July 1, 2003
    Date of Patent: November 10, 2009
    Assignee: STMicroelectronics S.r.l.
    Inventors: Fabrizio Simone Rovati, Antonio Maria Borneo, Danilo Pietro Pau
  • Patent number: 7617488
    Abstract: A method and an apparatus for determining processor utilization have been disclosed. In one embodiment, the method includes determining processor utilization in a data processing system and synchronizing execution of a number of threads in the data processing system to prevent interrupting the determining of the processor utilization. Other embodiments have been claimed and described.
    Type: Grant
    Filed: December 30, 2003
    Date of Patent: November 10, 2009
    Assignee: Intel Corporation
    Inventors: Vasudevan Srinivasan, Avinash P. Chakravarthy
  • Publication number: 20090271774
    Abstract: A Veil program analyzes the source code and/or data of an existing sequential target program and determines how best to distribute the target program and data among the processing elements of a multi-processing element computing system. The Veil program analyzes source code loops, data sizes and types to prepare a set of distribution attempts, whereby each distribution is run under a run-time evaluation wrapper and evaluated to determine the optimal distribution across the available processing elements.
    Type: Application
    Filed: July 3, 2009
    Publication date: October 29, 2009
    Applicant: MANAGEMENT SERVICES GROUP, INC. d/b/a Global Technical Systems
    Inventors: Robert Stephen Gordy, Terry Spitzer
  • Patent number: 7606698
    Abstract: A method and apparatus for sharing data between processors within first and second discrete clusters of processors. The method comprises supplying a first amount of data from a first data array in a first discrete cluster of processors to selector logic. A second amount of data from a second data array in a second discrete cluster of processors is also supplied to the selector logic. The first or second amount of data is then selected using the selector logic, and supplied to a shared input port on a processor in the first discrete cluster of processors. The apparatus comprises selector logic for selecting between input data supplied by a first data array and a second data array. The data arrays are located within different discrete clusters of processors. The selected data is then supplied to a shared input port on a processor.
    Type: Grant
    Filed: September 26, 2006
    Date of Patent: October 20, 2009
    Assignee: Cadence Design Systems, Inc.
    Inventors: Beshara G. Elmufdi, Mitchell G. Poplack
  • Patent number: 7590976
    Abstract: The present invention relates a compiler program, a computer-readable storage medium storing such a compiler program, a compiling method and a compiling unit, and an object thereof is to automatically generate a reentrant object program. In order to accomplish this object, an address saving program generator 16a generates an address saving program for saving a data area address of a calling program module; an address setting program generator 16b generates an address setting program for setting a data area address of an other program module; a transferring program generator 16c generates a transferring program for the transfer from the calling program module to the other program module; an address resetting program generator 16d generates an address resetting program for reading and resetting the saved data area address; and an accessing program generator 16e generates an accessing program for accessing a data area for the other program module using a relative address from the set data area address.
    Type: Grant
    Filed: December 26, 2003
    Date of Patent: September 15, 2009
    Assignee: Panasonic Corporation
    Inventors: Masaki Kawai, Takuji Kawamoto, Shusuke Haruna, Yutaka Fujihara