Including Scheduling Instructions Patents (Class 717/161)
  • Patent number: 9513976
    Abstract: A computer-implemented method and a corresponding computer system for emulation of Extended Memory Semantics (EMS) operations. The method and system include obtaining a set of computer instructions that include an EMS operation, converting the EMS operation into a corresponding atomic memory operation (AMO), and executing the AMO on at least one processor of a computer.
    Type: Grant
    Filed: December 30, 2011
    Date of Patent: December 6, 2016
    Assignee: Intel Corporation
    Inventor: Roger Golliver
  • Patent number: 9495225
    Abstract: A thread priority control mechanism is provided which uses the completion event of the preceding transaction to raise the priority of the next transaction in the order of execution when the transaction status has been changed from speculative to non-speculative. In one aspect of the present invention, a thread-level speculation mechanism is provided which has content-addressable memory, an address register and a comparator for recording transaction footprints, and a control logic circuit for supporting memory synchronization instructions. This supports hardware transaction memory in detecting transaction conflicts. This thread-level speculation mechanism includes a priority up bit for recording an attribute operand in a memory synchronization instruction, a means for generating a priority up event when a thread wake-up event has occurred and the priority up bit is 1, and a means for preventing the CAM from storing the load/store address when the instruction is a non-transaction instruction.
    Type: Grant
    Filed: October 24, 2013
    Date of Patent: November 15, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christian Jacobi, Marcel Mitran, Moriyoshi Ohara
  • Patent number: 9436582
    Abstract: Embodiments include dividing source code for an application into multiple program fragments by generating a control flow graph for the multiple program fragments. The control flow graph represents a graph structure with nodes representing the multiple program fragments and edges representing an execution order of the program fragments. Aspects include searching for a chosen assertion statement within a program fragment, wherein the chosen assertion statement must be satisfied for correct execution of the chosen program fragment. Aspects also include identifying an immediate parent program fragment for the chosen program fragment using the control flow graph and calculating an immediate parent assertion statement for the immediate parent program fragment using the chosen assertion logic statement. The immediate parent assertion statement is an over-approximate pre-condition of the chosen program fragment.
    Type: Grant
    Filed: November 18, 2015
    Date of Patent: September 6, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Viresh Paruthi, Mitra Purandare
  • Patent number: 9405596
    Abstract: Code versioning for enabling transactional memory region promotion may include receiving a portion of candidate source code; outlining the portion of candidate source code received for parallel execution; wrapping a critical region with entry and exit routines to enter into a speculation sub-process, wherein the entry and exit routines also gather conflict statistics at run time; and generating an outlined code portion comprising multiple loop versions using a processor.
    Type: Grant
    Filed: October 2, 2014
    Date of Patent: August 2, 2016
    Assignee: GlobalFoundries, Inc.
    Inventors: Hans Boettiger, Yaoqing Gao, Martin Ohmacht, Kai-Ting Amy Wang
  • Patent number: 9367658
    Abstract: Embodiments of the invention provide a method and apparatus for generating programmable logic for a hardware accelerator, the method comprising: generating a graph of nodes representing the programmable logic to be implemented in hardware; identifying nodes within the graph that affect external flow control of the programmable logic; retaining the identified nodes and removing or replacing all nodes which do not affect external flow control of the programmable logic in a modified graph; and simulating the modified graph or building a corresponding circuit of the retained nodes.
    Type: Grant
    Filed: June 22, 2011
    Date of Patent: June 14, 2016
    Assignee: Maxeler Technologies Ltd.
    Inventors: Oliver Pell, James Huggett
  • Patent number: 9355175
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes receiving a query. A plurality of search results responsive to the query are identified. The search results are analyzed to determine that at least a first search result is associated with a first answer box topic. The search results are provided along with an answer box precursor for the first answer box topic.
    Type: Grant
    Filed: May 27, 2011
    Date of Patent: May 31, 2016
    Assignee: Google Inc.
    Inventors: Tal Cohen, Ziv Bar-Yossef, Igor Tsvetkov, Adi Mano, Oren Naim, Nitsan Oz, Nir Andelman, Pravir K. Gupta
  • Patent number: 9348600
    Abstract: A method and a system are provided for prioritizing the fetching of instructions for each of a plurality of executing instruction threads in a multi-threaded processor. Instructions come from at least one source of instructions. Each thread has a number of threads buffered for execution in an instruction buffer. A first metric for each thread is determined based on the number of instructions currently buffered. A second metric is then determined for each thread, this being an execution based metric. A priority order for the threads is determined from the first and second metrics, and an instruction is fetched from the source for the thread with the highest determined priority which is requesting an instruction.
    Type: Grant
    Filed: February 9, 2009
    Date of Patent: May 24, 2016
    Assignee: Imagination Technologies Limited
    Inventor: Andrew Webber
  • Patent number: 9317265
    Abstract: Disclosed here are methods, systems, paradigms and structures for optimizing intermediate representation (IR) of a script code for atomic execution. Atomic execution of the script is achieved by generating portions of the IR as an atomic transaction. In an atomic transaction, a series of operations either all execute, or none executes. The IR includes checkpoints that evaluate to one of two possible values. The checkpoint evaluates to a first value when there is no error during execution, and evaluates to a second value when an error occurs. The IR is optimized for atomic execution by regenerating a portion of the IR including the checkpoint and code associated with the checkpoint as a transaction. When an error occurs during the execution of the transaction, the transaction is aborted and a state of execution of the script code is reverted to a state prior to the beginning of the transaction.
    Type: Grant
    Filed: March 25, 2013
    Date of Patent: April 19, 2016
    Assignee: Facebook, Inc.
    Inventors: Ali-Reza Adl-Tabatabai, Guilherme de Lima Ottoni, Michael Paleczny
  • Patent number: 9292287
    Abstract: Provided is a loop scheduling method including scheduling a first loop using execution units, and scheduling a second loop using execution units available as a result of the scheduling of the first loop. An n-th loop (n>2) may be scheduled using a result of scheduling an (n?1)-th loop, similar to the (n?1)-th loop. The first loop may be a higher priority loop than the second loop.
    Type: Grant
    Filed: July 14, 2014
    Date of Patent: March 22, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yeon Bok Lee, Young Hwan Park, Ho Yang, Keshava Prasad
  • Patent number: 9239712
    Abstract: Apparatuses and methods may provide for determining a level of performance for processing one or more loops by a dynamic compiler and executing code optimizations to generate a pipelined schedule for the one or more loops that achieves the determined level of performance within a prescribed time period. In one example, a dependence graph may be established for the one or more loops, and each dependence graph may be partitioned into stages based on the level of performance.
    Type: Grant
    Filed: March 29, 2013
    Date of Patent: January 19, 2016
    Assignee: Intel Corporation
    Inventors: Hongbo Rong, Hyunchul Park, Youfeng Wu
  • Patent number: 9158658
    Abstract: A method, system, and computer program product for detecting merge conflicts and compilation errors in a collaborative integrated development environment are provided in the illustrative embodiments. Prior to at least one user committing a set of uncommitted changes associated with a source code to a repository, the computer receives the set of uncommitted changes associated with the source code. The computer creates at least one temporary branch corresponding to the set of uncommitted changes associated with the source code. The computer device merges the at least one temporary branch to corresponding portions of the source code. The computer determines whether a merge conflict has occurred. If the merge conflict occurred, the computer communicates a first notification to the at least one user, the first notification indicating the merge conflict.
    Type: Grant
    Filed: October 15, 2013
    Date of Patent: October 13, 2015
    Assignee: International Business Machines Corporation
    Inventors: George T. Bigwood, Jason T. McMann, Michael G. Nikitaides, Kaleb D. Walton
  • Patent number: 9152422
    Abstract: An apparatus and method for compressing trace data is provided. The apparatus includes a detection unit configured to detect trace data corresponding to one or more function units performing a substantially significant operation in a reconfigurable processor as valid trace data, and a compression unit configured to compress the valid trace data.
    Type: Grant
    Filed: July 7, 2011
    Date of Patent: October 6, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jae-Young Kim, Dong-Hoon Yoo, Yeon-Gon Cho, Hee-Jun Shim, Chang-Moo Kim
  • Patent number: 9063735
    Abstract: Provided are a reconfigurable processor, which is capable of reducing the probability of an incorrect computation by analyzing the dependence between memory access instructions and allocating the memory access instructions between a plurality of processing elements (PEs) based on the results of the analysis, and a method of controlling the reconfigurable processor. The reconfigurable processor extracts an execution trace from simulation results, and analyzes the memory dependence between instructions included in different iterations based on parts of the execution trace of memory access instructions.
    Type: Grant
    Filed: October 13, 2011
    Date of Patent: June 23, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hee-Jin Ahn, Dong-Hoon Yoo, Bernhard Egger, Min-Wook Ahn, Jin-Seok Lee, Tai-Song Jin, Won-Sub Kim
  • Patent number: 9043582
    Abstract: Systems and methods for static code scheduling are disclosed. A method can include receiving an intermediate representation of source code, building a directed acyclic graph (DAG) for the intermediate representation, and creating chains of dependent instructions from the DAG for cluster formation. The chains are merged into clusters and each node in the DAG is marked with an identifier of a cluster it is part of to generate a marked instruction DAG. Instruction DAG scheduling is then performed using information about the clusters to generate an ordered intermediate representation of the source code.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: May 26, 2015
    Assignee: Qualcomm Innovation Center, Inc.
    Inventor: Sergei Larin
  • Publication number: 20150100950
    Abstract: A method for scheduling loop processing of a reconfigurable processor includes generating a dependence graph of instructions for the loop processing; mapping a first register file of the reconfigurable processor on an arrow indicating inter-iteration dependence on the dependence graph; and searching for schedules of the instructions based on the mapping result.
    Type: Application
    Filed: October 6, 2014
    Publication date: April 9, 2015
    Inventors: Min-wook AHN, Won-sub Kim, Tai-song Jin, Seung-won Lee, Jin-seok Lee
  • Patent number: 8990547
    Abstract: Systems, methodologies, computer-readable media, and other embodiments associated with ordering instructions are described. One exemplary system embodiment can include an analysis logic configured to analyze executable instructions from an executable program. A re-write logic can be configured to re-order selected load instructions within the executable program based on latency times for the selected load instructions.
    Type: Grant
    Filed: August 23, 2005
    Date of Patent: March 24, 2015
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: James R. Callister, Richard E. Hank, Teresa L. Johnson
  • Patent number: 8984525
    Abstract: A method for job management in an HPC environment includes determining an unallocated subset from a plurality of HPC nodes, with each of the unallocated HPC nodes comprising an integrated fabric. An HPC job is selected from a job queue and executed using at least a portion of the unallocated subset of nodes.
    Type: Grant
    Filed: October 11, 2013
    Date of Patent: March 17, 2015
    Assignee: Raytheon Company
    Inventors: Shannon V. Davidson, Anthony N. Richoux
  • Patent number: 8972961
    Abstract: A processor instruction scheduler comprising an optimization engine which uses an optimization model for a processor architecture with: means to generate an optimization model for the optimization engine from a design of a processor and data representing optimization goals and constraints and a code stream, wherein the processor has at least two execution pipes and at least two registers, and wherein the design comprises data for processor instruction latency and execution pipes, and wherein the code stream comprises processor instructions with corresponding register selections; and reordering means to generate an optimized code stream from the code stream with the optimal solution provided by the optimization engine for the optimization model by reordering the code stream, such that optimum values for the optimization goals under the given constraints are achieved without affecting the operation results of the code stream.
    Type: Grant
    Filed: May 11, 2011
    Date of Patent: March 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Juergen Koehl, Jens Leenstra, Philipp Panitz, Hans Schlenker
  • Patent number: 8959497
    Abstract: One embodiment of the present invention sets forth a technique for partitioning a predecessor thread program into sub-programs and dynamically spawning a thread grid of the sub-programs based on the outcome of a conditional statement in the predecessor thread program. The programming instructions for the predecessor thread program are analyzed to assess the benefit of partitioning the thread program at a conditional statement into sub-programs. If the predecessor thread program is partitioned, then each branch of the conditional statement may be used to form a separate sub-program. Predicate tables are populated at the predecessor thread program run-time to establish which possible instances of the thread sub-programs should be spawned in subsequent execution phases.
    Type: Grant
    Filed: August 29, 2008
    Date of Patent: February 17, 2015
    Assignee: NVIDIA Corporation
    Inventors: John A. Stratton, David Luebke
  • Patent number: 8954946
    Abstract: A static branch prediction method and code execution method for a pipeline processor, and a code compiling method for static branch prediction, are provided herein. The static branch prediction method includes predicting a conditional branch code as taken or not-taken, adding the prediction information, converting the conditional branch code into a jump target address setting (JTS) code including target address information, branch time information, and a test code, and scheduling codes in a block. The code may be scheduled into a last slot of the block, and the JTS code may be scheduled into an empty slot after all the other codes in the block are scheduled. When the conditional branch code is predicted as taken in the prediction operation, a target address indicated by the target address information may be fetched at a cycle time indicated by the branch time information.
    Type: Grant
    Filed: January 25, 2010
    Date of Patent: February 10, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Tai-song Jin, Dong-kwan Suh, Suk-jin Kim
  • Patent number: 8949809
    Abstract: A system and associated method for automatically pipeline parallelizing a nested loop in sequential code over a predefined number of threads. Pursuant to task dependencies of the nested loop, each subloop of the nested loop are allocated to a respective thread. Combinations of stage partitions executing the nested loop are configured for parallel execution of a subloop where permitted. For each combination of stage partitions, a respective bottleneck is calculated and a combination with a minimum bottleneck is selected for parallelization.
    Type: Grant
    Filed: March 1, 2012
    Date of Patent: February 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Pradeep Varma, Manish Gupta, Monika Gupta, Naga Praveen Kumar Katta
  • Patent number: 8949820
    Abstract: A technique for streaming from a media device involves enabling a local device to function as a streaming server. An example of a method according to the technique includes inserting a removable storage device that includes programs associated with a streaming application, running one or more of the programs, ensuring that a streaming software player is installed, and executing a streaming-related activity associated with the streaming application. An example of a system according to the technique includes a means for providing a streaming application that expects content to be found on a media drive, a means for intercepting requests for content expected to be found on the media drive, and a means for honoring the requests with content from a different media location.
    Type: Grant
    Filed: November 26, 2012
    Date of Patent: February 3, 2015
    Assignee: Numecent Holdings, Inc.
    Inventors: Jeffrey de Vries, Greg Zavertnik, Ann Hubbell
  • Patent number: 8943487
    Abstract: Particular embodiments optimize a C++ function comprising one or more loops for symbolic execution, comprising for each loop, if there is a branching condition within the loop, then rewrite the loop to move the branching condition outside the loop. Particular embodiments may further optimize the C++ function through simplified symbolic expressions and adding constructs forcing delayed interpretation of symbolic expressions during the symbolic execution.
    Type: Grant
    Filed: January 20, 2011
    Date of Patent: January 27, 2015
    Assignee: Fujitsu Limited
    Inventors: Guodong Li, Sreeranga P. Rajan, Indradeep Ghosh
  • Patent number: 8935686
    Abstract: A condition detected by a virtual routine may be treated by setting an error code or raising an exception, depending on circumstances. Enhanced vtable layouts promote availability of both error-ID-based and exception-based virtual routines, while maintaining compatibility. Compilers treat virtual routines based on their circumstances. One enhanced vtable includes error-ID-based routine pointers in a COM-layout-compatible portion and exception-based routine pointers in an extension. For a virtual routine not overridden by a derived class, a compiler generates a direct call. For an object instance of a specific type, the compiler generates a direct exception-based call for the object's routine. For a factory-sourced object's routine, the compiler generates a virtual exception-based call. When the virtual routine belongs to a component having an enhanced vtable, the compiler may generate a virtual call using the exception-based routine pointer. Code wrappers between COM and native format may also be used.
    Type: Grant
    Filed: August 28, 2012
    Date of Patent: January 13, 2015
    Assignee: Microsoft Corporation
    Inventors: Deon Brewis, James Springfield, Sridhar S. Madhugiri
  • Patent number: 8935685
    Abstract: A processor instruction scheduler comprising an optimization engine which uses an optimization model for a processor architecture with: means to generate an optimization model for the optimization engine from a design of a processor and data representing optimization goals and constraints and a code stream, wherein the processor has at least two execution pipes and at least two registers, and wherein the design comprises data for processor instruction latency and execution pipes, and wherein the code stream comprises processor instructions with corresponding register selections; and reordering means to generate an optimized code stream from the code stream with the optimal solution provided by the optimization engine for the optimization model by reordering the code stream, such that optimum values for the optimization goals under the given constraints are achieved without affecting the operation results of the code stream.
    Type: Grant
    Filed: April 28, 2012
    Date of Patent: January 13, 2015
    Assignee: International Business Machines Corporation
    Inventors: Juergen Koehl, Jens Leenstra, Philipp Panitz, Hans Schlenker
  • Patent number: 8930926
    Abstract: Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.
    Type: Grant
    Filed: April 16, 2010
    Date of Patent: January 6, 2015
    Assignee: Reservoir Labs, Inc.
    Inventors: Cedric Bastoul, Richard A. Lethin, Allen K. Leung, Benoit J. Meister, Peter Szilagyi, Nicolas T. Vasilache, David E. Wohlford
  • Patent number: 8918770
    Abstract: A system and method for compiling includes, for a parallelizable code portion of an application stored on a computer readable storage medium, determining one or more variables that are to be transferred to and/or from a coprocessor if the parallelizable code portion were to be offloaded. A start location and an end location are determined for at least one of the one or more variables as a size in memory. The parallelizable code portion is transformed by inserting an offload construct around the parallelizable code portion and passing the one or more variables and the size as arguments of the offload construct such that the parallelizable code portion is offloaded to a coprocessor at runtime.
    Type: Grant
    Filed: August 24, 2012
    Date of Patent: December 23, 2014
    Assignee: NEC Laboratories America, Inc.
    Inventors: Nishkam Ravi, Tao Bao, Ozcan Ozturk, Srimat Chakradhar
  • Patent number: 8898115
    Abstract: A computer-implemented method for using data archiving to expedite server migration may include: 1) archiving data from at least one source computing system to an archiving system in accordance with an archiving policy, 2) altering metadata associated with the archived data on the archiving system so that the metadata references a desired target computing system instead of the source computing system, and then, upon bringing the target computing system online, 3) restoring at least a portion of the archived data from the archiving system to the target computing system. Various other methods, systems, and configured computer-readable media are also disclosed.
    Type: Grant
    Filed: February 12, 2013
    Date of Patent: November 25, 2014
    Assignee: Symantec Corporation
    Inventors: Laxmikant Gunda, Praveen Rakshe
  • Patent number: 8898648
    Abstract: A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Guojing Cong, Hiroki Murata, Yasushi Negishi, Hui-Fang Wen
  • Patent number: 8893095
    Abstract: There are provided methods and computer program products for generating code for an architecture encoding an extended register specification. A method for generating code for a fixed-width instruction set includes identifying a non-contiguous register specifier. The method further includes generating a fixed-width instruction word that includes the non-contiguous register specifier.
    Type: Grant
    Filed: July 26, 2012
    Date of Patent: November 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Robert Kevin Montoye, Brett Olsson, John-David Wellman
  • Patent number: 8893104
    Abstract: The aspects enable a computing device to allocate memory space to variables during runtime compilation of a software application. A compiler may be modified to identify operations that can be performed on either a main pipe or an alternative pipe, identify chains of related operations that can be performed on either the main pipe or the alternative pipe, identify points in the execution of code at which the number of live values will exceed the number of registers, and choosing a chain of operations as a candidate to be moved to the alternative pipe in order to reduce the number of live values at identified points in the execution of code. The entire chosen chain of operations may be moved to the alternative pipe. The alternative pipe may perform the computations and return the results to the main pipe for execution.
    Type: Grant
    Filed: March 1, 2012
    Date of Patent: November 18, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Christopher A. Vick, Gregory M. Wright
  • Patent number: 8869129
    Abstract: An apparatus and method for scheduling an instruction are provided. The apparatus includes an analyzer configured to analyze dependency of a plurality of recurrence loops and a scheduler configured to schedule the recurrence loops based the analyzed dependencies. When scheduling a plurality of recurrence loops, the apparatus first schedules a dominant loop whose loop head has no dependency on another loop among the recurrence loops.
    Type: Grant
    Filed: November 2, 2009
    Date of Patent: October 21, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Tae-wook Oh, Won-sub Kim, Bernhard Egger
  • Patent number: 8856313
    Abstract: A system and method for determining data usage based on provenance information, in a stream-processing system, includes progressively setting usage information for output stream data objects (SDOs), determining input SDOs that an output SDO depends on, based on a provenance dependency function; recursively feeding back the usage information for a subset of SDOs that can be discarded; and discarding the subset of SDOs. A system and method for data retention based on usage information, in a stream-processing system, includes managing retention of SDOs by deleting SDOs that are determined to be of null usage; and enhancing retention characteristics of SDOs that are deemed to have usage.
    Type: Grant
    Filed: November 13, 2007
    Date of Patent: October 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Lisa Amini, Chitra Venkatramani
  • Patent number: 8850410
    Abstract: A system and method for improving software maintainability, performance, and/or security by associating a unique marker to each software code-block; the system comprising of a plurality of processors, a plurality of code-blocks, and a marker associated with each code-block. The system may also include a special hardware register (code-block marker hardware register) in each processor for identifying the markers of the code-blocks executed by the processor, without changing any of the plurality of code-blocks.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: September 30, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ramanjaneya S. Burugula, Joefon Jann, Pratap C. Pattnaik
  • Patent number: 8832669
    Abstract: Generating decode time instruction optimization (DTIO) object code that enables a DTIO enabled processor to optimize execution of DTIO instructions. A code sequence configured to facilitate DTIO in a DTIO enabled processor is identified by a computer. The code sequence includes an internal representation (IR) of a first instruction and an IR of a second instruction. The second instruction is dependent on the first instruction. A schedule associated with at least one of the IR of the first instruction and the IR of the second instruction is modified. The modifying includes generating a modified schedule that is configured to place the first instruction next to the second instruction. An object file is generated based on the modified schedule. The object file includes the first instruction placed next to the second instruction. The object file is emitted.
    Type: Grant
    Filed: August 5, 2013
    Date of Patent: September 9, 2014
    Assignee: International Business Machines Corporation
    Inventors: Robert J. Blainey, Michael K. Gschwind, James L. McInnes, Steven J. Munroe
  • Patent number: 8831791
    Abstract: Illustrative embodiments provide a computer implemented method, a data processing system, and a computer program product for adjusting cooling settings. The computer implemented method comprises analyzing a set of instructions of an application to determine a number of degrees by which a set of instructions will raise a temperature of at least one processor core. The computer implemented method further calculates a cooling setting for at least one cooling system for the at least one processor core. The computer implemented method adjusts the at least one cooling system based on the cooling setting. The step of analyzing the set of instructions is performed before the set of instructions is executed on the at least one processor core. The step of adjusting the at least one cooling system is performed before the set of instructions is executed on the at least one processor core.
    Type: Grant
    Filed: June 21, 2012
    Date of Patent: September 9, 2014
    Assignee: International Business Machines Corporation
    Inventors: Robert L. Angell, David W. Cosby, Robert R. Friedlander, James R. Kraemer
  • Patent number: 8826257
    Abstract: A method of memory disambiguation hardware to support software binary translation is provided. This method includes unrolling a set of instructions to be executed within a processor, the set of instructions having a number of memory operations. An original relative order of memory operations is determined. Then, possible reordering problems are detected and identified in software. The reordering problem being when a first memory operation has been reordered prior to and aliases to a second memory operation with respect to the original order of memory operations. The reordering problem is addressed and a relative order of memory operations to the processor is communicated.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: September 2, 2014
    Assignee: Intel Corporation
    Inventors: Muawya M. Al-Otoom, Paul Caprioli, Abhay S. Kanhere, Arvind Krishnaswamy, Omar M. Shaikh
  • Patent number: 8819618
    Abstract: A device receives a model that includes model elements scheduled to execute in time slots on a hardware device. The device identifies time slots, of the time slots, that are unoccupied or underutilized by the model elements, and identifies a set of model elements that can be moved to the unoccupied time slots without affecting a behavior of the model. The device calculates a combined execution time of the model elements, determines whether the combined execution time of the model elements is less than or equal to a duration of a first time slot of the time slots, and schedules the model elements for execution in the first time slot when the combined execution time of the model elements is less than or equal to the duration of the first time slot.
    Type: Grant
    Filed: September 26, 2012
    Date of Patent: August 26, 2014
    Assignee: The MathWorks, Inc.
    Inventors: David MacLay, Matej Urbas
  • Patent number: 8813044
    Abstract: A method, system, and article of manufacture are disclosed for transforming a definition of a process for delivering a service. This service process definition is comprised of computer readable code. The method comprises the steps of expressing a given set of assumptions in a computer readable code; and transforming said process definition by using a processing unit to apply said assumptions to said process definition to change the configuration of the process definition. The process definition may be transformed by using factors relating to the specific context in or for which the process definition is executed. Also, the process definition may be transformed by identifying, in a flow diagram for the service process definition, flows to which the assumptions apply, and applying program rewriting techniques to those identified flows.
    Type: Grant
    Filed: September 6, 2012
    Date of Patent: August 19, 2014
    Assignee: International Business Machines Corporation
    Inventors: David F. Bantz, Steven J. Mastrianni, James R. Moulic, Dennis G. Shea
  • Patent number: 8806466
    Abstract: A program generation apparatus references a source program including a loop for executing a block N times (N?2) and having such dependence that a variable defined in a statement in the block pertaining to ith execution (1?i<N) is referenced by a statement in the block pertaining to jth execution (i<j?N), calculates equivalent representations of variables in the block pertaining to the ith execution and the block pertaining to any other execution than the ith execution, specifies, with respect to each representation of a target variable causing the dependence, a representation of a variable not causing the dependence that is equivalent to the representation of the target variable, and generates a program being for executing the block M times (M?N) and including a statement including the specified representation in place of each representation of the target variable.
    Type: Grant
    Filed: July 4, 2011
    Date of Patent: August 12, 2014
    Assignee: Panasonic Corporation
    Inventors: Akira Tanaka, Hiroyuki Morishita, Akihiko Inoue
  • Patent number: 8769507
    Abstract: A method, system, and article of manufacture are disclosed for transforming a definition of a process for delivering a service on a specified computing device. This service process definition is comprised of computer readable code. The method comprises the steps of expressing a given set of assumptions in a computer readable code; and transforming the definition by using a processing unit to apply the assumptions to the definition of the process to change the way in which the process operates. The definition of the process may be transformed by using factors relating to the specific context in or for which the definition is executed. Also, the definition may be transformed by identifying, in a flow diagram for the process, flows to which the assumptions apply, and applying program rewriting techniques to those identified flows.
    Type: Grant
    Filed: May 14, 2009
    Date of Patent: July 1, 2014
    Assignee: International Business Machines Corporation
    Inventors: David F. Bantz, Steven J. Mastrianni, James R. Moulic, Dennis G. Shea
  • Patent number: 8752036
    Abstract: Embodiments of the invention provide systems and methods for throughput-aware software pipelining in compilers to produce optimal code for single-thread and multi-thread execution on multi-threaded systems. A loop is identified within source code as a candidate for software pipelining. An attempt is made to generate pipelined code (e.g., generate an instruction schedule and a set of register assignments) for the loop in satisfaction of throughput-aware pipelining criteria, like maximum register count, minimum trip count, target core pipeline resource utilization, maximum code size, etc. If the attempt fails to generate code in satisfaction of the criteria, embodiments adjust one or more settings (e.g., by reducing scalarity or latency settings being used to generate the instruction schedule).
    Type: Grant
    Filed: October 31, 2011
    Date of Patent: June 10, 2014
    Assignee: Oracle International Corporation
    Inventors: Spiros Kalogeropulos, Partha Tirumalai
  • Patent number: 8745608
    Abstract: A scheduler of a reconfigurable array, a method of scheduling commands, and a computing apparatus are provided. To perform a loop operation in a reconfigurable array, a recurrence node, a producer node, and a predecessor node are detected from a data flow graph of the loop operation such that resources are assigned to such nodes so as to increase the loop operating speed. Also, a dedicated path having a fixed delay may be added to the assigned resources.
    Type: Grant
    Filed: February 1, 2010
    Date of Patent: June 3, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Won-sub Kim, Tae-wook Oh, Bernhard Egger
  • Patent number: 8738348
    Abstract: A method and mechanism for implementing a general purpose scripting language that supports parallel execution is described. In one approach, parallel execution is provided in a seamless and high-level approach rather than requiring or expecting a user to have low-level programming expertise with parallel processing languages/functions. Also described is a system and method for performing circuit simulation. The present approach provides methods and systems that create reusable and independent measurements for use with circuit simulators. Also disclosed are parallelizable measurements having looping constructs that can be run without interference between parallel iterations. Reusability is enhanced by having parameterized measurements. Revisions and history of the operating parameters of circuit designs subject to simulation are tracked.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: May 27, 2014
    Assignee: Cadence Design Systems, Inc.
    Inventor: Kenneth S. Kundert
  • Patent number: 8738570
    Abstract: A file cloning mechanism allows for quickly creating copies (clones) of files within a filesystem, such as when a user makes a copy of a file. In exemplary embodiments, a clone of a source object is at least initially represented by a structure containing references to various elements of the source object (e.g., indirect onodes, direct onodes, and data blocks). Both read-only and mutable clones can be created. The source file and the clone initially share such elements and continue to share unmodified elements as changes are made to the source file or mutable clone. None of the user data blocks or the metadata blocks describing the data stream (i.e., the indirect/direct onodes) associated with the source file need to be copied at the time the clone is created. At appropriate times, cloned files may be “de-cloned.
    Type: Grant
    Filed: November 21, 2011
    Date of Patent: May 27, 2014
    Assignee: Hitachi Data Systems Engineering UK Limited
    Inventors: Daniel J. N. Picken, Neil Berrington
  • Patent number: 8726251
    Abstract: Embodiments of the invention provide systems and methods for automatically parallelizing loops with non-speculative pipelined execution of chunks of iterations with pre-computation of selected values. Non-DOALL loops are identified and divided the loops into chunks. The chunks are assigned to separate logical threads, which may be further assigned to hardware threads. As a thread performs its runtime computations, subsequent threads attempt to pre-compute their respective chunks of the loop. These pre-computations may result in a set of assumed initial values and pre-computed final variable values associated with each chunk. As subsequent pre-computed chunks are reached at runtime, those assumed initial values can be verified to determine whether to proceed with runtime computation of the chunk or to avoid runtime execution and instead use the pre-computed final variable values.
    Type: Grant
    Filed: March 29, 2011
    Date of Patent: May 13, 2014
    Assignee: Oracle International Corporation
    Inventors: Spiros Kalogeropulos, Partha Pal Tirumalai
  • Patent number: 8701099
    Abstract: A method, a system and a computer program product for effectively accelerating loop iterators using speculative execution of iterators. An Efficient Loop Iterator (ELI) utility detects initiation of a target program and initiates/spawns a speculative iterator thread at the start of the basic code block ahead of the code block that initiates a nested loop. The ELI utility assigns the iterator thread to a dedicated processor in a multi-processor system. The speculative thread runs/executes ahead of the execution of the nested loop and calculates indices in a corresponding multidimensional array. The iterator thread adds all the precomputed indices to a single queue. As a result, the ELI utility effectively enables a multidimensional loop to be replaced by a single dimensional loop. At the beginning of (or during) each iteration of the iterator, the ELI utility “dequeues” an entry from the queue to use the entry to access the array upon which the ELI utility iterates.
    Type: Grant
    Filed: November 2, 2010
    Date of Patent: April 15, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ganesh Bikshandi, Dibyendu Das, Smruti Ranjan Sarangi
  • Patent number: 8689202
    Abstract: A method of automatically extracting information from an architecture description. A memory resident directed acyclic graph data structure comprising nodes representing instructions and edges whose weights represent dependencies between pairs of instructions is constructed. A list of ready nodes are maintained in the directed acyclic graph. A list of nodes not scheduled is maintained. And, it is determined whether the next instruction to be scheduled is to be taken from the list of ready nodes or from the list of nodes not yet scheduled.
    Type: Grant
    Filed: March 30, 2005
    Date of Patent: April 1, 2014
    Assignee: Synopsys, Inc.
    Inventors: Gunnar Braun, Andreas Hoffmann, Volker Grieve, Manuel Hohenauer, Rainer Leupers
  • Patent number: 8677338
    Abstract: Methods and apparatus to data dependence testing for loop fusion, e.g., with code replication, array contraction, and/or loop interchange, are described. In one embodiment, a compiler may optimize code for efficient execution during run-time by testing for dependencies associated with improving memory locality through code replication in loops that enable various loop transformations. Other embodiments are also described.
    Type: Grant
    Filed: June 4, 2008
    Date of Patent: March 18, 2014
    Assignee: Intel Corporation
    Inventors: John L. Ng, Rakesh Krishnaiyer, Alexander Y. Ostanevich
  • Patent number: RE45199
    Abstract: A compiler apparatus, which can perform software pipelining optimization that has a considerable effect of reducing the number of execution cycles taken to complete a loop process, converts a source program into a machine program for a processor which is capable of parallel processing. The compiler apparatus is composed of: a parsing unit operable to parse the source program and then to convert the source program into an intermediate program which is described in an intermediate language; an optimization unit operable to optimize the intermediate program; and a conversion unit operable to convert the optimized intermediate program into the machine language program, wherein the optimization unit is operable to execute software pipelining, by inserting a transfer instruction, which is used for transferring data between operands, into a loop process included in the intermediate program so that a data dependence relation is changed.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: October 14, 2014
    Assignee: Panasonic Corporation
    Inventors: Shohei Michimoto, Taketo Heishi, Hajime Ogawa, Teruo Kawabata