Abstract: Techniques for detecting recurring non-occurrences of an event. In one embodiment, techniques are provided for detecting the non-occurrence of an event within each of a series of time periods following the occurrence of another event. Language extensions are provided that enable queries to be formulated for detecting recurring non-occurrence of an event following occurrence of a triggering event.
Abstract: Data-parallel computation programs may be improved by, for example, determining the functional properties user defined functions (UDFs), eliminating unnecessary data-shuffling stages, and/or changing data-partition properties to cause desired data properties to appear after one or more user defined functions are applied.
Abstract: A system and method are provided for synchronizing threads in a divergent region of code within a multi-threaded parallel processing system. The method includes, prior to any thread entering a divergent region, generating a count that represents a number of threads that will enter the divergent region. The method also includes using the count within the divergent region to synchronize the threads in the divergent region.
Abstract: A system and method for dynamically migrating stash transactions include first and second processing cores, an input/output memory management unit (IOMMU), an IOMMU mapping table, an input/output (I/O) device, a stash transaction migration management unit (STMMU), and an operating system (OS) scheduler. The first core executes a first thread associated with a frame manager. The OS scheduler migrates the first thread from the first core to the second core and generates pre-empt notifiers to indicate scheduling-out and scheduling-in of the first thread from the first core and to the second core. The STMMU uses the pre-empt notifiers to enable dynamic stash transaction migration.
Abstract: A configurable execution unit comprises operators capable of being dynamically configured by an instruction at the level of processing multi-bit operand values. The unit comprises one or more dynamically configurable operator modules, the or each module being connectable to receive input operands indicated in an instruction, and a programmable lookup table connectable to receive dynamic configuration information determined from an opcode portion of the instruction and capable of generating operator configuration settings defining an aspect of the function or behavior of a configurable operator module, responsive to the dynamic configuration information in the instruction.
Abstract: A pipelined processing device includes: a device controller configured to receive a request to perform an operation; a plurality of subcontrollers configured to receive at least one instruction associated with the operation, each of the plurality of subcontrollers including a counter configured to generate an active time value indicating at least a portion of a time taken to process the at least one instruction; a pipeline processor configured to receive and process the at least one instruction, the pipeline processor configured to receive the active time value; and a shared pipeline storage area configured to store the active time value for each of the plurality of subcontrollers.
Type:
Grant
Filed:
June 24, 2010
Date of Patent:
March 11, 2014
Assignee:
International Business Machines Corporation
Inventors:
Ekaterina M. Ambroladze, Deanna Postles Dunn Berger, Michael Fee, Christine C. Jones, Arthur J. O'Neill, Jr., Diana Lynn Orf, Robert J. Sonnelitter, III
Abstract: A mapper unit of an out-of-order processor assigns a particular counter currently in a counter free pool to count a number of mappings of logical registers to a particular physical register from among multiple physical registers, responsive to an execution of an instruction by the mapper unit mapping at least one logical register to the particular physical register. The number of counters is less than the number of physical registers. The mapper unit, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.
Type:
Grant
Filed:
April 15, 2011
Date of Patent:
February 25, 2014
Assignee:
International Business Machines Corporation
Inventors:
Gregory W. Alexander, Brian D. Barrick, John W. Ward, III
Abstract: A method and apparatus for configuring dynamic data are provided. A compilation apparatus may select a data format showing an optimum performance when a binary code is executed, from among a plurality of data formats supported by an execution apparatus used to execute a binary code, and may generate a binary code that uses the selected data format. The execution apparatus may execute a binary code provided by the compilation apparatus.
Type:
Application
Filed:
August 8, 2013
Publication date:
February 20, 2014
Applicant:
Samsung Electronics Co., Ltd.
Inventors:
Sung Jin SON, Sang Oak WOO, Seok Yoon JUNG
Abstract: Mechanism for consistent core hang detection on a processor with multiple processor cores, each having one or more instruction execution pipelines. Each core may also include a hang detection unit with a counter unit that may generate a count value based on a clock source having a frequency that is independent of a frequency of a processor core clock. The hang detection unit may also include a detector logic unit that may determine whether a given instruction execution pipeline has ceased processing a given instruction based upon a state of the processor core and whether or not the given instruction has completed execution prior to the count value exceeding a predetermined value.
Abstract: Method and apparatus are provided for a synchronizing execution of a plurality of threads on a multi-threaded processor. Each thread is provided with a number of synchronization points corresponding to points where it is advantageous or preferable that execution should be synchronized with another thread. Execution of a thread is paused when it reaches a synchronization point until at least one other thread with which it is intended to be synchronized reaches a corresponding synchronization point. Execution is subsequently resumed. Where an executing thread branches over a section of code which included a synchronization point then execution is paused at the end of the branch until the at least one other thread reaches the synchronization point of the end of the corresponding branch.
Abstract: A system and method for implementing the Elliptic Curve scalar multiplication method in cryptography, where the Double Base Number System is expressed in decreasing order of exponents and further on using it to determine Elliptic curve scalar multiplication over a finite elliptic curve.
Abstract: Provided is a method and system for dynamically parallelizing an application program. Specifically, provided is a method and system having multi-core control that may verify a number of available threads according to an application program and dynamically parallelize data based on the verified number of available threads. The method and system for dynamically parallelizing the application program may divide a data block to be processed according to the application program based on a relevant data characteristic and dynamically map the threads to division blocks, and thereby enhance a system performance.
Type:
Grant
Filed:
April 27, 2010
Date of Patent:
February 11, 2014
Assignees:
Samsung Electronics Co., Ltd., University of Southern California
Inventors:
Seung Won Lee, Shi Hwa Lee, Dong-In Kang, Mikyung Kang
Abstract: Fetch operations are assigned to different threads in a multithreaded environment. There are provided a number of different sorting algorithms, from which one is periodically selected on the basis of whether the present algorithm is giving satisfactory results or not. The period is preferably a sub-context interval. The different sorting algorithms preferably include a software/OS priority. A second sorting algorithm may include sorting according to hardware performance measurements. Two-level priority scheme is used to combine both priorities. The judgement of satisfactory performance is preferably based on the difference between a desired number of fetch operations attributed per sub-context switch interval to each thread and a real number of fetch operations attributed per sub-context switch interval to each thread.
Type:
Grant
Filed:
December 18, 2009
Date of Patent:
January 28, 2014
Assignee:
International Business Machines Corporation
Abstract: A hardware device for concurrently processing a fixed set of predetermined tasks associated with an algorithm which includes a number of processes, some of the processes being dependent on binary decisions, includes a plurality of task units for processing data, making decisions and/or processing data and making decisions, including source task units and destination task units. A task interconnection logic means interconnect the task units for communicating actions from a source task unit to a destination task unit. Each of the task units includes a processor for executing only a particular single task of the fixed set of predetermined tasks associated with the algorithm in response to a received request action, and a status manager for handling the actions from the source task units and building the actions to be sent to the destination task units.
Type:
Grant
Filed:
February 3, 2012
Date of Patent:
January 21, 2014
Assignee:
International Business Machines Corporation
Inventors:
Alain Benayoun, Jean-Francois Le Pennec, Patrick Michel, Claude Pin
Abstract: One embodiment of the present invention provides a method for supporting the development of a parallel/distributed application, wherein the development process comprises a design phase, an implementation phase and a test phase. A script language can be provided in the design phase for representing elements of a connectivity graph and the connectivity between them. In the implementation phase, modules can be provided for implementing functionality of the application, executors can be provided for defining a type of execution for the modules, and process-instances can be provided for distributing the application over several computing devices. In the test phase, abstraction levels can be provided for monitoring and testing the application.
Type:
Grant
Filed:
May 4, 2006
Date of Patent:
January 14, 2014
Assignee:
Honda Research Institute Europe GmbH
Inventors:
Frank Joublin, Christian Goerick, Antonello Ceravola, Mark Dunn
Abstract: A method for flattening conditional statements, the method comprises: obtaining a program code, the program code comprising a conditional control flow program construct, which conditional control flow program construct when read by a target processor, causes the target processor to select a control flow path for execution between at least a first and a second control flow paths, wherein said selection is based on an evaluation of a condition of the conditional control flow program construct; replacing the conditional control flow program construct with a transaction-based control flow program construct, which when read by the target processor is operative to cause the target processor to commence a transaction, the transaction configured to execute the first control flow path; and wherein the transaction-based control flow program construct is operative to cause the target processor to execute the conditional control flow program construct in case the transaction is rolled back.
Type:
Application
Filed:
July 8, 2012
Publication date:
January 9, 2014
Applicant:
International Business Machines Corporation
Abstract: One embodiment of the present invention sets forth a technique for dynamically specifying a texture header and texture sampler using an index. The index corresponds to a particular register value that may be static or computed during execution of a shader program. Any texture operation instruction may specify an index value for each of the texture header and the texture sampler.
Abstract: A data processing device including a reception unit, an instruction unit and a storage unit. The reception unit receives instructions for processing at a processing execution device. The instruction unit instructs the processing execution device to cancel a power saving state of the processing execution device and execute the processing corresponding to an instruction received by the reception unit. The storage unit stores data relating to received instructions. If the processing corresponding to the received instruction is a pre-specified process, data relating to the instruction is stored by the storage unit. If the processing corresponding to the received instruction is not a pre-specified process, the instruction unit instructs the processing execution device to execute both the processing corresponding to this instruction and processing based on data relating to instructions stored in the storage unit.
Abstract: Embodiments of apparatuses and methods for processor accelerator interface virtualization are disclosed. In one embodiment, an apparatus includes instruction hardware and execution hardware. The instruction hardware is to receive instructions. One of the instruction types is an accelerator job request instruction type, which the execution hardware executes to cause the processor to submit a job request to an accelerator.
Type:
Application
Filed:
December 28, 2011
Publication date:
January 2, 2014
Inventors:
Paul M. Stillwell, JR., Omesh Tickoo, Vineet Chadha, Yong Zhang, Rameshkumar G. Illikkal, Ravishankar Iyer
Abstract: Apparatus having corresponding methods and non-transitory computer-readable media comprise a processor, wherein the processor is configured to count a number of iterations of an idle task loop executed by a processor during a first predetermined interval, determine a current load of the processor based on the number of iterations of the idle task loop executed by the processor during the first predetermined interval, determine a current operating frequency of the processor, and determine a desired operating frequency of the processor based on i) the current operating frequency of the processor and ii) the current load of the processor.
Abstract: Obfuscating a multi-threaded computer program is carried out using an instruction pipeline in a computer processor by streaming first instructions of a first thread of a multi-threaded computer application program into the pipeline, the first instructions entering the pipeline at the fetch stage, detecting a stall signal indicative of a stall condition in the pipeline, and responsively to the stall signal injecting second instructions of a second thread of the multi-threaded computer application program into the pipeline. The injected second instructions enter the pipeline at an injection stage that is disposed downstream from the fetch stage up to and including the register stage for processing therein. The stall condition exists at one of the stages that is located upstream from the injection stage.
Abstract: A novel technique for improving throughput in a multi-core system in which data is processed according to a producer-consumer relationship by eliminating latencies caused by compulsory cache misses. The producer and consumer entities run as multiple slices of execution. Each such slice has an associated execution context that comprises of the code and data that particular slice would access. The execution contexts of the producer and consumer slices are small enough to fit in the processor caches simultaneously. When a producer entity scheduled on a first core completed production of data elements as constrained by the size of cache memories, a consumer entity is scheduled on that same core to consume the produced data elements. Meanwhile, a second slice of the producer entity is moved to another core and a second slice of a consumer entity is scheduled to consume elements produced by the second slice of the producer.
Abstract: In one embodiment, the present invention includes an instruction decoder that can receive an incoming instruction and a path select signal and decode the incoming instruction into a first instruction code or a second instruction code responsive to the path select signal. The two different instruction codes, both representing the same incoming instruction may be used by an execution unit to perform an operation optimized for different data lengths. Other embodiments are described and claimed.
Type:
Application
Filed:
August 28, 2013
Publication date:
December 26, 2013
Inventors:
Ohad Falik, Lihu Rappoport, Ron Gabor, Yulia Kurolap, Michael Mishaeli
Abstract: A data processing system having a memory for storing instructions and several central processing units for executing instructions, each central processing unit including an adaptive power supply which provides, among other data, IR (voltage) drop information. Circuitry is provided that receives the IR drop information from the many central processing units, selects a central processing unit which has the lowest IR drop and which is available to execute instructions and dispatches instructions to the selected central processing unit from the memory.
Type:
Grant
Filed:
February 6, 2007
Date of Patent:
December 24, 2013
Assignee:
International Business Machines Corporation
Inventors:
Deepak K. Singh, Francois Ibrahim Atallah
Abstract: An instruction is provided to establish various operational parameters for an adapter. These parameters include adapter interruption parameters, input/output address translation parameters, resetting error indications, setting measurement parameters, and setting an interception control, as examples. The instruction specifies a function information block, which is a program representation of a device table entry used by the adapter, to be used in certain situations in establishing the parameters. A store instruction is also provided that stores the current contents of the function information block.
Type:
Grant
Filed:
June 23, 2010
Date of Patent:
December 24, 2013
Assignee:
International Business Machines Corporation
Inventors:
David Craddock, Mark S. Farrell, Beth A. Glendening, Thomas A. Gregg, Dan F. Greiner, Gustav E. Sittmann, III, Peter K. Szwed
Abstract: Restricted instructions are prohibited from execution within a transaction. There are classes of instructions that are restricted regardless of type of transaction: constrained or nonconstrained. There are instructions only restricted in constrained transactions, and there are instructions that are selectively restricted for given transactions based on controls specified on instructions used to initiate the transactions.
Type:
Application
Filed:
June 15, 2012
Publication date:
December 19, 2013
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
Abstract: A TRANSACTION ABORT instruction is used to abort a transaction that is executing in a computing environment. The TRANSACTION ABORT instruction includes at least one field used to specify a user-defined abort code that indicates the specific reason for aborting the transaction. Based on executing the TRANSACTION ABORT instruction, a condition code is provided that indicates whether re-execution of the transaction is recommended.
Type:
Application
Filed:
June 15, 2012
Publication date:
December 19, 2013
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Dan F. Greiner, Christian Jacobi, Marcel Mitran, Timothy J. Slegel
Abstract: Embodiments relate to intra-instructional transaction abort handling. An aspect includes using an emulation routine to execute an instruction within a transaction. The instruction includes at least one unit of operation. The transaction effectively delays committing stores to memory until the transaction has completed successfully. After receiving an abort indication, emulation of the instruction is terminated prior to completing the execution of the instruction. The instruction is terminated after the emulation routine completes any previously initiated unit of operation of the instruction.
Type:
Application
Filed:
June 15, 2012
Publication date:
December 19, 2013
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Brenton F. Belmar, Mark S. Farrell, Christian Jacobi, Timothy J. Slegel
Abstract: Executing a Next Instruction Access Intent instruction by a computer. The processor obtains an access intent instruction indicating an access intent. The access intent is associated with an operand of a next sequential instruction. The access intent indicates usage of the operand by one or more instructions subsequent to the next sequential instruction. The computer executes the access intent instruction. The computer obtains the next sequential instruction. The computer executes the next sequential instruction, which comprises based on the access intent, adjusting one or more cache behaviors for the operand of the next sequential instruction.
Type:
Application
Filed:
June 15, 2012
Publication date:
December 19, 2013
Applicant:
International Business Machines Corporation
Abstract: Task specific diagnostic controls are provided to facilitate the debugging of certain types of abort conditions. The diagnostic controls may be set to cause transactions to be selectively aborted, allowing a transaction to drive its abort handler routine for testing purposes. The controls include, for instance, a transaction diagnostic scope and a transaction diagnostic control. The transaction diagnostic scope indicates when the transaction diagnostic control is to be applied, and the transaction diagnostic control indicates whether transactions are to selectively aborted.
Type:
Application
Filed:
June 15, 2012
Publication date:
December 19, 2013
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
Abstract: In one embodiment, a processor comprises a programmable map and a circuit. The programmable map is configured to store data that identifies at least one instruction for which an architectural modification of an instruction set architecture implemented by the processor has been defined, wherein the processor does not implement the modification. The circuitry is configured to detect the instruction or its memory operands and cause a transition to Known Good Code (KGC), wherein the KGC is protected from unauthorized modification and is provided from an authenticated entity. The KGC comprises code that, when executed, emulates the modification. In another embodiment, an integrated circuit comprises at least one processor core; at least one other circuit; and a KGC source configured to supply KGC to the processor core for execution. The KGC comprises interface code for the other circuit whereby an application executing on the processor core interfaces to the other circuit through the KGC.
Type:
Grant
Filed:
December 17, 2007
Date of Patent:
December 17, 2013
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Garth D. Hillman, Geoffrey Strongin, Andrew R. Rawson, Gary H. Simpson, Ralf Findeisen
Abstract: A method of sharing a plurality of registers in a shared register pool among a plurality of microprocessor threads begins with a determination that a first instruction to be executed by a microprocessor in a first microprocessor thread requires a first logical register. Next a determination is made that a second instruction to be executed by the microprocessor in a second microprocessor thread requires a second logical register. A first physical register in the shared register pool is allocated to the first microprocessor thread for execution of the first instruction and the first logical register is mapped to the first physical register. A second physical register in the shared register pool is allocated to the second microprocessor thread for execution of the second instruction. Finally, the second logical register is mapped to the second physical register.
Abstract: A combination of hardware and software collect profile data for asynchronous events, at code region granularity. An exemplary embodiment is directed to collecting metrics for prefetching events, which are asynchronous in nature. Instructions that belong to a code region are identified using one of several alternative techniques, causing a profile bit to be set for the instruction, as a marker. Each line of a data block that is prefetched is similarly marked. Events corresponding to the profile data being collected and resulting from instructions within the code region are then identified. Each time that one of the different types of events is identified, a corresponding counter is incremented. Following execution of the instructions within the code region, the profile data accumulated in the counters are collected, and the counters are reset for use with a new code region.
Type:
Application
Filed:
December 29, 2011
Publication date:
December 12, 2013
Inventors:
Raul Martinez, Enric Gibert Codina, Pedro Lopez, Marti Torrents Lapuerta, Polychronis Xekalakis, Georgios Tournavitis, Kyriakos A. Stavrou, Demos Pavlou, Daniel Ortega, Alejandro Martinez Vicente, Pedro Marcuello, Grigorios Magklis, Josep M. Codina, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Christos Kotselidis, Fernando Latorre, Marc Lupon, Carlos Madriles
Abstract: A method for improving performance of a pipelined microprocessor by utilizing pipeline virtual registers allows for either decreased register spillage or decreased area and power consumption of a microprocessor. The microprocessor takes advantage of register bypass logic to write short-lived values to virtual registers, which are discarded instead of being written to the register bank, thus reducing register pressure by avoiding short-lived values being written to the register bank.
Abstract: There are included a communication part performing communication with other device; a command managing part transmitting a command of an own device to other device and receiving a command of other device to acquire the command of other device by the communication part, and managing the command of the own device and the command of other device; and a command processing part executing processing of a function corresponding to the command of the own device by the own device when a command selected from the commands managed by the command managing part is the command of the own device, and executing processing of a function corresponding to the command of other device by the other device when the command selected is the command of the other device.
Abstract: A hardware device for concurrently processing a fixed set of predetermined tasks associated with an algorithm which includes a number of processes, some of the processes being dependent on binary decisions, includes a plurality of task units for processing data, making decisions and/or processing data and making decisions, including source task units and destination task units. A task interconnection logic means interconnect the task units for communicating actions from a source task unit to a destination task unit. Each of the task units includes a processor for executing only a particular single task of the fixed set of predetermined tasks associated with the algorithm in response to a received request action, and a status manager for handling the actions from the source task units and building the actions to be sent to the destination task units.
Type:
Grant
Filed:
February 3, 2012
Date of Patent:
December 10, 2013
Assignee:
International Business Machines Corporation
Inventors:
Alain Benayoun, Jean-Francois Le Pennec, Patrick Michel, Claude Pin
Abstract: A system serialization capability is provided to facilitate processing in those environments that allow multiple processors to update the same resources. The system serialization capability is used to facilitate processing in a multi-processing environment in which guests and hosts use locks to provide serialization. The system serialization capability includes a diagnose instruction which is issued after the host acquires a lock, eliminating the need for the guest to acquire the lock.
Type:
Grant
Filed:
April 28, 2012
Date of Patent:
December 10, 2013
Assignee:
International Business Machines Corporation
Abstract: A technique to perform a fast compare-exchange operation is disclosed. More specifically, a machine-readable medium, processor, and system are described that implement a fast compare-exchange operation as well as a cache line mark operation that enables the fast compare-exchange operation.
Type:
Grant
Filed:
December 18, 2009
Date of Patent:
December 3, 2013
Assignee:
Intel Corporation
Inventors:
Joshua B. Fryman, Andrew Thomas Forsyth, Edward Grochowski
Abstract: A clone set of General Purpose Registers (GPRs) is created to be used by a set of helper thread binaries, which is created from a set of main thread binaries. When the set of main thread binaries enters a wait state, the set of helper thread binaries uses the clone set of GPRs to continue using unused execution units within a processor core. The set of helper threads are thus able to warm up local cache memory with data that will be needed when execution of the set of main thread binaries resumes.
Type:
Grant
Filed:
February 1, 2008
Date of Patent:
December 3, 2013
Assignee:
International Business Machines Corporation
Inventors:
Ravi K. Arimilli, Juan C. Rubio, Balaram Sinharoy
Abstract: A method for processing an operating sequence of instructions of a program in a processor, wherein each instruction is represented by an assigned instruction code which comprises one execution step to be processed by the processor or a plurality of execution steps to be processed successively by the processor, includes determining an actual signature value assigned to a current execution step of the execution steps of the instruction code representing the instruction of the operating sequence; determining, in a manner dependent on an address value, a desired signature value assigned to the current execution step; and if the actual signature value does not correspond to the desired signature value, omitting at least one execution step directly available for execution and/or an execution step indirectly available for execution.
Abstract: A system serialization capability is provided to facilitate processing in those environments that allow multiple processors to update the same resources. The system serialization capability is used to facilitate processing in a multi-processing environment in which guests and hosts use locks to provide serialization. The system serialization capability includes a diagnose instruction which is issued after the host acquires a lock, eliminating the need for the guest to acquire the lock.
Type:
Grant
Filed:
June 24, 2010
Date of Patent:
November 26, 2013
Assignee:
International Business Machines Corporation
Abstract: A digital signal processor includes an instruction analysis unit, a digital signal processor (DSP) core and a memory unit. The instruction analysis unit receives an instruction and determines the required bit width M for the data process corresponding to the instruction. The DSP core performs the M-bit data process based on the bit width M determined by the instruction analysis unit, and the memory unit stores multiple data and performs the M-bit access based on the bit width M determined by the instruction analysis unit thereby allowing the DSP core to access, and at least one available space in the memory unit will be adjusted such that only the access space having the bit width M for the operation corresponding to the instruction will be open in each access, thereby effectively achieving the effect of power-saving.
Type:
Grant
Filed:
November 9, 2010
Date of Patent:
November 26, 2013
Assignee:
Sentelic Corporation
Inventors:
Zhiyang Guo, Mao-Sung Wu, Chun Hsien, Tsai-Lin Lee
Abstract: A semiconductor chip includes a plurality of multi-core clusters each including a plurality of cores and a cluster controller unit. Each cluster controller unit is configured to control thread assignment within the multi-core cluster to which it belongs. The cluster controller unit monitors various parameters measured in the plurality of cores within the multi-core cluster to estimate the computational demand of each thread that runs in the cores. The cluster controller unit may reassign the threads within the multi-core cluster based on the estimated computational demand of the threads and transmit a signal to an upper-level software manager that controls the thread assignment across the semiconductor chip. When an acceptable solution to thread assignment cannot be achieved by shuffling of threads within the multi-core cluster, the cluster controller unit may also report inability to solve thread assignment to the upper-level software manager to request a system level solution.
Abstract: A technique to perform three-source instructions. At least one embodiment of the invention relates to converting a three-source instruction into at least two instructions identifying no more than two source values.
Type:
Grant
Filed:
June 27, 2006
Date of Patent:
November 19, 2013
Assignee:
Intel Corporation
Inventors:
Avinash Sodani, Stephan Jourdan, Alexandre Farcy, Per Hammarlund
Abstract: A data processing apparatus includes a data engine 6 having an instruction decoder 18 for generating one or more control signals 24 for controlling processing circuitry 20 to perform data processing operations specified by the program instructions decoded. The instruction decoder 18 responsive to a marker instruction to read a programmable flow control value from a flow control register 38. The programmable flow control value specifies the action to be taken upon completion of execution of a current sequence of program instructions. The action taken may be jumping to a target program instruction at the start of a target sequence of program instructions or entry into an idle state awaiting a new processing task to be initiated.
Abstract: Queuing of received transactions that have a resource conflict is disclosed. A first node receives a first transaction from a second node, where the first transaction relates to a resource of the first node. The transaction may be a request relating to a memory line of the first node, for instance. It is determined that a second transaction that relates to this resource of the first node is already being processed by the first node. Therefore, the first transaction is enqueued in a conflict queue within the first node. The queuing may be a linked list, a priority queue, or another type of queue. Once the second transaction has been processed, the first transaction is restarted for processing by the first node. The first transaction is then processed by the first node.
Type:
Grant
Filed:
June 25, 2011
Date of Patent:
November 19, 2013
Assignee:
International Business Machines Corporation
Inventors:
Donald R. DeSota, Robert Joersz, Davis A. Miller, Maged M. Michael
Abstract: An operation of a processor in respect to transactions is checked by simulating an execution of a test program, and updating a transaction order graph to identify a cycle. The graph is updated based on a value read during an execution of a first transaction and a second transaction that is the configured to set the memory with the read value. The test program comprises information useful for identifying the second transaction.
Type:
Grant
Filed:
August 26, 2010
Date of Patent:
November 19, 2013
Assignee:
International Business Machines Corporation
Inventors:
Allon Adir, John Martin Ludden, Avi Ziv
Abstract: Various technologies and techniques are disclosed for switching threads within routines. A controller routine receives a request from an originating routine to execute a coroutine, and executes the coroutine on an initial thread. The controller routine receives a response back from the coroutine when the coroutine exits based upon a return statement. Upon return, the coroutine indicates a subsequent thread that the coroutine should be executed on when the coroutine is executed a subsequent time. The controller routine executes the coroutine the subsequent time on the subsequent thread. The coroutine picks up execution at a line of code following the return statement. Multiple return statements can be included in the coroutine, and the threads can be switched multiple times using this same approach. Graphical user interface logic and worker thread logic can be co-mingled into a single routine.
Abstract: Optimizations are provided for frame management operations, including a clear operation and/or a set storage key operation, requested by pageable guests. The operations are performed, absent host intervention, on frames not resident in host memory. The operations may be specified in an instruction issued by the pageable guests.
Type:
Application
Filed:
July 15, 2013
Publication date:
November 14, 2013
Inventors:
Charles W. Gainey, JR., Dan F. Greiner, Lisa Cranton Heller, Damian L. Osisek, Gustav E. Sittmann, III