Abstract: Techniques for reducing issue-to-issue latency by reversing processing order in half-pumped single instruction multiple data (SIMD) execution units are described. In one embodiment a processor functional unit is provided comprising a frontend unit, and execution core unit, a backend unit, an execution order control signal unit, a first interconnect coupled between and output and an input of the execution core unit and a second interconnect coupled between an output of the backend unit and an input of the frontend unit. In operation, the execution order control signal unit generates a forwarding order control signal based on the parity of an applied clock signal on reception of a first vector instruction. This control signal is in turn used to selectively forward first and second portions of an execution result of the first vector instruction via the interconnects for use in the execution of a dependent second vector instruction.
Type:
Grant
Filed:
December 14, 2011
Date of Patent:
February 3, 2015
Assignee:
International Business Machines Corporation
Inventors:
Maarten J. Boersma, Markus Kaltenbach, Christophe J. Layer, Jens Leenstra, Silvia M. Mueller
Abstract: A processor includes a first and second execution unit each of which is arranged to execute multiply instructions of a first type upon fixed point operands and to execute multiply instructions of a second type upon floating point operands. A register file of the processor stores operands in registers that are each addressable by instructions for performing the first and second types of operations. An instruction decode unit is responsive to the at least one multiply instruction of the first type and the at least one multiply instruction of the second type to at the same time enable a first data path between the first set of registers and the first execution unit and to enable a second data path between a second set of registers and the second execution unit.
Type:
Grant
Filed:
September 15, 2011
Date of Patent:
November 4, 2014
Assignee:
Texas Instruments Incorporated
Inventors:
Timothy D Anderson, Duc Quang Bui, Eric Biscondi, Shriram D Moharil, Mujibur Rahman, Soujanya Narnur, Peter Richard Dent, Ashish Rai Shrivastava
Abstract: An apparatus and method for dynamically determining the execution mode of a reconfigurable array are provided. Performance information of a loop may be obtained before and/or during the execution of the loop. The performance information may be used to determine whether to operate the apparatus in a very long instruction word (VLIW) mode or in a coarse grained array (CGA) mode.
Type:
Grant
Filed:
August 25, 2011
Date of Patent:
September 30, 2014
Assignee:
Samsung Electronics Co., Ltd.
Inventors:
Bernhard Egger, Dong-Hoon Yoo, Tai-Song Jin, Won-Sub Kim, Min-Wook Ahn, Jin-Seok Lee, Hee-Jin Ahn
Abstract: A branch target address table is provided for each branch instruction having a plurality of branch targets. Each branch target address table stores a history of a plurality of branch target addresses determined in the past by executing a corresponding branch instruction. A branch target prediction unit predicts a predicted branch target address with respect to a branch instruction with reference to the history of branch target addresses stored in the branch target address table corresponding to the branch instruction. The predicted branch target address obtained as a result of the prediction is stored, for example, in a predicted branch target address storage unit in association with the branch instruction, and is referenced by an instruction fetch control unit at the time of prefetching a branch target instruction.
Abstract: A data processing system is used to evaluate a data processing function by executing a sequence of program instructions including an intermediate value generating instruction and an intermediate value consuming instruction. In dependence upon one or more input operands to the evaluation, an embedded opcode within the intermediate value passed between the intermediate value generating instruction and the intermediate value consuming instruction may be set to have a value indicating that a substitute instruction should be used in place of the intermediate value consuming instruction. The instructions may be floating point instructions, such as a floating point power instruction evaluating the data processing function ab.
Abstract: A processor has a plurality of PEs (processing elements) that operate in parallel based on operation commands and an information collection unit that collects the data of the plurality of PEs, wherein each of the plurality of PEs holds data and a condition flag, supplies the data and the condition flag to the information collection unit upon receiving an operation command, and upon receiving an update request for updating the condition flag, updates the condition flag in accordance with the update request that was received; and the information collection unit, upon receiving the data and the condition flags, selects one PE based on a predetermined order of priority from among the PEs for which the received condition flags are active and both supplies the data of the selected PE as collection result data and supplies an update request for updating the condition flag of the PE that was selected.