Conditional instruction for a single instruction, multiple data execution engine
According to some embodiments, a conditional Single Instruction, Multiple Data instruction is provided. For example, a first conditional instruction may be received at an n-channel SIMD execution engine. The first conditional instruction may be evaluated based on multiple channels of associated data, and the result of the evaluation may be stored in an n-bit conditional mask register. A second conditional instruction may then be received at the execution engine and the result may be copied from the conditional mask register to an n-bit wide, m-entry deep conditional stack.
To improve the performance of a processing system, a Single Instruction, Multiple Data (SIMD) instruction may be simultaneously executed for multiple operands of data in a single instruction period. For example, an eight-channel SIMD execution engine might simultaneously execute an instruction for eight 32-bit operands of data, each operand being mapped to a unique compute channel of the SIMD execution engine. In some cases, an instruction may be “conditional.” That is, an instruction or set of instructions might only be executed if a pre-determined condition is satisfied. Note that in the case of a SIMD execution engine, such a condition might be satisfied for some channels while not being satisfied for other channels.
BRIEF DESCRIPTION OF THE DRAWINGS
Some embodiments described herein are associated with a “processing system.” As used herein, the phrase “processing system” may refer to any device that processes data. A processing system may, for example, be associated with a graphics engine that processes graphics data and/or other types of media information. In some cases, the performance of a processing system may be improved with the use of a SIMD execution engine. For example, a SIMD execution engine might simultaneously execute a single floating point SIMD instruction for multiple channels of data (e.g., to accelerate the transformation and/or rendering three-dimensional geometric shapes).
Note that some SIMD instructions may be conditional. Consider, for example, the following set of instructions:
Here, the first set of instructions will be executed when “condition 1” is true and the second set of instructions will be executed when “condition 1” is false. When such an instruction is simultaneously executed for multiple channels of data, however, different channels may produce different results. That is, the first set of instructions may need to be executed for some channels while the second set of instructions need to be executed for other channels.
The engine 300 may receive and simultaneously execute instructions for four different channels of data (e.g., associated with four compute channels). Note that in some cases, fewer than four channels may be needed (e.g., when there are less than four valid operands). As a result, the conditional mask vector 310 may be initialized with an initialization vector indicating which channels have valid operands and which do not (e.g., operands i0 through i3, with a “1” indicating that the associated channel is currently enabled). The conditional mask vector 310 may then be used to avoid unnecessary processing (e.g., an instruction might be executed only for those operands in the conditional mask register 310 that are set to “1”). According to some embodiments, information in the conditional mask register 310 may be combined with information in other registers (e.g., via a Boolean AND operation) and the result may be stored in an overall execution mask register (which may then used to avoid unnecessary or inappropriate processing).
When the engine 300 receives a conditional instruction (e.g., an “IF” statement), as illustrated in
When the engine 300 receives an indication that the end of instructions associated with a conditional instruction has been reached (e.g., and “END IF” statement), as illustrated in
According to some embodiments, one conditional instruction may be “nested” inside of a set of instructions associated with another conditional instruction. Consider, for example, the following set of instructions:
In this case, the first and third sets of instructions should be executed when “condition 1” is true and the second set of instructions should only be executed when both “condition 1” and “condition 2” are true.
When the engine 600 receives an indication that the end of instructions associated with the second conditional instruction has been reached (e.g., and “END IF” statement), as illustrated in
Note that the depth of the conditional stack 620 may be associated with the number of levels of conditional instruction nesting that are supported by the engine 600. According to some embodiments, the conditional stack 620 is only be a single entry deep (e.g., the stack might actually be an n-operand wide register).
At 1002, a conditional mask register is initialized. For example, an initialization vector might be stored in the conditional mask register based on channels that are currently enabled. According to another embodiment, the conditional mask register is simply initialized to all ones (e.g., it is assumed that all channels are always enabled).
The next SIMD instruction is retrieved at 1004. For example, a SIMD execution engine might receive an instruction from a memory unit. When the SIMD instruction is an “IF” instruction at 1006, a condition associated with the instruction is evaluated at 1008 in accordance with the conditional mask register. That is, the condition is evaluated for operands associated with channels that have a “1” in the conditional mask register. Note that in some cases, one or none of the channels might have a “1” in the conditional mask register.
At 1010, the data in the conditional mask register is transferred to the top of a conditional stack. For example, the current state of the conditional mask register may saved to be later restored after the instructions associated with the “IF” instruction have been executed. The result of the evaluation is then stored in the conditional mask register at 1012, and the method continues at 1004 (e.g., the next SIMD instruction may be retrieved).
When the SIMD instruction was not an “IF” instruction at 1006, it is determined at 1014 whether or not the instruction is an “END IF” instruction. If not, the instruction is executed 1018. For example, the instruction may be executed for multiple channels of data as indicated by the conditional mask register and the remaining values in the stack are moved up one position.
When it is determined that an “END IF” instruction has been encounter at 1014, to information at the top of the conditional stack is moved back into the conditional register at 1016.
In some cases, a conditional instruction will be associated with both (i) a first set of instructions to be execute when a condition is. true and (ii) a second set of instructions to be execute when that condition is false (e.g., associated with an ELSE statement).
As illustrated in
When the ELSE instruction is encountered as illustrated in
If any of the channels that were evaluated were true at 1406, a first set of instructions associated with the IF instruction may be executed at 1408 in accordance with the conditional mask register. Optionally, if none of the channels were true at 1406 these instructions may be skipped.
When an ELSE statement is encountered, the information in the conditional mask register may be combined with the information at the top of the conditional stack at 1410 via a per-channel Boolean operation such as NOT(conditional mask register) AND top-of-stack. A second set of instructions may be executed (e.g., associated with an ELSE instruction) may then been executed at 1414, and the conditional mask register may be restored from the conditional stack at 1416. Optionally, if none of the channels were true at 1412 these instructions may be skipped.
The following illustrates various additional embodiments. These do not constitute a definition of all possible embodiments, and those skilled in the art will understand that many other embodiments are possible. Further, although the following embodiments are briefly described for clarity, those skilled in the art will understand how to make any changes, if necessary, to the above description to accommodate these and other embodiments and applications.
Although some embodiments have been described with respect to a separate conditional mask register and conditional stack, any embodiment might be associated with only a single conditional stack (e.g., and the current mask information might be associated with the top entry in the stack).
Moreover, although different embodiments have been described, note that any combination of embodiments may be implemented (e.g., both an IF statement and an ELSE statement might include an address). Moreover, although examples have used “0” to indicate a channel that is not enabled according to other embodiments a “1” might instead indicate that a channel is not currently enabled.
The several embodiments described herein are solely for the purpose of illustration. Persons skilled in the art will recognize from this description other embodiments may be practiced with modifications and alterations limited only by the claims.
Claims
1. A method, comprising:
- receiving a first conditional instruction at an n-operand single instruction, multiple-data execution engine;
- evaluating the first conditional instruction based on multiple operands of associated data;
- storing the result of the evaluation in an n-bit conditional mask register;
- receiving a second conditional instruction at the execution engine; and
- copying the result from the conditional mask register to an n-bit wide, m-entry deep conditional stack.
2. The method of claim 1, further comprising:
- evaluating the second conditional instruction based on the data in the conditional mask register and multiple operands of associated data;
- storing the result of the evaluation of the second conditional instruction in the conditional mask register;
- executing instructions associated with the second conditional instruction in accordance with the data in the conditional mask register;
- moving the top of the conditional stack to the conditional mask register; and
- executing instructions associated with the first conditional instruction in accordance with the data in the conditional mask register.
3. The method of claim 1, wherein the first conditional instruction is associated with (i) a first set of instructions to be executed when a condition is true and (ii) a second set of instructions to be executed when the condition is false.
4. The method of claim 3, wherein the first conditional instruction includes an address associated with the second set of instructions, and further comprising:
- jumping to the address when said evaluating indicates that the first conditional instruction is not satisfied for any evaluated bit of associated data.
5. The method of claim 3, further comprising:
- executing the first set of instructions;
- combining the data in the conditional mask register with the data at the top of the conditional stack via a Boolean operation;
- storing the result of the combination in the conditional mask register; and
- executing the second set of instructions in accordance with the data in the conditional mask register.
6. The method of claim 1, wherein each of the n-operands of associated data is associated with a channel, and further comprising prior to receiving the first conditional instruction:
- initializing the conditional mask register based on channels to be enabled for execution.
7. The method of claim 1, wherein the conditional stack is more than one entry deep.
8. An apparatus, comprising:
- an n-bit conditional mask vector, wherein the conditional mask vector is to store results of evaluations of: (i) an “if” instruction condition and (ii) data associated with multiple channels; and
- an n-bit wide, m-entry deep conditional stack to store the information that existed in the conditional mask vector prior to the results of the evaluations.
9. The apparatus of claim 8, wherein the information is to be transferred from the conditional stack to the conditional mask vector when an associated “end if” instruction is executed.
10. The apparatus of claim 8, wherein the “if” instruction is associated with (i) a first set of instructions to be executed on operands associated with a true condition and (ii) a second set of instructions to be executed on operands associated with a false condition.
11. The apparatus of claim 10, wherein the “if” instruction includes an address associated with the second set of instructions, and that address is stored in a program counter when results are false for every channel.
12. The apparatus of claim 10, further comprising an engine to: (i) execute the first set of instructions, (ii) combine the information in the conditional mask vector with the information at the top of the conditional stack, (iii) store the result of the combination in the conditional mask vector, and (iv) execute the second set of instructions.
13. The apparatus of claim 8, wherein the conditional mask vector is to be initialized in accordance with enabled channels.
14. The apparatus of claim 8, wherein the conditional stack is 1-entry deep.
15. An article, comprising:
- a storage medium having stored thereon instructions that when executed by a machine result in the following: receiving a first conditional statement at an n-channel single instruction, multiple-data execution engine, simultaneously evaluating the first conditional statement for multiple channels of associated data, storing the result of the evaluation in an n-bit conditional mask register, receiving at the execution engine a second conditional statement, and copying the result from the conditional mask register to an n-bit wide, m-entry deep conditional stack.
16. The article of claim 15, wherein the first conditional statement: (i) is associated with a first set of statements to be executed when a condition is true, (iii) is associated with a second set of statements to be executed when the condition is false, and (iii) includes an address associated with the second set of statements, and said method further comprises:
- jumping to the address when said evaluating indicates that the first conditional statement not true for any of the n-channels of associated data.
17. The article of claim 16, wherein said method further comprises:
- evaluating the second conditional statement based on the data in the conditional mask register and n-channels of associated data,
- storing the result of the evaluation of the second conditional statement in the conditional mask register,
- executing statements associated with the second conditional statement in accordance with the data in the conditional mask register,
- transferring the top of the conditional stack to the conditional mask register; and
- executing statements associated with the first conditional statement in accordance with the data in the conditional mask register.
18. A system, comprising:
- a processor, including: an n-bit conditional mask vector, wherein the conditional mask vector is to store a result of an evaluation of: (i) a first “if” condition and (ii) data associated with a plurality of channels, and an n-bit wide, m-entry deep conditional stack to store the result when a second “if” instruction is encountered; and
- a graphics memory unit.
19. The system of claim 18, wherein the result is to be transferred from the conditional stack to the conditional mask vector when an “end if” instruction associated with the second “if” instruction is executed.
20. The system of claim 18, further comprising an instruction memory unit.
Type: Application
Filed: Jun 29, 2004
Publication Date: Dec 29, 2005
Inventors: Michael Dwyer (El Dorado Hills, CA), Hong Jiang (San Jose, CA), Thomas Piazza (Granite Bay, CA)
Application Number: 10/879,460