Abstract: A superscalar processor includes an execution unit that executes load/store instructions and an execution unit that executes arithmetic instruction. Execution pipelines for both execution units include a decode stage, a read stage that identify and read source operands for the instructions and an execution stage or stages performed in the execution units. For store instructions, reading store data from a register file is deferred until the store data is required for transfer to a memory system. This allows the store instructions to be decoded simultaneously with earlier instructions that generate the store data. A simple antidependency interlock uses a list of the register numbers identifying registers holding store data for pending store instructions.