Abstract: A processor includes a branch execution unit to detect a dual basic-block loop type where a second basic block jumps to the start address of a first basic block. The dual basic-block loop includes a predicted loop count to write to an entry of a branch target buffer (BTB). Two basic-blocks loop of the loop prediction from BTB forms a loop buffer in an instruction queues of the processor to seamlessly sending loop instructions from plurality of iterations to the next pipeline stage.
Abstract: A processor includes a time counter and issuing instruction and executing instruction at a future time which is based on the time counter. The execution times are based on fixed latency times of instructions with exception of the load instruction which is based on the data cache hit latency time. A data cache miss causes the load instruction to fetch data from the level 2 cache wherein a time tracker unit adjusts the level 2 cache latency time based on a counter.