Patents by Inventor Arunachalam Annamalai

Arunachalam Annamalai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Using loop exit prediction to accelerate or suppress loop mode of a processor

Patent number: 11256505

Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.

Type: Grant

Filed: February 5, 2021

Date of Patent: February 22, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
Branch target buffer with early return prediction

Patent number: 11055098

Abstract: A processor includes a branch target buffer (BTB) having a plurality of entries whereby each entry corresponds to an associated instruction pointer value that is predicted to be a branch instruction. Each BTB entry stores a predicted branch target address for the branch instruction, and further stores information indicating whether the next branch in the block of instructions associated with the predicted branch target address is predicted to be a return instruction. In response to the BTB indicating that the next branch is predicted to be a return instruction, the processor initiates an access to a return stack that stores the return address for the predicted return instruction. By initiating access to the return stack responsive to the return prediction stored at the BTB, the processor reduces the delay in identifying the return address, thereby improving processing efficiency.

Type: Grant

Filed: July 24, 2018

Date of Patent: July 6, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Aparna Thyagarajan, Marius Evers, Arunachalam Annamalai
USING LOOP EXIT PREDICTION TO ACCELERATE OR SUPPRESS LOOP MODE OF A PROCESSOR

Publication number: 20210191722

Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.

Type: Application

Filed: February 5, 2021

Publication date: June 24, 2021

Inventors: Arunachalam ANNAMALAI, Marius EVERS, Aparna THYAGARAJAN, Anthony JARVIS
Using loop exit prediction to accelerate or suppress loop mode of a processor

Patent number: 10915322

Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.

Type: Grant

Filed: September 18, 2018

Date of Patent: February 9, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
Low latency synchronization for operation cache and instruction cache fetching and decoding instructions

Patent number: 10896044

Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.

Type: Grant

Filed: June 21, 2018

Date of Patent: January 19, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams
USING LOOP EXIT PREDICTION TO ACCELERATE OR SUPPRESS LOOP MODE OF A PROCESSOR

Publication number: 20200089498

Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.

Type: Application

Filed: September 18, 2018

Publication date: March 19, 2020

Inventors: Arunachalam ANNAMALAI, Marius EVERS, Aparna THYAGARAJAN
BRANCH TARGET BUFFER WITH EARLY RETURN PREDICTION

Publication number: 20200034151

Abstract: A processor includes a branch target buffer (BTB) having a plurality of entries whereby each entry corresponds to an associated instruction pointer value that is predicted to be a branch instruction. Each BTB entry stores a predicted branch target address for the branch instruction, and further stores information indicating whether the next branch in the block of instructions associated with the predicted branch target address is predicted to be a return instruction. In response to the BTB indicating that the next branch is predicted to be a return instruction, the processor initiates an access to a return stack that stores the return address for the predicted return instruction. By initiating access to the return stack responsive to the return prediction stored at the BTB, the processor reduces the delay in identifying the return address, thereby improving processing efficiency.

Type: Application

Filed: July 24, 2018

Publication date: January 30, 2020

Inventors: Aparna THYAGARAJAN, Marius EVERS, Arunachalam ANNAMALAI
LOW LATENCY SYNCHRONIZATION FOR OPERATION CACHE AND INSTRUCTION CACHE FETCHING AND DECODING INSTRUCTIONS

Publication number: 20190391813

Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.

Type: Application

Filed: June 21, 2018

Publication date: December 26, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams