Patents by Inventor Marius Evers

Marius Evers has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

RETAINING CACHE ENTRIES OF A PROCESSOR CORE DURING A POWERED-DOWN STATE

Publication number: 20210056031

Abstract: A processor core associated with a first cache initiates entry into a powered-down state. In response, information representing a set of entries of the first cache are stored in a retention region that receives a retention voltage while the processor core is in a powered-down state. Information indicating one or more invalidated entries of the set of entries is also stored in the retention region. In response to the processor core initiating exit from the powered-down state, entries of the first cache are restored using the stored information representing the entries and the stored information indicating the at least one invalidated entry.

Type: Application

Filed: November 6, 2020

Publication date: February 25, 2021

Inventors: William L. WALKER, Michael L. GOLDEN, Marius EVERS
Selective use of taint protection during speculative execution

Patent number: 10929141

Abstract: A state of a first architectural register in a processing system is changed from a first state to a second state that indicates that the first architectural register is to be monitored during speculative execution. A second architectural register in the processing system is associated with a third state in response to the first architectural register being a source register for a memory load instruction that loads data from a memory into the second architectural register during speculative execution. Use of data in the second architectural register is constrained during speculative operations while the second architectural register is in the third state. In some cases, a “set taint” instruction is executed to change the state of the first architectural register from the first state to the second state.

Type: Grant

Filed: March 5, 2019

Date of Patent: February 23, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: David Kaplan, Marius Evers
Using loop exit prediction to accelerate or suppress loop mode of a processor

Patent number: 10915322

Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.

Type: Grant

Filed: September 18, 2018

Date of Patent: February 9, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
SELECTIVELY PERFORMING AHEAD BRANCH PREDICTION BASED ON TYPES OF BRANCH INSTRUCTIONS

Publication number: 20210034370

Abstract: A set of entries in a branch prediction structure for a set of second blocks are accessed based on a first address of a first block. The set of second blocks correspond to outcomes of one or more first branch instructions in the first block. Speculative prediction of outcomes of second branch instructions in the second blocks is initiated based on the entries in the branch prediction structure. State associated with the speculative prediction is selectively flushed based on types of the branch instructions. In some cases, the branch predictor can be accessed using an address of a previous block or a current block. State associated with the speculative prediction is selectively flushed from the ahead branch prediction, and prediction of outcomes of branch instructions in one of the second blocks is selectively initiated using non-ahead accessing, based on the types of the one or more branch instructions.

Type: Application

Filed: July 31, 2020

Publication date: February 4, 2021

Inventors: Marius EVERS, Aparna THYAGARAJAN, Ashok T. VENKATACHAR
Low latency synchronization for operation cache and instruction cache fetching and decoding instructions

Patent number: 10896044

Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.

Type: Grant

Filed: June 21, 2018

Date of Patent: January 19, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams
Using return address predictor to speed up control stack return address verification

Patent number: 10768937

Abstract: Overhead associated with verifying function return addresses to protect against security exploits is reduced by taking advantage of branch prediction mechanisms for predicting return addresses. More specifically, returning from a function includes popping a return address from a data stack. Well-known security exploits overwrite the return address on the data stack to hijack control flow. In some processors, a separate data structure referred to as a control stack is used to verify the data stack. When a return instruction is executed, the processor issues an exception if the return addresses on the control stack and the data stack are not identical. This overhead can be avoided by taking advantage of the return address stack, which is a data structure used by the branch predictor to predict return addresses. In most situations, if this prediction is correct, the above check does not need to occur, thus reducing the associated overhead.

Type: Grant

Filed: July 26, 2018

Date of Patent: September 8, 2020

Assignee: Advanced Micro Devices, Inc.

Inventors: Marius Evers, David A. Kaplan, Debjit Das Sarma
PROCESSOR WITH ACCELERATED LOCK INSTRUCTION OPERATION

Publication number: 20200272463

Abstract: A processor and method for handling lock instructions identifies which of a plurality of older store instructions relative to a current lock instruction are able to be locked. The method and processor lock the identified older store instructions as an atomic group with the current lock instruction. The method and processor negatively acknowledge probes until all of the older store instructions in the atomic group have written to cache memory. In some implementations, an atomic grouping unit issues an indication to lock identified older store instructions that are retired and lockable, and in some implementations, also issues an indication to lock older stores that are determined to be lockable that are non-retired.

Type: Application

Filed: February 27, 2019

Publication date: August 27, 2020

Inventors: Scott Thomas Bingham, Marius Evers, Krishnan V. Ramani, Thomas Kunjan
Selectively performing ahead branch prediction based on types of branch instructions

Patent number: 10732979

Abstract: A set of entries in a branch prediction structure for a set of second blocks are accessed based on a first address of a first block. The set of second blocks correspond to outcomes of one or more first branch instructions in the first block. Speculative prediction of outcomes of second branch instructions in the second blocks is initiated based on the entries in the branch prediction structure. State associated with the speculative prediction is selectively flushed based on types of the branch instructions. In some cases, the branch predictor can be accessed using an address of a previous block or a current block. State associated with the speculative prediction is selectively flushed from the ahead branch prediction, and prediction of outcomes of branch instructions in one of the second blocks is selectively initiated using non-ahead accessing, based on the types of the one or more branch instructions.

Type: Grant

Filed: June 18, 2018

Date of Patent: August 4, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Marius Evers, Aparna Thyagarajan, Ashok T. Venkatachar
Stride prefetching across memory pages

Patent number: 10671535

Abstract: A prefetcher maintains the state of stored prefetch information, such as a prefetch confidence level, when a prefetch would cross a memory page boundary. The maintained prefetch information can be used both to identify whether the stride pattern for a particular sequence of demand requests persists after the memory page boundary has been crossed, and to continue to issue prefetch requests according to the identified pattern. The prefetcher therefore does not have re-identify a stride pattern each time a page boundary is crossed by a sequence of demand requests, thereby improving the efficiency and accuracy of the prefetcher.

Type: Grant

Filed: July 17, 2013

Date of Patent: June 2, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
USING LOOP EXIT PREDICTION TO ACCELERATE OR SUPPRESS LOOP MODE OF A PROCESSOR

Publication number: 20200089498

Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.

Type: Application

Filed: September 18, 2018

Publication date: March 19, 2020

Inventors: Arunachalam ANNAMALAI, Marius EVERS, Aparna THYAGARAJAN
BRANCH TARGET BUFFER WITH EARLY RETURN PREDICTION

Publication number: 20200034151

Abstract: A processor includes a branch target buffer (BTB) having a plurality of entries whereby each entry corresponds to an associated instruction pointer value that is predicted to be a branch instruction. Each BTB entry stores a predicted branch target address for the branch instruction, and further stores information indicating whether the next branch in the block of instructions associated with the predicted branch target address is predicted to be a return instruction. In response to the BTB indicating that the next branch is predicted to be a return instruction, the processor initiates an access to a return stack that stores the return address for the predicted return instruction. By initiating access to the return stack responsive to the return prediction stored at the BTB, the processor reduces the delay in identifying the return address, thereby improving processing efficiency.

Type: Application

Filed: July 24, 2018

Publication date: January 30, 2020

Inventors: Aparna THYAGARAJAN, Marius EVERS, Arunachalam ANNAMALAI
USING RETURN ADDRESS PREDICTOR TO SPEED UP CONTROL STACK RETURN ADDRESS VERIFICATION

Publication number: 20200034144

Abstract: Overhead associated with verifying function return addresses to protect against security exploits is reduced by taking advantage of branch prediction mechanisms for predicting return addresses. More specifically, returning from a function includes popping a return address from a data stack. Well-known security exploits overwrite the return address on the data stack to hijack control flow. In some processors, a separate data structure referred to as a control stack is used to verify the data stack. When a return instruction is executed, the processor issues an exception if the return addresses on the control stack and the data stack are not identical. This overhead can be avoided by taking advantage of the return address stack, which is a data structure used by the branch predictor to predict return addresses. In most situations, if this prediction is correct, the above check does not need to occur, thus reducing the associated overhead.

Type: Application

Filed: July 26, 2018

Publication date: January 30, 2020

Applicant: Advanced Micro Devices, Inc.

Inventors: Marius Evers, David A. Kaplan, Debjit Das Sarma
LOW LATENCY SYNCHRONIZATION FOR OPERATION CACHE AND INSTRUCTION CACHE FETCHING AND DECODING INSTRUCTIONS

Publication number: 20190391813

Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.

Type: Application

Filed: June 21, 2018

Publication date: December 26, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams
SELECTIVELY PERFORMING AHEAD BRANCH PREDICTION BASED ON TYPES OF BRANCH INSTRUCTIONS

Publication number: 20190384612

Abstract: A set of entries in a branch prediction structure for a set of second blocks are accessed based on a first address of a first block. The set of second blocks correspond to outcomes of one or more first branch instructions in the first block. Speculative prediction of outcomes of second branch instructions in the second blocks is initiated based on the entries in the branch prediction structure. State associated with the speculative prediction is selectively flushed based on types of the branch instructions. In some cases, the branch predictor can be accessed using an address of a previous block or a current block. State associated with the speculative prediction is selectively flushed from the ahead branch prediction, and prediction of outcomes of branch instructions in one of the second blocks is selectively initiated using non-ahead accessing, based on the types of the one or more branch instructions.

Type: Application

Filed: June 18, 2018

Publication date: December 19, 2019

Inventors: Marius EVERS, Aparna THYAGARAJAN, Ashok T. VENKATACHAR
STORING INCIDENTAL BRANCH PREDICTIONS TO REDUCE LATENCY OF MISPREDICTION RECOVERY

Publication number: 20190369999

Abstract: A branch predictor predicts a first outcome of a first branch in a first block of instructions. Fetch logic fetches instructions for speculative execution along a first path indicated by the first outcome. Information representing a remainder of the first block is stored in response to the first predicted outcome being taken. In response to the first branch instruction being not taken, the branch predictor is restarted based on the remainder block. In some cases, entries corresponding to second blocks along speculative paths from the first block are accessed using an address of the first block as an index into a branch prediction structure. Outcomes of branch instructions in the second blocks are concurrently predicted using a corresponding set of instances of branch conditional logic and the predicted outcomes are used in combination with the remainder block to restart the branch predictor in response to mispredictions.

Type: Application

Filed: June 4, 2018

Publication date: December 5, 2019

Inventors: Marius EVERS, Douglas WILLIAMS, Ashok T. VENKATACHAR, Sudherssen KALAISELVAN
TRACKING STORES AND LOADS BY BYPASSING LOAD STORE UNITS

Publication number: 20190310845

Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. Store-load pairs which have a strong history of store-to-load forwarding are identified. Once identified, the load is memory renamed to the register stored by the store. The memory dependency predictor may also be used to detect loads that are dependent on a store but cannot be renamed. In such a configuration, the dependence is signaled to the load store unit and the load store unit uses the information to issue the load after the identified store has its physical address.

Type: Application

Filed: June 24, 2019

Publication date: October 10, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Krishnan V. Ramani, Kai Troester, Frank C. Galloway, David N. Suggs, Michael D. Achenbach, Betty Ann McDaniel, Marius Evers
RETAINING CACHE ENTRIES OF A PROCESSOR CORE DURING A POWERED-DOWN STATE

Publication number: 20190129853

Abstract: A processor core associated with a first cache initiates entry into a powered-down state. In response, information representing a set of entries of the first cache are stored in a retention region that receives a retention voltage while the processor core is in a powered-down state. Information indicating one or more invalidated entries of the set of entries is also stored in the retention region. In response to the processor core initiating exit from the powered-down state, entries of the first cache are restored using the stored information representing the entries and the stored information indicating the at least one invalidated entry.

Type: Application

Filed: November 1, 2017

Publication date: May 2, 2019

Inventors: William L. WALKER, Michael L. GOLDEN, Marius EVERS
Bandwidth increase in branch prediction unit and level 1 instruction cache

Patent number: 10127044

Abstract: A processor, a device, and a non-transitory computer readable medium for performing branch prediction in a processor are presented. The processor includes a front end unit. The front end unit includes a level 1 branch target buffer (BTB), a BTB index predictor (BIP), and a level 1 hash perceptron (HP). The BTB is configured to predict a target address. The BIP is configured to generate a prediction based on a program counter and a global history, wherein the prediction includes a speculative partial target address, a global history value, a global history shift value, and a way prediction. The HP is configured to predict whether a branch instruction is taken or not taken.

Type: Grant

Filed: October 24, 2014

Date of Patent: November 13, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Douglas Williams, Sahil Arora, Nikhil Gupta, Wei-Yu Chen, Debjit Das Sarma, Marius Evers
Method and apparatus for performing a bus lock and translation lookaside buffer invalidation

Patent number: 9916243

Abstract: A method and apparatus for performing a bus lock and a translation lookaside buffer invalidate transaction includes receiving, by a lock master, a lock request from a first processor in a system. The lock master sends a quiesce request to all processors in the system, and upon receipt of the quiesce request from the lock master, all processors cease issuing any new transactions and issue a quiesce granted transaction. Upon receipt of the quiesce granted transactions from all processors, the lock master issues a lock granted message that includes an identifier of the first processor. The first processor performs an atomic transaction sequence and sends a first lock release message to the lock master upon completion of the atomic transaction sequence. The lock master sends a second lock release message to all processors upon receiving the first lock release message from the first processor.

Type: Grant

Filed: October 23, 2014

Date of Patent: March 13, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: William L. Walker, Paul J. Moyer, Richard M. Born, Eric Morton, David Christie, Marius Evers, Scott T. Bingham
Resource management for northbridge using tokens

Patent number: 9697146

Abstract: A processor uses a token scheme to govern the maximum number of memory access requests each of a set of processor cores can have pending at a northbridge of the processor. To implement the scheme, the northbridge issues a minimum number of tokens to each of the processor cores and keeps a number of tokens in reserve. In response to determining that a given processor core is generating a high level of memory access activity the northbridge issues some of the reserve tokens to the processor core. The processor core returns the reserve tokens to the northbridge in response to determining that it is not likely to continue to generate the high number of memory access requests, so that the reserve tokens are available to issue to another processor core.

Type: Grant

Filed: December 27, 2012

Date of Patent: July 4, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Douglas R. Williams, Vydhyanathan Kalyanasundharam, Marius Evers, Michael K. Fertig

prev 1 2 3 next