Patents by Inventor Marius Evers
Marius Evers has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9529720Abstract: The present application describes embodiments of techniques for picking a data array lookup request for execution in a data array pipeline a variable number of cycles behind a corresponding tag array lookup request that is concurrently executing in a tag array pipeline. Some embodiments of a method for picking the data array lookup request include picking the data array lookup request for execution in a data array pipeline of a cache concurrently with execution of a tag array lookup request in a tag array pipeline of the cache. The data array lookup request is picked for execution in response to resources of the data array pipeline becoming available after picking the tag array lookup request for execution. Some embodiments of the method may be implemented in a cache.Type: GrantFiled: June 7, 2013Date of Patent: December 27, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Marius Evers, John Kalamatianos, Carl D. Dietz, Richard E. Klass, Ravindra N. Bhargava
-
Patent number: 9058277Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.Type: GrantFiled: November 8, 2012Date of Patent: June 16, 2015Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Sharad Dilip Bade, Alok Garg, John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
-
Publication number: 20150120976Abstract: A method and apparatus for performing a bus lock and a translation lookaside buffer invalidate transaction includes receiving, by a lock master, a lock request from a first processor in a system. The lock master sends a quiesce request to all processors in the system, and upon receipt of the quiesce request from the lock master, all processors cease issuing any new transactions and issue a quiesce granted transaction. Upon receipt of the quiesce granted transactions from all processors, the lock master issues a lock granted message that includes an identifier of the first processor. The first processor performs an atomic transaction sequence and sends a first lock release message to the lock master upon completion of the atomic transaction sequence. The lock master sends a second lock release message to all processors upon receiving the first lock release message from the first processor.Type: ApplicationFiled: October 23, 2014Publication date: April 30, 2015Applicant: ADVANCED MICRO DEVICES, INC.Inventors: William L. Walker, Paul J. Moyer, Richard M. Born, Eric Morton, David Christie, Marius Evers, Scott T. Bingham
-
Publication number: 20150121046Abstract: The present invention provides a method and apparatus for supporting embodiments of an out-of-order load to load queue structure. One embodiment of the apparatus includes a load queue for storing memory operations adapted to be executed out-of-order with respect to other memory operations. The apparatus also includes a load order queue for cacheable operations that ordered for a particular address.Type: ApplicationFiled: October 24, 2014Publication date: April 30, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Thomas Kunjan, Scott T. Bingham, Marius Evers, James D. Williams
-
Publication number: 20150121050Abstract: A processor, a device, and a non-transitory computer readable medium for performing branch prediction in a processor are presented. The processor includes a front end unit. The front end unit includes a level 1 branch target buffer (BTB), a BTB index predictor (BIP), and a level 1 hash perceptron (HP). The BTB is configured to predict a target address. The BIP is configured to generate a prediction based on a program counter and a global history, wherein the prediction includes a speculative partial target address, a global history value, a global history shift value, and a way prediction. The HP is configured to predict whether a branch instruction is taken or not taken.Type: ApplicationFiled: October 24, 2014Publication date: April 30, 2015Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Douglas Williams, Sahil Arora, Nikhil Gupta, Wei-Yu Chen, Debjit Das Sarma, Marius Evers
-
Publication number: 20150026414Abstract: A prefetcher maintains the state of stored prefetch information, such as a prefetch confidence level, when a prefetch would cross a memory page boundary. The maintained prefetch information can be used both to identify whether the stride pattern for a particular sequence of demand requests persists after the memory page boundary has been crossed, and to continue to issue prefetch requests according to the identified pattern. The prefetcher therefore does not have re-identify a stride pattern each time a page boundary is crossed by a sequence of demand requests, thereby improving the efficiency and accuracy of the prefetcher.Type: ApplicationFiled: July 17, 2013Publication date: January 22, 2015Applicant: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
-
Publication number: 20140365729Abstract: The present application describes embodiments of techniques for picking a data array lookup request for execution in a data array pipeline a variable number of cycles behind a corresponding tag array lookup request that is concurrently executing in a tag array pipeline. Some embodiments of a method for picking the data array lookup request include picking the data array lookup request for execution in a data array pipeline of a cache concurrently with execution of a tag array lookup request in a tag array pipeline of the cache. The data array lookup request is picked for execution in response to resources of the data array pipeline becoming available after picking the tag array lookup request for execution. Some embodiments of the method may be implemented in a cache.Type: ApplicationFiled: June 7, 2013Publication date: December 11, 2014Inventors: Marius Evers, John Kalamatianos, Carl D. Dietz, Richard E. Klass, Ravindra N. Bhargava
-
Publication number: 20140189700Abstract: A processor uses a token scheme to govern the maximum number of memory access requests each of a set of processor cores can have pending at a northbridge of the processor. To implement the scheme, the northbridge issues a minimum number of tokens to each of the processor cores and keeps a number of tokens in reserve. In response to determining that a given processor core is generating a high level of memory access activity the northbridge issues some of the reserve tokens to the processor core. The processor core returns the reserve tokens to the northbridge in response to determining that it is not likely to continue to generate the high number of memory access requests, so that the reserve tokens are available to issue to another processor core.Type: ApplicationFiled: December 27, 2012Publication date: July 3, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Douglas R. Williams, Vydhyanathan Kalyanasundharam, Marius Evers, Michael K. Fertig
-
Publication number: 20140129780Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.Type: ApplicationFiled: November 8, 2012Publication date: May 8, 2014Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Sharad Dilip Bade, Alok Garg, John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
-
Publication number: 20140108740Abstract: A processing system monitors memory bandwidth available to transfer data from memory to a cache. In addition, the processing system monitors a prefetching accuracy for prefetched data. If the amount of available memory bandwidth is low and the prefetching accuracy is also low, prefetching can be throttled by reducing the amount of data prefetched. The prefetching can be throttled by changing the frequency of prefetching, prefetching depth, prefetching confidence levels, and the like.Type: ApplicationFiled: October 17, 2012Publication date: April 17, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Todd Rafacz, Marius Evers, Chitresh Narasimhaiah
-
Patent number: 8086825Abstract: One or more processor cores of a multiple-core processing device each can utilize a processing pipeline having a plurality of execution units (e.g., integer execution units or floating point units) that together share a pre-execution front-end having instruction fetch, decode and dispatch resources. Further, one or more of the processor cores each can implement dispatch resources configured to dispatch multiple instructions in parallel to multiple corresponding execution units via separate dispatch buses. The dispatch resources further can opportunistically decode and dispatch instruction operations from multiple threads in parallel so as to increase the dispatch bandwidth. Moreover, some or all of the stages of the processing pipelines of one or more of the processor cores can be configured to implement independent thread selection for the corresponding stage.Type: GrantFiled: December 31, 2007Date of Patent: December 27, 2011Assignee: Advanced Micro Devices, Inc.Inventors: Gene Shen, Sean Lie, Marius Evers
-
Patent number: 7818592Abstract: A token-based power control mechanism for an apparatus including a power controller and a plurality of processing devices. The power controller may detect a power budget allotted for the apparatus. The power controller may convert the allotted power budget into a plurality of power tokens, each power token being a portion of the allotted power budget. The power controller may then assign one or more of the plurality of power tokens to each of the processing devices. The assigned power tokens may determine the power allotted for each of the processing devices. The power controller may receive one or more requests from the plurality of processing devices for one or more additional power tokens. In response to receiving the requests, the power controller may determine whether to change the distribution of power tokens among the processing devices.Type: GrantFiled: April 18, 2007Date of Patent: October 19, 2010Assignee: Globalfoundries Inc.Inventors: Stephan Meier, Marius Evers
-
Patent number: 7702888Abstract: An apparatus for executing branch predictor directed prefetch operations. During operation, a branch prediction unit may provide an address of a first instruction to the fetch unit. The fetch unit may send a fetch request for the first instruction to the instruction cache to perform a fetch operation. In response to detecting a cache miss corresponding to the first instruction, the fetch unit may execute one or more prefetch operation while the cache miss corresponding to the first instruction is being serviced. The branch prediction unit may provide an address of a predicted next instruction in the instruction stream to the fetch unit. The fetch unit may send a prefetch request for the predicted next instruction to the instruction cache to execute the prefetch operation. The fetch unit may store prefetched instruction data obtained from a next level of memory in the instruction cache or in a prefetch buffer.Type: GrantFiled: February 28, 2007Date of Patent: April 20, 2010Assignee: GlobalFoundries Inc.Inventors: Marius Evers, Trivikram Krishnamurthy
-
Publication number: 20090172362Abstract: One or more processor cores of a multiple-core processing device each can utilize a processing pipeline having a plurality of execution units (e.g., integer execution units or floating point units) that together share a pre-execution front-end having instruction fetch, decode and dispatch resources. Further, one or more of the processor cores each can implement dispatch resources configured to dispatch multiple instructions in parallel to multiple corresponding execution units via separate dispatch buses. The dispatch resources further can opportunistically decode and dispatch instruction operations from multiple threads in parallel so as to increase the dispatch bandwidth. Moreover, some or all of the stages of the processing pipelines of one or more of the processor cores can be configured to implement independent thread selection for the corresponding stage.Type: ApplicationFiled: December 31, 2007Publication date: July 2, 2009Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Gene Shen, Sean Lie, Marius Evers
-
Publication number: 20080263373Abstract: A token-based power control mechanism for an apparatus including a power controller and a plurality of processing devices. The power controller may detect a power budget allotted for the apparatus. The power controller may convert the allotted power budget into a plurality of power tokens, each power token being a portion of the allotted power budget. The power controller may then assign one or more of the plurality of power tokens to each of the processing devices. The assigned power tokens may determine the power allotted for each of the processing devices. The power controller may receive one or more requests from the plurality of processing devices for one or more additional power tokens. In response to receiving the requests, the power controller may determine whether to change the distribution of power tokens among the processing devices.Type: ApplicationFiled: April 18, 2007Publication date: October 23, 2008Inventors: Stephan Meier, Marius Evers
-
Publication number: 20080209173Abstract: An apparatus for executing branch predictor directed prefetch operations. During operation, a branch prediction unit may provide an address of a first instruction to the fetch unit. The fetch unit may send a fetch request for the first instruction to the instruction cache to perform a fetch operation. In response to detecting a cache miss corresponding to the first instruction, the fetch unit may execute one or more prefetch operation while the cache miss corresponding to the first instruction is being serviced. The branch prediction unit may provide an address of a predicted next instruction in the instruction stream to the fetch unit. The fetch unit may send a prefetch request for the predicted next instruction to the instruction cache to execute the prefetch operation. The fetch unit may store prefetched instruction data obtained from a next level of memory in the instruction cache or in a prefetch buffer.Type: ApplicationFiled: February 28, 2007Publication date: August 28, 2008Inventors: Marius Evers, Trivikram Krishnamurthy
-
Patent number: 7188325Abstract: In one embodiment, a method for selecting transistor threshold voltages on an integrated circuit may include assigning a first threshold voltage to selected groups of transistors such as cell instances, for example, and determining which of the selected groups of transistors to assign a second threshold voltage, that is lower than the first threshold voltage, by iteratively performing a cost/benefit analysis. The method may further include determining which of the selected groups of transistors having a third threshold voltage to assign the first threshold voltage by iteratively performing a cost/benefit analysis. The cost/benefit analysis may include calculating a cost/benefit ratio for each group of the selected groups of transistors. In addition, the cost/benefit analysis may include calculating an upcone benefit and a downcone benefit for groups of transistors coupled to one or more inputs and outputs, respectively.Type: GrantFiled: October 4, 2004Date of Patent: March 6, 2007Assignee: Advanced Micro Devices, Inc.Inventors: Marius Evers, Jeffrey E. Trull, Alper Halbutogullari, Robert W. Williams