Patents by Inventor Ravindra N. Bhargava

Ravindra N. Bhargava has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DYNAMIC PAGE STATE AWARE SCHEDULING OF READ/WRITE BURST TRANSACTIONS

Publication number: 20190196995

Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. When a memory controller in a computing system determines a threshold number of memory access requests have not been sent to the memory device in a current mode of a read mode and a write mode, a first cost corresponding to a latency associated with sending remaining requests in either the read queue or the write queue associated with the current mode is determined. If the first cost exceeds the cost of a data bus turnaround, the cost of a data bus turnaround comprising a latency incurred when switching a transmission direction of the data bus from one direction to an opposite direction, then a second cost is determined for sending remaining memory access requests to the memory device. If the second cost does not exceed the cost of the data bus turnaround, then a time for the data bus turnaround is indicated and the current mode of the memory controller is changed.

Type: Application

Filed: December 21, 2017

Publication date: June 27, 2019

Inventors: Guanhao Shen, Ravindra N. Bhargava, Kedarnath Balakrishnan
ADAPTIVE PAGE CLOSE PREDICTION

Publication number: 20190196720

Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. In various embodiments, a computing system includes one or more computing resources and a memory controller coupled to a memory device. The memory controller determines a memory access request targets a given bank of multiple banks. An access history is updated for the given bank based on whether the memory access request hits on an open page within the given bank and a page hit rate for the given bank is determined. The memory controller sets an idle cycle limit based on the page hit rate. The idle cycle limit is a maximum amount of time the given bank will be held open before closing the given bank while the bank is idle. The idle cycle limit is based at least in part on a page hit rate for the bank.

Type: Application

Filed: December 21, 2017

Publication date: June 27, 2019

Inventors: Guanhao Shen, Ravindra N. Bhargava, James Raymond Magro, Kedarnath Balakrishnan, Kevin M. Brandl
DYNAMIC PER-BANK AND ALL-BANK REFRESH

Publication number: 20190196987

Abstract: Systems, apparatuses, and methods for performing efficient memory accesses in a computing system are disclosed. In various embodiments, a computing system includes computing resources and a memory controller coupled to a memory device. The memory controller determines a memory request targets a given rank of multiple ranks. The memory controller determines a predicted latency for the given rank as an amount of time the pending queue in the memory controller for storing outstanding memory requests does not store any memory requests targeting the given rank. The memory controller determines the total bank latency as an amount of time for refreshing a number of banks which have not yet been refreshed in the given rank with per-bank refresh operations. If there are no pending requests targeting the given rank, each of the predicted latency and the total bank latency is used to select between per-bank and all-bank refresh operations.

Type: Application

Filed: December 21, 2017

Publication date: June 27, 2019

Inventors: Guanhao Shen, Ravindra N. Bhargava, James Raymond Magro, Kedarnath Balakrishnan, Jing Wang
REGION BASED DIRECTORY SCHEME TO ADAPT TO LARGE CACHE SIZES

Publication number: 20190188137

Abstract: Systems, apparatuses, and methods for maintaining a region-based cache directory are disclosed. A system includes multiple processing nodes, with each processing node including a cache subsystem. The system also includes a cache directory to help manage cache coherency among the different cache subsystems of the system. In order to reduce the number of entries in the cache directory, the cache directory tracks coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Accordingly, the system includes a region-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the system. The cache directory includes a reference count in each entry to track the aggregate number of cache lines that are cached per region. If a reference count of a given entry goes to zero, the cache directory reclaims the given entry.

Type: Application

Filed: December 18, 2017

Publication date: June 20, 2019

Inventors: Vydhyanathan Kalyanasundharam, Kevin M. Lepak, Amit P. Apte, Ganesh Balakrishnan, Eric Christopher Morton, Elizabeth M. Cooper, Ravindra N. Bhargava
CACHE CONTROL AWARE MEMORY CONTROLLER

Publication number: 20190179760

Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. External system memory is used as a last-level cache and includes one of a variety of types of dynamic random access memory (DRAM). A memory controller generates a tag request and a separate data request based on a same, single received memory request. The sending of the tag request is prioritized over sending the data request. A partial tag comparison is performed during processing of the tag request. If a tag miss is detected for the partial tag comparison, then the data request is cancelled, and the memory request is sent to main memory. If one or more tag hits are detected for the partial tag comparison, then processing of the data request is dependent upon the result of the full tag comparison.

Type: Application

Filed: December 12, 2017

Publication date: June 13, 2019

Inventors: Ravindra N. Bhargava, Ganesh Balakrishnan
CACHE TO CACHE DATA TRANSFER ACCELERATION TECHNIQUES

Publication number: 20190179758

Abstract: Systems, apparatuses, and methods for accelerating cache to cache data transfers are disclosed. A system includes at least a plurality of processing nodes and prediction units, an interconnect fabric, and a memory. A first prediction unit is configured to receive memory requests generated by a first processing node as the requests traverse the interconnect fabric on the path to memory. When the first prediction unit receives a memory request, the first prediction unit generates a prediction of whether data targeted by the request is cached by another processing node. The first prediction unit is configured to cause a speculative probe to be sent to a second processing node responsive to predicting that the data targeted by the memory request is cached by the second processing node. The speculative probe accelerates the retrieval of the data from the second processing node if the prediction is correct.

Type: Application

Filed: December 12, 2017

Publication date: June 13, 2019

Inventors: Vydhyanathan Kalyanasundharam, Amit P. Apte, Ganesh Balakrishnan, Ann Ling, Ravindra N. Bhargava
SPECULATIVE HINT-TRIGGERED ACTIVATION OF PAGES IN MEMORY

Publication number: 20190155516

Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. In various embodiments, a computing system includes a computing resource and a memory controller coupled to a memory device. The computing resource selectively generates a hint that includes a target address of a memory request generated by the processor. The hint is sent outside the primary communication fabric to the memory controller. The hint conditionally triggers a data access in the memory device. When no page in a bank targeted by the hint is open, the memory controller processes the hint by opening a target page of the hint without retrieving data. The memory controller drops the hint if there are other pending requests that target the same page or the target page is already open.

Type: Application

Filed: November 20, 2017

Publication date: May 23, 2019

Inventors: Ravindra N. Bhargava, Philip S. Park, Vydhyanathan Kalyanasundharam, James Raymond Magro
Thread selection at a processor based on branch prediction confidence

Patent number: 10223124

Abstract: A processor employs one or more branch predictors to issue branch predictions for each thread executing at an instruction pipeline. Based on the branch predictions, the processor determines a branch prediction confidence for each of the executing threads, whereby a lower confidence level indicates a smaller likelihood that the corresponding thread will actually take the predicted branch. Because speculative execution of an untaken branch wastes resources of the instruction pipeline, the processor prioritizes threads associated with a higher confidence level for selection at the stages of the instruction pipeline.

Type: Grant

Filed: January 11, 2013

Date of Patent: March 5, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Ramkumar Jayaseelan, Ravindra N Bhargava
Variable distance bypass between tag array and data array pipelines in a cache

Patent number: 9529720

Abstract: The present application describes embodiments of techniques for picking a data array lookup request for execution in a data array pipeline a variable number of cycles behind a corresponding tag array lookup request that is concurrently executing in a tag array pipeline. Some embodiments of a method for picking the data array lookup request include picking the data array lookup request for execution in a data array pipeline of a cache concurrently with execution of a tag array lookup request in a tag array pipeline of the cache. The data array lookup request is picked for execution in response to resources of the data array pipeline becoming available after picking the tag array lookup request for execution. Some embodiments of the method may be implemented in a cache.

Type: Grant

Filed: June 7, 2013

Date of Patent: December 27, 2016

Assignee: Advanced Micro Devices, Inc.

Inventors: Marius Evers, John Kalamatianos, Carl D. Dietz, Richard E. Klass, Ravindra N. Bhargava
VARIABLE DISTANCE BYPASS BETWEEN TAG ARRAY AND DATA ARRAY PIPELINES IN A CACHE

Publication number: 20140365729

Abstract: The present application describes embodiments of techniques for picking a data array lookup request for execution in a data array pipeline a variable number of cycles behind a corresponding tag array lookup request that is concurrently executing in a tag array pipeline. Some embodiments of a method for picking the data array lookup request include picking the data array lookup request for execution in a data array pipeline of a cache concurrently with execution of a tag array lookup request in a tag array pipeline of the cache. The data array lookup request is picked for execution in response to resources of the data array pipeline becoming available after picking the tag array lookup request for execution. Some embodiments of the method may be implemented in a cache.

Type: Application

Filed: June 7, 2013

Publication date: December 11, 2014

Inventors: Marius Evers, John Kalamatianos, Carl D. Dietz, Richard E. Klass, Ravindra N. Bhargava
THREAD SELECTION AT A PROCESSOR BASED ON BRANCH PREDICTION CONFIDENCE

Publication number: 20140201507

Abstract: A processor employs one or more branch predictors to issue branch predictions for each thread executing at an instruction pipeline. Based on the branch predictions, the processor determines a branch prediction confidence for each of the executing threads, whereby a lower confidence level indicates a smaller likelihood that the corresponding thread will actually take the predicted branch. Because speculative execution of an untaken branch wastes resources of the instruction pipeline, the processor prioritizes threads associated with a higher confidence level for selection at the stages of the instruction pipeline.

Type: Application

Filed: January 11, 2013

Publication date: July 17, 2014

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Ramkumar Jayaseelan, Ravindra N. Bhargava
Branch history with polymorphic indirect branch information

Patent number: 8782384

Abstract: A system and method for efficient improvement of branch prediction in a microprocessor with negligible impact on die-area, power consumption, and clock cycle period. It is determined if a program counter (PC) register contains a polymorphic indirect unconditional branch (PIUB) instruction. One determination may be searching a table with a portion or all of a PC of past PIUB instructions. If a hit occurs in this table, the global shift register (GSR) is updated by shifting a portion of the branch target address into the GSR, rather than updating the GSR with a taken/not-taken prediction bit. The stored value in the GSR is input into a hashing function along with the PC in order to index prediction tables such as a pattern history table (PHT), a branch target buffer (BTB), an indirect target array, or other. The updated value due to the PIUB instruction improves the accuracy of the prediction tables.

Type: Grant

Filed: December 20, 2007

Date of Patent: July 15, 2014

Assignee: Advanced Micro Devices, Inc.

Inventors: David Suggs, Ravindra N. Bhargava
Detecting branch direction and target address pattern and supplying fetch address by replay unit instead of branch prediction unit

Patent number: 8667257

Abstract: Techniques are disclosed relating to improving the performance of branch prediction in processors. In one embodiment, a processor is disclosed that includes a branch prediction unit configured to predict a sequence of instructions to be issued by the processor for execution. The processor also includes a pattern detection unit configured to detect a pattern in the predicted sequence of instructions, where the pattern includes a plurality of predicted instructions. In response to the pattern detection unit detecting the pattern, the processor is configured to switch from issuing instructions predicted by the branch prediction unit to issuing the plurality of instructions. In some embodiments, the processor includes a replay unit that is configured to replay fetch addresses to an instruction fetch unit to cause the plurality of predicted instructions to be issued.

Type: Grant

Filed: November 10, 2010

Date of Patent: March 4, 2014

Assignee: Advanced Micro Devices, Inc.

Inventors: Ravindra N. Bhargava, David Suggs, Anthony X. Jarvis
REPLAY OF DETECTED PATTERNS IN PREDICTED INSTRUCTIONS

Publication number: 20120117362

Abstract: Techniques are disclosed relating to improving the performance of branch prediction in processors. In one embodiment, a processor is disclosed that includes a branch prediction unit configured to predict a sequence of instructions to be issued by the processor for execution. The processor also includes a pattern detection unit configured to detect a pattern in the predicted sequence of instructions, where the pattern includes a plurality of predicted instructions. In response to the pattern detection unit detecting the pattern, the processor is configured to switch from issuing instructions predicted by the branch prediction unit to issuing the plurality of instructions. In some embodiments, the processor includes a replay unit that is configured to replay fetch addresses to an instruction fetch unit to cause the plurality of predicted instructions to be issued.

Type: Application

Filed: November 10, 2010

Publication date: May 10, 2012

Inventors: Ravindra N. Bhargava, David Suggs, Anthony X. Jarvis
BRANCH HISTORY WITH POLYMORPHIC INDIRECT BRANCH INFORMATION

Publication number: 20090164766

Abstract: A system and method for efficient improvement of branch prediction in a microprocessor with negligible impact on die-area, power consumption, and clock cycle period. It is determined if a program counter (PC) register contains a polymorphic indirect unconditional branch (PIUB) instruction. One determination may be searching a table with a portion or all of a PC of past PIUB instructions. If a hit occurs in this table, the global shift register (GSR) is updated by shifting a portion of the branch target address into the GSR, rather than updating the GSR with a taken/not-taken prediction bit. The stored value in the GSR is input into a hashing function along with the PC in order to index prediction tables such as a pattern history table (PHT), a branch target buffer (BTB), an indirect target array, or other. The updated value due to the PIUB instruction improves the accuracy of the prediction tables.

Type: Application

Filed: December 20, 2007

Publication date: June 25, 2009

Inventors: David Suggs, Ravindra N. Bhargava
PARALLEL PREDICTION OF MULTIPLE BRANCHES

Publication number: 20080209190

Abstract: A branch history value associated with a first branch instruction of a first set of instructions is determined. The branch history value represents a branch history of a program flow prior to the first branch instruction. A first branch prediction of the first branch instruction is determined based on the branch history value of the first branch instruction and a first identifier associated with first branch instruction. A second branch prediction of a second branch instruction of the first set of instructions based on the branch history value associated with the first branch instruction and a second identifier associated with the second branch instruction. The second branch instruction occurs subsequent to the first branch instruction in the program flow. A second set of instructions is fetched at the processing device based on at least one of the first branch prediction and the second branch prediction.

Type: Application

Filed: February 28, 2007

Publication date: August 28, 2008

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Ravindra N. Bhargava, Brian Raf
INSTRUCTION PIPELINE MONITORING DEVICE AND METHOD THEREOF

Publication number: 20080141002

Abstract: In accordance with a specific embodiment of the present disclosure, hardware periodically monitors a fetch cycle that fetches data associated with an address to determine performance parameters associated with the fetch cycle. Information related to the duration of a fetch cycle is maintained as well as information indicating the occurrence of various states and data values related to the fetch cycle. For example, the virtual address being processed during the fetch cycle is saved at the integrated circuit containing the fetch engine. Other performance-related parameters associated with execution of instructions at an execution engine of the pipeline are also monitored periodically. However, monitoring performance of the fetch engine is decoupled from monitoring performance-related events of the execution engine.

Type: Application

Filed: December 8, 2006

Publication date: June 12, 2008

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Ravindra N. Bhargava, Benjamin T. Sander
FETCH ENGINE MONITORING DEVICE AND METHOD THEREOF

Publication number: 20080140993

Abstract: In accordance with a specific embodiment of the present disclosure, hardware periodically monitors a fetch cycle that fetches data associated with an address to determine performance parameters associated with the fetch cycle. Information related to the duration of a fetch cycle is maintained as well as information indicating the occurrence of various states and data values related to the fetch cycle. For example, the virtual address being processed during the fetch cycle is saved at the integrated circuit containing the fetch engine. Other performance-related parameters associated with execution of instructions at an execution engine of the pipeline are also monitored periodically. However, monitoring performance of the fetch engine is decoupled from monitoring performance-related events of the execution engine.

Type: Application

Filed: December 8, 2006

Publication date: June 12, 2008

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Ravindra N. Bhargava, Benjamin T. Sander, David Neal Suggs
EXECUTION ENGINE MONITORING DEVICE AND METHOD THEREOF

Publication number: 20080141008

Abstract: In accordance with a specific embodiment of the present disclosure, hardware periodically monitors a fetch cycle that fetches data associated with an address to determine performance parameters associated with the fetch cycle. Information related to the duration of a fetch cycle is maintained as well as information indicating the occurrence of various states and data values related to the fetch cycle. For example, the virtual address being processed during the fetch cycle is saved at the integrated circuit containing the fetch engine. Other performance-related parameters associated with execution of instructions at an execution engine of the pipeline are also monitored periodically. However, monitoring performance of the fetch engine is decoupled from monitoring performance-related events of the execution engine.

Type: Application

Filed: December 8, 2006

Publication date: June 12, 2008

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Benjamin T. Sander, Michael Edward Tuuk, Ravindra N. Bhargava

prev 1 2