Patents by Inventor Raguram Damodaran

Raguram Damodaran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170285727
    Abstract: A scheduler and method for dynamic power reduction, e.g., in a processor core, is proposed. In conventional processor cores for example, the scheduler precharges grant lines of many instructions only to discharge a great majority of the precharged lines in one cycle. To reduce power consumption, selective precharge and/or selective evaluation are proposed. In the selective precharge, the grant lines of instructions that will evaluate to false (e.g., invalid instructions) are not precharged in a cycle. In the selective evaluation, among the precharged instructions, instructions that are not ready are not evaluated in the same cycle. In this way, power consumption is reduced by avoiding unnecessary precharge and discharge.
    Type: Application
    Filed: March 30, 2016
    Publication date: October 5, 2017
    Inventors: Milind Ram KULKARNI, Rami Mohammad A. AL SHEIKH, Raguram DAMODARAN
  • Publication number: 20170286119
    Abstract: Aspects disclosed in the detailed description include providing load address predictions using address prediction tables based on load path history in processor-based systems. In one aspect, a load address prediction engine provides a load address prediction table containing multiple load address prediction table entries. Each load address prediction table entry includes a predictor tag field and a memory address field for a load instruction. The load address prediction engine generates a table index and a predictor tag based on an identifier and a load path history for a detected load instruction. The table index is used to look up a corresponding load address prediction table entry. If the predictor tag matches the predictor tag field of the load address prediction table entry corresponding to the table index, the memory address field of the load address prediction table entry is provided as a predicted memory address for the load instruction.
    Type: Application
    Filed: March 31, 2016
    Publication date: October 5, 2017
    Inventors: Rami Mohammad Al Sheikh, Raguram Damodaran
  • Publication number: 20170277536
    Abstract: Providing references to previously decoded instructions of recently-provided instructions to be executed by a processor is disclosed herein. In one aspect, a low resource micro-operation controller is provided. Responsive to an instruction pipeline receiving an instruction address, the low resource micro-operation controller is configured to determine if the received instruction address corresponds to an instruction address in short history table. Short history table includes instruction addresses of recently-provided instructions having micro-ops in a post-decode queue. If the received instruction address corresponds to an instruction address in short history table, the low resource micro-operation controller is configured to provide reference (e.g., pointer) to the fetch stage that corresponds to an entry in the post-decode queue in which the micro-ops corresponding to the instruction address are stored.
    Type: Application
    Filed: March 24, 2016
    Publication date: September 28, 2017
    Inventors: Vignyan Reddy Kothinti Naresh, Shivam Priyadarshi, Raguram Damodaran
  • Publication number: 20170249149
    Abstract: Systems and methods for operating a processor include determining confidence levels, such as high, low, and medium confidence levels, associated with in-flight branch instructions in an instruction pipeline of the processor, based on counters used for predicting directions of the in-flight branch instructions. Numbers of in-flight branch instructions associated with each of confidence levels are determined. A weighted sum of the numbers weighted with weights corresponding to the confidence levels is calculated and the weighted sum is compared with a threshold. A throttling signal may be asserted to indicate that instructions are to be throttled in a pipeline stage of the instruction pipeline based on the comparison.
    Type: Application
    Filed: February 29, 2016
    Publication date: August 31, 2017
    Inventors: Shivam PRIYADARSHI, Rami Mohammad AL SHEIKH, Raguram DAMODARAN, Michael Scott MCILVAINE, Jeffrey Todd BRIDGES
  • Publication number: 20170192484
    Abstract: The disclosure generally relates to dynamic clock and voltage scaling (DCVS) based on program phase. For example, during each program phase, a first hardware counter may count each cycle where a dispatch stall occurs and an oldest instruction in a load queue is a last-level cache miss, a second hardware counter may count total cycles, and a third hardware counter may count committed instructions. Accordingly, a software/firmware mechanism may read the various hardware counters once the committed instruction counter reaches a threshold value and divide a value of the first hardware counter by a value of the second hardware counter to measure a stall fraction during a current program execution phase. The measured stall fraction can then be used to predict a stall fraction in a next program execution phase such that optimal voltage and frequency settings can be applied in the next phase based on the predicted stall fraction.
    Type: Application
    Filed: January 4, 2016
    Publication date: July 6, 2017
    Inventors: Shivam Priyadarshi, Anil Krishna, Raguram Damodaran, Jeffrey Todd Bridges, Ryan Wells, Norman Gargash, Rodney Wayne Smith
  • Publication number: 20170177366
    Abstract: Selective storing of previously decoded instructions of frequently-called instruction sequences in an instruction sequence buffer to be executed by a processor is disclosed. In one aspect, a selective instruction sequence buffer controller is configured to selectively store previously decoded instructions for an instruction sequence by determining if a received instruction address corresponds to an instruction sequence captured in an instruction sequence buffer. If the received instruction address corresponds to a captured instruction sequence, the selective instruction sequence buffer controller provides corresponding micro-operations stored in the instruction sequence buffer for execution. If the received instruction address does not correspond to the captured instruction sequence, the selective instruction sequence buffer controller reduces a frequency indicator of the instruction sequence.
    Type: Application
    Filed: December 22, 2015
    Publication date: June 22, 2017
    Inventors: Vignyan Reddy Kothinti Naresh, Shivam Priyadarshi, Raguram Damodaran
  • Publication number: 20170097894
    Abstract: The level one memory controller maintains a local copy of the cacheability bit of each memory attribute register. The level two memory controller is the initiator of all configuration read/write requests from the CPU. Whenever a configuration write is made to a memory attribute register, the level one memory controller updates its local copy of the memory attribute register.
    Type: Application
    Filed: October 5, 2015
    Publication date: April 6, 2017
    Inventors: Raguram Damodaran, Joseph Raymond Michael Zbiciak, Naveen Bhoria
  • Publication number: 20170091117
    Abstract: A cache fill line is received, including an index, a thread identifier, and cache fill line data. The cache is probed, using the index and a different thread identifier, for a potential duplicate cache line. The potential duplicate cache line includes cache line data and the different thread identifier. Upon the cache fill line data matching the cache line data, duplication is identified. The potential duplicate cache line is set as a shared resident cache line, and the thread share permission tag is set to a permission state.
    Type: Application
    Filed: September 25, 2015
    Publication date: March 30, 2017
    Inventors: Harold Wade CAIN, III, Derek Robert HOWER, Raguram DAMODARAN, Thomas Andrew SARTORIUS
  • Publication number: 20170090508
    Abstract: The clock frequency of a processor is reduced in response to a dispatch stall due to a cache miss. In an embodiment, the processor clock frequency is reduced for a load instruction that causes a last level cache miss, provided that the load instruction is the oldest load instruction and the number of consecutive processor cycles in which there is a dispatch stall exceeds a threshold, and provided that the total number of processor cycles since the last level cache miss does not exceed some specified number.
    Type: Application
    Filed: September 25, 2015
    Publication date: March 30, 2017
    Inventors: Shivam PRIYADARSHI, Anil KRISHNA, Raguram DAMODARAN, Jeffrey Todd BRIDGES, Thomas Philip SPEIER, Rodney Wayne SMITH, Keith Alan BOWMAN, David Joseph Winston HANSQUINE
  • Publication number: 20170090930
    Abstract: Reconfiguring execution pipelines of out-of-order (OOO) computer processors based on phase training and prediction is disclosed. In one aspect, a pipeline reconfiguration circuit is communicatively coupled to an execution pipeline providing multiple selectable pipeline configurations. The pipeline reconfiguration circuit generates a phase identifier (ID) for a phase based on a preceding phase. The phase ID is used as an index into an entry of a pipeline configuration prediction (PCP) table to determine whether training for the phase is ongoing. If so, the pipeline reconfiguration circuit performs multiple training cycles, each employing a pipeline configuration from the selectable pipeline configurations for the execution pipeline, to determine a preferred pipeline configuration for the phase. If training for the phase is complete, the pipeline reconfiguration circuit reconfigures the execution pipeline into the preferred pipeline configuration indicated by the entry before the phase is executed.
    Type: Application
    Filed: September 24, 2015
    Publication date: March 30, 2017
    Inventors: Shivam Priyadarshi, Anil Krishna, Raguram Damodaran
  • Publication number: 20170060593
    Abstract: Systems and methods relate to a hierarchical register file system including a level 1 physical register file (L1 PRF) and a backing physical register file (PRF). A subset of productions of instructions executed in an instruction pipeline of a processor which have a high likelihood of use for one or more future instructions are identified. The subset of productions are stored in the L1 PRF, while all productions are stored in the backing PRF.
    Type: Application
    Filed: September 2, 2015
    Publication date: March 2, 2017
    Inventors: Anil KRISHNA, Rodney Wayne SMITH, Sandeep Suresh NAVADA, Shivam PRIYADARSHI, Niket Kumar CHOUDHARY, Raguram DAMODARAN
  • Patent number: 9582285
    Abstract: Speculative history forwarding in overriding branch predictors, and related circuits, methods, and computer-readable media are disclosed. In one embodiment, a branch prediction circuit including a first branch predictor and a second branch predictor is provided. The first branch predictor generates a first branch prediction for a conditional branch instruction, and the first branch prediction is stored in a first branch prediction history. The first branch prediction is also speculatively forwarded to a second branch prediction history. The second branch predictor subsequently generates a second branch prediction based on the second branch prediction history, including the speculatively forwarded first branch prediction. By enabling the second branch predictor to base its branch prediction on the speculatively forwarded first branch prediction, an accuracy of the second branch predictor may be improved.
    Type: Grant
    Filed: March 24, 2014
    Date of Patent: February 28, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Rami Mohammad Al Sheikh, Raguram Damodaran
  • Patent number: 9575901
    Abstract: This invention is a cache system with a memory attribute register having plural entries. Each entry stores a write-through or a write-back indication for a corresponding memory address range. On a write to cached data the cache the cache consults the memory attribute register for the corresponding address range. Writes to addresses in regions marked as write-through always update all levels of the memory hierarchy. Writes to addresses in regions marked as write-back update only the first cache level that can service the write. The memory attribute register is preferably a memory mapped control register writable by the central processing unit.
    Type: Grant
    Filed: October 15, 2015
    Date of Patent: February 21, 2017
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Raguram Damodaran, Abhijeet Ashok Chachad, Naveen Bhoria, David Matthew Thompson
  • Publication number: 20170046159
    Abstract: Systems and methods relate to an instruction fetch unit of a processor, such as a superscalar processor. The instruction fetch unit includes a fetch bandwidth predictor (FBWP) configured to predict a number of instructions to be fetched in a fetch group of instructions in a pipeline stage of the processor. A first entry of the FBWP corresponding to the fetch group corresponds to a prediction of the number of instructions to be fetched, based on occurrence and location of a predicted taken branch instruction in the fetch group and a confidence level associated with the predicted number in the prediction field. The instruction fetch unit is configured to fetch only the predicted number of instructions, rather than the maximum number of entries that can be fetched in the pipeline stage, if the confidence level is greater than a predetermined threshold. In this manner, wasteful fetching of instructions is avoided.
    Type: Application
    Filed: August 14, 2015
    Publication date: February 16, 2017
    Inventors: Shivam PRIYADARSHI, Rami Mohammad AL SHEIKH, Raguram DAMODARAN
  • Publication number: 20170046154
    Abstract: Storing narrow produced values for instruction operands directly in a register map in an out-of-order processor (OoP) is provided. An OoP is provided that includes an instruction processing system. The instruction processing system includes a number of instruction processing stages configured to pipeline the processing and execution of instructions according to a dataflow execution. The instruction processing system also includes a register map table (RMT) configured to store address pointers mapping logical registers to physical registers in a physical register file (PRF) for storing produced data for use by consumer instructions without overwriting logical registers for later executed, out-of-order instructions. In certain aspects, the instruction processing system is configured to write back (i.e., store) narrow values produced by executed instructions directly into the RMT, as opposed to writing the narrow produced values into the PRF in a write back stage.
    Type: Application
    Filed: September 21, 2015
    Publication date: February 16, 2017
    Inventors: Anil Krishna, Rodney Wayne Smith, Sandeep Suresh Navada, Shivam Priyadarshi, Raguram Damodaran
  • Patent number: 9390011
    Abstract: A method to eliminate the delay of a block invalidate operation in a multi CPU environment by overlapping the block invalidate operation with normal CPU accesses, thus making the delay transparent. A range check is performed on each CPU access while a block invalidate operation is in progress, and an access that maps to within the address range of the block invalidate operation is treated as a cache miss to ensure that the requesting CPU will receive valid data.
    Type: Grant
    Filed: October 6, 2015
    Date of Patent: July 12, 2016
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Naveen Bhoria, Raguram Damodaran, Abhijeet Ashok Chachad
  • Patent number: 9298643
    Abstract: This invention optimizes DMA writes to directly addressable level two memory that is cached in level one and the line is valid and dirty. When the level two controller detects that a line is valid and dirty in level one, the level two memory need not update its copy of the data. Level one memory will replace the level two copy with a victim writeback at a future time. Thus the level two memory need not store write a copy. This limits the number of DMA writes to level two directly addressable memory and thus improves performance and minimizes dynamic power. This also frees the level two memory for other master/requestors.
    Type: Grant
    Filed: June 2, 2015
    Date of Patent: March 29, 2016
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Jonathan (Son) Hung Tran, Raguram Damodaran, Abhijeet Ashok Chachad, Joseph Raymond Michael Zbiciak
  • Patent number: 9268708
    Abstract: This invention assures cache coherence in a multi-level cache system upon eviction of a higher level cache line. A victim buffer stored data on evicted lines. On a DMA access that may be cached in the higher level cache the lower level cache sends a snoop write. The address of this snoop write is compared with the victim buffer. On a hit in the victim buffer the write completes in the victim buffer. When the victim data passes to the next cache level it is written into a second victim buffer to be retired when the data is committed to cache. DMA write addresses are compared to addresses in this second victim buffer. On a match the write takes place in the second victim buffer. On a failure to match the controller sends a snoop write.
    Type: Grant
    Filed: March 4, 2015
    Date of Patent: February 23, 2016
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Raguram Damodaran, Abhijeet Ashok Chachad, Jonathan (Son) Hung Tran, David Matthew Thompson
  • Publication number: 20160034396
    Abstract: This invention is a cache system with a memory attribute register having plural entries. Each entry stores a write-through or a write-back indication for a corresponding memory address range. On a write to cached data the cache the cache consults the memory attribute register for the corresponding address range. Writes to addresses in regions marked as write-through always update all levels of the memory hierarchy. Writes to addresses in regions marked as write-back update only the first cache level that can service the write. The memory attribute register is preferably a memory mapped control register writable by the central processing unit.
    Type: Application
    Filed: October 15, 2015
    Publication date: February 4, 2016
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Raguram Damodaran, Abhijeet Ashok Chachad, Naveen Bhoria, David Matthew Thompson
  • Patent number: RE46193
    Abstract: An embedded megamodule and an embedded CPU enable power-saving through a combination of hardware and software. The CPU configures the power-down controller (PDC) logic within megamodule and can software trigger a low-power state of logic modules during processor IDLE periods. To wake from this power-down state, a system event is asserted to the CPU through the module interrupt controller. Thus the entry into a low-power state is software-driven during periods of inactivity and power restoration is on system activity that demands the attention of the CPU.
    Type: Grant
    Filed: June 12, 2014
    Date of Patent: November 1, 2016
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Timothy David Anderson, Lewis Nardini, Jose Luis Flores, Abhijeet Chachad, Raguram Damodaran, Joseph R. M. Zbiciak, Gary Swoboda