Patents by Inventor Rodney Wayne Smith

Rodney Wayne Smith has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200409712
    Abstract: Operand-based reach explicit dataflow processors, and related methods and computer-readable media are disclosed. The operand-based reach explicit dataflow processors support execution of a producer instruction that explicitly names a target consumer operand of a consumer instruction in a consumer operand encoding namespace of the producer instruction. The produced value from execution of the producer instruction is provided or otherwise made available as an input to the named target consumer operand of the consumer instruction as a result of processing the producer instruction. The target consumer operand is encoded in the producer instruction as an operand target distance relative to the producer instruction. Instructions in an instruction stream between the producer instruction and the targeted consumer instruction that have no operands do not consume an operand reach namespace in the producer instructions.
    Type: Application
    Filed: June 28, 2019
    Publication date: December 31, 2020
    Inventors: Robert Douglas CLANCY, Melinda Joyce BROWN, Yusuf Cagatay TEKMEN, Brian Michael STEMPEL, Michael Scott MCILVAINE, Thomas Philip SPEIER, Rodney Wayne SMITH, Gagan GUPTA, David Tennyson HARPER, III
  • Patent number: 10877768
    Abstract: Minimizing traversal of a processor reorder buffer (ROB) for register rename map table (RMT) state recovery for interrupted instruction recovery in a processor. Instructions may execute out of order in a processor. Information about the logical register-to-physical register mapping resulting from each instruction is stored in entries in program order in the ROB. When the pipeline is interrupted by an instruction that fails to execute, changing program flow, all instructions following the interrupting instruction may be flushed from the processor pipeline. It is important to return the state of the RMT to the state that existed when the interrupting instruction entered the pipeline. To recover the RMT state in response to an interrupting instruction, register mapping information in the ROB entries is traversed to either undo the younger instructions that entered the pipeline after the interrupting instruction or replay the older instructions that entered the pipeline before the interrupting instruction.
    Type: Grant
    Filed: September 6, 2019
    Date of Patent: December 29, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shivam Priyadarshi, Yusuf Cagatay Tekmen, Kiran Ravi Seth, Rodney Wayne Smith, Vignyan Reddy Kothinti Naresh
  • Patent number: 10860328
    Abstract: Providing late physical register allocation and early physical register release in out-of-order processor (OOP)-based devices implementing a checkpoint-based architecture is provided. In this regard, an OOP-based device provides a register management circuit that is configured to employ a combination of the checkpoint approach and the virtual register approach. The register management circuit includes a most recent table (MRT) for tracking mappings of logical register numbers (LRNs) to physical register numbers (PRNs), a physical register file (PRF) storing information for physical registers, a virtual register file (VRF) storing data for virtual registers, and a checkpoint queue for tracking active checkpoints (each of which is a snapshot of the MRT at a given time).
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: December 8, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Shivam Priyadarshi, Rodney Wayne Smith, Yusuf Cagatay Tekmen, Luke Yen
  • Publication number: 20200301877
    Abstract: Exemplary reach-based explicit dataflow processors and related computer-readable media and methods. The reach-based explicit dataflow processors are configured to support execution of producer instructions encoded with explicit naming of consumer instructions intended to consume the values produced by the producer instructions. The reach-based explicit dataflow processors are configured to make available produced values as inputs to explicitly named consumer instructions as a result of processing producer instructions. The reach-based explicit dataflow processors support execution of a producer instruction that explicitly names a consumer instruction based on using the producer instruction as a relative reference point from the producer instruction.
    Type: Application
    Filed: March 18, 2019
    Publication date: September 24, 2020
    Inventors: Gagan GUPTA, Michael Scott MCILVAINE, Rodney Wayne SMITH, Thomas Philip SPEIER, David Tennyson HARPER, III
  • Publication number: 20200097296
    Abstract: Providing late physical register allocation and early physical register release in out-of-order processor (OOP)-based devices implementing a checkpoint-based architecture is provided. In this regard, an OOP-based device provides a register management circuit that is configured to employ a combination of the checkpoint approach and the virtual register approach. The register management circuit includes a most recent table (MRT) for tracking mappings of logical register numbers (LRNs) to physical register numbers (PRNs), a physical register file (PRF) storing information for physical registers, a virtual register file (VRF) storing data for virtual registers, and a checkpoint queue for tracking active checkpoints (each of which is a snapshot of the MRT at a given time).
    Type: Application
    Filed: September 21, 2018
    Publication date: March 26, 2020
    Inventors: Shivam Priyadarshi, Rodney Wayne Smith, Yusuf Cagatay Tekmen, Luke Yen
  • Patent number: 10551896
    Abstract: The disclosure generally relates to dynamic clock and voltage scaling (DCVS) based on program phase. For example, during each program phase, a first hardware counter may count each cycle where a dispatch stall occurs and an oldest instruction in a load queue is a last-level cache miss, a second hardware counter may count total cycles, and a third hardware counter may count committed instructions. Accordingly, a software/firmware mechanism may read the various hardware counters once the committed instruction counter reaches a threshold value and divide a value of the first hardware counter by a value of the second hardware counter to measure a stall fraction during a current program execution phase. The measured stall fraction can then be used to predict a stall fraction in a next program execution phase such that optimal voltage and frequency settings can be applied in the next phase based on the predicted stall fraction.
    Type: Grant
    Filed: November 15, 2017
    Date of Patent: February 4, 2020
    Assignee: QUALCOMM Incorporated
    Inventors: Shivam Priyadarshi, Anil Krishna, Raguram Damodaran, Jeffrey Todd Bridges, Ryan Wells, Norman Gargash, Rodney Wayne Smith
  • Publication number: 20200004550
    Abstract: Various aspects disclosed herein relate to combining instructions to load data from or store data in memory while processing instructions in a computer processor. More particularly, at least one pattern of multiple memory access instructions that reference a common base register and do not fully utilize an available bus width may be identified in a processor pipeline. In response to determining that the multiple memory access instructions target adjacent memory or non-contiguous memory that can fit on a single cache line, the multiple memory access instructions may be replaced within the processor pipeline with one equivalent memory access instruction that utilizes more of the available bus width than either of the replaced memory access instructions.
    Type: Application
    Filed: June 29, 2018
    Publication date: January 2, 2020
    Inventors: Harsh THAKKER, Thomas Philip SPEIER, Rodney Wayne SMITH, Kevin JAGET, James Norris DIEFFENDERFER, Michael MORROW, Pritha GHOSHAL, Yusuf Cagatay TEKMEN, Brian STEMPEL, Sang Hoon LEE, Manish GARG
  • Publication number: 20190332385
    Abstract: In certain aspects of the disclosure, an apparatus comprises a first scheduling pool associated with a first minimum scheduling latency and a second scheduling pool associated with a second minimum scheduling latency, the second minimum scheduling latency greater than the first minimum scheduling latency. A common instruction picker is coupled to both the first scheduling pool and the second scheduling pool. The common instruction picker may be configured to select a first instruction from the first scheduling pool and a second instruction from the second scheduling pool, and then choose either the first instruction or second instruction for dispatch according to a picking policy.
    Type: Application
    Filed: April 26, 2018
    Publication date: October 31, 2019
    Inventors: Rodney Wayne SMITH, Raghavan MADHAVAN, Luke YEN, Shivam PRIYADARSHI, Yusuf Cagatay TEKMEN
  • Publication number: 20190294443
    Abstract: Providing early pipeline optimization of conditional instructions in processor-based systems is disclosed. In one aspect, an instruction pipeline of a processor-based system detects a mispredicted branch (i.e., following a misprediction of a condition associated with a speculatively executed conditional branch instruction), and records a current state of one or more condition flags as a condition flags snapshot. After a pipeline flush is initiated and a corrected fetch path is restarted, an instruction decode stage of the instruction pipeline uses the condition flags snapshot to apply optimizations to conditional instructions detected within the corrected fetch path. According to some aspects, the condition flags snapshot is subsequently invalidated upon encountering a condition-flag-writing instruction within the corrected fetch path.
    Type: Application
    Filed: March 20, 2018
    Publication date: September 26, 2019
    Inventors: Sandeep Suresh Navada, Michael Scott McIlvaine, Rodney Wayne Smith, Robert Douglas Clancy, Yusuf Cagatay Tekmen, Niket Choudhary, Daren Eugene Streett, Richard Doing, Ankita Upreti
  • Publication number: 20190155608
    Abstract: Aspects of the present disclosure include a method, a device, and a computer-readable medium for restarting an instruction pipeline of a processor that includes a decoupled fetcher. A method comprises detecting, in a processor, a re-fetch event, wherein the processor includes an instruction unit (IU) configured to fetch instructions from a decoupled fetcher (DCF), and simultaneously flushing the IU and the DCF in response to detecting of the re-fetch event.
    Type: Application
    Filed: November 16, 2018
    Publication date: May 23, 2019
    Inventors: Arthur PERAIS, Michael Scott MCILVAINE, Rami Mohammad A. AL SHEIKH, Robert Douglas CLANCY, Luke YEN, Rodney Wayne SMITH
  • Patent number: 10108417
    Abstract: Storing narrow produced values for instruction operands directly in a register map in an out-of-order processor (OoP) is provided. An OoP is provided that includes an instruction processing system. The instruction processing system includes a number of instruction processing stages configured to pipeline the processing and execution of instructions according to a dataflow execution. The instruction processing system also includes a register map table (RMT) configured to store address pointers mapping logical registers to physical registers in a physical register file (PRF) for storing produced data for use by consumer instructions without overwriting logical registers for later executed, out-of-order instructions. In certain aspects, the instruction processing system is configured to write back (i.e., store) narrow values produced by executed instructions directly into the RMT, as opposed to writing the narrow produced values into the PRF in a write back stage.
    Type: Grant
    Filed: September 21, 2015
    Date of Patent: October 23, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Anil Krishna, Rodney Wayne Smith, Sandeep Suresh Navada, Shivam Priyadarshi, Raguram Damodaran
  • Publication number: 20180074568
    Abstract: The disclosure generally relates to dynamic clock and voltage scaling (DCVS) based on program phase. For example, during each program phase, a first hardware counter may count each cycle where a dispatch stall occurs and an oldest instruction in a load queue is a last-level cache miss, a second hardware counter may count total cycles, and a third hardware counter may count committed instructions. Accordingly, a software/firmware mechanism may read the various hardware counters once the committed instruction counter reaches a threshold value and divide a value of the first hardware counter by a value of the second hardware counter to measure a stall fraction during a current program execution phase. The measured stall fraction can then be used to predict a stall fraction in a next program execution phase such that optimal voltage and frequency settings can be applied in the next phase based on the predicted stall fraction.
    Type: Application
    Filed: November 15, 2017
    Publication date: March 15, 2018
    Inventors: Shivam PRIYADARSHI, Anil KRISHNA, Raguram DAMODARAN, Jeffrey Todd BRIDGES, Ryan WELLS, Norman GARGASH, Rodney Wayne SMITH
  • Patent number: 9851774
    Abstract: The disclosure generally relates to dynamic clock and voltage scaling (DCVS) based on program phase. For example, during each program phase, a first hardware counter may count each cycle where a dispatch stall occurs and an oldest instruction in a load queue is a last-level cache miss, a second hardware counter may count total cycles, and a third hardware counter may count committed instructions. Accordingly, a software/firmware mechanism may read the various hardware counters once the committed instruction counter reaches a threshold value and divide a value of the first hardware counter by a value of the second hardware counter to measure a stall fraction during a current program execution phase. The measured stall fraction can then be used to predict a stall fraction in a next program execution phase such that optimal voltage and frequency settings can be applied in the next phase based on the predicted stall fraction.
    Type: Grant
    Filed: January 4, 2016
    Date of Patent: December 26, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Shivam Priyadarshi, Anil Krishna, Raguram Damodaran, Jeffrey Todd Bridges, Ryan Wells, Norman Gargash, Rodney Wayne Smith
  • Patent number: 9823929
    Abstract: A processor includes a queue for storing instructions processed within the context of a current value of a register field, where for some embodiments the instruction is undefined or defined, depending upon the register field at time of processing. After a write instruction (an instruction that writes to the register field) executes, the queue is searched for any entries that contain instructions that depend upon the executed write instruction. Each such entry stores the value of the register field at the time the instruction in the entry was processed. If such an entry is found in the queue and its stored value of the register field does not match the value that the write instruction wrote to the register field, then the processor flushes the pipeline and restarts at a state so as to correctly execute the instruction.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: November 21, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Daren Eugene Streett, Brian Michael Stempel, Thomas Philip Speier, Rodney Wayne Smith, Michael Scott McIlvaine, Kenneth Alan Dockser, James Norris Dieffenderfer
  • Publication number: 20170255569
    Abstract: Systems and methods for managing access to a cache relate to determining one or more execute permissions associated with a write-address of a write-request to the cache. The cache may be a unified cache for storing data as well as instructions. If there is a write-miss in the cache for the write-request, a cache controller may determine whether to implement a write-allocate policy or a write-no-allocate policy for servicing the write-miss, based on the one or more execute permissions. The one or more execute permissions can relate to a privilege level associated with the write-address. Execute permissions of a producing agent which generated the write-request and an execute permission of a consuming agent which can execute from the write-address may be based on the privilege levels of the producing agent and the consuming agent, respectively.
    Type: Application
    Filed: March 1, 2016
    Publication date: September 7, 2017
    Inventors: Thomas Andrew SARTORIUS, James Norris DIEFFENDERFER, Michael William MORROW, Jeffrey Todd BRIDGES, Michael Scott MCILVAINE, Rodney Wayne SMITH, Kenneth Alan DOCKSER, Thomas Philip SPEIER
  • Publication number: 20170192484
    Abstract: The disclosure generally relates to dynamic clock and voltage scaling (DCVS) based on program phase. For example, during each program phase, a first hardware counter may count each cycle where a dispatch stall occurs and an oldest instruction in a load queue is a last-level cache miss, a second hardware counter may count total cycles, and a third hardware counter may count committed instructions. Accordingly, a software/firmware mechanism may read the various hardware counters once the committed instruction counter reaches a threshold value and divide a value of the first hardware counter by a value of the second hardware counter to measure a stall fraction during a current program execution phase. The measured stall fraction can then be used to predict a stall fraction in a next program execution phase such that optimal voltage and frequency settings can be applied in the next phase based on the predicted stall fraction.
    Type: Application
    Filed: January 4, 2016
    Publication date: July 6, 2017
    Inventors: Shivam Priyadarshi, Anil Krishna, Raguram Damodaran, Jeffrey Todd Bridges, Ryan Wells, Norman Gargash, Rodney Wayne Smith
  • Publication number: 20170090508
    Abstract: The clock frequency of a processor is reduced in response to a dispatch stall due to a cache miss. In an embodiment, the processor clock frequency is reduced for a load instruction that causes a last level cache miss, provided that the load instruction is the oldest load instruction and the number of consecutive processor cycles in which there is a dispatch stall exceeds a threshold, and provided that the total number of processor cycles since the last level cache miss does not exceed some specified number.
    Type: Application
    Filed: September 25, 2015
    Publication date: March 30, 2017
    Inventors: Shivam PRIYADARSHI, Anil KRISHNA, Raguram DAMODARAN, Jeffrey Todd BRIDGES, Thomas Philip SPEIER, Rodney Wayne SMITH, Keith Alan BOWMAN, David Joseph Winston HANSQUINE
  • Publication number: 20170060593
    Abstract: Systems and methods relate to a hierarchical register file system including a level 1 physical register file (L1 PRF) and a backing physical register file (PRF). A subset of productions of instructions executed in an instruction pipeline of a processor which have a high likelihood of use for one or more future instructions are identified. The subset of productions are stored in the L1 PRF, while all productions are stored in the backing PRF.
    Type: Application
    Filed: September 2, 2015
    Publication date: March 2, 2017
    Inventors: Anil KRISHNA, Rodney Wayne SMITH, Sandeep Suresh NAVADA, Shivam PRIYADARSHI, Niket Kumar CHOUDHARY, Raguram DAMODARAN
  • Publication number: 20170046160
    Abstract: Systems and methods of handling a register file include, in a first instruction set architecture (ISA) mode assigning a first subset of tracking resources to logical registers of a first logical register subset for tracking mappings to full granularity and first lower granularity physical registers of a first physical register subset, and assigning a second subset of tracking resources to logical registers of a second logical register subset for tracking mappings to second lower granularity physical registers of the first physical register subset. The second subset of tracking resources are configured for tracking at least the logical registers of the second logical register subset mappings to physical registers of a second physical register subset in a second ISA mode, wherein the second physical register subset is available to the second ISA mode but not the first ISA mode.
    Type: Application
    Filed: March 31, 2016
    Publication date: February 16, 2017
    Inventors: Kiran Ravi SETH, Rodney Wayne SMITH, Yusuf Cagatay TEKMEN, Raghaven MADHAVAN
  • Publication number: 20170046164
    Abstract: A load instruction, for loading a register among a set of registers, is scheduled. Associated with scheduling the load instruction, a register dependency vector, corresponding to the register, is set to a state identifying the load instruction. A consumer instruction is scheduled, having a set of operand register and a target register, the register being in the set of operand registers. A target register dependency vector, corresponding to the target register is set in the memory. Based at least in part on the register being in the set of operand registers, a value of the target register dependency vector identifies the load instruction. Optionally, upon receiving a cache miss notice associated with the load instruction, the target register dependency vector is retrieved.
    Type: Application
    Filed: September 25, 2015
    Publication date: February 16, 2017
    Inventors: Raghavan MADHAVAN, Kiran RAVI SETH, Yusuf Cagatay TEKMEN, Rodney Wayne SMITH