Patents by Inventor Robert T. Golla

Robert T. Golla has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200174903
    Abstract: Techniques are disclosed relating to cache debug using control registers based on debug commands. In some embodiments, an apparatus includes a processor core, debug circuitry, and control circuitry. In some embodiments, the debug circuitry is configured to receive external debug inputs and send abstract commands to the processor core based on the external debug inputs. In some embodiments, the control circuitry is configured to, in response to an abstract command to read data from the cache: write cache address information to a first control register, assert a trigger signal to cause a read of the data from the cache to a second control register, based on the cache address information in the first control register, and send data from the second control register to the debug circuitry. In various embodiments, this may facilitate hardware cache debug using debug circuitry that also controls software debugging.
    Type: Application
    Filed: January 31, 2019
    Publication date: June 4, 2020
    Inventors: Jama I. Barreh, Robert T. Golla, Thomas M. Wicki, Matthew B. Smittle
  • Publication number: 20200174794
    Abstract: Techniques are disclosed relating to the handling of exceptions generated by illegal instructions in a processor. In an embodiment, a processor may be configured to fetch instructions defined according to an instruction set architecture (ISA). The ISA may include a set of uncompressed instructions and a set of compressed instructions. The processor may further be configured to, upon detecting a given one of the set of compressed instructions, cause a copy of the given compressed instruction to be saved and convert the given compressed instruction to a corresponding given uncompressed instruction. The processor may also be configured to detect that the given uncompressed instruction is illegal and was converted from the given compressed instruction, and based at least in part on these, cause an illegal instruction exception to be generated using the copy of the given compressed instruction.
    Type: Application
    Filed: October 23, 2019
    Publication date: June 4, 2020
    Inventors: Robert T. Golla, Matthew B. Smittle
  • Publication number: 20200174071
    Abstract: Techniques are disclosed relating to using non-debug path circuitry to perform debug commands. In some embodiments, an apparatus includes a processor core that includes path circuitry configured to access data for instructions executed by the processor core and storage elements which the path circuitry is configured to access via one or more ports. In some embodiments, the apparatus includes debug circuitry configured to receive external debug inputs and send abstract commands to the processor core based on the external debug inputs.
    Type: Application
    Filed: January 31, 2019
    Publication date: June 4, 2020
    Inventors: Deepak Panwar, Muhammad Tauseef Rab, Robert T. Golla, Matthew B. Smittle
  • Publication number: 20200004549
    Abstract: An instruction buffer for a processor configured to execute multiple threads is disclosed. The instruction buffer is configured to receive instructions from a fetch unit and provide instructions to a selection unit. The instruction buffer includes one or more memory arrays comprising a plurality of entries configured to store instructions and/or other information (e.g., program counter addresses). One or more indicators are maintained by the processor and correspond to the plurality of threads. The one or more indicators are usable such that for instructions received by the instruction buffer, one or more of the plurality entries of a memory array can be determined as a write destination for the received instructions, and for instructions to be read from the instruction buffer (and sent to a selection unit), one or more entries can be determined as the correct source location from which to read.
    Type: Application
    Filed: July 8, 2019
    Publication date: January 2, 2020
    Inventors: Jama I. Barreh, Robert T. Golla, Manish K. Shah
  • Patent number: 10346173
    Abstract: An instruction buffer for a processor configured to execute multiple threads is disclosed. The instruction buffer is configured to receive instructions from a fetch unit and provide instructions to a selection unit. The instruction buffer includes one or more memory arrays comprising a plurality of entries configured to store instructions and/or other information (e.g., program counter addresses). One or more indicators are maintained by the processor and correspond to the plurality of threads. The one or more indicators are usable such that for instructions received by the instruction buffer, one or more of the plurality entries of a memory array can be determined as a write destination for the received instructions, and for instructions to be read from the instruction buffer (and sent to a selection unit), one or more entries can be determined as the correct source location from which to read.
    Type: Grant
    Filed: March 7, 2011
    Date of Patent: July 9, 2019
    Assignee: Oracle International Corporation
    Inventors: Jama I. Barreh, Robert T. Golla, Manish K. Shah
  • Patent number: 9710042
    Abstract: Embodiments of the invention provide adaptive power ramp control (APRC) in microprocessors. One implementation of the APRC can compute a present core power and a present power ramp condition in the microprocessor, for example, to determine whether the present power is in a particular predefined control zone and whether the present power ramp is greater than a predefined threshold for that control zone. Those determinations can indicate a likelihood of an imminent, undesirable power ramp condition and can inform entry into a control mode. The APRC can generate an appropriate stall control signal in response to its present control mode, and the stall control signal can stall operation of at least one functional unit of the microprocessor according to a predefined stall pattern. This can effectively combat the imminent power ramp condition by reducing the power usage of the microprocessor.
    Type: Grant
    Filed: August 15, 2014
    Date of Patent: July 18, 2017
    Assignee: Oracle International Corporation
    Inventors: Haowei Zhang, Xiaoying Shen, Sebastian Turullols, Robert T. Golla
  • Patent number: 9690625
    Abstract: A system and method for managing the dynamic sharing of processor resources between threads in a multi-threaded processor are disclosed. Out-of-order allocation and deallocation may be employed to efficiently use the various resources of the processor. Each element of an allocate vector may indicate whether a corresponding resource is available for allocation. A search of the allocate vector may be performed to identify resources available for allocation. Upon allocation of a resource, a thread identifier associated with the thread to which the resource is allocated may be associated with the allocate vector entry corresponding to the allocated resource. Multiple instances of a particular resource type may be allocated or deallocated in a single processor execution cycle. Each element of a deallocate vector may indicate whether a corresponding resource is ready for deallocation. Examples of resources that may be dynamically shared between threads are reorder buffers, load buffers and store buffers.
    Type: Grant
    Filed: June 16, 2009
    Date of Patent: June 27, 2017
    Assignee: Oracle America, Inc.
    Inventor: Robert T. Golla
  • Patent number: 9672298
    Abstract: Techniques for executing versioned memory access instructions. In one embodiment, a processor is configured to execute versioned store instructions of a first thread within a first mode of operation. In this embodiment, in the first mode of operation, the processor is configured to retire a versioned store instruction only after a version comparison has been performed for the versioned store instruction. In this embodiment the processor is configured to suppress retirement of instructions in the first thread that are younger than an oldest versioned store instruction until the oldest versioned store instruction has retired. In some embodiments, the processor is configured to execute versioned store instructions of a given thread within a second mode of operation, in which the processor is configured to retire outstanding versioned store instructions before a version comparison has been performed.
    Type: Grant
    Filed: May 1, 2014
    Date of Patent: June 6, 2017
    Assignee: Oracle International Corporation
    Inventors: Zoran Radovic, Jared C. Smolens, Robert T. Golla, Paul J. Jordan, Mark A. Luttrell
  • Patent number: 9665375
    Abstract: Systems and methods for efficient thread arbitration in a threaded processor with dynamic resource allocation. A processor includes a resource shared by multiple threads. The resource includes an array with multiple entries, each of which may be allocated for use by any thread. Control logic detects a load miss to memory, wherein the miss is associated with a latency greater than a given threshold. The load instruction or an immediately younger instruction is selected for replay for an associated thread. A pipeline flush and replay for the associated thread begins with the selected instruction. Instructions younger than the load instruction are held at a given pipeline stage until the load instruction completes. During replay, this hold prevents resources from being allocated to the associated thread while the load instruction is being serviced.
    Type: Grant
    Filed: April 26, 2012
    Date of Patent: May 30, 2017
    Assignee: Oracle International Corporation
    Inventors: Yuan C. Chou, Robert T. Golla, Mark A. Luttrell
  • Patent number: 9304767
    Abstract: Systems and methods for providing single cycle movement of data between a floating-point register file (FRF) and a general purpose or integer register file (IRF) of a microprocessor system are provided. The system may include an integer execution unit operative to execute instructions with single cycle latency, a floating-point execution unit, a working register file (WRF), an FRF, and an IRF. To achieve the single cycle movement functionality, the integer execution unit may physically own the WRF, IRF, and FRF, and may monitor and control any dependencies between them. Thus, since the integer execution unit has direct read access to both the IRF and the FRF, data may be moved between the two register files using the single cycle operation of the integer execution unit, without the need to store and load the data from memory.
    Type: Grant
    Filed: June 2, 2009
    Date of Patent: April 5, 2016
    Assignee: Oracle America, Inc.
    Inventors: Christopher Olson, Robert T. Golla, Jeffrey S. Brooks
  • Patent number: 9286075
    Abstract: Systems and methods for efficient out-of-order dynamic deallocation of entries within a shared storage resource in a processor. A processor comprises a unified pick queue that includes an array configured to dynamically allocate any entry of a plurality of entries for a decoded and renamed instruction. This instruction may correspond to any available active threads supported by the processor. The processor includes circuitry configured to determine whether an instruction corresponding to an allocated entry of the plurality of entries is dependent on a speculative instruction and whether the instruction has a fixed instruction execution latency. In response to determining the instruction is not dependent on a speculative instruction, the instruction has a fixed instruction execution latency, and said latency has transpired, the circuitry may deallocate the instruction from the allocated entry.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: March 15, 2016
    Assignee: Oracle America, Inc.
    Inventors: Matthew B. Smittle, Robert T. Golla
  • Publication number: 20160048187
    Abstract: Embodiments of the invention provide adaptive power ramp control (APRC) in microprocessors. One implementation of the APRC can compute a present core power and a present power ramp condition in the microprocessor, for example, to determine whether the present power is in a particular predefined control zone and whether the present power ramp is greater than a predefined threshold for that control zone. Those determinations can indicate a likelihood of an imminent, undesirable power ramp condition and can inform entry into a control mode. The APRC can generate an appropriate stall control signal in response to its present control mode, and the stall control signal can stall operation of at least one functional unit of the microprocessor according to a predefined stall pattern. This can effectively combat the imminent power ramp condition by reducing the power usage of the microprocessor.
    Type: Application
    Filed: August 15, 2014
    Publication date: February 18, 2016
    Inventors: Haowei Zhang, Xiaoying Shen, Sebastian Turullols, Robert T. Golla
  • Patent number: 9262171
    Abstract: Systems and methods for identification of dependent instructions on speculative load operations in a processor. A processor allocates entries of a unified pick queue for decoded and renamed instructions. Each entry of a corresponding dependency matrix is configured to store a dependency bit for each other instruction in the pick queue. The processor speculates that loads will hit in the data cache, hit in the TLB and not have a read after write (RAW) hazard. For each unresolved load, the pick queue tracks dependent instructions via dependency vectors based upon the dependency matrix. If a load speculation is found to be incorrect, dependent instructions in the pick queue are reset to allow for subsequent picking, and dependent instructions in flight are canceled. On completion of a load miss, dependent operations are re-issued. On resolution of a TLB miss or RAW hazard, the original load is replayed and dependent operations are issued again from the pick queue.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: February 16, 2016
    Assignee: Oracle America, Inc.
    Inventors: Robert T. Golla, Matthew B. Smittle, Xiang Shan Li
  • Publication number: 20150317338
    Abstract: Techniques for executing versioned memory access instructions. In one embodiment, a processor is configured to execute versioned store instructions of a first thread within a first mode of operation. In this embodiment, in the first mode of operation, the processor is configured to retire a versioned store instruction only after a version comparison has been performed for the versioned store instruction. In this embodiment the processor is configured to suppress retirement of instructions in the first thread that are younger than an oldest versioned store instruction until the oldest versioned store instruction has retired. In some embodiments, the processor is configured to execute versioned store instructions of a given thread within a second mode of operation, in which the processor is configured to retire outstanding versioned store instructions before a version comparison has been performed.
    Type: Application
    Filed: May 1, 2014
    Publication date: November 5, 2015
    Applicant: Oracle International Corporation
    Inventors: Zoran Radovic, Jared C. Smolens, Robert T. Golla, Paul J. Jordan, Mark A. Luttrell
  • Patent number: 9122487
    Abstract: A system and method for balancing instruction loads between multiple execution units are disclosed. One or more execution units may be represented by a slot configured to accept instructions on behalf of the execution unit(s). A decode unit may assign instructions to a particular slot for subsequent scheduling for execution. Slot assignments may be made based on an instruction's type and/or on a history of previous slot assignments. A cumulative slot assignment history may be maintained in a bias counter, the value of which reflects the bias of previous slot assignments. Slot assignments may be determined based on the value of the bias counter, in order to balance the instruction load across all slots, and all execution units. The bias counter may reflect slot assignments made only within a desired historical window. A separate data structure may store data reflecting the actual slot assignments made during the desired historical window.
    Type: Grant
    Filed: June 23, 2009
    Date of Patent: September 1, 2015
    Assignee: Oracle America, Inc.
    Inventors: Robert T. Golla, Gregory F. Grohoski
  • Patent number: 9058180
    Abstract: Systems and methods for efficient picking of instructions for out-of-order issue and execution in a processor. In one embodiment, a processor comprises a unified pick queue that is dynamically allocated. Each entry is configured to store age and dependency information relative to other decoded instructions. Also, each entry stores a picked field, which when asserted indicates the decoded instruction has already been picked for out-of-order issue and execution. When asserted, a trigger field indicates a result of a corresponding decoded instruction will be available a predetermined number of clock cycles afterward. A younger instruction dependent on a result of an older instruction is ready to be picked before the result of the older instruction is available. In this case, the older instruction has asserted picked and trigger fields.
    Type: Grant
    Filed: June 29, 2009
    Date of Patent: June 16, 2015
    Assignee: Oracle America, Inc.
    Inventors: Robert T. Golla, Matthew B. Smittle, Mark A. Luttrell, Xiang Shan Li
  • Patent number: 8904156
    Abstract: A multithreaded microprocessor includes an instruction fetch unit including a perceptron-based conditional branch prediction unit configured to provide, for each of one or more concurrently executing threads, a direction branch prediction. The conditional branch prediction unit includes a plurality of storages each including a plurality of entries. Each entry may be configured to store one or more prediction values. Each prediction value of a given storage may correspond to at least one conditional branch instruction in a cache line. The conditional branch prediction unit may generate a separate index value for accessing each storage by generating a first index value for accessing a first storage by combining one or more portions of a received instruction fetch address, and generating each other index value for accessing the other storages by combining the first index value with a different portion of direction branch history information.
    Type: Grant
    Filed: October 14, 2009
    Date of Patent: December 2, 2014
    Assignee: Oracle America, Inc.
    Inventors: Manish K. Shah, Gregory F. Grohoski, Robert T. Golla, Jama I. Barreh
  • Patent number: 8832464
    Abstract: A processor including instruction support for implementing hash algorithms may issue, for execution, programmer-selectable hash instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include hash instructions defined within the ISA. In addition, the hash instructions may be executable by the cryptographic unit to implement a hash that is compliant with one or more respective hash algorithm specifications. In response to receiving a particular hash instruction defined within the ISA, the cryptographic unit may retrieve a set of input data blocks from a predetermined set of architectural registers of the processor, and generate a hash value of the set of input data blocks according to a hash algorithm that corresponds to the particular hash instruction.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: September 9, 2014
    Assignee: Oracle America, Inc.
    Inventors: Christopher H. Olson, Jeffrey S. Brooks, Robert T. Golla
  • Patent number: 8769246
    Abstract: In one embodiment, a multithreaded processor includes a plurality of buffers, each configured to store instructions corresponding to a respective thread. The multithreaded processor also includes a pick unit coupled to the plurality of buffers. The pick unit may pick from at least one of the buffers in a given cycle, a valid instruction based upon a thread selection algorithm. The pick unit may further cancel, in the given cycle, the picking of the valid instruction in response to receiving a cancel indication.
    Type: Grant
    Filed: February 14, 2011
    Date of Patent: July 1, 2014
    Assignee: Open Computing Trust I & II
    Inventor: Robert T. Golla
  • Publication number: 20130297910
    Abstract: Systems and methods for efficient thread arbitration in a threaded processor with dynamic resource allocation. A processor includes a resource shared by multiple threads. The resource includes entries which may be allocated for use by any thread. Control logic detects long latency instructions. Long latency instructions have a latency greater than a given threshold. One example is a load instruction that has a read-after-write (RAW) data dependency on a store instruction that misses a last-level data cache. The long latency instruction or an immediately younger instruction is selected for replay for an associated thread. A pipeline flush and replay for the associated thread begins with the selected instruction. Instructions younger than the long latency instruction are held at a given pipeline stage until the long latency instruction completes. During replay, this hold prevents resources from being allocated to the associated thread while the long latency instruction is being serviced.
    Type: Application
    Filed: May 3, 2012
    Publication date: November 7, 2013
    Inventors: Jared C. Smolens, Robert T. Golla, Mark A. Luttrell, Paul J. Jordan