Patents by Inventor Paul J. Jordan

Paul J. Jordan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8078942
    Abstract: In one embodiment, a processor comprises a first register file configured to store speculative register state, a second register file configured to store committed register state, a check circuit and a control unit. The first register file is protected by a first error protection scheme and the second register file is protected by a second error protection scheme. A check circuit is coupled to receive a value and corresponding one or more check bits read from the first register file to be committed to the second register file in response to the processor selecting a first instruction to be committed. The check circuit is configured to detect an error in the value responsive to the value and the check bits. Coupled to the check circuit, the control unit is configured to cause reexecution of the first instruction responsive to the error detected by the check circuit.
    Type: Grant
    Filed: September 4, 2007
    Date of Patent: December 13, 2011
    Assignee: Oracle America, Inc.
    Inventors: Paul J. Jordan, Christopher H. Olson
  • Publication number: 20110296142
    Abstract: A processor including instruction support for large-operand instructions that use multiple register windows may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may also include an instruction execution unit that, during operation, receives instructions for execution from the instruction fetch unit and executes a large-operand instruction defined within the ISA, where execution of the large-operand instruction is dependent upon a plurality of registers arranged within a plurality of register windows. The processor may further include control circuitry (which may be included within the fetch unit, the execution unit, or elsewhere within the processor) that determines whether one or more of the register windows depended upon by the large-operand instruction are not present. In response to determining that one or more of these register windows are not present, the control circuitry causes them to be restored.
    Type: Application
    Filed: May 28, 2010
    Publication date: December 1, 2011
    Inventors: Christopher H. Olson, Paul J. Jordan, Jama I. Barreh
  • Patent number: 7937556
    Abstract: In one embodiment, a system comprises one or more registers configured to store a plurality of values that identify a virtual address space (collectively a tag), a translation lookaside buffer (TLB), and a control unit coupled to the TLB and the one or more registers. The control unit is configured to detect whether or not the tag has changed and in response to a change in the tag, map the changed tag to an identifier having fewer bits than the total number of bits in the tag, and provide the current identifier to the TLB. The TLB is configured to detect a hit/miss in response to the identifier. A similar method is also contemplated.
    Type: Grant
    Filed: April 30, 2008
    Date of Patent: May 3, 2011
    Assignee: Oracle America, Inc.
    Inventors: Paul J. Jordan, Manish K. Shah, Gregory F. Grohoski
  • Publication number: 20100332786
    Abstract: A system and method for invalidating obsolete virtual/real address to physical address translations may employ translation lookaside buffers to cache translations. TLB entries may be invalidated in response to changes in the virtual memory space, and thus may need to be demapped. A non-cacheable unit (NCU) residing on a processor may be configured to receive and manage a global TLB demap request from a thread executing on a core residing on the processor. The NCU may send the request to local cores and/or to NCUs of external processors in a multiprocessor system using a hardware instruction to broadcast to all cores and/or processors or to multicast to designated cores and/or processors. The NCU may track completion of the demap operation across the cores and/or processors using one or more counters, and may send an acknowledgement to the initiator of the demap request when the global demap request has been satisfied.
    Type: Application
    Filed: June 29, 2009
    Publication date: December 30, 2010
    Inventors: Gregory F. Grohoski, Paul J. Jordan, Mark A. Luttrell, Zeid Hartuon Samoail
  • Publication number: 20100332787
    Abstract: A system and method for servicing translation lookaside buffer (TLB) misses may manage separate input and output pipelines within a memory management unit. A pending request queue (PRQ) in the input pipeline may include an instruction-related portion storing entries for instruction TLB (ITLB) misses and a data-related portion storing entries for potential or actual data TLB (DTLB) misses. A DTLB PRQ entry may be allocated to each load/store instruction selected from the pick queue. The system may select an ITLB- or DTLB-related entry for servicing dependent on prior PRQ entry selection(s). A corresponding entry may be held in a translation table entry return queue (TTERQ) in the output pipeline until a matching address translation is received from system memory. PRQ and/or TTERQ entries may be deallocated when a corresponding TLB miss is serviced. PRQ and/or TTERQ entries associated with a thread may be deallocated in response to a thread flush.
    Type: Application
    Filed: June 29, 2009
    Publication date: December 30, 2010
    Inventors: Gregory F. Grohoski, Paul J. Jordan, Mark A. Luttrell, Zeid Hartuon Samoail, Robert T. Golla
  • Publication number: 20100333098
    Abstract: Various techniques for dynamically allocating instruction tags and using those tags are disclosed. These techniques may apply to processors supporting out-of-order execution and to architectures that supports multiple threads. A group of instructions may be assigned a tag value from a pool of available tag values. A tag value may be usable to determine the program order of a group of instructions relative to other instructions in a thread. After the group of instructions has been (or is about to be) committed, the tag value may be freed so that it can be re-used on a second group of instructions. Tag values are dynamically allocated between threads; accordingly, a particular tag value or range of tag values is not dedicated to a particular thread.
    Type: Application
    Filed: June 30, 2009
    Publication date: December 30, 2010
    Inventors: Paul J. Jordan, Robert T. Golla, Jama I. Barreh
  • Patent number: 7861063
    Abstract: In one embodiment, a processor comprises a fetch unit and a pick unit. The fetch unit is configured to fetch instructions for execution by the processor. The pick unit is configured to schedule instructions fetched by the fetch unit for execution in the processor. The pick unit is configured to inhibit scheduling a delayed control transfer instruction (DCTI) until a delay slot instruction of the DCTI is available for scheduling. For example, in some embodiments, the pick unit may inhibit scheduling until the delay slot instruction is written to an instruction buffer, until the delay slot instruction is fetched, etc.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: December 28, 2010
    Assignee: Oracle America, Inc.
    Inventors: Robert T. Golla, Paul J. Jordan, Jama I. Barreh
  • Publication number: 20100325188
    Abstract: A processor including instruction support for implementing large-operand multiplication may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include an instruction execution unit comprising a hardware multiplier datapath circuit, where the hardware multiplier datapath circuit is configured to multiply operands having a maximum number of bits M.
    Type: Application
    Filed: June 19, 2009
    Publication date: December 23, 2010
    Inventors: Christopher H. Olson, Jeffrey S. Brooks, Robert T. Golla, Paul J. Jordan
  • Publication number: 20100274994
    Abstract: Various techniques for mitigating dependencies between groups of instructions are disclosed. In one embodiment, such dependencies include “evil twin” conditions, in which a first floating-point instruction has as a destination a first portion of a logical floating-point register (e.g., a single-precision write), and in which a second, subsequent floating-point instruction has as a source the first portion and a second portion of the same logical floating-point register (e.g., a double-precision read). The disclosed techniques may be applicable in a multithreaded processor implementing register renaming. In one embodiment, a processor may enter an operating mode in which detection of evil twin “producers” (e.g., single-precision writes) causes the instruction sequence to be modified to break potential dependencies. Modification of the instruction sequence may continue until one or more exit criteria are reached (e.g., committing a predetermined number of single-precision writes).
    Type: Application
    Filed: April 22, 2009
    Publication date: October 28, 2010
    Inventors: Robert T. Golla, Paul J. Jordan, Jama I. Barreh, Matthew B. Smittel, Yuan C. Chou, Jared C. Smolens
  • Publication number: 20100268920
    Abstract: A mechanism for handling unfused multiply-add accrued exception bits includes a processor including a floating point unit, a storage, and exception logic. The floating-point unit may be configured to execute an unfused multiply-accumulate instruction defined with the instruction set architecture (ISA). The unfused multiply-accumulate instruction may include a multiply sub-operation and an accumulate sub-operation. The storage may be configured to maintain floating-point exception state information. The exception logic may be configured to capture the floating-point exception state after completion of the multiply sub-operation and prior to completion of the accumulate sub-operation, for example, and to update the storage to reflect the floating-point exception state.
    Type: Application
    Filed: April 16, 2009
    Publication date: October 21, 2010
    Inventors: Jeffrey S. Brooks, Paul J. Jordan, Christopher H. Olson
  • Patent number: 7779238
    Abstract: A system and method for precisely identifying an instruction causing a performance-related event is disclosed. The instruction may be detected while in a pipeline stage of a microprocessor preceding a writeback stage and the microprocessor's architectural state may not be updated until after information identifying the instruction is captured. The instruction may be flushed from the pipeline, along with other instructions from the same thread. A hardware trap may be taken when the instruction is detected and/or when an event counter overflows or is within a given range of overflowing. A software trap handler may capture and/or log information identifying the instruction, such as one or more extended address elements, before returning control and initiating a retry of the instruction. The captured and/or logged information may be stored in an event space database usable by a data space profiler to identify performance bottlenecks in the application containing the instruction.
    Type: Grant
    Filed: October 30, 2006
    Date of Patent: August 17, 2010
    Assignee: Oracle America, Inc.
    Inventors: Nicolai Kosche, Gregory F. Grohoski, Paul J. Jordan
  • Publication number: 20100169611
    Abstract: A system and method for reducing branch misprediction penalty. In response to detecting a mispredicted branch instruction, circuitry within a microprocessor identifies a predetermined condition prior to retirement of the branch instruction. Upon identifying this condition, the entire corresponding pipeline is flushed prior to retirement of the branch instruction, and instruction fetch is started at a corresponding address of an oldest instruction in the pipeline immediately prior to the flushing of the pipeline. The correct outcome is stored prior to the pipeline flush. In order to distinguish the mispredicted branch from other instructions, identification information may be stored alongside the correct outcome. One example of the predetermined condition being satisfied is in response to a timer reaching a predetermined threshold value, wherein the timer begins incrementing in response to the mispredicted branch detection and resets at retirement of the mispredicted branch.
    Type: Application
    Filed: December 30, 2008
    Publication date: July 1, 2010
    Inventors: Yuan C. Chou, Robert T. Golla, Mark A. Luttrell, Paul J. Jordan, Manish Shah
  • Patent number: 7702887
    Abstract: A method and mechanism for monitoring events in a processing system. A performance monitoring mechanism includes is configured to store a count of events in an event counter. Periodically, the count stored in the event counter is updated to a new count. If the new count equals a predetermined value, an indication that the count equals the predetermined value is conveyed. If the new count does not equal the predetermined value, but is within a given epsilon of the predetermined value and the occurrence of a corresponding event is detected, an indication that the count equals the predetermined value is conveyed. The mechanism is further configured to suppress event counts which correspond to mis-speculations.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: April 20, 2010
    Assignee: Sun Microsystems, Inc.
    Inventors: Gregory F. Grohoski, Paul J. Jordan, Yue Chang
  • Patent number: 7676655
    Abstract: A method and mechanism for controlling threads in a multithreaded multicore processor. A processor includes multiple cores, each of which are capable of executing multiple threads. A control register which is shared by each of the cores is utilized to control the status of the threads in the processing system. In one embodiment, the shared register includes a single bit for each thread in the processor. Depending upon the value written to a bit of the shared register, one of three results may occur with respect to a thread which corresponds to the bit. In one embodiment, writing a “0” to a bit of the shared register will cause a corresponding thread to be Parked. Writing a “1” to a bit of the shared register will cause a corresponding thread to either be UnParked or be Reset. Whether writing a “1” to a bit of the register causes the corresponding thread to be UnParked or Reset depends upon a state of the processor.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: March 9, 2010
    Assignee: Sun Microsystems, Inc.
    Inventor: Paul J. Jordan
  • Publication number: 20090327646
    Abstract: In one embodiment, a system comprises one or more registers configured to store a plurality of values that identify a virtual address space (collectively a tag), a translation lookaside buffer (TLB), and a control unit coupled to the TLB and the one or more registers. The control unit is configured to detect whether or not the tag has changed and in response to a change in the tag, map the changed tag to an identifier having fewer bits than the total number of bits in the tag, and provide the current identifier to the TLB. The TLB is configured to detect a hit/miss in response to the identifier. A similar method is also contemplated.
    Type: Application
    Filed: April 30, 2008
    Publication date: December 31, 2009
    Inventors: Paul J. Jordan, Manish K. Shah, Gregory F. Grohoski
  • Patent number: 7543132
    Abstract: A method and apparatus for improved performance for reloading translation look-aside buffers in multithreading, multi-core processors. TSB prediction is accomplished by hashing a plurality of data parameters and generating an index that is provided as an input to a predictor array to predict the TSB page size. In one embodiment of the invention, the predictor array comprises two-bit saturating up-down counters that are used to enhance the accuracy of the TSB prediction. The saturating up-down counters are configured to avoid making rapid changes in the TSB prediction upon detection of an error. Multiple misses occur before the prediction output is changed. The page size specified by the predictor index is searched first. Using the technique described herein, errors are minimized because the counter leads to the correct result at least half the time.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: June 2, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Greg F. Grohoski, Ashley Saulsbury, Paul J. Jordan, Manish Shah, Rabin A. Sugumar, Mark Debbage, Venkatesh Iyengar
  • Publication number: 20090063899
    Abstract: In one embodiment, a processor comprises a first register file configured to store speculative register state, a second register file configured to store committed register state, a check circuit and a control unit. The first register file is protected by a first error protection scheme and the second register file is protected by a second error protection scheme. A check circuit is coupled to receive a value and corresponding one or more check bits read from the first register file to be committed to the second register file in response to the processor selecting a first instruction to be committed. The check circuit is configured to detect an error in the value responsive to the value and the check bits. Coupled to the check circuit, the control unit is configured to cause reexecution of the first instruction responsive to the error detected by the check circuit.
    Type: Application
    Filed: September 4, 2007
    Publication date: March 5, 2009
    Inventors: Paul J. Jordan, Christopher H. Olson
  • Patent number: 7454590
    Abstract: In one embodiment, a processor comprises a plurality of processor cores and an interconnect to which the plurality of processor cores are coupled. Each of the plurality of processor cores comprises at least one translation lookaside buffer (TLB). A first processor core is configured to broadcast a demap command on the interconnect responsive to executing a demap operation. The demap command identifies one or more translations to be invalidated in the TLBs, and remaining processor cores are configured to invalidate the translations in the respective TLBs. The remaining processor cores transmit a response to the first processor core, and the first processor core is configured to delay continued processing subsequent to the demap operation until the responses are received from each of the remaining processor cores.
    Type: Grant
    Filed: September 9, 2005
    Date of Patent: November 18, 2008
    Assignee: Sun Microsystems, Inc.
    Inventors: Paul J. Jordan, Manish K. Shah, Gregory F. Grohoski
  • Patent number: 7454666
    Abstract: A method for tracing of instructions executed by a processor is provided which includes providing a type of instruction to be traced and tracing at least one instruction corresponding to the type of instruction. The method further includes storing data without stopping from the tracing into a memory until the memory is full.
    Type: Grant
    Filed: April 7, 2005
    Date of Patent: November 18, 2008
    Assignee: Sun Microsystems, Inc.
    Inventors: Paul J. Jordan, Joseph T. Rahmeh, Gregory F. Grohoski
  • Patent number: 7430643
    Abstract: The present invention provides a method and apparatus for increased efficiency for translation lookaside buffers by collapsing redundant translation table entries into a single translation table entry (TTE). In the present invention, each thread of a multithreaded processor is provided with multiple context registers. Each of these context registers is compared independently to the context of the TTE. If any of the contexts match (and the other match conditions are satisfied), then the translation is allowed to proceed. Two applications attempting to share one page but that still keep separate pages can then employ three total contexts. One context is for one application's private use; one of the contexts is for the other application's private use; and a third context is for the shared page. In one embodiment of the invention, two contexts are implemented per thread. However, the teachings of the present invention can be extended to a higher number of contexts per thread.
    Type: Grant
    Filed: December 30, 2004
    Date of Patent: September 30, 2008
    Assignee: Sun Microsystems, Inc.
    Inventors: Paul J. Jordan, William J. Kucharski, Roman M. Zajcew, Ashley N. Saulsbury, Quinn A. Jacobson