Abstract: A multithreaded microprocessor includes a thread scheduler and voltage-frequency scheduler (VFS). The thread scheduler uses application-specified QoS requirements, which include required instruction completion rates, and instruction completion information from execution units to schedule the priorities of the threads at which the thread scheduler issues instructions to the execution units. Concurrently, the VFS uses the instruction completion and QoS information to calculate an aggregate utilization of the microprocessor by all the active threads when it is time to scale the voltage-frequency. The aggregate utilization is an effective measure of the amount of work left to be performed relative to the rate requirements. The VFS scales the voltage-frequency based on the aggregate utilization.
Abstract: A processor core and a method for distributive scoreboard scheduling in an out-of-order processor pipeline are described herein. In an embodiment, control logic appends operand availability bits to each instruction. The appended operand availability bits form a distributive scoreboard for each instruction. The appended operand availability bits are propagated together with the instruction through multiple stages of the processor pipeline. An instruction dispatch buffer stores the instruction and the operand availability bits. A dispatch controller determines when an instruction is to be issued. The determination is based, at least in part, on the operand availability bits stored in the instruction dispatch buffer.
Abstract: In a data-packet processor, a configurable queuing system for packet accounting during processing has a plurality of queues arranged in one or more clusters, an identification mechanism for creating a packet identifier for arriving packets, insertion logic for inserting packet identifiers into queues and for determining into which queue to insert a packet identifier, and selection logic for selecting packet identifiers from queues to initiate processing of identified packets, downloading of completed packets, or for requeuing of the selected packet identifiers.
Type:
Grant
Filed:
March 23, 2006
Date of Patent:
May 11, 2010
Assignee:
MIPS Technologies, Inc.
Inventors:
Mario Nemirovsky, Enric Musoll, Stephen Melvin, Narendra Sankar, Nandakumar Sampath, Adolfo Nemirovsky
Abstract: A fork instruction for execution on a multithreaded microprocessor and occupying a single instruction issue slot is disclosed. The fork instruction, executing in a parent thread, includes a first operand specifying the initial instruction address of a new thread and a second operand. The microprocessor executes the fork instruction by allocating context for the new thread, copying the first operand to a program counter of the new thread context, copying the second operand to a register of the new thread context, and scheduling the new thread for execution. If no new thread context is free for allocation, the microprocessor raises an exception to the fork instruction. The fork instruction is efficient because it does not copy the parent thread general purpose registers to the new thread. The second operand is typically used as a pointer to a data structure in memory containing initial general purpose register set values for the new thread.
Abstract: A multithreading processor for concurrently executing multiple threads is provided. The processor includes an execution pipeline and a thread scheduler that dispatches instructions of the threads to the execution pipeline. The execution pipeline execution pipeline is configured for generating a thread context (TC) flush indicator associated with a thread context when one or more instructions of the thread context would stall in the execution pipeline. One or more instructions in the pipeline of the thread context associated with the thread context flush signal can be flushed or nullified.
Type:
Application
Filed:
January 8, 2010
Publication date:
May 6, 2010
Applicant:
MIPS Technologies, Inc.
Inventors:
Michael Gottlieb Jensen, Darren M. JONES, Ryan C. Kinter, Sanjay Vishin
Abstract: A shared resource access control system having a gating storage responsive to a plurality of controls with each of the controls derived from an instruction context identifying the shared resource, the gating storage including a plurality of sets of access method functions with each set of access method functions including a first access method function and a second access method function with the gating storage producing a particular one access method function from a particular one set responsive to the controls; and a controller, coupled to the gating storage, for controlling access to the shared resource using the particular one access method function.
Abstract: Polynomial arithmetic instructions are provided in an instruction set architecture (ISA). A multiply-add-polynomial (MADDP) instruction and a multiply-polynomial (MULTP) instruction are provided.
Type:
Grant
Filed:
February 21, 2001
Date of Patent:
May 4, 2010
Assignee:
MIPS Technologies, Inc.
Inventors:
Morten Stribaek, Kevin D. Kissell, Pascal Paillier
Abstract: A method, cache controller, and computer processor provide a parallel mapping system whereby a plurality of mappers processes several inputs simultaneously. The plurality of mappers are disposed in a pipelined processor upstream from a multiplexor. Mapping, tag comparison, and selection by the multiplexor all occur in a single pipeline stage. Data does not wait idly to be selected by the multiplexor. Instead, each instruction of a first instruction set is read in parallel into a corresponding one of the plurality of mappers. This parallel mapping system implementation reduces processor cycle time and results in improved processor efficiency.
Abstract: A processor core and method for managing branch misprediction in an out-of-order processor pipeline. In one embodiment, the pipeline of the processor core includes a front-end instruction fetch portion, a back-end instruction execution portion, and pipeline control logic. Operation of the instruction fetch portion is decoupled from operation of the instruction execution portion. Following detection of a control transfer misprediction, operation of the instruction fetch portion is halted and instructions residing in the instruction fetch portion are invalidated. When the instruction associated with the misprediction reaches a selected pipeline stage, instructions residing in the instruction execution portion of the pipeline are invalidated and the flow of instructions from the instruction fetch portion to the instruction execution portion of the processor pipeline is restarted.
Abstract: A context-selection mechanism is provided for selecting a best context from a pool of contexts for processing a data packet. The context selection mechanism comprises, an interface for communicating with a multi-streaming processor; circuitry for computing input data into a result value according to logic rule and for selecting a context based on the computed value and a loading mechanism for preloading the packet information into the selected context for subsequent processing. The computation of the input data functions to enable identification and selection of a best context for processing a data packet according to the logic rule at the instant time such that a multitude of subsequent context selections over a period of time acts to balance load pressure on functional units housed within the multi-streaming processor and required for packet processing. In preferred aspects, programmable singular or multiple predictive rules of logic are utilized in the selection process.
Abstract: In a multi-streaming processor, a system for fetching instructions from individual ones of multiple streams to an instruction pipeline is provided, comprising a fetch algorithm for selecting from which stream to fetch an instruction, and one or more predictors for forecasting whether a load instruction will hit or miss the cache or a branch will be taken. The prediction or predictions are used by the fetch algorithm in determining from which stream to fetch. In some cases probabilities are determined and also used in decisions, and predictors may be used at either or both of fetch and dispatch stages.
Abstract: A method and apparatus for recoding one or more instruction sets. An expand instruction and an expandable instruction are read from an instruction cache. A tag compare and way selection unit checks to verify each instruction is a desired instruction. An instruction staging unit dispatches the expand instruction to a first recoder and the expandable instruction to a second recoder of a recoding unit. At least one information bit based on the expand instruction is generated at the first recoder. The second recoder uses the at least one information bit generated at the first recoder to recode the expandable instruction, and the recoded expandable instruction is placed in an instruction buffer.
Type:
Grant
Filed:
October 31, 2003
Date of Patent:
April 27, 2010
Assignee:
MIPS Technologies, Inc.
Inventors:
Soumya Banerjee, John L. Kelley, Ryan C. Kinter
Abstract: A method of tracing processor data includes receiving a first trace stream from a first processor operating in response to a first clock and a second trace stream from a second processor operating in response to a second clock. The first trace stream is routed to a first dual-port synchronous memory in accordance with the first clock and the second trace stream is routed to a second dual-port synchronous memory in accordance with the second clock. The first trace stream and the second trace stream are delivered to a memory in accordance with a third clock.
Abstract: A configurable coprocessor interface between a central processing unit (CPU) and a coprocessor is provided. The coprocessor interface has an instruction transfer signal group for transferring different instruction types from the CPU to the coprocessor, sequentially or in parallel, a busy signal group, for allowing the coprocessor to signal the CPU that it cannot receive a transfer of one or more of the different instruction types, and an instruction order signal group for indicating to the coprocessor a relative execution order for multiple instructions that are transferred in parallel. In addition, the coprocessor interface includes separate data transfer signal groups for data being transferred from the CPU to the coprocessor, and for data being transferred from the coprocessor to the CPU, along with a data order signal group for indicating a relative order of data (if transferred out-of-order).
Type:
Grant
Filed:
February 14, 2007
Date of Patent:
April 13, 2010
Assignee:
MIPS Technologies, Inc.
Inventors:
Lawrence Henry Hudepohl, Darren Miller Jones, Radhika Thekkath, Franz Treue
Abstract: Mechanisms for dynamically configuring the resources of a virtual multiprocessor are provided. An apparatus to configure resources for virtual processing elements in a virtual multiprocessor is provided. The apparatus includes a virtual multiprocessor context, virtual processing element contexts, and configuration logic. The virtual multiprocessor context, prescribes the resources, and controls a configuration state of the virtual multiprocessor. The virtual processing element contexts each exclusively correspond to a virtual processing element. The virtual processing element contexts each have first logic, for prescribing whether the virtual processing element is permitted to configure the resources; and second logic, for prescribing a subset of the resources that is allocated to the virtual processing element.
Abstract: Disclosed are methods, systems, and computer program products for evaluating performance aspects of electrical circuits, and particularly digital logic circuits. An exemplary method comprises obtaining access to a simulation dump file comprising state indications of the values of a plurality of signals of an electrical circuit at a plurality of simulation time points, and receiving an evaluation task that defines an output based on one or more input signals, with each input signal being a signal for which state indications are provided in the simulation dump file. The method further comprises generating, from the simulation dump file, one or more state representations for the input signals of the evaluation task, with each state representation being representative of the state of an input signal over a period of simulation time, and generating values of the output of the evaluation task at a plurality of simulation time points from the state representations.
Abstract: An instruction dispatching apparatus in a multi threading microprocessor that concurrently executes N threads each in one of G groups each having one of P priorities. G round-robin vectors each have N bits corresponding to the threads, each being a 1-bit left-rotated and subsequently sign-extended version of an N-bit vector with a single bit true of the last thread selected for dispatching in the group. Each of N G-input muxes receive a corresponding one of the N bits of each of the round-robin vectors and selects for output one of the inputs specified by the corresponding thread's group. Selection logic selects for dispatching one of the N instructions corresponding to the thread whose dispatch value is greater than or equal to any of the N threads left thereof. Each dispatch value comprises a least-significant bit of the corresponding mux output, a most-significant dispatchable instruction bit, and middle thread group priority bits.
Type:
Grant
Filed:
July 27, 2005
Date of Patent:
March 16, 2010
Assignee:
MIPS Technologies, Inc.
Inventors:
Michael Gottlieb Jensen, Ryan C. Kinter
Abstract: A microprocessor core includes a plurality of inputs that indicate whether a corresponding plurality of independently occurring events has occurred. The inputs are non-memory address inputs. The core also includes a yield instruction in its instruction set architecture, comprising a user-visible output operand and an explicit input operand. The input operand specifies one or more of the independently occurring events. The yield instruction instructs the microprocessor core to suspend issuing for execution instructions of a program thread until at least one of the independently occurring events specified by the input operand has occurred. The program thread contains the yield instruction. The yield instruction further instructs the microprocessor core to return a value in the output operand indicating which of the independently occurring events occurred to cause the microprocessor core to resume issuing the instructions of the program thread.
Abstract: A multiprocessing system including a multithreading microprocessor and multiprocessor operating system (OS) is disclosed. The microprocessor includes a first and a second plurality of thread contexts (TCs), each TC having a program counter and a general purpose register set for executing a thread. The microprocessor also includes a first and a second shared privileged resource, shared by the first and second respective plurality of TCs rather than being replicated for each of the respective first and second plurality of TCs, and privileged to be managed only by operating system-privileged threads rather than by user-privileged threads. The OS manages the first and second shared privileged resource and schedules execution of both the operating system-privileged threads and the user-privileged threads on the plurality of TCs.
Abstract: A microprocessor includes an N-way cache and a logic block that selectively enables and disables the N-way cache for at least one clock cycle if a first register load instructions and a second register load instruction, following the first register load instruction, are detected as pointing to the same index line in which the requested data is stored. The logic block further provides a disabling signal to the N-way cache for at least one clock cycle if the first and second instructions are detected as pointing to the same cache way.