Simultaneous Parallel Fetching Or Executing Of Both Branch And Fall-through Path Patents (Class 712/235)

Multiple job signals per processing unit in a multiprocessing system

Patent number: 6714961

Abstract: The invention is directed toward a multiprocessing system having multiple processing units. For at least one of the processing units in the multiprocessing system, a first job signal is assigned to the processing unit for speculative execution of a corresponding first job, and a further job signal is assigned to the processing unit for speculative execution of a corresponding further job. The speculative execution of said further job is initiated when the processing unit has completed execution of the first job. If desirable, even more job signals may be assigned to the processing unit for speculative execution. In this way, multiple job signals are assigned to the processing units of the processing system, and the processing units are allowed to execute a plurality of jobs speculatively while waiting for commit priority.

Type: Grant

Filed: November 12, 1999

Date of Patent: March 30, 2004

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventors: Per Anders Holmberg, Terje Egeland, Nils Ola Linnermark, Karl Oscar Joachim Strömbergson, Magnus Carlsson
Processor utilizing novel architectural ordering scheme

Publication number: 20040059898

Abstract: Various methods, apparatuses, and systems in which a processor includes an issue engine and an in-order execution pipeline. The issue engine categorizes operations as at least one of either a speculative operation which perform computations or an architectural operation which has potential to fault or cause an exception. Each architectural operation issues with an associated architectural micro-operation. A first micro-operation checks whether a first speculative operation is dependent upon an intervening first architectural operation. The in-order execution pipeline executes the speculative operation, the architectural operation, and the associated architectural micro-operations.

Type: Application

Filed: September 19, 2002

Publication date: March 25, 2004

Inventors: Jeffery J. Baxter, Gary N. Hammond, Nazar A. Zaidi
Method and apparatus for implementing two-tiered thread state multithreading support with high clock rate

Publication number: 20040059896

Abstract: A method and apparatus are provided for implementing two-tiered thread state multithreading support with a high clock rate. A first tier thread state storage stores a limited number of runnable thread register states. The limited number is less than a threshold value. Next thread selection logic coupled between the first tier thread state storage and a currently executing processor state, picks a next thread to run on a processor from the limited number of runnable thread register states. A second tier thread storage facility stores a second number of thread states that is greater than the limited number of runnable thread register states. A runnable thread selection logic coupled between the first tier thread state storage and the second tier thread storage facility, selectively exchanges thread states between the first tier limited number of runnable thread register states and the second tier thread storage facility.

Type: Application

Filed: September 19, 2002

Publication date: March 25, 2004

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Harold F. Kossman, Timothy John Mullins
Multiprocessor speculation mechanism via a barrier speculation flag

Patent number: 6691220

Abstract: A method of operation within a processor that permits load instructions following barrier instructions in an instruction sequence to be issued speculatively. The barrier instruction is executed and while the barrier operation is pending, a load request associated with the load instruction is speculatively issued. A speculation flag is set to indicate the load instruction was speculatively issued. The flag is reset when an acknowledgment of the barrier operation is received. Data that is returned before the acknowledgment is received is temporarily held, and the data is forwarded to the register and/or execution unit of the processor only after the acknowledgment is received. If a snoop invalidate is detected for the speculatively issued load request before the barrier operation completes, the data is discarded and the load request is re-issued.

Type: Grant

Filed: June 6, 2000

Date of Patent: February 10, 2004

Assignee: International Business Machines Corporation

Inventors: Guy Lynn Guthrie, Ravi Kumar Arimilli, John Steven Dodson, Derek Edward Williams
Speculative counting of performance events with rewind counter

Publication number: 20040024996

Abstract: A circuit and method for maintaining a correct value in performance monitor counter within a speculative computer microprocessor is disclosed. In response to determining the begin of speculative execution within the microprocessor, the value of the performance monitor counter is stored in a rewind register. The performance monitor counter is incremented in response to predetermined events. If the microprocessor determines the speculative execution was incorrect, the value of the rewind register is loaded into the counter, restoring correct value for the counter.

Type: Application

Filed: July 31, 2002

Publication date: February 5, 2004

Applicants: International Business Machines Corporation, Hitachi, Ltd.

Inventors: Hung Qui Le, Alexander Erik Mericas, Robert Dominick Mirabella, Toshihiko Kurihara, Michitaka Okuno, Masahiro Tokoro
Parallel processing apparatus

Patent number: 6687812

Abstract: Disclosed is a parallel processing apparatus capable of reducing power consumption by efficiently executing a fork instruction for activating a plurality of processors. The parallel processing apparatus has a processor element (10) for generating (forking) a thread consisting of a plurality of instructions of an external unit. The processor element comprises a fork-instruction predicting section (14) which includes a predicting section for predicting whether or not the fork condition of a fork-conditioned fork instruction is satisfied after fetching but before executing the instruction.

Type: Grant

Filed: April 20, 2000

Date of Patent: February 3, 2004

Assignee: NEC Corporation

Inventor: Sachiko Shimada
Microprocessor

Publication number: 20040019772

Abstract: A microprocessor includes a register (5) rewritable with software outputs a signal A for determining which one of a successor instruction to be executed when a condition for a conditional branch is satisfied and another successor instruction to be executed when the condition is unsatisfied is to be introduced into a delay slot. When the microprocessor executes a conditional branch, a decode circuit (6) delivers a signal B indicating which one of the successor instruction and the other successor instruction is to be selected as the next instruction to be supplied next to a CPU (1) to a code interface circuit (2).

Type: Application

Filed: January 22, 2003

Publication date: January 29, 2004

Inventors: Hiroshi Ueki, Masahiro Yokoyama
Insertion of prefetch instructions into computer program code

Patent number: 6675374

Abstract: A technique is provided for inserting memory prefetch instructions only at appropriate locations in program code. The instructions are inserted into the program code such that, when the code is executed, the speed and efficiency of execution of the code may be improved, cache conflicts arising from execution of the prefetch instruction may be substantially eliminated, and the number of simultaneously-executing memory prefetch operations may be limited to prevent stalling and/or overtaxing of the processor executing the code.

Type: Grant

Filed: October 12, 1999

Date of Patent: January 6, 2004

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: John Samuel Pieper, Steven Orodon Hobbs, Stephen Corridon Root
Geometric engine including a computational module without memory contention

Patent number: 6675285

Abstract: A method and apparatus for eliminating memory contention in a computation module is presented. The method includes, for a current operation being performed by a computation engine of the computation model, processing that begins by identifying one of a plurality of threads for which the current operation is being performed. The plurality of threads constitutes an application (e.g., geometric primitive applications, video graphic applications, drawing applications, etc.). The processing continues by identifying an operation code from a set of operation codes corresponding to the one of the plurality of threads. As such, the thread that has been identified for the current operation, one of its operation codes is being identified for the current operation. The processing then continues by determining a particular location of a particular one of a plurality of data flow memory devices based on the particular thread and the particular operation code for storing the result of the current operation.

Type: Grant

Filed: April 21, 2000

Date of Patent: January 6, 2004

Assignee: ATI International, Srl

Inventors: Michael Andrew Mang, Michael Mantor
Method and apparatus for executing low power validations for high confidence speculations

Publication number: 20040003215

Abstract: A method and apparatus for executing low power validations for high confidence predictions. More particularly, the present invention pertains to using confidence levels of speculative executions to decrease power consumption of a processor without affecting its performance. Non-critical instructions, or those instructions whose prediction, rather than verification, lie on the critical path, can thus be optimized to consume less power.

Type: Application

Filed: June 28, 2002

Publication date: January 1, 2004

Inventors: Evgeni Krimer, Bishara Shomar, Ronny Ronen, Doron Orenstein
Method and system dynamically presenting the branch target address in conditional branch instruction

Patent number: 6662295

Abstract: The present invention is related to branch instructions in a pipeline process of a microprocessor system. The microprocessor system executes branch prediction if a conditional branch instruction code calls for branch prediction, and on the other hand, suspends successive instruction execution until a branch evaluation of the conditional branch instruction settles if the conditional branch instruction code does not call for branch prediction.

Type: Grant

Filed: September 10, 1998

Date of Patent: December 9, 2003

Assignee: Ricoh Company, Ltd.

Inventor: Shinichi Yamaura
Method, apparatus, and product for optimizing compiler with rotating register assignment to modulo scheduled code in SSA form

Patent number: 6651247

Abstract: In a computer having rotating registers, a schedule-assigner for allocating the rotating registers. The scheduler-assigner includes a software-pipelined instruction scheduler that generates a first software-pipelined instruction schedule based on an intermediate representation that has data flow information in SSA form. The scheduler-assigner also includes a rotating register allocator that designates live ranges of loop-variant variables in the first software-pipelined instruction schedule as being allocated to rotating registers, when available. The first software-pipelined instruction schedule may be a modulo schedule. When a rotating register is not available, the software-pipelined instruction scheduler may generate a second software-pipelined instruction schedule having an initiation interval greater than the initiation interval of the first software-pipelined instruction schedule.

Type: Grant

Filed: May 9, 2000

Date of Patent: November 18, 2003

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Uma Srinivasan
Branch misprediction recovery using a side memory

Patent number: 6643770

Abstract: A mispredicted path side memory is configured to be coupled to a stage in an instruction pipeline. As instructions advance through the pipeline, a result from the stage is stored into the mispredicted path side memory. The result is restored from the mispredicted path side memory into a pipeline stage when a branch is mispredicted.

Type: Grant

Filed: September 16, 1999

Date of Patent: November 4, 2003

Assignee: Intel Corporation

Inventor: Nicolas I. Kacevas
Method and apparatus for resteering failing speculation check instructions

Patent number: 6636960

Abstract: The system is a method and an apparatus for resteering failing speculation check instructions in the pipeline of a processor. A branch offset immediate value and an instruction pointer correspond to each failing instruction. These values are used to determine the correct target recovery address. A relative adder adds the immediate value and the instruction pointer value to arrive at the target recovery address. This is done by flushing the pipeline upon the occurrence of a failing speculation check instruction. The pipeline flush is extended to allow the instruction stream to be resteered. The immediate value and the instruction pointer are then routed through the existing data paths of the pipeline, into the relative adder, which calculates the correct address. A sequencer tracks the progression of these values through the pipeline and causes a branch at the desired time.

Type: Grant

Filed: February 16, 2000

Date of Patent: October 21, 2003

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: James Douglas Gibson, Rohit Bhatia
Time-multiplexed speculative multi-threading to support single-threaded applications

Publication number: 20030188141

Abstract: One embodiment of the present invention provides a system that facilitates interleaved execution of a head thread and a speculative thread within a single processor pipeline. The system operates by executing program instructions using the head thread, and by speculatively executing program instructions in advance of the head thread using the speculative thread, wherein the head thread and the speculative thread execute concurrently through time-multiplexed interleaving in the single processor pipeline.

Type: Application

Filed: February 12, 2003

Publication date: October 2, 2003

Inventors: Shailender Chaudhry, Marc Tremblay
Storing execution results of mispredicted paths in a superscalar computer processor

Publication number: 20030182539

Abstract: It has been determined that, in a superscalar computer processor, executing load instructions issued along an incorrectly predicted path of a conditional branch instruction eventually reduces the number of cache misses observed on the correct branch path. Executing these wrong-path loads provides an indirect prefetching effect. If the processor has a small L1 data cache, however, this prefetching pollutes the cache causing an overall slowdown in performance. By storing the execution results of mispredicted paths in memory, such as in a wrong path cache, the pollution is eliminated. A wrong path cache can improve processor performance up to 17% in simulations using a 32 KB data cache. A fully-associative eight-entry wrong path cache in parallel with a 4 KB direct-mapped data cache allows the execution of wrong path loads to produce an average processor speedup of 46%. The wrong path cache also results in 16% better speedup compared to the baseline processor equipped with a victim cache of the same size.

Type: Application

Filed: March 20, 2002

Publication date: September 25, 2003

Applicant: International Business Machines Corporation

Inventors: Steven R. Kunkel, David J. Lilja, Resit Sendag
Method and apparatus for controlling execution of speculations in a processor based on monitoring power consumption

Publication number: 20030182542

Abstract: In accordance with one embodiment, the invention provides a method comprising monitoring a power consumption of a processor in executing a program while running in a speculative execution mode wherein instructions are speculatively executed; and turning off said speculative execution mode if said power consumption is above a predetermined threshold. According to another embodiment the invention provides a processor comprising a speculative mode wherein instructions are speculatively executed; a non-speculative execution mode wherein instructions are executed non-speculatively; and a speculation control mechanism to selectively cause said processor to operate in said non-speculative mode based on a power consumption criterion.

Type: Application

Filed: March 20, 2002

Publication date: September 25, 2003

Inventors: Robert L. Davies, Aaron M. Tsirkel
Methods and apparatus for multi-processing execution of computer instructions

Publication number: 20030177343

Abstract: A multi-processing computer architecture and a method of operating the same are provided. The multi-processing architecture provides a main processor and multiple sub-processors cascaded together to efficiently execute loop operations. The main processor executes operations outside of a loop and controls the loop. The multiple sub-processors are operably interconnected, and are each assigned by the main processor to a given loop iteration. Each sub-processor is operable to receive one or more sub-instructions sequentially, operate on each sub-instruction and propagate the sub-instruction to a subsequent sub-processor.

Type: Application

Filed: July 24, 2002

Publication date: September 18, 2003

Applicant: Sony Computer Entertainment America Inc.

Inventor: Hidetaka Magoshi
Method and apparatus for accelerating instruction fetching for a processor

Patent number: 6604191

Abstract: An instruction fetching system (and/or architecture) which may be utilized by a high-frequency short-pipeline microprocessor, for efficient fetching of both in-line and target instructions. The system contains an instruction fetching unit (IFU), having a control logic and associated components for controlling a specially designed instruction cache (I-cache). The I-cache is a sum-address cache, i.e., it receives two address inputs, which compiled by a decoder to provide the address of the line of instructions desired fetch. The I-cache is designed with an array of cache lines that can contain 32 instructions, and three buffers that each have a capacity of 32 instructions.

Type: Grant

Filed: February 4, 2000

Date of Patent: August 5, 2003

Assignee: International Business Machines Corporation

Inventors: Brian King Flacks, David Meltzer, Joel Abraham Silberman
Speculative load instructions with retry

Publication number: 20030135722

Abstract: A system, method and apparatus is provided that splits a microprocessor load instruction into two (2) parts, a speculative load instruction and a check speculative load instruction. The speculative load instruction can be moved ahead in the instruction stream by the compiler as soon as the address and result registers are available. This is true even when the data to be loaded is not actually required. This speculative load instruction will not cause a fault in the memory if the access is invalid, i.e. the load misses and a token bit is set. The check speculative load instruction will cause the speculative load instruction to be retried in the event the token bit was set equal to one. In this manner, the latency associated with branching to an interrupt routine will be eliminated a significant amount of the time. It is very possible that the reasons for invalidating the speculative load operation are no longer present (e.g. page in memory is not present) and the load will be allowed to complete.

Type: Application

Filed: January 10, 2002

Publication date: July 17, 2003

Applicant: International Business Machines Corporation

Inventor: Andrew Johnson
Suspending execution of a thread in a multi-threaded processor

Publication number: 20030126416

Abstract: Techniques for suspending execution of a thread in a multi-threaded processor. In one embodiment, a processor includes resources that can be partitioned between multiple threads. Processor logic receives an instruction in a first thread of execution, and, in response to that instruction, relinquishes portions of the portioned resources for use by other threads.

Type: Application

Filed: December 31, 2001

Publication date: July 3, 2003

Inventors: Deborah T. Marr, Dion Rodgers, David L. Hill, Shiv Kaushik, James B. Crossland, David A. Koufaty
Method and apparatus to execute instructions in a processor

Publication number: 20030126417

Abstract: A method and apparatus to execute data speculative instructions in a processor comprising at least one source register, each source register comprising a bit to indicate validity of data in the at least one source register. A data validity circuit coupled to the one or more source registers to determine the validity of the data in the source registers, and to indicate the validity of the data in a destination register based upon the validity bit in the at least one source register. The processor optionally comprising a checker unit to retire those instructions from the execution unit which write valid data to the destination register, and to re-schedules those instructions for execution which write invalid data to the destination register.

Type: Application

Filed: January 2, 2002

Publication date: July 3, 2003

Inventors: Eric Sprangle, Michael J. Haertel, David J. Sager
Scheduler which discovers non-speculative nature of an instruction after issuing and reissues the instruction

Patent number: 6564315

Abstract: A scheduler issues instruction operations for execution, but also retains the instruction operations. If a particular instruction operation is subsequently found to be required to execute non-speculatively, the particular instruction operation is still stored in the scheduler. Subsequent to determining that the particular operation has become non-speculative (through the issuance and execution of instruction operations prior to the particular instruction operation), the particular instruction operation may be reissued from the scheduler. The penalty for incorrect scheduling of instruction operations which are to execute non-speculatively may be reduced as compared to purging the particular instruction operation and younger instruction operations from the pipeline and refetching the particular instruction operation. Additionally, the scheduler may maintain the dependency indications for each instruction operation which has been issued.

Type: Grant

Filed: January 3, 2000

Date of Patent: May 13, 2003

Assignee: Advanced Micro Devices, Inc.

Inventors: James B. Keller, Ramsey W. Haddad, Stephan G. Meier
Method and apparatus for maintaining processor ordering

Publication number: 20030088760

Abstract: According to one aspect of the invention, a method is provided in which store addresses of store instructions dispatched during a last predetermined number of cycles are maintained in a first data structure of a first processor. It is determined whether a load address of a first load instruction matches one of the store addresses in the first data structure. The first load instruction is replayed if the load address of the first load instruction matches one of the store addresses in the first data structure.

Type: Application

Filed: October 24, 2002

Publication date: May 8, 2003

Inventors: Muntaquim F. Chowdhury, Douglas M. Carmean
Method of optimization of CPU and chipset performance by support of optional reads by CPU and chipset

Publication number: 20030084274

Abstract: In processing an instruction request, the invention determines whether the request is speculative or not based upon a bit field within the instruction. If the request is speculative, bus congestion and/or target memory is assessed for conditions and a decision is made, based on the conditions, as to whether or not to process the request. To facilitate the invention, certain bit fields within the instruction are encoded to identify the request as speculative or not. Additional bit fields may define a priority of a speculative request to influence the decision to process as based on the conditions. CPU architectures incorporating prefetch logic may be modified to recognize instructions encoded with speculation and priority identification fields to implement the invention in existing systems. Other logic, e.g., bus controllers and switches, may similarly process speculative requests to enhance system performance.

Type: Application

Filed: October 26, 2001

Publication date: May 1, 2003

Inventors: Blaine D. Gaither, Robert J. Brooks
Facilitating value prediction to support speculative program execution

Publication number: 20030079116

Abstract: One embodiment of the present invention provides a system that predicts a result produced by a section of code in order to support speculative program execution. The system begins by executing the section of code using a head thread in order to produce a result. Before the head thread produces the result, the system generates a predicted result to be used in place of the result. Next, the system allows a speculative thread to use the predicted result in speculatively executing subsequent code that follows the section of code. After the head thread finishes executing the section of code, the system determines if a difference between the predicted result and the result generated by the head thread has affected execution of the speculative thread. If so, the system executes the subsequent code again using the result generated by the head thread. If not, the system performs a join operation to merge state associated with the speculative thread with state associated with the head thread.

Type: Application

Filed: January 16, 2001

Publication date: April 24, 2003

Inventors: Shailender Chaudlhry, Marc Tremblay
Conditional execution with multiple destination stores

Publication number: 20030074544

Abstract: A method for conditionally performing a SIMD operation causing a predetermined number of result objects to be held in a combination of different ones of a plurality of destination stores, the method comprising receiving and decoding instruction fields to determine at least one source store, a plurality of destination stores and at least one control store, said source and destination stores being capable of holding one or a plurality of objects, each object defining a SIMD lane. Conditional execution of the operation on a per SIMD lane basis is controlled using a plurality of pre-set indicators of the at least one control store designated in the instruction, wherein each said pre-set indicator i controls a predetermined number of result lanes p, where p takes a value greater than or equal to two. A predetermined number of result objects are sent to the destination stores such that the predetermined number of result objects are held by a combination of different ones of the plurality of destination stores.

Type: Application

Filed: April 11, 2002

Publication date: April 17, 2003

Inventor: Sophie Wilson
Apparatus to facilitate multithreading in a computer processor pipeline

Publication number: 20030046517

Abstract: One embodiment of the present invention provides a system to facilitate multithreading a computer processor pipeline. The system includes a pipeline that is configured to accept instructions from multiple independent threads of operation, wherein each thread of operation is unrelated to the other threads of operation. This system also includes a control mechanism that is configured to control the pipeline. This control mechanism is statically scheduled to execute multiple threads in round-robin succession. This static scheduling eliminates the need for communication between stages of the pipeline.

Type: Application

Filed: September 4, 2001

Publication date: March 6, 2003

Inventor: Gary R. Lauterbach
Processor architecture

Publication number: 20030037226

Abstract: A processor architecture includes a program counter which executes M independent program streams in time division in units of one instruction, a pipeline which is shared by each of the program streams and has N pipeline stages operable at a frequency F, and a mechanism which executes only s program streams depending on a required operation performance, where M and N are integers greater than or equal to one and having no mutual dependency, s is an integer greater than or equal to zero and satisfying s≦M. An apparent number of pipeline stages viewed from each of the program streams is set to N/M so that M parallel processors having an apparent operating frequency F/M are formed.

Type: Application

Filed: April 29, 2002

Publication date: February 20, 2003

Applicant: FUJITSU LIMITED

Inventors: Toru Tsuruta, Norichika Kumamoto, Hideki Yoshizawa
Decoupled fetch-execute engine with static branch prediction support

Patent number: 6523110

Abstract: There is provided a decoupled fetch-execute engine with static branch prediction support. A method for prefetching targets of branch instructions in a computer processing system having instruction fetch decoupled from an execution pipeline includes the step of generating a prepare-to-branch (PBR) operation. The PBR operation includes address bits corresponding to a branch paired thereto and address bits corresponding to an expected target of the branch. The execution of the PBR operation is scheduled prior to execution of the paired branch to enforce a desired latency therebetween. Upon execution of the PBR operation, it is determined whether the paired branch is available using the address bits of the PBR operation corresponding to the paired branch. When the paired branch is available, the expected branch target is fetched using the address bits of the PBR operation corresponding to the expected branch target.

Type: Grant

Filed: July 23, 1999

Date of Patent: February 18, 2003

Assignee: International Business Machines Corporation

Inventors: Arthur A. Bright, Jason E. Fritts
Processor having multiple program counters and trace buffers outside an execution pipeline

Publication number: 20030033511

Abstract: In one embodiment of the invention, a processor includes an execution pipeline to concurrently execute at least portions of threads, wherein at least one of the threads is dependent on at least another one of the threads. The processor also includes detection circuitry to detect speculation errors in the execution of the threads. In another embodiment, the processor includes thread management logic to control dynamic creation of threads from a program.

Type: Application

Filed: October 8, 2002

Publication date: February 13, 2003

Inventors: Haitham Akkary, Kingsum Chow
Methods and apparatus for controlling speculative execution of instructions based on a multiaccess memory condition

Publication number: 20030033510

Abstract: Mechanisms and techniques operate in a computerized device to enable or disable speculative execution of instructions such as reordering of load and store instructions a multiprocessing computerized device. The mechanisms and techniques provide a speculative execution controller that can detect a multiaccess memory condition between the first and second processors, such as concurrent access to shared data pages via page table entries. This can be done by monitoring page table entry accesses by other processors. The speculative execution controller sets a value of a speculation indicator in the memory system based on the multiaccess memory condition. If the value of the speculation indicator indicates that speculative execution of instructions is allowed in the computerized device, the speculative execution controller allows speculative execution of instructions in at least one of the first and second processors in the computerized device.

Type: Application

Filed: January 3, 2002

Publication date: February 13, 2003

Inventor: David Dice
Interprocessor register succession method and device therefor

Publication number: 20030028755

Abstract: In a parallel processor system for executing a plurality of threads which are obtained by dividing a single program in parallel each other by a plurality of processors, when a processor executing a master thread conducts forking of a slave thread in other processor, at every write to a general register in the master thread after forking, the fork source processor transmits an updated register value to the fork destination processor through a communication bus. The fork destination processor executes the slave thread for speculation and upon detecting an offense against Read After Write (RAW) related to the general register, cancels the thread being executed to conduct re-execution of the thread.

Type: Application

Filed: June 7, 2002

Publication date: February 6, 2003

Applicant: NEC CORPORATION

Inventors: Taku Ohsawa, Satoshi Matsushita
Cache miss saving for speculation load operation

Patent number: 6516462

Abstract: Compiler optimization methods and systems for preventing delays associated with a speculative load operation on a data when the data is not in the data cache of a processor. A compiler optimizer analyzes various criteria to determine whether a cache miss savings transformation is useful. Depending on the results of the analysis, the load operation and/or the successor operations to the load operation are transferred into a predicated mode of operation to enhance overall system efficiency and execution speed.

Type: Grant

Filed: February 17, 2000

Date of Patent: February 4, 2003

Assignee: Elbrus International

Inventors: Sergev K. Okunev, Vladimir Y. Volkonsky
Novel fetch branch architecture for reducing branch penalty without branch prediction

Publication number: 20030023838

Abstract: In lieu of branch prediction, a merged fetch-branch unit operates in parallel with the decode unit within a processor. Upon detection of a branch instruction within a group of one or more fetched instructions, any instructions preceding the branch are marked regular instructions, the branch instruction is marked as such, and any instructions following branch are marked sequential instructions. Within two cycles, sequential instructions following the last fetched instruction are retrieved and marked, target instructions beginning at the branch target address are retrieved and marked, and the branch is resolved. Either the sequential or target instructions are then dropped depending on the branch resolution, incurring a fixed, 1 cycle branch penalty.

Type: Application

Filed: July 27, 2001

Publication date: January 30, 2003

Inventors: Faraydon O. Karim, Ramesh Chandra
MULTI-THREADED PROCESSOR BY MULTIPLE-BIT FLIP-FLOP GLOBAL SUBSTITUTION

Publication number: 20030014612

Abstract: A processor improves throughput efficiency and exploits increased parallelism by introducing multithreading to an existing and mature processor core. The multithreading is implemented in two steps including vertical multithreading and horizontal multithreading. The processor core is retrofitted to support multiple machine states. System embodiments that exploit retrofitting of an existing processor core advantageously leverage hundreds of man-years of hardware and software development by extending the lifetime of a proven processor pipeline generation. A processor implements N-bit flip-flop global substitution. To implement multiple machine states, the processor converts 1-bit flip-flops in storage cells of the stalling vertical thread to an N-bit global flip-flop where N is the number of vertical threads.

Type: Application

Filed: May 11, 1999

Publication date: January 16, 2003

Inventors: WILLIAM N. JOY, MARC TREMBLAY, GARY LAUTERBACH, JOSEPH I. CHAMDANI
Mechanism for providing high instruction fetch bandwidth in a multi-threaded processor

Publication number: 20030005262

Abstract: The present invention provides a mechanism for supporting high bandwidth instruction fetching in a multi-threaded processor. A multi-threaded processor includes an instruction cache (I-cache) and a temporary instruction cache (TIC). In response to an instruction pointer (IP) of a first thread hitting in the I-cache, a first block of instructions for the thread is provided to an instruction buffer and a second block of instructions for the thread are provided to the TIC. On a subsequent clock interval, the second block of instructions is provided to the instruction buffer, and first and second blocks of instructions from a second thread are loaded into a second instruction buffer and the TIC, respectively.

Type: Application

Filed: June 28, 2001

Publication date: January 2, 2003

Inventors: Sailesh Kottapalli, James S. Burns, Kenneth D. Shoemaker
Processor having multiple program counters and trace buffers outside an execution pipeline

Patent number: 6493820

Abstract: In one embodiment of the invention, a processor includes an execution pipeline to concurrently execute at least portions of threads, wherein at least one of the threads is dependent on at least another one of the threads. The processor also includes detection circuitry to detect speculation errors in the execution of the threads. In another embodiment, the processor includes thread management logic to control dynamic creation of threads from a program.

Type: Grant

Filed: December 29, 2000

Date of Patent: December 10, 2002

Assignee: Intel Corporation

Inventors: Haitham Akkary, Kingsum Chow
Instruction path coprocessor branch handling mechanism

Publication number: 20020184478

Abstract: The problem of mis-match between a program counter (14) of a CPU (10) and a byte code counter (18) of an instruction path coprocessor (IPC) (16) is addressed by causing the IPC (16) to translate IPC branch instructions to the CPU branch instructions, in which the CPU branch instructions implicitly indicate whether a corresponding IPC branch instructions should be taken and in which the CPU branch instruction will cause the CPU (10) to set its own program counter (14) to a safe location in the IPC range to avoid overflow.

Type: Application

Filed: April 8, 2002

Publication date: December 5, 2002

Inventors: Adrianus Josephus Bink, Alexander Augusteijn, Paul Ferenc Hoogendijk, Hendrikus Wilhelmus Johannes Van De Wiel
Processor, multiprocessor system and method for data dependence speculative execution

Publication number: 20020178349

Abstract: When a processor executes a memory operation instruction by means of data dependence speculative execution, a speculative execution result history table which stores history information concerning success/failure results of the speculative execution of memory operation instructions of the past is referred to and thereby whether the speculative execution will succeed or fail is predicted. In the prediction, the target address of the memory operation instruction is converted by a hash function circuit into an entry number of the speculative execution result history table (allowing the existence of aliases), and an entry of the table designated by the entry number is referred to. If the prediction is “success”, the memory operation instruction is executed in out-of-order execution speculatively (with regard to data dependence relationship between the instructions).

Type: Application

Filed: May 22, 2002

Publication date: November 28, 2002

Applicant: NEC CORPORATION

Inventors: Atsufumi Shibayama, Satoshi Matsushita, Sunao Torii, Naoki Nishi
Method for cancelling speculative conditional delay slot instructions

Publication number: 20020174328

Abstract: A first tag is assigned to a branch instruction. Dependent on the type of branch instruction, a second tag is assigned to an instruction in the branch delay slot of the branch instruction. The second tag may equal the first tag if the branch delay slot is unconditional for that branch, and may equal a different tag if the branch delay slot is conditional for the branch. If the branch is mispredicted, the first tag is broadcast to pipeline stages that may have speculative instructions, and the first tag is compared to tags in the pipeline stages. If the tag in a pipeline stage matches the first tag, the instruction is not cancelled. If the tag mismatches, the instruction is cancelled.

Type: Application

Filed: May 17, 2001

Publication date: November 21, 2002

Inventor: David A. Kruckemyer
Architecture of method for fetching microprocessor's instructions

Publication number: 20020144087

Abstract: A kind of architecture of method for fetching microprocessor's instructions is provided to pre-read and pre-decode a next instruction. If the instruction pre-decoded is found a conditional branch instruction, an instruction reading-amount register is set for reading two instructions next to the current instruction in the program memory, or one is read instead if the next instruction is found an instruction other than the conditional branch one so as to waive reading of unnecessary program memory and thereby reduce power consumption.

Type: Application

Filed: December 18, 2001

Publication date: October 3, 2002

Inventors: Pao-Lung Chen, Chen-Yi Lee
Multiple-thread processor with single-thread interface shared among threads

Publication number: 20020138717

Abstract: A processor includes logic for tagging a thread identifier (TID) for usage with processor blocks that are not stalled. Pertinent non-stalling blocks include caches, translation look-aside buffers (TLB), a load buffer asynchronous interface, an external memory management unit (MMU) interface, and others. A processor includes a cache that is segregated into a plurality of N cache parts. Cache segregation avoids interference, “pollution”, or “cross-talk” between threads. One technique for cache segregation utilizes logic for storing and communicating thread identification (TID) bits. The cache utilizes cache indexing logic. For example, the TID bits can be inserted at the most significant bits of the cache index.

Type: Application

Filed: May 23, 2002

Publication date: September 26, 2002

Inventors: William N. Joy, Marc Tremblay, Gary Lauterbach, Joseph I. Chamdani
Processor configured to predecode relative control transfer instructions and replace displacements therein with a target address

Patent number: 6457117

Abstract: The processor is configured to predecode instruction bytes prior to their storage within an instruction cache. During the predecoding, relative branch instructions are detected. The displacement included within the relative branch instruction is added to the address corresponding to the relative branch instruction, thereby generating the target address. The processor replaces the displacement field of the relative branch instruction with an encoding of the target address, and stores the modified relative branch instruction in the instruction cache. The branch prediction mechanism may select the target address from the displacement field of the relative branch instruction instead of performing an addition to generate the target address. In one embodiment, relative branch instructions having eight bit and 32-bit displacement fields are included in the instruction set executed by the processor.

Type: Grant

Filed: November 7, 2000

Date of Patent: September 24, 2002

Assignee: Advanced Micro Devices, Inc.

Inventor: David B. Witt
Processor having priority changing function according to threads

Publication number: 20020129227

Abstract: A time multiplex changing function for priorities among threads is added to a multi-thread processor, and capability for large-scale out-of-order execution is achieved by confining the flows of data among threads, prescribing the execution order in the flow sequence, and executing a plurality of threads having data dependency either simultaneously or in time multiplex.

Type: Application

Filed: December 20, 2001

Publication date: September 12, 2002

Inventor: Fumio Arakawa
Reliable branch predictions for real-time applications

Patent number: 6430682

Abstract: Reliable branch predictions for real-time applications reduce both conditional branch execution time and uncertainties associated with their prediction in a computer implemented application. One method ensures that certain conditional branches are always correctly predicted, effectively converting them to jump instructions during program execution. Another method exploits the fact that some conditional branches always branch in the same direction within a task invocation, although that direction may vary across invocations. These methods improve computer processor utilization and performance.

Type: Grant

Filed: September 11, 1998

Date of Patent: August 6, 2002

Assignee: Agere Systems Guardian Corp.

Inventor: Harry Dwyer, III
Re-order buffer managing method and processor

Publication number: 20020091913

Abstract: By using an entry number (WRB number) of a re-order buffer 6, each of function units such as an operation unit 3, a store unit 4, a load unit 5, etc. notifies to the re-order buffer 6 the processing end for a instruction stored in the entry concerned in the unit thereof. The load unit 5 manages the latest speculation state of a load instruction issued on the basis of a branch prediction success/failure signal output from the branch unit 2, and makes no notification to the re-order buffer 6 on the basis of WRB number for subsequent load instructions of a branch-prediction failed branch instruction even when the processing of the instruction is finished. Accordingly, the re-order buffer 6 can re-use entries in which the subsequent instructions of the branch prediction failed branch instruction are stored.

Type: Application

Filed: January 9, 2002

Publication date: July 11, 2002

Applicant: NEC CORPORATION

Inventor: Masao Fukagawa
Full multiprocessor speculation mechanism in a symmetric multiprocessor (smp) System

Publication number: 20020087849

Abstract: Described is a data processing system and processor that provides full multiprocessor speculation by which all instructions subsequent to barrier operations in a instruction sequence are speculatively executed before the barrier operation completes on the system bus. The processor comprises a load/store unit (LSU) with a barrier operation (BOP) controller that permits load instructions subsequent to syncs in an instruction sequence to be speculatively issued by the LRQ prior to the return of the sync acknowledgment. Load data returned by the speculative load request is immediately forwarded to the processor's execution units for speculative execution with subsequent instructions. The returned data and results of subsequent operations are held temporarily in the rename registers. A multiprocessor speculation flag is set in the corresponding rename registers to indicate that the value is “barrier” speculative.

Type: Application

Filed: December 28, 2000

Publication date: July 4, 2002

Applicant: International Business Machines Corporation

Inventors: Ravi Kumar Arimilli, John Steven Dodson, Guy Lynn Guthrie, Derek Edward Williams
Speculative register adjustment

Publication number: 20020078326

Abstract: In one embodiment, a programmable processor is adapted to include a speculative count register. The speculative count register may be loaded with data associated with an instruction before the instruction commits. However, if the instruction is terminated before it commits, the speculative count register may be adjusted. A set of counters may monitor the difference between the speculative count register and its architectural counterpart.

Type: Application

Filed: December 20, 2000

Publication date: June 20, 2002

Applicant: Intel Corporation and Analog Devices, Inc.

Inventors: Charles P. Roth, Ravi P. Singh, Gregory A. Overkamp
Hardware for use with compiler generated branch information

Publication number: 20020073301

Abstract: A method of executing microprocessor instructions and an associated microprocessor are disclosed. Initially, a conditional branch instruction is fetched from a storage unit such as an instruction cache. Branch prediction information embedded in the branch instruction is detected by a fetch unit of the microprocessor. Depending upon the state of the branch prediction information, instructions from the branch-taken path and the branch-not-taken path of the branch instruction are fetched. The branch-not-taken path instructions and the branch-taken path instruction may be speculatively executed. Upon executing the conditional branch instruction, the speculative results from the branch-taken path are discarded if the branch is not taken and speculative results from the branch-not-taken path are discarded if the branch is taken. The branch prediction information may include compiler generated information indicative of the context in which the conditional branch instruction is used.

Type: Application

Filed: December 7, 2000

Publication date: June 13, 2002

Applicant: International Business Machines Corporation

Inventors: James Allan Kahle, Charles Roberts Moore

prev 1 2 3 4 5 next