Patents by Inventor Klaus J. Getzlaff
Klaus J. Getzlaff has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7493480Abstract: A two level branch history table (TLBHT) is substantially improved by providing a mechanism to prefetch entries from the very large second level branch history table (L2 BHT) into the active (very fast) first level branch history table (L1 BHT) before the processor uses them in the branch prediction process and at the same time prefetch cache misses into the instruction cache. The mechanism prefetches entries from the very large L2 BHT into the very fast L1 BHT before the processor uses them in the branch prediction process. A TLBHT is successful because it can prefetch branch entries into the L1 BHT sufficiently ahead of the time the entry is needed. This feature of the TLBHT is also used to prefetch instructions into the cache ahead of their use. In fact, the timeliness of the prefetches produced by the TLBHT can be used to remove most of the cycle time penalty incurred by cache misses.Type: GrantFiled: July 18, 2002Date of Patent: February 17, 2009Assignee: International Business Machines CorporationInventors: Philip G. Emma, Klaus J. Getzlaff, Allan M. Hartstein, Thomas Pflueger, Thomas R. Puzak, Eric Mark Schwarz, Vijayalakshmi Srinivasan
-
Patent number: 7082517Abstract: In a computer system for use as a symetrical multiprocessor, a superscalar microprocessor apparatus allows dispatching and executing multi-cycle and complex instructions Some control signals are generated in the dispatch unit and dispatched with the instruction to the Fixed Point Unit (FXU). Multiple execution pipes correspond to the instruction dispatch ports and the execution unit is a Fixed Point Unit (FXU) which contains three execution dataflow pipes (X, Y and Z) and one control pipe (R). The FXU logic then execute these instructions on the available FXU pipes. This results in optimum performance with little or no other complications. The presented technique places the flexibility of how these instructions will be executed in the FXU, where the actual execution takes place, instead of in the instruction decode or dispatch units or cracking by the compiler.Type: GrantFiled: May 12, 2003Date of Patent: July 25, 2006Assignee: International Business Machines CorporationInventors: Fadi Y. Busaba, Klaus J. Getzlaff, Christopher A. Krygowski, Timothy J. Slegel
-
Patent number: 7010676Abstract: An embodiment of the invention is a processor for processing loop branch instructions. The processor includes an instruction unit for fetching and decoding instructions including at least one loop branch instruction. A branch prediction unit predicts target instructions to be fetched and decoded by the instruction unit in response to the loop branch instruction. An execution unit executes instructions from the instruction unit and maintains a counter indicating an iteration of a loop. The execution unit includes detection logic for detecting when the counter equals a threshold and notifies the branch prediction unit when the counter equals the threshold.Type: GrantFiled: May 12, 2003Date of Patent: March 7, 2006Assignee: International Business Machines CorporationInventors: Fadi Y. Busaba, Klaus J. Getzlaff, Christopher A. Krygowski, Timothy J. Slegel
-
Publication number: 20040230772Abstract: In a computer system for use as a symetrical multiprocessor, a superscalar microprocessor apparatus allows dispatching and executing multi-cycle and complex instructions Some control signals are generated in the dispatch unit and dispatched with the instruction to the Fixed Point Unit (FXU). Multiple execution pipes correspond to the instruction dispatch ports and the execution unit is a Fixed Point Unit (FXU) which contains three execution dataflow pipes (X, Y and Z) and one control pipe (R). The FXU logic then execute these instructions on the available FXU pipes. This results in optimum performance with little or no other complications. The presented technique places the flexibility of how these instructions will be executed in the FXU, where the actual execution takes place, instead of in the instruction decode or dispatch units or cracking by the compiler.Type: ApplicationFiled: May 12, 2003Publication date: November 18, 2004Applicant: International Business Machines CorporationInventors: Fadi Y. Busaba, Klaus J. Getzlaff, Christopher A. Krygowski, Timothy J. Slegel
-
Publication number: 20040230782Abstract: An embodiment of the invention is a processor for processing loop branch instructions. The processor includes an instruction unit for fetching and decoding instructions including at least one loop branch instruction. A branch prediction unit predicts target instructions to be fetched and decoded by the instruction unit in response to the loop branch instruction. An execution unit executes instructions from the instruction unit and maintains a counter indicating an iteration of a loop. The execution unit includes detection logic for detecting when the counter equals a threshold and notifies the branch prediction unit when the counter equals the threshold.Type: ApplicationFiled: May 12, 2003Publication date: November 18, 2004Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Fadi Y. Busaba, Klaus J. Getzlaff, Christopher A. Krygowski, Timothy J. Slegel
-
Publication number: 20040139299Abstract: A method and mechanism for improving Instruction Level Parallelism (ILP) of a program and eventually improving Instructions per cycle (IPC) allows dependent instructions to be grouped and dispatched simultaneously by forwarding the oldest instruction, or source instruction, General Register (GR) data to the other dependent instructions. A source instruction of a load type loading a GR value into a GR. The dependent instructions will then select the forwarded data to perform their computation. The dependent instructions use the same GR read address as the source instruction. Another source instruction of a load type loads a memory data into a GR. The loaded memory data is forwarded or replicated on the memory read bus of the other dependent instructions. The mechanism allows Address Generator Output to be forwarded to the other dependent instructions when the source instruction is a load type loading a memory address into a GR.Type: ApplicationFiled: January 14, 2003Publication date: July 15, 2004Applicant: International Business Machines CorporationInventors: Fadi Busaba, Klaus J. Getzlaff, Bruce C. Giamei, Christopher A. Krygowski, Timothy J. Slegel
-
Publication number: 20040139300Abstract: A method and mechanism for improving Instruction Level Parallelism (ILP) of a program and eventually improving Instructions per cycle (IPC) allows dependent instructions to be grouped and dispatched simultaneously by forwarding the oldest instruction, or source instruction, result to the other dependent instructions result buses or registers thus bypassing the dependent instruction execution stage. A source instruction that performs arithmetic, logical or rotate/shift type operation on operands and updates a GR with the computed result. A load type dependent or target instruction loading a GR value into a GR will then select the forwarded result of the source instruction to its write bus for the GR update. Another target instruction of a store type stores a memory data from a GR data. The result of source instruction is also used by the dependent instruction to update storage. The mechanism allows also the dependent instruction to be a load type that loads a GR data into a Control Register (CR).Type: ApplicationFiled: January 14, 2003Publication date: July 15, 2004Applicant: International Business Machines CorporationInventors: Fadi Busaba, Klaus J. Getzlaff, Bruce C. Giamei, Christopher A. Krygowski, Timothy J. Slegel
-
Publication number: 20040015683Abstract: A two level branch history table (TLBHT) is substantially improved by providing a mechanism to prefetch entries from the very large second level branch history table (L2 BHT) into the active (very fast) first level branch history table (L1 BHT) before the processor uses them in the branch prediction process and at the same time prefetch cache misses into the instruction cache. The mechanism prefetches entries from the very large L2 BHT into the very fast L1 BHT before the processor uses them in the branch prediction process. A TLBHT is successful because it can prefetch branch entries into the L1 BHT sufficiently ahead of the time the entry is needed. This feature of the TLBHT is also used to prefetch instructions into the cache ahead of their use. In fact, the timeliness of the prefetches produced by the TLBHT can be used to remove most of the cycle time penalty incurred by cache misses.Type: ApplicationFiled: July 18, 2002Publication date: January 22, 2004Applicant: International Business Machines CorporationInventors: Philip G. Emma, Klaus J. Getzlaff, Allan M. Hartstein, Thomas Pflueger, Thomas R. Puzak, Eric Mark Schwarz, Vijayalakshmi Srinivasan
-
Patent number: 5634047Abstract: A method and system for executing branch or other instructions in a loop. A loop end condition is evaluated in a fixed point unit while floating point instructions are evaluated in a floating point unit. In a first execution of the instructions in the loop, the loop end condition is processed as in prior art. A branch target instruction is stored in a branch target register and an instruction address of the branch target instruction is stored in a branch address register. However, on subsequent execution of the instructions in the loop, the branch condition is evaluated and, if it is fulfilled, once the end of the loop is detected by comparison of the effective address of the next instruction to be executed with the contents of the branch address register, the effective address of the first instruction in the loop is passed from the branch target register to an operations register.Type: GrantFiled: June 10, 1996Date of Patent: May 27, 1997Assignee: International Business Machines CorporationInventors: Klaus J. Getzlaff, Udo Wille, Brigitte Roethe, Wilhelm Haller, Hans-Werner Tast
-
Patent number: 5594876Abstract: The invention concerns a multiprocessor system comprising processors PU0 to PUn and a common main memory. The memory is logically divided into at least two banks M0 and M1 and is interconnected with the processors by a bus 110. By means of control lines 111 to 118 a bus protocol is established so that one of said memory banks is accessed while another one of said banks is still busy.Type: GrantFiled: July 27, 1995Date of Patent: January 14, 1997Assignee: International Business Machines CorporationInventors: Klaus J. Getzlaff, Udo Wille
-
Patent number: 5311519Abstract: A multiplexer circuit which is built up from a series of smaller submultiplexers (241-247, 251-254). It selects a number of adjacent bits, bytes or words from one register and places them in the same order in a second register. The multiplexer can be used in cache memories or instruction buffers.Type: GrantFiled: April 20, 1992Date of Patent: May 10, 1994Assignee: International Business Machines CorporationInventors: Klaus J. Getzlaff, Thomas Pflueger, Hans-Werner Tast
-
Patent number: 5303365Abstract: The invention relates to a multi-chip computersystem with master-slave latches. It is known to provide all latches on all chips with two clock pulses, respectively. With the help of the latches the digital signals are pipelined through the logic gates on the chip. Due to tolerances, the edges which control the masters and the slaves have a skew. According to the invention, one of the two clock pulses is generated on the chip itself, respectively, by ANDing an auxiliary clock pulse with the other of the two clock pulses. This has the result, that the above mentioned edges of the two clock pulses occur almost at the same time with the consequence that the frequency of the clock pulses can be increased.Type: GrantFiled: June 14, 1991Date of Patent: April 12, 1994Assignee: International Business Machines CorporationInventors: Klaus J. Getzlaff, Johann Hajdu, Guenter Knauft
-
Patent number: 5138707Abstract: A method of operating a timer mechanism in a digital data processing system is described in which the contents of at least one timer register is updated by a predetermined time increment during each of successive periodic update cycles. Each update cycle includes a predetermined number of operating cycles of the data processing system. During each update cycle, the contents of an adjustment register is circularly shifted by one bit position and if the bit value at a particular position in this adjustment register has a predetermined binary value during an update cycle, then actual updating of the timer register is omitted during a related update cycle.Type: GrantFiled: September 13, 1989Date of Patent: August 11, 1992Assignee: International Business Machines CorporationInventors: Wilhelm Haller, Johann Hajdu, Klaus J. Getzlaff
-
Patent number: 5070471Abstract: A multiplier for multiplying two binary operands is presented which comprises an encoding unit, a multiplying unit composed of two multiplying arrays, and a logic unit. The encoding unit to which the second operand is applied generates factors following the Booth algorithm. The two multiplying arrays are respectively applied with the first operand as well as with factors belonging to the higher significance digits or the lower significance digits, respectively, of the second operand. In both multiplying arrays the multiplication of the factors with the first operand into a respective partial end product is simultaneously performed. Both partial end products are applied to the logic unit which generates therefrom the end product in accordance with the algorithm used at the beginning.Type: GrantFiled: February 9, 1990Date of Patent: December 3, 1991Assignee: International Business Machines Corp.Inventors: Son Dao-Trong, Klaus J. Getzlaff, Klaus Helwig
-
Patent number: 4656578Abstract: In the processing of instructions in data processing systems it is not always possible to execute these instructions without interruption since particular situations, in the following called events can occur which necessitate a short interruption for executing the operations caused by such events before continuing the interrupted instruction processing. Such repetition however is only possible when the contents of the operation register containing the instruction is frozen during the interruption. Such a situation requires two actions: the first is the execution of a forced operation to resolve the event. The second action is a repetition of the instruction and execution phase of the interrupted instruction.Type: GrantFiled: September 4, 1984Date of Patent: April 7, 1987Assignee: International Business Machines CorporationInventors: Herbert Chilinski, Klaus J. Getzlaff, Johann Hajdu, Stephan Richter
-
Patent number: 4631663Abstract: In a microprogram-controlled processor, having an additional operating mode in which particular functions can be executed under direct hardware control, a mode latch signals whether microprogram instructions or directly controlled macro instructions are to be executed. The macroprogram instructions are conventionally executed. For the execution of directly controlled macro instructions, a control storage supplies a hardware control word associated with the macro instruction to be directly executed. The hardware control word contains a mode control bit for signalling that the direct hardware control mode is to be executed. The remainder of the hardware control word contains a plurality of direct control bits, each of which directly controls a hardware function. In an alternative embodiment multiple hardware control words are employed.Type: GrantFiled: June 2, 1983Date of Patent: December 23, 1986Assignee: International Business Machines CorporationInventors: Herbert Chilinski, Klaus J. Getzlaff, Johann Hajdu, Franz J. Raeth
-
Patent number: 4400776Abstract: An improved data processor control subsystem in which a cycle counter having a plurality of cascade-connected stages also comprises one or more supplemental or dummy stages, which can be selectively inserted or removed from the chain of cascade-connected stages, to alter the number of sub-cycles in an operating cycle, thereby decreasing the complexity of associated decoding circuitry.Type: GrantFiled: September 12, 1980Date of Patent: August 23, 1983Assignee: International Business Machines CorporationInventors: Dieter Bazlen, Dietrich W. Bock, Klaus J. Getzlaff, Johann Hajdu, Helmut Painke
-
Patent number: 4398247Abstract: In executing and controlling internal data flow for a particular program, it is often necessary to delay execution of an instruction by the insertion of an appropriate number of wait cycles. Thus, it may be necessary to interrupt instruction execution, for example, to insert a given number of wait cycles for channel access to common storage of the data processing system, for reloading a data or instruction buffer, or for a like situation. In such cases, the control unit has to ignore the particular instruction awaiting execution and execute another forced operation instead.An appropriate code is provided for a NO OPERATION instruction, say all bits zero. When a forced operation is to be executed, this code can be generated with fewer logic means at the output of the instruction register. As a result, there are no control signals active at the output of the decoder. The signals indicating forced operations have to be considered by the decoder only if a control signal is to be generated.Type: GrantFiled: September 12, 1980Date of Patent: August 9, 1983Assignee: International Business Machines CorporationInventors: Dieter Bazlen, Dietrich W. Bock, Klaus J. Getzlaff, Johann Hajdu, Helmut Painke