Patents by Inventor Jochen Preiss
Jochen Preiss has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20100100713Abstract: A floating point processor unit executes a floating point compare instruction with two operands of the same or different precision by comparing the two operands in integer format, which speeds up the execution of the floating point compare instruction significantly. The floating point processor now executes the floating point compare instruction at least twice as fast or faster (e.g., two clock cycles instead of five clock cycles in the prior art) for nearly most operand cases (e.g., 99% of all cases). Only the rare corner cases require additional operations on one of the operands and thus require additional cycles of execution time because the integer compare operation will not work for these corner cases. This is due to the fact that one operand is a single precision subnormal number in an unnormalized representation (i.e., has two representations) and the other operand is in the SP subnormal range such that the integer compare operation will fail.Type: ApplicationFiled: October 22, 2008Publication date: April 22, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Maarten J. Boersma, Michael Kroener, Silvia M. Meuller, Jochen Preiss
-
Publication number: 20100095099Abstract: A system and a method for storing numbers in a register file are provided. The system and the method store single precision numbers in double precision format in a register file that is shared between floating point computational units and computational units not supporting floating point numbers.Type: ApplicationFiled: October 14, 2008Publication date: April 15, 2010Applicant: International Business Machines CorporationInventors: Maarten Boersma, Michael Kroener, Petra Leber, Silvia M. Mueller, Jochen Preiss, Kerstin Schelm
-
Patent number: 7694112Abstract: A method for executing multiple computational primitives is provided in accordance with exemplary embodiments. A first computational unit and at least a second computational unit cooperate to execute multiple computational primitives. The first computational unit independently computes other computational primitives. By virtue of arbitration for shared source operand buses or shared result buses, availability of the first and second computational units needed to execute cooperatively the multiple computational primitives is assured by a process of reservation as used for a computational primitive executed on a dedicated computational unit.Type: GrantFiled: January 31, 2008Date of Patent: April 6, 2010Assignee: International Business Machines CorporationInventors: Harry S. Barowski, J. Adam Butts, Stephen V. Kosonocky, Silvia M. Mueller, Jochen Preiss
-
Publication number: 20100063987Abstract: In a binary floating point processor, the exponents of each of the various types of operands are recoded into an internal format, by biasing the exponents with the minimum exponent value of the result precision (“Emin”), i.e., the recoded value of the exponent is the represented value of the exponent minus Emin. Emin depends only on the result precision of the instruction that is currently being executed in the binary floating point processor. The exponent computations are then performed in this new format. The underflow check for all result precisions is a check against zero and overflow checks are performed against a positive number that depends on the result precision. The exponent values are in a 2's complement representation, so the underflow check simply becomes a check of the sign bit.Type: ApplicationFiled: September 9, 2008Publication date: March 11, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Maarten J. Boersma, K. Michael Kroener, Petra Leber, Silvia M. Mueller, Jochen Preiss, Kerstin Schelm
-
Publication number: 20100063985Abstract: A floating point processor unit includes a shift amount calculation circuit within a normalizer portion of the floating point unit, wherein the shift amount calculation circuit is utilized to compute the normalizer shift amount for a log estimate instruction that runs as a pipelinable instruction.Type: ApplicationFiled: September 11, 2008Publication date: March 11, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Maarten J. Boersma, Michael Klein, Jochen Preiss, Son Dao Trong
-
Publication number: 20100057825Abstract: A method for operand width reduction is described, wherein two N-bit input operands (A, B) of a bit width of N are processed and two M-bit output operands (A?, B?) of a reduced bit width of M are generated in a way, that a post-processing comprising an M-bit adder function followed by saturation to M bits performed on said two M-bit output operands (A?, B?) provides an M-bit result equal to an M-bit result of an N-bit modulo adder function of the two N-bit input operands (A, B), followed by a saturation to M bits. Further an electronic computing circuit (1, 5) is described performing said method. Additionally a computer system comprising such an electronic computing circuit is described.Type: ApplicationFiled: February 11, 2008Publication date: March 4, 2010Applicant: International Business Machines CorporationInventors: Tobias Gemmeke, Nicolas Maeding, Jochen Preiss
-
Publication number: 20100058266Abstract: A 3-stack floorplan for a floating point unit includes: an aligner located in the center of the floating point unit; a frontend located directly above the aligner; a multiplier located directly below the frontend and next to the aligner; an adder located directly next to the multiplier and directly below the aligner; a normalizer located directly above the adder; and a rounder located directly above the normalizer.Type: ApplicationFiled: August 29, 2008Publication date: March 4, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Maarten Boersma, Michael Kroener, Petra Leber, Silvia M. Mueller, Jochen Preiss, Kerstin Schelm
-
Publication number: 20100023573Abstract: The forcing of the result or output of a rounder portion of a floating point processor occurs only in a fraction non-increment data path within the rounder and not in the fraction increment data path within the rounder. The fraction forcing is active on a corner case such as a disabled overflow exception. A disabled overflow exception may be detected by inspecting the normalized exponent. If a disabled overflow exception is detected, the round mode is selected to execute only in the non-increment data path thereby preventing the fraction increment data path from being selected.Type: ApplicationFiled: July 22, 2008Publication date: January 28, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Maarten J. Boersma, J. Adam Butts, Silvia M. Mueller, Jochen Preiss
-
Publication number: 20100017635Abstract: A method, system and computer program product for reducing power consumption when processing mathematical operations. Power may be reduced in processor hardware devices that receive one or more operands from an execution unit that executes instructions. A circuit detects when at least one operand of multiple operands is a zero operand, prior to the operand being forwarded to an execution component for completing a mathematical operation. When at least one operand is a zero operand or at least one operand is “unordered”, a flag is set that triggers a gating of a clock signal. The gating of the clock signal disables one or more processing stages and/or devices, which perform the mathematical operation. Disabling the stages and/or devices enables computing the correct result of the mathematical operation on a reduced data path. When a device(s) is disabled, the device may be powered off until the device is again required by subsequent operations.Type: ApplicationFiled: July 18, 2008Publication date: January 21, 2010Inventors: HARRY S. BAROWSKI, MAARTEN J. BOERSMA, SILVIA M. MUELLER, TIM NIGGEMEIER, JOCHEN PREISS
-
Patent number: 7639046Abstract: A method to reduce power consumption within a clock gated synchronous circuit, said synchronous circuit comprising at least two successive stages, wherein each stage if activated propagates a data signal cycle by cycle to a succeeding stage, comprising the steps of: deriving a local clock activation signal from an external clock activation signal, wherein said local clock activation signal changes its value every cycle the external clock activation signal indicates a propagation, propagating the data signal and the local clock activation signal synchronously cycle by cycle from a particular stage to a succeeding stage whenever a local clock activation signal at the particular stage by derivation from the clock activation signal or by propagation through the synchronous circuit changes its value between two successive cycles, in order to propagate the data signal and the local clock activating signal within the same clock domain through the clock gated synchronous circuit.Type: GrantFiled: September 6, 2007Date of Patent: December 29, 2009Assignee: International Business Machines CorporationInventors: Tobias Gemmeke, Jens Leenstra, Jochen Preiss
-
Publication number: 20090198758Abstract: A method for implementing sign extension within a multi-precision multiplier is described. The method includes receiving a first input within a multiplier core of the multiplier, receiving a second input within the multiplier core, and creating partial products in the multiplier core using the first and second inputs. The method also includes summing up the partial products in a partial product reduction tree in the multiplier core. The method also includes performing sign extension within the partial product reduction tree of the multiplier core by adding a value to a partial product of the partial product reduction tree. The method further includes computing an output from the partial product reduction tree, the output including a final product of the first and second inputs signed extended to a desired width.Type: ApplicationFiled: January 31, 2008Publication date: August 6, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Harry S. Barowski, Jeffrey Adam Butts, Silvia M. Mueller, Tim Niggemeier, Jochen Preiss
-
Publication number: 20090198974Abstract: A method for executing multiple computational primitives is provided in accordance with exemplary embodiments. A first computational unit and at least a second computational unit cooperate to execute multiple computational primitives. The first computational unit independently computes other computational primitives. By virtue of arbitration for shared source operand buses or shared result buses, availability of the first and second computational units needed to execute cooperatively the multiple computational primitives is assured by a process of reservation as used for a computational primitive executed on a dedicated computational unit.Type: ApplicationFiled: January 31, 2008Publication date: August 6, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Harry S. Barowski, J. Adam Butts, Stephen V. Kosonocky, Silvia M. Mueller, Jochen Preiss
-
Patent number: 7461117Abstract: The invention proposes a Floating Point Unit (1) with fused multiply add, with one addend operand (eb, fb) and two multiplicand operands (ea, fa; ec, fc), with a shift amount logic (2) which based on the exponents of the operands (ea, eb and ec) computes an alignment shift amount, with an alignment logic (3) which uses the alignment shift amount to align the fraction (fb) of the addend operand, with a multiply logic (4) which multiplies the fractions of the multiplicand operands (fa, fc), with a adder logic (5) which adds the outputs of the alignment logic (3) and the multiply logic (4), with a normalization logic (6) which normalizes the output of the adder logic (5), which is characterized in that a leading zero logic (7) is provided which computes the number of leading zeros of the fraction of the addend operand (fb), and that a compare logic (8) is provided which based on the number of leading zeros and the alignment shift amount computes select signals that indicate whether the most significant bits of tType: GrantFiled: February 11, 2005Date of Patent: December 2, 2008Assignee: International Business Machines CorporationInventors: Son Dao Trong, Juergen Haess, Christian Jacobi, Klaus Michael Kroener, Silvia Melitta Mueller, Jochen Preiss
-
Publication number: 20080276140Abstract: A semiconductor chip subdivided into power domains, at least one of the power domains is separately activated or deactivated and at least a part of the scannable storage elements are interconnected to one or more scan chains. At least one scan chain is serially subdivided into scan chain portions and the scan chain portion is arranged within one of the power domains. For at least one scan chain portion a bypass line is provided for passing by scan data and at least one select unit is provided for selecting between the bypass line and the corresponding scan chain portion in dependence of the activated or deactivated state of the corresponding power domains.Type: ApplicationFiled: March 5, 2008Publication date: November 6, 2008Inventors: Tobias Gemmeke, Christoph Jaeschke, Jens Kuenzer, Cedric Lichtenau, Thomas Pflueger, Jochen Preiss
-
Publication number: 20080169842Abstract: A design structure to reduce power consumption within a clock gated synchronous circuit, said synchronous circuit comprising at least two successive stages, wherein each stage if activated propagates a data signal cycle by cycle to a succeeding stage the two successive stages comprising at least a control register, a data register and a local clock buffer (LCB) each, wherein each stage if activated propagates a data signal stored within the data register cycle by cycle to a data register of a succeeding stage.Type: ApplicationFiled: September 6, 2007Publication date: July 17, 2008Inventors: Tobias Gemmeke, Jens Leenstra, Jochen Preiss
-
Publication number: 20080169841Abstract: A method to reduce power consumption within a clock gated synchronous circuit, said synchronous circuit comprising at least two successive stages, wherein each stage if activated propagates a data signal cycle by cycle to a succeeding stage, comprising the steps of: deriving a local clock activation signal from an external clock activation signal, wherein said local clock activation signal changes its value every cycle the external clock activation signal indicates a propagation, propagating the data signal and the local clock activation signal synchronously cycle by cycle from a particular stage to a succeeding stage whenever a local clock activation signal at the particular stage by derivation from the clock activation signal or by propagation through the synchronous circuit changes its value between two successive cycles, in order to propagate the data signal and the local clock activating signal within the same clock domain through the clock gated synchronous circuit.Type: ApplicationFiled: September 6, 2007Publication date: July 17, 2008Inventors: Tobias Gemmeke, Jens Leenstra, Jochen Preiss
-
Publication number: 20080162897Abstract: A binary logic unit to apply any Boolean operation on two input signals (va, vb) is described, wherein any Boolean operation to be applied on the input signals (va, vb) is defined by a particular combination of well defined control signals (ctl0, ctl1, ctl2, ctl3), wherein the input signals (va, vb) are used to select a control signal (ctl0, ctl1, ctl2, ctl3) as an output signal (vo) of the binary logic unit representing the result of a particular Boolean operation applied on the two input signals (va, vb). Furthermore a method to operate such a binary logic unit is described.Type: ApplicationFiled: October 16, 2007Publication date: July 3, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Tobias Gemmeke, Jochen Preiss
-
Publication number: 20070050435Abstract: The present invention relates to a circuit comprising a Leading Zero Counter (LZC) sub-circuit driving a second sub-circuit, like a shifter or arbiter. Shifter circuits or arbiter circuits operating with fewer stages than before have a smaller delay since every stage can select between more than two inputs. This reduces the overall delay of the shifter, arbiter, etc. But for state-of-the art binary LZC circuits this requires a complex recoding between LZC and shifter circuit. In order to provide an improved leading zero circuit having an output which allows a simpler control of a post-connected sub-circuit having two or more stages and having at least one stage with three or more inputs, it is proposed to provide a LZC circuitry providing an output consisting of two or more unary encoded substrings. This removes the requirement for a recoder between LZC and shifter.Type: ApplicationFiled: July 25, 2006Publication date: March 1, 2007Inventors: Christian Jacobi, Silvia Mueller, Jochen Preiss, Kai Weber
-
Publication number: 20070038693Abstract: The invention relates to a method for performing floating-point instructions within a processor of a data processing system is described, wherein an input of said floating-point instruction comprises a normal or a denormal floating-point number. Said method comprises the steps of storing said floating-point number, normalization of said floating-point number by counting the leading zeros of the mantissa, shifting the fraction part to the left by the number of leading zeros and simultaneously decrementing the exponent by one for every position that the fraction part is shifted to the left, wherein it the input is a normal floating point number the normalization is done after counting no leading zero of the mantissa, execution of a floating point instruction, wherein said normalized floating-point number is utilized as input for the floating point instruction, and storing of a floating-point result. Furthermore a processor to be used to perform said method is described.Type: ApplicationFiled: August 3, 2006Publication date: February 15, 2007Inventors: Christian Jacobi, Matthias Klein, Silvia Mueller, Matthias Pflanz, Jochen Preiss
-
Publication number: 20060184601Abstract: The invention proposes a Floating Point Unit (1) with fused multiply add, with one addend operand (eb, fb) and two multiplicand operands (ea, fa; ec, fc), with a shift amount logic (2) which based on the exponents of the operands (ea, eb and ec) computes an alignment shift amount, with an alignment logic (3) which uses the alignment shift amount to align the fraction (fb) of the addend operand, with a multiply logic (4) which multiplies the fractions of the multiplicand operands (fa, fc), with a adder logic (5) which adds the outputs of the alignment logic (3) and the multiply logic (4), with a normalization logic (6) which normalizes the output of the adder logic (5), which is characterized in that a leading zero logic (7) is provided which computes the number of leading zeros of the fraction of the addend operand (fb), and that a compare logic (8) is provided which based on the number of leading zeros and the alignment shift amount computes select signals that indicate whether the most significant bits of tType: ApplicationFiled: February 11, 2005Publication date: August 17, 2006Inventors: Son Trong, Juergen Haess, Christian Jacobi, Klaus Kroener, Silvia Mueller, Jochen Preiss