Abstract: Apparatus comprising an input connected to receive an input signal, a lookup table comprising a plurality of input entries and first and second output entries for each input entry. The look up table receives the input signal and returns a lower input entry, an upper input entry, the second output entry for the lower input entry, and the first output entry for the upper input entry. A first subtractor subtracts the lower input entry from the input signal to produce a first difference. A second subtractor subtracts the input signal from the upper input entry to produce a second difference. First and second multipliers multiply the first and second differences by the first output entry for the upper input entry and the second output entry for the lower input entry, respectively, to produce first and second products. An adder adds the first and second products to produce an output signal.
Abstract: In an embodiment, hardware implementing a transcendental or other non-linear function is based on a series expansion of the function. For example, a Taylor series expansion may be used as the basis. One or more of the initial terms of the Taylor series may be used, and may be implemented in hardware. In some embodiments, modifications to the Taylor series expansion may be used to increase the accuracy of the result. In one embodiment, a variety of bit widths for the function operands may be acceptable for use in a given implementation. A methodology for building a library of series-approximated components for use in integrated circuit design is provided which synthesizes the acceptable implementations and tests the results for accuracy. A smallest (area-wise) implementation which produces a desired level of accuracy may be selected as the library element.
Type:
Grant
Filed:
March 30, 2012
Date of Patent:
April 21, 2015
Assignee:
Apple Inc.
Inventors:
Vaughn T. Arnold, Brijesh Tripathi, Albert Kuo
Abstract: A processor for dividing by calculating repeatedly an n-bit width partial quotient includes, a dividend zero count value counter that counts a dividend zero count value, a divisor zero count value counter that counts a divisor zero count value, a correction value calculator that calculates a correction value to a loop count value, a correction loop count value calculator that calculates a correction loop count value, a dividend shift unit that shifts leftward an absolute value of the dividend by the dividend zero count value and shifts rightward the leftward-shifted absolute value of the dividend by the correction value, a divisor shift unit that shifts leftward an absolute value of the divisor by the divisor zero count value, and a division loop operation unit that divides based on an output value from the dividend shift unit, an output value from the divisor shift unit, and the correction loop count value.
Abstract: Provided is a system for generating coefficient values. The system may include a base function generator and a series of accumulators including a leading and a last accumulator. In the series of accumulators, the data output of each accumulator, except the last, may be coupled to the data input of a successive adjacent accumulator. The base function generator may be configured to output, to the leading accumulator, a series of data values that may correspond to a base function that is a specified order derivative of a filter function. Each accumulator may be configured to: add a data value currently at its data input to a currently stored data value to produce an updated data value that may correspond to a respective value of a specified order integral of the base function; store the updated data value in the accumulator; and output the updated data value at its data output.
Abstract: Methods and circuitry for evaluating reciprocal, square root, inverse square root, logarithm, and exponential functions of an input value, Y. In one embodiment, an approximate value, RA, of the reciprocal of Y is generated. One Newton-Raphson iteration is performed as a function of RA and Y, resulting in a truncated approximate value, R. R is multiplied by Y and 1 is subtracted, resulting in a reduced argument, A. A Taylor series evaluation of A is performed, resulting in an evaluated argument, B. B is multiplied by a post-processing factor for the final result.
Abstract: The invention relates to a circuit for generating a true, circuit-specific and time-invariant random binary number, having: a matrix of K?L delay elements that can be connected to each other by means of L?1 single or double commutation circuits into chains of delay elements of length L, a single or double demultiplexer connected before the matrix, a single or double multiplexer connection after the matrix, and a run time or number comparator, wherein the setting of the commutation circuits, the demultiplexer, and the multiplexer can be prescribed by a control signal, wherein the circuit comprises a channel code encoder whereby code words of a channel code can be generated and a transcriber, whereby code words of the channel code can be transcribed into the control signal of the L?1 single or double commutation circuits, and a method for generating a true, circuit-specific and time-invariant random number by means of a matrix of L?K delay elements, L?1 single or double commutation circuits, a single or double
Type:
Grant
Filed:
December 11, 2008
Date of Patent:
March 24, 2015
Assignee:
Micronas GmbH
Inventors:
Dejan Lazich, Micaela Wuensche, Sebastian Kaluza
Abstract: A computer processor including a single fused-unfused floating point multiply-add (FMA) module computes the result of the operation A*B+C for floating point numbers for fused multiply-add rounding operations and unfused multiply-add rounding operations. In one embodiment, a fused multiply-add rounding implementation is augmented with additional hardware which calculates an unfused multiply-add rounding result without adding additional pipeline stages. In one embodiment, a computation by the fused-unfused floating point multiply-add (FMA) module is initiated using a single opcode which determines whether a fused multiply-add rounding result or unfused multiply-add rounding result is generated.
Abstract: A technique is provided for performing a mixed precision estimate. A processing circuit receives an input of a first precision having a wide precision value. The processing circuit computes an output in an output exponent range corresponding to a narrow precision value based on the input having the wide precision value.
Type:
Grant
Filed:
February 9, 2012
Date of Patent:
March 17, 2015
Assignee:
International Business Machines Corporation
Inventors:
Michael K. Gschwind, Valentina Salapura
Abstract: A residue generating circuit for an execution unit that supports vector operations includes an operand register and a residue generator coupled to the operand register. The residue generator includes a first residue generation tree coupled to a first section of the operand register and a second residue generation tree coupled to a second section of the operand register. The first residue generation tree is configured to generate a first residue for first data included in the first section of the operand register. The second residue generation tree is configured to generate a second residue for second data included in a second section of the operand register. The first section of the operand register includes a different number of register bits than the second section of the operand register.
Type:
Grant
Filed:
February 6, 2012
Date of Patent:
March 17, 2015
Assignee:
International Business Machines Corporation
Abstract: Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.
Type:
Grant
Filed:
May 11, 2012
Date of Patent:
March 10, 2015
Assignee:
Oracle International Corporation
Inventors:
Jeffrey S. Brooks, Christopher H. Olson
Abstract: A specialized processing block is configurable as one ternary linear decomposition or two binary linear decompositions to perform large multiplications using smaller multipliers, and includes a first number of multiplier circuits of a first size, a second number of pre-adders, and a third number of block inputs. The block inputs are connected to a first subset of the multiplier circuits, and to the pre-adders which are connected to a second subset of the multiplier circuits. There is also a fourth number of additional inputs. A plurality of shifters shift partial product outputs of each of the multipliers by various shift amounts. A joint adder structure combines the shifted partial products. Controllable elements controllably select between different configurations of inputs to the multipliers and pre-adders, controllably connect and disconnect certain ones of the shifted partial products, and selectively split the joint adder structure into two smaller adder structures.
Abstract: A computer system is operable to identify subfields that differ in two data elements using a bit matrix compare function between a first matrix filled with pattern elements and a reference pattern.
Abstract: A random number quality control circuit capable of fast control of the level of random number quality is present. When a “0” output section and a “1” output section generate random numbers by individually receiving a random number signal, a random number quality monitor monitors an unbalance between the numbers of “0”s and “1”s. If a deviation from a desired ratio is found, a drive controller controls the reception characteristics of the “0” output section and “1” output section individually so that the deviation will be compensated for. The amount of information intercepted between a sender and a receiver can be reduced by maintaining the mark ratio of shared random numbers at 50%.
Abstract: A group of numbers from which the smallest and second-smallest are to be selected are compared in a cascaded tree. Each comparison stage will select the smallest number from two numbers output by the previous stage, into which four numbers are input. The second-smallest number is one of the other three inputs to the previous stage and, as before, all bits of the second-smallest number will not be known until the smallest number is determined. However, because at each stage of the determination, the next stage is reached because the bit values being examined are the same, those bit values of the second-smallest number (and indeed of the smallest number) are known ahead of the final determination of the smallest number. Accordingly, one can begin to output bits of the second-smallest number (as well as of the smallest number) even before that final determination.
Abstract: A remainder by division of a sequence of bytes interpreted as a first number by a second number is calculated. A first remainder by division associated with a first subset of the sequence of bytes is calculated with a first processor. A second remainder by division associated with a second subset of the sequence of bytes is calculated with a second processor. The calculating of the second remainder by division may occur at least partially during the calculating of the first remainder by division. A third remainder by division is calculated based on the calculating of the first remainder by division and the calculating of the second remainder by division.
Type:
Grant
Filed:
June 13, 2012
Date of Patent:
January 13, 2015
Assignee:
International Business Machines Corporation
Inventors:
Michael Hirsch, Shmuel T. Klein, Yair Toaff
Abstract: A remainder by division of a sequence of bytes interpreted as a first number by a second number is calculated. A first remainder by division associated with a first subset of the sequence of bytes is calculated with a first processor. A second remainder by division associated with a second subset of the sequence of bytes is calculated with a second processor. The calculating of the second remainder by division may occur at least partially during the calculating of the first remainder by division. A third remainder by division is calculated based on the calculating of the first remainder by division and the calculating of the second remainder by division.
Type:
Grant
Filed:
December 15, 2010
Date of Patent:
January 6, 2015
Assignee:
International Business Machines Corporation
Inventors:
Michael Hirsch, Shmuel T. Klein, Yair Toaff
Abstract: Methods and apparatus relating to reducing power consumption in multi-precision floating point multipliers are described. In an embodiment, certain portions of a multiplier are disabled in response to two or more multiplication operations with the same data size and data type occurring back-to-back. Other embodiments are also claimed and described.
Type:
Grant
Filed:
December 14, 2010
Date of Patent:
December 23, 2014
Assignee:
Intel Corporation
Inventors:
Brent R. Boswell, Thierry Pons, Tom Aviram
Abstract: A random number generator of a processor comprises a whitener for reducing the bias in random numbers generated by the random number generator. The whitener receives a random number of a first length read by an array of latches with inputs from an array of oscillators. The whitener dynamically creates a mask of the first length based on a parity of at least one previous random number read from the array of latches during at least one cycle prior to reading the random number. The whitener applies a compare operation between the random number and the mask to generate a whitened random number of the first length, with reduced bias, without reducing randomness.
Type:
Grant
Filed:
January 16, 2013
Date of Patent:
December 23, 2014
Assignee:
International Business Machines Corporation
Abstract: A random number generator of a processor comprises a whitener for reducing the bias in random numbers generated by the random number generator. The whitener receives a random number of a first length read by an array of latches with inputs from an array of oscillators. The whitener dynamically creates a mask of the first length based on a parity of at least one previous random number read from the array of latches during at least one cycle prior to reading the random number. The whitener applies a compare operation between the random number and the mask to generate a whitened random number of the first length, with reduced bias, without reducing randomness.
Type:
Grant
Filed:
August 22, 2012
Date of Patent:
December 23, 2014
Assignee:
International Business Machines Corporation
Abstract: Disclosed is a method of seeking semianalytical solutions to multispecies transport equations coupled with sequential first-order network reactions under conditions wherein a groundwater flow velocity and a dispersion coefficient vary spatially and temporally and boundary conditions vary temporally.
Type:
Grant
Filed:
July 15, 2013
Date of Patent:
December 9, 2014
Assignee:
Korea Institute of Geoscience and Minerals Resources