Round Off Or Truncation Patents (Class 708/551)
  • Patent number: 11698772
    Abstract: An instruction is executed in round-for-reround mode wherein the permissible resultant value that is closest to and no greater in magnitude than the infinitely precise result is selected. If the selected value is not exact and the units digit of the selected value is either 0 or 5, then the digit is incremented by one and the selected value is delivered. In all other cases, the selected value is delivered.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: July 11, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric Mark Schwarz, Martin Stanley Schmookler
  • Patent number: 11531727
    Abstract: Some embodiments provide a method for a circuit that executes a neural network including multiple nodes. The method loads a set of weight values for a node into a set of weight value buffers, a first set of bits of each input value of a set of input values for the node into a first set of input value buffers, and a second set of bits of each of the input values into a second set of input value buffers. The method computes a first dot product of the weight values and the first set of bits of each input value and a second dot product of the weight values and the second set of bits of each input value. The method shifts the second dot product by a particular number of bits and adds the first dot product with the bit-shifted second dot product to compute a dot product for the node.
    Type: Grant
    Filed: December 6, 2018
    Date of Patent: December 20, 2022
    Assignee: PERCEIVE CORPORATION
    Inventors: Jung Ko, Kenneth Duong, Steven L. Teig
  • Patent number: 11416736
    Abstract: Systems and methods are related to improving throughput of neural networks in integrated circuits by combining values in operands to increase compute density. A system includes an integrated circuit (IC) having multiplier circuitry. The IC receives a first value and a second value in a first operand. The IC performs a multiplication operation, via the multiplier circuitry, on the first operand and a second operand to produce a first multiplied product based at least in part on the first value and a second multiplied product based at least in part on the second value.
    Type: Grant
    Filed: December 27, 2017
    Date of Patent: August 16, 2022
    Assignee: Intel Corporation
    Inventors: Kevin Nealis, Randy Huang
  • Patent number: 11182666
    Abstract: In one example, an integrated circuit includes a first circuit, a second circuit, a third circuit, and a fourth circuit. The first circuit is configured to receive an input value and generate a first intermediate value based on a first probability density distribution associated with the input value. The second circuit comprises a set of multiplexer circuits configured to select, from a first set of candidate values and based on the first intermediate value, a first product of the first intermediate value and a weight value. The third circuit is configured to generate a second intermediate value based on a sum of the first product and a second product received from another circuit. The fourth circuit is configured to generate an output value based on the second intermediate value and a second probability density distribution associated with the second intermediate value.
    Type: Grant
    Filed: November 7, 2017
    Date of Patent: November 23, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Taylor Phebus, Asif Khan
  • Patent number: 11003446
    Abstract: Adder trees may be constructed for efficient packing of arithmetic operators into an integrated circuit. The operands of the trees may be truncated to pack an integer number of nodes per logic array block. As a result, arithmetic operations may pack more efficiently onto the integrated circuit while providing increased precision and performance.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: May 11, 2021
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler, Bogdan Pasca
  • Patent number: 10761805
    Abstract: The present embodiments relate to performing reduced-precision floating-point arithmetic operations using specialized processing blocks with higher-precision floating-point arithmetic circuitry. A specialized processing block may receive four floating-point numbers that represent two single-precision floating-point numbers, each separated into an LSB portion and an MSB portion, or four half-precision floating-point numbers. A first partial product generator may generate a first partial product of first and second input signals, while a second partial product generator may generate a second partial product of third and fourth input signals.
    Type: Grant
    Filed: September 26, 2018
    Date of Patent: September 1, 2020
    Assignee: Altera Corporation
    Inventor: Martin Langhammer
  • Patent number: 9702960
    Abstract: A method for determining a FDOA of a pulsed waveform received by two sensors includes obtaining a respective plurality of in-phase and quadrature-phase (IQ) samples indicative of a pulse envelope of the received pulsed waveform. The method includes determining a TDOA responsive to a leading edge of a pulse of the pulsed waveform and obtaining a first cross correlation of IQ samples at a delay (dc) closest to the TDOA, and respective second and third cross correlations at least one additional delay (dc+1 and dc?1) on either side of the closest delay. The method includes refining the approximation of the TDOA according to an interpolation of amplitudes of the cross-correlation and determining a respective rate of change of cross-correlation phase (??). The method includes approximating a straight line fit to the rates of change of cross-correlation phase (d??/dt), the slope of the straight line representative of the FDOA.
    Type: Grant
    Filed: July 19, 2013
    Date of Patent: July 11, 2017
    Assignee: Raytheon Company
    Inventors: John T. Broad, Lee M. Savage
  • Patent number: 9535659
    Abstract: Embodiments relate to a hardware circuit that is operable as a fixed point adder and a checksum adder. An aspect includes a driving of a multifunction compression tree disposed on a circuit path based on a control bit to execute one of first and second schemes of vector input addition and a driving of a multifunction adder disposed on the circuit path based on the control bit to perform the one of the first and second schemes of vector input addition.
    Type: Grant
    Filed: March 10, 2016
    Date of Patent: January 3, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James R. Cuffney, John G. Rell, Jr., Eric M. Schwarz, Patrick M. West, Jr.
  • Patent number: 9436434
    Abstract: Embodiments relate to a hardware circuit that is operable as a fixed point adder and a checksum adder. An aspect includes a driving of a multifunction compression tree disposed on a circuit path based on a control bit to execute one of first and second schemes of vector input addition and a driving of a multifunction adder disposed on the circuit path based on the control bit to perform the one of the first and second schemes of vector input addition.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: September 6, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James R. Cuffney, John G. Rell, Jr., Eric M. Schwarz, Patrick M. West, Jr.
  • Patent number: 9286267
    Abstract: In one embodiment, the present invention includes a method for receiving a rounding instruction and an immediate value in a processor, determining if a rounding mode override indicator of the immediate value is active, and if so executing a rounding operation on a source operand in a floating point unit of the processor responsive to the rounding instruction and according to a rounding mode set forth in the immediate operand. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 11, 2013
    Date of Patent: March 15, 2016
    Assignee: Intel Corporation
    Inventors: Ronen Zohar, Shane Story
  • Patent number: 9244653
    Abstract: A floating point value can represent a number or something that is not a number (NaN). A floating point value that is a NaN having data field that stores information, such as a propagation count that indicates the number of times a NaN value has been propagated through instructions. A NaN evaluation instruction can determine whether one or more operands is a NaN operand of a particular type, and if so can generate a result that is a NaN of a different type. An exception can be generated based upon the NaN of the different type being provided as a resultant.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: January 26, 2016
    Assignee: FREESCALE SEMICONDUCTOR, INC.
    Inventor: William C. Moyer
  • Patent number: 9223751
    Abstract: In one embodiment, the present invention includes a method for receiving a rounding instruction and an immediate value in a processor, determining if a rounding mode override indicator of the immediate value is active, and if so executing a rounding operation on a source operand in a floating point unit of the processor responsive to the rounding instruction and according to a rounding mode set forth in the immediate operand. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 22, 2006
    Date of Patent: December 29, 2015
    Assignee: Intel Corporation
    Inventors: Ronen Zohar, Shane Story
  • Patent number: 9213524
    Abstract: A floating-point value can represent a number or something that is not a number (NaN). A floating-point value that is a NaN includes a portion that stores information about the source operands of the instruction.
    Type: Grant
    Filed: September 30, 2013
    Date of Patent: December 15, 2015
    Assignee: FREESCALE SEMICONDUCTOR, INC.
    Inventor: William C. Moyer
  • Patent number: 9104479
    Abstract: Processing circuitry is provided to perform an operation FRINT for rounding a floating-point value to an integral floating-point value. Control circuitry controls the processing circuitry to perform the FRINT operation in response to an FRINT instruction. The processing circuitry includes shifting circuitry for generating a rounding value by shifting a base value, adding circuitry for adding the rounding value to the significand of the floating-point value to generate a sum value, mask generating circuitry for generating a mask for clearing fractional-valued bits of the sum value, and masking circuitry for applying the mask to the sum value to generate the integral floating-point value.
    Type: Grant
    Filed: December 7, 2011
    Date of Patent: August 11, 2015
    Assignee: ARM Limited
    Inventors: David Raymond Lutz, Neil Burgess, Sabrina Marie Romero
  • Patent number: 8972472
    Abstract: A system and method for unbiased rounding away from, or toward, zero by truncating N bits from a M bit input number to provide a M?N bit number, and adding the equivalent value of ‘½’ to the M?N bit number unless the input number is negative, or positive, respectively, and the N truncated bits represent exactly ½. The method for rounding away from zero may include outputting a (M?N) bit truncated number if the M-bit input number is negative and the sequence of N truncated bits comprises a most significant bit of 1, followed by zeros; and otherwise, computing and outputting a sum of (a) a number that has an equivalent value of one followed by (N?1) replicas of zero, the one provided by applying a logical operation on the most significant bit of the sequence of truncated bits and (b) the (M?N) bit truncated number.
    Type: Grant
    Filed: September 17, 2008
    Date of Patent: March 3, 2015
    Assignee: Densbits Technologies Ltd.
    Inventors: Ofir Avraham Kanter, Ilan Bar
  • Publication number: 20150039665
    Abstract: A processing apparatus supports a narrowing-and-rounding arithmetic operation which generates, in response to two operands each comprising at least one W-bit data element, a result value comprising at least one X-bit result data element, with each X-bit result data element representing a sum or difference of corresponding W-bit data elements of the two operands rounded to an X-bit value (W>X). The arithmetic operation is implemented using a number of N-bit additions (N<W), with carry values from a first stage of N-bit additions being added at a second stage of N-bit additions for adding a rounding value to the result of the first stage additions. This technique reduces the amount of time required for performing the narrowing-and-rounding arithmetic operation.
    Type: Application
    Filed: July 31, 2013
    Publication date: February 5, 2015
    Applicant: Arm Limited
    Inventors: Neil BURGESS, David Raymond LUTZ
  • Patent number: 8862652
    Abstract: A method is provided for deriving an RTL a logic circuit performing a multiplication as the sum of addends operation with a desired rounding position. In this, an error requirement to meet for the design rounding position is derived. For each of the CCT and the VCT implementation a number columns to discard is derived and a constant to include in the sum addends. For an LMS implementation, a number of columns to discard is derived. After discarding the columns and including the constants as appropriate, an RTL representation of the sum of addends operation is derived for each of the CCT, VCT and LMS implementations and a logic circuit synthesized for each of these. The logic circuit which gives the best implementation is selected for manufacture.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: October 14, 2014
    Assignee: Imagination Technologies, Limited
    Inventor: Theo Alan Drane
  • Publication number: 20140181170
    Abstract: According to one embodiment, an arithmetic circuit includes follows. The arithmetic unit performs an arithmetic operation including addition and multiplication to generate a first value of (n+m) bits. The rounding preprocessor performs an OR operation on lower (m?k) bits of the first value to generate a second value of 1 bit. The register stores a third value of (n+k+1) bits obtained by concatenating upper (n+k) bits of the first value and the second value. The rounding postprocessor calculates a carry bit value of 1 bit from a most significant bit of the third value and lower (k+1) bits of the third value, and adds the carry bit value to upper n bits of the third value.
    Type: Application
    Filed: December 24, 2013
    Publication date: June 26, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Koichiro BAN
  • Patent number: 8639738
    Abstract: A low-error reduced-width multiplier is provided by the present invention. The multiplier can dynamically compensate the truncation error. The compensation value is derived by the dependencies among the multiplier partial products, and thus, can be analyzed according to the multiplication type and the multiplier input statistics.
    Type: Grant
    Filed: February 28, 2011
    Date of Patent: January 28, 2014
    Assignee: National Chiao Tung University
    Inventors: Yen-Chin Liao, Hsie-Chia Chang
  • Patent number: 8615543
    Abstract: Saturation and rounding capabilities are implemented in MAC blocks to provide rounded and saturated outputs of multipliers and of add-subtract-accumulate circuitrs implemented using DSP. These features support any suitable format of value representation, including the x.15 format. Circuitry within the multipliers and the add-subtract-accumulate circuits implement the rounding and saturation features of the present invention.
    Type: Grant
    Filed: June 22, 2011
    Date of Patent: December 24, 2013
    Assignee: Altera Corporation
    Inventors: Leon Zheng, Martin Langhammer, Steven Perry, Paul Metzgen, Nitin Prasad, William Hwang
  • Patent number: 8537047
    Abstract: The invention relates to the digital signal requantization, at a given quantization step size, of a first word received in a first period of time and encoded in a first number of bits, into a second word, with a quantization error equal to a third number. A sequence of third words is outputted, equal to the second word, with the sequence subdivided into a first group comprising a number of third words that is equal to the third number and a second group of third words. Before outputting them, the correction means adds a least significant bit to the third words of the first group and adds or subtracts least significant bits to or from the third words of the second group, such that the sum of the least significant bits added to and subtracted from the second group is zero.
    Type: Grant
    Filed: July 26, 2010
    Date of Patent: September 17, 2013
    Assignee: ST-Ericsson SA
    Inventor: Sébastien Cliquennois
  • Patent number: 8495124
    Abstract: A decimal multiplication mechanism for fixed and floating point computation in a computer having a coefficient mechanism without resulting leading zero detection (LZD) and process which assumes that the final product will be M+N digits in length and performs all calculations based on this assumption. Least significant digits that would be truncated are no longer stored, but retained as sticky information which is used to finalize the result product. Once the computation of the product is complete, a final check based on the examination of key bits observed during partial product accumulation is used to determine if the final product is truly M+N digits in length, or M+N?1 digits. If the latter is true, then corrective final product shifting is employed to obtain the proper result. This eliminates the need for dedicated leading zero detection hardware used to determine the number of significant digits in the final product.
    Type: Grant
    Filed: June 23, 2010
    Date of Patent: July 23, 2013
    Assignee: International Business Machines Corporation
    Inventors: Steven R. Carlough, Adam B. Collura, Michael Kroener, Silvia Melitta Mueller
  • Patent number: 8443027
    Abstract: A method, computer-readable medium, and an apparatus for implementing a floating point weighted average function. The method includes receiving an input containing 2N input values, 2N weights, and an opcode, where N is a positive integer number and each of the input values corresponds to one of the weights. Furthermore, the method also includes using existing dot product circuit function to generate 2N addends by multiplying each of the input values with the corresponding weight. In addition, the method includes generating a sum value by adding the 2N addends, where the sum value includes an exponent value, and generating the weighted average value based on the sum value by decreasing the exponent value by N. In this fashion, the same circuit area may be used to carry out both dot product and weighted average calculations, leading to greater circuit area savings and performance advantages.
    Type: Grant
    Filed: September 26, 2007
    Date of Patent: May 14, 2013
    Assignee: International Business Machines Corporation
    Inventors: Adam James Muff, Matthew Ray Tubbs
  • Patent number: 8407271
    Abstract: An apparatus and method for computing a rounded floating point number. A floating point unit (FPU) receives an instruction to round a floating point number to a nearest integral value and retrieves a binary source operand having an exponent of a fixed first number of bits and a mantissa of a fixed second number of bits. If the unbiased exponent value is greater than or equal to zero and less than the fixed second number, the FPU generates a mask having N consecutive ‘1’ bits beginning with the least significant bit and whose remaining bits have a value of ‘0’, where N is equal to the fixed second number minus the unbiased exponent value. The FPU computes a bitwise OR of the source operand with the mask, increments the result if the instruction is to round up, and computes a bitwise AND of the result with the inverse of the mask.
    Type: Grant
    Filed: August 28, 2009
    Date of Patent: March 26, 2013
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kevin Hurd, Daryl Lieu, Kelvin Goveas, Scott Hilker
  • Patent number: 8370226
    Abstract: A technique for performing a financial calculation is described. In this calculation technique, initial financial values are rounded based on a rounding criterion, and a total financial value is calculated by summing the rounded financial values. Based on the rounded financial values, associated rounding error values are computed. These rounding error values are then summed to determine a total error value. Moreover, the total error value is rounded based on the rounding criterion, and the resulting rounded total error value is used to correct a rounding error in the total financial value.
    Type: Grant
    Filed: April 19, 2010
    Date of Patent: February 5, 2013
    Assignee: Intuit Inc.
    Inventor: Patanjali Bhatt
  • Publication number: 20130007085
    Abstract: A method is provided for deriving an RTL a logic circuit performing a multiplication as the sum of addends operation with a desired rounding position. In this, an error requirement to meet for the design rounding position is derived. For each of the CCT and the VCT implementation a number columns to discard is derived and a constant to include in the sum addends. For an LMS implementation, a number of columns to discard is derived. After discarding the columns and including the constants as appropriate, an RTL representation of the sum of addends operation is derived for each of the CCT, VCT and LMS implementations and a logic circuit synthesized for each of these. The logic circuit which gives the best implementation is selected for manufacture.
    Type: Application
    Filed: June 29, 2012
    Publication date: January 3, 2013
    Applicant: Imagination Technologies, Ltd.
    Inventor: Theo Alan Drane
  • Publication number: 20120311008
    Abstract: Pricing values may be automatically computed by converting a base price with a predefined price ending based on predetermined rounding rules. A base price may be adjusted employing a rounding syntax and two pricing points, one for a rounding lower limit the other for rounding upper limit. Based on a comparison of a portion of the price computed with the rounding syntax, the adjusted (or sales) price may be computed reflecting a desired pricing strategy such as a psychological pricing strategy.
    Type: Application
    Filed: June 1, 2011
    Publication date: December 6, 2012
    Applicant: MICROSOFT CORPORATION
    Inventor: Jakob Hall
  • Patent number: 8266198
    Abstract: A specialized processing block for a programmable logic device includes circuitry for performing multiplications and sums thereof, as well as circuitry for rounding the result. The rounding circuitry can selectably perform round-to-nearest and round-to-nearest-even operations. In addition, the bit position at which rounding occurs is preferably selectable. The specialized processing block preferably also includes saturation circuitry to prevent overflows and underflows, and the bit position at which saturation occurs also preferably is selectable. The selectability of both the rounding and saturation positions provides control of the output data word width. The rounding and saturation circuitry may be selectably located in different positions based on timing needs. Similarly, rounding may be speeded up using a look-ahead mode in which both rounded and unrounded results are computed in parallel, with the rounding logic selecting between those results.
    Type: Grant
    Filed: June 5, 2006
    Date of Patent: September 11, 2012
    Assignee: Altera Corporation
    Inventors: Kwan Yee Martin Lee, Martin Langhammer, Yi-Wen Lin, Triet M. Nguyen
  • Publication number: 20120215825
    Abstract: Techniques are disclosed that involve the multiplication of values. For instance, a plurality of partial products may be calculated from a first operand and a second operand. This calculating bypasses calculating partial products having corresponding shift values that are less than a shift threshold value. These partial products are summed to produce a summed product. In turn, the summed product is truncated into a final product having a final precision. This final precision may be a shared precision employed by multiple processing units (e.g., algorithmic units in a graphics or display processing pipeline).
    Type: Application
    Filed: February 22, 2011
    Publication date: August 23, 2012
    Inventor: Abhay M. Mavalankar
  • Patent number: 8214419
    Abstract: Methods and apparatus are provided for implementing an efficient saturating multiplier associated with addition and subtraction logic. The result of the multiplier is saturated before accumulating. The result of the multiplier can be stored in a result register in unsaturated form. The output of the result register can then be saturated and provided to addition and subtraction logic to allow efficient implementation of a saturating multiplier.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: July 3, 2012
    Assignee: Altera Corporation
    Inventor: Paul Metzgen
  • Patent number: 8095587
    Abstract: An arithmetic unit comprising: an encoding circuit arranged to receive first and second operands each having a bit length of m bits and to generate therefrom a number n of partial products of bit length of 2m bits or less; an addition circuit having 2m columns each having n inputs, wherein bits of said partial products are applied to said inputs for combining said partial products into a result leaving certain of said inputs unused; and a rounding bit generator connected to supply a rounding bit to at least one of said unused inputs in one of said in columns at a bit position to cause said result to be rounded.
    Type: Grant
    Filed: June 30, 2006
    Date of Patent: January 10, 2012
    Assignee: STMicroelectronics (Research & Development) Ltd.
    Inventors: Tariq Kurd, Mark O. Homewood
  • Patent number: 8095586
    Abstract: Methods and arrangements to correct for double rounding errors when rounding floating point numbers to nearest away are described. Embodiments include transformations, code, state machines or other logic to perform a floating point operation on one or more floating point numbers of precision P1 in base b, producing positive result res0 of precision greater than precision P1; rounding positive result res0 to precision P1 to the nearest away, producing positive result res1; and rounding the result res1 to precision P2 to the nearest away, where P2 is narrower than P1, producing result res2. The embodiments may also include correcting res2 for double rounding errors. The correcting may include determining that res1 is midway between two consecutive floating point numbers of precision P2, the larger being res2, determining that rounding res0 to produce res1 involved rounding up, and decrementing the significand of res2 to obtain the corrected result.
    Type: Grant
    Filed: December 31, 2007
    Date of Patent: January 10, 2012
    Assignee: Intel Corporation
    Inventor: Marius Cornea-Hasegan
  • Publication number: 20110320512
    Abstract: A decimal multiplication mechanism for fixed and floating point computation in a computer having a coefficient mechanism without resulting leading zero detection (LZD) and process which assumes that the final product will be M+N digits in length and performs all calculations based on this assumption. Least significant digits that would be truncated are no longer stored, but retained as sticky information which is used to finalize the result product. Once the computation of the product is complete, a final check based on the examination of key bits observed during partial product accumulation is used to determine if the final product is truly M+N digits in length, or M+N?1 digits. If the latter is true, then corrective final product shifting is employed to obtain the proper result. This eliminates the need for dedicated leading zero detection hardware used to determine the number of significant digits in the final product.
    Type: Application
    Filed: June 23, 2010
    Publication date: December 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Steven R. Carlough, Adam B. Collura, Michael Kroener, Silvia Melitta Mueller
  • Patent number: 8069199
    Abstract: Methods and arrangements to correct for double rounding errors when rounding floating point numbers to nearest even are described. Embodiments include transformations, code, state machines or other logic to perform a floating point operation on one or more floating point numbers of precision P1 in base b, producing positive result res0 of precision greater than precision P1; rounding positive result res0 to precision P1 to the nearest even, producing positive result res1; and rounding the result res1 to precision P2 to the nearest even, where P2 is narrower than P1, producing result res2. The embodiments may also include correcting res2 for double rounding errors. The correcting may include determining that res1 is midway between two consecutive floating point numbers of precision P1, the larger (smaller) being res2, determining that rounding res0 to produce res1 involved rounding up (down), and decrementing (incrementing) the significand of res2 to obtain the corrected result res2?.
    Type: Grant
    Filed: December 31, 2007
    Date of Patent: November 29, 2011
    Assignee: Intel Corporation
    Inventor: Marius Cornea-Hasegan
  • Publication number: 20110270901
    Abstract: An FFT algorithm that splits a large bit width waveform into two parts, making it possible to conduct the FFT with much lower logic resource consumption is disclosed. The waveform is split into its most significant bits and its least significant bits through division in the form of a bit shift. Each partial signal is then put through an FFT algorithm. The MSB FFT output is then right bit shifted. The two partial FFT's are summed to create a single output that is largely equivalent to an FFT of the original waveform. Rounding distortion is reduced by overlapping the MSB and LSB partial signals.
    Type: Application
    Filed: April 30, 2010
    Publication date: November 3, 2011
    Applicant: SRC, INC.
    Inventors: Kristen L. Dobart, Michael T. Addario
  • Publication number: 20110225224
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Application
    Filed: May 26, 2011
    Publication date: September 15, 2011
    Applicant: ALTERA CORPORATION
    Inventors: Nikos P. Pitsianis, Gerald G. Pechanek, Ricardo E. Rodriguez
  • Patent number: 8005884
    Abstract: A system and method for efficient floating-point rounding in computer systems. A computer system may include at least one floating-point unit for floating-point arithmetic operations such as addition, subtraction, multiplication, division and square root. For the division operation, the constraints for the remainder may be relaxed in order to reduce the area for look-up tables. An extra internal precision bit may not be used. Only one quotient may be calculated, rather than two, further reducing needed hardware to perform the rounding. Comparison logic may be required that may add a couple of cycles to the rounding computation beyond the calculation of the remainder. However, the extra latency is much smaller than a second floating-point multiply accumulate latency.
    Type: Grant
    Filed: October 9, 2007
    Date of Patent: August 23, 2011
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexandru Fit-Florea, Debjit Das-Sarma
  • Patent number: 8005885
    Abstract: A processor, an instruction set architecture, an instruction, a computer readable medium and a method for implementing optimal per-instruction encoding of rounding control to emulate directed rounding are disclosed. In one embodiment, an apparatus designed to perform directed rounding includes an instruction decoder configured to decode an instruction, which includes a rounding control information to calculate a result boundary. The apparatus also includes a directed rounding emulator configured to adjust the result boundary to form an adjusted result boundary as a function of the rounding control bit. The adjusted result boundary establishes an endpoint for an interval that includes a result. In one embodiment, the directed round emulator is further configured to emulate a round-to-negative infinity rounding mode and a round-to-positive infinity rounding mode based on at least the single rounding control bit.
    Type: Grant
    Filed: October 14, 2005
    Date of Patent: August 23, 2011
    Assignee: Nvidia Corporation
    Inventor: Nicholas Patrick Wilt
  • Publication number: 20110185000
    Abstract: A low-error reduced-width multiplier is provided by the present invention. The multiplier can dynamically compensate the truncation error. The compensation value is derived by the dependencies among the multiplier partial products, and thus, can be analyzed according to the multiplication type and the multiplier input statistics.
    Type: Application
    Filed: February 28, 2011
    Publication date: July 28, 2011
    Applicant: National Chiao Tung University
    Inventors: Yen-Chin Liao, Hsie-Chia Chang
  • Patent number: 7949701
    Abstract: A method and system to perform shifting and rounding operations within a microprocessor, such as, for example, a digital signal processor, during execution of a single instruction are described. An instruction to shift and round data within a source register unit of a register file structure is received within a processing unit. The instruction includes a shifting bit value indicating the bit amount for a right shift operation and is subsequently executed to shift data within the source register unit to the right by an encoded bit value, calculated by subtracting a single bit from the shifting bit value contained within the instruction. A predetermined bit extension is further inserted within the vacated bit positions adjacent to the shifted data. Subsequently, an addition operation is performed on the shifted data and a unitary integer value is added to the shifted data to obtain resulting data.
    Type: Grant
    Filed: August 2, 2006
    Date of Patent: May 24, 2011
    Assignee: QUALCOMM Incorporated
    Inventors: Lucian Codrescu, Erich Plondke, Mao Zeng
  • Patent number: 7948267
    Abstract: A specialized processing block for a configurable integrated circuit device includes circuitry for performing multiplications and sums thereof, as well as circuitry for rounding the result. The rounding circuitry reuses an adder that is also available, in other configurations, for accumulation of the result. Rounding is performed by adding a constant to the result and then truncating at the bit position at which rounding is desired. The constant may be entered by a user, or may be derived based on a desired rounding method from mask data entered by the user to identify the rounding bit position.
    Type: Grant
    Filed: February 9, 2010
    Date of Patent: May 24, 2011
    Assignee: Altera Corporation
    Inventors: Volker Mauer, Martin Langhammer
  • Patent number: 7912888
    Abstract: A computing device has a rounding processor that inputs therein a set of plural (K) input data IN1 through INK comprising z bits. The rounding processor selects an ensured bit field depending upon the state of usage of each of specific areas A of upper z/2 bits of the 32-bit input data IN1 through INK and rounds the corresponding input data to z/2. As a result of rounding processing, shift information SHIFT of lower (16?n) bits of each discarded non-specific area B is stored in a memory area. D10-1 through D10-K of the rounded respective 16 bits are subjected to multiplication by a multiplier. A digit adjuster shifts multiplication results to the left on the basis of the shift information SHIFT respectively stored in the memory areas to adjust digits.
    Type: Grant
    Filed: March 15, 2007
    Date of Patent: March 22, 2011
    Assignee: Oki Semiconductor Co., Ltd.
    Inventor: Wataru Uchida
  • Patent number: 7853636
    Abstract: An integrated circuit (IC) for convergent rounding including: an adder circuit configured to produce a summation; a comparison circuit configured to bitwise compare the summation with an input pattern, bitwise mask the comparison using a mask, and combine the masked comparison to produce a comparison bit; and rounding circuitry for rounding the summation based at least in part on the comparison bit.
    Type: Grant
    Filed: May 12, 2006
    Date of Patent: December 14, 2010
    Assignee: Xilinx, Inc.
    Inventors: Bernard J. New, Jennifer Wong, James M. Simkins, Alvin Y. Ching, John M. Thendean, Anna Wing Wah Wong, Vasisht Mantra Vadi
  • Publication number: 20100306292
    Abstract: A processor may have at least one multiplier unit which can be controlled to operate in a signed, an unsigned, or a mixed sign mode; a multiplier unit mode decoder coupled with the multiplier unit which receives location information of a first and second operands, wherein the multiplier mode decoder controls the multiplier unit when in the mixed sign mode depending on the location information to operate in a signed mode, an unsigned mode, or a combined signed/unsigned mode.
    Type: Application
    Filed: May 7, 2010
    Publication date: December 2, 2010
    Inventors: Michael I. Catherwood, Settu Duraisamy
  • Patent number: 7822799
    Abstract: Adder/rounder circuitry for use in a programmable logic device computes a rounded sum quickly, and ideally within one clock cycle. The rounding position is selectable within a range of bit positions. In an input stage, for each bit position in that range, bits from both addends and a rounding bit are processed, while for each bit position outside that range only bits from both addends are processed. The input stage processing aligns its output in a common format for bits within and outside the range. The input processing may include 3:2 compression for bit positions within the range and 2:2 compression for bit positions outside the range, so that further processing is performed for all bit positions on a sum vector and a carry vector. Computation of the sum proceeds substantially simultaneously with and without the rounding input, and rounding logic makes a selection later in the computation.
    Type: Grant
    Filed: June 26, 2006
    Date of Patent: October 26, 2010
    Assignee: Altera Corporation
    Inventors: Martin Langhammer, Triet M. Nguyen, Yi-Wen Lin
  • Publication number: 20100260429
    Abstract: A signal processing apparatus according to an embodiment of the present invention includes: a compression processing unit that performs compression processing on n-bit data; a bit-number conversion unit that converts m-bit input image data into n-bit data (where n<m) by performing round-up or round-down processes on the lower (m?n) bits of the m-bit input image data, and feeds the obtained n-bit data to the compression processing unit; and a conversion processing control unit that selects either one of the round-up process and the round-down process to be performed on each datum of the n-bit data in accordance with a predefined rule on the basis of the position of a frame to which the datum belongs and the position in the frame at which the datum is located, and instructs the bit-number conversion unit to perform the selected round-up process or round-down process.
    Type: Application
    Filed: March 31, 2010
    Publication date: October 14, 2010
    Applicant: Sony Corporation
    Inventor: Tsutomu Ichinose
  • Patent number: 7765221
    Abstract: Methods and apparatus, including computer systems and program products, for normalizing computer-represented collections of objects. A first minimum value can be normalized based on a second minimum value of a universal set object that corresponds to the first set object. The second minimum value is both a minimum value supported by a data type (e.g., 1-byte integer) and a minimum value defined to be in the universal set object (e.g., 0 for a universal set of all natural numbers). Similarly, a first maximum value can be normalized based on a second maximum value of the universal set object where the second maximum value is both a maximum value supported by a data type and in the universal set object. Intervals can be normalized, which can involve replacing half-open intervals with equivalent half-closed intervals. Also, a consecutively ordered, uninterrupted, sequence of values of a set object can be normalized.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: July 27, 2010
    Assignee: SAP AG
    Inventor: Peter K. Zimmerer
  • Patent number: 7728624
    Abstract: An integrated circuit comprising at least one group comprising having multiple arithmetic/logic units arranged in sub-groups. In the sub-groups at inputs of multiple arithmetic/logic units, in each case a single one of the first selection units is connected on the input side, wherein no other selection unit is connected directly on the input side of this selection unit. The first selection units are coupled to each other such that a horizontal and/or vertical logical interconnection of the arithmetic/logic units within a group, and/or a logical interconnection of arithmetic/logic units to an upstream group can be implemented. Second selection units are in each case connected on the output side of a column of arithmetic/logic units. The second selection units of a group are connected on the output side to one bus each, and a microprocessor is coupled to this bus.
    Type: Grant
    Filed: October 10, 2006
    Date of Patent: June 1, 2010
    Assignee: Micronas GmbH
    Inventor: Gert Umbach
  • Patent number: RE43145
    Abstract: A processor which executes positive conversion processing, which converts coded data into uncoded data, and saturation calculation processing, which rounds a value to an appropriate number of bits, at high speed. When a positive conversion saturation calculation instruction “MCSST D1” is decoded, the sum-product result register 6 outputs its held value to the path P1. The comparator 22 compares the magnitude of the held value of the sum-product result register 6 with the coded 32-bit integer “0x0000_00FF”. The polarity judging unit 23 judges whether the eighth bit of the value held by the sum-product result register 6 is “ON”. The multiplexer 24 outputs one of the maximum value “0x0000_00FF” generated by the constant generator 21, the zero value “0x0000_0000” generated by the zero generator 25, and the held value of the sum-product result register 6 to the data bus 18.
    Type: Grant
    Filed: December 21, 2004
    Date of Patent: January 24, 2012
    Assignee: Panasonic Corporation
    Inventors: Toru Morikawa, Nobuo Higaki, Akira Miyoshi, Keizo Sumida
  • Patent number: RE43729
    Abstract: A processor which executes positive conversion processing, which converts coded data into uncoded data, and saturation calculation processing, which rounds a value to an appropriate number of bits, at high speed. When a positive conversion saturation calculation instruction “MCSST D1” is decoded, the sum-product result register 6 outputs its held value to the path P1. The comparator 22 compares the magnitude of the held value of the sum-product result register 6 with the coded 32-bit integer “0x0000_00FF”. The polarity judging unit 23 judges whether the eighth bit of the value held by the sum-product result register 6 is “ON”. The multiplexer 24 outputs one of the maximum value “0x0000_00FF” generated by the constant generator 21, the zero value “0x0000_0000” generated by the zero generator 25, and the held value of the sum-product result register 6 to the data bus 18.
    Type: Grant
    Filed: April 22, 2011
    Date of Patent: October 9, 2012
    Assignee: Panasonic Corporation
    Inventors: Toru Morikawa, Nobuo Higaki, Akira Miyoshi, Keizo Sumida