Round Off Or Truncation Patents (Class 708/497)

Exact versus inexact decimal floating-point numbers and computation system

Patent number: 12277404

Abstract: This disclosure represents an improved computer system and process to avoid the consequences of improper conversion of numbers and of rounding errors. This process makes the distinction between exact and inexact decimal floating-point numbers. If the result of a sequence of operation is exact, the user can trust that every decimal digit in the computed result is correct. On the other hand, if the input operands are inexact or the result cannot be computed exactly, a loss of significant digits occurs, and the user is warned of the loss. A novel representation is used for the inexact computed values. An estimate of the absolute error is also part of the representation.

Type: Grant

Filed: November 10, 2021

Date of Patent: April 15, 2025

Assignee: King Fahd University of Petroleum and Minerals

Inventor: Muhamed F. Mudawar
Rounding circuitry for floating-point mantissas

Patent number: 12020000

Abstract: Systems and methods include arithmetic circuitry that generates a floating-point mantissa and includes a propagation network that calculates the floating-point mantissa based on input bits. The systems and methods also include rounding circuitry that rounds the floating-point mantissa. The rounding circuitry includes a multiplexer at a rounding location for the floating-point mantissa that selectively inputs a first input bit of the input bits or a rounding bit. The rounding circuitry also includes an OR gate that ORs a second input bit of the input bits with the rounding bit. Moreover, the second input bit is a less significant bit than the first input bit.

Type: Grant

Filed: December 24, 2020

Date of Patent: June 25, 2024

Assignee: Intel Corporation

Inventors: Martin Langhammer, Alexander Heinecke
Processing device with vector transformation execution

Patent number: 11768685

Abstract: An integrated circuit, comprising an instruction pipeline that includes instruction fetch phase circuitry, instruction decode phase circuitry, and instruction execution circuitry. The instruction execution circuitry includes transformation circuitry for receiving an interleaved dual vector operand as an input and for outputting a first natural order vector including a first set of data values from the interleaved dual vector operand and a second natural order vector including a second set of data values from the interleaved dual vector operand.

Type: Grant

Filed: May 5, 2022

Date of Patent: September 26, 2023

Assignee: Texas Instruments Incorporated

Inventors: Mujibur Rahman, Timothy David Anderson, Joseph Zbiciak
Hardware module for converting numbers

Patent number: 11449309

Abstract: A hardware module comprising circuitry configured to: store a sequence of n bits in a register of the hardware module; generate a signed integer comprising a magnitude component and a sign bit by: if the most significant bit of the sequence of n bits is equal to one: set each of the n?1 of the most significant bits of the magnitude component to be equal to the corresponding bit of the n?1 least significant bits of the sequence of n bits; and set the sign bit to be zero; if the most significant bit of the sequence of n bits is equal to zero: set each of the n?1 of the most significant bits of the magnitude component to be equal to the inverse of the corresponding bit of the n?1 least significant bits of the sequence of n bits; and set the sign bit to be one.

Type: Grant

Filed: June 21, 2019

Date of Patent: September 20, 2022

Assignee: GRAPHCORE LIMITED

Inventors: Stephen Felix, Mrudula Gore
Floating-point unit stochastic rounding for accelerated deep learning

Patent number: 11449574

Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has a respective floating-point unit enabled to perform stochastic rounding, thus in some circumstances enabling reducing systematic bias in long dependency chains of floating-point computations. The long dependency chains of floating-point computations are performed, e.g., to train a neural network or to perform inference with respect to a trained neural network.

Type: Grant

Filed: April 13, 2018

Date of Patent: September 20, 2022

Assignee: Cerebras Systems Inc.

Inventors: Sean Lie, Michael Edwin James, Michael Morrison, Gary R. Lauterbach, Srikanth Arekapudi
Processing-in-memory (PIM) device

Patent number: 11422803

Abstract: A processing-in-memory (PIM) device includes a data storage region and a multiplication/accumulation (MAC) operator. The data storage region is configured to store first data and second data. The MAC operator is configured to perform a MAC arithmetic operation of the first data and the second data. The MAC operator includes a MAC circuit configured to perform the MAC arithmetic operation to output MAC result data and a data output unit configured to feedback bias data to the MAC circuit prior to the MAC arithmetic operation.

Type: Grant

Filed: January 8, 2021

Date of Patent: August 23, 2022

Assignee: SK hynix Inc.

Inventor: Choung Ki Song
Adaptive pacing setting for workload execution

Patent number: 11392418

Abstract: A computer system may initialize one or more workloads. The computer system may operate in a boost mode and a regular mode. The boost mode may include an adjustment of a pacing setting and an adjustment of group availability targets for executing the one or more workloads. The computer system may identify that the boost mode is enabled during a system start of the computer system. The computer system may identify that the pacing setting is operating in the regular mode. The computer system may dynamically increase the pacing setting. The increase of the pacing setting may enable an increased processor utilization of the computer system by the one or more workloads. The increased processor utilization may generate a concurrent processing of the one or more workloads. The computer system may determine an end of the boost mode and reset the pacing setting.

Type: Grant

Filed: February 21, 2020

Date of Patent: July 19, 2022

Assignee: International Business Machines Corporation

Inventors: Juergen Holtz, Qais Noorshams
Matrix processing instruction with optional up/down sampling of matrix

Patent number: 11372644

Abstract: A processor system comprises a shared memory and a processing element. The processing element includes a matrix processor unit and is in communication with the shared memory. The processing element is configured to receive a processor instruction specifying a data matrix and a matrix manipulation operation. A manipulation matrix based on the processor instruction is identified. The data matrix and the manipulation matrix are loaded into the matrix processor unit and a matrix operation is performed to determine a result matrix. The result matrix is outputted to a destination location.

Type: Grant

Filed: December 9, 2019

Date of Patent: June 28, 2022

Assignee: Meta Platforms, Inc.

Inventors: Thomas Mark Ulrich, Krishnakumar Narayanan Nair, Yuchen Hao
Optimized compute hardware for machine learning operations

Patent number: 11334796

Abstract: A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.

Type: Grant

Filed: August 3, 2020

Date of Patent: May 17, 2022

Assignee: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
Bit string accumulation in memory array periphery

Patent number: 10942890

Abstract: Systems, apparatuses, and methods related to bit string accumulation in memory array periphery are described. Control circuitry (e.g., a processing device) may be utilized to control performance of operations using bit strings within a memory device. Results of the operations may be accumulated in circuitry peripheral to a memory array of the memory device. For instance, a method for bit string accumulation in memory array periphery can include retrieving a bit string stored in a data structure of a memory array. The bit string can represent a result of performance of an arithmetic operation or a logical operation. The method can further include storing the bit string in a plurality of sense amplifiers located in a periphery of the memory array and using the bit string as an operand in performance of at least a portion of a recursive operation.

Type: Grant

Filed: June 4, 2019

Date of Patent: March 9, 2021

Assignee: Micron Technology, Inc.

Inventor: Vijay S. Ramesh
Bit string accumulation in memory array periphery

Patent number: 10942889

Abstract: Bit string accumulation in a memory array periphery is described. Control circuitry (e.g., a processing device) may be utilized to control performance of operations using bit strings within a memory device. Results of the operations may be accumulated in circuitry peripheral to a memory array of the memory device. For instance, a plurality of sense amplifiers may be coupled to a memory array and a processing device. A quantity of sense amplifiers among the plurality of sense amplifiers can be the same as a quantity of rows or columns of the array. The processing device may be configured to cause performance of a recursive operation using one or more bit strings that are formatted according to a Type III universal number format or a posit format. The processing device may further be configured to cause resultant bit strings representing results of iterations of the recursive operation to be accumulated in the plurality of sense amplifiers.

Type: Grant

Filed: June 4, 2019

Date of Patent: March 9, 2021

Assignee: Micron Technology, Inc.

Inventor: Vijay S. Ramesh
Efficient and scalable computations with sparse tensors

Patent number: 10936569

Abstract: In a system for storing in memory a tensor that includes at least three modes, elements of the tensor are stored in a mode-based order for improving locality of references when the elements are accessed during an operation on the tensor. To facilitate efficient data reuse in a tensor transform that includes several iterations, on a tensor that includes at least three modes, a system performs a first iteration that includes a first operation on the tensor to obtain a first intermediate result, and the first intermediate result includes a first intermediate-tensor. The first intermediate result is stored in memory, and a second iteration is performed in which a second operation on the first intermediate result accessed from the memory is performed, so as to avoid a third operation, that would be required if the first intermediate result were not accessed from the memory.

Type: Grant

Filed: May 20, 2013

Date of Patent: March 2, 2021

Assignee: Reservoir Labs, Inc.

Inventors: Muthu Manikandan Baskaran, Richard A. Lethin, Benoit J. Meister, Nicolas T. Vasilache
Systems and methods for measuring error in terms of unit in last place

Patent number: 10936769

Abstract: Systems and methods evaluate simulation models and measure floating point arithmetic errors in terms of Unit in Last Place (ULP). The simulation model may include model elements that perform numerical computations using Native Floating Point (NFP) arithmetic. The model elements may be arranged to implement a procedure. A data store may include local ULP errors predetermined for the model elements. The systems and methods may retrieve the local ULP errors for the model elements included in the model, and may apply a rules-based analysis to compute an overall ULP error of the simulation model. The systems and methods may present the overall ULP computed for the model. The systems and methods may also present intermediate ULP errors determined for portions of the simulation model. Changes may be made to the model to reduce the overall ULP error.

Type: Grant

Filed: May 10, 2019

Date of Patent: March 2, 2021

Assignee: The MathWorks, Inc.

Inventors: Kiran K. Kintali, Shomit Dutta, E. Mehran Mestchian, Pieter J. Mosterman
Round for reround mode in a decimal floating point instruction

Patent number: 10782932

Abstract: A round-for-reround mode (preferably in a BID encoded Decimal format) of a floating point instruction prepares a result for later rounding to a variable number of digits by detecting that the least significant digit may be a 0, and if so changing it to 1 when the trailing digits are not all 0. A subsequent reround instruction is then able to round the result to any number of digits at least 2 fewer than the number of digits of the result. An optional embodiment saves a tag indicating the fact that the low order digit of the result is 0 or 5 if the trailing bits are non-zero in a tag field rather than modify the result. Another optional embodiment also saves a half-way-and-above indicator when the trailing digits represent a decimal with a most significant digit having a value of 5. An optional subsequent reround instruction is able to round the result to any number of digits fewer or equal to the number of digits of the result using the saved tags.

Type: Grant

Filed: August 25, 2019

Date of Patent: September 22, 2020

Assignee: International Business Machines Corporation

Inventors: Michael F. Cowlishaw, Eric M. Schwarz, Ronald M. Smith, Sr., Phil C. Yeh
Leading zero anticipation

Patent number: 10606557

Abstract: A data processing apparatus is provided. Intermediate value generation circuitry generates an intermediate value from a first floating point number and a second floating point number. The intermediate value includes a number of leading 0s indicative of a prediction of a number of leading 0s in a difference between absolute values of the first floating point number and the second floating point number. The prediction differs by at most one from the number of leading 0s in the difference between absolute values of the first floating point number and the second floating point number. Count circuitry counts the number of leading 0s in said intermediate value and mask generation circuitry produces one or more masks using the intermediate value. The mask generation circuitry produces the one or more masks at the same time or before the count circuitry counts the number of leading 0s in the intermediate value.

Type: Grant

Filed: December 6, 2016

Date of Patent: March 31, 2020

Assignee: ARM Limited

Inventor: David Raymond Lutz
Testing insecure computing environments using random data sets generated from characterizations of real data sets

Patent number: 10592672

Abstract: The disclosed embodiments provide a system that facilitates testing of an insecure computing environment. During operation, the system obtains a real data set comprising a set of data strings. Next, the system determines a set of frequency distributions associated with the set of data strings. The system then generates a test data set from the real data set, wherein the test data set comprises a set of random data strings that conforms to the set of frequency distributions. Finally, the system tests the insecure computing environment using the test data set.

Type: Grant

Filed: December 21, 2016

Date of Patent: March 17, 2020

Assignee: INTUIT INC.

Inventor: Colin R. Dillard
Efficient instruction processing for sparse data

Patent number: 10592252

Abstract: Efficient instruction processing for sparse data includes extensions to a processor pipeline to identify zero-optimizable instructions that include at least one zero input operand, and bypass the execute stage of the processor pipeline, determining the result of the operation without executing the instruction. When possible, the extensions also bypass the writeback stage of the processor pipeline.

Type: Grant

Filed: December 31, 2015

Date of Patent: March 17, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Trishul A. Chilimbi, Olatunji Ruwase, Vivek Seshadri
Decimal and binary floating point arithmetic calculations

Patent number: 10416962

Abstract: Logic is provided for performing decimal and binary floating point arithmetic calculations on first and second operands. The method includes: receiving the first and second operands in packed format; unpacking the first and second operands; swapping the first operand to a fourth operand and the second operand to a third operand, if an exponent of the first operand is less than an exponent of the second operand, otherwise storing the first operand to the third operand and the second operand to the fourth operand; aligning the third operand and the fourth operands based on the exponent difference of the third and fourth operand and a number of leading zeroes of the third operand; performing an add/subtract operation on the aligned third and fourth operands with normalizing and rounding between the operands; and packing the result obtained from the add/subtract.

Type: Grant

Filed: October 2, 2015

Date of Patent: September 17, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Steven R. Carlough, Juergen Haess, Michael Klein, Klaus M. Kroener, Petra Leber, Silvia M. Mueller, Kerstin Schelm
Read and set floating point control register instruction

Patent number: 10318240

Abstract: Setting or updating of floating point controls is managed. Floating point controls include controls used for floating point operations, such as rounding mode and/or other controls. Further, floating point controls include status associated with floating point operations, such as floating point exceptions and/or others. The management of the floating point controls includes efficiently updating the controls, while reducing costs associated therewith.

Type: Grant

Filed: November 14, 2017

Date of Patent: June 11, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Read and set floating point control register instruction

Patent number: 10310814

Abstract: Setting or updating of floating point controls is managed. Floating point controls include controls used for floating point operations, such as rounding mode and/or other controls. Further, floating point controls include status associated with floating point operations, such as floating point exceptions and/or others. The management of the floating point controls includes efficiently updating the controls, while reducing costs associated therewith.

Type: Grant

Filed: June 23, 2017

Date of Patent: June 4, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Floating point rounding processors, methods, systems, and instructions

Patent number: 10209986

Abstract: A method of an aspect includes receiving a floating point rounding instruction. The floating point rounding instruction indicates a source of one or more floating point data elements, indicates a number of fraction bits after a radix point that each of the one or more floating point data elements are to be rounded to, and indicates a destination storage location. A result is stored in the destination storage location in response to the floating point rounding instruction. The result includes one or more rounded result floating point data elements. Each of the one or more rounded result floating point data elements includes one of the floating point data elements of the source, in a corresponding position, which has been rounded to the indicated number of fraction bits. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: December 22, 2011

Date of Patent: February 19, 2019

Assignee: Intel Corporation

Inventors: Jesus Corbal San Adrian, Cristina S. Anderson, Robert Valentine, Bret Toll, Amit Gradstein, Simon Rubanovich, Benny Eitan
Methods, apparatus, instructions and logic to provide permute controls with leading zero count functionality

Patent number: 10162638

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: March 5, 2018

Date of Patent: December 25, 2018

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
Methods, apparatus, instructions and logic to provide permute controls with leading zero count functionality

Patent number: 10162639

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: March 5, 2018

Date of Patent: December 25, 2018

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
Methods, apparatus, instructions and logic to provide permute controls with leading zero count functionality

Patent number: 10162637

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: March 5, 2018

Date of Patent: December 25, 2018

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
Rounding floating point numbers

Patent number: 10146503

Abstract: Embodiments disclosed pertain to apparatuses, systems, and methods for floating point operations. Disclosed embodiments pertain to a circuit that is capable of processing both a normal and denormal inputs and outputting normal and denormal results, and where a rounding module is used advantageously to reduce operational latency of the circuit.

Type: Grant

Filed: October 13, 2016

Date of Patent: December 4, 2018

Assignee: Imagination Technologies Limited

Inventor: Leonard Rarick
Closepath fast incremented sum in a three-path fused multiply-add design

Patent number: 10140092

Abstract: According to one general aspect, an apparatus may include a floating-point multiply-accumulate unit configured to generate a floating point result by either adding or subtracting three floating point operands: an addend, a product carry, and a product sum. The floating-point multiply-accumulate unit may include a close path adder. The close path adder may include an unincremented mantissa addition circuit configured to compute an unincremented mantissa result based upon the three floating point operands. The close path adder may also include an incremented mantissa addition circuit configured to, at least partially in parallel with the mantissa addition circuit, produce an incremented mantissa result. The close path adder may further include a selection circuit configured to produce a close path result by selecting between the unincremented mantissa result and the incremented mantissa result.

Type: Grant

Filed: February 10, 2017

Date of Patent: November 27, 2018

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Ashraf Ahmed
Overlap propagation operation

Patent number: 9928031

Abstract: Processing circuitry is provided to perform an overlap propagating operation on a first data value to generate a second data value, the first and second data values having a redundant representation representing a P-bit numeric value using an M-bit data value comprising a plurality of N-bit portions, where M>P>N. In the redundant representation, each N-bit portion other than a most significant N-bit portion includes a plurality of overlap bits having a same significance as a plurality of least significant bits of a following N-bit portion. Each N-bit portion of the second data value other than a least significant N-bit portion is generated by adding non-overlap bits of a corresponding N-bit portion of the first data value to the overlap bits of a preceding N-bit portion of the first data value. This provides a faster technique for reducing the chance of overflow during addition of the redundantly represented M-bit value.

Type: Grant

Filed: November 12, 2015

Date of Patent: March 27, 2018

Assignee: ARM LIMITED

Inventors: Neil Burgess, David Raymond Lutz, Christopher Neal Hinds
Floating point number rounding

Patent number: 9817661

Abstract: A data processing system supports execution of program instructions having a rounding position input operand so as to generate control signals for controlling processing circuitry to process a floating point input operand with a significand value to generate an output result which depends upon a value from rounding the floating point input operand using a variable rounding point within the significand of the floating point input operand as specified by the rounding position input operand. In this way, processing operations having as inputs floating point operands and anchored number operands may be facilitated.

Type: Grant

Filed: October 7, 2015

Date of Patent: November 14, 2017

Assignee: ARM Limited

Inventors: David Raymond Lutz, Christopher Neal Hinds, Neil Burgess
Standard format intermediate result

Patent number: 9798519

Abstract: A microprocessor comprises an instruction pipeline, a shared memory, and first and second arithmetic processing units in the instruction pipeline, each capable of reading or receiving operands from and writing or providing results to the shared memory. The first arithmetic processing unit performs a first portion of a mathematical operation to produce an intermediate result vector that is not a complete, final result of the mathematical operation. The first arithmetic processing unit generates a plurality of non-architectural calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The second arithmetic processing unit performs a second portion of the mathematical operation, in accordance with the calculation control indicators, to produce a complete, final result of the mathematical operation.

Type: Grant

Filed: June 24, 2015

Date of Patent: October 24, 2017

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Thomas Elmer
Double rounded combined floating-point multiply and add

Patent number: 9778909

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Grant

Filed: October 24, 2016

Date of Patent: October 3, 2017

Assignee: Intel Corporation

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Lane position information for processing of vector

Patent number: 9733899

Abstract: Processing circuitry performs a plurality of lanes of processing on respective data elements of at least one operand vector to generate corresponding result data elements of a result vector. The processing circuitry identifies lane position information for each lane of processing, the lane position information for a given lane identifying a relative position of the corresponding result data element to be generated by the given lane within a corresponding result data value spanning one or more result data elements of the result vector. The processing circuitry is configured to perform each lane of processing in dependence on the lane position information identified for that lane. This enables generation of results which are wider or narrower than the vector size supported in hardware.

Type: Grant

Filed: November 12, 2015

Date of Patent: August 15, 2017

Assignee: ARM Limited

Inventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds
Check procedure for floating point operations

Patent number: 9678714

Abstract: Method and computer system for implementing an operation on ?1 floating point input, in accordance with a rounding mode, e.g. using a Newton-Raphson technique. The floating point result comprises a p-bit mantissa. An unrounded proposed mantissa result is determined using the Newton-Raphson technique, wherein a p-bit rounded proposed mantissa result, t, corresponds to a rounding of the unrounded proposed mantissa result in accordance with the rounding mode, with k leading zeroes. If an increment to the (m?k)th bit of the unrounded result would affect the p-bit rounded result then the input(s) and bits of the unrounded result are used to determine a check parameter which is indicative of a relationship between an exact result and the unrounded result if the (m?k)th bit were incremented. The p-bit mantissa of the floating point result, is determined in dependence upon the check parameter, to be either t or t+1.

Type: Grant

Filed: July 11, 2014

Date of Patent: June 13, 2017

Assignee: Imagination Technologies Limited

Inventors: Manouk Manoukian, Leonard Rarick
Fused floating point datapath with correct rounding

Patent number: 9552190

Abstract: In accordance with some embodiments, a floating point number datapath circuitry, e.g., within an integrated circuit programmable logic device is provided. The datapath circuitry may be used for computing a rounded absolute value of a mantissa of a floating point number. The floating point datapath circuitry may have only a single adder stage for computing a rounded absolute value of a mantissa of the floating point number based on one or more bits of an unrounded mantissa of the floating point number. The unrounded and rounded mantissas may include a sign bit, a sticky bit, a round bit, and/or a least significant bit, and/or other bits. The unrounded mantissa may be in a format that includes negative numbers (e.g., 2's complement) and the rounded mantissa may be in a format that may include a portion of the floating point number represented as a positive number, (e.g., signed magnitude).

Type: Grant

Filed: April 20, 2016

Date of Patent: January 24, 2017

Assignee: ALTERA CORPORATION

Inventors: Martin Langhammer, Bogdan Pasca
Vector instruction for presenting complex conjugates of respective complex numbers

Patent number: 9411583

Abstract: An apparatus is described having a semiconductor chip that has an instruction execution pipeline. The instruction execution pipeline has an execution unit with logic circuitry to perform the following for an instruction: accept input vector elements representing real and imaginary parts of a plurality of complex numbers; and, present the complex conjugates of the complex numbers.

Type: Grant

Filed: December 22, 2011

Date of Patent: August 9, 2016

Assignee: Intel Corporation

Inventors: Suleyman Sair, Elmoustapha Ould-Ahmed-Vall
Floating-point adder circuitry

Patent number: 9405728

Abstract: An integrated circuit is provided that performs floating-point addition or subtraction operations involving at least three floating-point numbers. The floating-point numbers are pre-processed by dynamically extending the number of mantissa bits, determining the floating-point number with the biggest exponent, and shifting the mantissa of the other floating-point numbers to the right. Each extended mantissa has at least twice the number of bits of the mantissa entering the floating-point operation. The exact bit extension is dependent on the number of floating-point numbers to be added. The mantissas of all floating-point numbers with an exponent smaller than the biggest exponent are shifted to the right. The number of right shift bits is dependent on the difference between the biggest exponent and the respective floating-point exponent.

Type: Grant

Filed: September 5, 2013

Date of Patent: August 2, 2016

Assignee: Altera Corporation

Inventor: Tomasz Czajkowski
Enhanced loop streaming detector to drive logic optimization

Patent number: 9354875

Abstract: An enhanced loop streaming detection mechanism is provided in a processor to reduce power consumption. The processor includes a decoder to decode instructions in a loop into micro-operations, and a loop streaming detector to detect the presence of the loop in the micro-operations. The processor also includes a loop characteristic tracker unit to identify hardware components downstream from the decoder that are not to be used by the micro-operations in the loop, and to disable the identified hardware components. The processor also includes execution circuitry to execute the micro-operations in the loop with the identified hardware components disabled.

Type: Grant

Filed: December 27, 2012

Date of Patent: May 31, 2016

Assignee: Intel Corporation

Inventors: Matthew C. Merten, Justin M. Deinlein, Yury N. Ilin, Alexandre J. Farcy, Tong Li, Srikanth T. Srinivasan
Fused floating point datapath with correct rounding

Patent number: 9348557

Abstract: In accordance with some embodiments, a floating point number datapath circuitry, e.g., within an integrated circuit programmable logic device is provided. The datapath circuitry may be used for computing a rounded absolute value of a mantissa of a floating point number. The floating point datapath circuitry may have only a single adder stage for computing a rounded absolute value of a mantissa of the floating point number based on one or more bits of an unrounded mantissa of the floating point number. The unrounded and rounded mantissas may include a sign bit, a sticky bit, a round bit, and/or a least significant bit, and/or other bits. The unrounded mantissa may be in a format that includes negative numbers (e.g., 2's complement) and the rounded mantissa may be in a format that may include a portion of the floating point number represented as a positive number, (e.g., signed magnitude).

Type: Grant

Filed: February 21, 2014

Date of Patent: May 24, 2016

Assignee: ALTERA CORPORATION

Inventors: Martin Langhammer, Bogdan Pasca
Performing quotient selection for a carry-save division operation

Patent number: 9298421

Abstract: The disclosed embodiments disclose techniques for performing quotient selection in an iterative carry-save division operation that divides a dividend, R, by a divisor, D, to produce an approximation of a quotient, Q=R/D. During a divide operation, a divider approximates Q by iteratively selecting an operation to perform for each iteration of the carry-save division operation and then performing the selected operation. The operation for each iteration is selected based on the current partial sum bits of a partial remainder in carry-save form (rs) and the current partial carry bits of a partial remainder in carry-save form (rc). More specifically, the operation is selected from a set of operations that includes: (1) a 2X* operation; (2) an S1 & 2X* operation; (3) an S2 & 2X* operation; (4) an A1 & 2X* operation; and (5) an A2 & 2X* operation.

Type: Grant

Filed: September 17, 2013

Date of Patent: March 29, 2016

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Josephus C. Ebergen, Navaneeth P. Jamadagni, Ivan E. Sutherland
Exponentiation system

Patent number: 8930435

Abstract: A method for computation, including defining a sequence of n bits that encodes an exponent d, such that no more than a specified number of successive bits in the sequence are the same, initializing first and second registers using a value of a base x that is to be exponentiated, whereby the first and second registers hold respective first and second values, which are successively updated during the computation, successively, for each bit in the sequence computing a product of the first and second values, depending on whether the bit is one or zero, selecting one of the first and second registers, and storing the product in the selected one of the registers, whereby the first and second registers hold respective first and second final values upon completion of the sequence, and returning xd based on the first and second final values. Related apparatus and methods are also described.

Type: Grant

Filed: September 21, 2010

Date of Patent: January 6, 2015

Assignee: Cisco Technology Inc.

Inventors: Yaacov Belenky, Zeev Geyzel
Circuit which performs split precision, signed/unsigned, fixed and floating point, real and complex multiplication

Patent number: 8918445

Abstract: An integrated multiplier circuit that operates on a variety of data formats including integer fixed point, signed or unsigned, real or complex, 8 bit, 16 bit or 32 bit as well as floating point data that may be single precision real, single precision complex or double precision. The circuit uses a single set of multiplier arrays to perform 16×16, 32×32 and 64×64 multiplies, 32×32 and 64×64 complex multiplies, 32×32 and 64×64 complex multiplies with one operand conjugated.

Type: Grant

Filed: September 21, 2011

Date of Patent: December 23, 2014

Assignee: Texas Instruments Incorporated

Inventors: Timothy David Anderson, Mujibur Rahman
Arithmetic circuit, arithmetic processing apparatus and method of controlling arithmetic circuit

Patent number: 8903881

Abstract: An arithmetic circuit for quantizing pre-quantized data includes a first input register to store first-format pre-quantized data that includes a mantissa and an exponent, a second input register to store a quantization target exponent, an exponent-correction-value indicating unit to indicate an exponent correction value, an exponent generating unit to generate a quantized exponent obtained by subtracting the exponent correction value from the quantization target exponent, a shift amount generating unit to generate a shift amount obtained by subtracting the exponent of the pre-quantized data and the exponent correction value from the quantization target exponent, a shift unit to generate a quantized mantissa obtained by shifting the mantissa of the pre-quantized data by the shift amount generated by the shift amount generating unit, and an output register to store quantized data that includes the quantized exponent generated by the exponent generating unit and the quantized mantissa generated by the shift unit

Type: Grant

Filed: April 3, 2012

Date of Patent: December 2, 2014

Assignee: Fujitsu Limited

Inventors: Ryuji Kan, Hideyuki Unno, Kenichi Kitamura
Normalization of floating point operations in a programmable integrated circuit device

Patent number: 8886695

Abstract: A programmable integrated circuit device is programmed to normalize multiplication operations by examining the input or output values to determined the likelihood of overflow or underflow and then to adjust the input or output values accordingly. The examination of the inputs can include an examination of the number of adder stages feeding into the inputs, as well as a count of leading bits ahead of the first significant bit. Adjustment of an input can include shifting the mantissa by the leading bit count and adjusting the exponent accordingly, while adjustment of the output can include shifting the mantissa by the sum of the leading bit counts of the inputs and adjusting the exponent accordingly. Or the output can be examined to find its leading bit count and the output then can be adjusted by shifting the mantissa by the leading bit count and adjusting the exponent accordingly.

Type: Grant

Filed: July 10, 2012

Date of Patent: November 11, 2014

Assignee: Altera Corporation

Inventor: Martin Langhammer
Floating point multiplier circuit with optimized rounding calculation

Patent number: 8832166

Abstract: An optimized floating point multiplier rounding circuit that minimizes the increase of the critical timing path of the calculation. The values of the temporary mantissa required to make the rounding decision are calculated simultaneously by the circuit shown in the invention.

Type: Grant

Filed: September 28, 2011

Date of Patent: September 9, 2014

Assignee: Texas Instruments Incorporated

Inventor: Timothy David Anderson
System and method for testing whether a result is correctly rounded

Patent number: 8775494

Abstract: A computer-implemented method for executing a floating-point calculation where an exact value of an associated result cannot be expressed as a floating-point value is disclosed. The method involves: generating an estimate of the associated result and storing the estimate in memory; calculating an amount of error for the estimate; determining whether the amount of error is less than or equal to a threshold of error for the associated result; and if the amount of error is less than or equal to the threshold of error, then concluding that the estimate of the associated result is a correctly rounded result of the floating-point calculation; or if the amount of error is greater than the threshold of error, then testing whether the floating-point calculation constitutes an exception case.

Type: Grant

Filed: March 1, 2011

Date of Patent: July 8, 2014

Assignee: NVIDIA Corporation

Inventor: Alexandru Fit-Florea
METHOD, APPARATUS, SYSTEM FOR SINGLE-PATH FLOATING-POINT ROUNDING FLOW THAT SUPPORTS GENERATION OF NORMALS/DENORMALS AND ASSOCIATED STATUS FLAGS

Publication number: 20140181169

Abstract: A mechanism for performing single-path floating-point rounding in a floating point unit is disclosed. A system of the disclosure includes a memory and a processing device communicably coupled to the memory. In one embodiment, the processing device comprises a floating point unit (FPU) to generate a plurality of status flags for a rounded value of a finite nonzero number. The plurality of status flags are generated based on the finite nonzero number without calculating the rounded value of the finite nonzero number. The plurality of status flags comprises an overflow flag and an underflow flag. The FPU determines whether a rounded value should be calculated for the finite nonzero number based on the plurality of status flags and whether the overflow flag is asserted.

Type: Application

Filed: December 21, 2012

Publication date: June 26, 2014

Inventors: WARREN E. FERGUSON, BRIAN J. HICKMANN, THOMAS D. FLETCHER
Rounding unit for decimal floating-point division

Patent number: 8751555

Abstract: A method for performing a decimal floating-point division, including: receiving, by a decimal floating-point divider, a decimal floating-point dividend and a decimal floating-point divisor; obtaining, by the decimal floating-point divider, a preliminary quotient having a first precision level, where the preliminary quotient is calculated from the decimal floating-point dividend and the decimal-floating point divisor; receiving, by the decimal floating-point divider, a rounding mode; selecting a rounding action based on the preliminary quotient and the rounding mode; and obtaining a rounded quotient having a second precision level by rounding the preliminary quotient according to the rounding action, where the first precision level is at least one digit greater than the second precision level.

Type: Grant

Filed: July 6, 2011

Date of Patent: June 10, 2014

Assignee: SilMinds, LLC, Egypt

Inventors: Amira Mohamed, Hossam Ali Hassan Fahmy, Ramy Raafat, Yasmeen Farouk, Mostafa Elkhouly, Rodina Samy, Tarek Eldeeb
Integer rounding operation

Patent number: 8732226

Abstract: Systems, methods, processors, media, and other embodiments associated with integer rounding a floating point number in one micro-operation (uop) are described. One system embodiment includes a memory to store an integer rounding floating point instruction and a processor to perform the integer rounding floating point instruction. The processor may include a floating point unit that includes circuits and/or logics that integer round the floating point number.

Type: Grant

Filed: June 6, 2006

Date of Patent: May 20, 2014

Assignee: Intel Corporation

Inventors: Mohammad Abdallah, Chad D. Hancock, Kwok W. Lui
MIXED PRECISION ESTIMATE INSTRUCTION COMPUTING NARROW PRECISION RESULT FOR WIDE PRECISION INPUTS

Publication number: 20140101216

Abstract: A technique is provided for performing a mixed precision estimate. A processing circuit receives an input of a first precision having a wide precision value. The processing circuit computes an output in an output exponent range corresponding to a narrow precision value based on the input having the wide precision value.

Type: Application

Filed: December 11, 2013

Publication date: April 10, 2014

Applicant: International Business Machines Corporation

Inventors: Michael K. Gschwind, Valentina Salapura
Decimal floating-point fused multiply-add unit

Patent number: 8694572

Abstract: A decimal floating-point Fused-Multiply-Add (FMA) unit that performs the operation of ±(A×B)±C on decimal floating-point operands. The decimal floating-point FMA unit executes the multiplication and addition operations compliant with the IEEE 754-2008 standard. Specifically, the decimal floating-point FMA includes a parallel multiplier and injects the addend after required alignment as an additional partial product in the reduction tree used in the parallel multiplier. The decimal floating-point FMA unit may be configured to perform addition-subtraction operations or multiplication operations as standalone operations.

Type: Grant

Filed: July 6, 2011

Date of Patent: April 8, 2014

Assignee: SilMinds, LLC, Egypt

Inventors: Rodina Samy, Hossam Ali Hassan Fahmy, Tarek Eldeeb, Ramy Raafat, Yasmeen Farouk, Mostafa Elkhouly, Amira Mohamed
System and method of bypassing unrounded results in a multiply-add pipeline unit

Patent number: 8671129

Abstract: A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.

Type: Grant

Filed: March 8, 2011

Date of Patent: March 11, 2014

Assignee: Oracle International Corporation

Inventors: Jeffrey S. Brooks, Christopher H. Olson

1 2 3 4 next