Floating Point Patents (Class 708/495)
  • Patent number: 10732979
    Abstract: A set of entries in a branch prediction structure for a set of second blocks are accessed based on a first address of a first block. The set of second blocks correspond to outcomes of one or more first branch instructions in the first block. Speculative prediction of outcomes of second branch instructions in the second blocks is initiated based on the entries in the branch prediction structure. State associated with the speculative prediction is selectively flushed based on types of the branch instructions. In some cases, the branch predictor can be accessed using an address of a previous block or a current block. State associated with the speculative prediction is selectively flushed from the ahead branch prediction, and prediction of outcomes of branch instructions in one of the second blocks is selectively initiated using non-ahead accessing, based on the types of the one or more branch instructions.
    Type: Grant
    Filed: June 18, 2018
    Date of Patent: August 4, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Marius Evers, Aparna Thyagarajan, Ashok T. Venkatachar
  • Patent number: 10671345
    Abstract: An integrated circuit may include normalization circuitry that can be used when converting a fixed-point number to a floating-point number. The normalization circuitry may include at least a floating-point generation circuit that receives the fixed-point number and that creates a corresponding floating-point number. The normalization circuitry may then leverage an embedded digital signal processing (DSP) block on the integrated circuit to perform an arithmetic operation by removing the leading one from the created floating-point number. The resulting number may have a fractional component and an exponent value, which can then be used to derive the final normalized value.
    Type: Grant
    Filed: February 2, 2017
    Date of Patent: June 2, 2020
    Assignee: Intel Corporation
    Inventor: Bogdan Pasca
  • Patent number: 10671389
    Abstract: A Vector Floating Point Test Data Class Immediate instruction is provided that determines whether one or more elements of a vector specified in the instruction are of one or more selected classes and signs. If a vector element is of a selected class and sign, an element in an operand of the instruction corresponding to the vector element is set to a first defined value, and if the vector element is not of the selected class and sign, the operand element corresponding to the vector element is set to a second defined value.
    Type: Grant
    Filed: January 21, 2019
    Date of Patent: June 2, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan D. Bradbury, Eric M. Schwarz
  • Patent number: 10628049
    Abstract: A sequencer circuit is configured to generate control signals for on-die memory control circuitry. The control signals may include memory operation pulses for implementing operations on selected non-volatile memory cells embodied within the same die as the sequencer (and other on-die memory control circuitry). The timing, configuration, and/or duration of the memory control signals are defined in configuration data, which can be modified after the design and/or fabrication of the die and/or on-die memory circuitry. As such, the timing, configuration, and/or duration of the memory control signals generated by the sequencer may be manipulated after the design and/or fabrication of the die, sequencer, and other on-die memory control circuitry.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: April 21, 2020
    Assignee: Sandisk Technologies LLC
    Inventors: Yuheng Zhang, Gordon Yee, Yibo Yin, Tz-Yi Liu Liu
  • Patent number: 10592213
    Abstract: Techniques to preprocess tensor operations prior to code generation to optimize compilation are disclosed. A computer readable representation of a linear algebra or tensor operation is received. A code transformation software component performs transformations include output reduction and fraction removal. The result is a set of linear equations of a single variable with integer coefficients. Such a set lends itself to more efficient code generation during compilation by a code generation software component. Use cases disclosed include targeting a machine learning hardware accelerator, receiving code in the form of an intermediate language generated by a cross-compiler with multiple front ends supporting multiple programming languages, and cloud deployment and execution scenarios.
    Type: Grant
    Filed: October 18, 2017
    Date of Patent: March 17, 2020
    Assignee: Intel Corporation
    Inventors: Jeremy Bruestle, Choong Ng
  • Patent number: 10514913
    Abstract: Setting or updating of floating point controls is managed. Floating point controls include controls used for floating point operations, such as rounding mode and/or other controls. Further, floating point controls include status associated with floating point operations, such as floating point exceptions and/or others. The management of the floating point controls includes efficiently updating the controls, while reducing costs associated therewith.
    Type: Grant
    Filed: June 23, 2017
    Date of Patent: December 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10503477
    Abstract: The disclosure provides a very flexible mechanism for a storage controller to create RAID stripes and to re-create corrupted stripes when necessary using the erasure coding scheme. Typically, this is known as a RAID 6 implementation/feature. The erasure code calculations are generated using the Galois Multiplication hardware and the system controller can pass any polynomial into the hardware on a per stripe calculation basis. The polynomial value is passed to the hardware via an input descriptor field. The descriptor controls the entire computation process.
    Type: Grant
    Filed: December 8, 2017
    Date of Patent: December 10, 2019
    Assignee: EXTEN TECHNOLOGIES, INC.
    Inventors: Daniel B. Reents, Ashwin Kamath
  • Patent number: 10459688
    Abstract: An apparatus comprises: processing circuitry to perform data processing; and an instruction decoder to control the processing circuitry to perform an anchored-data processing operation to generate an anchored-data element. The anchored-data element has an encoding including type information indicative of whether the anchored-data element represents: a portion of bits of a two's complement number, said portion of bits corresponding to a given range of significance representable using the anchored-data element; or a special value other than said portion of bits of a two's complement number.
    Type: Grant
    Filed: February 6, 2019
    Date of Patent: October 29, 2019
    Assignee: ARM Limited
    Inventors: Neil Burgess, Christopher Neal Hinds, David Raymond Lutz
  • Patent number: 10379815
    Abstract: Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: August 13, 2019
    Assignee: Altera Corporation
    Inventor: Martin Langhammer
  • Patent number: 10331405
    Abstract: An accurate implementation of a polynomial using floating-point or other rounded arithmetic can be generated using a plurality of hardware logic components which each implement an input polynomial such that the zeros in the input polynomial can be determined correctly. The number of different hardware logic components that are used can be reduced by analyzing the set of input polynomials and from it generating a set of polynomial components, where each polynomial in the set of input polynomials which is not also in the set of polynomial components, can be generated from a single one of the polynomial components.
    Type: Grant
    Filed: April 21, 2017
    Date of Patent: June 25, 2019
    Assignee: Imagination Technologies Limited
    Inventor: Theo Alan Drane
  • Patent number: 10297000
    Abstract: A high dynamic range image information hiding method includes embedding secret information and extracting the secret information. The step of embedding secret information includes obtaining three channel values of every pixel in an original high dynamic range image; according to every channel value and corresponding 5-bit exponent of every pixel, determining an embedding significance bit of the information to be embedded in every channel value of every pixel; embedding information into every channel value of every pixel; and obtaining a high dynamic range image embedded with the secret information.
    Type: Grant
    Filed: November 8, 2017
    Date of Patent: May 21, 2019
    Assignee: Ningbo University
    Inventors: Gangyi Jiang, Yongqiang Bai, Mei Yu, Yang Wang
  • Patent number: 10216479
    Abstract: An apparatus and method are provided for performing arithmetic operations to accumulate floating-point numbers. The apparatus comprises execution circuitry to perform arithmetic operations, and decoder circuitry to decode a sequence of instructions. A convert and accumulate instruction is provided, and the decoder circuitry is responsive to decoding the convert and accumulate instruction to generate one or more control signals to control the execution circuitry to convert at least one floating-point operand identified by the convert and accumulate instruction into a corresponding N-bit fixed-point operand having M fraction bits, where M is less than N and M is dependent on a format of the floating-point operand. The execution circuitry accumulates each corresponding N bit fixed-point operand and a P bit fixed-point operand identified by the convert and accumulate instruction in order to generate a P bit fixed-point result value, where P is greater than N and also has M fraction bits.
    Type: Grant
    Filed: December 6, 2016
    Date of Patent: February 26, 2019
    Assignee: ARM LIMITED
    Inventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds, Andreas Due Engh-Halstvedt
  • Patent number: 10109014
    Abstract: Systems and methods involving a rating module that accesses a single, voluminous table or multiple tables stored in a searchable data store (e.g., database) to execute various queries (e.g., SQL JOIN) to search the table(s) is disclosed. The system may include—an underlying linear programming platform (e.g., optimization engine and associated components) that includes an application programmer's interface (e.g., Python API) that may be used to perform optimization using illustrative optimization libraries (e.g., optimizer). The system may be communicatively coupled with a vehicle and/or other device to communicate/output ratings information to a user.
    Type: Grant
    Filed: July 30, 2015
    Date of Patent: October 23, 2018
    Assignee: Allstate Insurance Company
    Inventors: Richard D. Bischoff, Nicholas J. Reed, Timothy S. Lenahan, Eric Huls
  • Patent number: 10061561
    Abstract: A floating point adder includes leading zero anticipation circuitry to determine a number of leading zeros within a result significand value of a sum of a first floating point operand and a second floating point operand. This number of leading zeros is used to generate a mask which in turn selects input bits from a non-normalized significand produced by adding the first significand value and the second significand value. The non-normalized significand is then normalized at the same time as the output rounding bits used to round the normalized significand value are generated by rounding bit generation circuitry.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: August 28, 2018
    Assignee: ARM Limited
    Inventor: David Raymond Lutz
  • Patent number: 10031842
    Abstract: A method including measuring an initial precision measurement of at least one value of a number with a decimal point, measuring an infinite precision measurement of a value of the number with a decimal point, where a format of the number with a decimal point or of a primitive operation manipulating a number with a decimal point is first replaced with a predetermined optimal format. Additionally, manipulating, for at least one instruction, at least one number with a decimal point, including writing at least one variant performing the same function as the at least one instruction, and measuring, for each variant, at least one value of the at least one number with a decimal point obtained with the variant, and selecting the optimal variant as a function of the measured value and the initial precision and infinite precision values and replacing the at least one instruction with the selected variant.
    Type: Grant
    Filed: March 12, 2015
    Date of Patent: July 24, 2018
    Assignee: NUMALIS
    Inventors: Arnault Ioualalen, Nicolas Normand, Matthieu Martel
  • Patent number: 10033801
    Abstract: Apparatus, systems, and methods are described, including apparatus that includes one or more communication interfaces for communicating over a communication network, and a processor. The processor is configured to receive, via the communication interfaces, a plurality of numbers, and calculate a sum of the numbers that is independent of an order in which the numbers are received, by (i) converting any of the numbers that are received in a floating-point representation to a derived floating-point representation that includes a plurality of signed integer multiplicands corresponding to different respective orders of magnitude, and (ii) summing the numbers in the derived floating-point representation, by separately summing integer multiplicands that correspond to the same order of magnitude. Other embodiments are also described.
    Type: Grant
    Filed: February 11, 2016
    Date of Patent: July 24, 2018
    Assignee: Mellanox Technologies, Ltd.
    Inventor: Hillel Chapman
  • Patent number: 9990203
    Abstract: Methods, devices, and systems for capturing an accuracy of an instruction executing on a processor. An instruction may be executed on the processor, and the accuracy of the instruction may be captured using a hardware counter circuit. The accuracy of the instruction may be captured by analyzing bits of at least one value of the instruction to determine a minimum or maximum precision datatype for representing the field, and determining whether to adjust a value of the hardware counter circuit accordingly. The representation may be output to a debugger or logfile for use by a developer, or may be output to a runtime or virtual machine to automatically adjust instruction precision or gating of portions of the processor datapath.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: June 5, 2018
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Leonardo de Paula Rosa Piga, Abhinandan Majumdar, Indrani Paul, Wei Huang, Manish Arora, Joseph L. Greathouse
  • Patent number: 9959091
    Abstract: A method identifies a floating point implementation of a polynomial that is accurately evaluable. The method comprises determining whether the polynomial has an allowable variety defined by a plurality of sub-varieties, and, if so, partitioning the input domain of the polynomial into a plurality of sub-domains about the sub-varieties. A floating point precision is then identified for each input to the polynomial falling within each sub-domain based on the location of the input within the sub-domain (e.g. how far away the input is from the sub-variety associated with the sub-domain). A floating point implementation for the polynomial is generated so that an input to the polynomial is evaluated using floating point components having the precision identified for the input.
    Type: Grant
    Filed: September 8, 2015
    Date of Patent: May 1, 2018
    Assignee: Imagination Technologies Limited
    Inventor: Theo Alan Drane
  • Patent number: 9940101
    Abstract: Embodiments of the present disclosure include a tininess prediction and handler engine for handling numeric underflow while streamlining the data path for handling normal range cases, thereby avoiding flushes, and reducing the complexity of a scheduler with respect to how dependent operations are handled. A preemptive tiny detection logic section can detect a potential tiny result for the function or operation that is being performed, and can produce a pessimistic tiny indicator. The tininess prediction and handler engine can further include a subnormal post-processing pipe, which can denormalize and round one or more subnormal operations while in a post-processing mode. A schedule modification logic section can reschedule in-flight operations. The schedule modification logic section can issue dependent operations optimistically assuming that a producing operation will not produce a tiny result, and so will not incur extra latency associated with fixing the tiny result in the post-processing pipe.
    Type: Grant
    Filed: March 10, 2016
    Date of Patent: April 10, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ashraf Ahmed, Nicholas Todd Humphries, Marc Augustin
  • Patent number: 9904513
    Abstract: Floating point compound equations that involve addition of at least three terms, where each term involves a multiplication, can be implemented by using a bypass to prevent small, remaining values from being lost when shifted.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: February 27, 2018
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Jorge F. Garcia Pabon, Ashutosh Garg
  • Patent number: 9880840
    Abstract: Detection of whether a result of a floating point operation is safe. Characteristics of the result are examined to determine whether the result is safe or potentially unsafe, as defined by the user. An instruction is provided to facilitate detection of safe or potentially unsafe results.
    Type: Grant
    Filed: February 28, 2014
    Date of Patent: January 30, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael F Cowlishaw, Shawn D Lundvall, Ronald M Smith, Sr., Phil C Yeh
  • Patent number: 9778906
    Abstract: An apparatus comprises processing circuitry to perform a conversion operation to convert a floating-point value to a vector comprising a plurality of data elements representing respective bit significance portions of a binary value corresponding to the floating-point value.
    Type: Grant
    Filed: December 24, 2014
    Date of Patent: October 3, 2017
    Assignee: ARM Limited
    Inventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds
  • Patent number: 9665347
    Abstract: An apparatus comprises processing circuitry to perform a conversion operation to convert a vector comprising a plurality of data elements representing respective bit significance portions of a binary value to a scalar value comprising an alternative representation of said binary value.
    Type: Grant
    Filed: December 24, 2014
    Date of Patent: May 30, 2017
    Assignee: ARM Limited
    Inventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds
  • Patent number: 9569175
    Abstract: An FMA unit, for carrying out an arithmetic operation in a model computation unit of a control unit, is configured to process input of two factors and one summand in the form of floating point values, and provide a computation result of such processing as an output variable in the form of a floating point value. The FMA unit is designed to carry out a multiplication and a subsequent addition, the bit resolutions of the inputs for the factors being lower than the bit resolution of the input for the summand and the bit resolution of the output variable.
    Type: Grant
    Filed: May 21, 2014
    Date of Patent: February 14, 2017
    Assignee: ROBERT BOSCH GMBH
    Inventors: Wolfgang Fischer, Andre Guntoro
  • Patent number: 9489198
    Abstract: A method and apparatus for including in a processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.
    Type: Grant
    Filed: February 4, 2016
    Date of Patent: November 8, 2016
    Assignee: Intel Corporation
    Inventors: Rajiv Kapoor, Ronen Zohar, Mark Buxton, Zeev Sperber, Koby Gottlieb
  • Patent number: 9471305
    Abstract: A method for graphics processing includes generating one or more transcendental instructions in a graphics processing unit (GPU). Micro-code is formed for processing the one or more transcendental instructions in the GPU. The micro-code is processed using an iterative process including cubic interpolation and an evaluation of a cubic polynomial.
    Type: Grant
    Filed: August 12, 2014
    Date of Patent: October 18, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Mitchell Alsup
  • Patent number: 9455831
    Abstract: An order-preserving encryption (OPE) encryption method receives a plaintext (clear text) and generates a ciphertext (encrypted text) using a software arbitrary precision floating point libraries during initial recursive computation rounds. In response to the ciphertext space reducing to breakpoint, the OPE encryption method continues computations using a hardware floating point processor to accelerate the computation. In this manner, the OPE encryption method enables efficient order preserving encryption to enable range queries on encrypted data.
    Type: Grant
    Filed: September 18, 2014
    Date of Patent: September 27, 2016
    Assignee: Skyhigh Networks, Inc.
    Inventor: Paul Grubbs
  • Patent number: 9448806
    Abstract: A floating-point unit and a method of identifying exception cases in a floating-point unit. In one embodiment, the floating-point unit includes: (1) a floating-point computation circuit having a normal path and an exception path and operable to execute an operation on an operand and (2) a decision circuit associated with the normal path and the exception path and configured to employ a flush-to-zero mode of the floating-point unit to determine which one of the normal path and the exception path is appropriate for carrying out the operation on the operand.
    Type: Grant
    Filed: September 25, 2012
    Date of Patent: September 20, 2016
    Assignee: Nvidia Corporation
    Inventors: Marcin Andrychowicz, Alex Fit-Florea
  • Patent number: 9444735
    Abstract: Techniques are presented herein to distribute the processing of communication to network-connected devices to routing nodes, as opposed to centralizing those operations in one device as in the traditional/classical system. Using a bitmapped Type field, advertisements and queries can be categorized. Also, by using a Subgroup field, the scope of advertisements and queries can be dynamically limited. These techniques reduce the number of matches and make the matches more relevant to the user who sent the query. Routing nodes can be any network element that routes traffic, physical or virtual (cloud-based router or switch). The intelligence to perform these techniques can be embodied as an overlay on top of a physical network.
    Type: Grant
    Filed: February 27, 2014
    Date of Patent: September 13, 2016
    Assignee: Cisco Technology, Inc.
    Inventors: Michael L. Sullenberger, Andre Karamanian
  • Patent number: 9335996
    Abstract: A mechanism for recycling error bits in a floating point unit is disclosed. A system of the disclosure includes a memory and a processing device communicably coupled to the memory. In one embodiment, the processing device comprising a floating point unit (FPU) to generate a result value from applying an operation on floating point number inputs to the FPU and generate an error value using the result value. The FPU also writes the result value to a first register of the processing device dedicated to storing results from the operation of the FPU and writes the error value to a second register of the processing device dedicated to storing errors from the operation of the FPU.
    Type: Grant
    Filed: November 14, 2012
    Date of Patent: May 10, 2016
    Assignee: Intel Corporation
    Inventors: Helia Naeimi, Ralph Nathan, Daniel Sorin, Shih-Lien L. Lu
  • Patent number: 9298457
    Abstract: An execution unit configured for compression and decompression of numerical data utilizing single instruction, multiple data (SIMD) instructions is described. The numerical data includes integer and floating-point samples. Compression supports three encoding modes: lossless, fixed-rate, and fixed-quality. SIMD instructions for compression operations may include attenuation, derivative calculations, bit packing to form compressed packets, header generation for the packets, and packed array output operations. SIMD instructions for decompression may include packed array input operations, header recovery, decoder control, bit unpacking, integration, and amplification. Compression and decompression may be implemented in a microprocessor, digital signal processor, field-programmable gate array, application-specific integrated circuit, system-on-chip, or graphics processor, using SIMD instructions. Compression and decompression of numerical data can reduce memory, networking, and storage bottlenecks.
    Type: Grant
    Filed: January 22, 2013
    Date of Patent: March 29, 2016
    Assignee: Altera Corporation
    Inventor: Albert W. Wegener
  • Patent number: 9141586
    Abstract: A mechanism for performing single-path floating-point rounding in a floating point unit is disclosed. A system of the disclosure includes a memory and a processing device communicably coupled to the memory. In one embodiment, the processing device comprises a floating point unit (FPU) to generate a plurality of status flags for a rounded value of a finite nonzero number. The plurality of status flags are generated based on the finite nonzero number without calculating the rounded value of the finite nonzero number. The plurality of status flags comprises an overflow flag and an underflow flag. The FPU determines whether a rounded value should be calculated for the finite nonzero number based on the plurality of status flags and whether the overflow flag is asserted.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: September 22, 2015
    Assignee: Intel Corporation
    Inventors: Warren E. Ferguson, Brian J. Hickmann, Thomas D. Fletcher
  • Patent number: 9135376
    Abstract: Various embodiments provide for the determination of a test set that satisfies a coverage model, where portions of the search space need not be searched in order to generate the test set. With various embodiments, a search space defined by a set of inputs for an electronic design and a coverage model is identified. The search space is then fractured into subspaces. Subsequently, the subspaces are solved to determine if they include at least one input sequence that satisfies the coverage constraints defined in the coverage model. The subspaces found to include at least one input sequence that satisfies these coverage constraints, are then searched for unique input sequences in order to generate a test set. Subspaces found not to include at least one input sequence that satisfies the coverage constraints may be excluded from the overall search space.
    Type: Grant
    Filed: May 1, 2013
    Date of Patent: September 15, 2015
    Assignee: Mentor Graphics Corporation
    Inventors: Sudhir D. Kadkade, Clifton A. Lyons, Jr., Kunal P. Ganeshpure
  • Patent number: 9128758
    Abstract: According to one aspect of the present disclosure, a method and technique for encoding densely packed decimals is disclosed. The method includes: executing a floating point instruction configured to perform a floating point operation on decimal data in a binary coded decimal (BCD) format; determining whether a result of the operation includes a rounded mantissa overflow; and responsive to determining that the result of the operation includes a rounded mantissa overflow, compressing a result of the operation from the BCD-formatted decimal data to decimal data in a densely packed decimal (DPD) format by shifting select bit values of the BCD formatted decimal data by one digit to select bit positions in the DPD format.
    Type: Grant
    Filed: November 15, 2011
    Date of Patent: September 8, 2015
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Kroener, Christophe J. Layer, Petra Leber, Silvia M. Mueller
  • Patent number: 9092226
    Abstract: Methods and apparatus are provided for handling floating point exceptions in a processor that executes single-instruction multiple-data (SIMD) instructions. In one example a numerical exception is identified for a SIMD floating point operation and SIMD micro-operations are initiated to generate two packed partial results of a packed result for the SIMD floating point operation. A SIMD denormalization micro-operation is initiated to combine the two packed partial results and to denormalize one or more elements of the combined packed partial results to generate a packed result for the SIMD floating point operation having one or more denormal elements. Flags are set and stored with packed partial results to identify denormal elements. In one example a SIMD normalization micro-operation is initiated to generate a normalized pseudo internal floating point representation prior to the SIMD floating point operation when it uses multiplication.
    Type: Grant
    Filed: December 14, 2011
    Date of Patent: July 28, 2015
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Shachar Finkelstein, Gregory Pribush, Amit Gradstein, Guy Bale, Thierry Pons
  • Patent number: 9076215
    Abstract: An image processor sets a first predetermined number of first blocks at first intervals in a second image, calculates a first evaluated value, selects one of the first blocks, and calculates a first parallax between the selected first block and the matching target block. An image processor sets a second predetermined number of second blocks at second intervals in a second image, calculates a second evaluated value, selects one of the second blocks, and calculates a second parallax between the selected second block and the matching target block. A controller determines, based on the first evaluated value and the second evaluated value and based on the first parallax and the second parallax, whether or not to employ one of the first parallax and the second parallax.
    Type: Grant
    Filed: November 27, 2012
    Date of Patent: July 7, 2015
    Assignee: Panasonic Intellectual Property Management Co., Ltd.
    Inventor: Motonori Ogura
  • Publication number: 20150149521
    Abstract: A hardware circuit for returning single precision denormal results to double precision. A hardware circuit component configured to count leading zeros of an unrounded single precision denormal result. A hardware circuit component configured to pre-compute a first exponent and a second exponent for the unrounded single precision denormal result. A hardware circuit component configured to perform a second normalization of the rounded single precision denormal result back to architected format.
    Type: Application
    Filed: November 26, 2013
    Publication date: May 28, 2015
    Applicant: International Business Machines Corporation
    Inventors: Maarten J. Boersma, Thomas Fuchs, Markus Kaltenbach, David Lang
  • Publication number: 20150149522
    Abstract: A hardware circuit for returning single precision denormal results to double precision. A hardware circuit component configured to count leading zeros of an unrounded single precision denormal result. A hardware circuit component configured to pre-compute a first exponent and a second exponent for the unrounded single precision denormal result. A hardware circuit component configured to perform a second normalization of the rounded single precision denormal result back to architected format.
    Type: Application
    Filed: January 9, 2014
    Publication date: May 28, 2015
    Applicant: International Business Machines Corporation
    Inventors: Maarten J. Boersma, Thomas Fuchs, Markus Kaltenbach, David Lang
  • Publication number: 20150095393
    Abstract: A floating-point value can represent a number or something that is not a number (NaN). A floating-point value that is a NaN includes a portion that stores information about the source operands of the instruction.
    Type: Application
    Filed: September 30, 2013
    Publication date: April 2, 2015
    Applicant: FREESCALE SEMICONDUCTOR, INC.
    Inventor: William C. Moyer
  • Patent number: 8984041
    Abstract: Mechanisms are provided for performing a floating point arithmetic operation in a data processing system. A plurality of floating point operands of the floating point arithmetic operation are received and bits in a mantissa of at least one floating point operand of the plurality of floating point operands are shifted. One or more bits of the mantissa that are shifted outside a range of bits of the mantissa of at least one floating point operand are stored and a vector value is generated based on the stored one or more bits of the mantissa that are shifted outside of the range of bits of the mantissa of the at least one floating point operand. A resultant value is generated for the floating point arithmetic operation based on the vector value and the plurality of floating point operands.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: John B. Carter, Bruce G. Mealey, Karthick Rajamani, Eric E. Retter, Jeffrey A. Stuecheli
  • Publication number: 20150074162
    Abstract: Mechanisms are provided for performing a floating point arithmetic operation in a data processing system. A plurality of floating point operands of the floating point arithmetic operation are received and bits in a mantissa of at least one floating point operand of the plurality of floating point operands are shifted. One or more bits of the mantissa that are shifted outside a range of bits of the mantissa of at least one floating point operand are stored and a vector value is generated based on the stored one or more bits of the mantissa that are shifted outside of the range of bits of the mantissa of the at least one floating point operand. A resultant value is generated for the floating point arithmetic operation based on the vector value and the plurality of floating point operands.
    Type: Application
    Filed: October 28, 2014
    Publication date: March 12, 2015
    Inventors: John B. Carter, Bruce G. Mealey, Karthick Rajamani, Eric E. Retter, Jeffrey A. Stuecheli
  • Patent number: 8977669
    Abstract: To add floating point numbers in a parallel computing system, a collective logic device receives the floating point numbers from computing nodes. The collective logic devices converts the floating point numbers to integer numbers. The collective logic device adds the integer numbers and generating a summation of the integer numbers. The collective logic device converts the summation to a floating point number. The collective logic device performs the receiving, the converting the floating point numbers, the adding, the generating and the converting the summation in one pass. One pass indicates that the computing nodes send inputs only once to the collective logic device and receive outputs only once from the collective logic device.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: March 10, 2015
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger, Burkhard Steinmacher-Burow
  • Patent number: 8965944
    Abstract: Methods, apparatus and systems are disclosed for the generation of range-constrained test cases for verification of designs of arithmetic floating point units. Given three ranges of floating point numbers Rx, Ry, Rz, a floating point operation (op), and a rounding-mode (round), three floating point numbers x, y, z are generated such that x?Rx, y?Ry, z?Rz, and z=round ( x op y). Solutions are provided for add and subtract operations. Range constraints are imposed on the input operands and on the result operand of floating point add and subtract instructions to target corner cases when generating test cases for use in verification of floating point hardware.
    Type: Grant
    Filed: April 17, 2012
    Date of Patent: February 24, 2015
    Assignee: International Business Machines Corporation
    Inventor: Abraham Ziv
  • Patent number: 8922565
    Abstract: A system, method and apparatus are disclosed, in which a processing unit is configured to perform secondary processing on graphics pipeline data outside the graphics pipeline, with the output from the secondary processing being integrated into the graphics pipeline so that it is made available to the graphics pipeline. A determination is made whether to use secondary processing, and in a case that secondary processing is to be used, a command stream, which can comprise one or more commands, is provided to the secondary processing unit, so that the unit can locate and operate on buffered graphics pipeline data. Secondary processing is managed and monitored so as to synchronize data access by the secondary processing unit with the graphics pipeline processing modules.
    Type: Grant
    Filed: November 30, 2007
    Date of Patent: December 30, 2014
    Assignee: QUALCOMM Incorporated
    Inventor: Michael D. Street
  • Publication number: 20140379772
    Abstract: A processor includes: an exponent generating unit that generates an exponent part of a coefficient represented by a floating point number format based on a first part of received input data, the coefficient being obtained when an exponential function is decomposed into a series operation and the coefficient for the series operation; a storage unit that stores a mantissa part of the coefficient; a constant generating unit that reads constant data corresponding to a second part of the input data from the storage unit; and a selecting unit that selects and outputs the constant data from the constant generating unit when an instruction to be executed is a coefficient calculation instruction for calculation of the coefficient of the exponential function.
    Type: Application
    Filed: September 8, 2014
    Publication date: December 25, 2014
    Inventor: Mikio Hondo
  • Patent number: 8918445
    Abstract: An integrated multiplier circuit that operates on a variety of data formats including integer fixed point, signed or unsigned, real or complex, 8 bit, 16 bit or 32 bit as well as floating point data that may be single precision real, single precision complex or double precision. The circuit uses a single set of multiplier arrays to perform 16×16, 32×32 and 64×64 multiplies, 32×32 and 64×64 complex multiplies, 32×32 and 64×64 complex multiplies with one operand conjugated.
    Type: Grant
    Filed: September 21, 2011
    Date of Patent: December 23, 2014
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Mujibur Rahman
  • Publication number: 20140372493
    Abstract: A system and method for accelerating evaluation of functions. In one embodiment, a method includes receiving, by a processor, a value to be processed, and notification of a function to be applied to the value. The value is represented in a floating point format. The value is converted, by the processor, to a fixed point format. Which of Newton-Raphson and polynomial approximation is to be used to apply the function to the value in the fixed point format is determined by the processor. The function is applied to the value in the fixed point format to generate a result in the fixed point format. The result is converted to the floating point format by the processor.
    Type: Application
    Filed: June 14, 2013
    Publication date: December 18, 2014
    Inventors: Brent Everett Peterson, Nitya Ramdas, Sotirios Christodulos Tsongas, Jonathan Zack Albus, Johann Zipperer
  • Patent number: 8914801
    Abstract: A set of instructions for implementation in a floating-point unit or other computer processor hardware is disclosed herein. In one embodiment, an extended-range fused multiply-add operation, a first look-up operation, and a second look-up operation are each embodied in hardware instructions configured to be operably executed in a processor. These operations are accompanied by a table which provides a set of defined values in response to various function types, supporting the computation of elementary functions such as reciprocal, square, cube, fourth roots and their reciprocals, exponential, and logarithmic functions. By allowing each of these functions to be computed with a hardware instruction, branching and predicated execution may be reduced or eliminated, while also permitting the use of distributed instructions across a number of execution units.
    Type: Grant
    Filed: May 27, 2010
    Date of Patent: December 16, 2014
    Assignee: International Business Machine Corporation
    Inventors: Christopher K. Anand, Robert F. Enenkel, Anuroop Sharma, Daniel M. Zabawa
  • Patent number: 8909690
    Abstract: Mechanisms are provided for performing a floating point arithmetic operation in a data processing system. A plurality of floating point operands of the floating point arithmetic operation are received and bits in a mantissa of at least one floating point operand of the plurality of floating point operands are shifted. One or more bits of the mantissa that are shifted outside a range of bits of the mantissa of at least one floating point operand are stored and a vector value is generated based on the stored one or more bits of the mantissa that are shifted outside of the range of bits of the mantissa of the at least one floating point operand. A resultant value is generated for the floating point arithmetic operation based on the vector value and the plurality of floating point operands.
    Type: Grant
    Filed: December 13, 2011
    Date of Patent: December 9, 2014
    Assignee: International Business Machines Corporation
    Inventors: John B. Carter, Bruce G. Mealey, Karthick Rajamani, Eric E. Retter, Jeffrey A. Stuecheli
  • Publication number: 20140351308
    Abstract: A system and method are provided for dynamically reducing power consumption of floating-point logic. A disable control signal that is based on a characteristic of a floating-point format input operand is received and a portion of a logic circuit is disabled based on the disable control signal. The logic circuit processes the floating-point format input operand to generate an output.
    Type: Application
    Filed: May 23, 2013
    Publication date: November 27, 2014
    Applicant: NVIDIA Corporation
    Inventors: David C. Tannenbaum, Srinivasan Iyer