Floating Point Patents (Class 708/495)
-
Patent number: 12217020Abstract: In an aspect, a processor includes circuitry for iterative refinement approaches, e.g., Newton-Raphson, to evaluating functions, such as square root, reciprocal, and for division. The circuitry includes circuitry for producing an initial approximation; which can include a LookUp Table (LUT). LUT may produce an output that (with implementation-dependent processing) forms an initial approximation of a value, with a number of bits of precision. A limited-precision multiplier multiplies that initial approximation with another value; an output of the limited precision multiplier goes to a full precision multiplier circuit that performs remaining multiplications required for iteration(s) in the particular refinement process being implemented. For example, in division, the output being calculated is for a reciprocal of the divisor.Type: GrantFiled: December 26, 2023Date of Patent: February 4, 2025Assignee: Imagination Technologies LimitedInventor: Leonard Rarick
-
Patent number: 12164881Abstract: An apparatus comprises floating-point processing circuitry to perform a floating-point operation with rounding to generate a floating-point result value; and tininess detection circuitry to detect a tininess status indicating whether an outcome of the floating-point operation is tiny. A tiny outcome corresponds to a non-zero number with a magnitude smaller than a minimum non-zero magnitude representable as a normal floating-point number in a floating-point format to be used for the floating-point result value. The tininess detection circuitry comprises hardware circuit logic configured to support both before rounding tininess detection and after rounding tininess detection for detecting the tininess status.Type: GrantFiled: July 23, 2021Date of Patent: December 10, 2024Assignee: Arm LimitedInventors: David Raymond Lutz, David M. Russinoff, Harsha Valsaraju
-
Patent number: 12045581Abstract: The present disclosure relates generally to techniques for adjusting the number representation (e.g., format) of a variable before and/or after performing one or more arithmetic operations on the variable. In particular, the present disclosure relates to scaling the range of a variable to a suitable representation based on available hardware (e.g., hard logic) in an integrated circuit device. For example, an input in a first number format (e.g., bfloat16) may be scaled to a second number format (e.g., half-precision floating-point) so that circuitry implemented to receive inputs in the second number format may perform one or more arithmetic operations on the input. Further, the output produced by the circuitry may be scaled back to the first number format. Accordingly, arithmetic operations, such as a dot-product, performed in a first format may be emulated by scaling the inputs to and/or the outputs from arithmetic operations performed in another format.Type: GrantFiled: April 1, 2022Date of Patent: July 23, 2024Assignee: Intel CorporationInventors: Bogdan Mihai Pasca, Martin Langhammer
-
Patent number: 11853718Abstract: In an aspect, a processor includes circuitry for iterative refinement approaches, e.g., Newton-Raphson, to evaluating functions, such as square root, reciprocal, and for division. The circuitry includes circuitry for producing an initial approximation; which can include a LookUp Table (LUT). LUT may produce an output that (with implementation-dependent processing) forms an initial approximation of a value, with a number of bits of precision. A limited-precision multiplier multiplies that initial approximation with another value; an output of the limited precision multiplier goes to a full precision multiplier circuit that performs remaining multiplications required for iteration(s) in the particular refinement process being implemented. For example, in division, the output being calculated is for a reciprocal of the divisor.Type: GrantFiled: January 16, 2023Date of Patent: December 26, 2023Assignee: Imagination Technologies LimitedInventor: Leonard Rarick
-
Patent number: 11579844Abstract: In an aspect, a processor includes circuitry for iterative refinement approaches, e.g., Newton-Raphson, to evaluating functions, such as square root, reciprocal, and for division. The circuitry includes circuitry for producing an initial approximation; which can include a LookUp Table (LUT). LUT may produce an output that (with implementation-dependent processing) forms an initial approximation of a value, with a number of bits of precision. A limited-precision multiplier multiplies that initial approximation with another value; an output of the limited precision multiplier goes to a full precision multiplier circuit that performs remaining multiplications required for iteration(s) in the particular refinement process being implemented. For example, in division, the output being calculated is for a reciprocal of the divisor.Type: GrantFiled: March 23, 2021Date of Patent: February 14, 2023Assignee: Imagination Technologies LimitedInventor: Leonard Rarick
-
Patent number: 11507797Abstract: An information processing apparatus having an input device for receiving data, an operation unit for constituting a convolutional neural network for processing data, a storage area for storing data to be used by the operation unit and an output device for outputting a result of the processing. The convolutional neural network is provided with a first intermediate layer for performing a first processing including a first inner product operation and a second intermediate layer for performing a second processing including a second inner product operation, and is configured so that the bit width of first filter data for the first inner product operation and the bit width of second filter data for the second inner product operation are different from each other.Type: GrantFiled: January 26, 2018Date of Patent: November 22, 2022Assignee: Hitachi, Ltd.Inventors: Toru Motoya, Goichi Ono, Hidehiro Toyoda
-
Patent number: 11416457Abstract: Bias correcting system for small number estimators. A computer system includes a distinct value estimator configured to estimate a number of distinct values in a data set. The computer system includes a bias table for the estimator. The bias table includes entries with values corresponding to biases caused by the distinct value estimator correlated to values corresponding to numbers estimated. The entries in the table are optimized by having a set of entries with an optimized number of biases in the entries. The biases in the entries are associated with predetermined confidence intervals. The system includes a bias corrector configured to correct the number of distinct values in the multiset data estimated by the distinct value estimator set using values from the bias table to produce a corrected value. The system includes a user interface coupled to the bias corrector configured to output the corrected value to a user.Type: GrantFiled: January 2, 2018Date of Patent: August 16, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Arnd Christian Konig, Edgars Sedols, Parag Nandan Paul
-
Patent number: 11321087Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element is enabled to execute instructions in accordance with an ISA. The ISA is enhanced in accordance with improvements with respect to deep learning acceleration.Type: GrantFiled: August 27, 2019Date of Patent: May 3, 2022Assignee: Cerebras Systems Inc.Inventors: Michael Morrison, Michael Edwin James, Sean Lie, Srikanth Arekapudi, Gary R. Lauterbach
-
Patent number: 11301247Abstract: A method includes receiving an input data at a FP arithmetic operating unit configured to perform a FP arithmetic operation on the input data. The method further includes determining whether the received input data generates a FP hardware exception responsive to the FP arithmetic operation on the input data, wherein the determining occurs prior to performing the FP arithmetic operation. The method also includes converting a value of the received input data to a modified value responsive to the determining that the received input data generates the FP hardware exception, wherein the converting eliminates generation of the FP hardware exception responsive to the FP arithmetic operation on the input data.Type: GrantFiled: April 30, 2020Date of Patent: April 12, 2022Assignee: Marvell Asia Pte LtdInventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
-
Patent number: 11182668Abstract: Hardware for implementing a Deep Neural Network (DNN) having a convolution layer, the hardware comprising a plurality of convolution engines each configured to perform convolution operations by applying filters to data windows, each filter comprising a set of weights for combination with respective data values of a data window; and one or more weight buffers accessible to each of the plurality of convolution engines over an interconnect, each weight buffer being configured to provide weights of one or more filters to any of the plurality of convolution engines; wherein each of the convolution engines comprises control logic configured to request weights of a filter from the weight buffers using an identifier of that filter.Type: GrantFiled: November 6, 2018Date of Patent: November 23, 2021Assignee: Imagination Technologies LimitedInventor: Christopher Martin
-
Patent number: 11120602Abstract: Methods and devices for lowering precision of computations used in shader programs may include receiving program code for a shader program to use with a graphics processing unit (GPU) that supports half precision storage and arithmetic in shader programs. The methods and devices may include performing at least one pass on the program code to select a set of operations within the program code to lower a precision of a plurality of computations used by the set of operations and evaluating a risk of precision loss for lowering the precision to a half precision for each computation of the plurality of computations. The methods and devices may include generating edited program code by rewriting the computation to the half precision in response to the risk of precision loss being a precision loss threshold.Type: GrantFiled: June 3, 2019Date of Patent: September 14, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ivan Nevraev, Vishal Chandra Sharma
-
Patent number: 11099815Abstract: In described examples, an apparatus is arranged to generate a linear term, a quadratic term, and a constant term of a transcendental function with, respectively, a first circuit, a second circuit, and a third circuit in response to least significant bits of an input operand and in response to, respectively, a first, a second, and a third table value that is retrieved in response to, respectively, a first, a second, and a third index generated in response to most significant bits of the input operand. The third circuit is further arranged to generate a mantissa of an output operand in response to a sum of the linear term, the quadratic term, and the constant term.Type: GrantFiled: July 21, 2020Date of Patent: August 24, 2021Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Prasanth Viswanathan Pillai, Richard Mark Poley, Venkatesh Natarajan, Alexander Tessarolo
-
Patent number: 11062202Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has a respective floating-point unit enabled to optionally and/or selectively perform floating-point operations in accordance with a programmable exponent bias and/or various floating-point computation variations. In some circumstances, the programmable exponent bias and/or the floating-point computation variations enable neural network processing with improved accuracy, decreased training time, decreased inference latency, and/or increased energy efficiency.Type: GrantFiled: July 17, 2019Date of Patent: July 13, 2021Assignee: Cerebras Systems Inc.Inventors: Michael Edwin James, Sean Lie, Michael Morrison, Srikanth Arekapudi, Gary R. Lauterbach
-
Patent number: 11023205Abstract: Negative zero control for execution of an instruction. A process obtains an instruction to perform operation(s) using an input value. The instruction includes a negative zero control indicator indicating whether negative zero control is enabled for execution of the instruction. The process executes the instruction, the executing including performing the operation(s) using the input value to obtain a result having a sign, determining whether to control the sign of the result, the determining being based at least in part on the negative zero control indicator being set to a defined value, and performing further processing, as part the executing the instruction, based on the determining.Type: GrantFiled: February 15, 2019Date of Patent: June 1, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Cedric Lichtenau, Reid Copeland, Petra Leber, Silvia M. Mueller, Jonathan D. Bradbury, Xin Guo
-
Patent number: 10732979Abstract: A set of entries in a branch prediction structure for a set of second blocks are accessed based on a first address of a first block. The set of second blocks correspond to outcomes of one or more first branch instructions in the first block. Speculative prediction of outcomes of second branch instructions in the second blocks is initiated based on the entries in the branch prediction structure. State associated with the speculative prediction is selectively flushed based on types of the branch instructions. In some cases, the branch predictor can be accessed using an address of a previous block or a current block. State associated with the speculative prediction is selectively flushed from the ahead branch prediction, and prediction of outcomes of branch instructions in one of the second blocks is selectively initiated using non-ahead accessing, based on the types of the one or more branch instructions.Type: GrantFiled: June 18, 2018Date of Patent: August 4, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Marius Evers, Aparna Thyagarajan, Ashok T. Venkatachar
-
Patent number: 10671389Abstract: A Vector Floating Point Test Data Class Immediate instruction is provided that determines whether one or more elements of a vector specified in the instruction are of one or more selected classes and signs. If a vector element is of a selected class and sign, an element in an operand of the instruction corresponding to the vector element is set to a first defined value, and if the vector element is not of the selected class and sign, the operand element corresponding to the vector element is set to a second defined value.Type: GrantFiled: January 21, 2019Date of Patent: June 2, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan D. Bradbury, Eric M. Schwarz
-
Patent number: 10671345Abstract: An integrated circuit may include normalization circuitry that can be used when converting a fixed-point number to a floating-point number. The normalization circuitry may include at least a floating-point generation circuit that receives the fixed-point number and that creates a corresponding floating-point number. The normalization circuitry may then leverage an embedded digital signal processing (DSP) block on the integrated circuit to perform an arithmetic operation by removing the leading one from the created floating-point number. The resulting number may have a fractional component and an exponent value, which can then be used to derive the final normalized value.Type: GrantFiled: February 2, 2017Date of Patent: June 2, 2020Assignee: Intel CorporationInventor: Bogdan Pasca
-
Patent number: 10628049Abstract: A sequencer circuit is configured to generate control signals for on-die memory control circuitry. The control signals may include memory operation pulses for implementing operations on selected non-volatile memory cells embodied within the same die as the sequencer (and other on-die memory control circuitry). The timing, configuration, and/or duration of the memory control signals are defined in configuration data, which can be modified after the design and/or fabrication of the die and/or on-die memory circuitry. As such, the timing, configuration, and/or duration of the memory control signals generated by the sequencer may be manipulated after the design and/or fabrication of the die, sequencer, and other on-die memory control circuitry.Type: GrantFiled: January 12, 2018Date of Patent: April 21, 2020Assignee: Sandisk Technologies LLCInventors: Yuheng Zhang, Gordon Yee, Yibo Yin, Tz-Yi Liu Liu
-
Patent number: 10592213Abstract: Techniques to preprocess tensor operations prior to code generation to optimize compilation are disclosed. A computer readable representation of a linear algebra or tensor operation is received. A code transformation software component performs transformations include output reduction and fraction removal. The result is a set of linear equations of a single variable with integer coefficients. Such a set lends itself to more efficient code generation during compilation by a code generation software component. Use cases disclosed include targeting a machine learning hardware accelerator, receiving code in the form of an intermediate language generated by a cross-compiler with multiple front ends supporting multiple programming languages, and cloud deployment and execution scenarios.Type: GrantFiled: October 18, 2017Date of Patent: March 17, 2020Assignee: Intel CorporationInventors: Jeremy Bruestle, Choong Ng
-
Patent number: 10514913Abstract: Setting or updating of floating point controls is managed. Floating point controls include controls used for floating point operations, such as rounding mode and/or other controls. Further, floating point controls include status associated with floating point operations, such as floating point exceptions and/or others. The management of the floating point controls includes efficiently updating the controls, while reducing costs associated therewith.Type: GrantFiled: June 23, 2017Date of Patent: December 24, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Patent number: 10503477Abstract: The disclosure provides a very flexible mechanism for a storage controller to create RAID stripes and to re-create corrupted stripes when necessary using the erasure coding scheme. Typically, this is known as a RAID 6 implementation/feature. The erasure code calculations are generated using the Galois Multiplication hardware and the system controller can pass any polynomial into the hardware on a per stripe calculation basis. The polynomial value is passed to the hardware via an input descriptor field. The descriptor controls the entire computation process.Type: GrantFiled: December 8, 2017Date of Patent: December 10, 2019Assignee: EXTEN TECHNOLOGIES, INC.Inventors: Daniel B. Reents, Ashwin Kamath
-
Patent number: 10459688Abstract: An apparatus comprises: processing circuitry to perform data processing; and an instruction decoder to control the processing circuitry to perform an anchored-data processing operation to generate an anchored-data element. The anchored-data element has an encoding including type information indicative of whether the anchored-data element represents: a portion of bits of a two's complement number, said portion of bits corresponding to a given range of significance representable using the anchored-data element; or a special value other than said portion of bits of a two's complement number.Type: GrantFiled: February 6, 2019Date of Patent: October 29, 2019Assignee: ARM LimitedInventors: Neil Burgess, Christopher Neal Hinds, David Raymond Lutz
-
Patent number: 10379815Abstract: Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.Type: GrantFiled: July 18, 2018Date of Patent: August 13, 2019Assignee: Altera CorporationInventor: Martin Langhammer
-
Patent number: 10331405Abstract: An accurate implementation of a polynomial using floating-point or other rounded arithmetic can be generated using a plurality of hardware logic components which each implement an input polynomial such that the zeros in the input polynomial can be determined correctly. The number of different hardware logic components that are used can be reduced by analyzing the set of input polynomials and from it generating a set of polynomial components, where each polynomial in the set of input polynomials which is not also in the set of polynomial components, can be generated from a single one of the polynomial components.Type: GrantFiled: April 21, 2017Date of Patent: June 25, 2019Assignee: Imagination Technologies LimitedInventor: Theo Alan Drane
-
Patent number: 10297000Abstract: A high dynamic range image information hiding method includes embedding secret information and extracting the secret information. The step of embedding secret information includes obtaining three channel values of every pixel in an original high dynamic range image; according to every channel value and corresponding 5-bit exponent of every pixel, determining an embedding significance bit of the information to be embedded in every channel value of every pixel; embedding information into every channel value of every pixel; and obtaining a high dynamic range image embedded with the secret information.Type: GrantFiled: November 8, 2017Date of Patent: May 21, 2019Assignee: Ningbo UniversityInventors: Gangyi Jiang, Yongqiang Bai, Mei Yu, Yang Wang
-
Patent number: 10216479Abstract: An apparatus and method are provided for performing arithmetic operations to accumulate floating-point numbers. The apparatus comprises execution circuitry to perform arithmetic operations, and decoder circuitry to decode a sequence of instructions. A convert and accumulate instruction is provided, and the decoder circuitry is responsive to decoding the convert and accumulate instruction to generate one or more control signals to control the execution circuitry to convert at least one floating-point operand identified by the convert and accumulate instruction into a corresponding N-bit fixed-point operand having M fraction bits, where M is less than N and M is dependent on a format of the floating-point operand. The execution circuitry accumulates each corresponding N bit fixed-point operand and a P bit fixed-point operand identified by the convert and accumulate instruction in order to generate a P bit fixed-point result value, where P is greater than N and also has M fraction bits.Type: GrantFiled: December 6, 2016Date of Patent: February 26, 2019Assignee: ARM LIMITEDInventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds, Andreas Due Engh-Halstvedt
-
Patent number: 10109014Abstract: Systems and methods involving a rating module that accesses a single, voluminous table or multiple tables stored in a searchable data store (e.g., database) to execute various queries (e.g., SQL JOIN) to search the table(s) is disclosed. The system may include—an underlying linear programming platform (e.g., optimization engine and associated components) that includes an application programmer's interface (e.g., Python API) that may be used to perform optimization using illustrative optimization libraries (e.g., optimizer). The system may be communicatively coupled with a vehicle and/or other device to communicate/output ratings information to a user.Type: GrantFiled: July 30, 2015Date of Patent: October 23, 2018Assignee: Allstate Insurance CompanyInventors: Richard D. Bischoff, Nicholas J. Reed, Timothy S. Lenahan, Eric Huls
-
Patent number: 10061561Abstract: A floating point adder includes leading zero anticipation circuitry to determine a number of leading zeros within a result significand value of a sum of a first floating point operand and a second floating point operand. This number of leading zeros is used to generate a mask which in turn selects input bits from a non-normalized significand produced by adding the first significand value and the second significand value. The non-normalized significand is then normalized at the same time as the output rounding bits used to round the normalized significand value are generated by rounding bit generation circuitry.Type: GrantFiled: September 7, 2016Date of Patent: August 28, 2018Assignee: ARM LimitedInventor: David Raymond Lutz
-
Patent number: 10031842Abstract: A method including measuring an initial precision measurement of at least one value of a number with a decimal point, measuring an infinite precision measurement of a value of the number with a decimal point, where a format of the number with a decimal point or of a primitive operation manipulating a number with a decimal point is first replaced with a predetermined optimal format. Additionally, manipulating, for at least one instruction, at least one number with a decimal point, including writing at least one variant performing the same function as the at least one instruction, and measuring, for each variant, at least one value of the at least one number with a decimal point obtained with the variant, and selecting the optimal variant as a function of the measured value and the initial precision and infinite precision values and replacing the at least one instruction with the selected variant.Type: GrantFiled: March 12, 2015Date of Patent: July 24, 2018Assignee: NUMALISInventors: Arnault Ioualalen, Nicolas Normand, Matthieu Martel
-
Patent number: 10033801Abstract: Apparatus, systems, and methods are described, including apparatus that includes one or more communication interfaces for communicating over a communication network, and a processor. The processor is configured to receive, via the communication interfaces, a plurality of numbers, and calculate a sum of the numbers that is independent of an order in which the numbers are received, by (i) converting any of the numbers that are received in a floating-point representation to a derived floating-point representation that includes a plurality of signed integer multiplicands corresponding to different respective orders of magnitude, and (ii) summing the numbers in the derived floating-point representation, by separately summing integer multiplicands that correspond to the same order of magnitude. Other embodiments are also described.Type: GrantFiled: February 11, 2016Date of Patent: July 24, 2018Assignee: Mellanox Technologies, Ltd.Inventor: Hillel Chapman
-
Patent number: 9990203Abstract: Methods, devices, and systems for capturing an accuracy of an instruction executing on a processor. An instruction may be executed on the processor, and the accuracy of the instruction may be captured using a hardware counter circuit. The accuracy of the instruction may be captured by analyzing bits of at least one value of the instruction to determine a minimum or maximum precision datatype for representing the field, and determining whether to adjust a value of the hardware counter circuit accordingly. The representation may be output to a debugger or logfile for use by a developer, or may be output to a runtime or virtual machine to automatically adjust instruction precision or gating of portions of the processor datapath.Type: GrantFiled: December 28, 2015Date of Patent: June 5, 2018Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Leonardo de Paula Rosa Piga, Abhinandan Majumdar, Indrani Paul, Wei Huang, Manish Arora, Joseph L. Greathouse
-
Patent number: 9959091Abstract: A method identifies a floating point implementation of a polynomial that is accurately evaluable. The method comprises determining whether the polynomial has an allowable variety defined by a plurality of sub-varieties, and, if so, partitioning the input domain of the polynomial into a plurality of sub-domains about the sub-varieties. A floating point precision is then identified for each input to the polynomial falling within each sub-domain based on the location of the input within the sub-domain (e.g. how far away the input is from the sub-variety associated with the sub-domain). A floating point implementation for the polynomial is generated so that an input to the polynomial is evaluated using floating point components having the precision identified for the input.Type: GrantFiled: September 8, 2015Date of Patent: May 1, 2018Assignee: Imagination Technologies LimitedInventor: Theo Alan Drane
-
Patent number: 9940101Abstract: Embodiments of the present disclosure include a tininess prediction and handler engine for handling numeric underflow while streamlining the data path for handling normal range cases, thereby avoiding flushes, and reducing the complexity of a scheduler with respect to how dependent operations are handled. A preemptive tiny detection logic section can detect a potential tiny result for the function or operation that is being performed, and can produce a pessimistic tiny indicator. The tininess prediction and handler engine can further include a subnormal post-processing pipe, which can denormalize and round one or more subnormal operations while in a post-processing mode. A schedule modification logic section can reschedule in-flight operations. The schedule modification logic section can issue dependent operations optimistically assuming that a producing operation will not produce a tiny result, and so will not incur extra latency associated with fixing the tiny result in the post-processing pipe.Type: GrantFiled: March 10, 2016Date of Patent: April 10, 2018Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ashraf Ahmed, Nicholas Todd Humphries, Marc Augustin
-
Patent number: 9904513Abstract: Floating point compound equations that involve addition of at least three terms, where each term involves a multiplication, can be implemented by using a bypass to prevent small, remaining values from being lost when shifted.Type: GrantFiled: June 25, 2015Date of Patent: February 27, 2018Assignee: Intel CorporationInventors: Subramaniam Maiyuran, Jorge F. Garcia Pabon, Ashutosh Garg
-
Patent number: 9880840Abstract: Detection of whether a result of a floating point operation is safe. Characteristics of the result are examined to determine whether the result is safe or potentially unsafe, as defined by the user. An instruction is provided to facilitate detection of safe or potentially unsafe results.Type: GrantFiled: February 28, 2014Date of Patent: January 30, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael F Cowlishaw, Shawn D Lundvall, Ronald M Smith, Sr., Phil C Yeh
-
Patent number: 9778906Abstract: An apparatus comprises processing circuitry to perform a conversion operation to convert a floating-point value to a vector comprising a plurality of data elements representing respective bit significance portions of a binary value corresponding to the floating-point value.Type: GrantFiled: December 24, 2014Date of Patent: October 3, 2017Assignee: ARM LimitedInventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds
-
Patent number: 9665347Abstract: An apparatus comprises processing circuitry to perform a conversion operation to convert a vector comprising a plurality of data elements representing respective bit significance portions of a binary value to a scalar value comprising an alternative representation of said binary value.Type: GrantFiled: December 24, 2014Date of Patent: May 30, 2017Assignee: ARM LimitedInventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds
-
Patent number: 9569175Abstract: An FMA unit, for carrying out an arithmetic operation in a model computation unit of a control unit, is configured to process input of two factors and one summand in the form of floating point values, and provide a computation result of such processing as an output variable in the form of a floating point value. The FMA unit is designed to carry out a multiplication and a subsequent addition, the bit resolutions of the inputs for the factors being lower than the bit resolution of the input for the summand and the bit resolution of the output variable.Type: GrantFiled: May 21, 2014Date of Patent: February 14, 2017Assignee: ROBERT BOSCH GMBHInventors: Wolfgang Fischer, Andre Guntoro
-
Patent number: 9489198Abstract: A method and apparatus for including in a processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.Type: GrantFiled: February 4, 2016Date of Patent: November 8, 2016Assignee: Intel CorporationInventors: Rajiv Kapoor, Ronen Zohar, Mark Buxton, Zeev Sperber, Koby Gottlieb
-
Patent number: 9471305Abstract: A method for graphics processing includes generating one or more transcendental instructions in a graphics processing unit (GPU). Micro-code is formed for processing the one or more transcendental instructions in the GPU. The micro-code is processed using an iterative process including cubic interpolation and an evaluation of a cubic polynomial.Type: GrantFiled: August 12, 2014Date of Patent: October 18, 2016Assignee: Samsung Electronics Co., Ltd.Inventor: Mitchell Alsup
-
Patent number: 9455831Abstract: An order-preserving encryption (OPE) encryption method receives a plaintext (clear text) and generates a ciphertext (encrypted text) using a software arbitrary precision floating point libraries during initial recursive computation rounds. In response to the ciphertext space reducing to breakpoint, the OPE encryption method continues computations using a hardware floating point processor to accelerate the computation. In this manner, the OPE encryption method enables efficient order preserving encryption to enable range queries on encrypted data.Type: GrantFiled: September 18, 2014Date of Patent: September 27, 2016Assignee: Skyhigh Networks, Inc.Inventor: Paul Grubbs
-
Patent number: 9448806Abstract: A floating-point unit and a method of identifying exception cases in a floating-point unit. In one embodiment, the floating-point unit includes: (1) a floating-point computation circuit having a normal path and an exception path and operable to execute an operation on an operand and (2) a decision circuit associated with the normal path and the exception path and configured to employ a flush-to-zero mode of the floating-point unit to determine which one of the normal path and the exception path is appropriate for carrying out the operation on the operand.Type: GrantFiled: September 25, 2012Date of Patent: September 20, 2016Assignee: Nvidia CorporationInventors: Marcin Andrychowicz, Alex Fit-Florea
-
Patent number: 9444735Abstract: Techniques are presented herein to distribute the processing of communication to network-connected devices to routing nodes, as opposed to centralizing those operations in one device as in the traditional/classical system. Using a bitmapped Type field, advertisements and queries can be categorized. Also, by using a Subgroup field, the scope of advertisements and queries can be dynamically limited. These techniques reduce the number of matches and make the matches more relevant to the user who sent the query. Routing nodes can be any network element that routes traffic, physical or virtual (cloud-based router or switch). The intelligence to perform these techniques can be embodied as an overlay on top of a physical network.Type: GrantFiled: February 27, 2014Date of Patent: September 13, 2016Assignee: Cisco Technology, Inc.Inventors: Michael L. Sullenberger, Andre Karamanian
-
Patent number: 9335996Abstract: A mechanism for recycling error bits in a floating point unit is disclosed. A system of the disclosure includes a memory and a processing device communicably coupled to the memory. In one embodiment, the processing device comprising a floating point unit (FPU) to generate a result value from applying an operation on floating point number inputs to the FPU and generate an error value using the result value. The FPU also writes the result value to a first register of the processing device dedicated to storing results from the operation of the FPU and writes the error value to a second register of the processing device dedicated to storing errors from the operation of the FPU.Type: GrantFiled: November 14, 2012Date of Patent: May 10, 2016Assignee: Intel CorporationInventors: Helia Naeimi, Ralph Nathan, Daniel Sorin, Shih-Lien L. Lu
-
Patent number: 9298457Abstract: An execution unit configured for compression and decompression of numerical data utilizing single instruction, multiple data (SIMD) instructions is described. The numerical data includes integer and floating-point samples. Compression supports three encoding modes: lossless, fixed-rate, and fixed-quality. SIMD instructions for compression operations may include attenuation, derivative calculations, bit packing to form compressed packets, header generation for the packets, and packed array output operations. SIMD instructions for decompression may include packed array input operations, header recovery, decoder control, bit unpacking, integration, and amplification. Compression and decompression may be implemented in a microprocessor, digital signal processor, field-programmable gate array, application-specific integrated circuit, system-on-chip, or graphics processor, using SIMD instructions. Compression and decompression of numerical data can reduce memory, networking, and storage bottlenecks.Type: GrantFiled: January 22, 2013Date of Patent: March 29, 2016Assignee: Altera CorporationInventor: Albert W. Wegener
-
Patent number: 9141586Abstract: A mechanism for performing single-path floating-point rounding in a floating point unit is disclosed. A system of the disclosure includes a memory and a processing device communicably coupled to the memory. In one embodiment, the processing device comprises a floating point unit (FPU) to generate a plurality of status flags for a rounded value of a finite nonzero number. The plurality of status flags are generated based on the finite nonzero number without calculating the rounded value of the finite nonzero number. The plurality of status flags comprises an overflow flag and an underflow flag. The FPU determines whether a rounded value should be calculated for the finite nonzero number based on the plurality of status flags and whether the overflow flag is asserted.Type: GrantFiled: December 21, 2012Date of Patent: September 22, 2015Assignee: Intel CorporationInventors: Warren E. Ferguson, Brian J. Hickmann, Thomas D. Fletcher
-
Patent number: 9135376Abstract: Various embodiments provide for the determination of a test set that satisfies a coverage model, where portions of the search space need not be searched in order to generate the test set. With various embodiments, a search space defined by a set of inputs for an electronic design and a coverage model is identified. The search space is then fractured into subspaces. Subsequently, the subspaces are solved to determine if they include at least one input sequence that satisfies the coverage constraints defined in the coverage model. The subspaces found to include at least one input sequence that satisfies these coverage constraints, are then searched for unique input sequences in order to generate a test set. Subspaces found not to include at least one input sequence that satisfies the coverage constraints may be excluded from the overall search space.Type: GrantFiled: May 1, 2013Date of Patent: September 15, 2015Assignee: Mentor Graphics CorporationInventors: Sudhir D. Kadkade, Clifton A. Lyons, Jr., Kunal P. Ganeshpure
-
Patent number: 9128758Abstract: According to one aspect of the present disclosure, a method and technique for encoding densely packed decimals is disclosed. The method includes: executing a floating point instruction configured to perform a floating point operation on decimal data in a binary coded decimal (BCD) format; determining whether a result of the operation includes a rounded mantissa overflow; and responsive to determining that the result of the operation includes a rounded mantissa overflow, compressing a result of the operation from the BCD-formatted decimal data to decimal data in a densely packed decimal (DPD) format by shifting select bit values of the BCD formatted decimal data by one digit to select bit positions in the DPD format.Type: GrantFiled: November 15, 2011Date of Patent: September 8, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Kroener, Christophe J. Layer, Petra Leber, Silvia M. Mueller
-
Patent number: 9092226Abstract: Methods and apparatus are provided for handling floating point exceptions in a processor that executes single-instruction multiple-data (SIMD) instructions. In one example a numerical exception is identified for a SIMD floating point operation and SIMD micro-operations are initiated to generate two packed partial results of a packed result for the SIMD floating point operation. A SIMD denormalization micro-operation is initiated to combine the two packed partial results and to denormalize one or more elements of the combined packed partial results to generate a packed result for the SIMD floating point operation having one or more denormal elements. Flags are set and stored with packed partial results to identify denormal elements. In one example a SIMD normalization micro-operation is initiated to generate a normalized pseudo internal floating point representation prior to the SIMD floating point operation when it uses multiplication.Type: GrantFiled: December 14, 2011Date of Patent: July 28, 2015Assignee: Intel CorporationInventors: Zeev Sperber, Shachar Finkelstein, Gregory Pribush, Amit Gradstein, Guy Bale, Thierry Pons
-
Patent number: 9076215Abstract: An image processor sets a first predetermined number of first blocks at first intervals in a second image, calculates a first evaluated value, selects one of the first blocks, and calculates a first parallax between the selected first block and the matching target block. An image processor sets a second predetermined number of second blocks at second intervals in a second image, calculates a second evaluated value, selects one of the second blocks, and calculates a second parallax between the selected second block and the matching target block. A controller determines, based on the first evaluated value and the second evaluated value and based on the first parallax and the second parallax, whether or not to employ one of the first parallax and the second parallax.Type: GrantFiled: November 27, 2012Date of Patent: July 7, 2015Assignee: Panasonic Intellectual Property Management Co., Ltd.Inventor: Motonori Ogura