Reciprocal Patents (Class 708/502)

Patent number: 11943336Abstract: A method of encrypting and decrypting multiple individual pieces or sets of data in which a computing device randomly selects a group of seeds that it then uses to generate irrational numbers. Sections of the generated irrational numbers can be used as onetime pads or keys to encrypt the corresponding data sets. Intended recipients can then reverse the process using their allowed keys to access data for which they have authorization.Type: GrantFiled: November 22, 2021Date of Patent: March 26, 2024Assignee: Theon Technology LLCInventor: Robert Edward Grant

Patent number: 11934797Abstract: A processor to facilitate execution of a singleprecision floating point operation on an operand is disclosed. The processor includes one or more execution units, each having a plurality of floating point units to execute one or more instructions to perform the singleprecision floating point operation on the operand, including performing a floating point operation on an exponent component of the operand; and performing a floating point operation on a mantissa component of the operand, comprising dividing the mantissa component into a first subcomponent and a second subcomponent, determining a result of the floating point operation for the first subcomponent and determining a result of the floating point operation for the second subcomponent, and returning a result of the floating point operation.Type: GrantFiled: April 4, 2019Date of Patent: March 19, 2024Assignee: Intel CorporationInventors: Abhishek Rhisheekesan, Shashank Lakshminarayana, Subramaniam Maiyuran

Patent number: 11886505Abstract: A function approximation system is disclosed for determining output floating point values of functions calculated using floating point numbers. Complex functions have different shapes in different subsets of their input domain, making them difficult to predict for different values of the input variable. The function approximation system comprises an execution unit configured to determine corresponding values of a given function given a floating point input to the function; a plurality of look up tables for each function type; a correction table of values which determines if corrections to the output value are required; and a table selector for finding an appropriate table for a given function.Type: GrantFiled: April 5, 2022Date of Patent: January 30, 2024Assignee: GRAPHCORE LIMITEDInventors: Jonathan Mangnall, Stephen Felix

Patent number: 10942204Abstract: An improved phasor estimation method for Mclass phasor measurement units (PMUs) with a low computational burden is described. The method contains three steps: 1) A phasor measurement filter is designed by selecting parameters of Taylor weighted least square method to prioritize dynamic phasor accuracy and a high level of suppression on highfrequency interferences; 2) A finite impulse response lowpass filter is designed by the equalripple method is put forward to suppress lowfrequency interferences; and 3) Phasor amplitude is corrected under offnominal conditions.Type: GrantFiled: October 27, 2020Date of Patent: March 9, 2021Assignee: North China Electric Power UniversityInventors: Hao Liu, Tianshu Bi, Sudi Xu

Patent number: 10707932Abstract: Disclosed are a MultipleInput MultipleOutput (MIMO) systembased signal detection method. The method includes: performing a scaling calculation on a first covariance matrix according to first main diagonal elements in the first covariance matrix to obtain a second covariance matrix; obtaining a whitening matrix according to the second covariance matrix; taking the whitening matrix, a vector of a receiving signal and a channel matrix as input parameters, and inputting the parameters into a mathematical model for a whitening operation and perform a whitening calculation to obtain an operation result; and detecting a transmit signal in a MIMO system according to the operation result to obtain a detection result. Also disclosed are a MIMO systembased signal detection device and a computer storage medium.Type: GrantFiled: January 15, 2018Date of Patent: July 7, 2020Assignee: SANECHIPS TECHNOLOGY CO., LTD.Inventor: Xuetao Dong

Patent number: 10664237Abstract: An apparatus and method for performing a reciprocal square root. For example one embodiment of a processor comprises: a decoder to decode a reciprocal square root instruction to generate a decoded reciprocal square root instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal square root execution circuitry to execute the decoded reciprocal square root instruction, the reciprocal square root execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal square root execution circuitry to generate a reciprocal square root of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.Type: GrantFiled: December 21, 2017Date of Patent: May 26, 2020Assignee: Intel CorporationInventors: Cristina Anderson, Elmoustapha OuldAhmedVall, Marius CorneaHasegan, Robert Valentine, Mark Charney, Jesus Corbal, Venkateswara Madduri

Patent number: 10521194Abstract: The present embodiments relate to integrated circuits with circuitry that efficiently performs mixedprecision floatingpoint arithmetic operations. Such circuitry may be implemented in specialized processing blocks. The specialized processing blocks may include configurable interconnect circuitry to support a variety of different use modes. For example, the specialized processing blocks may implement fixedpoint addition, floatingpoint addition, fixedpoint multiplication, floatingpoint multiplication, sum of two multiplications in a first floatingpoint precision, with or without casting to a second floatingpoint precision and the latter followed by a subsequent addition in the second floatingpoint precision, if desired, just to name a few.Type: GrantFiled: December 20, 2018Date of Patent: December 31, 2019Assignee: Intel CorporationInventor: Martin Langhammer

Patent number: 10489113Abstract: The present disclosure provides a quick operation device for a nonlinear function, and a method therefor. The device comprises: a domain conversion part for converting an input independent variable into a corresponding value in a table lookup range; a table lookup part for looking up a slope and an intercept of the corresponding piecewise linear fitting based on the input independent variable or an independent variable processed by the domain conversion part; and a linear fitting part for obtaining, a final result in a way of linear fitting based on the slope and the intercept obtained, by means of table lookup, by the table lookup part. The present disclosure solves the problems of slow operation speed, large area of the operation device, and high power consumption caused by the traditional method.Type: GrantFiled: June 17, 2016Date of Patent: November 26, 2019Assignee: Institute of Computing Technology, Chinese Academy of SciencesInventors: Shijin Zhang, Tao Luo, Shaoli Liu, Yunji Chen

Patent number: 10235136Abstract: A binary logic circuit is provided for determining a rounded value of px q , where p and q are coprime constant integers with p<q and q?2i, i is any integer, and x is an integer variable between 0 and integer M where M?2q, the binary logic circuit implementing in hardware the optimal solution of the multiplyadd operation ax + b 2 k where a, b and k are fixed integers.Type: GrantFiled: May 26, 2017Date of Patent: March 19, 2019Assignee: Imagination Technologies LimitedInventor: Thomas Rose

Patent number: 10169874Abstract: A target object may be identified by estimating a distribution of a plurality of orientations of a periphery of a target object, and identifying the target object based on the distribution.Type: GrantFiled: May 30, 2017Date of Patent: January 1, 2019Assignee: International Business Machines CorporationInventors: Hiroki Nakano, Yasushi Negishi, Masaharu Sakamato, Taro Sekiyama, Kun Zhao

Patent number: 10133553Abstract: A reciprocal unit for computing an estimated reciprocal of a number represented by a bit string. The unit comprises a first lookup table configured to receive one or more of the bits in the bit string and to output an initial estimate of the reciprocal of the number. The unit further comprises a second lookup table configured to receive one or more of the bits in the bit string and to output the square of the initial estimate of the reciprocal of the number. The unit still further comprises a multiplier circuit configured to multiply the square of the initial estimate by the number, and an addersubtractor circuit for subtracting the product of the multiplication from a scaled value of the initial estimate to determine a final estimate of the reciprocal of the number.Type: GrantFiled: February 20, 2016Date of Patent: November 20, 2018Assignee: The Regents of The University of MichiganInventors: Zhengya Zhang, ChiaHsiang Chen

Patent number: 9645974Abstract: The present disclosure relates to optimized matrix multiplication using vector multiplication of interleaved matrix values. Two matrices to be multiplied are organized into specially ordered vectors, which are multiplied together to produce a portion of a product matrix.Type: GrantFiled: March 11, 2015Date of Patent: May 9, 2017Assignee: Google Inc.Inventors: Nishant Patil, Matthew Sarett, Rama Krishna Govindaraju, Benoit Steiner, Vincent O. Vanhoucke

Patent number: 9524783Abstract: An apparatus, system, and method for controlling data transfer to an output port of a serial data link interface in a semiconductor memory is disclosed. In one example, a flash memory device may have multiple serial data links, multiple memory banks and control input ports that enable the memory device to transfer the serial data to a serial data output port of the memory device. In another example, a flash memory device may have a single serial data link, a single memory bank, a serial data input port, a control input port for receiving output enable signals. The flash memory devices may be cascaded in a daisychain configuration using echo signal lines to serially communicate between memory devices.Type: GrantFiled: December 30, 2015Date of Patent: December 20, 2016Assignee: Conversant Intellectual Property Management Inc.Inventors: Hakjune Oh, Hong Beom Pyeon, JinKi Kim

Patent number: 9116747Abstract: Systems and methods are provided for implementing a sparse deterministic direct solver. The deterministic direct solver is configured to identify at least one task for each of a plurality of dense blocks, identify operations on which the tasks are dependent, store in a first data structure an entry for each of the dense blocks identifying whether a precondition must be satisfied before tasks associated with the dense blocks can be initiated, store in a second data structure a status value for each of the dense blocks that is changeable by multiple threads, and assign the tasks to a plurality of threads, wherein the threads execute their assigned task when the status of the dense block corresponding to their assigned task indicates that the assigned task is ready to be performed and the precondition associated with the dense block has been satisfied if the precondition exists.Type: GrantFiled: June 20, 2012Date of Patent: August 25, 2015Assignee: SAS Institute Inc.Inventor: Alexander Andrianov

Patent number: 8990278Abstract: Methods and circuitry for evaluating reciprocal, square root, inverse square root, logarithm, and exponential functions of an input value, Y. In one embodiment, an approximate value, RA, of the reciprocal of Y is generated. One NewtonRaphson iteration is performed as a function of RA and Y, resulting in a truncated approximate value, R. R is multiplied by Y and 1 is subtracted, resulting in a reduced argument, A. A Taylor series evaluation of A is performed, resulting in an evaluated argument, B. B is multiplied by a postprocessing factor for the final result.Type: GrantFiled: October 17, 2011Date of Patent: March 24, 2015Assignee: Xilinx, Inc.Inventor: Christopher M. Clegg

Patent number: 8965946Abstract: A data processing apparatus and method are provided for performing a reciprocal operation on an input value d to produce a result value X. The reciprocal operation involves iterative execution of a refinement step to converge on the result value, the refinement step performing the computation: Xi=Xi1*M, where Xi is an estimate of the result value for the ith iteration of the refinement step, and M is a value determined by a portion of the refinement step. The data processing apparatus comprises a register data store having a plurality of registers operable to store data, and processing logic operable to execute instructions to perform data processing operations on data held in the register data store.Type: GrantFiled: July 19, 2011Date of Patent: February 24, 2015Assignee: ARM LimitedInventors: David Raymond Lutz, Christopher Neal Hinds

Patent number: 8838663Abstract: A new function for calculating the reciprocal residual of a floatingpoint number X is defined as recip_residual(X)=1?X*recip(X), where recip(X) represents the reciprocal of X. The function may be implemented using a fused multiplyadd unit in a processor. The reciprocal value of X, recip(X), may be obtained from a lookup table. The recip_residual function may help reduce the latency of many multiplicative functions that are based on products of multiple numbers and can be expressed in simple terms of functions on each individual number (e.g., log(U*V)=log(U)+log(V)).Type: GrantFiled: March 30, 2007Date of Patent: September 16, 2014Assignee: Intel CorporationInventors: Ping Tak Peter Tang, Robert Cavin

Patent number: 8711146Abstract: Methods and apparatuses for constructing a multilevel solver, comprising decomposing a graph into a plurality of pieces, wherein each of the pieces has a plurality of edges and a plurality of interface nodes, and wherein the interface nodes in the graph are fewer in number than the edges in the graph; producing a local preconditioner for each of the pieces; and aggregating the local preconditioners to form a global preconditioner.Type: GrantFiled: November 29, 2007Date of Patent: April 29, 2014Assignee: Carnegie Mellon UniversityInventors: Gary Lee Miller, Ioannis Koutis

Patent number: 8706789Abstract: In one embodiment, the present invention includes a method for receiving a reciprocal instruction and an operand in a processor, accessing an entry of a lookup table based on a portion of the operand and the instruction, generating an encoder output based on a type of the reciprocal instruction and whether the reciprocal instruction is a legacy instruction, and selecting portions of the lookup table entry and input operand to be provided to a reciprocal logic unit based on the encoder output. Other embodiments are described and claimed.Type: GrantFiled: December 22, 2010Date of Patent: April 22, 2014Assignee: Intel CorporationInventors: Zeev Sperber, Cristina S. Anderson, Benny Eitan, Simon Rubanovich, Amit Gradstein

Patent number: 8639737Abstract: Approximations of reciprocal square roots are provided in IEEE floating point binary format by obtaining an index from an input value, accessing a pair of table values and performing a limited number of simple and rapidly performed manipulations. The maximum relative error in the approximation thus provided is less than 0.75/2(2k+1) as compared with a maximum relative error of 1/2k+2 of known methods, where 2k is the number of table entries.Type: GrantFiled: March 28, 2008Date of Patent: January 28, 2014Assignee: International Business Machines CorporationInventor: James B. Shearer

Patent number: 8601047Abstract: A decimal floatingpoint (DFP) adder includes a decimal leadingzero anticipator (LZA). The DFP adder receives DFP operands. Each operand includes a significand, an exponent, a sign bit and a leading zero count for the significand. The DFP adder adds or subtracts the DFP operands to obtain a DFP result. The LZA determines the leading zero count associated with the significand of the DFP result. The LZA operates at least partially in parallel with circuitry (in the DFP adder) that computes the DFP result. The LZA does not wait for that circuitry to finish computation of the DFP result. Instead it “anticipates” the number of leading zeros that the result's significand will contain.Type: GrantFiled: June 13, 2013Date of Patent: December 3, 2013Assignee: Advanced Micro DevicesInventor: LiangKai Wang

Patent number: 8489663Abstract: A decimal floatingpoint (DFP) adder includes a decimal leadingzero anticipator (LZA). The DFP adder receives DFP operands. Each operand includes a significand, an exponent, a sign bit and a leading zero count for the significand. The DFP adder adds or subtracts the DFP operands to obtain a DFP result. The LZA determines the leading zero count associated with the significand of the DFP result. The LZA operates at least partially in parallel with circuitry (in the DFP adder) that computes the DFP result. The LZA does not wait for that circuitry to finish computation of the DFP result. Instead it “anticipates” the number of leading zeros that the result's significand will contain.Type: GrantFiled: June 5, 2009Date of Patent: July 16, 2013Assignee: Advanced Micro DevicesInventor: LiangKai Wang

Patent number: 8301680Abstract: A method and apparatus for reducing memory required to store reciprocal approximations as specified in Institute of Electrical and Electronic Engineers (IEEE) standards such as IEEE 754 is presented. Monotonic properties of the reciprocal function are used to bound groups of values. Efficient bitvectors are used to represent information in groups resulting in a very compact table representation about four times smaller than storing all of the reciprocal approximations in a table.Type: GrantFiled: December 23, 2007Date of Patent: October 30, 2012Assignee: Intel CorporationInventors: Vinodh Gopal, Gilbert M. Wolrich, Wajdi K. Feghali

Publication number: 20120166509Abstract: In one embodiment, the present invention includes a method for receiving a reciprocal instruction and an operand in a processor, accessing an entry of a lookup table based on a portion of the operand and the instruction, generating an encoder output based on a type of the reciprocal instruction and whether the reciprocal instruction is a legacy instruction, and selecting portions of the lookup table entry and input operand to be provided to a reciprocal logic unit based on the encoder output. Other embodiments are described and claimed.Type: ApplicationFiled: December 22, 2010Publication date: June 28, 2012Inventors: Zeev Sperber, Cristina S. Anderson, Benny Eitan, Simon Rubanovich, Amit Gradstein

Publication number: 20110276614Abstract: A data processing apparatus and method are provided for performing a reciprocal operation on an input value d to produce a result value X. The reciprocal operation involves iterative execution of a refinement step to converge on the result value, the refinement step performing the computation: Xi=Xi1*M, where Xi is an estimate of the result value for the ith iteration of the refinement step, and M is a value determined by a portion of the refinement step. The data processing apparatus comprises a register data store having a plurality of registers operable to store data, and processing logic operable to execute instructions to perform data processing operations on data held in the register data store.Type: ApplicationFiled: July 19, 2011Publication date: November 10, 2011Applicant: ARM LimitedInventors: David Raymond Lutz, Christopher Neal Hinds

Patent number: 8015228Abstract: A data processing apparatus and method are provided for performing a reciprocal operation on an input value d to produce a result value X. The reciprocal operation involves iterative execution of a refinement step to converge on the result value, the refinement step performing the computation: Xi=Xi?1*M, where Xi is an estimate of the result value for the ith iteration of the refinement step, and M is a value determined by a portion of the refinement step. The data processing apparatus comprises a register data store having a plurality of registers operable to store data, and processing logic operable to execute instructions to perform data processing operations on data held in the register data store.Type: GrantFiled: February 16, 2005Date of Patent: September 6, 2011Assignee: ARM LimitedInventors: David Raymond Lutz, Christopher Neal Hinds

Patent number: 7899859Abstract: One embodiment of the present invention provides a system that performs both errorcheck and exactcheck operations for a NewtonRaphson divide or squareroot computation. During operation, the system performs NewtonRaphson iterations followed by a multiply for a divide or a squareroot operation to produce a result, which includes one or more additional bits of accuracy beyond a desired accuracy for the result. Next, the system rounds the result to the desired accuracy to produce a rounded result t. The system then analyzes the additional bits of accuracy to determine whether t is correct and whether t is exact.Type: GrantFiled: December 20, 2005Date of Patent: March 1, 2011Assignee: Oracle America, Inc.Inventors: Allen Lyu, Leonard D. Rarick

Patent number: 7747667Abstract: A data processing apparatus and method generate an initial estimate of a result value that would be produced by performing a reciprocal operation on an input value. The input value and the result value are either fixed point values or floating point values. The data processing apparatus comprises processing logic for executing instructions to perform data processing operations on data, and a lookup table referenced by the processing logic during generation of the initial estimate of the result value. The processing logic is responsive to an estimate instruction to reference the lookup table to generate, dependent on a modified input value that is within a predetermined range of values, a table output value. For a particular modified input value, the same table output value is generated irrespective of whether the input value is a fixed point value or a floating point value. The initial estimate of the result value is then derivable from the table output value.Type: GrantFiled: February 16, 2005Date of Patent: June 29, 2010Assignee: ARM LimitedInventors: David Raymond Lutz, Christopher Neal Hinds, Dominic Hugo Symes, Simon Andrew Ford

Patent number: 7634527Abstract: In a first aspect, a first method of reciprocal estimate computation using floating point pipeline logic is provided. The first method includes the steps of (1) receiving an input value having an exponent and a mantissa when represented as a floating point number on which a reciprocal estimate computation is to be performed; (2) determining whether the exponent is one of a plurality of predetermined numbers; and (3) if the exponent is one of the plurality of predetermined numbers, adjusting at least one of a plurality of modified mantissa bits (e.g., mantissa bits internal to leading zero anticipator (LZA) logic) and the exponent so as to prevent an underflow result of the reciprocal estimate computation. Numerous other aspects are provided.Type: GrantFiled: November 17, 2005Date of Patent: December 15, 2009Assignee: International Business Machines CorporationInventors: Sherman Matthew Dance, Andrew Patrick Freemyer, Matthew Ray Tubbs

Publication number: 20090164543Abstract: A method and apparatus for reducing memory required to store reciprocal approximations as specified in Institute of Electrical and Electronic Engineers (IEEE) standards such as IEEE 754 is presented. Monotonic properties of the reciprocal function are used to bound groups of values. Efficient bitvectors are used to represent information in groups resulting in a very compact table representation about four times smaller than storing all of the reciprocal approximations in a table.Type: ApplicationFiled: December 23, 2007Publication date: June 25, 2009Inventors: Vinodh Gopal, Gilbert M. Wolrich, Wajdi K. Feghali

Publication number: 20080301213Abstract: A division method includes determining a precision indicator for the division operation that indicates whether the quotient should be a single precision, double precision, or extended precision floatingpoint number. The division is performed at a rectangular multiplier using the Goldschmidt or NewtonRaphson algorithm. Each algorithm calculates one or more intermediate values in order to determine the quotient. For example, the Goldschmidt algorithm calculates a complement of a product of the dividend and an estimate of the reciprocal of the divisor. The quotient is determined based on a portion of one or more of these intermediate values. Because only a portion of the intermediate value is used, the division can be performed efficiently at the rectangular multiplier, and therefore the quotient can be determined more quickly and still achieve the desired level of precision.Type: ApplicationFiled: June 1, 2007Publication date: December 4, 2008Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Michael J. Schulte, Carl E. Lemonds, JR., Dimitri Tan

Publication number: 20080243985Abstract: A new function for calculating the reciprocal residual of a floatingpoint number X is defined as recip_residual(X)=1?X*recip(X), where recip(X) represents the reciprocal of X. The function may be implemented using a fused multiplyadd unit in a processor. The reciprocal value of X, recip(X), may be obtained from a lookup table. The recip_residual function may help reduce the latency of many multiplicative functions that are based on products of multiple numbers and can be expressed in simple terms of functions on each individual number (e.g., log(U*V)=log(U)+log(V)).Type: ApplicationFiled: March 30, 2007Publication date: October 2, 2008Inventors: Ping Tak Peter Tang, Robert Cavin

Publication number: 20080208945Abstract: Approximations of reciprocal square roots are provided in IEEE floating point binary format by obtaining an index from an input value, accessing a pair of table values and performing a limited number of simple and rapidly performed manipulations. The maximum relative error in the approximation thus provided is less than 0.75/2(2k+1) as compared with a maximum relative error of 1/2k+2 of known methods, where 2k is the number of table entries.Type: ApplicationFiled: March 28, 2008Publication date: August 28, 2008Inventor: James B. SHEARER

Patent number: 7406589Abstract: Highprecision floatingpoint function estimates are split in two instructions each: a low precision table lookup instruction and a linear interpolation instruction. Estimates of different functions can be implemented using this scheme: A separate tablelookup instruction is provided for each different function, while only a single interpolation instruction is needed, since the single interpolation instruction can perform the interpolation step for any of the functions to be estimated. Thus, significantly less overhead is incurred than would be incurred with specialized hardware, while still maintaining a uniform FPU latency, which allows for much simpler control logic.Type: GrantFiled: May 12, 2005Date of Patent: July 29, 2008Assignee: International Business Machines CorporationInventors: Sang Hoo Dhong, Gordon Clyde Fossum, Harm Peter Hofstee, Brad William Michael, Silvia Melitta Mueller, HwaJoon Oh

Patent number: 7124161Abstract: Efficient implementation of arithmetic circuits in programmable logic devices by using LookUp Tables (LUTs) to store precalculated values. A table lookup operation is performed in place of complex arithmetic operations. In this way, at the expense of a few LUTs, many logic elements can be saved. This approach is particularly applicable to circuits for calculating reciprocal values and circuits for performing normalized LMS algorithm.Type: GrantFiled: October 31, 2005Date of Patent: October 17, 2006Assignee: Altera CorporationInventors: Chang Choo, Asher Hazanchuk

Patent number: 7117238Abstract: A pipelined circuit configured to generate a Taylor's series approximation at least one function, preferably at least one of the reciprocal and the reciprocal square root, of an input value. The circuit is preloaded with or configured to generate a predetermined set of Taylor's series coefficients for each segment of the input value range. Other aspects of the invention are methods for determining preferred parameters for elements of such a circuit, a circuit designed in accordance with such a method, and a system (e.g., a pipelined graphics processor) for and method of pipelined graphics data processing using any embodiment of the circuit. The preferred parameters are determined by minimizing the circuit's size subject to constraints on input and output value format and output accuracy, assuming a specific function to be approximated and a specific degree for the approximation but allowing variation of parameters such as coefficient width and number of input value range segments.Type: GrantFiled: September 19, 2002Date of Patent: October 3, 2006Assignee: NVIDIA CorporationInventors: Nicholas J. Foskett, Robert J. Prevett, Jr., Sean Treichler

Patent number: 7080112Abstract: A method and apparatus allows the quick computation of an estimate of the reciprocal of a floating point number in IEEE format. A table with 2k entries allows the computation of an estimate with 2×k+3 good bits. x is a floating point number in IEEE format for which a reciprocal approximation is to be computed. {circumflex over (x)} be a floating point number in IEEE format derived from x by leaving the sign bit unchanged, complementing the exponent bits, leaving the first k fraction bits unchanged, and complementing the remaining fraction bits. t is another floating point number in IEEE format found by using the first k bits of the fraction of x as an index into a table with 2k entries. The product {circumflex over (x)}×t computed with IEEE floating point arithmetic is an estimate of the reciprocal of x with 2×k+3 good bits (i.e., relative error less than 2?2×k+3).Type: GrantFiled: November 13, 2002Date of Patent: July 18, 2006Assignee: International Business Machines CorporationInventor: James Bergheim Shearer

Patent number: 7027597Abstract: A precomputation and dualpass modular operation approach to implement encryption protocols efficiently in electronic integrated circuits is disclosed. An encrypted electronic message is received and another electronic message generated based on the encryption protocol. Two passes of Montgomery's method are used for a modular operation that is associated with the encryption protocol along with precomputation of a constant based on a modulus. The modular operation may be a modular multiplication or a modular exponentiation. Modular arithmetic may be performed using the residue number system (RNS) and two RNS bases with conversions between the two RNS bases. A minimal number of register files are used for the computations along with an array of multiplier circuits and an array of modular reduction circuits. The approach described allows for high throughput for large encryption keys with a relatively small number of logical gates.Type: GrantFiled: September 18, 2001Date of Patent: April 11, 2006Assignee: Cisco Technologies, Inc.Inventors: Mihailo M. Stojancic, Mahesh S. Maddury, Kenneth J. Tomei

Patent number: 7027598Abstract: A precomputation and dualpass modular operation approach to implement encryption protocols efficiently in electronic integrated circuits is disclosed. An encrypted electronic message is received and another electronic message generated based on the encryption protocol. Two passes of Montgomery's method are used for a modular operation that is associated with the encryption protocol along with precomputation of a constant based on a modulus. The modular operation may be a modular multiplication or a modular exponentiation. Modular arithmetic may be performed using the residue number system (RNS) and two RNS bases with conversions between the two RNS bases. A minimal number of register files are used for the computations along with an array of multiplier circuits and an array of modular reduction circuits. The approach described allows for high throughput for large encryption keys with a relatively small number of logical gates.Type: GrantFiled: September 19, 2001Date of Patent: April 11, 2006Assignee: Cisco Technology, Inc.Inventors: Mihailo M. Stojancic, Mahesh S. Maddury, Kenneth J. Tomei

Patent number: 6963895Abstract: Methods and systems are provided for fast computation of reciprocal square root for floatingpoint numbers. A piecewise linear approximation of the result mantissa is computed in two cycles and used as the input to an iteration sequence that converges cubically. Three iterations produce a result with accuracy sufficient for computer graphic applications. The initial estimate and input operand are scaled to minimize final adjustments to the result mantissa and final exponent adjustments required by the algorithm are performed concurrently with any adjustment required by rounding. A pipelined implementation of the algorithm produces a result with a latency of 24 and a repeat rate of 21 clock cycles.Type: GrantFiled: May 1, 2000Date of Patent: November 8, 2005Assignee: Raza Microelectronics, Inc.Inventor: Mark H. Comstock

Patent number: 6952710Abstract: The present invention provides apparatus, methods, and computer program products for noniterative division and noniterative reciprocal generation. In one embodiment, the present invention uses a logic network that determines the bits of the quotient of a divisor and dividend by using a noniterative, (i.e., non trial and error), method. Further, in another embodiment, the present invention may determine the reciprocal of a number M by separating the number M into at least two numbers X, Y . . . Z so that M=X+Y+ . . . +Z. The reciprocal of M is computed according to an equation 1/M=F(X,Y . . . Z) or an approximation 1/M?G(X,Y . . . Z), where the approximation gives the correct value of the inverse of M to a predetermined accuracy. In some embodiments, the apparatus uses an equation that exactly describes the reciprocal or instead, it may include one or more memories for storing lookup tables containing precalculated parts of the equation.Type: GrantFiled: June 11, 2001Date of Patent: October 4, 2005Inventors: Walter Eugene Pelton, K. Walt Herridge

Patent number: 6941334Abstract: A floating point unit includes a multiplier, an approximation circuit, and a control circuit coupled to the multiplier and the approximation circuit. The approximation circuit is configured to generate an approximation of a difference of the first result from the multiplier and a constant. The control circuit is configured to approximate a function specified by a floating point instruction provided to the floating point unit for execution using an approximation algorithm. The approximation algorithm comprises at least two iterations through the multiplier and optionally the approximation circuit. The control circuit is configured to correct the approximation from the approximation circuit from a first iteration of the approximation algorithm during a second iteration of the approximation algorithm by supplying a correction vector to the multiplier during the second iteration. The multiplier is configured to incorporate the correction vector into the first result during the second iteration.Type: GrantFiled: February 1, 2002Date of Patent: September 6, 2005Assignee: Broadcom CorporationInventors: Robert Rogenmoser, Michael C. Kim

Patent number: 6912559Abstract: The accuracy of approximating the reciprocal and the reciprocal square root of a number (N) is improved. Approximating the reciprocal of N includes: (a) estimating the reciprocal of N to produce an estimate (Xi); (b) determining a first intermediate result (IR1) according to the equation: IR1=1?N*Xi; (c) multiplying IR1 by Xi to produce a second intermediate result (IR2); and (d) adding Xi to IR2 to produce an approximation of the reciprocal of N. Approximating the reciprocal square root includes: (a) estimating the reciprocal square root of N to produce Xi; (b) multiplying Xi by N to produce IR1; (c) determining IR2 according to the equation: IR2=(1?Xi*IR1)/2; (d) multiplying IR2 by Xi to produce a third intermediate result (IR3); and (e) adding IR3 to Xi to produce an approximation of the reciprocal square root of the number.Type: GrantFiled: July 30, 1999Date of Patent: June 28, 2005Assignee: MIPS Technologies, Inc.Inventors: Yingwai Ho, Michael J. Schulte, John L. Kelley

Patent number: 6769006Abstract: A method and apparatus for the calculation of the reciprocal of a normalized mantissa M for a floatingpoint input number D. A formula for determining the minimum size for the lookup table in accordance with the required precision is provided, as well as formulas for calculating lookup table entries. The lookup table stores the initiation approximations and the correction coefficients, which are addressed by the corresponding number of the mantissa's most significant bits and used to obtain the initial approximation of the reciprocal by means of linear interpolation requiring one subtraction operation and one multiplication operation. The result of the linear interpolation may be fed to a NewtonRaphson iteration device requiring, for each iteration, two multiplication operations and one two's complement operation, thereby doubling the precision of the reciprocal.Type: GrantFiled: February 14, 2001Date of Patent: July 27, 2004Assignee: Sicon Video CorporationInventors: Alexei Krouglov, Jie Zhou, Daniel Gudmunson

Publication number: 20040139137Abstract: Generally, a method and apparatus are provided for computing a matrix inverse square root of a given positivedefinite Hermitian matrix, K. The disclosed technique for computing an inverse square root of a matrix may be implemented, for example, by the noise whitener of a MIMO receiver. Conventional noise whitening algorithms whiten a nonwhite vector, X, by applying a matrix, Q, to X, such that the resulting vector, Y, equal to Q·X, is a white vector. Thus, the noise whitening algorithms attempt to identify a matrix, Q, that when multiplied by the nonwhite vector, will convert the vector to a white vector. The disclosed iterative algorithm determines the matrix, Q, given the covariance matrix, K. The disclosed matrix inverse square root determination process initially establishes an initial matrix, Q0, by multiplying an identity matrix by a scalar value and then continues to iterate and compute another value of the matrix, Qn+1, until a convergence threshold is satisfied.Type: ApplicationFiled: January 10, 2003Publication date: July 15, 2004Inventors: Laurence Eugene Mailaender, Jack Salz, Sivarama Krishnan Venkatesan

Publication number: 20040093367Abstract: A method and apparatus allows the quick computation of an estimate of the reciprocal of a floating point number in IEEE format. A table with 2k entries allows the computation of an estimate with 2×k+3 good bits. x is a floating point number in IEEE format for which a reciprocal approximation is to be computed. {circumflex over (x)} be a floating point number in IEEE format derived from x by leaving the sign bit unchanged, complementing the exponent bits, leaving the first k fraction bits unchanged, and complementing the remaining fraction bits. t is another floating point number in IEEE format found by using the first k bits of the fraction of x as an index into a table with 2k entries. The product {circumflex over (x)}×t computed with IEEE floating point arithmetic is an estimate of the reciprocal of x with 2×k+3 good bits (i.e., relative error less than 2−2×k+3).Type: ApplicationFiled: November 13, 2002Publication date: May 13, 2004Inventor: James Bergheim Shearer

Patent number: 6732134Abstract: Operations that involve denormalized numbers are handled by restructuring the input values for an operation as normalized numbers, and performing calculations on the normalized numbers. As a first step in the process of performing an operation, a determination is made whether input values for the operation contain one or more denormalized numbers. For certain types of operations, a determination is made whether the input values are such that the output value from the operation will be a denormalized number. For each operation in which either the input values or output values comprise a denormalized number, the input values are scaled to produce values that are not denormalized. Once the appropriate factoring has been carried out, the requested operation is performed, using normalized numbers, to produce an intermediate result which is then adjusted to account for the initial scaling.Type: GrantFiled: September 11, 2000Date of Patent: May 4, 2004Assignee: Apple Computer, Inc.Inventors: Alexander Rosenberg, Ali Sazegari

Polynomial inverse computing apparatus, multiplier apparatus and polynomial inverse computing method
Publication number: 20040064495Abstract: A polynomial inverse computing apparatus comprises first to sixth registers, a left shift unit, first and second exclusiveOR units, a doubling computing unit which executes doubling computation in an extension field with characteristic 2, a halving computing unit which executes halving computation in the extension field of characteristic 2, a determination unit which determines whether or not a content of each register is 0, a decrement unit which decrements the content of each register, an increment unit which increments the content of each register.Type: ApplicationFiled: September 15, 2003Publication date: April 1, 2004Inventors: Hideo Shimizu, Atsushi Shimbo 
Patent number: 6665693Abstract: A digital signal system (10, 100) for determining an approximate reciprocal of a value of x. The system includes an input (12) for receiving a signal, and circuitry (18) for measuring an attribute of the signal. The measured attribute relates at least in part to the value of x. The system further includes circuitry (104) for identifying a bounded region within which x falls. The bounded region is one of a plurality of bounded regions, and each bounded region has a corresponding slope value and first and second endpoints. The system further includes circuitry (106, 108, 110) for determining the approximate reciprocal by adjusting a reciprocal value at one of the first and second endpoints by a measure equal to a distance of the value of x from the one of the first and second endpoints times the slope value corresponding to the bounded region within which x is identified as falling.Type: GrantFiled: November 22, 1999Date of Patent: December 16, 2003Assignee: Texas Instruments IncorporatedInventor: Rustin W. Allred

Patent number: 6654777Abstract: A floating point inverse square root circuit is disclosed. The circuit is configured to receive a floating point value comprised of a sign bit, an exponent field, and a mantissa field. The inverse square root circuit includes a lookup table configured to receive at least a portion of the floating point value and further configured to generate an initial approximation (x0) of the inverse square root of the floating point value from the received portion of the floating point value. The inverse square root circuit further includes a first estimation circuit that receives the initial approximation from the lookup table and at least a portion of a value L derived from the floating point value mantissa field (M) and further configured to produce a first approximation (x1) of the floating point value's inverse square root based upon L and x0 where x1 is a more accurate estimate of the inverse square root than x0.Type: GrantFiled: July 27, 2000Date of Patent: November 25, 2003Assignee: International Business Machines CorporationInventors: Gordon Clyde Fossum, Thomas Winters Fox