Multiplication Followed By Addition Patents (Class 708/501)
-
Patent number: 6615341Abstract: A digital signal processor (DSP) employs a variable-length instruction set. A portion of the variable-length instructions may be stored in adjacent locations within memory space with the beginning and ending of instructions occurring across memory word boundaries. The instructions may contain variable numbers of instruction fragments. Each instruction fragment causes a particular operation, or operations, to be performed allowing multiple operations during each clock cycle. The DSP includes multiple data buses, and in particular three data buses. The DSP may also use a register bank that has registers accessible by at least two processing units, allowing multiple operations to be performed on a particular set of data by the multiple processing units, without reading and writing the data to and from a memory. an instruction fetch unit that receives instructions of variable length stored in an instruction memory. An instruction memory may advantageously be separate from the three data memories.Type: GrantFiled: June 5, 2001Date of Patent: September 2, 2003Assignee: Qualcomm, Inc.Inventors: Gilbert C. Sih, Qiuzhen Zou, Inyup Kang, Quaeed Motiwala, Deepu John, Li Zhang, Haitao Zhang, Way-Shing Lee, Charles E. Sakamaki, Prashant A. Kantak, Sanjay K. Jha, Jian Lin
-
Patent number: 6606700Abstract: The invention is a digital signal processor architecture that is designed to speed up frequently-used signal processing computations, such as FIR filters, correlations, FFTs, and DFTs. The architecture uses a coupled dual-MAC architecture (MAC1), (MAC2) and attaches a dual-MAC coprocessor (MAC3), (MAC4) onto it in a unique way to achieve a significant increase in processing capability.Type: GrantFiled: February 26, 2000Date of Patent: August 12, 2003Assignee: Qualcomm, IncorporatedInventors: Gilbert C. Sih, Hemant Kumar, Way-Shing Lee
-
Publication number: 20030126174Abstract: To perform a product-sum operation by adding third data to a product of first data and second data, a floating point multiplier first multiplies the first data by the second data, and a bit string representing a fixed-point part in the multiplication result is divided into a portion representing more significant digits in the fixed-point part and a portion representing less significant digits in the fixed-point part. Then, a floating point adder first adds less significant multiplication result data having a bit string representing the less significant digits as a fixed-point part to the third data, and then adds the addition result to more significant multiplication result data having a bit string representing the more significant digits as a fixed-point part. A rounding process is performed on the two addition results to obtain a result of the product-sum operation.Type: ApplicationFiled: March 29, 2002Publication date: July 3, 2003Applicant: Fujitsu LimitedInventor: Shiro Kawata
-
Patent number: 6584482Abstract: A multiplier array processing system which improves the utilization of the multiplier and adder array for lower-precision arithmetic is described. New instructions are defined which provide for the deployment of additional multiply and add operations as a result of a single instruction, and for the deployment of greater multiply and add operands as the symbol size is decreased.Type: GrantFiled: August 19, 1999Date of Patent: June 24, 2003Assignee: Microunity Systems Engineering, Inc.Inventors: Craig C. Hansen, Henry Massalin
-
Patent number: 6571266Abstract: A floating-point multiply accumulate method acquiring a final mantissa result comprises comparing exponents of (A*B) and C. Transferring part of the C mantissa to a CHI register. Shifting any part of the C mantissa which overlaps the range of the (A*B) mantissa to align the bits of the (A*B) and C mantissas. Adding the shifted part of the C mantissa to the (A*B) mantissa. Shifting least significant bits corresponding to a number of bits transferred to the CHI register out of the Temp. Result. Mask merging bits of the C mantissa which were transferred to the CHI register with most significant bit positions of the shifted Temp. Result. Rounding this mantissa result to the first precision and acquiring L from an Lbit value of the CHI register or an Lbit value of the Temp. Result based on the bit value of the merge mask corresponding to the Lbit position.Type: GrantFiled: February 21, 2000Date of Patent: May 27, 2003Assignee: Hewlett-Packard Development Company, L.P.Inventor: Stephen L Bass
-
Patent number: 6553120Abstract: Method for the cryptography of data recorded on a medium usable by a computing unit in which the computing unit processes an input information x using a key for supplying an information F(x) encoded by a function F. The function uses a decorrelation module Mk such that F(x)=[F′(Mk)](x), in which K is a random key and F′ a cryptographic function. This Abstract is neither intended to define the invention disclosed in this specification nor intended to limit, in any manner, the scope of the invention.Type: GrantFiled: June 28, 1999Date of Patent: April 22, 2003Assignee: Centre National de la Recherche ScientifiqueInventor: Serge Vaudenay
-
Patent number: 6542915Abstract: Presented is a “high-order” Leading Zeros Anticipator or LZA circuit and specifically a five-input LZA. The prior-art two-input LZA circuit is part of almost all high-performance floating-point units or FPUs. The advantages of a high-order LZA (such as five-input) is that the LZA function may be started and finished sooner in the floating point pipeline, and therefore allows more time for other functions in the pipeline. Therefore, a high-order LZA, such as five-input LZA, may be faster than the prior art two-input LZA designs. Thus, speeding up the LZA function in a floating point pipeline may significantly increase the speed in which the overall floating-point unit may operate as compared to the prior-art two input LZA designs and may additionally inspire new floating-point michroarchitectures which may yield further performance gains.Type: GrantFiled: June 17, 1999Date of Patent: April 1, 2003Assignee: International Business Machines CorporationInventors: Michael Thomas Dibrino, Faraydon Osman Karim
-
Patent number: 6542916Abstract: A data processing apparatus and method is provided for applying a floating-point multiply-accumulate operation to first, second and third operands. The apparatus comprises a multiplier for multiplying the second and third operands and applying rounding to produce a rounded multiplication result, and an adder for adding the rounded multiplication result to the first operand to generate a final result and for applying rounding to generate a rounded final result. Further, control logic is provided which is responsive to a first single instruction to control the multiplier and adder to cause the rounded final result generated by the adder to be equivalent to the subtraction of the rounded multiplication result from the first operand.Type: GrantFiled: July 28, 1999Date of Patent: April 1, 2003Assignee: Arm LimitedInventors: Christopher Neal Hinds, David Vivian Jaggar, David James Seal
-
Publication number: 20030041082Abstract: A circuit (10) for multiplying two floating point operands (A and C) while adding or subtracting a third floating point operand (B) removes latency associated with normalization and rounding from a critical speed path for dependent calculations. An intermediate representation of a product and a third operand are selectively shifted to facilitate use of prior unnormalized dependent resultants. Logic circuitry (24, 42) implements a truth table for determining when and how much shifting should be made to intermediate values based upon the a resultant of a previous calculation, upon exponents of current operands and an exponent of a previous resultant operand. Normalization and rounding may be subsequently implemented, but at a time when a new cycle operation is not dependent on such operations even if data dependencies exist.Type: ApplicationFiled: August 24, 2001Publication date: February 27, 2003Inventor: Michael Dibrino
-
Publication number: 20030018676Abstract: A scalable engine having multiple datapaths, each of which is a unique multi-function floating point pipeline capable of performing a four component dot product on data in a single pass through the datapath, which allows matrix transformations to be computed in an efficient manner, with a high data throughput and without substantially increasing the cost and amount of hardware required to implement the pipeline.Type: ApplicationFiled: March 15, 2001Publication date: January 23, 2003Inventor: Steven Shaw
-
Publication number: 20020194239Abstract: A multiply-accumulate circuit includes a compressor tree to generate a product with a binary exponent and a mantissa in carry-save format. The product is converted into a number having a three bit exponent and a fifty-seven bit mantissa in carry-save format for accumulation. An adder circuit accumulates the converted products in carry-save format. Because the products being summed are in carry-save format, post-normalization is avoided within the adder feedback loop. The adder operates on floating point number representations having exponents with a least significant bit weight of thirty-two, and exponent comparisons within the adder exponent path are limited in size. Variable shifters are avoided in the adder mantissa path. A single mantissa shift of thirty-two bits is provided by a conditional shifter.Type: ApplicationFiled: June 4, 2001Publication date: December 19, 2002Applicant: Intel CorporationInventor: Amaresh Pangal
-
Publication number: 20020194240Abstract: A multiply-accumulate circuit includes a compressor tree to generate a product with a binary exponent and a mantissa in carry-save format. The product is converted into a number having a three bit exponent and a fifty-seven bit mantissa in carry-save format for accumulation. An adder circuit accumulates the converted products in carry-save format. Because the products being summed are in carry-save format, post-normalization is avoided within the adder feedback loop. The adder operates on floating point number representations having exponents with a least significant bit weight of thirty-two, and exponent comparisons within the adder exponent path are limited in size. Variable shifters are avoided in the adder mantissa path. A single mantissa shift of thirty-two bits is provided by a conditional shifter.Type: ApplicationFiled: June 4, 2001Publication date: December 19, 2002Applicant: Intel CorporationInventors: Amaresh Pangal, Dinesh Somasekhar, Shekhar Y. Borkar, Sriram R. Vangal
-
Patent number: 6493817Abstract: The present invention provides a method and apparatus for performing floating-point operations. The apparatus of the present invention comprises a floating point unit which comprises standard multiply accumulate units (MACs) which are capable of performing multiply accumulate operations on a plurality of data type formats. The standard MACs are configured to operate on traditional data type formats and on single instruction multiple data (SIMD) type formats. Therefore, dedicated SIMD MAC units are not needed, thus allowing a significant savings in die area to be realized. When a SIMD instruction is to be operated on by one of the MAC units, the data is presented to the upper and lower MAC units as 64-bit words. Each MAC unit also receives one or more bits which cause the MAC units to each select either the upper or lower halves of the 64-bit words. Each MAC unit then operates on its respective 32-bit words.Type: GrantFiled: May 21, 1999Date of Patent: December 10, 2002Assignee: Hewlett-Packard CompanyInventor: Preston J. Renstrom
-
Publication number: 20020184284Abstract: A digital electronic circuit for performing single precision floating point arithmetic involving multiple operations of multiplication, and addition or subtraction. Multiple operations may occur within each time unit of operation.Type: ApplicationFiled: June 10, 2002Publication date: December 5, 2002Applicant: HYNIX SEMICONDUCTOR INC.Inventor: Earle W. Jennings
-
Patent number: 6480872Abstract: A method and a device including, in one embodiment, a multiply array and at least one adder to perform a floating-point multiplication followed by an addition when operands are in floating-point format. The device is also configured to perform an integer multiplication followed by an accumulation when operands are in integer format. The device is further configured to perform a floating-point multiply-add or an integer multiply-accumulation in response to control signals. In another embodiment, the device contains an adder and the adder is capable of performing a floating-point addition and an integer accumulation. The adder is configured to be extra wide to reduce operand misalignment. Moreover, the device stalls the process in response to operand misalignment.Type: GrantFiled: January 21, 1999Date of Patent: November 12, 2002Assignee: SandCraft, Inc.Inventor: Jack H. Choquette
-
Patent number: 6446195Abstract: An instruction set architecture (ISA) for application specific signal processor (ASSP) is tailored to digital signal processing applications. The instruction set architecture implemented with the ASSP, is adapted to DSP algorithmic structures. The instruction word of the ISA is typically 20 bits but can be expanded to 40-bits to control two instructions to be executed in series or parallel. All DSP instructions of the ISA are dyadic DSP instructions performing two operations with one instruction in one cycle. The DSP instructions or operations in the preferred embodiment include a multiply instruction (MULT), an addition instruction (ADD), a minimize/maximize instruction (MIN/MAX) also referred to as an extrema instruction, and a no operation instruction (NOP) each having an associated operation code (“opcode”). The present invention efficiently executes DSP instructions by means of the instruction set architecture and the hardware architecture of the application specific signal processor.Type: GrantFiled: January 31, 2000Date of Patent: September 3, 2002Assignee: Intel CorporationInventors: Kumar Ganapathy, Ruban Kanapathipillai
-
Publication number: 20020107900Abstract: A processor for performing a multiply-add instruction on a multiplicand A, a multiplier B, and an addend C, to calculate a result D. The operands are double-precision floating point numbers and the result D is a canonical-form extended-precision floating point number having a high order component and a low order component. The processor is a fused multiply-add processor with a multiplier, an adder, a normalizer and a rounder. The post-adder data path, the normalizer and the rounder each have a data width sufficient to represent post-adder intermediate results to permit the high and low order words of a correctly-rounded result D to be computed. The mantissas of the extended-precision result D are provided such that the high order word mantissa is stored to double precision registers.Type: ApplicationFiled: July 31, 2001Publication date: August 8, 2002Applicant: International Business Machines CorporationInventors: Robert F. Enenkel, Fred G. Gustavson, Bruce M. Fleischer, Jose E. Moreira
-
Patent number: 6427203Abstract: An improved digital signal processor, in which arithmetic multiply-add instructions are performed faster with substantial accuracy. The digital signal processor performs multiply-add instructions with look-ahead rounding, so that rounding after repeated arithmetic operations proceeds much more rapidly. The digital signal processor is also augmented with additional instruction formats which are particularly useful for digital signal processing. A first additional instruction format allows the digital signal processor to incorporate a small constant immediately into an instruction, such as to add a small constant value to a register value, or to multiply a register by a small constant value; this allows the digital signal processor to conduct the arithmetic operation with only one memory lookup instead of two.Type: GrantFiled: August 22, 2000Date of Patent: July 30, 2002Assignee: Sigma Designs, Inc.Inventor: Yann Le Cornec
-
Patent number: 6425074Abstract: A microprocessor configured to rapidly execute floating point store status word (FSTSW) type instructions that are immediately preceded by floating point compare (FCOM) type instructions is disclosed. FCOM-type instructions are modified to store their results to an architectural floating point status word and a temporary destination register. If an FSTSW-type instruction is detected immediately following an FCOM-type instruction, then the FSTSW-type instruction is transformed into a special fast floating point store status word (FSTSWEF) instruction. Unlike the FSTSW-type instruction, which is serializing and negatively impacts performance, the FSTSWEF instruction is not serializing and allows execution to continue without undue serialization. A computer system and method for rapidly executing FSTSW instructions immediately preceded by FCOM-type instructions are also disclosed.Type: GrantFiled: September 10, 1999Date of Patent: July 23, 2002Assignee: Advanced Micro Devices, Inc.Inventors: Stephan G. Meier, Norbert Juffa, Frederick D. Weber, Stuart F. Oberman
-
Patent number: 6393452Abstract: The present invention provides a method and apparatus for performing load bypasses with data conversion in a floating-point unit. This process of reading instructions and data out of a cache memory component, of decoding the instructions, of performing the a memory-format-to-register format conversion and of writing the converted data to the register file block of a floating-point unit is known as a load operation. A load operation occurs over many cycles. In accordance with the present invention, the number of cycles required to perform a load operation has been shortened, thereby dramatically increasing the overall throughput of the floating-point unit. In accordance with the present invention, the floating-point unit performs a load bypass with conversion, which significantly shortens the load operation. Data received by the floating-point unit must be converted from a memory format into a register format.Type: GrantFiled: May 21, 1999Date of Patent: May 21, 2002Assignee: Hewlett-Packard CompanyInventor: Preston J Renstrom
-
Patent number: 6381624Abstract: A Multiply Accumulate unit, which may be an FMAC for IEEE 754 format numbers, finds A*B±C faster if the multiplier is allowed to assume that it's A and B inputs are always positive, so that it never has to provide a complemented output, and if the C input for the accumulation with the product is also assumed to be positive. The sign magnitude notation of the IEEE 754 format is temporarily exchanged for a positive two's complement notation of the assumed positive values. Notice is taken of the actual signs, and when there is a difference to be formed, either because of addition between numbers having opposite signs, or because of a subtraction between numbers having the same sign, one of the numbers need to be negated (complemented) prior to the addition of C and the product AB. That number can always be C, provided that correct compensatory negation is available after the addition.Type: GrantFiled: April 29, 1999Date of Patent: April 30, 2002Assignee: Hewlett-Packard CompanyInventors: Glenn T Colon-Bonet, Paul Robert Thayer
-
Patent number: 6363476Abstract: A floating point multiply-add operating device in which a critical path in an addition process of continuous multiply-add operations of floating point numbers is shortened to improve operation efficiency is disclosed. This operating device includes: an exponent operating section for comparing an exponent of a floating point number of an operation result of a preceding multiply-add operation n with an exponent of a multiplication result of a subsequent multiply-add operation (n+1), and calculating an alignment shift count of the multiply-add operation (n+1) by the comparison result; and a mantissa operating section for aligning one mantissa of mantissas of two operands according to the alignment shift count inputted from the exponent operating section, calculating a sum of an aligned mantissa of the operand and the mantissa of the other operand, and normalizing a calculated addition result of the mantissas as needed, thereby calculating a mantissa of the multiply-add operation (n+1).Type: GrantFiled: August 11, 1999Date of Patent: March 26, 2002Assignee: Kabushiki Kaisha ToshibaInventor: Nobuhiro Ide
-
Patent number: 6360189Abstract: A data processing apparatus and method is provided for performing a multiply-accumulate operation A+(B*C) in response to a single instruction identifying said multiply-accumulate operation. The data processing operation comprises a multiplier for multiplying values B and C to generate an unrounded multiplication result, the multiplier further being arranged to generate first data required for rounding determination, and an adder for adding the unrounded multiplication result to a value A to generate an unrounded multiply-accumulate result, the adder further being arranged to generate second data required for rounding determination. Determination logic is then provided for using the first and second data to determine one or more rounding values required to produce a final multiply-accumulate result equivalent to the execution of a separate multiply instruction incorporating rounding, followed by a separate add instruction incorporating rounding.Type: GrantFiled: August 31, 1998Date of Patent: March 19, 2002Assignee: ARM LimitedInventors: Christopher Neal Hinds, David Vivian Jaggar, David Terrence Matheny
-
Publication number: 20020002573Abstract: A reconfigurable processor includes at least three (3) MacroSequencers (10)-(16) which are configured in an array. Each of the MacroSequencers is operable to receive on a separate one of four buses (18) an input from the other three MacroSequencers and from itself in a feedback manner. In addition, a control bus (20) is operable to provide control signals to all of the MacroSequencers for the purpose of controlling the instruction sequence associated therewith and also for inputting instructions thereto. Each of the MacroSequencers includes a plurality of executable units having inputs and outputs and each for providing an associated execution algorithm. The outputs of the execution units are input to an output selector which selects the outputs for outputs on at least one external output and on at least one feedback path. An input selector (66) is provided having an input for receiving at least one external output and at least the feedback path.Type: ApplicationFiled: March 1, 2001Publication date: January 3, 2002Applicant: Infinite Technology Corporation.Inventors: George Landers, Earle Jennings, Tim B. Smith, Glen Haas
-
Patent number: 6330631Abstract: A bus bridge for a computer system for bridging first and second buses includes a shift and accumulate unit. The shift and accumulate unit includes a shifter having an input connected to receive bytes from one of the first and second buses and an output providing a selectable shift to the received bytes. The shift and accumulate unit also includes an accumulator having an input connected to receive the output of the shifter and providing accumulation of selectable bits of the shifted bytes, the accumulator having an output for supplying realigned bytes to be passed to the other of the first and second buses. The combination of the shifter and the accumulator permits a desired amount of shift to be combined with the accumulation of selected bits or bytes to realign sets of bytes from one bus and to form sets of bytes for the other bus. Burst transfer is also possible by operating the shift and accumulate unit to operate in successive cycles for successive sets of input bytes from one of the buses.Type: GrantFiled: February 3, 1999Date of Patent: December 11, 2001Assignee: Sun Microsystems, Inc.Inventor: Andrew Crosland
-
Patent number: 6327605Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.Type: GrantFiled: March 19, 2001Date of Patent: December 4, 2001Assignee: Hitachi, Ltd.Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
-
Patent number: 6317764Abstract: The invention provides a method and system for computing transcendental functions quickly: (1) the multiply ALU is enhanced to add a term to the product, (2) rounding operations for intermediate multiplies are skipped, and (3) the Taylor series is separated into two partial series which are performed in parallel. Transcendental functions with ten terms (e.g., SIN or COS), are thus performed in about ten clock times.Type: GrantFiled: March 12, 1999Date of Patent: November 13, 2001Assignee: STMicroelectronics, Inc.Inventor: Leonard D. Rarick
-
Patent number: 6298366Abstract: A reconfigurable co-processor adapted for multiple multiply-accumulate operations includes plural pairs of multipliers, plural first adders receiving respective product outputs from a pairs of multipliers, and at least one second adder receiving sum outputs from a corresponding pair of first adders. The co-processor includes sign extend circuits at the output of each multiplier. One multiplier of each pair has a fixed left shift circuit that left shifts the product output a predetermined number of bits. The other multiplier in each pair includes a right shift circuit that right shifts the product output the number of bits. Multiplexers at the output of the first multiplier in each pair select the sign extended or the left shifted products. Multiplexers at the output of the second multiplier in each pair select the product, the right shifted product or pass through the inputs. The sign extend circuit for the second multiplier follows the multiplexer.Type: GrantFiled: February 4, 1999Date of Patent: October 2, 2001Assignee: Texas Instruments IncorporatedInventors: Alan Gatherer, Carl E. Lemonds, Jr., Dale E. Hocevar, Ching-Yu Hung
-
Patent number: 6292886Abstract: A system for processing SIMD operands in a packed data format includes a scalar FMAC and a vector FMAC coupled to a register file through an operand delivery module. For vector operations, the operand delivery module bit steers a SIMD operand of the packed operand into an unpacked operand for processing by the first execution unit. Another SIMD operand is processed by the vector execution unit.Type: GrantFiled: October 12, 1998Date of Patent: September 18, 2001Assignee: Intel CorporationInventors: Sivakumar Makineni, Sunnhyuk Kimn, Gautam B. Doshi, Roger A. Golliver
-
Patent number: 6282557Abstract: A low latency fused multiply-adder for adding a product of a first binary number and a second binary number to a third binary number is disclosed. The low latency fused multiply-adder includes a partial product generation module, a partial product reduction module, and a carry propagate adder. The partial product generation module generates a set of partial products from the first binary number and the second binary number. Coupled to the partial product generation module, the partial product reduction module combines the set of partial products with the third binary number to produce a redundant Sum and a redundant Carry. Finally, the carry propagate adder adds the redundant Sum and the redundant Carry to yield a Sum Total.Type: GrantFiled: December 8, 1998Date of Patent: August 28, 2001Assignee: International Business Machines CorporationInventors: Sang Hoo Dhong, Hung Cai Ngo, Kevin John Nowka
-
Patent number: 6275838Abstract: An enhanced floating point unit that supports floating point, integer, and graphics operations by combining the units into a single functional unit is disclosed. The enhanced floating point unit comprises a register file coupled to a plurality of bypass multiplexers. Coupled to the bypass multiplexers are an aligner and a multiplier. And, coupled to the multiplier is an adder that further couples to a normalizer/rounder unit. The normalizer/rounder unit may comprise a normalizer and a rounder coupled in series and or a parallel normalizer/rounder. The enhanced floating point unit supports both integer operations and graphics operations with one functional unit.Type: GrantFiled: October 28, 1998Date of Patent: August 14, 2001Assignee: Intrinsity, Inc.Inventors: James S. Blomgren, Terence M. Potter, Jeffrey S. Brooks
-
Publication number: 20010011291Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.Type: ApplicationFiled: March 19, 2001Publication date: August 2, 2001Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
-
Patent number: 6256655Abstract: A method and system for performing floating point operations in unnormalized format using a floating point accumulator. The present invention provides a set of floating point instructions and a floating point accumulator which stores the results of the operations in unnormalized format. Since the present invention operates on and stores floating point numbers in unnormalized format, the normalization step in the implementation of the floating point operations, which is typically required in the prior art, is readily eliminated. The present invention thus provides significant improvements in both time and space efficiency over prior art implementations of floating point operations. In digital signal processing applications, where floating point operations are used extensively for sums of products calculations, the performance improvements afforded by the present invention are further magnified by the elimination of normalization in each of numerous iterations of multiply and add instructions.Type: GrantFiled: September 14, 1998Date of Patent: July 3, 2001Assignee: Silicon Graphics, Inc.Inventors: Gulbin Ezer, Sudhaker Rao, Timothy J. van Hook
-
Patent number: 6243732Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.Type: GrantFiled: January 7, 2000Date of Patent: June 5, 2001Assignee: Hitachi, Ltd.Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
-
Publication number: 20010002484Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer and floating point operations. A number of specialized graphics instructions and accompanying hardware for executing them are disclosed to optimize the execution of graphics instruction with minimal additional hardware for a general purpose CPU.Type: ApplicationFiled: January 4, 2001Publication date: May 31, 2001Applicant: Sun Microsystems, IncInventor: Robert Yung
-
Patent number: 6226737Abstract: An apparatus and method for performing single precision multiplication in a microprocessor are provided. The apparatus includes translation logic and extended precision floating point execution logic. The translation logic decodes a single precision multiply instruction into an associated micro instruction sequence directing the microprocessor to fetch a single precision operand from memory and convert it to extended precision format. In addition, the associated micro instruction sequence directs floating point execution logic employing a dual pass multiplication unit to skip a pass associated with computing an insignificant partial product. This insignificant partial product would otherwise result from multiplication of a multiplicand by zeros which are appended to the significand of the fetched operand when it is converted to extended precision format.Type: GrantFiled: July 15, 1998Date of Patent: May 1, 2001Assignee: IP-First, L.L.C.Inventors: Timothy A. Elliott, G. Glenn Henry
-
Patent number: 6182104Abstract: A co-processor (44) executes a mathematical algorithm that computes modular exponentiation equations for encrypting or decrypting data. A pipelined multiplier (56) receives sixteen bit data values stored in an A/B RAM (72) and generates a partial product. The generated partial product is summed in a summer (58) with a previous partial product stored in a product RAM (64). A modulo reducer (60) causes a binary data value N to be aligned and added to the summed value when a particular data bit location of the summed value has a logic one value. An N RAM (70) stores the data value N that is added in a modulo reducer (60) to the summed value. The co-processor (44) computes the Foster-Montgomery Reduction Algorithm and reduces the value of (A*B mod N) without having to first compute the value of &mgr; as is required in the Montgomery Reduction Algorithm.Type: GrantFiled: July 22, 1998Date of Patent: January 30, 2001Assignee: Motorola, Inc.Inventors: Robert I. Foster, John Michael Buss, Rodney C. Tesch, James Douglas Dworkin, Michael J. Torla
-
Patent number: 6138136Abstract: A signal processor includes at least one data source (3), a plurality of input registers (11, 12, 13, 14, . . . ) whose inputs are coupled to the data source by data buses (9, 10), a plurality of multipliers (19, 20; 71, 72 . . . ) for multiplying data buffered in the input registers, and a processing arrangement spread over a plurality of data processor branches (4-0, 4-1, . . . , 4-N) for processing products (p0, p1, . . . ), generated by the multipliers by arithmetic and/or logic operations. For achieving enhanced flexibility of the signal processor and increasing the number of possible applications, multiplexers (15, 16, 17, 18; 70) are provided which are used for coupling the multipliers to a respective part of the input registers in dependence on control signals (I, II, III, IV). Such a signal processor is preferably used in mobile radio technology. Further fields of application are, for example, audio, video, medical and automotive technology, ISDN systems, and digital radio.Type: GrantFiled: October 23, 1998Date of Patent: October 24, 2000Assignee: U.S. Philips CorporationInventors: Harald Bauer, Dietmar Lorenz, Peter Meyer, Roberto Woudsma
-
Patent number: 6115730Abstract: A preloadable floating point unit includes first and second preload registers that hold a next operand and a next top of array (TOA) for use with a next FPU instruction held in an instruction queue pending completion of the current FPU instruction.Type: GrantFiled: November 17, 1997Date of Patent: September 5, 2000Assignee: VIA-Cyrix, Inc.Inventors: Atul Dhablania, Willard S. Briggs
-
Patent number: 6115729Abstract: A floating point unit 10 provides a multiply-accumulate operation to determine a result B+(A*C). The multiplier 20 takes several processing cycles to determine the product (A*C). Whilst the multiplier 20 and its subsequent carry-save-adder 26 operate, an aligned value B' of the addend B is generated by an alignment-shifter 34. The aligned-addend B' may only partially overlap with the product (A*C) to which it is to be added using an adder 44. Any high-order-portion HOP of the aligned-addend B' that does not overlap with the product (A*C) must be subsequently concatenated with the output of the adder 44 that sums the product (A*C) with the overlapping portion of the aligned-addend B'. If the sum performed by the adder 44 generates a carry then it is an incremented version IHOP of the high-order-portion that should be concatenated with the output of the adder 44.Type: GrantFiled: August 20, 1998Date of Patent: September 5, 2000Assignee: Arm LimitedInventors: David Terrence Matheny, David Vivian Jaggar
-
Patent number: 6078939Abstract: A computer and a method of using the computer to separate a floating-point number into high and low parts and for evaluating a dominant arithmetic object and a remainder object. The dominant object is associated with the first arithmetic object by using the high parts of the floating-point number. The evaluation of a remainder arithmetic object associates the first arithmetic object with the high and low parts of the floating-point numbers. A sum of the dominant and remainder arithmetic objects returns a value corresponding to the first arithmetic object.Type: GrantFiled: September 30, 1997Date of Patent: June 20, 2000Assignee: Intel CorporationInventors: Shane A. Story, Ping Tak Peter Tang
-
Patent number: 6078938Abstract: A system and method of using a computer processor (34) to generate a solution to a linear system of equations is provided. The computer processor (34) executes a Jacobi iterative technique to produce outputs representing the solution. Multiplication operations required by the iterative technique are performed using logarithmic arithmetic. With logarithmic arithmetic, a multiplication operation is accomplished using addition. For a given n.times.n matrix A, the computer processor (34) can compute an inverse matrix A.sup.-1 by repeatedly executing the iterative technique to solve n linear systems.Type: GrantFiled: May 29, 1996Date of Patent: June 20, 2000Assignee: Motorola, Inc.Inventors: ShaoWei Pan, Srinivas L. Panchumarthi, Ramamoorthy Srinath, Shay-Ping T. Wang
-
Patent number: 6064740Abstract: Circuitry which performs modular mathematics to solve the equation C=M.sup.k mod n and n is performed in a manner to mask the exponent k's signature from timing or power monitoring attacks. The modular exponentation function is performed in a normalized manner such that binary ones and zeros in the exponent are calculated by being modulo-squared and modulo-multiplied.Type: GrantFiled: November 12, 1997Date of Patent: May 16, 2000Inventors: Andreas Curiger, Wendell Little
-
Patent number: 6061707Abstract: An apparatus for generating an end-around carry to an end-around carry adder in a floating-point pipeline within a computer system is disclosed. The apparatus for generating an end-around carry includes a shift-comparison logic circuit, a sign-comparison circuit, and a logic gate. The shift-comparison logic circuit produces a shift-count signal and the sign-comparison logic circuit produces an effective operation signal. Coupled to the shift-comparison logic circuit and the sign-comparison logic circuit, the logic gate combines the shift-count signal and the effective operation signal with a carry-out signal generated by an end-around carry adder to provide an end-around carry signal for the end-around carry adder.Type: GrantFiled: January 16, 1998Date of Patent: May 9, 2000Assignee: International Business Machines CorporationInventors: Michael Thomas Dibrino, Faraydon Osman Karim
-
Patent number: 6038582Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.Type: GrantFiled: October 15, 1997Date of Patent: March 14, 2000Assignee: Hitachi, Ltd.Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
-
Patent number: 6026422Abstract: Electron repulsion integrals are classified according to atomic nucleus coordinates, etc., coefficients are generated and are stored in a data memory, multiplication with addition operation is executed according to a product sum procedure of auxiliary integrals of recursive order 1 or less, and the result is stored in the data memory. Next, density matrix element is stored in the data memory, a multiplication with addition operation procedure of an electron repulsion integral of recursive order 2 not containing any procedure of recursive order 1 or less is generated, and an instruction memory is updated. Multiplication with addition operation is executed while data is read from the data memory, and the result is stored in the data memory. At the termination of the product sum procedure, calculation of electron repulsion integral gRstu is complete and the Fock matrix element value is updated.Type: GrantFiled: February 27, 1998Date of Patent: February 15, 2000Assignee: Fuji Xerox Co., Ltd.Inventors: So Yamada, Shinjiro Inabata, Nobuaki Miyakawa
-
Patent number: 5996066Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer and floating point operations. A number of specialized graphics instructions and accompanying hardware for executing them are disclosed to optimize the execution of graphics instruction with minimal additional hardware for a general purpose CPU.Type: GrantFiled: October 10, 1996Date of Patent: November 30, 1999Assignee: Sun Microsystems, Inc.Inventor: Robert Yung
-
Patent number: 5928316Abstract: A fused floating point multiply-and-accumulate unit includes a multiplier which uses a modified Booth's algorithm to generate a sum and a carry representing a product of mantissas. An artifact of this algorithm is that the sum or carry may represent a negative value even though both mantissas are positive. The negative value may have a sign bit from sign extension or sign encoding of partial products in the multiplier. An artifact of the signed bit is a false carry out that results from canceling the sign bit. A 3-input adder simultaneously combines the sum and carry from the multiplier and performs the accumulation. The adder includes carry correction logic to suppress false carries and prevents a false carry from affecting more significant bits of the value being accumulated.Type: GrantFiled: November 18, 1996Date of Patent: July 27, 1999Assignee: Samsung Electronics Co., Ltd.Inventors: Roney S. Wong, Shao-Kun Jiang