Multiplication Followed By Addition Patents (Class 708/501)
  • Patent number: 6615341
    Abstract: A digital signal processor (DSP) employs a variable-length instruction set. A portion of the variable-length instructions may be stored in adjacent locations within memory space with the beginning and ending of instructions occurring across memory word boundaries. The instructions may contain variable numbers of instruction fragments. Each instruction fragment causes a particular operation, or operations, to be performed allowing multiple operations during each clock cycle. The DSP includes multiple data buses, and in particular three data buses. The DSP may also use a register bank that has registers accessible by at least two processing units, allowing multiple operations to be performed on a particular set of data by the multiple processing units, without reading and writing the data to and from a memory. an instruction fetch unit that receives instructions of variable length stored in an instruction memory. An instruction memory may advantageously be separate from the three data memories.
    Type: Grant
    Filed: June 5, 2001
    Date of Patent: September 2, 2003
    Assignee: Qualcomm, Inc.
    Inventors: Gilbert C. Sih, Qiuzhen Zou, Inyup Kang, Quaeed Motiwala, Deepu John, Li Zhang, Haitao Zhang, Way-Shing Lee, Charles E. Sakamaki, Prashant A. Kantak, Sanjay K. Jha, Jian Lin
  • Patent number: 6606700
    Abstract: The invention is a digital signal processor architecture that is designed to speed up frequently-used signal processing computations, such as FIR filters, correlations, FFTs, and DFTs. The architecture uses a coupled dual-MAC architecture (MAC1), (MAC2) and attaches a dual-MAC coprocessor (MAC3), (MAC4) onto it in a unique way to achieve a significant increase in processing capability.
    Type: Grant
    Filed: February 26, 2000
    Date of Patent: August 12, 2003
    Assignee: Qualcomm, Incorporated
    Inventors: Gilbert C. Sih, Hemant Kumar, Way-Shing Lee
  • Publication number: 20030126174
    Abstract: To perform a product-sum operation by adding third data to a product of first data and second data, a floating point multiplier first multiplies the first data by the second data, and a bit string representing a fixed-point part in the multiplication result is divided into a portion representing more significant digits in the fixed-point part and a portion representing less significant digits in the fixed-point part. Then, a floating point adder first adds less significant multiplication result data having a bit string representing the less significant digits as a fixed-point part to the third data, and then adds the addition result to more significant multiplication result data having a bit string representing the more significant digits as a fixed-point part. A rounding process is performed on the two addition results to obtain a result of the product-sum operation.
    Type: Application
    Filed: March 29, 2002
    Publication date: July 3, 2003
    Applicant: Fujitsu Limited
    Inventor: Shiro Kawata
  • Patent number: 6584482
    Abstract: A multiplier array processing system which improves the utilization of the multiplier and adder array for lower-precision arithmetic is described. New instructions are defined which provide for the deployment of additional multiply and add operations as a result of a single instruction, and for the deployment of greater multiply and add operands as the symbol size is decreased.
    Type: Grant
    Filed: August 19, 1999
    Date of Patent: June 24, 2003
    Assignee: Microunity Systems Engineering, Inc.
    Inventors: Craig C. Hansen, Henry Massalin
  • Patent number: 6571266
    Abstract: A floating-point multiply accumulate method acquiring a final mantissa result comprises comparing exponents of (A*B) and C. Transferring part of the C mantissa to a CHI register. Shifting any part of the C mantissa which overlaps the range of the (A*B) mantissa to align the bits of the (A*B) and C mantissas. Adding the shifted part of the C mantissa to the (A*B) mantissa. Shifting least significant bits corresponding to a number of bits transferred to the CHI register out of the Temp. Result. Mask merging bits of the C mantissa which were transferred to the CHI register with most significant bit positions of the shifted Temp. Result. Rounding this mantissa result to the first precision and acquiring L from an Lbit value of the CHI register or an Lbit value of the Temp. Result based on the bit value of the merge mask corresponding to the Lbit position.
    Type: Grant
    Filed: February 21, 2000
    Date of Patent: May 27, 2003
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Stephen L Bass
  • Patent number: 6553120
    Abstract: Method for the cryptography of data recorded on a medium usable by a computing unit in which the computing unit processes an input information x using a key for supplying an information F(x) encoded by a function F. The function uses a decorrelation module Mk such that F(x)=[F′(Mk)](x), in which K is a random key and F′ a cryptographic function. This Abstract is neither intended to define the invention disclosed in this specification nor intended to limit, in any manner, the scope of the invention.
    Type: Grant
    Filed: June 28, 1999
    Date of Patent: April 22, 2003
    Assignee: Centre National de la Recherche Scientifique
    Inventor: Serge Vaudenay
  • Patent number: 6542915
    Abstract: Presented is a “high-order” Leading Zeros Anticipator or LZA circuit and specifically a five-input LZA. The prior-art two-input LZA circuit is part of almost all high-performance floating-point units or FPUs. The advantages of a high-order LZA (such as five-input) is that the LZA function may be started and finished sooner in the floating point pipeline, and therefore allows more time for other functions in the pipeline. Therefore, a high-order LZA, such as five-input LZA, may be faster than the prior art two-input LZA designs. Thus, speeding up the LZA function in a floating point pipeline may significantly increase the speed in which the overall floating-point unit may operate as compared to the prior-art two input LZA designs and may additionally inspire new floating-point michroarchitectures which may yield further performance gains.
    Type: Grant
    Filed: June 17, 1999
    Date of Patent: April 1, 2003
    Assignee: International Business Machines Corporation
    Inventors: Michael Thomas Dibrino, Faraydon Osman Karim
  • Patent number: 6542916
    Abstract: A data processing apparatus and method is provided for applying a floating-point multiply-accumulate operation to first, second and third operands. The apparatus comprises a multiplier for multiplying the second and third operands and applying rounding to produce a rounded multiplication result, and an adder for adding the rounded multiplication result to the first operand to generate a final result and for applying rounding to generate a rounded final result. Further, control logic is provided which is responsive to a first single instruction to control the multiplier and adder to cause the rounded final result generated by the adder to be equivalent to the subtraction of the rounded multiplication result from the first operand.
    Type: Grant
    Filed: July 28, 1999
    Date of Patent: April 1, 2003
    Assignee: Arm Limited
    Inventors: Christopher Neal Hinds, David Vivian Jaggar, David James Seal
  • Publication number: 20030041082
    Abstract: A circuit (10) for multiplying two floating point operands (A and C) while adding or subtracting a third floating point operand (B) removes latency associated with normalization and rounding from a critical speed path for dependent calculations. An intermediate representation of a product and a third operand are selectively shifted to facilitate use of prior unnormalized dependent resultants. Logic circuitry (24, 42) implements a truth table for determining when and how much shifting should be made to intermediate values based upon the a resultant of a previous calculation, upon exponents of current operands and an exponent of a previous resultant operand. Normalization and rounding may be subsequently implemented, but at a time when a new cycle operation is not dependent on such operations even if data dependencies exist.
    Type: Application
    Filed: August 24, 2001
    Publication date: February 27, 2003
    Inventor: Michael Dibrino
  • Publication number: 20030018676
    Abstract: A scalable engine having multiple datapaths, each of which is a unique multi-function floating point pipeline capable of performing a four component dot product on data in a single pass through the datapath, which allows matrix transformations to be computed in an efficient manner, with a high data throughput and without substantially increasing the cost and amount of hardware required to implement the pipeline.
    Type: Application
    Filed: March 15, 2001
    Publication date: January 23, 2003
    Inventor: Steven Shaw
  • Publication number: 20020194239
    Abstract: A multiply-accumulate circuit includes a compressor tree to generate a product with a binary exponent and a mantissa in carry-save format. The product is converted into a number having a three bit exponent and a fifty-seven bit mantissa in carry-save format for accumulation. An adder circuit accumulates the converted products in carry-save format. Because the products being summed are in carry-save format, post-normalization is avoided within the adder feedback loop. The adder operates on floating point number representations having exponents with a least significant bit weight of thirty-two, and exponent comparisons within the adder exponent path are limited in size. Variable shifters are avoided in the adder mantissa path. A single mantissa shift of thirty-two bits is provided by a conditional shifter.
    Type: Application
    Filed: June 4, 2001
    Publication date: December 19, 2002
    Applicant: Intel Corporation
    Inventor: Amaresh Pangal
  • Publication number: 20020194240
    Abstract: A multiply-accumulate circuit includes a compressor tree to generate a product with a binary exponent and a mantissa in carry-save format. The product is converted into a number having a three bit exponent and a fifty-seven bit mantissa in carry-save format for accumulation. An adder circuit accumulates the converted products in carry-save format. Because the products being summed are in carry-save format, post-normalization is avoided within the adder feedback loop. The adder operates on floating point number representations having exponents with a least significant bit weight of thirty-two, and exponent comparisons within the adder exponent path are limited in size. Variable shifters are avoided in the adder mantissa path. A single mantissa shift of thirty-two bits is provided by a conditional shifter.
    Type: Application
    Filed: June 4, 2001
    Publication date: December 19, 2002
    Applicant: Intel Corporation
    Inventors: Amaresh Pangal, Dinesh Somasekhar, Shekhar Y. Borkar, Sriram R. Vangal
  • Patent number: 6493817
    Abstract: The present invention provides a method and apparatus for performing floating-point operations. The apparatus of the present invention comprises a floating point unit which comprises standard multiply accumulate units (MACs) which are capable of performing multiply accumulate operations on a plurality of data type formats. The standard MACs are configured to operate on traditional data type formats and on single instruction multiple data (SIMD) type formats. Therefore, dedicated SIMD MAC units are not needed, thus allowing a significant savings in die area to be realized. When a SIMD instruction is to be operated on by one of the MAC units, the data is presented to the upper and lower MAC units as 64-bit words. Each MAC unit also receives one or more bits which cause the MAC units to each select either the upper or lower halves of the 64-bit words. Each MAC unit then operates on its respective 32-bit words.
    Type: Grant
    Filed: May 21, 1999
    Date of Patent: December 10, 2002
    Assignee: Hewlett-Packard Company
    Inventor: Preston J. Renstrom
  • Publication number: 20020184284
    Abstract: A digital electronic circuit for performing single precision floating point arithmetic involving multiple operations of multiplication, and addition or subtraction. Multiple operations may occur within each time unit of operation.
    Type: Application
    Filed: June 10, 2002
    Publication date: December 5, 2002
    Applicant: HYNIX SEMICONDUCTOR INC.
    Inventor: Earle W. Jennings
  • Patent number: 6480872
    Abstract: A method and a device including, in one embodiment, a multiply array and at least one adder to perform a floating-point multiplication followed by an addition when operands are in floating-point format. The device is also configured to perform an integer multiplication followed by an accumulation when operands are in integer format. The device is further configured to perform a floating-point multiply-add or an integer multiply-accumulation in response to control signals. In another embodiment, the device contains an adder and the adder is capable of performing a floating-point addition and an integer accumulation. The adder is configured to be extra wide to reduce operand misalignment. Moreover, the device stalls the process in response to operand misalignment.
    Type: Grant
    Filed: January 21, 1999
    Date of Patent: November 12, 2002
    Assignee: SandCraft, Inc.
    Inventor: Jack H. Choquette
  • Patent number: 6446195
    Abstract: An instruction set architecture (ISA) for application specific signal processor (ASSP) is tailored to digital signal processing applications. The instruction set architecture implemented with the ASSP, is adapted to DSP algorithmic structures. The instruction word of the ISA is typically 20 bits but can be expanded to 40-bits to control two instructions to be executed in series or parallel. All DSP instructions of the ISA are dyadic DSP instructions performing two operations with one instruction in one cycle. The DSP instructions or operations in the preferred embodiment include a multiply instruction (MULT), an addition instruction (ADD), a minimize/maximize instruction (MIN/MAX) also referred to as an extrema instruction, and a no operation instruction (NOP) each having an associated operation code (“opcode”). The present invention efficiently executes DSP instructions by means of the instruction set architecture and the hardware architecture of the application specific signal processor.
    Type: Grant
    Filed: January 31, 2000
    Date of Patent: September 3, 2002
    Assignee: Intel Corporation
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Publication number: 20020107900
    Abstract: A processor for performing a multiply-add instruction on a multiplicand A, a multiplier B, and an addend C, to calculate a result D. The operands are double-precision floating point numbers and the result D is a canonical-form extended-precision floating point number having a high order component and a low order component. The processor is a fused multiply-add processor with a multiplier, an adder, a normalizer and a rounder. The post-adder data path, the normalizer and the rounder each have a data width sufficient to represent post-adder intermediate results to permit the high and low order words of a correctly-rounded result D to be computed. The mantissas of the extended-precision result D are provided such that the high order word mantissa is stored to double precision registers.
    Type: Application
    Filed: July 31, 2001
    Publication date: August 8, 2002
    Applicant: International Business Machines Corporation
    Inventors: Robert F. Enenkel, Fred G. Gustavson, Bruce M. Fleischer, Jose E. Moreira
  • Patent number: 6427203
    Abstract: An improved digital signal processor, in which arithmetic multiply-add instructions are performed faster with substantial accuracy. The digital signal processor performs multiply-add instructions with look-ahead rounding, so that rounding after repeated arithmetic operations proceeds much more rapidly. The digital signal processor is also augmented with additional instruction formats which are particularly useful for digital signal processing. A first additional instruction format allows the digital signal processor to incorporate a small constant immediately into an instruction, such as to add a small constant value to a register value, or to multiply a register by a small constant value; this allows the digital signal processor to conduct the arithmetic operation with only one memory lookup instead of two.
    Type: Grant
    Filed: August 22, 2000
    Date of Patent: July 30, 2002
    Assignee: Sigma Designs, Inc.
    Inventor: Yann Le Cornec
  • Patent number: 6425074
    Abstract: A microprocessor configured to rapidly execute floating point store status word (FSTSW) type instructions that are immediately preceded by floating point compare (FCOM) type instructions is disclosed. FCOM-type instructions are modified to store their results to an architectural floating point status word and a temporary destination register. If an FSTSW-type instruction is detected immediately following an FCOM-type instruction, then the FSTSW-type instruction is transformed into a special fast floating point store status word (FSTSWEF) instruction. Unlike the FSTSW-type instruction, which is serializing and negatively impacts performance, the FSTSWEF instruction is not serializing and allows execution to continue without undue serialization. A computer system and method for rapidly executing FSTSW instructions immediately preceded by FCOM-type instructions are also disclosed.
    Type: Grant
    Filed: September 10, 1999
    Date of Patent: July 23, 2002
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Stephan G. Meier, Norbert Juffa, Frederick D. Weber, Stuart F. Oberman
  • Patent number: 6393452
    Abstract: The present invention provides a method and apparatus for performing load bypasses with data conversion in a floating-point unit. This process of reading instructions and data out of a cache memory component, of decoding the instructions, of performing the a memory-format-to-register format conversion and of writing the converted data to the register file block of a floating-point unit is known as a load operation. A load operation occurs over many cycles. In accordance with the present invention, the number of cycles required to perform a load operation has been shortened, thereby dramatically increasing the overall throughput of the floating-point unit. In accordance with the present invention, the floating-point unit performs a load bypass with conversion, which significantly shortens the load operation. Data received by the floating-point unit must be converted from a memory format into a register format.
    Type: Grant
    Filed: May 21, 1999
    Date of Patent: May 21, 2002
    Assignee: Hewlett-Packard Company
    Inventor: Preston J Renstrom
  • Patent number: 6381624
    Abstract: A Multiply Accumulate unit, which may be an FMAC for IEEE 754 format numbers, finds A*B±C faster if the multiplier is allowed to assume that it's A and B inputs are always positive, so that it never has to provide a complemented output, and if the C input for the accumulation with the product is also assumed to be positive. The sign magnitude notation of the IEEE 754 format is temporarily exchanged for a positive two's complement notation of the assumed positive values. Notice is taken of the actual signs, and when there is a difference to be formed, either because of addition between numbers having opposite signs, or because of a subtraction between numbers having the same sign, one of the numbers need to be negated (complemented) prior to the addition of C and the product AB. That number can always be C, provided that correct compensatory negation is available after the addition.
    Type: Grant
    Filed: April 29, 1999
    Date of Patent: April 30, 2002
    Assignee: Hewlett-Packard Company
    Inventors: Glenn T Colon-Bonet, Paul Robert Thayer
  • Patent number: 6363476
    Abstract: A floating point multiply-add operating device in which a critical path in an addition process of continuous multiply-add operations of floating point numbers is shortened to improve operation efficiency is disclosed. This operating device includes: an exponent operating section for comparing an exponent of a floating point number of an operation result of a preceding multiply-add operation n with an exponent of a multiplication result of a subsequent multiply-add operation (n+1), and calculating an alignment shift count of the multiply-add operation (n+1) by the comparison result; and a mantissa operating section for aligning one mantissa of mantissas of two operands according to the alignment shift count inputted from the exponent operating section, calculating a sum of an aligned mantissa of the operand and the mantissa of the other operand, and normalizing a calculated addition result of the mantissas as needed, thereby calculating a mantissa of the multiply-add operation (n+1).
    Type: Grant
    Filed: August 11, 1999
    Date of Patent: March 26, 2002
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Nobuhiro Ide
  • Patent number: 6360189
    Abstract: A data processing apparatus and method is provided for performing a multiply-accumulate operation A+(B*C) in response to a single instruction identifying said multiply-accumulate operation. The data processing operation comprises a multiplier for multiplying values B and C to generate an unrounded multiplication result, the multiplier further being arranged to generate first data required for rounding determination, and an adder for adding the unrounded multiplication result to a value A to generate an unrounded multiply-accumulate result, the adder further being arranged to generate second data required for rounding determination. Determination logic is then provided for using the first and second data to determine one or more rounding values required to produce a final multiply-accumulate result equivalent to the execution of a separate multiply instruction incorporating rounding, followed by a separate add instruction incorporating rounding.
    Type: Grant
    Filed: August 31, 1998
    Date of Patent: March 19, 2002
    Assignee: ARM Limited
    Inventors: Christopher Neal Hinds, David Vivian Jaggar, David Terrence Matheny
  • Publication number: 20020002573
    Abstract: A reconfigurable processor includes at least three (3) MacroSequencers (10)-(16) which are configured in an array. Each of the MacroSequencers is operable to receive on a separate one of four buses (18) an input from the other three MacroSequencers and from itself in a feedback manner. In addition, a control bus (20) is operable to provide control signals to all of the MacroSequencers for the purpose of controlling the instruction sequence associated therewith and also for inputting instructions thereto. Each of the MacroSequencers includes a plurality of executable units having inputs and outputs and each for providing an associated execution algorithm. The outputs of the execution units are input to an output selector which selects the outputs for outputs on at least one external output and on at least one feedback path. An input selector (66) is provided having an input for receiving at least one external output and at least the feedback path.
    Type: Application
    Filed: March 1, 2001
    Publication date: January 3, 2002
    Applicant: Infinite Technology Corporation.
    Inventors: George Landers, Earle Jennings, Tim B. Smith, Glen Haas
  • Patent number: 6330631
    Abstract: A bus bridge for a computer system for bridging first and second buses includes a shift and accumulate unit. The shift and accumulate unit includes a shifter having an input connected to receive bytes from one of the first and second buses and an output providing a selectable shift to the received bytes. The shift and accumulate unit also includes an accumulator having an input connected to receive the output of the shifter and providing accumulation of selectable bits of the shifted bytes, the accumulator having an output for supplying realigned bytes to be passed to the other of the first and second buses. The combination of the shifter and the accumulator permits a desired amount of shift to be combined with the accumulation of selected bits or bytes to realign sets of bytes from one bus and to form sets of bytes for the other bus. Burst transfer is also possible by operating the shift and accumulate unit to operate in successive cycles for successive sets of input bytes from one of the buses.
    Type: Grant
    Filed: February 3, 1999
    Date of Patent: December 11, 2001
    Assignee: Sun Microsystems, Inc.
    Inventor: Andrew Crosland
  • Patent number: 6327605
    Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.
    Type: Grant
    Filed: March 19, 2001
    Date of Patent: December 4, 2001
    Assignee: Hitachi, Ltd.
    Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
  • Patent number: 6317764
    Abstract: The invention provides a method and system for computing transcendental functions quickly: (1) the multiply ALU is enhanced to add a term to the product, (2) rounding operations for intermediate multiplies are skipped, and (3) the Taylor series is separated into two partial series which are performed in parallel. Transcendental functions with ten terms (e.g., SIN or COS), are thus performed in about ten clock times.
    Type: Grant
    Filed: March 12, 1999
    Date of Patent: November 13, 2001
    Assignee: STMicroelectronics, Inc.
    Inventor: Leonard D. Rarick
  • Patent number: 6298366
    Abstract: A reconfigurable co-processor adapted for multiple multiply-accumulate operations includes plural pairs of multipliers, plural first adders receiving respective product outputs from a pairs of multipliers, and at least one second adder receiving sum outputs from a corresponding pair of first adders. The co-processor includes sign extend circuits at the output of each multiplier. One multiplier of each pair has a fixed left shift circuit that left shifts the product output a predetermined number of bits. The other multiplier in each pair includes a right shift circuit that right shifts the product output the number of bits. Multiplexers at the output of the first multiplier in each pair select the sign extended or the left shifted products. Multiplexers at the output of the second multiplier in each pair select the product, the right shifted product or pass through the inputs. The sign extend circuit for the second multiplier follows the multiplexer.
    Type: Grant
    Filed: February 4, 1999
    Date of Patent: October 2, 2001
    Assignee: Texas Instruments Incorporated
    Inventors: Alan Gatherer, Carl E. Lemonds, Jr., Dale E. Hocevar, Ching-Yu Hung
  • Patent number: 6292886
    Abstract: A system for processing SIMD operands in a packed data format includes a scalar FMAC and a vector FMAC coupled to a register file through an operand delivery module. For vector operations, the operand delivery module bit steers a SIMD operand of the packed operand into an unpacked operand for processing by the first execution unit. Another SIMD operand is processed by the vector execution unit.
    Type: Grant
    Filed: October 12, 1998
    Date of Patent: September 18, 2001
    Assignee: Intel Corporation
    Inventors: Sivakumar Makineni, Sunnhyuk Kimn, Gautam B. Doshi, Roger A. Golliver
  • Patent number: 6282557
    Abstract: A low latency fused multiply-adder for adding a product of a first binary number and a second binary number to a third binary number is disclosed. The low latency fused multiply-adder includes a partial product generation module, a partial product reduction module, and a carry propagate adder. The partial product generation module generates a set of partial products from the first binary number and the second binary number. Coupled to the partial product generation module, the partial product reduction module combines the set of partial products with the third binary number to produce a redundant Sum and a redundant Carry. Finally, the carry propagate adder adds the redundant Sum and the redundant Carry to yield a Sum Total.
    Type: Grant
    Filed: December 8, 1998
    Date of Patent: August 28, 2001
    Assignee: International Business Machines Corporation
    Inventors: Sang Hoo Dhong, Hung Cai Ngo, Kevin John Nowka
  • Patent number: 6275838
    Abstract: An enhanced floating point unit that supports floating point, integer, and graphics operations by combining the units into a single functional unit is disclosed. The enhanced floating point unit comprises a register file coupled to a plurality of bypass multiplexers. Coupled to the bypass multiplexers are an aligner and a multiplier. And, coupled to the multiplier is an adder that further couples to a normalizer/rounder unit. The normalizer/rounder unit may comprise a normalizer and a rounder coupled in series and or a parallel normalizer/rounder. The enhanced floating point unit supports both integer operations and graphics operations with one functional unit.
    Type: Grant
    Filed: October 28, 1998
    Date of Patent: August 14, 2001
    Assignee: Intrinsity, Inc.
    Inventors: James S. Blomgren, Terence M. Potter, Jeffrey S. Brooks
  • Publication number: 20010011291
    Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.
    Type: Application
    Filed: March 19, 2001
    Publication date: August 2, 2001
    Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
  • Patent number: 6256655
    Abstract: A method and system for performing floating point operations in unnormalized format using a floating point accumulator. The present invention provides a set of floating point instructions and a floating point accumulator which stores the results of the operations in unnormalized format. Since the present invention operates on and stores floating point numbers in unnormalized format, the normalization step in the implementation of the floating point operations, which is typically required in the prior art, is readily eliminated. The present invention thus provides significant improvements in both time and space efficiency over prior art implementations of floating point operations. In digital signal processing applications, where floating point operations are used extensively for sums of products calculations, the performance improvements afforded by the present invention are further magnified by the elimination of normalization in each of numerous iterations of multiply and add instructions.
    Type: Grant
    Filed: September 14, 1998
    Date of Patent: July 3, 2001
    Assignee: Silicon Graphics, Inc.
    Inventors: Gulbin Ezer, Sudhaker Rao, Timothy J. van Hook
  • Patent number: 6243732
    Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.
    Type: Grant
    Filed: January 7, 2000
    Date of Patent: June 5, 2001
    Assignee: Hitachi, Ltd.
    Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
  • Publication number: 20010002484
    Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer and floating point operations. A number of specialized graphics instructions and accompanying hardware for executing them are disclosed to optimize the execution of graphics instruction with minimal additional hardware for a general purpose CPU.
    Type: Application
    Filed: January 4, 2001
    Publication date: May 31, 2001
    Applicant: Sun Microsystems, Inc
    Inventor: Robert Yung
  • Patent number: 6226737
    Abstract: An apparatus and method for performing single precision multiplication in a microprocessor are provided. The apparatus includes translation logic and extended precision floating point execution logic. The translation logic decodes a single precision multiply instruction into an associated micro instruction sequence directing the microprocessor to fetch a single precision operand from memory and convert it to extended precision format. In addition, the associated micro instruction sequence directs floating point execution logic employing a dual pass multiplication unit to skip a pass associated with computing an insignificant partial product. This insignificant partial product would otherwise result from multiplication of a multiplicand by zeros which are appended to the significand of the fetched operand when it is converted to extended precision format.
    Type: Grant
    Filed: July 15, 1998
    Date of Patent: May 1, 2001
    Assignee: IP-First, L.L.C.
    Inventors: Timothy A. Elliott, G. Glenn Henry
  • Patent number: 6182104
    Abstract: A co-processor (44) executes a mathematical algorithm that computes modular exponentiation equations for encrypting or decrypting data. A pipelined multiplier (56) receives sixteen bit data values stored in an A/B RAM (72) and generates a partial product. The generated partial product is summed in a summer (58) with a previous partial product stored in a product RAM (64). A modulo reducer (60) causes a binary data value N to be aligned and added to the summed value when a particular data bit location of the summed value has a logic one value. An N RAM (70) stores the data value N that is added in a modulo reducer (60) to the summed value. The co-processor (44) computes the Foster-Montgomery Reduction Algorithm and reduces the value of (A*B mod N) without having to first compute the value of &mgr; as is required in the Montgomery Reduction Algorithm.
    Type: Grant
    Filed: July 22, 1998
    Date of Patent: January 30, 2001
    Assignee: Motorola, Inc.
    Inventors: Robert I. Foster, John Michael Buss, Rodney C. Tesch, James Douglas Dworkin, Michael J. Torla
  • Patent number: 6138136
    Abstract: A signal processor includes at least one data source (3), a plurality of input registers (11, 12, 13, 14, . . . ) whose inputs are coupled to the data source by data buses (9, 10), a plurality of multipliers (19, 20; 71, 72 . . . ) for multiplying data buffered in the input registers, and a processing arrangement spread over a plurality of data processor branches (4-0, 4-1, . . . , 4-N) for processing products (p0, p1, . . . ), generated by the multipliers by arithmetic and/or logic operations. For achieving enhanced flexibility of the signal processor and increasing the number of possible applications, multiplexers (15, 16, 17, 18; 70) are provided which are used for coupling the multipliers to a respective part of the input registers in dependence on control signals (I, II, III, IV). Such a signal processor is preferably used in mobile radio technology. Further fields of application are, for example, audio, video, medical and automotive technology, ISDN systems, and digital radio.
    Type: Grant
    Filed: October 23, 1998
    Date of Patent: October 24, 2000
    Assignee: U.S. Philips Corporation
    Inventors: Harald Bauer, Dietmar Lorenz, Peter Meyer, Roberto Woudsma
  • Patent number: 6115730
    Abstract: A preloadable floating point unit includes first and second preload registers that hold a next operand and a next top of array (TOA) for use with a next FPU instruction held in an instruction queue pending completion of the current FPU instruction.
    Type: Grant
    Filed: November 17, 1997
    Date of Patent: September 5, 2000
    Assignee: VIA-Cyrix, Inc.
    Inventors: Atul Dhablania, Willard S. Briggs
  • Patent number: 6115729
    Abstract: A floating point unit 10 provides a multiply-accumulate operation to determine a result B+(A*C). The multiplier 20 takes several processing cycles to determine the product (A*C). Whilst the multiplier 20 and its subsequent carry-save-adder 26 operate, an aligned value B' of the addend B is generated by an alignment-shifter 34. The aligned-addend B' may only partially overlap with the product (A*C) to which it is to be added using an adder 44. Any high-order-portion HOP of the aligned-addend B' that does not overlap with the product (A*C) must be subsequently concatenated with the output of the adder 44 that sums the product (A*C) with the overlapping portion of the aligned-addend B'. If the sum performed by the adder 44 generates a carry then it is an incremented version IHOP of the high-order-portion that should be concatenated with the output of the adder 44.
    Type: Grant
    Filed: August 20, 1998
    Date of Patent: September 5, 2000
    Assignee: Arm Limited
    Inventors: David Terrence Matheny, David Vivian Jaggar
  • Patent number: 6078939
    Abstract: A computer and a method of using the computer to separate a floating-point number into high and low parts and for evaluating a dominant arithmetic object and a remainder object. The dominant object is associated with the first arithmetic object by using the high parts of the floating-point number. The evaluation of a remainder arithmetic object associates the first arithmetic object with the high and low parts of the floating-point numbers. A sum of the dominant and remainder arithmetic objects returns a value corresponding to the first arithmetic object.
    Type: Grant
    Filed: September 30, 1997
    Date of Patent: June 20, 2000
    Assignee: Intel Corporation
    Inventors: Shane A. Story, Ping Tak Peter Tang
  • Patent number: 6078938
    Abstract: A system and method of using a computer processor (34) to generate a solution to a linear system of equations is provided. The computer processor (34) executes a Jacobi iterative technique to produce outputs representing the solution. Multiplication operations required by the iterative technique are performed using logarithmic arithmetic. With logarithmic arithmetic, a multiplication operation is accomplished using addition. For a given n.times.n matrix A, the computer processor (34) can compute an inverse matrix A.sup.-1 by repeatedly executing the iterative technique to solve n linear systems.
    Type: Grant
    Filed: May 29, 1996
    Date of Patent: June 20, 2000
    Assignee: Motorola, Inc.
    Inventors: ShaoWei Pan, Srinivas L. Panchumarthi, Ramamoorthy Srinath, Shay-Ping T. Wang
  • Patent number: 6064740
    Abstract: Circuitry which performs modular mathematics to solve the equation C=M.sup.k mod n and n is performed in a manner to mask the exponent k's signature from timing or power monitoring attacks. The modular exponentation function is performed in a normalized manner such that binary ones and zeros in the exponent are calculated by being modulo-squared and modulo-multiplied.
    Type: Grant
    Filed: November 12, 1997
    Date of Patent: May 16, 2000
    Inventors: Andreas Curiger, Wendell Little
  • Patent number: 6061707
    Abstract: An apparatus for generating an end-around carry to an end-around carry adder in a floating-point pipeline within a computer system is disclosed. The apparatus for generating an end-around carry includes a shift-comparison logic circuit, a sign-comparison circuit, and a logic gate. The shift-comparison logic circuit produces a shift-count signal and the sign-comparison logic circuit produces an effective operation signal. Coupled to the shift-comparison logic circuit and the sign-comparison logic circuit, the logic gate combines the shift-count signal and the effective operation signal with a carry-out signal generated by an end-around carry adder to provide an end-around carry signal for the end-around carry adder.
    Type: Grant
    Filed: January 16, 1998
    Date of Patent: May 9, 2000
    Assignee: International Business Machines Corporation
    Inventors: Michael Thomas Dibrino, Faraydon Osman Karim
  • Patent number: 6038582
    Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.
    Type: Grant
    Filed: October 15, 1997
    Date of Patent: March 14, 2000
    Assignee: Hitachi, Ltd.
    Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
  • Patent number: 6026422
    Abstract: Electron repulsion integrals are classified according to atomic nucleus coordinates, etc., coefficients are generated and are stored in a data memory, multiplication with addition operation is executed according to a product sum procedure of auxiliary integrals of recursive order 1 or less, and the result is stored in the data memory. Next, density matrix element is stored in the data memory, a multiplication with addition operation procedure of an electron repulsion integral of recursive order 2 not containing any procedure of recursive order 1 or less is generated, and an instruction memory is updated. Multiplication with addition operation is executed while data is read from the data memory, and the result is stored in the data memory. At the termination of the product sum procedure, calculation of electron repulsion integral gRstu is complete and the Fock matrix element value is updated.
    Type: Grant
    Filed: February 27, 1998
    Date of Patent: February 15, 2000
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: So Yamada, Shinjiro Inabata, Nobuaki Miyakawa
  • Patent number: 5996066
    Abstract: An optimized, superscalar microprocessor architecture for supporting graphics operations in addition to the standard microprocessor integer and floating point operations. A number of specialized graphics instructions and accompanying hardware for executing them are disclosed to optimize the execution of graphics instruction with minimal additional hardware for a general purpose CPU.
    Type: Grant
    Filed: October 10, 1996
    Date of Patent: November 30, 1999
    Assignee: Sun Microsystems, Inc.
    Inventor: Robert Yung
  • Patent number: 5928316
    Abstract: A fused floating point multiply-and-accumulate unit includes a multiplier which uses a modified Booth's algorithm to generate a sum and a carry representing a product of mantissas. An artifact of this algorithm is that the sum or carry may represent a negative value even though both mantissas are positive. The negative value may have a sign bit from sign extension or sign encoding of partial products in the multiplier. An artifact of the signed bit is a false carry out that results from canceling the sign bit. A 3-input adder simultaneously combines the sum and carry from the multiplier and performs the accumulation. The adder includes carry correction logic to suppress false carries and prevents a false carry from affecting more significant bits of the value being accumulated.
    Type: Grant
    Filed: November 18, 1996
    Date of Patent: July 27, 1999
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Roney S. Wong, Shao-Kun Jiang