Multiplication Followed By Addition (i.e., X*y+z) Patents (Class 708/523)

Enhanced multiplier-accumulator logic for a programmable logic device

Patent number: 8090758

Abstract: A multiplier-accumulator includes a pre-adder, a multiplier, an accumulator, multiplexing logic, and control logic. The pre-adder is configured to sum a first input and a second input to produce a pre-sum output. The multiplier is configured to multiply a third input and the pre-sum output to produce a product output. The accumulator is configured to sum a pair of accumulator inputs to produce a sum output. The multiplexer is configured to select the pair of accumulator inputs from a plurality of multiplexer inputs, where the plurality of multiplexer inputs includes the product output and the sum output. The control logic is configured to control operation of the pre-adder, the accumulator, and the multiplexer logic. In an example, each of the first input, the second input, the third input, and the sum output is coupled to programmable interconnect of a programmable logic device.

Type: Grant

Filed: December 14, 2006

Date of Patent: January 3, 2012

Assignee: Xilinx, Inc.

Inventors: Schuyler E. Shimanek, William E. Allaire, Steven J. Zack
Decimal Floating Point Mechanism and Process of Multiplication without Resultant Leading Zero Detection

Publication number: 20110320512

Abstract: A decimal multiplication mechanism for fixed and floating point computation in a computer having a coefficient mechanism without resulting leading zero detection (LZD) and process which assumes that the final product will be M+N digits in length and performs all calculations based on this assumption. Least significant digits that would be truncated are no longer stored, but retained as sticky information which is used to finalize the result product. Once the computation of the product is complete, a final check based on the examination of key bits observed during partial product accumulation is used to determine if the final product is truly M+N digits in length, or M+N?1 digits. If the latter is true, then corrective final product shifting is employed to obtain the proper result. This eliminates the need for dedicated leading zero detection hardware used to determine the number of significant digits in the final product.

Type: Application

Filed: June 23, 2010

Publication date: December 29, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Steven R. Carlough, Adam B. Collura, Michael Kroener, Silvia Melitta Mueller
Bridge fused multiply-adder circuit

Patent number: 8078660

Abstract: A bridge fused multiply-adder is disclosed. The fused multiply-adder is for the single instruction execution of (A×B)+C. The bridge fused multiply-add unit adds this functionality to existing floating-point co-processor units by including a fused multiply-add hardware “bridge” between an existing floating-point adder and a floating-point multiplier unit. This fused multiply-add functionality is added to existing two-operand architecture designs without degrading the performance or parallel pipe execution of floating-point adder and floating-point multiplier instructions.

Type: Grant

Filed: April 9, 2008

Date of Patent: December 13, 2011

Assignee: The Board of Regents, University of Texas System

Inventors: Eric Quinnell, Earl E. Swartzlander, Jr., Carl Lemonds
Multiple-word multiplication-accumulation circuit and montgomery modular multiplication-accumulation circuit

Patent number: 8078661

Abstract: A multiple-word multiplication-accumulation circuit suitable for use with a single-port memory. The circuit is composed of a multiplication-accumulation (MAC) operator and surrounding registers. The MAC operator has multiplicand and multiplier input ports with different bit widths to calculate a sum of products of multiple-word data read out of a memory. The registers serve as buffer storage of multiple-word data to be supplied to individual input ports of the MAC operator. The amount of data supplied to the MAC operator in each clock cycle is adjusted such that total amount of data consumed and produced by the MAC operator in one clock cycle will be equal to or smaller than the maximum amount of data that the memory can transfer in one clock cycle. This feature enables the use of a bandwidth-limited single-port memory, without causing adverse effect on the efficiency of MAC operator usage.

Type: Grant

Filed: July 26, 2004

Date of Patent: December 13, 2011

Assignee: Fujitsu Semiconductor Limited

Inventors: Kenji Mukaida, Masahiko Takenaka, Naoya Torii, Shoichi Masui
Datapipe synchronization device

Patent number: 8065356

Abstract: A programmable element for data processing comprises a crosspoint switch (318), a mathematical operation module (320), and a plurality of data hold modules (604,606). Each of the data hold modules (604,606) receives data from the crosspoint switch (318) and communicates the data to an input of the mathematical operation module (320) such that data arrives at the inputs of the mathematical operation module (320) substantially simultaneously. A first data hold module (604) communicates a first data valid signal to a second data hold module (606) upon receipt of first valid data, and the second data hold module communicates a second data valid signal to the first data hold module upon receipt of second valid data.

Type: Grant

Filed: December 20, 2006

Date of Patent: November 22, 2011

Assignee: L3 Communications Integrated Systems, L.P.

Inventors: Jerry William Yancey, Yea Zong Kuo
DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING A RECIPROCAL OPERATION ON AN INPUT VALUE TO PRODUCE A RESULT VALUE

Publication number: 20110276614

Abstract: A data processing apparatus and method are provided for performing a reciprocal operation on an input value d to produce a result value X. The reciprocal operation involves iterative execution of a refinement step to converge on the result value, the refinement step performing the computation: Xi=Xi-1*M, where Xi is an estimate of the result value for the i-th iteration of the refinement step, and M is a value determined by a portion of the refinement step. The data processing apparatus comprises a register data store having a plurality of registers operable to store data, and processing logic operable to execute instructions to perform data processing operations on data held in the register data store.

Type: Application

Filed: July 19, 2011

Publication date: November 10, 2011

Applicant: ARM Limited

Inventors: David Raymond Lutz, Christopher Neal Hinds
Dual multiply-accumulator operation optimized for even and odd multisample calculations

Patent number: 8051121

Abstract: According to some embodiments, a dual multiply-accumulate operation optimized for even and odd multisample calculations is disclosed.

Type: Grant

Filed: March 4, 2008

Date of Patent: November 1, 2011

Assignee: Marvell International Ltd.

Inventors: Bradley C. Aldrich, Nigel C. Paver, William T. Maghielse
HIGH RADIX DIGITAL MULTIPLIER

Publication number: 20110264719

Abstract: The present invention relates to power and hardware efficient digital multipliers configured to multiply an N-bit multiplicand with an M-bit multiplier. The digital multipliers comprise efficient partial product generation through sharing of at least one partial product result.

Type: Application

Filed: September 23, 2009

Publication date: October 27, 2011

Applicant: AUDIOASICS A/S

Inventor: Mikael Mortensen
Fused multiply-add rounding and unfused multiply-add rounding in a single multiply-add module

Patent number: 8046399

Abstract: A computer processor including a single fused-unfused floating point multiply-add (FMA) module computes the result of the operation A*B+C for floating point numbers for fused multiply-add rounding operations and unfused multiply-add rounding operations. In one embodiment, a fused multiply-add rounding implementation is augmented with additional hardware which calculates an unfused multiply-add rounding result without adding additional pipeline stages. In one embodiment, a computation by the fused-unfused floating point multiply-add (FMA) module is initiated using a single opcode which determines whether a fused multiply-add rounding result or unfused multiply-add rounding result is generated.

Type: Grant

Filed: January 25, 2008

Date of Patent: October 25, 2011

Assignee: Oracle America, Inc.

Inventors: Murali K. Inaganti, Leonard D. Rarick
Specialized processing block for programmable logic device

Patent number: 8041759

Abstract: A specialized processing block for a programmable logic device incorporates a fundamental processing unit that performs a sum of two multiplications, adding the partial products of both multiplications without computing the individual multiplications. Such fundamental processing units consume less area than conventional separate multipliers and adders. The specialized processing block further has input and output stages, as well as a loopback function, to allow the block to be configured for various digital signal processing operations, including finite impulse response (FIR) filters and infinite impulse response (IIR) filters. By using the programmable connections, and in some cases the programmable resources of the programmable logic device, and by running portions of the specialized processing block at higher clock speeds than the remainder of the programmable logic device, more complex FIR and IIR filters can be implemented.

Type: Grant

Filed: June 5, 2006

Date of Patent: October 18, 2011

Assignee: Altera Corporation

Inventors: Martin Langhammer, Kwan Yee Martin Lee, Orang Azgomi, Keone Streicher, Robert L. Pelt
Channel allocation method for allocating channels to terminal apparatuses to be communicated and base station apparatus utilizing the same

Patent number: 8036165

Abstract: The quality of signals during SDMA is raised. In an uplink, a signal processing unit receives signals respectively from a plurality of terminal apparatuses which have been multiple-accessed by division of time. It derives receiving channel characteristics corresponding to the plurality of terminal apparatuses, respectively, for each time slot. In a downlink, the signal processing unit derives transmitting channel characteristics from the receiving channel characteristics derived and, based on the transmitting channel characteristics derived, it transmits signals respectively to the plurality of terminal apparatuses to which SDMA has been performed.

Type: Grant

Filed: May 16, 2005

Date of Patent: October 11, 2011

Assignee: Kyocera Corporation

Inventors: Takeo Miyata, Katsutoshi Kawai
Apparatus and method for performing efficient multiply-accumulate operations in microprocessors

Patent number: 8015229

Abstract: An apparatus for performing multiply-accumulate operations in a microprocessor comprising operand input registers for receiving data to be operated on an adder and a multiplier for performing operations on the data, a result output port for presenting results to the microprocessor, a multiplexer for storing results, an accumulator cache for storing an accumulator value internal to the apparatus, and control circuitry for controlling the operation of the apparatus.

Type: Grant

Filed: June 1, 2005

Date of Patent: September 6, 2011

Assignee: Atmel Corporation

Inventors: Øyvind Strøm, Erik Knutsen Renno
Modulus scaling for elliptic-curve cryptography

Patent number: 8005210

Abstract: Modulus scaling applied a reduction techniques decreases time to perform modular arithmetic operations by avoiding shifting and multiplication operations. Modulus scaling may be applied to both integer and binary fields and the scaling multiplier factor is chosen based on a selected reduction technique for the modular arithmetic operation.

Type: Grant

Filed: June 30, 2007

Date of Patent: August 23, 2011

Assignee: Intel Corporation

Inventors: Erdinc Ozturk, Vinodh Gopal, Gilbert Wolrich, Wajdi K. Feghali
Method and software for partitioned group element selection operation

Patent number: 8001360

Abstract: A system and software for improving the performance of processors by incorporating an execution unit operable to decode and execute single instructions specifying a data selection operand and a first and a second register providing a plurality of data elements, the data selection operand comprising a plurality of fields each selecting one of the plurality of data elements, the execution unit operable to provide the data element selected by each field of the data selection operand to a predetermined position in a catenated result.

Type: Grant

Filed: January 16, 2004

Date of Patent: August 16, 2011

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris
Method for carry estimation of reduced-width multipliers

Publication number: 20110185000

Abstract: A low-error reduced-width multiplier is provided by the present invention. The multiplier can dynamically compensate the truncation error. The compensation value is derived by the dependencies among the multiplier partial products, and thus, can be analyzed according to the multiplication type and the multiplier input statistics.

Type: Application

Filed: February 28, 2011

Publication date: July 28, 2011

Applicant: National Chiao Tung University

Inventors: Yen-Chin Liao, Hsie-Chia Chang
Efficient elliptic-curve cryptography based on primality of the order of the ECC-group

Patent number: 7986779

Abstract: Time to perform scalar point multiplication used for ECC is reduced by minimizing the number of shifting operations. These operations are minimized by applying modulus scaling by performing selective comparisons of points at intermediate computations based on primality of the order of an ECC group.

Type: Grant

Filed: June 30, 2007

Date of Patent: July 26, 2011

Assignee: Intel Corporation

Inventors: Erdinc Ozturk, Vinodh Gopal, Gilbert Wolrich, Wajdi K. Feghali
Method and apparatus for implementing a multiplier utilizing digital signal processor block memory extension

Patent number: 7987222

Abstract: A method for performing multiplication on a field programmable gate array includes generating a product by multiplying a first plurality of bits from a first number and a first plurality of bits from a second number. A stored value designated as a product of a second plurality of bits from the first number and a second plurality of bits from the second number is retrieved. The product is scaled with respect to a position of the first plurality of bits from the first number and a position of the first plurality of bits from the second number. The stored value is scaled with respect to a position of the second plurality of bits from the second number and a position of the second plurality of bits from the second number. The scaled product and the scaled stored value are summed.

Type: Grant

Filed: April 22, 2004

Date of Patent: July 26, 2011

Assignee: Altera Corporation

Inventors: Asher Hazanchuk, Benjamin Esposito
Multithreaded programmable processor and system with partitioned operations

Patent number: 7987344

Abstract: A programmable processor and method for improving the performance of processors by incorporating an execution unit configurable to execute a plurality of instruction streams from the plurality of threads, wherein each instruction stream includes a group instruction that operates on a plurality of data elements in partitioned fields of at least one of the registers to produce a catenated result.

Type: Grant

Filed: January 16, 2004

Date of Patent: July 26, 2011

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris
Bus-based logic blocks with optional constant input

Patent number: 7982496

Abstract: A bus-based logic block for an integrated circuit includes a provision for placing an arbitrary constant onto a data bus in the logic block. An exemplary logic block has multi-bit first and second inputs and a multi-bit output. The logic block includes a multi-bit multiplexer circuit, a multi-bit programmable logic circuit, and a constant generator circuit. The multiplexer circuit has a multi-bit first input coupled to a multi-bit first input of the logic block, a multi-bit second input, and a multi-bit output. The programmable logic circuit has a multi-bit first input coupled to the output of the multiplexer circuit, and a multi-bit output. The constant generator circuit has a multi-bit output coupled to the second input of the multiplexer circuit. Each bit of the logic block may be commonly controlled with all other bits of the logic block.

Type: Grant

Filed: April 2, 2009

Date of Patent: July 19, 2011

Assignee: Xilinx, Inc.

Inventor: Steven P. Young
Scale-invariant barrett reduction for elliptic-curve cyrptography

Patent number: 7978846

Abstract: The computation time to perform scalar point multiplication in an Elliptic Curve Group is reduced by modifying the Barrett Reduction technique. Computations are performed using an N-bit scaled modulus based a modulus m having k-bits to provide a scaled result, with N being greater than k. The N-bit scaled result is reduced to a k-bit result using a pre-computed N-bit scaled reduction parameter in an optimal manner avoiding shifting/aligning operations for any arbitrary values of k, N.

Type: Grant

Filed: June 30, 2007

Date of Patent: July 12, 2011

Assignee: Intel Corporation

Inventors: Erdinc Ozturk, Vinodh Gopal, Gilbert Wolrich, Wajdi K. Feghali
LARGE MULTIPLIER FOR PROGRAMMABLE LOGIC DEVICE

Publication number: 20110161389

Abstract: A plurality of specialized processing blocks in a programmable logic device, including multipliers and circuitry for adding results of those multipliers, can be configured as a larger multiplier by adding to the specialized processing blocks selectable circuitry for shifting multiplier results before adding. In one embodiment, this allows all but the final addition to take place in specialized processing blocks, with the final addition occurring in programmable logic. In another embodiment, additional compression and adding circuitry allows even the final addition to occur in the specialized processing blocks.

Type: Application

Filed: March 8, 2011

Publication date: June 30, 2011

Applicant: ALTERA CORPORATION

Inventors: Martin Langhammer, Kumara Tharmalingam
MULTIPLYING AND ADDING MATRICES

Publication number: 20110153707

Abstract: An apparatus and method are described for multiplying and adding matrices. For example, one embodiment of a method comprises decoding by a decoder in a processor device, a single instruction specifying an m-by-m matrix operation for a set of vectors, wherein each vector represents an m-by-m matrix of data elements and m is greater than one; issuing the single instruction for execution by an execution unit in the processor device; and responsive to the execution of the single instruction, generating a resultant vector, wherein the resultant vector represents an m-by-m matrix of data elements.

Type: Application

Filed: December 10, 2010

Publication date: June 23, 2011

Inventors: Boris Ginzburg, Simon Rubanovich, Benny Eitan
Method, system and medium for controlling manufacture process having multivariate input parameters

Patent number: 7966087

Abstract: A method, system, and medium of modeling and/or for controlling a manufacturing process is disclosed. In particular, a method according to embodiments of the present invention includes calculating a set of predicted output values, and obtaining a prediction model based on a set of input parameters, the set of predicted output values, and empirical output values. Each input parameter causes a change in at least two outputs. The method also includes optimizing the prediction model by minimizing differences between the set of predicted output values and the empirical output values, and adjusting the set of input parameters to obtain a set of desired output values to control the manufacturing apparatus. Obtaining the prediction model includes transforming the set of input parameters into transformed input values using a transformation function of multiple coefficient values, and calculating the predicted output values using the transformed input values.

Type: Grant

Filed: July 31, 2007

Date of Patent: June 21, 2011

Assignee: Applied Materials, Inc.

Inventors: Yuri Kokotov, Efim Entin, Jacques Seror, Yossi Fisher, Shalomo Sarel, Arulkumar P. Shanmugasundram, Alexander T. Schwarm, Young Jeen Paik
Arithmetic method and device of reconfigurable processor

Patent number: 7958179

Abstract: Provided are an arithmetic method and device of a reconfigurable processor. The arithmetic device includes: an Arithmetic Logic Unit (ALU) for performing an addition and subtraction operation and a logic operation of a binary signal; a multiplier for performing a multiplication operation of the binary signal; a shifter for changing an arrangement of the binary signal; a first operand selector and a second operand selector each for selecting one of values output from the ALU, the multiplier, and the shifter; and an adder for adding the values selected by the first operand selector and the second operand selector.

Type: Grant

Filed: October 30, 2007

Date of Patent: June 7, 2011

Assignee: Electronics and Telecommunications Research Institute

Inventors: Chun Gi Lyuh, Soon Il Yeo, Tae Moon Roh, Jong Dae Kim
X87 fused multiply-add instruction

Patent number: 7917568

Abstract: An x87 fused multiply-add (FMA) instruction in the instruction set of an x86 architecture microprocessor is disclosed. The FMA instruction implicitly specifies the two factor operands as the top two operands of the x87 FPU register stack and explicitly specifies the third addend operand as a third x87 FPU register stack register. The microprocessor multiplies the first two operands and adds the product to the third operand to generate a result. The result is stored into the third register and the first two operands are popped off the stack. In an alternate embodiment, the third operand is also implicitly specified as being stored in the register that is two registers below the top of stack register; the result is also stored therein. The instruction opcode value is in the x87 opcode range.

Type: Grant

Filed: July 23, 2007

Date of Patent: March 29, 2011

Assignee: VIA Technologies, Inc.

Inventors: G. Glenn Henry, Timothy A. Elliott, Terry Parks
Mode-based multiply-add recoding for denormal operands

Patent number: 7912887

Abstract: In a denormal support mode, the normalization circuit of a floating-point adder is used to normalize or denormalized the output of a floating-point multiplier. Each floating-point multiply instruction is speculatively converted to a multiply-add instruction, with the addend forced to zero. This preserves the value of the product, while normalizing or denormalizing the product using the floating-point adder's normalization circuit. When the operands to the multiply operation are available, they are inspected. If the operands will not generate an unnormal intermediate product or a denormal final product, the add operation is suppressed, such as by operand-forwarding. Additionally, each non-fused floating-point multiply-add instruction is replaced with a multiply-add instruction having a zero addend, and a floating-point add instruction having the addend of the original multiply-add instruction is inserted into the instruction stream.

Type: Grant

Filed: May 10, 2006

Date of Patent: March 22, 2011

Assignee: QUALCOMM Incorporated

Inventors: Kenneth Alan Dockser, Pathik Sunil Lall
Function Generator

Publication number: 20110055303

Abstract: One embodiment relates to a method for generating a periodic function in response to an argument in a digital signal processing system, where the periodic function can be represented as functions of two or more components of the argument. The method may include: obtaining a first operand from one of two or more lookup tables in response to a first component of the argument; obtaining a second operand from one of the lookup tables in response to a second component of the argument; conditionally mirroring the first and second operands in response to a quadrant of the argument; and calculating a value of the periodic function in response to the operands with a linear algebra unit without using conditional code execution.

Type: Application

Filed: March 15, 2010

Publication date: March 3, 2011

Applicant: AZURAY TECHNOLOGIES, INC.

Inventor: Keith Slavin
Method And System For Multi-Precision Computation

Publication number: 20110055308

Abstract: Systems and methods for multi-precision computation are disclosed. One embodiment of the present invention includes a plurality of multiply-add units (MADDs) configured to perform one or more single precision operations and an arrangement generator to generate one or more mantissa arrangements using a plurality of double precision numbers. Each MADD is configured to receive and load said mantissa arrangements from the arrangement generator. The MADDs compute a result of a multi-precision computation using the mantissa arrangements. In an embodiment, the MADDs are configured to simultaneously perform operations that include, single precision operations, double-precision additions and double-precision multiply and additions.

Type: Application

Filed: June 10, 2010

Publication date: March 3, 2011

Applicant: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Daniel B. Clifton, Christopher Spencer
Microprocessor with rounding dot product instruction

Patent number: 7890566

Abstract: A functional unit in a digital system is provided with a rounding DOT product instruction, wherein a product of first pair of elements is combined with a product of second pair of elements, the combined product is rounded, and the final result is stored in a destination. Rounding is performed by adding a rounding value to form an intermediate result, and then shifting the intermediate result right. A combined result is rounded to a fixed length shorter than the combined product. The products are combined by either addition or subtraction. An overflow resulting from the combination or from rounding is not reported.

Type: Grant

Filed: October 31, 2000

Date of Patent: February 15, 2011

Assignee: Texas Instruments Incorporated

Inventor: Joseph R. Zbiciak
LOW POWER FIR FILTER IN MULTI-MAC ARCHITECTURE

Publication number: 20110029589

Abstract: Embodiments of the invention are directed to system and method that enable relatively low power dissipation by scheduling operations of multiply accumulators chain of two or more multiply accumulators units by delivering an output result of a first multiply accumulator of the chain as an input to a second subsequent multiply accumulator of the chain.

Type: Application

Filed: July 30, 2009

Publication date: February 3, 2011

Inventor: Jeffrey Allan (Alon) JACOB (YAAKOV)
Digital signal processors with configurable dual-MAC and dual-ALU

Patent number: 7873815

Abstract: DSP architectures having improved performance are described. In an exemplary architecture, a DSP includes two MAC units and two ALUs, where one of the ALUs replaces an adder for one of the two MAC units. This DSP may be configured to operate in a dual-MAC/single-ALU configuration, a single-MAC/dual-ALU configuration, or a dual-MAC/dual-ALU configuration. This flexibility allows the DSP to handle various types of signal processing operations and improves utilization of the available hardware. The DSP architectures further includes pipeline registers that break up critical paths and allow operations at a higher clock speed for greater throughput.

Type: Grant

Filed: March 4, 2004

Date of Patent: January 18, 2011

Assignee: QUALCOMM Incorporated

Inventors: Gilbert C. Sih, De D. Hsu, Way-Shing Lee, Xufeng Chen
ARITHMETIC PROCESSING UNIT THAT PERFORMS MULTIPLY AND MULTIPLY-ADD OPERATIONS WITH SATURATION AND METHOD THEREFOR

Publication number: 20100306301

Abstract: Sum and carry signals are formed representing a product of a first and a second operand. A bias signal is formed having a value determined by a sign of a product of the first and the second operand. An output signal is provided based on an addition of the sum signal, the carry signal, a sign-extended addend, and the bias signal. A portion of the output signal, a saturated minimum value, or a saturated maximum value, is selected as a final result based on the sign of the product and a sign of the output signal.

Type: Application

Filed: May 27, 2009

Publication date: December 2, 2010

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Kevin A. Hurd, Scott A. Hilker
Method and device for performing a cryptographic operation

Patent number: 7822199

Abstract: A method and device for performing a cryptographic operation by a device controlled by a security application executed outside thereof in which a cryptographic value (y) is produced a calculation comprising at least one multiplication between first and second factors containing a security key (s) associated with the device and a challenge number (c) provided by the security application. The first multiplication factor comprises a determined number of bits (L) in a binary representation and the second factor is constrained in such a way that it comprises, in a binary representation, several bits at 1 with a sequence of at least L?1 bits at 0 between each pair of consecutive bits to 1 while the multiplication is carried out by assembling the binary versions of the first factor shifted according to positions of the bits at 1 of the second factor, respectively.

Type: Grant

Filed: February 24, 2005

Date of Patent: October 26, 2010

Assignee: France Telecom

Inventors: Marc Girault, David Lefranc
Method and apparatus for dynamically fusing instructions at execution time in a processor of an information handling system

Patent number: 7818550

Abstract: One embodiment of a processor includes a fetch stage, decoder stage, execution stage and completion stage. The execution stage includes a primary execution stage for handling low latency instructions and a secondary execution stage for handling higher latency instructions. A detector determines if an instruction is a high latency instruction or a low latency instruction. If the detector also finds that a particular low latency instruction is dependent on, and destructive of, a corresponding high latency instruction, then the secondary execution stage dynamically fuses the execution of the low latency instruction together with the execution of the high latency instruction. Otherwise, the primary execution stage handles the execution of the low latency instruction.

Type: Grant

Filed: July 23, 2007

Date of Patent: October 19, 2010

Assignee: International Business Machines Corporation

Inventor: Michael Thomas Vaden
SYSTOLIC ARRAY AND CALCULATION METHOD

Publication number: 20100250640

Abstract: A linear systolic array is added to the lower side of a trapezoid systolic array created by combining a triangular systolic array and a square systolic array. In order to make the connection among the cells fixed, the intermediate result output from each row of the trapezoid systolic array to a lower row is shifted in phase with respect to the intermediate result of the complex MFA algorithm, the phase shift is absorbed by the next row, and the phase shift in the intermediate result output from the last row of the trapezoid systolic array is corrected by the linear systolic array. Each cell is implemented by a CORDIC circuit that processes vector angle computation, vector rotation, division, and multiply-and-accumulate with a constant delay.

Type: Application

Filed: November 21, 2008

Publication date: September 30, 2010

Inventor: Katsutoshi Seki
Scalable Montgomery Multiplication Architecture

Publication number: 20100235414

Abstract: A Montgomery multiplication device calculates a Montgomery product of an operand X and an operand Y with respect to a modulus M and includes a plurality of processing elements. In a first clock cycle, two intermediate partial sums are created by obtaining an input of length w?1 from a preceding processing element as w?1 least significant bits. The most significant bit is configured as either zero or one. Then, two partial sums are calculated using a word of the operand Y, a word of the modulus M, a bit of the operand X, and the two intermediate partial sums. In a second clock cycle, a selection bit is obtained from a subsequent processing element and one of the two partial sums is selected based on the value of the selection bit. Then, the selected partial sum is used for calculation of a word of the Montgomery product.

Type: Application

Filed: March 1, 2010

Publication date: September 16, 2010

Inventors: Miaoqing Huang, Krzysztof Gaj
Microcontroller with low-cost digital signal processing extensions

Patent number: 7797516

Abstract: A set of low-cost microcontroller extensions facilitates Digital Signal Processing (DSP) applications by incorporating a Multiply-Accumulate (MAC) unit in a Central Processing Unit (CPU) of the microcontroller which is responsive to the extensions.

Type: Grant

Filed: March 16, 2007

Date of Patent: September 14, 2010

Assignee: ATMEL Corporation

Inventors: Benjamin Francis Froemming, Emil Lambrache
DATA PROCESSING DEVICE

Publication number: 20100211622

Abstract: In a determination as to similarity on parts of a piece of data, high-speed processing is performed without the need for a database. Division signal lines (L1 to Lk) that transmit signals corresponding to division data are used.

Type: Application

Filed: September 25, 2008

Publication date: August 19, 2010

Inventor: Akiyoshi Oguro
ARITHMETIC CIRCUIT FOR MONTGOMERY MULTIPLICATION AND ENCRYPTION CIRCUIT

Publication number: 20100183145

Abstract: An arithmetic circuit capable of Montgomery multiplication using only a one-port RAM is disclosed. In a first read process, b[i] is read from a memory M2 of a sync one-port RAM for storing a[s?1: 0] and b[s?1: 0] and stored in a register R1. In a second read process, a[j] is read from the memory M2, t[j] from a memory M1 of a sync one-port RAM for storing t[s?1: 0], b[i] from the register R1, and a value RC from a register R2, and input to a sum-of-products calculation circuit 10 for calculating t[j]+a[j]*b[j]+RC. In a write process, the calculation result data FH is written in the register R2, and the calculation result data FL in the memory M1 as t[j]. A first subloop process for repeating the second read process, the sum-of-products calculation process and the write process is executed after the first read process.

Type: Application

Filed: January 12, 2010

Publication date: July 22, 2010

Inventor: Shigeo OHYAMA
FLEXIBLE ACCUMULATOR IN DIGITAL SIGNAL PROCESSING CIRCUITRY

Publication number: 20100169404

Abstract: A multiplier-accumulator (MAC) block can be programmed to operate in one or more modes. When the MAC block implements at least one multiply-and-accumulate operation, the accumulator value can be zeroed without introducing clock latency or initialized in one clock cycle. To zero the accumulator value, the most significant bits (MSBs) of data representing zero can be input to the MAC block and sent directly to the add-subtract-accumulate unit. Alternatively, dedicated configuration bits can be set to clear the contents of a pipeline register for input to the add-subtract-accumulate unit.

Type: Application

Filed: January 7, 2010

Publication date: July 1, 2010

Inventors: Leon Zheng, Martin Langhammer, Nitin Prasad, Greg Starr, Chiao Kai Hwang, Kumara Tharmalingam
Circuit architecture for an integrated circuit

Patent number: 7728624

Abstract: An integrated circuit comprising at least one group comprising having multiple arithmetic/logic units arranged in sub-groups. In the sub-groups at inputs of multiple arithmetic/logic units, in each case a single one of the first selection units is connected on the input side, wherein no other selection unit is connected directly on the input side of this selection unit. The first selection units are coupled to each other such that a horizontal and/or vertical logical interconnection of the arithmetic/logic units within a group, and/or a logical interconnection of arithmetic/logic units to an upstream group can be implemented. Second selection units are in each case connected on the output side of a column of arithmetic/logic units. The second selection units of a group are connected on the output side to one bus each, and a microprocessor is coupled to this bus.

Type: Grant

Filed: October 10, 2006

Date of Patent: June 1, 2010

Assignee: Micronas GmbH

Inventor: Gert Umbach
Multiply-accumulate unit and method of operation

Patent number: 7730118

Abstract: An arithmetic unit for selectively implementing one of a multiply and multiply-accumulate instruction, including a multiplier, addition circuitry, a result register, and accumulator circuitry. The multiplier arranged to receive first and second operands and operable to generate multiplication terms. The addition circuitry for receiving multiplication terms from the multiplier and operable to combine them to generate a multiplication result. The result register for receiving the multiplication result from the adder. The accumulator circuitry connected to receive a value stored in the result register and an accumulate control signal which determines whether the arithmetic unit implements a multiply or a multiply-accumulate instruction.

Type: Grant

Filed: April 7, 2006

Date of Patent: June 1, 2010

Assignee: STMicroelectronics (Research & Development) Limited

Inventor: Tariq Kurd
Method and apparatus for computing matrix transformations

Patent number: 7725521

Abstract: A method and apparatus for performing matrix transformations including multiply-add operations and byte shuffle operations on packed data in a processor. In one embodiment, two rows of content byte elements are shuffled to generate a first and second packed data respectively including elements of a first two columns and of a second two columns. A third packed data including sums of products is generated from the first packed data and elements from two rows of a matrix by a multiply-add instruction. A fourth packed data including sums of products is generated from the second packed data and elements from two more rows of the matrix by another multiply-add instruction. Corresponding sums of products of the third and fourth packed data are then summed to generate two rows of a product matrix. Elements of the product matrix may be generated in an order that further facilitates a second matrix multiplication.

Type: Grant

Filed: October 10, 2003

Date of Patent: May 25, 2010

Assignee: Intel Corporation

Inventors: Yen-Kuang Chen, Eric Q. Li, William W. Macy, Jr., Minerva M. Yeung
Method and apparatus for providing a processor based nested form polynomial engine

Patent number: 7716268

Abstract: A method and apparatus for providing a processor based nested form polynomial engine are disclosed. A concise instruction format is provided to significantly decrease memory required and allow for instruction pipelining without branch penalty using a nested form polynomial engine. The instruction causing a processor to set coefficient and data address pointers for evaluating a polynomial, to load loading a coefficient and data operand into a coefficient register and a data register, respectively, to multiply the contents of the coefficient register and data register to produce a product, to add a next coefficient operand to the product to produce a sum, to provide the sum to an accumulator and to repeat the loading, multiplying, adding and providing until evaluation of the polynomial is complete.

Type: Grant

Filed: March 4, 2005

Date of Patent: May 11, 2010

Assignee: Hitachi Global Storage Technologies Netherlands B.V.

Inventors: Jeffrey J. Dobbek, Kirk Hwang
Method and system for performing parallel integer multiply accumulate operations on packed data

Patent number: 7716269

Abstract: A multiply accumulate unit (“MAC”) that performs operations on packed integer data. In one embodiment, the MAC receives 2 32-bit data words which, depending on the specified mode of operation, each contain either four 8-bit operands, two 16-bit operands, or one 32-bit operand. Depending on the mode of operation, the MAC performs either sixteen 8×8 operations, four 16×16 operations, or one 32×32 operation. Results may be individually retrieved from registers and the corresponding accumulator cleared after the read cycle. In addition, the accumulators may be globally initialized. Two results from the 8×8 operations may be packed into a single 32-bit register. The MAC may also shift and saturate the products as required.

Type: Grant

Filed: June 16, 2005

Date of Patent: May 11, 2010

Assignee: Cradle Technologies

Inventors: Moshe B. Simon, Erik P. Machnicki, David A. Harrison, Rakesh K. Singh
Common shift-amount calculation for binary and hex floating point

Patent number: 7716266

Abstract: A method and system for performing a binary mode and hexadecimal mode Multiply-Add floating point operation in a floating point arithmetic unit according to a formula A*C+B, wherein A, B and C operands each have a fraction and an exponent part expA, expB and expC and the exponent of the product A*C is calculated and compared to the exponent of the addend under inclusion of an exponent bias value dedicated to use unsigned biased exponents, wherein the comparison yields a shift amount used for aligning the addend with the product operand, wherein a shift amount calculation provides a common value CV for both binary and hexadecimal according to the formula (expA+expC?expB+CV).

Type: Grant

Filed: January 26, 2006

Date of Patent: May 11, 2010

Assignee: International Business Machines Corporation

Inventors: Son Dao Trong, Juergen Haess, Klaus Michael Kroener, Eric M. Schwarz
Programmable logic device with specialized functional block

Patent number: 7698358

Abstract: In a programmable logic device having a specialized functional block incorporating multipliers and adders, multiplication operations that do not fit neatly into the available multipliers are performed partially in the multipliers of the specialized functional block and partially in multipliers configured in programmable logic of the programmable logic device. Unused resources of the specialized functional block, including adders, may be used to add together the partial products produced inside and outside the specialized functional block. Some adders configured in programmable logic of the programmable logic device also may be used for that purpose.

Type: Grant

Filed: December 24, 2003

Date of Patent: April 13, 2010

Assignee: Altera Corporation

Inventors: Martin Langhammer, Leon Zheng, Chiao Kai Hwang, Gregory Starr
Enhanced floating-point unit for extended functions

Patent number: 7676535

Abstract: An embodiment of the present invention is a technique to perform floating-point operations. A floating-point (FP) squarer squares a first argument to produce an intermediate argument. The first and intermediate arguments have first and intermediate mantissas and exponents. A FP multiply-add (MAD) unit performs a multiply-and-add operation on the intermediate argument, a second argument, and a third argument to produce a result having a result mantissa and a result exponent. The second and third arguments have second and third mantissas and exponents, respectively.

Type: Grant

Filed: September 28, 2005

Date of Patent: March 9, 2010

Assignee: Intel Corporation

Inventors: David D. Donofrio, Xuye Li
METHOD FOR COMPUTERIZED ARITHMETIC OPERATIONS

Publication number: 20100023569

Abstract: A method of computing arithmetic operations more efficiently than the conventional Arithmetic Logic Unit (ALU) is disclosed. By encoding both operands from Binary Coded Decimal (BCD) codes (0000, to 1001) into decimal digits (0 to 9), inputting them in the GerTh's™ look-up tables, which are made of an array of AND gates, the invention finds the answer more efficiently. This method finds the result in fewer steps than a traditional ALU by reducing the repetitive calculation steps and logic gates required. And this new method makes the unsolvable computerized binary floating-point multiplications and divisions back to the solvable GerTh's computerized decimal digits' (0-9) elementary arithmetic operations.

Type: Application

Filed: July 22, 2008

Publication date: January 28, 2010

Applicant: DAW SHIEN SCIENTIFIC RESEARCH & DEVELOPMENT, INC.

Inventors: James Shihfu Shiao, Albert Shihyung Shiao
Method and Apparatus for Efficient Integer Transform

Publication number: 20100011042

Abstract: A method and apparatus for including in a processor instructions for performing integer transforms including multiply-add operations and horizontal-add operations on packed data. In one embodiment, a processor is coupled to a memory that stores a first packed byte data and a second packed byte data. The processor performs operations on said first packed byte data and said second packed byte data to generate a third packed data in response to receiving a multiply-add instruction. A plurality of the 16-bit data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed byte data. The processor adds together at least a first and a second 16-bit data element of the third packed data in response to receiving an horizontal-add instruction to generate a 16-bit result as one of a plurality of data elements of a fourth packed data.

Type: Application

Filed: September 15, 2009

Publication date: January 14, 2010

Inventors: Eric Debes, William W. Macy, Jonathan J. Tyler

prev 1 2 3 4 5 6 7 8 next