Pipeline Patents (Class 708/631)
  • Patent number: 11816448
    Abstract: An ALU is capable of generating a multiply accumulation by compressing like-magnitude partial products. Given N pairs of multiplier and multiplicand, Booth encoding is used to encode the multipliers into M digits, and M partial products are produced for each pair of with each partial product in a smaller precision than a final product. The partial products resulting from the same encoded multiplier digit position, are summed across all the multiplies to produce a summed partial product. In this manner, the partial product summation operations can be advantageously performed in the smaller precision. The M summed partial products are then summed together with an aggregated fixup vector for sign extension. If the N multipliers equal to a constant, a preliminary fixup vector can be generated based on a predetermined value with adjustment on particular bits, where the predetermined value is determined by the signs of the encoded multiplier digits.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: November 14, 2023
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: David Carlson
  • Patent number: 9607586
    Abstract: Techniques are disclosed relating to asymmetric circuits. In some embodiments, a storage element is configured to maintain a first input value as an input to an asymmetric circuit during a time interval. For example, in one embodiment, the time interval may correspond to a frame of video data and the storage element may be configured to store a filter coefficient for the frame of video data. In some embodiments, the storage element may be configured to store the value as a constant for multiple operations by the asymmetric circuit. In some embodiments, the asymmetric circuit is configured to generate a plurality of output values based on the first input value and respective ones of a set of second input values. In some embodiments, the asymmetric circuit is leakage power asymmetric and/or critical path asymmetric. This may increase performance and/or reduce power consumption.
    Type: Grant
    Filed: February 18, 2014
    Date of Patent: March 28, 2017
    Assignee: Apple Inc.
    Inventors: Michael L. Liu, Liang Deng
  • Patent number: 8694573
    Abstract: A method for determining a quotient value from a dividend value and a divisor value in a digital processing circuit is provided. The method includes computing a reciprocal value of the divisor value and multiplying the reciprocal value by the dividend value to obtain a reciprocal product, the reciprocal product having an integer part. The method also includes computing an intermediate remainder value by computing a product of the integer part and the divisor value, and subtracting the resulting product from the dividend value and determining the quotient value based upon the intermediate remainder value.
    Type: Grant
    Filed: December 24, 2009
    Date of Patent: April 8, 2014
    Assignee: Jadavpur University
    Inventors: Debotosh Bhattacharjee, Santanu Halder
  • Patent number: 8161308
    Abstract: A circuit includes: an input buffer for storing input data; a plurality of processing sections connected in series including a head processing section and a tail-end processing section to sequentially process the input data; and a power supply controller for controlling power supply to each of the plurality of processing sections depending on a lapse of time during which no input data is stored in the input buffer.
    Type: Grant
    Filed: March 27, 2009
    Date of Patent: April 17, 2012
    Assignee: NEC Corporation
    Inventor: Hidenori Hisamatsu
  • Patent number: 8073892
    Abstract: In general, in one aspect, the disclosure describes a multiplier that includes a set of multiple multipliers configured in parallel where the set of multiple multipliers have access to a first operand and a second operand to multiply, the first operand having multiple segments and the second operand having multiple segments. The multiplier also includes logic to repeatedly supply a single segment of the second operand to each multiplier of the set of multiple multipliers and to supply multiple respective segments of the first operand to the respective ones of the set of multiple multipliers until each segment of the second operand has been supplied with each segment of the first operand. The logic shifts the output of different ones of the set of multiple multipliers based, at least in part, on the position of the respective segments within the first operand. The multiplier also includes an accumulator coupled to the logic.
    Type: Grant
    Filed: December 30, 2005
    Date of Patent: December 6, 2011
    Assignee: Intel Corporation
    Inventors: Wajdi K. Feghali, William C. Hasenplaugh, Gilbert M. Wolrich, Daniel R. Cutter, Vinodh Gopal, Gunnar Gaubatz
  • Publication number: 20110106872
    Abstract: An area efficient multiplier having high performance at modest clock speeds is presented. The performance of the multiplier is based on optimal choice of a number of levels of Karatsuba decomposition. The multiplier may be used to perform efficient modular reduction of large numbers greater than the size of the multiplier.
    Type: Application
    Filed: June 6, 2008
    Publication date: May 5, 2011
    Inventors: William Hasenplaugh, Gilbert Wolrich, Vinodh Gopal, Gunnar Gaubatz, Erdinc Ozturk, Wajdi Feghali
  • Patent number: 7917793
    Abstract: The present invention uses a swing structure to avoid using a clock period at a non-efficient execution time. The execution time is precisely controlled to enhance a performance of a processor using a low voltage. Thus, synchronization problems in a chip under different environments are solved for high reliability.
    Type: Grant
    Filed: February 11, 2008
    Date of Patent: March 29, 2011
    Assignee: National Chung Cheng University
    Inventors: Shu-Hsuan Chou, Yi-Chao Chan, Ming-Ku Chang, Tien-Fu Chen
  • Patent number: 7774400
    Abstract: The present invention relates to a method for performing calculation operations using a pipelined calculation device comprising a group of at least two pipeline stages. The pipeline stages comprise at least one data interface for input of data and at least one data interface for output of data. In the method, data for performing calculation operations is input to the device. Selective data processing is performed in the calculation device, wherein between at least one input data interface and at least one output data interface a selection is performed to connect at least one input data interface to at least one output data interface for routing data between at least one input data interface and at least one output data interface and for processing data according to the selection. The invention further relates to a system and a device in which the method is utilized.
    Type: Grant
    Filed: November 6, 2003
    Date of Patent: August 10, 2010
    Assignee: Nokia Corporation
    Inventors: David Guevorkian, Aki Launiainen, Petri Liuha
  • Patent number: 7769099
    Abstract: The invention relates to techniques for implementing high-speed precoders, such as Tomlinson-Harashima (TH) precoders. In one aspect of the invention, look-ahead techniques are utilized to pipeline a TH precoder, resulting in a high-speed TH precoder. These techniques may be applied to pipeline various types of TH precoders, such as Finite Impulse Response (FIR) precoders and Infinite Impulse Response (IIR) precoders. In another aspect of the invention, parallel processing multiple non-pipelined TH precoders results in a high-speed parallel TH precoder design. Utilization of high-speed TH precoders may enable network providers to for example, operate 10 Gigabit Ethernet with copper cable rather than fiber optic cable.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: August 3, 2010
    Assignee: Leanics Corporation
    Inventors: Keshab K. Parhi, Yongru Gu
  • Patent number: 7536430
    Abstract: A method for performing calculation operations uses a pipelined calculation device comprising a group of at least two pipeline stages, at least one data interface for input of data, and at least one data interface for output of data. The pipeline stages include at least one data interface for input of data and at least one data interface for output of data. Data for performing a first and a second calculation operation is input to the device. In the first calculation operation, output data of at least one pipeline stage is stored into a memory. In the second calculation operation the stored data is used as input data to a pipeline stage. The invention further relates to a system and a device, in which the method is utilized.
    Type: Grant
    Filed: November 6, 2003
    Date of Patent: May 19, 2009
    Assignee: Nokia Corporation
    Inventors: David Guevokian, Aki Launiainen, Petri Liuha
  • Patent number: 7334011
    Abstract: In a method for performing a multiplication operation between a first operand and a second operand the multiplication operation is divided into at least two suboperations. At least one of the suboperations is performed in a time-interlaced manner, wherein the at least one suboperation is further divided into partial suboperations so that each partial suboperation is initiated at a different time.
    Type: Grant
    Filed: November 6, 2003
    Date of Patent: February 19, 2008
    Assignee: Nokia Corporation
    Inventors: David Guevokian, Aki Launiainen, Petri Liuha
  • Publication number: 20040186872
    Abstract: The present invention provides a circuit for a programmable transitive processing unit for performing complex functions, such as multiplication, pipelining of one or more values, and/or shift operations, wherein the circuit can be configured to be a constituent of an array of other similar circuits to form, for example, a larger multiplier.
    Type: Application
    Filed: March 21, 2003
    Publication date: September 23, 2004
    Inventor: Charle' R. Rupp
  • Publication number: 20040133618
    Abstract: In a method for performing a multiplication operation between a first operand and a second operand the multiplication operation is divided into at least two suboperations. At least one of the suboperations is performed in a time-interlaced manner, wherein the at least one suboperation is further divided into partial suboperations so that each partial suboperation is initiated at a different time.
    Type: Application
    Filed: November 6, 2003
    Publication date: July 8, 2004
    Applicant: Nokia Corporation
    Inventors: David Guevokian, Aki Launiainen, Petri Liuha
  • Publication number: 20030220957
    Abstract: A digital Parallel Multiplier having a Partial Product Generator (3), a First Stage Adder Circuit (71) and a Final Stage Adder Circuit (72), wherein the spurious switching in the First Stage Adder Circuit (71) may be substantially reduced by synchronizing the input signals to the Adders in First Stage Adder Circuit (71). The reduced spurious switching reduces the power dissipation of the Multiplier. The timing of the input signals is synchronized by means of the Latch Adders (41) having a Latch that is an integral part of an Adder. Consequently, the power dissipation and hardware overheads of the Latch Adders (41) are low. The Latch Adders (41) may be controlled by Control Signals (44), which may be generated by Control Circuits (61). The application of the Latch Adders (41) may be applied to the Final Stage Adder Circuit (72) to further reduce spurious switching and thereby further reduce the power dissipation.
    Type: Application
    Filed: May 14, 2003
    Publication date: November 27, 2003
    Inventors: Joseph Sylvester Chang, Bah Hwee Gwee, Kwen Siong Chong
  • Patent number: 6604120
    Abstract: A digital parallel multiplier has encoders for each segmented bit pair of the multiplier input data which select one of 4 coefficients, based on the sum of the bit pair, that are then applied to the multiplicand input data. The addition of the rows of the scaled multiplicand input data is performed with adders with two data inputs (plus carryin). These adders are cascaded such that normally invalid data ripples through the adder before the final result is achieved. By controlling the time power is applied to the adders most of the intermediate states are eliminated.
    Type: Grant
    Filed: September 4, 1997
    Date of Patent: August 5, 2003
    Assignee: Cirrus Logic, Inc.
    Inventor: Edwin De Angel
  • Patent number: 6353843
    Abstract: A partitioned multiplier circuit which is designed for high speed operations. The multiplier of the present invention can perform one 32×32 bit multiplication, two 16×16 bit multiplications (simultaneously) or four 8×8 bit multiplications (simultaneously) depending on input partitioning signals. The time required to perform either the 32×32 bit or the 16×16 bit or the 8×8 bit multiplications is constant. Therefore, multiplication results are available with a constant latency regardless of operand bit-size. In one embodiment, the latency is two clock cycles but the multiplier circuit has a throughput of one clock cycle due to pipelining. The input operands can be signed or unsigned. The hardware is partitioned without any significant increase in the delay or area and the multiplier can provide six different modes of operation.
    Type: Grant
    Filed: October 8, 1999
    Date of Patent: March 5, 2002
    Assignees: Sony Corporation of Japan, Sony Electronics, Inc.
    Inventors: Farzad Chehrazi, Vojin G. Oklobdzija, Aamir A. Farooqui
  • Patent number: 6175912
    Abstract: A processor architecture having an accumulator register file with multiple shared read and/or write ports. Depending on the instruction, each port can be used to communicate with a different data source or destination.
    Type: Grant
    Filed: November 14, 1997
    Date of Patent: January 16, 2001
    Assignee: Lucent Technologies, Inc.
    Inventors: Mazhar M. Alidina, Bin Fu
  • Patent number: 6122320
    Abstract: The circuit for motion estimation in digitised video sequence encoders comprises at least an integrated circuit component (IM, IM1 . . . IMn) which is arranged to perform either the function of determining motion vectors and associated costs for different prediction modes, or the function of vector refinement, possibly in addition to prediction mode selection. The circuit (IM) is based on the use of two operating units (M1, M2) which are arranged to concurrently process in different ways different pixel groups according to a MIMD technique. Preferably, when the circuit performs motion vector determination, the operating units (M1, M2) are programmed to execute a genetic algorithm exploiting an initial vector population taking into account the temporal and spatial correlations in the picture.
    Type: Grant
    Filed: November 13, 1998
    Date of Patent: September 19, 2000
    Assignee: CSELT-Centro Studi e Laboratori Telecomunicazioni S.p.A.
    Inventors: Fabio Bellifemine, Gianmario Bollano, Andrea Finotello, Marco Gandini, Pierangelo Garino, Mauro Marchisio, Alessandro Torielli, Didier Nicoulaz, Stephanie Dogimont, Martin Gumm, Marco Mattavelli, Frederich Mombers
  • Patent number: 6085214
    Abstract: A digital parallel multiplier having encoders for each segmented bit pair of the multiplier input data and which selects one of 4 coefficients, based on the sum of the bit pair, that are then applied to the multiplicand input data. When a 3X coefficient of the multiplicand input data is to be generated, a -1 coefficient is output by the encoder requiring the 3X coefficient, and a 1 is added to the sum of the next most significant bit pair.
    Type: Grant
    Filed: September 4, 1997
    Date of Patent: July 4, 2000
    Assignee: Cirrus Logic, Inc.
    Inventor: Edwin De Angel
  • Patent number: 6052706
    Abstract: In accordance with the present invention a circuit for performing an iterative process on a data stream is provided. The iterative process includes pipeline stages which operate on a portion of the data stream to produce an output which is an input to a succeeding stage. At least one of the pipeline stages includes a means for recirculating an output from the pipeline stage as an input to the pipeline stage for a predetermined number of times before passing the output to a succeeding stage. The predetermined number of times represents a clock period that includes more than one assertion of a clock signal. With such an arrangement, a circuit which performs a process, such as multiplication and division, in accordance with a particular bandwidth requirement requires less hardware than in other circuits performing the same process.
    Type: Grant
    Filed: November 25, 1997
    Date of Patent: April 18, 2000
    Assignee: Digital Equipment Corporation
    Inventors: William R. Wheeler, Matthew J. Adiletta