Pipeline Patents (Class 708/631)
-
Patent number: 12019559Abstract: Various configurations of processors are provided. In a configuration, the processor comprises first and second multiplication unit. Each of these multiplication units includes carry-save adder circuitry with a respective outputs, partial product alignment multiplexing logic coupled to the outputs of the associated carry-save adder circuitry. The processor further comprises communication paths coupled between the outputs of the carry-save adder circuitry of the first multiplication unit and the partial product alignment multiplexing logic of the second multiplication unit. In other configurations, each of the first and second multiplication units may include one or more instances of masking logic, one or more instances of a multiplier array coupled to the associated instance(s) of masking logic, and one or more instances of a multiplexer set coupled to the associated instance(s) of multiplier array(s).Type: GrantFiled: July 6, 2023Date of Patent: June 25, 2024Assignee: Texas Instruments IncorporatedInventors: Timothy David Anderson, Mujibur Rahman
-
Patent number: 11816448Abstract: An ALU is capable of generating a multiply accumulation by compressing like-magnitude partial products. Given N pairs of multiplier and multiplicand, Booth encoding is used to encode the multipliers into M digits, and M partial products are produced for each pair of with each partial product in a smaller precision than a final product. The partial products resulting from the same encoded multiplier digit position, are summed across all the multiplies to produce a summed partial product. In this manner, the partial product summation operations can be advantageously performed in the smaller precision. The M summed partial products are then summed together with an aggregated fixup vector for sign extension. If the N multipliers equal to a constant, a preliminary fixup vector can be generated based on a predetermined value with adjustment on particular bits, where the predetermined value is determined by the signs of the encoded multiplier digits.Type: GrantFiled: January 27, 2021Date of Patent: November 14, 2023Assignee: Marvell Asia Pte, Ltd.Inventor: David Carlson
-
Patent number: 9607586Abstract: Techniques are disclosed relating to asymmetric circuits. In some embodiments, a storage element is configured to maintain a first input value as an input to an asymmetric circuit during a time interval. For example, in one embodiment, the time interval may correspond to a frame of video data and the storage element may be configured to store a filter coefficient for the frame of video data. In some embodiments, the storage element may be configured to store the value as a constant for multiple operations by the asymmetric circuit. In some embodiments, the asymmetric circuit is configured to generate a plurality of output values based on the first input value and respective ones of a set of second input values. In some embodiments, the asymmetric circuit is leakage power asymmetric and/or critical path asymmetric. This may increase performance and/or reduce power consumption.Type: GrantFiled: February 18, 2014Date of Patent: March 28, 2017Assignee: Apple Inc.Inventors: Michael L. Liu, Liang Deng
-
Patent number: 8694573Abstract: A method for determining a quotient value from a dividend value and a divisor value in a digital processing circuit is provided. The method includes computing a reciprocal value of the divisor value and multiplying the reciprocal value by the dividend value to obtain a reciprocal product, the reciprocal product having an integer part. The method also includes computing an intermediate remainder value by computing a product of the integer part and the divisor value, and subtracting the resulting product from the dividend value and determining the quotient value based upon the intermediate remainder value.Type: GrantFiled: December 24, 2009Date of Patent: April 8, 2014Assignee: Jadavpur UniversityInventors: Debotosh Bhattacharjee, Santanu Halder
-
Patent number: 8161308Abstract: A circuit includes: an input buffer for storing input data; a plurality of processing sections connected in series including a head processing section and a tail-end processing section to sequentially process the input data; and a power supply controller for controlling power supply to each of the plurality of processing sections depending on a lapse of time during which no input data is stored in the input buffer.Type: GrantFiled: March 27, 2009Date of Patent: April 17, 2012Assignee: NEC CorporationInventor: Hidenori Hisamatsu
-
Patent number: 8073892Abstract: In general, in one aspect, the disclosure describes a multiplier that includes a set of multiple multipliers configured in parallel where the set of multiple multipliers have access to a first operand and a second operand to multiply, the first operand having multiple segments and the second operand having multiple segments. The multiplier also includes logic to repeatedly supply a single segment of the second operand to each multiplier of the set of multiple multipliers and to supply multiple respective segments of the first operand to the respective ones of the set of multiple multipliers until each segment of the second operand has been supplied with each segment of the first operand. The logic shifts the output of different ones of the set of multiple multipliers based, at least in part, on the position of the respective segments within the first operand. The multiplier also includes an accumulator coupled to the logic.Type: GrantFiled: December 30, 2005Date of Patent: December 6, 2011Assignee: Intel CorporationInventors: Wajdi K. Feghali, William C. Hasenplaugh, Gilbert M. Wolrich, Daniel R. Cutter, Vinodh Gopal, Gunnar Gaubatz
-
Publication number: 20110106872Abstract: An area efficient multiplier having high performance at modest clock speeds is presented. The performance of the multiplier is based on optimal choice of a number of levels of Karatsuba decomposition. The multiplier may be used to perform efficient modular reduction of large numbers greater than the size of the multiplier.Type: ApplicationFiled: June 6, 2008Publication date: May 5, 2011Inventors: William Hasenplaugh, Gilbert Wolrich, Vinodh Gopal, Gunnar Gaubatz, Erdinc Ozturk, Wajdi Feghali
-
Patent number: 7917793Abstract: The present invention uses a swing structure to avoid using a clock period at a non-efficient execution time. The execution time is precisely controlled to enhance a performance of a processor using a low voltage. Thus, synchronization problems in a chip under different environments are solved for high reliability.Type: GrantFiled: February 11, 2008Date of Patent: March 29, 2011Assignee: National Chung Cheng UniversityInventors: Shu-Hsuan Chou, Yi-Chao Chan, Ming-Ku Chang, Tien-Fu Chen
-
Patent number: 7774400Abstract: The present invention relates to a method for performing calculation operations using a pipelined calculation device comprising a group of at least two pipeline stages. The pipeline stages comprise at least one data interface for input of data and at least one data interface for output of data. In the method, data for performing calculation operations is input to the device. Selective data processing is performed in the calculation device, wherein between at least one input data interface and at least one output data interface a selection is performed to connect at least one input data interface to at least one output data interface for routing data between at least one input data interface and at least one output data interface and for processing data according to the selection. The invention further relates to a system and a device in which the method is utilized.Type: GrantFiled: November 6, 2003Date of Patent: August 10, 2010Assignee: Nokia CorporationInventors: David Guevorkian, Aki Launiainen, Petri Liuha
-
Patent number: 7769099Abstract: The invention relates to techniques for implementing high-speed precoders, such as Tomlinson-Harashima (TH) precoders. In one aspect of the invention, look-ahead techniques are utilized to pipeline a TH precoder, resulting in a high-speed TH precoder. These techniques may be applied to pipeline various types of TH precoders, such as Finite Impulse Response (FIR) precoders and Infinite Impulse Response (IIR) precoders. In another aspect of the invention, parallel processing multiple non-pipelined TH precoders results in a high-speed parallel TH precoder design. Utilization of high-speed TH precoders may enable network providers to for example, operate 10 Gigabit Ethernet with copper cable rather than fiber optic cable.Type: GrantFiled: September 13, 2005Date of Patent: August 3, 2010Assignee: Leanics CorporationInventors: Keshab K. Parhi, Yongru Gu
-
Patent number: 7536430Abstract: A method for performing calculation operations uses a pipelined calculation device comprising a group of at least two pipeline stages, at least one data interface for input of data, and at least one data interface for output of data. The pipeline stages include at least one data interface for input of data and at least one data interface for output of data. Data for performing a first and a second calculation operation is input to the device. In the first calculation operation, output data of at least one pipeline stage is stored into a memory. In the second calculation operation the stored data is used as input data to a pipeline stage. The invention further relates to a system and a device, in which the method is utilized.Type: GrantFiled: November 6, 2003Date of Patent: May 19, 2009Assignee: Nokia CorporationInventors: David Guevokian, Aki Launiainen, Petri Liuha
-
Patent number: 7334011Abstract: In a method for performing a multiplication operation between a first operand and a second operand the multiplication operation is divided into at least two suboperations. At least one of the suboperations is performed in a time-interlaced manner, wherein the at least one suboperation is further divided into partial suboperations so that each partial suboperation is initiated at a different time.Type: GrantFiled: November 6, 2003Date of Patent: February 19, 2008Assignee: Nokia CorporationInventors: David Guevokian, Aki Launiainen, Petri Liuha
-
Publication number: 20040186872Abstract: The present invention provides a circuit for a programmable transitive processing unit for performing complex functions, such as multiplication, pipelining of one or more values, and/or shift operations, wherein the circuit can be configured to be a constituent of an array of other similar circuits to form, for example, a larger multiplier.Type: ApplicationFiled: March 21, 2003Publication date: September 23, 2004Inventor: Charle' R. Rupp
-
Publication number: 20040133618Abstract: In a method for performing a multiplication operation between a first operand and a second operand the multiplication operation is divided into at least two suboperations. At least one of the suboperations is performed in a time-interlaced manner, wherein the at least one suboperation is further divided into partial suboperations so that each partial suboperation is initiated at a different time.Type: ApplicationFiled: November 6, 2003Publication date: July 8, 2004Applicant: Nokia CorporationInventors: David Guevokian, Aki Launiainen, Petri Liuha
-
Publication number: 20030220957Abstract: A digital Parallel Multiplier having a Partial Product Generator (3), a First Stage Adder Circuit (71) and a Final Stage Adder Circuit (72), wherein the spurious switching in the First Stage Adder Circuit (71) may be substantially reduced by synchronizing the input signals to the Adders in First Stage Adder Circuit (71). The reduced spurious switching reduces the power dissipation of the Multiplier. The timing of the input signals is synchronized by means of the Latch Adders (41) having a Latch that is an integral part of an Adder. Consequently, the power dissipation and hardware overheads of the Latch Adders (41) are low. The Latch Adders (41) may be controlled by Control Signals (44), which may be generated by Control Circuits (61). The application of the Latch Adders (41) may be applied to the Final Stage Adder Circuit (72) to further reduce spurious switching and thereby further reduce the power dissipation.Type: ApplicationFiled: May 14, 2003Publication date: November 27, 2003Inventors: Joseph Sylvester Chang, Bah Hwee Gwee, Kwen Siong Chong
-
Patent number: 6604120Abstract: A digital parallel multiplier has encoders for each segmented bit pair of the multiplier input data which select one of 4 coefficients, based on the sum of the bit pair, that are then applied to the multiplicand input data. The addition of the rows of the scaled multiplicand input data is performed with adders with two data inputs (plus carryin). These adders are cascaded such that normally invalid data ripples through the adder before the final result is achieved. By controlling the time power is applied to the adders most of the intermediate states are eliminated.Type: GrantFiled: September 4, 1997Date of Patent: August 5, 2003Assignee: Cirrus Logic, Inc.Inventor: Edwin De Angel
-
Patent number: 6353843Abstract: A partitioned multiplier circuit which is designed for high speed operations. The multiplier of the present invention can perform one 32×32 bit multiplication, two 16×16 bit multiplications (simultaneously) or four 8×8 bit multiplications (simultaneously) depending on input partitioning signals. The time required to perform either the 32×32 bit or the 16×16 bit or the 8×8 bit multiplications is constant. Therefore, multiplication results are available with a constant latency regardless of operand bit-size. In one embodiment, the latency is two clock cycles but the multiplier circuit has a throughput of one clock cycle due to pipelining. The input operands can be signed or unsigned. The hardware is partitioned without any significant increase in the delay or area and the multiplier can provide six different modes of operation.Type: GrantFiled: October 8, 1999Date of Patent: March 5, 2002Assignees: Sony Corporation of Japan, Sony Electronics, Inc.Inventors: Farzad Chehrazi, Vojin G. Oklobdzija, Aamir A. Farooqui
-
Patent number: 6175912Abstract: A processor architecture having an accumulator register file with multiple shared read and/or write ports. Depending on the instruction, each port can be used to communicate with a different data source or destination.Type: GrantFiled: November 14, 1997Date of Patent: January 16, 2001Assignee: Lucent Technologies, Inc.Inventors: Mazhar M. Alidina, Bin Fu
-
Patent number: 6122320Abstract: The circuit for motion estimation in digitised video sequence encoders comprises at least an integrated circuit component (IM, IM1 . . . IMn) which is arranged to perform either the function of determining motion vectors and associated costs for different prediction modes, or the function of vector refinement, possibly in addition to prediction mode selection. The circuit (IM) is based on the use of two operating units (M1, M2) which are arranged to concurrently process in different ways different pixel groups according to a MIMD technique. Preferably, when the circuit performs motion vector determination, the operating units (M1, M2) are programmed to execute a genetic algorithm exploiting an initial vector population taking into account the temporal and spatial correlations in the picture.Type: GrantFiled: November 13, 1998Date of Patent: September 19, 2000Assignee: CSELT-Centro Studi e Laboratori Telecomunicazioni S.p.A.Inventors: Fabio Bellifemine, Gianmario Bollano, Andrea Finotello, Marco Gandini, Pierangelo Garino, Mauro Marchisio, Alessandro Torielli, Didier Nicoulaz, Stephanie Dogimont, Martin Gumm, Marco Mattavelli, Frederich Mombers
-
Patent number: 6085214Abstract: A digital parallel multiplier having encoders for each segmented bit pair of the multiplier input data and which selects one of 4 coefficients, based on the sum of the bit pair, that are then applied to the multiplicand input data. When a 3X coefficient of the multiplicand input data is to be generated, a -1 coefficient is output by the encoder requiring the 3X coefficient, and a 1 is added to the sum of the next most significant bit pair.Type: GrantFiled: September 4, 1997Date of Patent: July 4, 2000Assignee: Cirrus Logic, Inc.Inventor: Edwin De Angel
-
Patent number: 6052706Abstract: In accordance with the present invention a circuit for performing an iterative process on a data stream is provided. The iterative process includes pipeline stages which operate on a portion of the data stream to produce an output which is an input to a succeeding stage. At least one of the pipeline stages includes a means for recirculating an output from the pipeline stage as an input to the pipeline stage for a predetermined number of times before passing the output to a succeeding stage. The predetermined number of times represents a clock period that includes more than one assertion of a clock signal. With such an arrangement, a circuit which performs a process, such as multiplication and division, in accordance with a particular bandwidth requirement requires less hardware than in other circuits performing the same process.Type: GrantFiled: November 25, 1997Date of Patent: April 18, 2000Assignee: Digital Equipment CorporationInventors: William R. Wheeler, Matthew J. Adiletta