Pipeline Patents (Class 708/233)
-
Patent number: 11520582Abstract: Examples of a carry chain for performing an operation on operands each including elements of a selectable size is provided. Advantageously, the carry chain adapts to elements of different sizes. The carry chain determines a mask based on a selected size of an element. The carry chain selects, based on the mask, whether to carry a partial result of an operation performed on corresponding first portions of a first operand and a second operand into a next operation. The next operation is performed on corresponding second portions of the first operand and the second operand, and, based on the selection, the partial result of the operation. The carry chain stores, in a memory, a result formed from outputs of the operation and the next operation.Type: GrantFiled: October 13, 2020Date of Patent: December 6, 2022Assignee: Marvell Asia Pte, Ltd.Inventor: David Kravitz
-
Patent number: 11372711Abstract: Apparatus and Method for Fault Handling of an Offload Transaction. For example, one embodiment of a processor comprises: a plurality of cores; an interconnect coupling the plurality of cores; and offload circuitry to transfer work from a first core of the plurality of cores to a second core of the plurality of cores without operating system (OS) intervention, the work comprising a plurality of instructions; the second core comprising first fault management logic to determine an action to take responsive to a fault condition, wherein responsive to detecting a first type of fault condition, the first fault management logic is to cause the first core to be notified of the fault condition, the first core comprising second fault management logic to attempt to resolve the fault condition.Type: GrantFiled: June 29, 2019Date of Patent: June 28, 2022Assignee: INTEL CORPORATIONInventor: ElMoustapha Ould-Ahmed-Vall
-
Patent number: 11182166Abstract: According to one general aspect, an apparatus may include a branch prediction circuit configured to predict if a branch instruction will be taken or not. The apparatus may include a branch target buffer circuit configured to store a memory segment empty flag that indicates whether or not the memory segment after a target address includes at least one other branch instruction, wherein the memory segment empty flag was created during a commit stage of a prior occurrence of the branch instruction. The branch prediction circuit may be configured to skip over the memory segment if the memory segment empty flag indicates a lack of other branch instruction(s).Type: GrantFiled: September 4, 2019Date of Patent: November 23, 2021Inventors: Madhu Saravana Sibi Govindan, Fuzhou Zou, Anhdung Ngo, Wichaya Top Changwatchai, Monika Tkaczyk, Gerald David Zuraski, Jr.
-
Patent number: 10725742Abstract: In described examples, an apparatus is arranged to generate a linear term, a quadratic term, and a constant term of a transcendental function with, respectively, a first circuit, a second circuit, and a third circuit in response to least significant bits of an input operand and in response to, respectively, a first, a second, and a third table value that is retrieved in response to, respectively, a first, a second, and a third index generated in response to most significant bits of the input operand. The third circuit is further arranged to generate a mantissa of an output operand in response to a sum of the linear term, the quadratic term, and the constant term.Type: GrantFiled: June 5, 2018Date of Patent: July 28, 2020Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Prasanth Viswanathan Pillai, Richard Mark Poley, Venkatesh Natarajan, Alexander Tessarolo
-
Patent number: 9690590Abstract: Executing instructions in a processor includes: selecting or more instructions to be issued together in the same clock cycle of the processor from among a plurality of instructions, the selected one or more instructions occurring consecutively according to a program order; and executing instructions that have been issued, through multiple execution stages of a pipeline of the processor. The executing includes: determining a delay assigned to a first instruction, and sending a result of a first operation performed by the first instruction in a first execution stage to a second execution stage, where the number of execution stages between the first execution stage and the second execution stage is based on the determined delay.Type: GrantFiled: October 15, 2014Date of Patent: June 27, 2017Assignee: CAVIUM, INC.Inventor: David Albert Carlson
-
Patent number: 9672161Abstract: The described embodiments include a cache controller that configures a cache management mechanism. In the described embodiments, the cache controller is configured to monitor at least one structure associated with a cache to determine at least one cache block that may be accessed during a future access in the cache. Based on the determination of the at least one cache block that may be accessed during a future access in the cache, the cache controller configures the cache management mechanism.Type: GrantFiled: December 9, 2012Date of Patent: June 6, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Gabriel H. Loh, Yasuko Eckert
-
Patent number: 9606925Abstract: In one embodiment, a processor includes a caching home agent (CHA) coupled to a core and a cache memory and includes a cache controller having a cache pipeline and a home agent having a home agent pipeline. The CHA may: receive, in the home agent pipeline, information from an external agent responsive to a miss for data in the cache memory; issue a global ordering signal from the home agent pipeline to a requester of the data to inform the requester of receipt of the data; and report issuance of the global ordering signal to the cache pipeline, to prevent the cache pipeline from issuance of a global ordering signal to the requester. Other embodiments are described and claimed.Type: GrantFiled: March 26, 2015Date of Patent: March 28, 2017Assignee: Intel CorporationInventors: Bahaa Fahim, Yen-Cheng Liu, Vedaraman Geetha, Jeffrey D. Chamberlain, Min Huang
-
Patent number: 9329835Abstract: System and methods are provided for performing mathematical functions. An example system includes an instruction decoder configured to decode instructions for performing a mathematical function, an arithmetic logic unit having an alterable configuration to perform a combination of arithmetic operations, and a control unit configured to, based on the instructions decoded by the instruction decoder, output one or more control signals to the arithmetic logic unit. In response to the arithmetic logic unit receiving the one or more control signals, the configuration of the arithmetic logic unit is configured to be altered accordance with the one or more control signals such that the combination of arithmetic operations to be performed by the arithmetic logic unit is substantially equivalent to the mathematical function.Type: GrantFiled: September 27, 2012Date of Patent: May 3, 2016Assignee: MARVELL INTERNATIONAL LTD.Inventor: Kapil Jain
-
Publication number: 20150012578Abstract: Mathematical functions are computed in a single pipeline performing a polynomial approximation (e.g. a quadratic approximation, or the like); and one or more data tables corresponding to at least one of the RCP, SQRT, EXP or LOG functions operable to be coupled to the single pipeline according to one or more opcodes; wherein the single pipeline is operable for computing at least one of RCP, SQRT, EXP or LOG functions according to the one or more opcodes. SIN and COS are also computed using the pipeline according to the approximation ((?1)?IntX)*Sin(?*Min(FracX, 1.0?FracX)/Min(FracX, 1.0?FracX). A pipline portion approximates Sin(?*FracX) using tables and interpolation and a subsequent stage multiplies this approximation by FracX. For input arguments of x close 1.0. LOG2(x?1)/(x?1) is computed using a first pipeline portion using tables and interpolation and subsequently multiplied by (x?1). A DIV operation may also be performed with input arguments scaled up to avoid underflow as needed.Type: ApplicationFiled: September 15, 2014Publication date: January 8, 2015Inventors: Mike M. Cai, Lefan Zhong
-
Patent number: 8843802Abstract: The present invention relates to coding method and coding device that allow Rate-Compatible LDPC (low-density parity-check) codes to have favorable BER performance both with a low code rate and with a high code rate. In coding of LDPC codes that have plural code rates and whose all parity check matrices are composed of plural cyclic matrices, a coder 121 performs the coding in such a way that 1<w0 and w1<w0 are satisfied when the maximum column weight of the cyclic matrices in the check matrix of a certain code whose code rate is not the minimum value among the LDPC codes is defined as w0 and the maximum column weight of the cyclic matrices in the check matrix of a code having a code rate lower than that of the certain code is defined as w1.Type: GrantFiled: September 13, 2013Date of Patent: September 23, 2014Assignee: Sony CorporationInventor: Makoto Noda
-
Patent number: 8798386Abstract: Methods and systems for processing image data on a per tile basis in an image sensor pipeline (ISP) are disclosed and may include communicating, to one or more processing modules via control logic circuits integrated in the ISP, corresponding configuration parameters that are associated with each of a plurality of data tiles comprising an image. The ISP may be integrated in a video processing core. The plurality of data tiles may vary in size. A processing complete signal may be communicated to the control logic circuits when the processing of each of the data tiles is complete prior to configuring a subsequent processing module. The processing may comprise one or more of: lens shading correction, statistics, distortion correction, demosaicing, denoising, defective pixel correction, color correction, and resizing. Each of the data tiles may overlap with adjacent data tiles, and at least a portion of them may be processed concurrently.Type: GrantFiled: July 13, 2010Date of Patent: August 5, 2014Assignee: Broadcom CorporationInventors: Adrian Lees, David Plowman
-
Patent number: 8739101Abstract: A method of configuring a hardware design for a pipelined parallel stream processor includes obtaining a scheduled graph representing a processing operation in the time domain as a function of clock cycles. The graph includes a data path to be implemented in hardware as part of the stream processor, an input, an output, and parallel branches to enable data values to be streamed therethrough from the input to the output as a function of increasing clock cycle. The data path is partitioned into a plurality of discrete regions, each region operating on a different clock phase and having discrete control logic elements. Phase transition registers to align data separated by a boundary between regions having different clock phases are introduced into the data path at the boundary. The graph and control logic elements define a hardware design for the pipelined parallel stream processor.Type: GrantFiled: November 21, 2012Date of Patent: May 27, 2014Assignee: Maxeler Technologies Ltd.Inventor: Robert Gwilym Dimond
-
Patent number: 8543887Abstract: The present invention relates to coding method and coding device that allow Rate-Compatible LDPC (low-density parity-check) codes to have favorable BER performance both with a low code rate and with a high code rate. In coding of LDPC codes that have plural code rates and whose all parity check matrices are composed of plural cyclic matrices, a coder 121 performs the coding in such a way that 1<w0 and w1<w0 are satisfied when the maximum column weight of the cyclic matrices in the check matrix of a certain code whose code rate is not the minimum value among the LDPC codes is defined as w0 and the maximum column weight of the cyclic matrices in the check matrix of a code having a code rate lower than that of the certain code is defined as w1.Type: GrantFiled: July 10, 2008Date of Patent: September 24, 2013Assignee: Sony CorporationInventor: Makoto Noda
-
Patent number: 8516025Abstract: A system includes a plurality of datapaths, each having structural arithmetic elements to perform various arithmetic operations based, at least in part, on configuration data. The system also includes a configuration memory coupled to the datapaths, the configuration memory to provide the configuration data to the datapaths, which causes the datapaths to collaborate when performing the arithmetic operations.Type: GrantFiled: April 16, 2008Date of Patent: August 20, 2013Assignee: Cypress Semiconductor CorporationInventors: Warren Synder, Bert Sullam
-
Patent number: 8494155Abstract: An encryption device can include a tweaking value manager that is configured to generate an array of tweaking values corresponding to the array of data blocks based on a tweaking encryption key, a first encryption unit that is configured to encrypt a first portion of the array of data blocks into a first portion of encrypted data blocks based on corresponding tweaking values and a data encryption key, a second encryption unit that is configured to encrypt a second portion of the array of data blocks into a second portion of encrypted data blocks based on corresponding tweaking values and the data encryption key, and a data block combiner that is configured to combine the first portion of encrypted data blocks and the second portion of encrypted data blocks into an array of encrypted data blocks.Type: GrantFiled: October 7, 2011Date of Patent: July 23, 2013Assignee: Marvell International Ltd.Inventors: Tze Lei Poo, Siu-Hung Fred Au, Gregory Burd, David Geddes, Heng Tang
-
Patent number: 8433736Abstract: A Montgomery multiplication device calculates a Montgomery product of an operand X and an operand Y with respect to a modulus M and includes a plurality of processing elements. In a first clock cycle, two intermediate partial sums are created by obtaining an input of length w?1 from a preceding processing element as w?1 least significant bits. The most significant bit is configured as either zero or one. Then, two partial sums are calculated using a word of the operand Y, a word of the modulus M, a bit of the operand X, and the two intermediate partial sums. In a second clock cycle, a selection bit is obtained from a subsequent processing element and one of the two partial sums is selected based on the value of the selection bit. Then, the selected partial sum is used for calculation of a word of the Montgomery product.Type: GrantFiled: March 1, 2010Date of Patent: April 30, 2013Assignee: George Mason Intellectual Properties, Inc.Inventors: Miaoqing Huang, Krzysztof Gaj
-
Patent number: 8352530Abstract: A residue generator for calculation and correction of a residue value. The residue generator includes a residue-generation tree connected with an operand register at an input of the residue generator including a plurality of register-bits receiving and carrying bits of numerical data.Type: GrantFiled: December 8, 2008Date of Patent: January 8, 2013Assignee: International Business Machines CorporationInventors: Son T. Dao, Juergen G. Haess, Michael Klein, Michael K. Kroener
-
Patent number: 8346831Abstract: Mathematical functions are computed using a single hardware pipeline that performs polynomial approximation of second degree or higher. The single hardware pipeline includes multiple stages. Several data tables are used on the computations. The data tables are associated with a reciprocal, square root, exponential, or logarithm function. The data tables include data associated with implementing the associated function. The single hardware pipeline computes at least one of the functions associated with the data tables.Type: GrantFiled: July 25, 2006Date of Patent: January 1, 2013Assignee: Vivante CorporationInventors: Mike M. Cai, Lefan Zhong
-
Low density parity code (LDPC) decoding for memory with multiple log likelihood ratio (LLR) decoders
Patent number: 8301979Abstract: Data stored in memory is decoded using iterative probabilistic decoding and multiple decoders. A first decoder attempts to decode a representation of a codeword. If the attempt is unsuccessful, a second decoder attempts to decode the representation of a codeword. The second decoder may have a lower resolution than the first decoder. Probability values such as logarithmic likelihood ratio (LLR) values may be clipped in the second decoder. This approach can overcome trapping sets while exhibiting low complexity and high performance. Further, it can be implemented on existing decoders such as those used in current memory devices.Type: GrantFiled: October 7, 2009Date of Patent: October 30, 2012Assignee: SanDisk IL Ltd.Inventors: Eran Sharon, Idan Alrod, Ariel Navon, Opher Lieber -
Patent number: 8161308Abstract: A circuit includes: an input buffer for storing input data; a plurality of processing sections connected in series including a head processing section and a tail-end processing section to sequentially process the input data; and a power supply controller for controlling power supply to each of the plurality of processing sections depending on a lapse of time during which no input data is stored in the input buffer.Type: GrantFiled: March 27, 2009Date of Patent: April 17, 2012Assignee: NEC CorporationInventor: Hidenori Hisamatsu
-
Publication number: 20120066279Abstract: An apparatus having two or more parallel carry chain structures, each of the carry chain structures comprising a series of logical structures, where at least one of the logical structures within each of the carry chain structures has an associated input node, output node and carry node. The input node corresponds to a function input term, the output node corresponds to an output term of the function and the carry node corresponds to a carry value to a following logical structure in the series of logical structures.Type: ApplicationFiled: November 21, 2011Publication date: March 15, 2012Inventor: Ken S. McElvain
-
Patent number: 8112438Abstract: A first transmitting unit transmits a processing command to a plurality of parallelized database servers. A second transmitting unit integrates data sets transmitted from the database servers in response to the processing command, and transmits an integrated data set to a client. An integrating unit integrates data sets buffered in a buffer unit. A determining unit determines a transmission start or a transmission suspend of the data sets based on a data size in the buffer unit. A third transmitting unit transmits a control command for the transmission start or the transmission suspend to the database servers based on a result of determination by the first determining unit.Type: GrantFiled: September 22, 2008Date of Patent: February 7, 2012Assignee: Kabushiki Kaisha ToshibaInventor: Masakazu Hattori
-
Patent number: 8094768Abstract: The present invention discloses a novel multi-channel timing recovery scheme that utilizes a shared CORDIC to accurately compute the phase for each tone. Then a hardware-based linear combiner module is used to reconstruct the best phase estimate from multiple phase measurements. The firmware monitors the noise variance for the pilot tones and determines the corresponding weight for each tone to ensure that the minimum phase jitter noise is achieved through the linear combiner. Then a hardware-based second-order timing recovery control loop generates the frequency reference signal for VCXO or DCXO. A single sequentially controlled multiplier is used for all multiplications in the control loop.Type: GrantFiled: December 21, 2006Date of Patent: January 10, 2012Assignee: Triductor Technology (Suzhou) Inc.Inventor: Yaolong Tan
-
Patent number: 8055888Abstract: A data processing apparatus is disclosed that comprises a pipelined processor, said pipelined processor comprising a processing pipeline for processing instructions in a plurality of stages, at least some of said plurality of stages each comprising storage elements for storing an instruction or decoded instruction being processed in said stage, said storage elements in at least one of said stages comprising settable elements, each of said settable elements being adapted to store a predetermined value in response to a wake up event, said settable elements being arranged such that in response to said wake up event said values stored in said settable elements form an instruction or decoded instruction.Type: GrantFiled: February 28, 2008Date of Patent: November 8, 2011Assignee: ARM LimitedInventors: Chiloda Ashan Senerath Pathirane, David Michael Gilday
-
Patent number: 8041153Abstract: A processing device has plural processing modules executing a processing; and plural connectors each having a linking section, an associating section, and a controller. The linking section is able to link with at least one other connector at an input side or an output side. The associating section associates the connector with one of the processing modules. In accordance with a linked state, the controller controls the processing module associated by the associating section.Type: GrantFiled: January 14, 2011Date of Patent: October 18, 2011Assignees: Fuji Xerox Co., Ltd., FUJIFILM CorporationInventors: Yasuhiko Kaneko, Junichi Kaneko, Satoshi Yamamoto, Michitaka Hariya, Takashi Nagao, Yukio Kumazawa, Noriaki Seki
-
Patent number: 8036377Abstract: The disclosure provides a hardware architecture for encryption and decryption device. The hardware architecture can improve the encryption and decryption data rate by using parallel processing, and pipeline operation. Further, the hardware architecture can save footprint by sharing hardware components. Additionally, the hardware architecture can be associated with a memory to protect the information stored at the memory.Type: GrantFiled: December 12, 2007Date of Patent: October 11, 2011Assignee: Marvell International Ltd.Inventors: Tze Lei Poo, Siu-Hung Fred Au, Gregory Burd, David Geddes, Heng Tang
-
Publication number: 20110202586Abstract: Some embodiments provide a configurable integrated circuit (“IC”) that includes several configurable tiles arranged in a tile arrangement. Each configurable tile has a set of configurable logic circuits and a set of configurable routing circuits for routing signals between configurable logic circuits. In some embodiments, at least a first logic circuit of a first tile has at least one direct connection with a second circuit of a second tile that does not neighbor the first tile and that is not aligned horizontally or vertically with the first tile in the tile arrangement. Also, in some embodiments, each particular tile further has a set of configurable input-select circuits for receiving inputs and configurably supplying a sub-set of the received inputs to the configurable logic circuits in the particular tile.Type: ApplicationFiled: November 18, 2010Publication date: August 18, 2011Inventors: Steven Teig, Jason Redgrave
-
Patent number: 7934031Abstract: An asynchronous logic family of circuits which communicate on delay-insensitive flow-controlled channels with 4-phase handshakes and 1 of N encoding, compute output data directly from input data using domino logic, and use the state-holding ability of the domino logic to implement pipelining without additional latches.Type: GrantFiled: May 11, 2006Date of Patent: April 26, 2011Assignee: California Institute of TechnologyInventors: Andrew M. Lines, Alain J. Martin, Uri Cummings
-
Patent number: 7916974Abstract: A processing device has plural processing modules executing a processing; and plural connectors each having a linking section, an associating section, and a controller. The linking section is able to link with at least one other connector at an input side or an output side. The associating section associates the connector with one of the processing modules. In accordance with a linked state, the controller controls the processing module associated by the associating section.Type: GrantFiled: June 5, 2006Date of Patent: March 29, 2011Assignees: Fuji Xerox Co., Ltd., Fujifilm CorporationInventors: Yasuhiko Kaneko, Junichi Kaneko, Satoshi Yamamoto, Michitaka Hariya, Takashi Nagao, Yukio Kumazawa, Noriaki Seki
-
Patent number: 7890559Abstract: A data processing system, which is particularly useful for carrying out modular multiplication, especially for cryptographic purposes, comprises a plurality of independent, serially connected processing elements which are provided with data in a cyclical fashion via a control mechanism that is capable of transferring data from a set of registers to earlier ones in the series of the serially connected processing elements, at the end of a predetermined number of cycles.Type: GrantFiled: December 22, 2006Date of Patent: February 15, 2011Assignee: International Business Machines CorporationInventors: Camil Fayad, John K. Li, Siegfried K. H. Sutter, Tamas Visegrady
-
Patent number: 7834785Abstract: An encoding device and method, of CABAC type, for an initial stream of binary digital information intended to generate an outgoing stream to form video images, after decoding, the method included the following steps: bit-by-bit analysis of the successive series of bits of the initial binary stream so as to deduce therefrom, for each bit, an interval representing the probability of occurrence associated with this bit, this interval being defined by its size CIR and its lower bound CIL, analysis of this interval so as to ensure, if necessary, a renormalization thereof. The renormalization is non-iterative and for each bit of the initial stream is compliant with the appended figure in which: M is the length of the sequence S of high-order bits common to CIL and CIR, N is the integer number such that CIR.2N-1<0.25?CIR.2N, BO is the number of bits waiting to be inserted.Type: GrantFiled: June 27, 2007Date of Patent: November 16, 2010Assignee: Assistance Technique et Etude de Materiels Electroniques - ATEMEInventor: Tchi Southivong
-
Patent number: 7818357Abstract: A CORDIC processor is configured to perform orthogonal or oblique CORDIC projections in order to cancel interference in a received signal. The CORDIC projection can be used to rotate an interference signal vector so that its only non-zero component is in the last Euclidean coordinate of the representative vector. A measurement vector is then subject to the same rotations as the interference vector. As a result of the rotation on the measurement vector, all components of the measurement vector parallel to the interference vector will be resolved onto the same coordinate as the rotated interference vector. The parallel components of the symbol vector can be cancelled by zeroing that coordinate, and the modified measurement vector can then be rotated back to its original coordinates, to produce an orthogonally projected version of the original measurement vector. Typically, the projection is onto a subspace that is orthogonal or oblique to an interference subspace, which may be one-dimensional.Type: GrantFiled: November 23, 2005Date of Patent: October 19, 2010Assignee: Rambus Inc.Inventor: Leo Bredehoft
-
End-to-end residue-based protection of an execution pipeline that supports floating point operations
Patent number: 7769795Abstract: An end-to-end residue-based protection scheme protects multiple units/blocks of a floating point execution pipeline without the complexity and cost of having multiple protection schemes for the execution pipeline. Protecting an execution pipeline that supports floating point operations includes factoring in component operations, such as normalization and rounding, into a residue generated for a result. In addition, residues of operands are distilled to extract their corresponding mantissa residues, thus allowing the floating point operations (e.g., multiplication, addition, etc.) to be applied to the mantissa residues.Type: GrantFiled: June 3, 2005Date of Patent: August 3, 2010Assignee: Oracle America, Inc.Inventor: Sorin Iacobovici -
Patent number: 7769099Abstract: The invention relates to techniques for implementing high-speed precoders, such as Tomlinson-Harashima (TH) precoders. In one aspect of the invention, look-ahead techniques are utilized to pipeline a TH precoder, resulting in a high-speed TH precoder. These techniques may be applied to pipeline various types of TH precoders, such as Finite Impulse Response (FIR) precoders and Infinite Impulse Response (IIR) precoders. In another aspect of the invention, parallel processing multiple non-pipelined TH precoders results in a high-speed parallel TH precoder design. Utilization of high-speed TH precoders may enable network providers to for example, operate 10 Gigabit Ethernet with copper cable rather than fiber optic cable.Type: GrantFiled: September 13, 2005Date of Patent: August 3, 2010Assignee: Leanics CorporationInventors: Keshab K. Parhi, Yongru Gu
-
Publication number: 20100191786Abstract: A digital signal processing block with a preadder stage for an integrated circuit is described. The digital signal processing block includes a preadder stage and a control bus. The control bus is coupled to the preadder stage for dynamically controlling operation of the preadder stage. The preadder stage includes: a first input port of a first multiplexer coupled to the control bus; a second input port of a first logic gate coupled to the control bus; a third input port of a second logic gate coupled to the control bus; and a fourth input port of an adder/subtractor coupled to the control bus.Type: ApplicationFiled: January 27, 2009Publication date: July 29, 2010Applicant: XILINX, INC.Inventors: James M. Simkins, Alvin Y. Ching, John M. Thendean, Vasisht M. Vadi, Chi Fung Poon, Muhammad Asim Rab
-
Patent number: 7747020Abstract: Performing a hash algorithm in a processor architecture to alleviate performance bottlenecks and improve overall algorithm performance. In one embodiment of the invention, the hash algorithm is pipelined within the processor architecture.Type: GrantFiled: December 4, 2003Date of Patent: June 29, 2010Assignee: Intel CorporationInventor: Wajdi K. Feghali
-
Patent number: 7634631Abstract: A method for updating a current network flow statistic stored in a memory device, comprising: storing a first statistic and a first address corresponding to a location in the memory device in a first stage of a multiple stage delay pipeline; shifting the first statistic and the first address to successive stages of the pipeline during successive clock cycles; at a middle stage of the pipeline, sending a read signal to the memory device to read the current statistic from the location; at a last stage of the pipeline, receiving the current statistic from the memory device in response to the read signal, adding the first statistic to the current statistic to generate an updated statistic, and sending a write signal to the memory device to write the updated statistic to the location; and, if a second statistic for the first address is stored in the first stage of the pipeline while the first statistic is stored in any but the first and last stages of the pipeline, replacing the first statistic with a sum of the fType: GrantFiled: July 10, 2006Date of Patent: December 15, 2009Assignee: Alcatel LucentInventor: Hayrettin Buyuktepe
-
Patent number: 7558972Abstract: A data processing apparatus comprises a plurality of calculating units connected each other in series, a plurality of memories connected in between the plurality of calculating units, and a control unit operable to determine a calculating unit, which performs calculation in a unit cycle, among the plurality of the calculating units. It is possible to reduce unnecessary power consumption in the data processing apparatus while completing processing in permissible processing time set by an application.Type: GrantFiled: January 24, 2006Date of Patent: July 7, 2009Assignee: Panasonic CorporationInventors: Masashi Hoshino, Masahiro Ohashi
-
Patent number: 7555692Abstract: A processor that protects an execution pipeline includes a residue-based error detection infrastructure including a first logic for computing a first residue of a result of an executed instruction instance, and a second logic for computing a second residue of the result. The second logic applies arithmetic operations of the executed instruction instance to residues of operands of the instruction instance. The execution pipeline includes registers and one or more arithmetic execution units. A method of protecting an execution pipeline includes performing one or more operations of an instruction instance on residues of operands of the instruction instance, computing a first residue of a result of the operations on the operand residues, computing a second residue from a result of executing the instruction instance, and checking the first residue against the second residue to determine whether errors were introduced while the instruction instance was resident in the execution pipeline.Type: GrantFiled: May 24, 2005Date of Patent: June 30, 2009Assignee: Sun Microsystems, Inc.Inventor: Sorin Iacobovici
-
Patent number: 7533294Abstract: A functional coverage based test generation technique for pipelined architectures is presented. A general graph-theoretic model is developed that can capture the structure and behavior (instruction-set) of a wide variety of pipelined processors. A functional fault model is developed and used to define the functional coverage for pipelined architectures. Test generation procedures are developed that accept the graph model of the architecture as input and generate test programs to detect all the faults in the functional fault model. A graph model of the pipelined processor is automatically generated from the specification using functional abstraction. Functional test programs are generated based on the coverage of the pipeline behavior. Module level property checking is used to reduce test generation time.Type: GrantFiled: September 9, 2005Date of Patent: May 12, 2009Assignee: The Regents of the University of CaliforniaInventors: Prabhat Mishra, Nikil Dutt
-
Publication number: 20090100313Abstract: Disclosed is a pipelined iterative process and system. Data is received at an input port and is processed in a symbolwise fashion. Processing of each symbol is performed other than relying on completing the processing of an immediately preceding symbol such that operation of the system or process is independent of an order of the input symbols.Type: ApplicationFiled: October 14, 2008Publication date: April 16, 2009Applicant: The Royal Institution for the Advancement of Learning/McGill UniversityInventors: Warren J. GROSS, Shie MANNOR, Saeed SHARIFI TEHRANI
-
Patent number: 7463678Abstract: A circuit and method is provided for reducing the effect of having potentially different sizes for an Inverse Discrete Fourier Transform (IDFT) at a transmitter and a Discrete Fourier Transform (DFT) at a receiver in a telecommunications system without requiring a change in the DFT's size. The method includes following steps. The first step includes determining whether the IDFT size is greater than, equal to, or less than the DFT size. The second step includes selecting a target impulse response length from a predefined set of impulse response lengths in accordance with a result the previous step. The third step includes training an equalizer at the receiver to the target impulse response length. The circuit comprises hardware and software for implementing the method.Type: GrantFiled: March 10, 2003Date of Patent: December 9, 2008Assignee: CIENA CorporationInventors: Alberto Ginesi, Song Zhang, Andrew Deczky, Duncan Baird, Christian Bourget
-
Publication number: 20080294706Abstract: A digital adder circuit comprising a plurality of logical stages in the carry logic of said adder circuit, for generating and propagating predetermined groups of operand bits, each stage implementing a predetermined logic function and processing input variables from a preceding stage and outputting result values to a succeeding stage static and dynamic logic in the carry network of a 4-bit adder, and with output from the first stage fed directly as an input (60, 62) to the third stage of the carry network. Preferably, stages having normally relatively high switching activities are implemented in static logic. Preferably, the first stage of its carry network is implemented in a static logic, and the rest of the stages in dynamic logic.Type: ApplicationFiled: April 9, 2008Publication date: November 27, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Wilhelm Haller, Rolf Sautter, Christoph Wandel, Ulrich Weiss
-
Patent number: 7274369Abstract: Digital Image compositing using a programmable graphics processor is described. The programmable graphics processor supports high-precision data formats and can be programmed to complete a plurality of compositing operations in a single pass through a fragment processing pipeline within the programmable graphics processor. Source images for one or more compositing operations are stored in graphics memory, and a resulting composited image is output or stored in graphics memory. More-complex compositing operations, such as blur, warping, morphing, and the like, can be completed in multiple passes through the fragment processing pipeline. A composited image produced during a pass through the fragment processing pipeline is stored in graphics memory and is available as a source image for a subsequent pass.Type: GrantFiled: June 9, 2005Date of Patent: September 25, 2007Assignee: NVIDIA CorporationInventors: Rui M. Bastos, Daniel Elliott Wexler, Larry Gritz, Jonathan Rice, Harold Robert Feldman Zatz, Matthew N. Papakipos, David Kirk
-
Patent number: 7275125Abstract: A circuit and method to provide pipeline bit handling across a bus bridge between two different buses. In a preferred embodiment, the pipeline bit handling circuit provides rule enforcement for a P-bit address modifier across a bus bridge between two different buses with different rules for the P-bit address modifier. In a bus domain where pipeline transactions are allowed if the P-Bit is asserted and are not allowed if the P-Bit is not asserted, embodiments herein allow a master bus device to ensure that all bus devices will see a P=0 command with a defined minimum spacing to any other P=0 command. The required separation for P=0 commands is maintained within the bus bridge. In the preferred embodiments, the separation between P=0 commands is maintained by immediately retrying P=0 commands rather than spacing snoop requests.Type: GrantFiled: February 24, 2005Date of Patent: September 25, 2007Assignee: International Business Machines CorporationInventors: Robert Allen Drehmel, Clarence Rosser Ogilvie, Charles S. Woodruff
-
Patent number: 7240082Abstract: A method for improved processing efficiency of pipeline architecture with a processor. The processor has a first functional unit; a second functional unit; and a control unit electrically connected to the first and the second functional units for generating a plurality of control signals to control the first and the second functional units. The method includes following steps: (a) executing a first calculation task with the first functional unit or the second functional unit; (b) determining an executing time period of a second calculation task with the control unit according to the functional unit executing the first calculation task, an executing time period of the first calculation task, and whether the second calculation task depends upon a result of the first calculation task; and (c) executing the second calculation task with the first functional unit according to the executing time period of the second calculation task determined in step (b).Type: GrantFiled: July 7, 2003Date of Patent: July 3, 2007Assignee: Faraday Technology Corp.Inventor: Shan-Chyun Ku
-
Patent number: 7171535Abstract: A general-purpose serial operation pipeline realizes a complicated processing flow with an extemporaneous and explosive amount of operations with respect to various data sizes. A plurality of arithmetic-logic circuits (SALCs) that are controlled individually, and that can be operated together with another arithmetic-logic circuit (SALC) are connected in a cascade manner to form a serial operation pipeline. At least one of the plural SALCs includes a line for outputting data from an upstream SALC to a downstream SALC, a line for feeding back reverse data from the downstream SALC to the upstream SALC, and latch circuits for latching the data on the respective lines, thereby being capable of feeding back data from an arbitrary SALC to another SALC.Type: GrantFiled: April 1, 2003Date of Patent: January 30, 2007Assignee: Sony Computer Entertainment Inc.Inventor: Junichi Naoi
-
Patent number: 7107305Abstract: A tightly coupled dual 16-bit multiply-accumulate (MAC) unit for performing single-instruction/multiple-data (SIMD) operations may forward an intermediate result to another operation in a pipeline to resolve an accumulating dependency penalty. The MAC unit may also be used to perform 32-bit×32-bit operations.Type: GrantFiled: October 5, 2001Date of Patent: September 12, 2006Assignee: Intel CorporationInventors: Deli Deng, Anthony Jebson, Yuyun Liao, Nigel C. Paver, Steve J. Strazdus
-
Patent number: 7047317Abstract: A high performance network address processor is provided comprising a longest prefix match lookup engine for receiving a request for data from a designated network destination address. An associated data engine is also provided to couple to the longest prefix match lookup engine for receiving a longest prefix match lookup engine output address from the longest prefix match lookup engine and providing a network address processor data output corresponding to the designated network destination address requested. The high performance network address processor longest prefix match lookup engine comprises a plurality of pipelined lookup tables. Each table provides an index to a given row within the next higher stage lookup table, wherein the last stage, or the last table, in the set of tables comprises an associated data pointer provided as input to the associated data engine.Type: GrantFiled: June 14, 2000Date of Patent: May 16, 2006Assignee: Altera CorporationInventors: Jonathan Lockwood Huie, James Michael O'Connor
-
Patent number: 7043710Abstract: A system and method for early evaluation in micropipeline processors to improve performance is provided. The present invention presents a design methodology where a micropipeline processor block (e.g., a binary full adder) is capable of computing a result based on the arrival of only a subset of inputs. In general, early evaluation allows micropipeline processor blocks to operate in parallel, where they might otherwise operate sequentially because of data arrival dependencies; thereby improving performance of the micropipeline processors.Type: GrantFiled: February 10, 2004Date of Patent: May 9, 2006Assignee: Mississippi State UniversityInventors: Robert B. Reese, Mitchell A. Thornton