Scoreboarding, Reservation Station, Or Aliasing Patents (Class 712/217)
-
Publication number: 20140013085Abstract: A system and method for reducing latency and power of register renaming. A free list in processor includes multiple banks for indicating availability of register identifiers used for register renaming. A register rename unit receives one or more destination architectural registers to rename with physical register identifiers. Responsive to determining the multiple banks within the free list are unbalanced with available physical register identifiers, one or more returning physical register identifiers are assigned to the destination architectural registers before assigning any physical register identifiers from any bank of the multiple banks with a lowest number of available physical register identifiers. A returning physical register identifier is a physical register identifier that is available again for assignment to a destination architectural register but not yet indicated in the free list as available.Type: ApplicationFiled: July 3, 2012Publication date: January 9, 2014Inventors: Suparn Vats, John H. Mylius, Abhijit Radhakrishnan
-
Publication number: 20130339671Abstract: A system and method for reducing the latency of load operations. A register rename unit within a processor determines whether a decoded load instruction is eligible for conversion to a zero-cycle load operation. If so, control logic assigns a physical register identifier associated with a source operand of an older dependent store instruction to the destination operand of the load instruction. Additionally, the register rename unit marks the load instruction to prevent it from reading data associated with the source operand of the store instruction from memory. Due to the duplicate renaming, this data may be forwarded from a physical register file to instructions that are younger and dependent on the load instruction.Type: ApplicationFiled: June 14, 2012Publication date: December 19, 2013Inventors: Gerard R. Williams, III, John H. Mylius, Conrade Blasco-Allue
-
Patent number: 8612728Abstract: A pipelined computer processor is presented that reduces data hazards such that high processor utilization is attained. The processor restructures a set of instructions to operate concurrently on multiple pieces of data in multiple passes. One subset of instructions operates on one piece of data while different subsets of instructions operate concurrently on different pieces of data. A validity pipeline tracks the priming and draining of the pipeline processor to ensure that only valid data is written to registers or memory. Pass-dependent addressing is provided to correctly address registers and memory for different pieces of data.Type: GrantFiled: August 8, 2011Date of Patent: December 17, 2013Assignee: Micron Technology, Inc.Inventors: Neal Andrew Crook, Alan T. Wootton, James Peterson
-
Patent number: 8601177Abstract: A method may include distributing ranges of addresses in a memory among a first set of functions in a first pipeline. The first set of the functions in the first pipeline may operate on data using the ranges of addresses. Different ranges of addresses in the memory may be redistributed among a second set of functions in a second pipeline without waiting for the first set of functions to be flushed of data.Type: GrantFiled: June 27, 2012Date of Patent: December 3, 2013Assignee: Intel CorporationInventor: Thomas A. Piazza
-
Patent number: 8583900Abstract: A register renaming table recovery method for use in a processor includes the following steps. Firstly, a flushing operation is performed on a renaming-history table according to a flushed ID. Then, a first renamed ID corresponding to a first register is acquired from an unflushed row of the renaming-history table that is immediately adjacent to the flushed ID. If the first renamed ID is occupied, a register renaming table is updated to rename the first register according to the first renamed ID. Whereas, if the first renamed ID is not occupied, the register renaming table is updated to keep the first register unrenamed.Type: GrantFiled: November 17, 2009Date of Patent: November 12, 2013Assignee: RDC Semiconductor Co., Ltd.Inventors: Chien-Nan I, Chun-Wang Wei
-
Patent number: 8583901Abstract: Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided.Type: GrantFiled: February 4, 2010Date of Patent: November 12, 2013Assignee: STMicroelectronics (Beijing) R&D Co. Ltd.Inventors: Peng Fei Zhu, Hong-Xia Sun, Yong Qiang Wu
-
Patent number: 8578136Abstract: An apparatus and method are provided for performing register renaming, whereby architectural registers from a set of architectural registers are mapped to physical registers from a set of physical registers. Available register identifying circuitry is provided which is responsive to a current state of the apparatus to identify which physical registers form a pool of physical registers available to be mapped by register renaming circuitry to an architectural register specified by an instruction to be executed. Configuration storage stores configuration data whose value is modified during operation of the processing circuitry, such that when the configuration data has a first value, the configuration data identifies at least one architectural register of the architectural register set which does not require mapping to a physical register by the register renaming circuitry.Type: GrantFiled: June 15, 2010Date of Patent: November 5, 2013Assignee: ARM LimitedInventors: Frederic Claude Marie Piry, Louis-Marie Vincent Mouton, Luca Scalabrino, Richard Roy Grisenthwaite, David Hennah Mansell
-
Publication number: 20130283014Abstract: Embodiments of apparatus, computer-implemented methods, systems, and computer-readable media are described herein for expediting execution time memory alias checking. A sequence of instructions targeted for execution on an execution processor may be received or retrieved. The execution processor may include a plurality of alias registers and circuitry configured to check entries in the alias register for memory aliasing. One or more optimizations may be performed on the received or retrieved sequence of instructions to optimize execution performance of the received or retrieved sequence of instructions. This may include a reorder of a plurality of memory instructions in the received or retrieved sequence of instructions. After the optimization, one or more move instructions may be inserted in the optimized sequence of instructions to move one or more entries among the alias registers during execution, to expedite alias checking at execution time. Other embodiments may be described and/or claimed.Type: ApplicationFiled: September 27, 2011Publication date: October 24, 2013Inventors: Cheng Wang, Youfeng Wu
-
Publication number: 20130262823Abstract: A computer system for optimizing instructions includes a processor including an instruction execution unit configured to execute instructions and an instruction optimization unit configured to optimize instructions and memory to store machine instructions to be executed by the instruction execution unit. The computer system is configured to perform a method including analyzing machine instructions from among a stream of instructions to be executed by the instruction execution unit, the machine instructions including a memory load instruction and a data processing instruction to perform a data processing function based on the memory load instruction, identifying the machine instructions as being eligible for optimization, merging the machine instructions into a single optimized internal instruction, and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.Type: ApplicationFiled: March 28, 2012Publication date: October 3, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Patent number: 8544019Abstract: In some embodiments, a method includes receiving a request to generate a thread and supplying a request to a queue in response at least to the received request. The method may further include fetching a plurality of instructions in response at least in part to the request supplied to the queue and executing at least one of the plurality of instructions. In some embodiments, an apparatus includes a storage medium having stored therein instructions that when executed by a machine result in the method. In some embodiments, an apparatus includes circuitry to receive a request to generate a thread and to queue a request to generate a thread in response at least to the received request. In some embodiments, a system includes circuitry to receive a request to generate a thread and to queue a request to generate a thread in response at least to the received request, and a memory unit to store at least one instruction for the thread.Type: GrantFiled: May 26, 2011Date of Patent: September 24, 2013Assignee: Intel CorporationInventors: Hong Jiang, Thomas A. Piazza, Brian D. Rauchfuss, Sreedevi Chalasani, Steven J. Spangler
-
Patent number: 8528000Abstract: The execution environment provides for scalability where components will execute in parallel and exploit various patterns of parallelism. Dataflow applications are represented by reusable dataflow graphs called map components, while the executable version is called a prepared map. Using runtime properties the prepared map is executed in parallel with a thread allocated to each map process. The execution environment not only monitors threads, detects and corrects deadlocks, logs and controls program exceptions, but also data input and output ports of the map components are processed in parallel to take advantage of data partitioning schemes. Port implementation supports multi-state null value tokens to more accurately report exceptions. Data tokens are batched to minimize synchronization and transportation overhead and thread contention.Type: GrantFiled: May 6, 2010Date of Patent: September 3, 2013Assignee: Pervasive Software, Inc.Inventors: Larry Lee Schumacher, Agustin Gonzales-Tuchmann, Laurence Tobin Yogman, Paul C. Dingman
-
Patent number: 8521982Abstract: A system and method for tracking core load requests and providing arbitration and ordering of requests. When a core interface unit (CIU) receives a load operation from the processor core, a new entry in allocated in a queue of the CIU. In response to allocating the new entry in the queue, the CIU detects contention between the load request and another memory access request. In response to detecting contention, the load request may be suspended until the contention is resolved. Received load requests may be stored in the queue and tracked using a least recently used (LRU) mechanism. The load request may then be processed when the load request resides in a least recently used entry in the load request queue. CIU may also suspend issuing an instruction unless a read claim (RC) machine is available. In another embodiment, CIU may issue stored load requests in a specific priority order.Type: GrantFiled: April 15, 2009Date of Patent: August 27, 2013Assignee: International Business Machines CorporationInventors: Robert A. Cargnoni, Guy L. Guthrie, Thomas L. Jeremiah, Stephen J. Powell, William J. Starke, Jeffrey A. Steucheli
-
Publication number: 20130173886Abstract: Systems and methods for tracking data hazards in a processor. The processor comprises a pipelined architecture configured to execute a first instruction and a second instruction, wherein the second instruction is older than the first instruction. At least one of the first and second instructions comprises at least one operand expressed as a range of registers. Hazard detection logic is configured to compare the first instruction and the second instruction to determine if there is a data hazard, prior to expanding the second instruction.Type: ApplicationFiled: January 4, 2012Publication date: July 4, 2013Applicant: QUALCOMM INCORPORATEDInventors: Kenneth Alan Dockser, Yusuf Cagatay Tekmen
-
Publication number: 20130151819Abstract: A data processing apparatus with a processing pipeline, the pipeline including exception control circuitry and error detection circuitry. An exception storage unit is configured to maintain an age-ordered list of entries corresponding to instructions issued to the processing pipeline for execution. The unit is configured to store, in association with each entry, an exception indicator indicating whether the instruction is an exception instruction and whether it has generated an exception and an error indicator indicating whether the instruction has generated an error. The apparatus is configured to indicate to the exception storage unit that an instruction is resolved when processing of the instruction has reached a stage such that it is known whether the instruction will generate an error and whether the instruction will generate an exception; and the exception control circuitry is configured to sequentially retire oldest resolved entries from the list in the exception storage unit.Type: ApplicationFiled: December 7, 2011Publication date: June 13, 2013Applicant: ARM LIMITEDInventors: Frederic Claude Marie PIRY, Luca SCALABRINO, Guillaume SCHON, Melanie Emanuelle Lucie TEYSSIER
-
Patent number: 8464029Abstract: An out-of-order execution microprocessor for reducing load instruction replay likelihood due to store collisions. A register alias table (RAT) is coupled to first and second queues of entries and generates dependencies used to determine when instructions may execute out of order. The RAT allocates an entry of the first queue and populates the allocated entry with an instruction pointer of a load instruction, when it determines that the load instruction must be replayed. The RAT allocates an entry of the second queue when it encounters a store instruction and populates the allocated entry with a dependency that identifies an instruction upon which the store instruction depends for its data. The RAT causes a subsequent instance of the load instruction to share the dependency when it encounters the subsequent instance of the load instruction and determines that its instruction pointer matches the instruction pointer of an entry of the first queue.Type: GrantFiled: October 23, 2009Date of Patent: June 11, 2013Assignee: VIA Technologies, Inc.Inventors: Matthew Daniel Day, Rodney E. Hooker
-
Publication number: 20130145127Abstract: A data processing system is provided in which destination operands to be stored within architectural registers are constrained to have zero values added as prefixes in order that the architectural register value has a fixed bit width irrespective of the bit width of the destination operand being written thereto. Instead of adding these zero values everywhere in the data path, they are instead represented by zero flags in at least the physical registers utilised for register renaming operations and in the result queue prior to results being written to the architectural register file. This saves circuitry resources and reduces energy consumption.Type: ApplicationFiled: December 6, 2011Publication date: June 6, 2013Applicant: ARM LIMITEDInventors: James Nolan Hardage, Glen Andrew Harris, Mark Carpenter Glass
-
Publication number: 20130145131Abstract: Architectures and methods for viewing data in multiple formats within a register file. Various disclosed embodiments allow a plurality of consecutive registers within one register file to appear to be temporarily transposed by one instruction, such that each transposed register contains one byte or word from multiple consecutive registers. A program can arbitrarily reorganize the bytes within a register by swapping the value stored in any byte within the register with the value stored in any other byte within the same register. Indirect register access is also provided, without additional scoreboarding hardware, as an apparent move from one register to another. The functionality of a hardware data FIFO at the I/O is also provided, without the power consumption of register-to-register transfers. However, the size of the FIFO can be changed under program control.Type: ApplicationFiled: October 17, 2012Publication date: June 6, 2013Applicant: 3Dlabs Inc., Ltd.Inventor: 3Dlabs Inc., Ltd.
-
Publication number: 20130145130Abstract: The data processing apparatus (and method) has processing circuitry for performing data processing operations in response to data processing instructions, the data processing instructions referencing logical registers. A set of physical registers are provided for storing data values for access by the processing circuitry when performing the data processing operations. Register renaming storage stores a one-to-one mapping between the logical registers and the physical registers, with the register renaming storage being accessed by the processing circuitry when performing the data processing operations in order to map the referenced logical registers to corresponding physical registers. Update circuitry is arranged to identify the physical registers corresponding to those multiple logical registers in the register renaming storage. Altered one-to-one mapping between multiple logical registers and identified physical registers is employed when performing the current data processing operation.Type: ApplicationFiled: December 2, 2011Publication date: June 6, 2013Inventors: Jean-Baptiste BRELOT, Cédric Denis Robert Airaud
-
Patent number: 8386754Abstract: An out-of-order renaming processor is provided with a register file within which aliasing between registers of different sizes may occur. In this way a program instruction having a source register of a double precision size may alias with two single precision registers being used as destinations of one or more preceding program instructions. In order to track this data dependency the double precision register may be remapped into a micro-operation specifying two single precision registers as its source register. In this way, scheduling circuitry may use its existing hazard detection and management mechanisms to handle potential data hazards and dependencies. Not all program instructions having such data hazards between registers of different sizes are handled by this source register remapping. For these other program instructions a slower mechanism for dealing with the data dependency hazard is provided.Type: GrantFiled: June 24, 2009Date of Patent: February 26, 2013Assignee: ARM LimitedInventors: Conrado Blasco Allue, David James Williamson, James Nolan Hardage, Glen Andrew Harris, Robert Gregory McDonald
-
Patent number: 8370581Abstract: According to one embodiment of the invention, a method comprises measuring memory access latency for a prefetch cycle associated with a transmission of data from a memory device to a destination device such as a storage device. Hereafter, the prefetch rate is dynamically adjusted based on the measured memory access latency.Type: GrantFiled: June 30, 2005Date of Patent: February 5, 2013Assignee: Intel CorporationInventors: Victor Lau, Pak-lung Seto, Eric J. DeHaemer
-
Patent number: 8364936Abstract: In an embodiment, a scheduler implements a first dependency array that tracks dependencies on instruction operations (ops) within a distance N of a given op and which are short execution latency ops. Other dependencies are tracked in a second dependency array. The first dependency array may evaluate quickly, to support back-to-back issuance of short execution latency ops and their dependent ops. The second array may evaluate more slowly than the first dependency array.Type: GrantFiled: July 25, 2012Date of Patent: January 29, 2013Assignee: Apple Inc.Inventors: Andrew J. Beaumont-Smith, Honkai Tam, Daniel C. Murray, John H. Mylius, Peter J. Bannon, Pradeep Kanapathipillai
-
Patent number: 8347068Abstract: A multi-mode register rename mechanism which allows a simultaneous multi-threaded processor to support full out-of-order thread execution when the number of threads is low and in-order thread execution when the number of threads increases. Responsive to changing an execution mode of a processor to operate in in-order thread execution mode, the illustrative embodiments switch a physical register in the data processing system to an architected facility, thereby forming a switched physical register. When an instruction is issued to an execution unit, wherein the issued instruction comprises a thread bit, the thread bit is examined to determine if the instruction accesses an architected facility. If the issued instruction accesses an architected facility, the instruction is executed, and the results of the executed instruction are written to the switched physical register.Type: GrantFiled: April 4, 2007Date of Patent: January 1, 2013Assignee: International Business Machines CorporationInventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy
-
Patent number: 8346760Abstract: In one embodiment, the invention provides a method comprising determining metadata encoded in instructions of a stored program; and executing the stored program based on the metadata.Type: GrantFiled: May 20, 2009Date of Patent: January 1, 2013Assignee: Intel CorporationInventors: Hong Wang, John Shen, Ali-Reza Adl-Tabatabai, Anwar Ghuloum
-
Patent number: 8335912Abstract: Techniques and structures are described which allow the detection of certain dependency conditions, including evil twin conditions, during the execution of computer instructions. Information used to detect dependencies may be stored in a logical map table, which may include a content-addressable memory. The logical map table may maintain a logical register to physical register mapping, including entries dedicated to physical registers available as rename registers. In one embodiment, each entry in the logical map table includes a first value usable to indicate whether only a portion of the physical register is valid and whether the physical register includes the most recent update to the logical register being renamed. Use of this first value may allow precise detection of dependency conditions, including evil twin conditions, upon an instruction reading from at least two portions of a logical register having an entry in the logical map table whose first value is set.Type: GrantFiled: April 22, 2009Date of Patent: December 18, 2012Assignee: Oracle America, Inc.Inventors: Robert T. Golla, Jama I. Barreh, Jeffrey S. Brooks, Howard L. Levy
-
Publication number: 20120260072Abstract: A system may comprises an optimizer/scheduler to schedule on a set of instructions, compute a data dependence, a checking constraint and/or an anti-checking constraint for the set of scheduled instructions, and allocate alias registers for the set of scheduled instructions based on the data dependence, the checking constraint and/or the anti-checking constraint. In one embodiment, the optimizer is to release unused registers to reduce the alias registers used to protect the scheduled instructions. The optimizer is further to insert a dummy instruction after a fused instruction to break cycles in the checking and anti-checking constraints.Type: ApplicationFiled: April 7, 2011Publication date: October 11, 2012Inventors: Cheng Wang, Youfeng Wu
-
Patent number: 8271766Abstract: An information processing device including registers (105) for holding data and an operation device (102) for executing arithmetic and logic operations on input/output data held in the register. The information processing device can issue an inter-register copy instruction for instructing data held in one register to be copied to another register. The information processing device further includes a copy information holding device (113) for reserving for execution of a data copy operation by the inter-register copy instruction from a control unit (108) so as to execute the actual copy operation simultaneously with the succeeding instruction to hide the execution time of the copy operation. Thus, in the inter-register copy instruction execution phase, a reservation for a data copy operation is stored in the copy information holding device so that the execution phase is completed without performing the actual data copy operation.Type: GrantFiled: May 18, 2006Date of Patent: September 18, 2012Assignee: NEC CorporationInventor: Noritaka Hoshi
-
Patent number: 8255671Abstract: In an embodiment, a scheduler implements a first dependency array that tracks dependencies on instruction operations (ops) within a distance N of a given op and which are short execution latency ops. Other dependencies are tracked in a second dependency array. The first dependency array may evaluate quickly, to support back-to-back issuance of short execution latency ops and their dependent ops. The second array may evaluate more slowly than the first dependency array.Type: GrantFiled: December 18, 2008Date of Patent: August 28, 2012Assignee: Apple Inc.Inventors: Andrew J. Beaumont-Smith, Honkai Tam, Daniel C. Murray, John H. Mylius, Peter J. Bannon, Pradeep Kanapathipillai
-
Patent number: 8250345Abstract: A design structure embodied in a machine readable storage medium designing, manufacturing, and/or testing a design that includes a multi-threaded processor that executes an instruction of a process of an executing program is provided. The multi-threaded processor includes at least a first and a second thread. First and second sets of source registers are respectively allocated to the first and second threads, and first and second sets of destination registers are respectively allocated to the first and second threads. A resource prefix configuration register includes mappings between each of the source and destination registers and the threads. The multi-threaded processor, during execution of the instruction by one of the first or the second threads of execution, accesses the source and destination registers based on the mapping, wherein at least one of the accessed registers is allocated to the other of the first or the second thread of execution.Type: GrantFiled: April 28, 2008Date of Patent: August 21, 2012Assignee: International Business Machines CorporationInventor: Anthony J. Bybell
-
Patent number: 8250346Abstract: A processor 2 supporting register renaming has a rename table 20 in which the flag register has multiple tag values associated therewith. These tag values indicate which virtual register corresponds to a destination flag register of the oldest instruction which wrote a still up-to-date value of a subset of the flags.Type: GrantFiled: June 4, 2009Date of Patent: August 21, 2012Assignee: ARM LimitedInventor: James Nolan Hardage
-
Patent number: 8245016Abstract: A system includes a multi-threaded processor that executes an instruction of a process of an executing program. The multi-threaded processor includes at least a first and a second thread. First and second sets of source registers are respectively allocated to the first and second threads, and first and second sets of destination registers are respectively allocated to the first and second threads. A resource prefix configuration register includes mappings between each of the source and destination registers and the threads. The multi-threaded processor, during execution of the instruction by one of the first or the second threads of execution, accesses the source and destination registers based on the mapping, wherein at least one of the accessed registers is allocated to the other of the first or the second thread of execution.Type: GrantFiled: September 28, 2007Date of Patent: August 14, 2012Assignee: International Business Machines CorporationInventor: Anthony J. Bybell
-
Patent number: 8239660Abstract: A high speed processor. The processor includes terminals that each execute a subset of the instruction set. In at least one of the terminals, the instructions are executed in an order determined by data flow. Instructions are loaded into the terminal in pages. A notation is made when an operand for an instruction is generated by another instruction. When operands for an instruction are available, that instruction is a “ready” instruction. A ready instruction is selected in each cycle and executed. To allow data to be transmitted between terminals, each terminal is provided with a receive station, such that data generated in one terminal may be transmitted to another terminal for use as an operand in that terminal. In one embodiment, one terminal is an arithmetic terminal, executing arithmetic operations such as addition, multiplication and division. The processor has a second terminal, which contains functional logic to execute all other instructions in the instruction set.Type: GrantFiled: March 26, 2010Date of Patent: August 7, 2012Assignee: STMicroelectronics Inc.Inventor: Stefano Cervini
-
Patent number: 8225012Abstract: A method may include distributing ranges of addresses in a memory among a first set of functions in a first pipeline. The first set of the functions in the first pipeline may operate on data using the ranges of addresses. Different ranges of addresses in the memory may be redistributed among a second set of functions in a second pipeline without waiting for the first set of functions to be flushed of data.Type: GrantFiled: September 3, 2009Date of Patent: July 17, 2012Assignee: Intel CorporationInventor: Thomas A. Piazza
-
Patent number: 8225076Abstract: A scoreboard memory for a processing unit has separate memory regions allocated to each of the multiple threads to be processed. For each thread, the scoreboard memory stores register identifiers of registers that have pending writes. When an instruction is added to an instruction buffer, the register identifiers of the registers specified in the instruction are compared with the register identifiers stored in the scoreboard memory for that instruction's thread, and a multi-bit value representing the comparison result is generated. The multi-bit value is stored with the instruction in the instruction buffer and may be updated as instructions belonging to the same thread complete their execution. Before the instruction is issued for execution, this multi-bit value is checked. If this multi-bit value indicates that none of the registers specified in the instruction have pending writes, the instruction is allowed to issue for execution.Type: GrantFiled: September 18, 2008Date of Patent: July 17, 2012Assignee: NVIDIA CorporationInventors: Brett W. Coon, Peter C. Mills, Stuart F. Oberman, Ming Y. Siu
-
Patent number: 8179540Abstract: An image forming apparatus is provided that holds counter information obtained by integrating a consumption of a consumable that depends on usage of service provided by the image forming apparatus. A log corresponding to the usage of the service is set in job log information with a synchronization flag set off. The log in the job log information, for which the synchronization flag is set off, is set on. The counter information and the job log information are output after the synchronization flag for the log having the synchronization flag set off has been set on.Type: GrantFiled: October 29, 2008Date of Patent: May 15, 2012Assignee: Canon Kabushiki KaishaInventors: Junichi Hiruma, Nobuyuki Tonegawa
-
Patent number: 8179896Abstract: A network processor of an embodiment includes a packet classification engine, a processing pipeline, and a controller. The packet classification engine allows for classifying each of a plurality of packets according to packet type. The processing pipeline has a plurality of stages for processing each of the plurality of packets in a pipelined manner, where each stage includes one or more processors. The controller allows for providing the plurality of packets to the processing pipeline in an order that is based at least partially on: (i) packet types of the plurality of packets as classified by the packet classification engine and (ii) estimates of processing times for processing packets of the packet types at each stage of the plurality of stages of the processing pipeline. A method in a network processor allows for prefetching instructions into a cache for processing a packet based on a packet type of the packet.Type: GrantFiled: November 7, 2007Date of Patent: May 15, 2012Inventor: Justin Mark Sobaje
-
Patent number: 8171264Abstract: A sub-unit judges whether an instruction received from an external unit is executable. If the instruction is judged to be executable, the sub-unit executes it. If, on the other hand, the instruction is judged to be unexecutable, the sub-unit notifies the external unit of an executable plan.Type: GrantFiled: August 22, 2007Date of Patent: May 1, 2012Assignee: Mitsubishi Electric CorporationInventor: Noboru Miyamoto
-
Patent number: 8151097Abstract: When two threads (strands), for example, are executed in parallel in a processor in a simultaneous multi-thread (SMT) system, entries of a branch reservation station of an instruction control device are separately used in a strand 0 group and a strand 1 group. The data of the strand 0 and the data of the strand 1 are allocated to the respective entries by switching a select circuit. When an entry is released from the branch reservation station, the select circuit switches the strands so that a branch instruction in one strand can be released in order, thereby releasing the entry.Type: GrantFiled: December 4, 2009Date of Patent: April 3, 2012Assignee: Fujitsu LimitedInventor: Ryuichi Sunayama
-
Patent number: 8135942Abstract: A method receives a complex instruction comprising a first portion and a second portion. The method sets a single issue queue slot and allocates an execution unit for the complex instruction, and identifies dependencies in the first and second portions. The method sets a dependency matrix slot and a consumers table slot for the first and section portion. In the event the first portion dependencies have been satisfied, the method issues the first portion and then issues the second portion from the single issue queue slot. In the event the second portion dependencies have not been satisfied, the method places the second portion into a side issue queue. The method issues the second portion when the side issue queue indicates that the second portion is eligible for issue.Type: GrantFiled: August 28, 2008Date of Patent: March 13, 2012Assignee: International Business Machines CorprationInventors: Christopher M. Abernathy, Mary D. Brown, Todd A. Venton, John B. Griswell, Jr.
-
Patent number: 8127116Abstract: A processor having a dependency matrix comprises a first array comprising a plurality of first cells. A second array couples to the first array and comprises a plurality of second cells. A first write port couples to the first array and the second array and writes to the first array and the second array. A first read port couples to the first array and the second array and reads from the first array and the second array. A second read port couples to the first array and reads from the first array. A second write port couples to the second read port, reads from the second read port and writes to the second array.Type: GrantFiled: April 3, 2009Date of Patent: February 28, 2012Assignee: International Business Machines CorporationInventors: Saiful Islam, Mary D. Brown, Bjorn P. Christensen, Sam G. Chu, Robert A. Cordes, Maureen A. Delaney, Jafar Nahidi, Joel A. Silberman
-
Patent number: 8127083Abstract: A method and circuit for eliminating silent store invalidation propagation in shared memory cache coherency protocols, and a design structure on which the subject circuit resides are provided. A received data value is compared with a stored cache data value. When the received data value matches the stored cache data value, a first squash signal is generated. A received write address is compared with a reservation address. When the received write address matches the reservation address, a reservation signal is generated and inverted. The first squash signal and the inverted reservation signal are combined to selectively produce a silent store squash signal. The silent store squash signal cancels sending an invalidation signal.Type: GrantFiled: February 19, 2008Date of Patent: February 28, 2012Assignee: International Business Machines CorporationInventors: Christopher J. Kundinger, Nicholas D. Lindberg, Eric J. Stec
-
Patent number: 8122239Abstract: Method and apparatus for initializing a system configured in a programmable logic device (PLD) is described. In some examples, the method includes: initializing memory elements in the system with first data; executing a first iteration of the system to process the first data; partially reconfiguring the PLD, during execution of the first iteration, to initialize shadow memory elements in the PLD with second data, the shadow memory elements respectively shadowing the memory elements in the system; transferring the second data from the shadow memory elements to the memory elements; and executing a second iteration of the system to process the second data.Type: GrantFiled: September 11, 2008Date of Patent: February 21, 2012Assignee: Xilinx, Inc.Inventors: Philip B. James-Roxby, Stephen A. Neuendorffer, Henry E. Styles
-
Patent number: 8082421Abstract: A program instruction rearrangement method calculates the dependency depth of each instruction of a program based on dependency between instructions, based on register access order, and rearranging instructions based on the dependency depth. Additionally, the dependency between instructions can be utilized to locate and remove redundant instructions.Type: GrantFiled: September 4, 2007Date of Patent: December 20, 2011Assignee: Via Technologies, Inc.Inventor: Yi-Peng Chen
-
Patent number: 8078844Abstract: A system, method, and computer program product are provided for removing a register of a processor from an active state. In operation, an aspect of a portion of a processor capable of simultaneously processing a plurality of threads is identified. Additionally, a register of the processor is conditionally removed from an active state, based on the aspect.Type: GrantFiled: December 9, 2008Date of Patent: December 13, 2011Assignee: NVIDIA CorporationInventors: David Tarjan, Kevin Skadron
-
Patent number: 8060388Abstract: Provided are techniques for allowing consumers to reserve a resource, in which potential consumers have a choice among a number of different reservation contracts for reserving a resource to be provided at a future time. Each reservation contract allows a corresponding contracting customer to elect whether to receive the resource and requires the contracting customer to make a first payment in aggregate if the resource ultimately is elected and to make a second payment in aggregate if the resource ultimately is not elected, with the first payment being higher than the second payment, and with both being nonzero.Type: GrantFiled: February 10, 2006Date of Patent: November 15, 2011Assignee: Hewlett-Packard Development Company, L.P.Inventors: Bernardo Huberman, Li Zhang, Fang Wu
-
Patent number: 7991980Abstract: A scalable processing system includes a memory device having a plurality of executable program instructions, wherein each of the executable program instructions includes a timetag data field indicative of the nominal sequential order of the associated executable program instructions. The system also includes a plurality of processing elements, which are configured and arranged to receive executable program instructions from the memory device, wherein each of the processing elements executes executable instructions having the highest priority as indicated by the state of the timetag data field.Type: GrantFiled: October 20, 2008Date of Patent: August 2, 2011Assignee: The Board of Governors for Higher Education, State of Rhode Island and Providence PlantationsInventors: Augustus K. Uht, David Morano, David Kaeli
-
Patent number: 7984271Abstract: A processor device having a reservation station (RS) is concerned. In case the processor device has plural RS, the RS is associated with an arithmetic pipeline, and two RS make a pair. When one RS of the pair cannot dispatch an instruction to an associated arithmetic pipeline, the other RS dispatches the instruction to that arithmetic pipeline, or delivers its held instruction to the one RS. In case one RS is equipped, plural entries in the RS are divided into groups, and by dynamically changing this grouping according to the dispatch frequency of the instruction to the arithmetic pipelines or the held state of the instructions, the arithmetic pipelines are efficiently utilized. Incidentally, depending on the grouping of the plural entries in the RS, a configuration as if the plural RS were allocated to each arithmetic pipeline may be realized.Type: GrantFiled: October 19, 2007Date of Patent: July 19, 2011Assignee: Fujitsu LimitedInventors: Mariko Sakamoto, Toshio Yoshida
-
Patent number: 7979678Abstract: A system and method for performing register renaming of source registers in a processor having a variable advance instruction window for storing a group of instructions to be executed by the processor, wherein a new instruction is added to the variable advance instruction window when a location becomes available. A tag is assigned to each instruction in the variable advance instruction window. The tag of each instruction to leave the window is assigned to the next new instruction to be added to it. The results of instructions executed by the processor are stored in a temp buffer according to their corresponding tags to avoid output and anti-dependencies. The temp buffer therefore permits the processor to execute instructions out of order and in parallel. Data dependency checks for input dependencies are performed only for each new instruction added to the variable advance instruction window and register renaming is performed to avoid input dependencies.Type: GrantFiled: May 26, 2009Date of Patent: July 12, 2011Assignee: Seiko Epson CorporationInventors: Trevor A. Deosaran, Sanjiv Garg, Kevin R. Iadonato
-
Patent number: 7979677Abstract: A method and device for adaptively allocating reservation station entries to an instruction set with variable operands in a microprocessor. The device includes logic for determining free reservation station queue positions in a reservation station. The device allocates an issue queue to an instruction and writes the instruction into the issue queue as an issue queue entry. The device reads an operand corresponding to the instruction from a general purpose register and writes the operand into a reservation station using one of the free reservations station positions as a write address. The device writes each reservation station queue position corresponding to said instruction into said issue queue entry. When the instruction is ready for issue to an execution unit, the device reads out the instruction from the issue queue entry the reservation station queue positions to the execution unit.Type: GrantFiled: August 3, 2007Date of Patent: July 12, 2011Assignee: International Business Machines CorporationInventor: Dung Q. Nguyen
-
Patent number: 7975272Abstract: In some embodiments, a method includes receiving a request to generate a thread and supplying a request to a queue in response at least to the received request. The method may further include fetching a plurality of instructions in response at least in part to the request supplied to the queue and executing at least one of the plurality of instructions. In some embodiments, an apparatus includes a storage medium having stored therein instructions that when executed by a machine result in the method. In some embodiments, an apparatus includes circuitry to receive a request to generate a thread and to queue a request to generate a thread in response at least to the received request. In some embodiments, a system includes circuitry to receive a request to generate a thread and to queue a request to generate a thread in response at least to the received request, and a memory unit to store at least one instruction for the thread.Type: GrantFiled: December 30, 2006Date of Patent: July 5, 2011Assignee: Intel CorporationInventors: Hong Jiang, Thomas A. Piazza, Brian D. Rauchfuss, Sreedevi Chalasani, Steven J. Spangler
-
Patent number: RE43825Abstract: A system and method forward data between processing elements. A first processing element includes an address register that stores a first memory address. A forwarding storage element is coupled to the first processing element. A second processing element, coupled to the forwarding storage element, transmits a second memory address to the forwarding storage element. The forwarding storage transmits the second memory address to the first processing element, and the first processing element compares the second memory address with the first memory address.Type: GrantFiled: November 19, 2007Date of Patent: November 20, 2012Assignee: The United States of America as Represented by the Secretary of the NavyInventors: Joel Zvi Apisdorf, Sam Brandon Sandbote, Michael Daniel Poole