Simultaneous Issuance Of Multiple Instructions Patents (Class 712/215)
-
Patent number: 7107433Abstract: A mechanism for resource allocation in a processor, a method of allocating resources in a processor and a digital signal processor incorporating the mechanism or the method. In one embodiment, the mechanism includes: (1) categorization logic, associated with an earlier pipeline stage, that generates instruction type information for instructions to be executed in the processor and (2) priority logic, associated with a later pipeline stage, that allocates functional units of the processor to execution of the instructions based on the instruction type information.Type: GrantFiled: October 26, 2001Date of Patent: September 12, 2006Assignee: LSI Logic CorporationInventor: Hung T. Nguyen
-
Patent number: 7103755Abstract: An apparatus for efficient parallel executing instruction avoiding the usage of cross bypasses, the apparatus including an instruction buffer for storing instructions, of decoders for decoding, in parallel, the instructions which simultaneously issue from the instruction buffer, executing units for executing the instructions decoded in the decoders, and an instruction-issuing controlling means for controlling the issuing of the instructions in such a way that, when the instructions are executed, one of the plural executing units executes instructions more frequently than the rest of the plural executing units. The apparatus is preferably incorporated in an information processor to superscalar or out-of-order instruction execution.Type: GrantFiled: January 10, 2003Date of Patent: September 5, 2006Assignee: Fujitsu LimitedInventors: Susumu Akiu, Masaki Ukai, Toshio Yoshida
-
Patent number: 7096343Abstract: A method and apparatus are disclosed for allocating functional units in a multithreaded very large instruction word (VLIW) processor. The present invention combines the techniques of conventional very long instruction word architectures and conventional multithreaded architectures to reduce execution time within an individual program, as well as across a workload. The present invention utilizes instruction packet splitting to recover some efficiency lost with conventional multithreaded architectures. Instruction packet splitting allows an instruction bundle to be partially issued in one cycle, with the remainder of the bundle issued during a subsequent cycle. The allocation hardware assigns as many instructions from each packet as will fit on the available functional units, rather than allocating all instructions in an instruction packet at one time. Those instructions that cannot be allocated to a functional unit are retained in a ready-to-run register.Type: GrantFiled: March 30, 2000Date of Patent: August 22, 2006Assignee: Agere Systems Inc.Inventors: Alan David Berenbaum, Nevin Heintze, Tor E. Jeremiassen, Stefanos Kaxiras
-
Patent number: 7080241Abstract: An apparatus and method for self-initiated instruction issuing are implemented. In a central processing unit (CPU) having a pipelined architecture, instructions are queued for issuing to the execution unit which will execute them. Instructions are issued each cycle, and an instruction should be selectable for issuing as soon as its source operands are available. An instruction in the issue queue having source operands depending on other, target, instructions to determine their value are signaled to the target instruction by a link mask in the queue entry corresponding to the target instruction. A bit in the link mask identifies the queue entry corresponding to the dependent instruction. When the target instruction issues to the execution unit, a bit is set in a predetermined portion of the queue entry containing the dependent instruction. The portion of the queue entry is associated with the source operand depending on the issuing instruction.Type: GrantFiled: July 11, 2001Date of Patent: July 18, 2006Assignee: International Business Machines CorporationInventors: Hung Qui Le, Hoichi Cheong
-
Patent number: 7080234Abstract: According to the invention, a processing core (12) comprising a processing pipeline (100) having N-number of processing paths (56), each of which process instructions (54) on M-bit data words. In addition, the processing core (12) includes one or more register files (60), each preferably having Q-number of registers which are M-bits wide. Preferably, one of the Q-number of registers in at least one of the register files (60) is a program counter register dedicated to hold a program counter, and one of the Q-number of registers in at least one of the register files is a zero register dedicated to hold a zero value. In this manner, program jumps can be executed by adding values to the program counter in the program counter register, and memory address values can be calculated by adding values to the program counter stored in the program counter register or to the zero value stored in the zero register.Type: GrantFiled: March 8, 2001Date of Patent: July 18, 2006Assignee: Sun Microsystems, Inc.Inventors: Ashley Saulsbury, Nyles Nettleton, Michael Parkin, David R. Emberson
-
Patent number: 7057763Abstract: Executing a plurality of print jobs in a job queue by a computing architecture, each print job including at least one instruction for processing print data, including accessing one of the print jobs from the job queue, determining whether the accessed print job represents an image, selecting a high resolution mode in the case that the accessed print job represents an image, selecting a low resolution mode in the case that the accessed print job does not represent an image, and executing the accessed print job in accordance with the selected resolution mode.Type: GrantFiled: December 12, 2001Date of Patent: June 6, 2006Assignee: Canon Kabushiki KaishaInventor: Sadahiro Tanaka
-
Patent number: 7047395Abstract: A distributed system is provided for apportioning an instruction stream into multiple segments for processing in multiple parallel processing units, and for merging the processed segments into a single processed instruction stream having the same sequential relative order as the original instruction stream. Tags may be attached to each segment after apportioning to indicate the order in which the various segments are to be merged. In one embodiment, the end of each segment includes a tag indicating the unit to which the next instruction in the original instruction sequence is directed.Type: GrantFiled: November 13, 2001Date of Patent: May 16, 2006Assignee: Intel CorporationInventors: Roni Rosner, Micha G. Moffie, Abraham Mendelson
-
Patent number: 7043624Abstract: A tag monitoring system for assigning tags to instructions. A source supplies instructions to be executed by a functional unit. A register file stores information required for the execution of each instruction. A queue having a plurality of slots containing tags which are used for tagging the instructions. The tags are arranged in the queue in an order specified by the program order of their corresponding instructions. A control unit monitors the completion of executed instructions and advances the tags in the queue upon completion of an executed instruction. The register file stores an instruction's information at a location in the register file defined by the tag assigned to that instruction. The register file also contains a plurality of read address enable ports and corresponding read output ports. Each of the slots from the queue is coupled to a corresponding one of the read address enable ports. Thus, the information for each instruction can be read out of the register file in program order.Type: GrantFiled: May 18, 2004Date of Patent: May 9, 2006Assignee: Seiko Epson CorporationInventors: Kevin R. Iadonato, Trevor A. Deosaran, Sanjiv Garg
-
Patent number: 7039790Abstract: A data processing system with a microprocessor. The microprocessor has an instruction execution pipeline including fetch and decode stages and several functional execution units. Fetch packets contain a plurality of instruction words. Execute packets include a plurality of instruction words that can be executed in parallel by two or more execution units. An execution packet can span two or more fetch packets. A predetermined bit in each instruction marks whether the next instruction is executed in parallel with the current instruction. Instructions in an execute packet are dispatched to appropriate functional execution units based on instruction type. Upon a branch into an execute packet instructions at memory addresses before the branch location are not executed in parallel with instructions following the branch location.Type: GrantFiled: October 31, 2000Date of Patent: May 2, 2006Assignee: Texas Instruments IncorporatedInventors: Laurence R. Simar, Jr., Richard A. Brown
-
Patent number: 7039791Abstract: A computing system as described in which individual instructions are executable in parallel by processing pipelines, and instructions to be executed in parallel by different pipelines are supplied to the pipelines simultaneously. The system includes storage for storing an arbitrary number of the instructions to be executed. The instructions to be executed are tagged with pipeline identification tags indicative of the pipeline to which they should be dispatched. The pipeline identification tags are supplied to a system which controls a crossbar switch, enabling the tags to be used to control the switch and supply the appropriate instructions simultaneously to the differing pipelines.Type: GrantFiled: July 3, 2002Date of Patent: May 2, 2006Assignee: Intergraph CorporationInventors: Howard G. Sachs, Siamak Arya
-
Patent number: 7035998Abstract: A pipelined multistreaming processor has an instruction source, a first cluster of a plurality of streams fetching instructions from the instruction source, a second cluster of a plurality of streams fetching instructions from the instruction source, dedicated instruction queues for individual streams in each cluster, a first dedicated dispatch stage in the first cluster for dispatching instructions to execution units, and a second dedicated dispatch stage in the second cluster for selecting and dispatching instructions to execution units. The processor is characterized in that the clusters operate independently, with the dedicated dispatch stage taking instructions only from the instruction queues in the individual clusters to which the dispatch stages are dedicated. In preferred embodiments there are dedicated fetch and dispatch stages for streams in the clusters, and dedicated execution units to which instructions may be dispatched.Type: GrantFiled: November 3, 2000Date of Patent: April 25, 2006Assignee: MIPS Technologies, Inc.Inventors: Mario Nemirovsky, Stephen W. Melvin, Nandakumar Sampath, Enrique Musoll, Hector Urdaneta
-
Patent number: 7020765Abstract: A processor is disclosed including several features allowing the processor to simultaneously execute instructions of multiple conditional execution instruction groups. Each conditional execution instruction group includes a conditional execution instruction and a code block specified by the conditional execution instruction. In one embodiment, the processor includes multiple state machines simultaneously assignable to a corresponding number of conditional execution instruction groups. In another embodiment, the processor includes multiple registers for storing marking data pertaining to a number of instructions in each of multiple execution pipeline stages. In another embodiment, the processor includes multiple attribute queues simultaneously assignable to a corresponding number of conditional execution instruction groups. In another embodiment, the processor includes write enable logic and an execution unit.Type: GrantFiled: September 27, 2002Date of Patent: March 28, 2006Assignee: LSI Logic CorporationInventors: Hung Nguyen, Shannon Wichman
-
Patent number: 7013456Abstract: A method and a computer for performance of the method. While executing a program on a computer, profileable events occurring in the instruction pipeline are detected. The instruction pipeline is directed to record profile information describing the profileable events essentially concurrently with the occurrence of the profileable events. The detecting and recording occur under control of hardware of the computer without software intervention.Type: GrantFiled: June 16, 1999Date of Patent: March 14, 2006Assignee: ATI International SRLInventors: Korbin S. Van Dyke, Paul H. Hohensee, David L. Reese, John S. Yates, Jr., T. R. Ramesh, Shalesh Thusoo, Gurjeet Singh Saund, Stephen C. Purcell, Niteen Aravind Patkar
-
Patent number: 7003650Abstract: A method and apparatus for solving the output dependence problem in an explicit parallelism architecture microprocessor with consideration for implementation of the precise exception. In case of an output dependence hazard, the issue into bypass of a result of the earlier issued operation having an output hazard is cancelled. Latencies of short instructions are aligned by including additional stages on the way of writing the results into the register file in shorter executive units, which allows to save the issue order while writing the results into the register file. For long and unpredictable latencies of the instructions, writing of the result of the earlier issued operation having an output dependence hazard into the register file is cancelled after checking for no precise exception condition. All additional stages are connected to the bypass not to increase the result access time in case of this result use in the following operations.Type: GrantFiled: December 11, 2001Date of Patent: February 21, 2006Assignee: Elbrus InternationalInventors: Boris A. Babaian, Valeri G. Gorokhov, Feodor A. Gruzdov, Yuli K. Sakhin, Vladimir V. Rudometov, Valdimir Y. Volkonsky
-
Patent number: 7000095Abstract: A method and apparatus for overlaying hazard clearing with a jump instruction within a pipeline microprocessor is described. The apparatus includes hazard logic to detect when a jump instruction specifies that hazards are to be cleared as part of a jump operation. If hazards are to be cleared, the hazard logic disables branch prediction for the jump instruction, thereby causing the jump instruction to proceed down the pipeline until it is finally resolved, and flushing the pipeline behind the jump instruction. Disabling of branch prediction for the jump instruction effectively clears all execution and/or instruction hazards that preceded the jump instruction. Alternatively, hazard logic causes issue control logic to stall the jump instruction for n-cycles until all hazards are cleared. State tracking logic may be provided to determine whether any instructions are executing in the pipeline that create hazards. If so, hazard logic performs normally.Type: GrantFiled: September 6, 2002Date of Patent: February 14, 2006Assignee: MIPS Technologies, Inc.Inventors: Niels Gram Jeppesen, G. Michael Uhler
-
Patent number: 6983358Abstract: An apparatus and method in a microprocessor having two unaligned functional unit pipelines which enables an instruction queue for the second pipeline to be placed at an intermediate pipeline stage rather than after the stage in the first pipeline that retires instructions. The apparatus maintains coherency between the status of each instruction in the queue relative to its status in the first pipeline. The status comprises an age of the instruction and a valid bit. The age specifies the stage in the first pipeline in which the instruction resides. The apparatus includes logic for updating the age and valid bit based on whether the first pipeline is stalled, on valid bits from the first pipeline, and on whether the queue is downshifting. The microprocessor selectively updates its user-visible state with the instruction execution results from the second functional unit based on the instruction age and valid bit.Type: GrantFiled: October 23, 2002Date of Patent: January 3, 2006Assignee: IP-First, LLCInventor: Tom Elmer
-
Patent number: 6981127Abstract: A method and apparatus for providing a plurality of aligned instructions from an instruction stream provided by a memory unit for execution within a pipelined microprocessor is described. The microprocessor comprises a prefetch buffer, whereby the prefetch buffer stores prefetched instructions and additional information about the validity and size of the prefetch buffer. The method and apparatus use the prefetch buffer to buffer a part of an instruction stream. The actually aligned instruction stream is issued from the prefetch buffer or directly by instructions fetched from the memory, or from a combination of prefetched instructions and actually fetched instructions.Type: GrantFiled: May 26, 1999Date of Patent: December 27, 2005Assignee: Infineon Technologies North America Corp.Inventors: Balraj Singh, Venkat Mattela
-
Patent number: 6976150Abstract: A scalable processing system includes a memory device having a plurality of executable program instructions, wherein each of the executable program instructions includes a timetag data field indicative of the nominal sequential order of the associated executable program instructions. The system also includes a plurality of processing elements, which are configured and arranged to recieve executable program instructions from the memory device, wherein each of the processing elements executes executable instructions having the highest priority as indicated by the state of the timetag data field.Type: GrantFiled: April 6, 2001Date of Patent: December 13, 2005Assignee: The Board of Governors for Higher Education, State of Rhode Island and Providence PlantationsInventors: Augustus K. Uht, David Morano, David Kaeli
-
Patent number: 6968444Abstract: A microprocessor employing a fixed position dispatch unit. The microprocessor includes a plurality of execution units each corresponding to an issue position and configured to execute a common subset of instructions. At least a first one of the execution units includes extended logic for executing a designated instruction that others of the execution units may be incapable of executing. The microprocessor also includes a plurality of decoders coupled to the plurality of execution units. The plurality of decoders may provide positional information to cause the designated instruction to be routed to the first execution unit. Further, the microprocessor includes a dispatch control unit configured to dispatch during a dispatch cycle, the designated instruction for execution by the first execution unit based upon the positional information. The dispatch control unit may further dispatch one or more instructions within the common subset of instructions during the same dispatch cycle.Type: GrantFiled: November 4, 2002Date of Patent: November 22, 2005Assignee: Advanced Micro Devices, Inc.Inventors: David E. Kroesche, Michael T. Clark
-
Patent number: 6966056Abstract: A processor that has a plurality of instruction slots each of which stores an instruction to be executed in parallel. One of the plurality of instruction slots is a first instruction slot and another a second instruction slot. A special instruction stored in the first instruction slot is executed by a first functional unit that executes instructions stored in the first instruction slot, and a second functional unit that executes instructions stored in the second instruction slot. An instruction stored in the second instruction slot is executed in parallel by a third functional unit that executes instructions stored in the second instruction slot.Type: GrantFiled: March 14, 2001Date of Patent: November 15, 2005Assignee: Matsushita Electric Industrial Co., Ltd.Inventor: Kenichi Kawaguchi
-
Patent number: 6959372Abstract: A parallel processing architecture comprising a cluster of embedded processors that share a common code distribution bus. Pages or blocks of code are concurrently loaded into respective program memories of some or all of these processors (typically all processors assigned to a particular task) over the code distribution bus, and are executed in parallel by these processors. A task control processor determines when all of the processors assigned to a particular task have finished executing the current code page, and then loads a new code page (e.g., the next sequential code page within a task) into the program memories of these processors for execution. The processors within the cluster preferably share a common memory (1 per cluster) that is used to receive data inputs from, and to provide data outputs to, a higher level processor. Multiple interconnected clusters may be integrated within a common integrated circuit device.Type: GrantFiled: February 18, 2003Date of Patent: October 25, 2005Assignee: Cogent Chipware Inc.Inventors: Richard F. Hobson, Bill Ressl, Allan R. Dyck
-
Patent number: 6950925Abstract: A microprocessor may include several execution units and a scheduler coupled to issue operations to at least one of the execution units. The scheduler may include several entries. A first entry may be allocated to a first operation. The first entry includes a source status indication for each of the first operation's operands. Each source status indication indicates whether a value of a respective one of the first operation's operands is speculative. The scheduler is configured to update one of the first entry's source status indications to indicate that a value of a respective one of the first operation's operands is non-speculative in response to receiving an indication that a value of a result of a second operation is non-speculative.Type: GrantFiled: August 28, 2002Date of Patent: September 27, 2005Assignee: Advanced Micro Devices, Inc.Inventors: Benjamin T. Sander, Mitchell Alsup, Michael Filippo
-
Patent number: 6944750Abstract: The present invention is directed to a system and method for implementing a pre-steered instruction cache. The hardware logic that normally steers instructions to specific execution units just prior to execution is moved before the pre-steered instruction cache, so that instructions are pre-steered into that cache. In other words, the instructions can be moved into the instruction cache in such a manner that they are organized in the cache depending on the execution unit(s) to which they will be transmitted. This is done so that an instruction can leave the pre-steered instruction cache and enter the execution unit that can execute it with either minimum or no steering logic involvement. The cache lines of the pre-steered instruction cache are organized into bins such that each bin corresponds to either a single execution unit or a cluster of execution units.Type: GrantFiled: March 31, 2000Date of Patent: September 13, 2005Assignee: Intel CorporationInventor: Gad S. Sheaffer
-
Patent number: 6941447Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.Type: GrantFiled: November 5, 2003Date of Patent: September 6, 2005Assignee: Seiko Epson CorporationInventors: Le-Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
-
Patent number: 6934829Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.Type: GrantFiled: October 31, 2003Date of Patent: August 23, 2005Assignee: Seiko Epson CorporationInventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
-
Patent number: 6920544Abstract: A processor includes a memory unit in which instructions having their constituent bytes stored in ascending address order alternate with instructions having their constituent bytes stored in descending address order. A single address pointer is used to read one instruction by reading up, and another instruction by reading down. The amount of address information needed for program execution is thereby reduced, as one address pointer suffices for two instructions. The address pointer may be provided by a branch instruction that also indicates whether to read up or down. An up-counter and a down-counter may be provided as address counters, enabling the two instructions to be read and executed concurrently. Four address counters may be provided, enabling a branch instruction to designate the execution of from one to four consecutive instructions.Type: GrantFiled: March 25, 2003Date of Patent: July 19, 2005Assignee: Oki Electric Industry Co., Ltd.Inventor: Mototsugu Watanabe
-
Patent number: 6915412Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.Type: GrantFiled: October 30, 2002Date of Patent: July 5, 2005Assignee: Seiko Epson CorporationInventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
-
Patent number: 6901507Abstract: A programmable processing system that executes multiple instruction contexts includes an instruction memory for storing instructions that are executed by the system, fetch logic for determining an address of an instruction, with the fetch logic including scheduling logic that schedules execution of the instruction contexts based on condition signals indicating an availability of a hardware resource, with the condition signals being divided into groups of condition signals, which are sampled in turn by the scheduling logic to provide a plurality of scan sets of sampled conditions.Type: GrantFiled: November 19, 2001Date of Patent: May 31, 2005Assignee: Intel CorporationInventor: John A. Wishneusky
-
Patent number: 6898692Abstract: A method of processing data relating to graphical primitives to be displayed on a display device using region-based SIMD multiprocessor architecture, has the shading and blending operations deferred until rasterization of the available graphical primitive data is completed.Type: GrantFiled: June 28, 2000Date of Patent: May 24, 2005Assignee: ClearSpeed Technology plcInventors: Ken Cameron, Eamon O'Dea
-
Patent number: 6895497Abstract: A multiple dispatch processor has several instruction fetch units, each for providing a stream of instructions to an instruction decode and dispatch unit. The processor also has an resource allocation unit, and multiple resources such as combined integer and address execution pipelines and floating point execution pipelines. Each instruction decode and dispatch unit requests resources needed to perform an instruction of the resource allocation unit, which arbitrates among the multiple instruction decode and dispatch units.Type: GrantFiled: March 6, 2002Date of Patent: May 17, 2005Assignee: Hewlett-Packard Development Company, L.P.Inventors: Eric S. Fetzer, Wayne Kever, Eric DeLano
-
Patent number: 6892294Abstract: A find-instructions-and-allocate-ports (FIAP) circuit and method are provided for quickly and efficiently locating one or more instructions that are ready for execution during a launch cycle in an out of order processor and allocating one or more ports associated with one or more execution resources to such ready instructions during the launch cycle. In architecture, the processor includes an instruction reordering mechanism, for example, a queue, having a plurality of slots for temporarily storing a plurality of respective instructions. Instructions can be executed in an out of order sequence from the queue. Each slot is provided with the FIAP circuit for causing and preventing launching, when appropriate, of their respective instruction. A plurality of signals is propagated successively through the FIAP circuits of the queue that causes the queue to launch a predefined plurality of the instructions, which corresponds to a predefined plurality of ports associated with the one or more execution resources.Type: GrantFiled: February 3, 2000Date of Patent: May 10, 2005Assignee: Hewlett-Packard Development Company, L.P.Inventor: Samuel D Naffziger
-
Patent number: 6892293Abstract: A computing system as described in which individual instructions are executable in parallel by processing pipelines, and instructions to be executed in parallel by different pipelines are supplied to the pipelines simultaneously. The system includes storage for storing an arbitrary number of the instructions to be executed. The instructions to be executed are tagged with pipeline identification tags indicative of the pipeline to which they should be dispatched. The pipeline identification tags are supplied to a system which controls a crossbar switch, enabling the tags to be used to control the switch and supply the appropriate instructions simultaneously to the differing pipelines.Type: GrantFiled: April 9, 1998Date of Patent: May 10, 2005Assignee: Intergraph CorporationInventors: Howard G. Sachs, Siamak Arya
-
Patent number: 6886091Abstract: Super functional units are used to execute not only single super-instructions that take more than one issue slot, but also a number of equivalent regular VLIW instructions. Accordingly, the same hardware can thus be used to execute either a superoperation or a combination of regular operations, potentially combined with other smaller superoperations. Using super functional units in this way promotes efficient use of computing resources by making computing resources that might otherwise be used unnecessarily by superoperations available for use by single-slot instructions or by smaller superoperations. In some embodiments, a compiler analyzes program and other data to identify superoperations that can be reduced to equivalent single-slot instructions. The compiler maps these operations to a single slot of a super functional unit, reducing the computing resources occupied by the operation.Type: GrantFiled: June 29, 2001Date of Patent: April 26, 2005Assignee: Koninklijke Philips Electronics N.V.Inventors: Kornelis A. Vissers, Marcel J. A. Tromp, Jos van Eijndhoven
-
Patent number: 6883088Abstract: The ManArray processor is a scalable indirect VLIW array processor that defines two preferred architectures for indirect VLIW memories. One approach treats the VIM as one composite block of memory using one common address interface to access any VLIW stored in the VIM. The second approach treats the VIM as made up of multiple smaller VIMs each individually associated with the functional units and each individually addressable for loading and reading during XV execution. The VIM memories, contained in each processing element (PE), are accessible by the same type of LV and XV Short Instruction Words (SIWs) as in a single processor instantiation of the indirect VLIW architecture. In the ManArray architecture, the control processor, also called a sequence processor (SP), fetches the instructions from the SIW memory and dispatches them to itself and the PEs. By using the LV instruction, VLIWs can be loaded into VIMs in the SP and the PEs.Type: GrantFiled: January 15, 2004Date of Patent: April 19, 2005Assignee: PTS CorporationInventors: Edwin Frank Barry, Gerald G. Pechanek
-
Patent number: 6874078Abstract: A highly parallel data processing system includes an array of n processing elements (PEs) and a controller sequence processor (SP) wherein at least one PE is combined with the controller SP to create a Dynamic Merged Processor (DP) which supports two modes of operation. In its first mode of operation, the DP acts as one of the PEs in the array and participates in the execution of single-instruction-multiple-data (SIMD) instructions. In the second mode of operation, the DP acts as the controlling element for the array of PEs and executes non-array instructions. To support these two modes of operation, the DP includes a plurality of execution units and two general-purpose register files. The execution units are “shared” in that they can execute instructions in either mode of operation. With very long instruction word (VLIW) capability, both modes of operation can be in effect on a cycle by cycle basis for every VLIW executed.Type: GrantFiled: July 15, 2003Date of Patent: March 29, 2005Assignee: PTS CorporationInventors: Gerald G. Pechanek, Juan G. Revilla
-
Patent number: 6865662Abstract: A VLIW processor for executing a sequence of very long instruction words having a plurality of operations to be executed in parallel. The VLIW processor has a plurality of functional units for parallel execution of the operations specified by the VLIW, an instruction register for holding the VLIW, and a condition flag for indicating the results of a comparison operation. The VLIW includes a conditional head and a plurality of slots, each slot including an operational code and any related operands. The conditional head has a plurality of conditional indicators, each conditional indicator uniquely corresponding to one operation and specifying a condition in which the operation is to be executed if the indicated condition exists. A control circuit is connected to the instruction register and the functional units to deliver the operation from the instruction register to the corresponding functional unit for execution when the condition exists.Type: GrantFiled: August 8, 2002Date of Patent: March 8, 2005Assignee: Faraday Technology Corp.Inventor: Yu-Min Wang
-
Patent number: 6854049Abstract: A processing unit is associated with a first FIFO-type memory and with a second FIFO-type memory. Each instruction for loading memory stored data into a register within the processing unit is stored in the first FIFO-type memory, and other operative instructions are stored in the second FIFO-type memory. An operative instruction involving the register is removed from the second FIFO-type memory if no loading instruction which is earlier in time, and intended to modify a value of the register associated with this operative instruction is present in the first FIFO-type memory. In the presence of such an earlier loading instruction, the operative instruction is removed from the second FIFO-type memory only after the loading instruction has been removed from the first FIFO-type memory.Type: GrantFiled: February 26, 2002Date of Patent: February 8, 2005Assignee: STMicroelectronics SAInventor: Andrew Cofler
-
Patent number: 6842848Abstract: Techniques for token triggered multithreading in a multithreaded processor are disclosed. An instruction issuance sequence for a plurality of threads of the multithreaded processor is controlled by associating with each of the threads at least one register which stores a value identifying a next thread to be permitted to issue one or more instructions, and utilizing the stored value to control the instruction issuance sequence. For example, each of a plurality of hardware thread units of the multithreaded processor may include a corresponding local register updatable by that hardware thread unit, with the local register for a given one of the hardware thread units storing a value identifying the next thread to be permitted to issue one or more instructions after the given hardware thread unit has issued one or more instructions. A global register arrangement may also or alternatively be used.Type: GrantFiled: October 11, 2002Date of Patent: January 11, 2005Assignee: Sandbridge Technologies, Inc.Inventors: Erdem Hokenek, Mayan Moudgill, C. John Glossner
-
Publication number: 20040268091Abstract: Processor (100) has a plurality of registers (120) for storing instructions for execution by the plurality of execution units (160). The plurality of registers (120) are coupled to the plurality of execution units (160) via distribution means (140). Distribution means (140) have a plurality of dispatch units (144) coupled to the plurality of execution units (160) and a reroutable network, e.g. a data communication bus (142), coupling the plurality of execution units (120) to the plurality of dispatch units (144). The data communication bus (142) is controlled by control unit (148). Dispatch units (144) are arranged to detect dedicated instructions in the instruction flow, which signal the beginning of an inactive period of an execution unit (160a, 160b, 160c, 160d) in the plurality of execution units (160).Type: ApplicationFiled: May 24, 2004Publication date: December 30, 2004Inventor: Francesco Pessolano
-
Patent number: 6836828Abstract: The present invention provides an instruction cache apparatus and method using the instruction read buffer. The apparatus comprises an instruction hit analysis unit, an instruction read buffer, a first cache instruction word memory, a second cache instruction word memory, a first multiplexer and a second multiplexer. The instruction hit analysis unit receives a programmable counter output signal, compares this with a plurality of tags, and after the analysis, outputs the instruction hit signal of the instruction read buffer and the instruction hit signal of the first cache instruction word memory. The second multiplexer reads the expected instruction word from one of either the first cache instruction word memory, the second cache instruction word memory or the first multiplexer according to the instruction hit signal of the instruction read buffer and the instruction hit signal of the first cache instruction word memory.Type: GrantFiled: April 3, 2002Date of Patent: December 28, 2004Assignee: Faraday Technology Corp.Inventor: Min-Cheng Kao
-
Patent number: 6834336Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.Type: GrantFiled: May 24, 2002Date of Patent: December 21, 2004Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Shuichi Takayama, Nobuo Higaki
-
Publication number: 20040230773Abstract: In a computer system, a method and apparatus for dispatching and executing multi-cycle and complex instructions. The method results in maximum performance for such without impacting other areas in the processor such as decode, grouping or dispatch units. This invention allows multi-cycle and complex instructions to be dispatched to one port but executed in multiple execution pipes without cracking the instruction and without limiting it to a single execution pipe. Some control signals are generated in the dispatch unit and dispatched with the instruction to the Fixed Point Unit (FXU). The FXU logic then execute these instructions on the available FXU pipes. This method results in optimum performance with little or no other complications. The presented technique places the flexibility of how these instructions will be executed in the FXU, where the actual execution takes place, instead of in the instruction decode or dispatch units or cracking by the compiler.Type: ApplicationFiled: May 12, 2003Publication date: November 18, 2004Applicant: International Business Machines CorporationInventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, John G. Rell, Timothy J. Slegel
-
Publication number: 20040230772Abstract: In a computer system for use as a symetrical multiprocessor, a superscalar microprocessor apparatus allows dispatching and executing multi-cycle and complex instructions Some control signals are generated in the dispatch unit and dispatched with the instruction to the Fixed Point Unit (FXU). Multiple execution pipes correspond to the instruction dispatch ports and the execution unit is a Fixed Point Unit (FXU) which contains three execution dataflow pipes (X, Y and Z) and one control pipe (R). The FXU logic then execute these instructions on the available FXU pipes. This results in optimum performance with little or no other complications. The presented technique places the flexibility of how these instructions will be executed in the FXU, where the actual execution takes place, instead of in the instruction decode or dispatch units or cracking by the compiler.Type: ApplicationFiled: May 12, 2003Publication date: November 18, 2004Applicant: International Business Machines CorporationInventors: Fadi Y. Busaba, Klaus J. Getzlaff, Christopher A. Krygowski, Timothy J. Slegel
-
Patent number: 6820188Abstract: A circuit is provided to provide instruction streams to a processing device: embodiments of the circuit are appropriate for use with RISC CPUs, whereas other embodiments are useable with other processing devices, such as small processing devices used in a field programmable array. The circuit receives an external instruction stream which provides a first set of instruction values, and has a memory which contains a second set of instruction values. Two or more outputs provide instruction streams to the processing device. The circuit has a control input in the form of a mask which causes a selection means to allocate bits from the first and second sets of instruction values to different instruction streams to the processing device.Type: GrantFiled: January 6, 2003Date of Patent: November 16, 2004Assignee: Elixent LimitedInventors: Anthony Stansfield, Alan David Marshall, Jean Vuillemin
-
Patent number: 6820190Abstract: The present invention is a method for processing instructions by decomposing a macroinstruction into at least two microinstructions, executing the microinstructions in parallel, and linking the microinstructions such that they appear as though they were executed as a single functional unit. The present invention operates by determining whether certain exceptions occur in either of the functional units, according to SSE rules for exceptions. If an exception does occur in any of the linked microinstructions, then the execution of each of those microinstructions is canceled. This avoids the necessity of a back-off or undo mechanism.Type: GrantFiled: February 2, 2000Date of Patent: November 16, 2004Assignee: Hewlett-Packard Development Company, L.P.Inventors: Patrick Knebel, Kevin David Safford
-
Patent number: 6813704Abstract: For use in an instruction queue having a plurality of instruction slots, a mechanism for queueing and retiring instructions. In one embodiment, the mechanism includes a plurality of tag fields corresponding to the plurality of instruction slots, and control logic, coupled to the tag fields, that assigns tags to the tag fields to denote an order of instructions in the instruction slots. In addition, the mechanism includes a tag multiplexer, coupled to the control logic, that changes the order by reassigning only the tags.Type: GrantFiled: December 20, 2001Date of Patent: November 2, 2004Assignee: LSI Logic CorporationInventor: Hung T. Nguyen
-
Patent number: 6807623Abstract: A data processing control system has a controller wherein the controller (1) sends every received instruction to the plurality of data processing devices until the number of instructions being executed or waiting to be executed by the plurality of data processing devices reaches a predetermined number, (2) does not send any received instructions to the plurality of data processing devices but holds the received instructions in a queue once the number of instructions being executed or waiting to be executed by the plurality of data processing devices has reached the predetermined number, and (3) when the number of instructions being executed or waiting to be executed by the plurality of data processing devices has become zero by completing the execution thereof, starts sending the queued instructions in sequence to the plurality of data processing devices, and continues to send the queued instructions or every newly received instruction to the plurality of data processing devices until the number of instructioType: GrantFiled: July 26, 2001Date of Patent: October 19, 2004Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Manabu Migita, Junji Nishikawa, Ichiro Okabayashi, Shinji Furuya
-
Patent number: 6807624Abstract: The present invention relates to an instruction control device, and more particularly to an instruction control device which has achieved a high speed operational processing, such as a reduction in the amount of components, so as to enable an out-of-order instruction execution to thereby execute an instruction processing at high speed in an information processor.Type: GrantFiled: December 16, 1999Date of Patent: October 19, 2004Assignee: Fujitsu LimitedInventor: Takeo Asakawa
-
Patent number: 6804770Abstract: A hazard prediction array consists of an array of saturating counters. The array is indexed through a portion of the instruction address. At issue, the hazard prediction array is referenced and a prediction is made as to whether the current instruction or group of instructions is likely to encounter a flush. If the prediction is that it will flush, the instruction is not issued until it is the next instruction to complete. If the prediction is that the instruction will not flush, it is issued as normal. At completion time, the prediction array is updated with the actual flush behavior. When an instruction is predicted to flush and, thus, not issued until it is the next to complete, the predictor may be updated as if the instruction did not flush.Type: GrantFiled: March 22, 2001Date of Patent: October 12, 2004Assignee: International Business Machines CorporationInventors: Douglas Robert Logan, Alexander Erik Mericas, William John Starke
-
Patent number: 6804759Abstract: In a computer processor, a low-order portion of a virtual address for a pipelined operation is compared directly with the corresponding low-order portions of addresses of operations below it in the pipeline to detect an address conflict, without first translating the address. Preferably, if a match is found, it is assumed that an address conflict exists, and the pipeline is stalled one or more cycles to maintain data integrity in the event of an actual address conflict. Preferably, the CPU has caches which are addressed using real addresses, and a translation lookaside buffer (TLB) for determining the high-order portion of a real address. The comparison of low-order address portions provides conflict detection before the TLB can translate a real address of an instruction.Type: GrantFiled: March 14, 2002Date of Patent: October 12, 2004Assignee: International Business Machines CorporationInventor: David Arnold Luick