Simultaneous Issuance Of Multiple Instructions Patents (Class 712/215)
-
Patent number: 7418575Abstract: A system for adding reconfigurable computational instructions to a computer, the system comprising a processor operable to execute a set of instructions of a computer program comprising a set of computational instructions and long instruction word instructions with at least one of the long instruction word instructions comprising an instruction extension, an extension adapter coupled to the processor and operable to detect the execution of the instruction extension, and programmable logic coupled to the extension adapter and operable to receive configuration data for defining the instruction extension and execute the instruction extension.Type: GrantFiled: May 12, 2005Date of Patent: August 26, 2008Assignee: Stretch, Inc.Inventors: Ricardo E. Gonzalez, Scott Johnson, Derek Taylor
-
Publication number: 20080195846Abstract: In one embodiment, a processor comprises an instruction buffer and a pick unit. The instruction buffer is coupled to receive instructions fetched from an instruction cache. The pick unit is configured to select up to N instructions from the instruction buffer for concurrent transmission to respective slots of a plurality of slots, where N is an integer greater than one. Additionally, the pick unit is configured to transmit an oldest instruction of the selected instructions to any of the plurality of slots even if a number of the selected instructions is greater than one. The pick unit is configured to concurrently transmit other ones of the selected instructions to other slots of the plurality of slots based on the slot to which the oldest instruction is transmitted. Some embodiments comprise a computer system including the processor and a communication device configured to communicate with another computer system.Type: ApplicationFiled: February 13, 2007Publication date: August 14, 2008Inventors: Gene W. Shen, Sean Lie
-
Patent number: 7409530Abstract: A VLIW instruction format is introduced having a set of control bits which identify subinstruction sharing conditions. At compilation the VLIW instruction is analyzed to identify subinstruction sharing opportunities. Such opportunities are encoded in the control bits of the instruction. Before the instruction is moved into the instruction cache, the instruction is compressed into the new format to delete select redundant occurrences of a subinstruction. Specifically, where a subinstruction is to be shared by corresponding functional processing units of respective clusters, the subinstruction need only appear in the instruction once. The redundant appearance is deleted. The control bits are decoded at instruction parsing time to route a shared subinstruction to the associated functional processing units.Type: GrantFiled: December 17, 2004Date of Patent: August 5, 2008Assignee: University of WashingtonInventors: Donglok Kim, Stefan G. Berg, Weiyun Sun, Yongmin Kim
-
Patent number: 7406586Abstract: A pipelined multistreaming processor has an instruction source, a plurality of streams fetching instructions from the instruction source, a dispatch stage for selecting and dispatching instructions to a set of execution units, a set of instruction queues having one queue associated with each stream in the plurality of streams, and located in the pipeline between the instruction cache and the dispatch stage, and a select system for selecting streams in each cycle to fetch instructions from the instruction cache. The processor is characterized in that the select system selects one or more streams in each cycle for which to fetch instructions from the instruction cache, and in that the number of streams selected for which to fetch instructions in each cycle is fewer than the number of streams in the plurality of streams.Type: GrantFiled: October 6, 2006Date of Patent: July 29, 2008Assignee: MIPS Technologies, Inc.Inventors: Mario Nemirovsky, Adolfo Nemirovsky, Narendra Sankar, Enrique Musoll
-
Patent number: 7401207Abstract: Each instruction thread in a SMT processor is associated with a software assigned base input processing priority. Unless some predefined event or circumstance occurs with an instruction being processed or to be processed, the base input processing priorities of the respective threads are used to determine the interleave frequency between the threads according to some instruction interleave rule. However, upon the occurrence of some predefined event or circumstance in the processor related to a particular instruction thread, the base input processing priority of one or more instruction threads is adjusted to produce one more adjusted priority values. The instruction interleave rule is then enforced according to the adjusted priority value or values together with any base input processing priority values that have not been subject to adjustment.Type: GrantFiled: April 25, 2003Date of Patent: July 15, 2008Assignee: International Business Machines CorporationInventors: Ronald Nick Kalla, Minh Michelle Quy Pham, Balaram Sinharoy, John Wesley Ward, III
-
Patent number: 7401204Abstract: A parallel processor performs efficient parallel processing of one or more basic instructions contained in each of a plurality of instruction words delimited by instruction delimiting information. The processor includes: a plurality of instruction execution units performing processes in accordance with corresponding, supplied basic instructions in parallel; an instruction fetch unit fetching the instruction words one by one in accordance with the instruction delimiting information; and an instruction issue unit recognizing and, in accordance therewith, selecting each of the basic instructions contained in each of the instruction words fetched by the instruction fetch unit to a corresponding instruction execution unit to execute the basic instruction.Type: GrantFiled: September 1, 2000Date of Patent: July 15, 2008Assignee: Fujitsu LimitedInventors: Hideo Miyake, Atsuhiro Suga, Yasuki Nakamura, Yoshimasa Takebe
-
Patent number: 7398374Abstract: The invention provides a processor that processes bundles of instructions preferentially through clusters or execution units according to thread characteristics. The cluster architectures of the invention preferably include capability to process “multi-threaded” instructions. Selectively, the architecture either (a) processes singly-threaded instructions through a single cluster to avoid bypassing and to increase throughput, or (b) processes singly-threaded instructions through multiple processes to increase “per thread” performance. The architecture may be “configurable” to operate in one of two modes: in a “wide” mode of operation, the processor's internal clusters collectively process bundled instructions of one thread of a program at the same time; in a “throughput” mode of operation, those clusters independently process instruction bundles of separate program threads. Clusters are often implemented on a common die, with a core and register file per cluster.Type: GrantFiled: February 27, 2002Date of Patent: July 8, 2008Assignee: Hewlett-Packard Development Company, L.P.Inventor: Eric DeLano
-
Patent number: 7395532Abstract: Programs having a given instruction-set architecture are executed on a multiprocessor system comprising a plurality of processors, for example of a VLIW type, each of said processors being able to execute, at each processing cycle, a respective maximum number of instructions. The instructions are compiled as instruction words of given length executable on a first processor. At least some of the instruction words of given length are converted into modified-instruction words executable on a second processor. The operation of modifying comprises in turn at least one operation chosen in the group consisting of: splitting the instruction words into modified-instruction words; and entering no-operation instructions in the modified-instruction words.Type: GrantFiled: July 1, 2003Date of Patent: July 1, 2008Assignee: STMicroelectronics S.r.l.Inventors: Antonio Maria Borneo, Fabrizio Simone Rovati, Danilo Pietro Pau
-
Patent number: 7395414Abstract: A method and apparatus for steering instructions dynamically, at issue time, so as to maximize the efficiency of use of execution units being shared by multiple threads being processed by an SMT processor. Resource vectors are used at issue time to redirect instructions, from threads being processed simultaneously, to shared resources for which the multiple threads are competing. The existing resource vectors for instructions that are queued for issuance are analyzed and, where appropriate, dynamically recalculated and modified for maximum efficiency.Type: GrantFiled: February 11, 2005Date of Patent: July 1, 2008Assignee: International Business Machines CorporationInventors: Hung Q. Le, Dung Q. Nguyen, Brian W. Thompto, Raymond C. Yeung
-
Patent number: 7395413Abstract: A processor (e.g., a co-processor) capable of executing instructions sequentially, comprises at least two functional hardware resources. When two instructions that are consecutive in program order and are executed on two separate functional hardware resources, the execution of the two instructions may be parallelized if the two instructions are within a hardware loop. The processor thus, may implement a multiply and accumulate process in an efficient manner by performing the multiply instructions concurrently with the add instructions (which require fewer cycles to complete than the multiply instructions).Type: GrantFiled: July 31, 2003Date of Patent: July 1, 2008Assignee: Texas Instruments IncorporatedInventor: Gerard Chauvel
-
Patent number: 7395408Abstract: The parallel execution processor 100 fetches a piece of instruction data. When the piece of instruction data includes only one instruction, the instruction decoding unit 120 assigns the one instruction to all the PEs. When the piece of instruction data includes two instructions, the instruction decoding unit 120 forms all the PEs into two groups, so as to assign one instruction to each group. By making it possible to execute, in parallel, not only one type of instruction but also instructions that are different from each other, it is possible to improve the utilization efficiency of the parallel execution processor 100.Type: GrantFiled: October 16, 2003Date of Patent: July 1, 2008Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Takeshi Tanaka, Satoshi Takashima, Hideshi Nishida, Kozo Kimura, Tokuzo Kiyohara
-
Publication number: 20080155496Abstract: A program for execution by a computer that includes a plurality of processor elements, the program comprising: a parallel execution program part to assign the plurality of processor elements one-to-one to a plurality of program parts so that the plurality of program parts are executed in parallel with each other; an execution history obtaining part to obtain and hold an execution history of each of the plurality of program parts; a parallel execution judgment part to judge whether or not to execute the plurality of program parts in parallel with each other, in accordance with the obtained execution history; and a processor element assignment control part to perform a control to determine whether to assign the plurality of processor elements to the plurality of program parts, depending on a result of the judgment made by the parallel execution judgment part.Type: ApplicationFiled: December 17, 2007Publication date: June 26, 2008Inventors: Fumihiro Hatano, Akira Tanaka
-
Publication number: 20080133890Abstract: A method and apparatus for steering instructions dynamically, at issue time, so as to maximize the efficiency of use of execution units being shared by multiple threads being processed by an SMT processor. Resource vectors are used at issue time to redirect instructions, from threads being processed simultaneously, to shared resources for which the multiple threads are competing. The existing resource vectors for instructions that are queued for issuance are analyzed and, where appropriate, dynamically recalculated and modified for maximum efficiency.Type: ApplicationFiled: January 14, 2008Publication date: June 5, 2008Applicant: International Business Machines CorporationInventors: Hung Q. Le, Dung Q. Nguyen, Brian W. Thompto, Raymond C. Yeung
-
Method and apparatus for back to back issue of dependent instructions in an out of order issue queue
Patent number: 7380104Abstract: A method is provided for evaluating two or more instructions in an out of order issue queue during a particular cycle of the queue, to select an instruction for issue during the next following cycle. If an instruction was previously designated to issue during the particular cycle, one or more instructions in the queue are evaluated to determine if any of them are dependent on the designated instruction. For the evaluation, each instruction placed into the queue is accompanied by corresponding logic elements that provide destination to source compares for the instruction. In an embodiment comprising a method, the oldest ready instruction in the queue during a particular cycle is identified.Type: GrantFiled: April 25, 2006Date of Patent: May 27, 2008Assignee: International Business Machines CorporationInventors: William Elton Burky, Raymond Cheung Yeung -
Patent number: 7380107Abstract: Multi-processor systems and methods are disclosed that employ speculative source requests to obtain speculative data fills in response to a cache miss. In one embodiment, a source processor generates a speculative source request and a system source request in response to a cache miss. At least one processor provides a speculative data fill to a source processor in response to the speculative source request. The processor system provides a coherent data fill to the processor in response to the system source request.Type: GrantFiled: January 13, 2004Date of Patent: May 27, 2008Assignee: Hewlett-Packard Development Company, L.P.Inventors: Simon C. Steely, Jr., Gregory Edward Tierney, Stephen R. Van Doren
-
Patent number: 7366884Abstract: A context switching system for a multi-thread execution pipeline loop having a pipeline latency and a method of operation thereof. In one embodiment, the context switching system includes a context switch requesting subsystem configured to: (1) detect a device request from a thread executing within the multi-thread execution pipeline loop for access to a device having a fulfillment latency exceeding the pipeline latency, and (2) generate a context switch request for the thread. The context switching system further includes a context controller subsystem configured to receive the context switch request and prevent the thread from executing until the device request is fulfilled.Type: GrantFiled: February 25, 2002Date of Patent: April 29, 2008Assignee: Agere Systems Inc.Inventors: Victor A. Bennett, Sean W. McGee
-
Patent number: 7360062Abstract: The selection between instruction threads in a SMT processor for the purpose of interleaving instructions from the different instruction threads may be modified to accommodate certain processor events or conditions. During each processor clock cycle, an interleave rule enforcement component produces at least one base instruction thread selection signal that indicates a particular one of the instruction threads for passing an instruction from that particular thread into a stream of interleaved instructions. Thread selection modification is provided by an interleave modification component that generates a final thread selection signal based upon the base thread selection signal and a feedback signal derived from one or more conditions or events in the various processor elements.Type: GrantFiled: April 25, 2003Date of Patent: April 15, 2008Assignee: International Business Machines CorporationInventors: Ronald Nick Kalla, Minh Michelle Quy Pham, Balaram Sinharoy, John Wesley Ward, III
-
Patent number: 7353364Abstract: An apparatus and method for sharing a functional unit. In one embodiment, a processor may include instruction fetch logic configured to issue instructions, and a first functional unit configured to execute instructions issued from the instruction fetch logic and to execute operations issued from a second functional unit, where the operations are issued asynchronously with respect to the instructions. The second functional unit may be configured to provide one or more operands corresponding to a given operation to the first functional unit. The first functional unit may include temporary result storage configured to store a result of the given operation while the first functional unit executes a given instruction issued from the instruction fetch logic, and the first functional unit may be further configured to use the stored result as an operand of an operation issued subsequently to the given operation.Type: GrantFiled: June 30, 2004Date of Patent: April 1, 2008Assignee: Sun Microsystems, Inc.Inventors: Jike Chong, Christopher Olson, Gregory F. Grohoski
-
Publication number: 20080072012Abstract: An operation system and method of processing a user-defined extended operation are provided. The method includes using a software pipelining technology by enabling a processor to process a user-defined extended operation. An operation process system includes a plurality of functional units which are operable to process a primitive operation and a processor which is operable to process an extended operation according to a control of each of the functional units.Type: ApplicationFiled: December 27, 2006Publication date: March 20, 2008Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventor: Hee Seok Kim
-
Patent number: 7343479Abstract: The present invention is a method for implementing two architectures on a single chip. The method uses a fetch engine to retrieve instructions. If the instructions are macroinstructions, then it decodes the macroinstructions into microinstructions, and then bundles those microinstructions using a bundler, within an emulation engine. The bundles are issued in parallel and dispatched to the execution engine and contain pre-decode bits so that the execution engine treats them as microinstructions. Before being transferred to the execution engine, the instructions may be held in a buffer. The method also selects between bundled microinstructions from the emulation engine and native microinstructions coming directly from the fetch engine, by using a multiplexer or other means. Both native microinstructions and bundled microinstructions may be held in the buffer. The method also sends additional information to the execution engine.Type: GrantFiled: June 25, 2003Date of Patent: March 11, 2008Assignee: Hewlett-Packard Development Company, L.P.Inventors: Patrick Knebel, Kevin David Safford, Donald Charles Soltis, Jr., Joel D Lamb, Stephen R. Undy, Russell C Brockmann
-
Publication number: 20080059467Abstract: A method and system of processing multimedia data is provided. The method includes associating a constant identifier with a current block of the multimedia data. A frame of blocks of the multimedia data, including streaming video data, can be sorted based on the identifier. The identifier of the current block can be compared with the sorted frame of blocks and a compare condition can comprise matching a constant component of the compared blocks. A plurality of fine-grained instructions of a searching algorithm can be used in the comparing of the blocks. The plurality of fined-grained instructions can be stored in a data parallel system. Motion vectors can be generated for the frame of blocks. The generated motion vectors can also be sorted following generation of the motion vectors. A current picture can be reconfigured according to the generated motion vectors for the frame of blocks of the multimedia data.Type: ApplicationFiled: September 4, 2007Publication date: March 6, 2008Inventor: Lazar Bivolarski
-
Patent number: 7340591Abstract: A number of architectural and implementation approaches are described for using extra path (Epath) storage that operate in conjunction with a compute register file to obtain increased instruction level parallelism that more flexibly addresses the requirements of high performance algorithms. A processor that supports a single load data to a register file operation can be doubled in load capability through the use of an extra path storage, an additional independently addressable data memory path, and instruction decode information that specifies two independently load data operations. By allowing the extra path storage to be accessible by arithmetic facilities, the increased data bandwidth can be fully utilized.Type: GrantFiled: October 28, 2004Date of Patent: March 4, 2008Assignee: Altera CorporationInventors: Gerald George Pechanek, Patrick R. Marchand, Larry D. Larsen
-
Patent number: 7337304Abstract: When all of a plurality of instructions are symmetry instructions, a symmetry instruction issuing unit issues the symmetry instructions to a plurality of reservation stations provided for every different arithmetic operating units until they become full. If it is determined that there is an asymmetry instruction among the plurality of instructions and the residual instructions are the symmetry instructions, an asymmetry instruction issuing unit 56 develops the asymmetry instruction into a multiflow of a previous flow and a following flow, issues the asymmetry instruction to the reservation station provided in correspondence to the specific arithmetic operating unit, and issues the residual symmetry instructions to the plurality of reservation stations provided for every different arithmetic operating units in an issuing cycle different from that of the asymmetry instruction until they become full.Type: GrantFiled: January 23, 2003Date of Patent: February 26, 2008Assignee: Fujitsu LimitedInventor: Toshio Yoshida
-
Publication number: 20080028192Abstract: The present invention provides a data processing apparatus includes a plurality of register units and an operation unit. Each of the plurality of register units includes a register divided into a plurality of blocks, each of the plurality of blocks capable of holding a block data being at least 1 bit length. The operation unit sequentially reads the plurality of block data from at least one of the plurality of register units, performs predetermined operation, and outputs an operation result in units of blocks. At least one of the plurality of register units inputs a data having a plurality of block data in units of blocks and outputs the data to the operation unit in units of blocks before filling the register with full of the input data.Type: ApplicationFiled: July 30, 2007Publication date: January 31, 2008Applicant: NEC ELECTRONICS CORPORATIONInventor: Hideki Sugimoto
-
Patent number: 7308563Abstract: A method and apparatus for dual-target register allocation is described, intended to enable the efficient mapping/renaming of registers associated with instructions within a pipelined microprocessor architecture.Type: GrantFiled: September 28, 2001Date of Patent: December 11, 2007Assignee: Intel CorporationInventor: Nicholas Samra
-
Patent number: 7278010Abstract: An instruction execution apparatus comprising a register storing a copy of contents of a maximum number of entries that are executable simultaneously in one cycle with the entry storing the oldest unreleased instruction at a head among all entries in an instruction storage device after execution of the instructions, a completion condition determination section 44 for determining whether the instructions stored in the entries of the register are completed in the cycle for determining completion conditions of the entries in the instruction storage device, and an entry release section 44 for releasing only the entries that are determined to be completed by the completion condition determination section among all entries in the instruction storage device, which allows the entries in the CSE to be released smoothly even though the number of entries in a commitment stack entry, or clock frequency, is increased.Type: GrantFiled: December 31, 2002Date of Patent: October 2, 2007Assignee: Fujitsu LimitedInventors: Yasunobu Akizuki, Aiichiro Inoue
-
Patent number: 7269714Abstract: A processor is described which includes a first pipeline, a second pipeline, and a control circuit. The first pipeline includes a first stage at which instruction results are committed to architected state. The first stage is separated from an issue stage of the first pipeline by a first number of stages. The second pipeline includes a second stage at which an exception is reportable, wherein the second stage is separated from the issue stage of the second pipeline by a second number of stages which is greater than the first number. The control circuit is configured to inhibit co-issuance of a first instruction to the first pipeline and a second instruction to the second pipeline if the first instruction is subsequent to the second instruction in program order.Type: GrantFiled: February 4, 2002Date of Patent: September 11, 2007Assignee: Broadcom CorporationInventors: Tse-Yu Yeh, David A. Kruckemyer, Robert Rogenmoser
-
Patent number: 7269715Abstract: An improved method, apparatus, and computer instructions for grouping instructions processed in equal sized sets. A current set of instructions is received in an instruction cache for dispatching. A determination is made as to whether any instructions in the current set of instructions are part of a group including a prior set of instructions received in the instruction cache including using a history data structure, wherein the history data structure contains data regarding instructions in the prior set of instructions. Any instructions are grouped into the group with the instruction in response to a determination that the any instructions are part of the group. Instructions in the group units are dispatched to execution using the history data structure, wherein invalid instruction dispatch groupings are avoided.Type: GrantFiled: February 3, 2005Date of Patent: September 11, 2007Assignee: International Business Machines CorporationInventors: Hung Qui Le, David Stephen Levitan, John Wesley Ward, III
-
Patent number: 7266674Abstract: Detecting a stall condition associated with processor instructions within one or more threads and generating a no-dispatch condition. The stall condition can be detected by hardware and/or software before and/or during processor instruction execution. The no-dispatch condition can be associated with a number of processing cycles and an instruction from a particular thread. As a result of generating the no-dispatch condition, processor instructions from other threads may be dispatched into the execution slot of an available execution pipeline. After a period of time, the instruction associated with the stall can be fetched and executed.Type: GrantFiled: February 24, 2005Date of Patent: September 4, 2007Assignee: Microsoft CorporationInventor: Susan E. Carrie
-
Patent number: 7257698Abstract: An instruction buffer of the present invention includes a sequence of instructions arranged in an order determined beforehand, and a buffer including entries arranged in a preselected order for storing the sequence of instructions. Any one of the instructions stored in any one of the entries designated by a low entry number is prior, in order, to another instruction stored in another entry designated by a high entry number.Type: GrantFiled: May 23, 2001Date of Patent: August 14, 2007Assignee: NEC CorporationInventor: Mitsuharu Kawaguchi
-
Patent number: 7254689Abstract: In an embodiment of the present invention, the computational efficiency of decoding of block-sorted compressed data is improved by ensuring that more than one set of operations corresponding to a plurality of paths through a mapping array T are being handled by a processor. This sequence of operations, including instructions from the plurality of sets of operations, ensures that there is another operation in the pipeline if a cache miss on any given lookup operation in the mapping array results in a slower main memory access. In this way, the processor utilization is improved. While the sets of operations in the sequence of operations are independent of another other, there will be an overlap of a plurality of the main memory access operations due to the long time required for main memory access.Type: GrantFiled: July 15, 2004Date of Patent: August 7, 2007Assignee: Google Inc.Inventors: Sean M. Dorward, Sean Quinlan, Michael Burrows
-
Patent number: 7254667Abstract: A data processor core 10 comprising a memory access interface portion 30 operable to perform data transfer operations between an external data source and at least one memory associated with said data processor core and a data processing portion 12 operable to perform further data processing operations in response to receipt of said processor clock signal CLK. The two portions of the core being operable to be independently enabled such that one portion may be active while the other is inactive.Type: GrantFiled: April 2, 2004Date of Patent: August 7, 2007Assignee: Arm LimitedInventors: Tan Ba Tran, Richard Roy Grisenthwaite, Gerard Richard Williams
-
Patent number: 7240144Abstract: A data processor core 10 comprising: a memory access interface portion 30 operable to perform data transfer operations between an external data source and at least one memory 120 associated with said data processor core; a data processing portion 12 operable to perform data processing operations; a read/write port 40 operable to transfer data from said processor core to at least two buses 75A, 75B said at least two buses being operable to provide data communication between said processor core 10 and said at least one memory 120, said at least one memory 120 comprising at least two portions 120A, 120B, each of said at least two buses 75A, 75B being operable to provide data access to respective ones of said at least two portions 120A, 120B; arbitration logic 110 associated with said read/write port 40; wherein said arbitration logic is operable to route a data access request requesting access of data in one portion of said at least one memory received from said memory access interface to one of said at least twType: GrantFiled: April 2, 2004Date of Patent: July 3, 2007Assignee: Arm LimitedInventors: Tan Ba Tran, Gerard Richard Williams, David Terrence Matheny, David Walter Flynn
-
Patent number: 7237094Abstract: A more efficient method of handling instructions in a computer processor, by associating resource fields with respective program instructions wherein the resource fields indicate which of the processor hardware resources are required to carry out the program instructions, calculating resource requirements for merging two or more program instructions based on their resource fields, and determining resource availability for simultaneously executing the merged program instructions based on the calculated resource requirements. Resource vectors indicative of the required resource may be encoded into the resource fields, and the resource fields decoded at a later stage to derive the resource vectors. The resource fields can be stored in the instruction cache associated with the respective program instructions. The processor may operate in a simultaneous multithreading mode with different program instructions being part of different hardware threads.Type: GrantFiled: October 14, 2004Date of Patent: June 26, 2007Assignee: International Business Machines CorporationInventors: Brian William Curran, Brian R. Konigsburg, Hung Qui Le, David Arnold Luick, Dung Quoc Nguyen
-
Patent number: 7237095Abstract: A method and mechanism for managing shifts in a shifting queue. A reservation station in a processing device includes a queue of shifting entries. On a given cycle, zero, one, or two instructions may be dispatched and stored in the queue. Depending upon the dispatch conditions and the state of the queue, existing entries within the queue may be shifted to make room for the newly dispatched instruction(s) at the top of the queue. Shift vectors are generated which identify entries of the queue which are to be shifted and by how much. A queue management approach is adopted in which three rules are generally followed: (i) Only shift entries that must shift due to dispatch pressure from above; (ii) If an entry must be shifted elsewhere, shift it as far down the array as the particular implementation allows; and (iii) Don't allow the previous conditions to force additional entries to shift that are not required to shift by dispatch pressure.Type: GrantFiled: August 4, 2005Date of Patent: June 26, 2007Assignee: Advanced Micro Devices, Inc.Inventor: Daniel B. Hopper
-
Patent number: 7234042Abstract: An instruction set for a computer is described which includes instructions having a common predetermined bit length. That predetermined bit length can define a single operation or two independent operations. The instruction includes designated bits at predetermined bit locations which identify whether the instruction is a long instruction or a dual operation instruction.Type: GrantFiled: September 13, 1999Date of Patent: June 19, 2007Assignee: Broadcom CorporationInventor: Sophie Wilson
-
Patent number: 7231510Abstract: A mechanism for, and method of, processing multiply-accumulate instructions with out-of-order completion in a pipeline, for use in a processor having an at least four-wide instruction issue architecture, and a digital signal processor (DSP) incorporating the mechanism or the method. In one embodiment, the mechanism including: (1) a multiply-accumulate unit (MAC) having an initial multiply stage and a subsequent accumulate stage and (2) out-of-order completion logic, associated with the MAC, that causes interim results produced by the multiply stage to be stored when the accumulate stage is unavailable and allows younger instructions to complete before the multiply-accumulate instructions.Type: GrantFiled: November 13, 2001Date of Patent: June 12, 2007Assignee: VeriSilicon Holdings (Cayman Islands) Co. Ltd.Inventors: Hung T. Nguyen, Shannon A. Wichman
-
Patent number: 7181590Abstract: A method and system for allowing a multi-threaded processor to share pages across different threads in a pre-validated cache using a translation look-aside buffer is disclosed. The multi-threaded processor searches a translation look-aside buffer in an attempt to match a virtual memory address. If no matching valid virtual memory address is found, a new translation is retrieved and the translation look-aside buffer is searched for a matching physical memory address. If a matching physical memory address is found, the old translation is overwritten with a new translation. The multi-threaded processor may execute switch on event multi-threading or simultaneous multi-threading. If simultaneous multi-threading is executed, then access rights for each thread is associated with the translation.Type: GrantFiled: August 28, 2003Date of Patent: February 20, 2007Assignee: Intel CorporationInventors: Sailesh Kottapalli, Nadeem H. Firasta
-
Patent number: 7178008Abstract: A parallel processor has a plurality of operation units that execute operation instructions, and a multi-bank register file in which a plurality of banks each having a plurality of registers are formed. Each of machine instructions, which are input simultaneously, is split into a plurality of nano-instructions each of which includes at least one of an access instruction and operation instruction. The output clock cycles of operation instructions with respect to the operation units are arbitrated. Furthermore, the output clock cycles of access instructions to the multi-bank register file are arbitrated so as to prevent access instructions from contending in an identical bank in the multi-bank register file.Type: GrantFiled: February 18, 2003Date of Patent: February 13, 2007Assignee: Semiconductor Technology Academic Research CenterInventors: Tetsuo Hironaka, Mattausch Hans Juergen, Takeshi Hiramatsu
-
Patent number: 7162718Abstract: An asynchronous execution process to allow a compiler or interpreter to recognize code elements that may be executed out of order and to create a light weight thread for execution of the code element. This light weight thread may be executed on another processor in a multiprocessing environment. An “async” keyword is included in a language to indicate that a statement may be executed asynchronously with respect to the other statements at the same nesting level. The “async” keyword may also be used to modify the declaration of a function to indicate that it is safe to run the affected method out of order with other statements in a block. An “async_end” keyword is included in a language to indicate that asynchronous execution of a statement, block of code, or method must be complete before the next statement, block of code, or method may be executed.Type: GrantFiled: December 12, 2000Date of Patent: January 9, 2007Assignee: International Business Machines CorporationInventors: Michael Wayne Brown, Scott E. Garfinkle, Michael A. Paolini, David Mark Wendt
-
Patent number: 7149881Abstract: A method and apparatus for improving dispersal performance of instruction threads is described. In one embodiment, the dispersal logic determines whether the instructions supplied to it include any NOP instructions. When a NOP instruction is detected, the dispersal logic places the NOP into a no-op port for execution. All other instructions are distributed to the proper execution pipes in a normal manner. Because the NOP instructions do not use the execution resources of other instructions, all instruction threads can be executed in one cycle.Type: GrantFiled: March 19, 2004Date of Patent: December 12, 2006Assignee: Intel CorporationInventors: Sailesh Kottapalli, Udo Walterscheidt, Andrew Sun, Thomas Yeh, Kinkee Sit
-
Patent number: 7143268Abstract: A data processor includes execution clusters, an instruction cache, an instruction issue unit, and alignment and dispersal circuitry. Each execution cluster includes an instruction execution pipeline having a number of processing stages, and each execution pipeline is a number of lanes wide. The processing stages execute instruction bundles, where each instruction bundle has one or more syllables. Each lane is capable of receiving one of the syllables of an instruction bundle. The instruction cache includes a number of cache lines. The instruction issue unit receives fetched cache lines and issues complete instruction bundles toward the execution clusters. The alignment and dispersal circuitry receives the complete instruction bundles from the instruction issue unit and routes each received complete instruction bundle to a correct one of the execution clusters. The complete instruction bundles are routed as a function of at least one address bit associated with each complete instruction bundle.Type: GrantFiled: December 29, 2000Date of Patent: November 28, 2006Assignees: STMicroelectronics, Inc., Hewlett-Packard Development Co., L.P.Inventors: Paolo Faraboschi, Anthony X. Jarvis, Mark Owen Homewood, Geoffrey M. Brown, Gary L. Vondran
-
Patent number: 7137109Abstract: In one embodiment, the invention may comprise a computer-implemented system for managing access to a controlled space in a simulator environment, comprising: means for requiring initialization of a simulated hardware control object by a user code application operable to run on a simulated target platform in the simulator environment, wherein the simulated hardware control object is associated with at least a partition of the controlled space that is simulated by an architectural simulator in the simulator environment; and means for verifying if the simulated hardware control object associated with the partition has been initialized by the user code application when the user code application issues a transaction that attempts to access the partition.Type: GrantFiled: December 17, 2002Date of Patent: November 14, 2006Assignee: Hewlett-Packard Development Company, L.P.Inventor: Richard Shortz
-
Patent number: 7134002Abstract: An multi-threading processor is provided. The multi-threading processor includes a first instruction fetch unit and a second instruction fetch unit. A multi-thread scheduler unit is coupled to the first instruction fetch unit and the second instruction fetch unit. An execution unit, which executes a first active thread and a second active thread is coupled to the scheduler unit. The multi-threading processor also includes a register file coupled to the execution unit. The register file switches one of the first active thread and the second active threads with a first inactive thread.Type: GrantFiled: August 29, 2001Date of Patent: November 7, 2006Assignee: Intel CorporationInventor: Ken Shoemaker
-
Patent number: 7127530Abstract: In order to reduce load placed on a CPU (central processing unit) in providing SBP-2 (serial bus protocol 2) initiator capability, provided are a sequence control circuit activated by the CPU for controlling a command issue sequence, a packet processing circuit for assembling operation request blocks (ORB) into a transmission packet and extracting a status from a received packet; buffer for storing a command ORB provided by the CPU; a buffer for storing a management ORB provided by the CPU; a buffer for storing a status received for an issued management ORB and providing the status to the CPU; and a buffer for command for storing a status received for an issued command ORB and providing the status to the CPU.Type: GrantFiled: April 18, 2002Date of Patent: October 24, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Isamu Ishimura, Yoshihiro Tabira
-
Patent number: 7120913Abstract: A processing execution apparatus has an update management program and a reference limiting program. The update management program manages whether or not A- to C-data update programs are performing data update, and shows an update status indicating that update is being performed or a standby status indicating that update is not being performed. Then, the reference limiting program refers to the status of the update management program. If it is in the update status, the program does not notify an X-program including data reference processing of the received event. That is, execution of the X-program is controlled by notification/non-notification of the event. However, regarding a Y-program not including data reference processing, it directly operates in accordance with event.Type: GrantFiled: April 4, 2002Date of Patent: October 10, 2006Assignee: DENSO CorporationInventor: Yoshihiro Kawase
-
Patent number: 7117343Abstract: A program-controlled unit has a plurality of instruction-execution units for simultaneously executing successive instructions of a program that is to be executed. The program-controlled unit allows the number of access operations to a program memory storing the program that is to be executed to be reduced. The program-controlled unit has an assignment device which operates such that only the instructions for those instruction-execution units which are actually required for the execution of the program are stored in the program memory in which the program to be executed by the program-controlled unit is stored. The program includes a sequence of instructions which can be executed simultaneously. The assignment device allocates instructions that can be executed simultaneously to desired instruction-execution units for simultaneous execution, independent of each instruction's position within the sequence.Type: GrantFiled: September 4, 2001Date of Patent: October 3, 2006Assignee: Infineon Technologies AGInventors: Raimund Leitner, Christian Panis
-
Patent number: 7114060Abstract: One embodiment of the present invention provides a system that facilitates deferring execution of instructions with unresolved data dependencies as they are issued for execution in program order. During a normal execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system generates a checkpoint that can subsequently be used to return execution of the program to the point of the instruction. Next, the system executes subsequent instructions in an execute-ahead mode, wherein instructions that cannot be executed because of an unresolved data dependency are deferred, and wherein other non-deferred instructions are executed in program order.Type: GrantFiled: October 14, 2003Date of Patent: September 26, 2006Assignee: Sun Microsystems, Inc.Inventors: Shailender Chaudhry, Marc Tremblay
-
Patent number: 7114058Abstract: Methods and apparatuses for dispatching instructions executed by at least one functional unit of a data processor, each one of the instructions having a corresponding priority number, in a data processing system having at least one host processor with host processor cache and host memory are described herein. In one aspect of the invention, an exemplary method includes receiving a next instruction from an instruction stream, examining a current instruction group to determine if the current instruction group is completed, adding the next instruction to the current instruction group if the current instruction group is not completed, and dispatching the current instruction group if the current instruction group is completed.Type: GrantFiled: December 31, 2001Date of Patent: September 26, 2006Assignee: Apple Computer, Inc.Inventors: Sushma Shrikant Trivedi, Joseph P. Bratt, Jack Benkual, Ronald Ray Hochsprung, Derek Fujio Iwamoto
-
Patent number: 7107478Abstract: A data-processing system includes a data device for selectively storing data and an engine having access to the memory device, the engine supporting a plurality of machine executable programs. A controller is utilized which selectively outputs one of a plurality of instructions to the engine for driving the execution of the programs enabled by the engine, while a clock device is utilized for outputting a synchronizing clock signal comprised of a predetermined number of clock cycles per second. The clock device outputs the synchronizing clock signal to the data device, the engine and the controller. The controller outputs one of the instructions to the engine for execution of one of the programs, while also executing an operation within itself, all within a single clock cycle.Type: GrantFiled: December 4, 2003Date of Patent: September 12, 2006Assignee: Connex Technology, Inc.Inventors: Dan Tomescu, Gheorghe Stefan