Simultaneous Issuance Of Multiple Instructions Patents (Class 712/215)
  • Patent number: 7418575
    Abstract: A system for adding reconfigurable computational instructions to a computer, the system comprising a processor operable to execute a set of instructions of a computer program comprising a set of computational instructions and long instruction word instructions with at least one of the long instruction word instructions comprising an instruction extension, an extension adapter coupled to the processor and operable to detect the execution of the instruction extension, and programmable logic coupled to the extension adapter and operable to receive configuration data for defining the instruction extension and execute the instruction extension.
    Type: Grant
    Filed: May 12, 2005
    Date of Patent: August 26, 2008
    Assignee: Stretch, Inc.
    Inventors: Ricardo E. Gonzalez, Scott Johnson, Derek Taylor
  • Publication number: 20080195846
    Abstract: In one embodiment, a processor comprises an instruction buffer and a pick unit. The instruction buffer is coupled to receive instructions fetched from an instruction cache. The pick unit is configured to select up to N instructions from the instruction buffer for concurrent transmission to respective slots of a plurality of slots, where N is an integer greater than one. Additionally, the pick unit is configured to transmit an oldest instruction of the selected instructions to any of the plurality of slots even if a number of the selected instructions is greater than one. The pick unit is configured to concurrently transmit other ones of the selected instructions to other slots of the plurality of slots based on the slot to which the oldest instruction is transmitted. Some embodiments comprise a computer system including the processor and a communication device configured to communicate with another computer system.
    Type: Application
    Filed: February 13, 2007
    Publication date: August 14, 2008
    Inventors: Gene W. Shen, Sean Lie
  • Patent number: 7409530
    Abstract: A VLIW instruction format is introduced having a set of control bits which identify subinstruction sharing conditions. At compilation the VLIW instruction is analyzed to identify subinstruction sharing opportunities. Such opportunities are encoded in the control bits of the instruction. Before the instruction is moved into the instruction cache, the instruction is compressed into the new format to delete select redundant occurrences of a subinstruction. Specifically, where a subinstruction is to be shared by corresponding functional processing units of respective clusters, the subinstruction need only appear in the instruction once. The redundant appearance is deleted. The control bits are decoded at instruction parsing time to route a shared subinstruction to the associated functional processing units.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: August 5, 2008
    Assignee: University of Washington
    Inventors: Donglok Kim, Stefan G. Berg, Weiyun Sun, Yongmin Kim
  • Patent number: 7406586
    Abstract: A pipelined multistreaming processor has an instruction source, a plurality of streams fetching instructions from the instruction source, a dispatch stage for selecting and dispatching instructions to a set of execution units, a set of instruction queues having one queue associated with each stream in the plurality of streams, and located in the pipeline between the instruction cache and the dispatch stage, and a select system for selecting streams in each cycle to fetch instructions from the instruction cache. The processor is characterized in that the select system selects one or more streams in each cycle for which to fetch instructions from the instruction cache, and in that the number of streams selected for which to fetch instructions in each cycle is fewer than the number of streams in the plurality of streams.
    Type: Grant
    Filed: October 6, 2006
    Date of Patent: July 29, 2008
    Assignee: MIPS Technologies, Inc.
    Inventors: Mario Nemirovsky, Adolfo Nemirovsky, Narendra Sankar, Enrique Musoll
  • Patent number: 7401207
    Abstract: Each instruction thread in a SMT processor is associated with a software assigned base input processing priority. Unless some predefined event or circumstance occurs with an instruction being processed or to be processed, the base input processing priorities of the respective threads are used to determine the interleave frequency between the threads according to some instruction interleave rule. However, upon the occurrence of some predefined event or circumstance in the processor related to a particular instruction thread, the base input processing priority of one or more instruction threads is adjusted to produce one more adjusted priority values. The instruction interleave rule is then enforced according to the adjusted priority value or values together with any base input processing priority values that have not been subject to adjustment.
    Type: Grant
    Filed: April 25, 2003
    Date of Patent: July 15, 2008
    Assignee: International Business Machines Corporation
    Inventors: Ronald Nick Kalla, Minh Michelle Quy Pham, Balaram Sinharoy, John Wesley Ward, III
  • Patent number: 7401204
    Abstract: A parallel processor performs efficient parallel processing of one or more basic instructions contained in each of a plurality of instruction words delimited by instruction delimiting information. The processor includes: a plurality of instruction execution units performing processes in accordance with corresponding, supplied basic instructions in parallel; an instruction fetch unit fetching the instruction words one by one in accordance with the instruction delimiting information; and an instruction issue unit recognizing and, in accordance therewith, selecting each of the basic instructions contained in each of the instruction words fetched by the instruction fetch unit to a corresponding instruction execution unit to execute the basic instruction.
    Type: Grant
    Filed: September 1, 2000
    Date of Patent: July 15, 2008
    Assignee: Fujitsu Limited
    Inventors: Hideo Miyake, Atsuhiro Suga, Yasuki Nakamura, Yoshimasa Takebe
  • Patent number: 7398374
    Abstract: The invention provides a processor that processes bundles of instructions preferentially through clusters or execution units according to thread characteristics. The cluster architectures of the invention preferably include capability to process “multi-threaded” instructions. Selectively, the architecture either (a) processes singly-threaded instructions through a single cluster to avoid bypassing and to increase throughput, or (b) processes singly-threaded instructions through multiple processes to increase “per thread” performance. The architecture may be “configurable” to operate in one of two modes: in a “wide” mode of operation, the processor's internal clusters collectively process bundled instructions of one thread of a program at the same time; in a “throughput” mode of operation, those clusters independently process instruction bundles of separate program threads. Clusters are often implemented on a common die, with a core and register file per cluster.
    Type: Grant
    Filed: February 27, 2002
    Date of Patent: July 8, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Eric DeLano
  • Patent number: 7395532
    Abstract: Programs having a given instruction-set architecture are executed on a multiprocessor system comprising a plurality of processors, for example of a VLIW type, each of said processors being able to execute, at each processing cycle, a respective maximum number of instructions. The instructions are compiled as instruction words of given length executable on a first processor. At least some of the instruction words of given length are converted into modified-instruction words executable on a second processor. The operation of modifying comprises in turn at least one operation chosen in the group consisting of: splitting the instruction words into modified-instruction words; and entering no-operation instructions in the modified-instruction words.
    Type: Grant
    Filed: July 1, 2003
    Date of Patent: July 1, 2008
    Assignee: STMicroelectronics S.r.l.
    Inventors: Antonio Maria Borneo, Fabrizio Simone Rovati, Danilo Pietro Pau
  • Patent number: 7395414
    Abstract: A method and apparatus for steering instructions dynamically, at issue time, so as to maximize the efficiency of use of execution units being shared by multiple threads being processed by an SMT processor. Resource vectors are used at issue time to redirect instructions, from threads being processed simultaneously, to shared resources for which the multiple threads are competing. The existing resource vectors for instructions that are queued for issuance are analyzed and, where appropriate, dynamically recalculated and modified for maximum efficiency.
    Type: Grant
    Filed: February 11, 2005
    Date of Patent: July 1, 2008
    Assignee: International Business Machines Corporation
    Inventors: Hung Q. Le, Dung Q. Nguyen, Brian W. Thompto, Raymond C. Yeung
  • Patent number: 7395413
    Abstract: A processor (e.g., a co-processor) capable of executing instructions sequentially, comprises at least two functional hardware resources. When two instructions that are consecutive in program order and are executed on two separate functional hardware resources, the execution of the two instructions may be parallelized if the two instructions are within a hardware loop. The processor thus, may implement a multiply and accumulate process in an efficient manner by performing the multiply instructions concurrently with the add instructions (which require fewer cycles to complete than the multiply instructions).
    Type: Grant
    Filed: July 31, 2003
    Date of Patent: July 1, 2008
    Assignee: Texas Instruments Incorporated
    Inventor: Gerard Chauvel
  • Patent number: 7395408
    Abstract: The parallel execution processor 100 fetches a piece of instruction data. When the piece of instruction data includes only one instruction, the instruction decoding unit 120 assigns the one instruction to all the PEs. When the piece of instruction data includes two instructions, the instruction decoding unit 120 forms all the PEs into two groups, so as to assign one instruction to each group. By making it possible to execute, in parallel, not only one type of instruction but also instructions that are different from each other, it is possible to improve the utilization efficiency of the parallel execution processor 100.
    Type: Grant
    Filed: October 16, 2003
    Date of Patent: July 1, 2008
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Takeshi Tanaka, Satoshi Takashima, Hideshi Nishida, Kozo Kimura, Tokuzo Kiyohara
  • Publication number: 20080155496
    Abstract: A program for execution by a computer that includes a plurality of processor elements, the program comprising: a parallel execution program part to assign the plurality of processor elements one-to-one to a plurality of program parts so that the plurality of program parts are executed in parallel with each other; an execution history obtaining part to obtain and hold an execution history of each of the plurality of program parts; a parallel execution judgment part to judge whether or not to execute the plurality of program parts in parallel with each other, in accordance with the obtained execution history; and a processor element assignment control part to perform a control to determine whether to assign the plurality of processor elements to the plurality of program parts, depending on a result of the judgment made by the parallel execution judgment part.
    Type: Application
    Filed: December 17, 2007
    Publication date: June 26, 2008
    Inventors: Fumihiro Hatano, Akira Tanaka
  • Publication number: 20080133890
    Abstract: A method and apparatus for steering instructions dynamically, at issue time, so as to maximize the efficiency of use of execution units being shared by multiple threads being processed by an SMT processor. Resource vectors are used at issue time to redirect instructions, from threads being processed simultaneously, to shared resources for which the multiple threads are competing. The existing resource vectors for instructions that are queued for issuance are analyzed and, where appropriate, dynamically recalculated and modified for maximum efficiency.
    Type: Application
    Filed: January 14, 2008
    Publication date: June 5, 2008
    Applicant: International Business Machines Corporation
    Inventors: Hung Q. Le, Dung Q. Nguyen, Brian W. Thompto, Raymond C. Yeung
  • Patent number: 7380104
    Abstract: A method is provided for evaluating two or more instructions in an out of order issue queue during a particular cycle of the queue, to select an instruction for issue during the next following cycle. If an instruction was previously designated to issue during the particular cycle, one or more instructions in the queue are evaluated to determine if any of them are dependent on the designated instruction. For the evaluation, each instruction placed into the queue is accompanied by corresponding logic elements that provide destination to source compares for the instruction. In an embodiment comprising a method, the oldest ready instruction in the queue during a particular cycle is identified.
    Type: Grant
    Filed: April 25, 2006
    Date of Patent: May 27, 2008
    Assignee: International Business Machines Corporation
    Inventors: William Elton Burky, Raymond Cheung Yeung
  • Patent number: 7380107
    Abstract: Multi-processor systems and methods are disclosed that employ speculative source requests to obtain speculative data fills in response to a cache miss. In one embodiment, a source processor generates a speculative source request and a system source request in response to a cache miss. At least one processor provides a speculative data fill to a source processor in response to the speculative source request. The processor system provides a coherent data fill to the processor in response to the system source request.
    Type: Grant
    Filed: January 13, 2004
    Date of Patent: May 27, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Simon C. Steely, Jr., Gregory Edward Tierney, Stephen R. Van Doren
  • Patent number: 7366884
    Abstract: A context switching system for a multi-thread execution pipeline loop having a pipeline latency and a method of operation thereof. In one embodiment, the context switching system includes a context switch requesting subsystem configured to: (1) detect a device request from a thread executing within the multi-thread execution pipeline loop for access to a device having a fulfillment latency exceeding the pipeline latency, and (2) generate a context switch request for the thread. The context switching system further includes a context controller subsystem configured to receive the context switch request and prevent the thread from executing until the device request is fulfilled.
    Type: Grant
    Filed: February 25, 2002
    Date of Patent: April 29, 2008
    Assignee: Agere Systems Inc.
    Inventors: Victor A. Bennett, Sean W. McGee
  • Patent number: 7360062
    Abstract: The selection between instruction threads in a SMT processor for the purpose of interleaving instructions from the different instruction threads may be modified to accommodate certain processor events or conditions. During each processor clock cycle, an interleave rule enforcement component produces at least one base instruction thread selection signal that indicates a particular one of the instruction threads for passing an instruction from that particular thread into a stream of interleaved instructions. Thread selection modification is provided by an interleave modification component that generates a final thread selection signal based upon the base thread selection signal and a feedback signal derived from one or more conditions or events in the various processor elements.
    Type: Grant
    Filed: April 25, 2003
    Date of Patent: April 15, 2008
    Assignee: International Business Machines Corporation
    Inventors: Ronald Nick Kalla, Minh Michelle Quy Pham, Balaram Sinharoy, John Wesley Ward, III
  • Patent number: 7353364
    Abstract: An apparatus and method for sharing a functional unit. In one embodiment, a processor may include instruction fetch logic configured to issue instructions, and a first functional unit configured to execute instructions issued from the instruction fetch logic and to execute operations issued from a second functional unit, where the operations are issued asynchronously with respect to the instructions. The second functional unit may be configured to provide one or more operands corresponding to a given operation to the first functional unit. The first functional unit may include temporary result storage configured to store a result of the given operation while the first functional unit executes a given instruction issued from the instruction fetch logic, and the first functional unit may be further configured to use the stored result as an operand of an operation issued subsequently to the given operation.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: April 1, 2008
    Assignee: Sun Microsystems, Inc.
    Inventors: Jike Chong, Christopher Olson, Gregory F. Grohoski
  • Publication number: 20080072012
    Abstract: An operation system and method of processing a user-defined extended operation are provided. The method includes using a software pipelining technology by enabling a processor to process a user-defined extended operation. An operation process system includes a plurality of functional units which are operable to process a primitive operation and a processor which is operable to process an extended operation according to a control of each of the functional units.
    Type: Application
    Filed: December 27, 2006
    Publication date: March 20, 2008
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Hee Seok Kim
  • Patent number: 7343479
    Abstract: The present invention is a method for implementing two architectures on a single chip. The method uses a fetch engine to retrieve instructions. If the instructions are macroinstructions, then it decodes the macroinstructions into microinstructions, and then bundles those microinstructions using a bundler, within an emulation engine. The bundles are issued in parallel and dispatched to the execution engine and contain pre-decode bits so that the execution engine treats them as microinstructions. Before being transferred to the execution engine, the instructions may be held in a buffer. The method also selects between bundled microinstructions from the emulation engine and native microinstructions coming directly from the fetch engine, by using a multiplexer or other means. Both native microinstructions and bundled microinstructions may be held in the buffer. The method also sends additional information to the execution engine.
    Type: Grant
    Filed: June 25, 2003
    Date of Patent: March 11, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Patrick Knebel, Kevin David Safford, Donald Charles Soltis, Jr., Joel D Lamb, Stephen R. Undy, Russell C Brockmann
  • Publication number: 20080059467
    Abstract: A method and system of processing multimedia data is provided. The method includes associating a constant identifier with a current block of the multimedia data. A frame of blocks of the multimedia data, including streaming video data, can be sorted based on the identifier. The identifier of the current block can be compared with the sorted frame of blocks and a compare condition can comprise matching a constant component of the compared blocks. A plurality of fine-grained instructions of a searching algorithm can be used in the comparing of the blocks. The plurality of fined-grained instructions can be stored in a data parallel system. Motion vectors can be generated for the frame of blocks. The generated motion vectors can also be sorted following generation of the motion vectors. A current picture can be reconfigured according to the generated motion vectors for the frame of blocks of the multimedia data.
    Type: Application
    Filed: September 4, 2007
    Publication date: March 6, 2008
    Inventor: Lazar Bivolarski
  • Patent number: 7340591
    Abstract: A number of architectural and implementation approaches are described for using extra path (Epath) storage that operate in conjunction with a compute register file to obtain increased instruction level parallelism that more flexibly addresses the requirements of high performance algorithms. A processor that supports a single load data to a register file operation can be doubled in load capability through the use of an extra path storage, an additional independently addressable data memory path, and instruction decode information that specifies two independently load data operations. By allowing the extra path storage to be accessible by arithmetic facilities, the increased data bandwidth can be fully utilized.
    Type: Grant
    Filed: October 28, 2004
    Date of Patent: March 4, 2008
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Patrick R. Marchand, Larry D. Larsen
  • Patent number: 7337304
    Abstract: When all of a plurality of instructions are symmetry instructions, a symmetry instruction issuing unit issues the symmetry instructions to a plurality of reservation stations provided for every different arithmetic operating units until they become full. If it is determined that there is an asymmetry instruction among the plurality of instructions and the residual instructions are the symmetry instructions, an asymmetry instruction issuing unit 56 develops the asymmetry instruction into a multiflow of a previous flow and a following flow, issues the asymmetry instruction to the reservation station provided in correspondence to the specific arithmetic operating unit, and issues the residual symmetry instructions to the plurality of reservation stations provided for every different arithmetic operating units in an issuing cycle different from that of the asymmetry instruction until they become full.
    Type: Grant
    Filed: January 23, 2003
    Date of Patent: February 26, 2008
    Assignee: Fujitsu Limited
    Inventor: Toshio Yoshida
  • Publication number: 20080028192
    Abstract: The present invention provides a data processing apparatus includes a plurality of register units and an operation unit. Each of the plurality of register units includes a register divided into a plurality of blocks, each of the plurality of blocks capable of holding a block data being at least 1 bit length. The operation unit sequentially reads the plurality of block data from at least one of the plurality of register units, performs predetermined operation, and outputs an operation result in units of blocks. At least one of the plurality of register units inputs a data having a plurality of block data in units of blocks and outputs the data to the operation unit in units of blocks before filling the register with full of the input data.
    Type: Application
    Filed: July 30, 2007
    Publication date: January 31, 2008
    Applicant: NEC ELECTRONICS CORPORATION
    Inventor: Hideki Sugimoto
  • Patent number: 7308563
    Abstract: A method and apparatus for dual-target register allocation is described, intended to enable the efficient mapping/renaming of registers associated with instructions within a pipelined microprocessor architecture.
    Type: Grant
    Filed: September 28, 2001
    Date of Patent: December 11, 2007
    Assignee: Intel Corporation
    Inventor: Nicholas Samra
  • Patent number: 7278010
    Abstract: An instruction execution apparatus comprising a register storing a copy of contents of a maximum number of entries that are executable simultaneously in one cycle with the entry storing the oldest unreleased instruction at a head among all entries in an instruction storage device after execution of the instructions, a completion condition determination section 44 for determining whether the instructions stored in the entries of the register are completed in the cycle for determining completion conditions of the entries in the instruction storage device, and an entry release section 44 for releasing only the entries that are determined to be completed by the completion condition determination section among all entries in the instruction storage device, which allows the entries in the CSE to be released smoothly even though the number of entries in a commitment stack entry, or clock frequency, is increased.
    Type: Grant
    Filed: December 31, 2002
    Date of Patent: October 2, 2007
    Assignee: Fujitsu Limited
    Inventors: Yasunobu Akizuki, Aiichiro Inoue
  • Patent number: 7269714
    Abstract: A processor is described which includes a first pipeline, a second pipeline, and a control circuit. The first pipeline includes a first stage at which instruction results are committed to architected state. The first stage is separated from an issue stage of the first pipeline by a first number of stages. The second pipeline includes a second stage at which an exception is reportable, wherein the second stage is separated from the issue stage of the second pipeline by a second number of stages which is greater than the first number. The control circuit is configured to inhibit co-issuance of a first instruction to the first pipeline and a second instruction to the second pipeline if the first instruction is subsequent to the second instruction in program order.
    Type: Grant
    Filed: February 4, 2002
    Date of Patent: September 11, 2007
    Assignee: Broadcom Corporation
    Inventors: Tse-Yu Yeh, David A. Kruckemyer, Robert Rogenmoser
  • Patent number: 7269715
    Abstract: An improved method, apparatus, and computer instructions for grouping instructions processed in equal sized sets. A current set of instructions is received in an instruction cache for dispatching. A determination is made as to whether any instructions in the current set of instructions are part of a group including a prior set of instructions received in the instruction cache including using a history data structure, wherein the history data structure contains data regarding instructions in the prior set of instructions. Any instructions are grouped into the group with the instruction in response to a determination that the any instructions are part of the group. Instructions in the group units are dispatched to execution using the history data structure, wherein invalid instruction dispatch groupings are avoided.
    Type: Grant
    Filed: February 3, 2005
    Date of Patent: September 11, 2007
    Assignee: International Business Machines Corporation
    Inventors: Hung Qui Le, David Stephen Levitan, John Wesley Ward, III
  • Patent number: 7266674
    Abstract: Detecting a stall condition associated with processor instructions within one or more threads and generating a no-dispatch condition. The stall condition can be detected by hardware and/or software before and/or during processor instruction execution. The no-dispatch condition can be associated with a number of processing cycles and an instruction from a particular thread. As a result of generating the no-dispatch condition, processor instructions from other threads may be dispatched into the execution slot of an available execution pipeline. After a period of time, the instruction associated with the stall can be fetched and executed.
    Type: Grant
    Filed: February 24, 2005
    Date of Patent: September 4, 2007
    Assignee: Microsoft Corporation
    Inventor: Susan E. Carrie
  • Patent number: 7257698
    Abstract: An instruction buffer of the present invention includes a sequence of instructions arranged in an order determined beforehand, and a buffer including entries arranged in a preselected order for storing the sequence of instructions. Any one of the instructions stored in any one of the entries designated by a low entry number is prior, in order, to another instruction stored in another entry designated by a high entry number.
    Type: Grant
    Filed: May 23, 2001
    Date of Patent: August 14, 2007
    Assignee: NEC Corporation
    Inventor: Mitsuharu Kawaguchi
  • Patent number: 7254689
    Abstract: In an embodiment of the present invention, the computational efficiency of decoding of block-sorted compressed data is improved by ensuring that more than one set of operations corresponding to a plurality of paths through a mapping array T are being handled by a processor. This sequence of operations, including instructions from the plurality of sets of operations, ensures that there is another operation in the pipeline if a cache miss on any given lookup operation in the mapping array results in a slower main memory access. In this way, the processor utilization is improved. While the sets of operations in the sequence of operations are independent of another other, there will be an overlap of a plurality of the main memory access operations due to the long time required for main memory access.
    Type: Grant
    Filed: July 15, 2004
    Date of Patent: August 7, 2007
    Assignee: Google Inc.
    Inventors: Sean M. Dorward, Sean Quinlan, Michael Burrows
  • Patent number: 7254667
    Abstract: A data processor core 10 comprising a memory access interface portion 30 operable to perform data transfer operations between an external data source and at least one memory associated with said data processor core and a data processing portion 12 operable to perform further data processing operations in response to receipt of said processor clock signal CLK. The two portions of the core being operable to be independently enabled such that one portion may be active while the other is inactive.
    Type: Grant
    Filed: April 2, 2004
    Date of Patent: August 7, 2007
    Assignee: Arm Limited
    Inventors: Tan Ba Tran, Richard Roy Grisenthwaite, Gerard Richard Williams
  • Patent number: 7240144
    Abstract: A data processor core 10 comprising: a memory access interface portion 30 operable to perform data transfer operations between an external data source and at least one memory 120 associated with said data processor core; a data processing portion 12 operable to perform data processing operations; a read/write port 40 operable to transfer data from said processor core to at least two buses 75A, 75B said at least two buses being operable to provide data communication between said processor core 10 and said at least one memory 120, said at least one memory 120 comprising at least two portions 120A, 120B, each of said at least two buses 75A, 75B being operable to provide data access to respective ones of said at least two portions 120A, 120B; arbitration logic 110 associated with said read/write port 40; wherein said arbitration logic is operable to route a data access request requesting access of data in one portion of said at least one memory received from said memory access interface to one of said at least tw
    Type: Grant
    Filed: April 2, 2004
    Date of Patent: July 3, 2007
    Assignee: Arm Limited
    Inventors: Tan Ba Tran, Gerard Richard Williams, David Terrence Matheny, David Walter Flynn
  • Patent number: 7237094
    Abstract: A more efficient method of handling instructions in a computer processor, by associating resource fields with respective program instructions wherein the resource fields indicate which of the processor hardware resources are required to carry out the program instructions, calculating resource requirements for merging two or more program instructions based on their resource fields, and determining resource availability for simultaneously executing the merged program instructions based on the calculated resource requirements. Resource vectors indicative of the required resource may be encoded into the resource fields, and the resource fields decoded at a later stage to derive the resource vectors. The resource fields can be stored in the instruction cache associated with the respective program instructions. The processor may operate in a simultaneous multithreading mode with different program instructions being part of different hardware threads.
    Type: Grant
    Filed: October 14, 2004
    Date of Patent: June 26, 2007
    Assignee: International Business Machines Corporation
    Inventors: Brian William Curran, Brian R. Konigsburg, Hung Qui Le, David Arnold Luick, Dung Quoc Nguyen
  • Patent number: 7237095
    Abstract: A method and mechanism for managing shifts in a shifting queue. A reservation station in a processing device includes a queue of shifting entries. On a given cycle, zero, one, or two instructions may be dispatched and stored in the queue. Depending upon the dispatch conditions and the state of the queue, existing entries within the queue may be shifted to make room for the newly dispatched instruction(s) at the top of the queue. Shift vectors are generated which identify entries of the queue which are to be shifted and by how much. A queue management approach is adopted in which three rules are generally followed: (i) Only shift entries that must shift due to dispatch pressure from above; (ii) If an entry must be shifted elsewhere, shift it as far down the array as the particular implementation allows; and (iii) Don't allow the previous conditions to force additional entries to shift that are not required to shift by dispatch pressure.
    Type: Grant
    Filed: August 4, 2005
    Date of Patent: June 26, 2007
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Daniel B. Hopper
  • Patent number: 7234042
    Abstract: An instruction set for a computer is described which includes instructions having a common predetermined bit length. That predetermined bit length can define a single operation or two independent operations. The instruction includes designated bits at predetermined bit locations which identify whether the instruction is a long instruction or a dual operation instruction.
    Type: Grant
    Filed: September 13, 1999
    Date of Patent: June 19, 2007
    Assignee: Broadcom Corporation
    Inventor: Sophie Wilson
  • Patent number: 7231510
    Abstract: A mechanism for, and method of, processing multiply-accumulate instructions with out-of-order completion in a pipeline, for use in a processor having an at least four-wide instruction issue architecture, and a digital signal processor (DSP) incorporating the mechanism or the method. In one embodiment, the mechanism including: (1) a multiply-accumulate unit (MAC) having an initial multiply stage and a subsequent accumulate stage and (2) out-of-order completion logic, associated with the MAC, that causes interim results produced by the multiply stage to be stored when the accumulate stage is unavailable and allows younger instructions to complete before the multiply-accumulate instructions.
    Type: Grant
    Filed: November 13, 2001
    Date of Patent: June 12, 2007
    Assignee: VeriSilicon Holdings (Cayman Islands) Co. Ltd.
    Inventors: Hung T. Nguyen, Shannon A. Wichman
  • Patent number: 7181590
    Abstract: A method and system for allowing a multi-threaded processor to share pages across different threads in a pre-validated cache using a translation look-aside buffer is disclosed. The multi-threaded processor searches a translation look-aside buffer in an attempt to match a virtual memory address. If no matching valid virtual memory address is found, a new translation is retrieved and the translation look-aside buffer is searched for a matching physical memory address. If a matching physical memory address is found, the old translation is overwritten with a new translation. The multi-threaded processor may execute switch on event multi-threading or simultaneous multi-threading. If simultaneous multi-threading is executed, then access rights for each thread is associated with the translation.
    Type: Grant
    Filed: August 28, 2003
    Date of Patent: February 20, 2007
    Assignee: Intel Corporation
    Inventors: Sailesh Kottapalli, Nadeem H. Firasta
  • Patent number: 7178008
    Abstract: A parallel processor has a plurality of operation units that execute operation instructions, and a multi-bank register file in which a plurality of banks each having a plurality of registers are formed. Each of machine instructions, which are input simultaneously, is split into a plurality of nano-instructions each of which includes at least one of an access instruction and operation instruction. The output clock cycles of operation instructions with respect to the operation units are arbitrated. Furthermore, the output clock cycles of access instructions to the multi-bank register file are arbitrated so as to prevent access instructions from contending in an identical bank in the multi-bank register file.
    Type: Grant
    Filed: February 18, 2003
    Date of Patent: February 13, 2007
    Assignee: Semiconductor Technology Academic Research Center
    Inventors: Tetsuo Hironaka, Mattausch Hans Juergen, Takeshi Hiramatsu
  • Patent number: 7162718
    Abstract: An asynchronous execution process to allow a compiler or interpreter to recognize code elements that may be executed out of order and to create a light weight thread for execution of the code element. This light weight thread may be executed on another processor in a multiprocessing environment. An “async” keyword is included in a language to indicate that a statement may be executed asynchronously with respect to the other statements at the same nesting level. The “async” keyword may also be used to modify the declaration of a function to indicate that it is safe to run the affected method out of order with other statements in a block. An “async_end” keyword is included in a language to indicate that asynchronous execution of a statement, block of code, or method must be complete before the next statement, block of code, or method may be executed.
    Type: Grant
    Filed: December 12, 2000
    Date of Patent: January 9, 2007
    Assignee: International Business Machines Corporation
    Inventors: Michael Wayne Brown, Scott E. Garfinkle, Michael A. Paolini, David Mark Wendt
  • Patent number: 7149881
    Abstract: A method and apparatus for improving dispersal performance of instruction threads is described. In one embodiment, the dispersal logic determines whether the instructions supplied to it include any NOP instructions. When a NOP instruction is detected, the dispersal logic places the NOP into a no-op port for execution. All other instructions are distributed to the proper execution pipes in a normal manner. Because the NOP instructions do not use the execution resources of other instructions, all instruction threads can be executed in one cycle.
    Type: Grant
    Filed: March 19, 2004
    Date of Patent: December 12, 2006
    Assignee: Intel Corporation
    Inventors: Sailesh Kottapalli, Udo Walterscheidt, Andrew Sun, Thomas Yeh, Kinkee Sit
  • Patent number: 7143268
    Abstract: A data processor includes execution clusters, an instruction cache, an instruction issue unit, and alignment and dispersal circuitry. Each execution cluster includes an instruction execution pipeline having a number of processing stages, and each execution pipeline is a number of lanes wide. The processing stages execute instruction bundles, where each instruction bundle has one or more syllables. Each lane is capable of receiving one of the syllables of an instruction bundle. The instruction cache includes a number of cache lines. The instruction issue unit receives fetched cache lines and issues complete instruction bundles toward the execution clusters. The alignment and dispersal circuitry receives the complete instruction bundles from the instruction issue unit and routes each received complete instruction bundle to a correct one of the execution clusters. The complete instruction bundles are routed as a function of at least one address bit associated with each complete instruction bundle.
    Type: Grant
    Filed: December 29, 2000
    Date of Patent: November 28, 2006
    Assignees: STMicroelectronics, Inc., Hewlett-Packard Development Co., L.P.
    Inventors: Paolo Faraboschi, Anthony X. Jarvis, Mark Owen Homewood, Geoffrey M. Brown, Gary L. Vondran
  • Patent number: 7137109
    Abstract: In one embodiment, the invention may comprise a computer-implemented system for managing access to a controlled space in a simulator environment, comprising: means for requiring initialization of a simulated hardware control object by a user code application operable to run on a simulated target platform in the simulator environment, wherein the simulated hardware control object is associated with at least a partition of the controlled space that is simulated by an architectural simulator in the simulator environment; and means for verifying if the simulated hardware control object associated with the partition has been initialized by the user code application when the user code application issues a transaction that attempts to access the partition.
    Type: Grant
    Filed: December 17, 2002
    Date of Patent: November 14, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Richard Shortz
  • Patent number: 7134002
    Abstract: An multi-threading processor is provided. The multi-threading processor includes a first instruction fetch unit and a second instruction fetch unit. A multi-thread scheduler unit is coupled to the first instruction fetch unit and the second instruction fetch unit. An execution unit, which executes a first active thread and a second active thread is coupled to the scheduler unit. The multi-threading processor also includes a register file coupled to the execution unit. The register file switches one of the first active thread and the second active threads with a first inactive thread.
    Type: Grant
    Filed: August 29, 2001
    Date of Patent: November 7, 2006
    Assignee: Intel Corporation
    Inventor: Ken Shoemaker
  • Patent number: 7127530
    Abstract: In order to reduce load placed on a CPU (central processing unit) in providing SBP-2 (serial bus protocol 2) initiator capability, provided are a sequence control circuit activated by the CPU for controlling a command issue sequence, a packet processing circuit for assembling operation request blocks (ORB) into a transmission packet and extracting a status from a received packet; buffer for storing a command ORB provided by the CPU; a buffer for storing a management ORB provided by the CPU; a buffer for storing a status received for an issued management ORB and providing the status to the CPU; and a buffer for command for storing a status received for an issued command ORB and providing the status to the CPU.
    Type: Grant
    Filed: April 18, 2002
    Date of Patent: October 24, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Isamu Ishimura, Yoshihiro Tabira
  • Patent number: 7120913
    Abstract: A processing execution apparatus has an update management program and a reference limiting program. The update management program manages whether or not A- to C-data update programs are performing data update, and shows an update status indicating that update is being performed or a standby status indicating that update is not being performed. Then, the reference limiting program refers to the status of the update management program. If it is in the update status, the program does not notify an X-program including data reference processing of the received event. That is, execution of the X-program is controlled by notification/non-notification of the event. However, regarding a Y-program not including data reference processing, it directly operates in accordance with event.
    Type: Grant
    Filed: April 4, 2002
    Date of Patent: October 10, 2006
    Assignee: DENSO Corporation
    Inventor: Yoshihiro Kawase
  • Patent number: 7117343
    Abstract: A program-controlled unit has a plurality of instruction-execution units for simultaneously executing successive instructions of a program that is to be executed. The program-controlled unit allows the number of access operations to a program memory storing the program that is to be executed to be reduced. The program-controlled unit has an assignment device which operates such that only the instructions for those instruction-execution units which are actually required for the execution of the program are stored in the program memory in which the program to be executed by the program-controlled unit is stored. The program includes a sequence of instructions which can be executed simultaneously. The assignment device allocates instructions that can be executed simultaneously to desired instruction-execution units for simultaneous execution, independent of each instruction's position within the sequence.
    Type: Grant
    Filed: September 4, 2001
    Date of Patent: October 3, 2006
    Assignee: Infineon Technologies AG
    Inventors: Raimund Leitner, Christian Panis
  • Patent number: 7114060
    Abstract: One embodiment of the present invention provides a system that facilitates deferring execution of instructions with unresolved data dependencies as they are issued for execution in program order. During a normal execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system generates a checkpoint that can subsequently be used to return execution of the program to the point of the instruction. Next, the system executes subsequent instructions in an execute-ahead mode, wherein instructions that cannot be executed because of an unresolved data dependency are deferred, and wherein other non-deferred instructions are executed in program order.
    Type: Grant
    Filed: October 14, 2003
    Date of Patent: September 26, 2006
    Assignee: Sun Microsystems, Inc.
    Inventors: Shailender Chaudhry, Marc Tremblay
  • Patent number: 7114058
    Abstract: Methods and apparatuses for dispatching instructions executed by at least one functional unit of a data processor, each one of the instructions having a corresponding priority number, in a data processing system having at least one host processor with host processor cache and host memory are described herein. In one aspect of the invention, an exemplary method includes receiving a next instruction from an instruction stream, examining a current instruction group to determine if the current instruction group is completed, adding the next instruction to the current instruction group if the current instruction group is not completed, and dispatching the current instruction group if the current instruction group is completed.
    Type: Grant
    Filed: December 31, 2001
    Date of Patent: September 26, 2006
    Assignee: Apple Computer, Inc.
    Inventors: Sushma Shrikant Trivedi, Joseph P. Bratt, Jack Benkual, Ronald Ray Hochsprung, Derek Fujio Iwamoto
  • Patent number: 7107478
    Abstract: A data-processing system includes a data device for selectively storing data and an engine having access to the memory device, the engine supporting a plurality of machine executable programs. A controller is utilized which selectively outputs one of a plurality of instructions to the engine for driving the execution of the programs enabled by the engine, while a clock device is utilized for outputting a synchronizing clock signal comprised of a predetermined number of clock cycles per second. The clock device outputs the synchronizing clock signal to the data device, the engine and the controller. The controller outputs one of the instructions to the engine for execution of one of the programs, while also executing an operation within itself, all within a single clock cycle.
    Type: Grant
    Filed: December 4, 2003
    Date of Patent: September 12, 2006
    Assignee: Connex Technology, Inc.
    Inventors: Dan Tomescu, Gheorghe Stefan