Instruction Issuing Patents (Class 712/214)
  • Publication number: 20090254784
    Abstract: A semiconductor memory device comprises a RAM (Random Access Memory), an ODT (On-Die Termination) circuits and a JTAG (Joint Test Action Group) circuit. The RAM is connected to a data input-output port. The ODT circuit is provided between the data input-output port and a termination port. The JTAG circuit controls the ODT circuit in response to an instruction such that the data input-output port and the termination port are electrically connected with each other.
    Type: Application
    Filed: April 8, 2009
    Publication date: October 8, 2009
    Applicant: NEC ELECTRONICS CORPORATION
    Inventors: MASATOSHI SONODA, YUUJIROU SHIMIZU, HIDEAKI ARIMA
  • Publication number: 20090249034
    Abstract: A processor performs instruction execution regardless of a program order. An execution unit executes an instruction, and transmits end information of the instruction whose execution has ended. A retire unit receives the end information, rearranges a result of the instruction whose execution has ended in a program order to determine the instruction execution, and transmits completed instruction information which reports that the instruction execution has been determined. A signature generation unit receives the completed instruction information from the retire unit, and generates a signature using the completed instruction information.
    Type: Application
    Filed: February 23, 2009
    Publication date: October 1, 2009
    Applicant: Fujitsu Limited
    Inventor: Mitsuru SATO
  • Publication number: 20090249026
    Abstract: In one embodiment, a processor may include a vector unit to perform operations on multiple data elements responsive to a single instruction, and a control unit coupled to the vector unit to provide the data elements to the vector unit, where the control unit is to enable an atomic vector operation to be performed on at least some of the data elements responsive to a first vector instruction to be executed under a first mask and a second vector instruction to be executed under a second mask. Other embodiments are described and claimed.
    Type: Application
    Filed: March 28, 2008
    Publication date: October 1, 2009
    Inventors: Mikhail Smelyanskiy, Sanjeev Kumar, Daehyun Kim, Jatin Chhugani, Changkyu Kim, Christopher J. Hughes, Victor W. Lee, Anthony D. Nguyen, Yen-Kuang Chen
  • Publication number: 20090240919
    Abstract: A pipelined processor including an architecture for address generation interlocking, the processor including: an instruction grouping unit to detect a read-after-write dependency and to resolve instruction interdependency; an instruction dispatch unit (IDU) including address generation interlock (AGI) and operand fetching logic for dispatching an instruction to at least one of a load store unit and an execution unit; wherein the load store unit is configured with access to a data cache and to return fetched data to the execution unit; wherein the execution unit is configured to write data into a general purpose register bank; and wherein the architecture provides support for bypassing of results of a load multiple instruction for address generation while such instruction is executing in the execution unit before the general purpose register bank is written. A method and a computer system are also provided.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Khary J. Alexander, Fadi Y. Busaba, Vimal M. Kapadia, Chung-Lung Kevin Shum
  • Patent number: 7594097
    Abstract: A method and apparatus are provided for controlling instructions provided by a microprocessor output port to other execution units. A microprocessor pipeline of instructions is provided for each execution unit. These are scheduled via the microprocessor unit for each execution unit, a determination is made as to whether or not the execution unit can receive further instructions. If it cannot, it's associated pipeline is said to be stalled and instructions are deleted from the microprocessor pipeline. Its thread can then be restarted at a later time with the instruction corresponding to the instruction which was unable to execute.
    Type: Grant
    Filed: July 15, 2005
    Date of Patent: September 22, 2009
    Assignee: Imagination Technologies Limited
    Inventor: Andrew Webber
  • Patent number: 7590824
    Abstract: Techniques for processing transmissions in a communications (e.g., CDMA) system. A method and system for issuing and executing mixed architecture instructions in a multiple-issue digital signal processor receives in a mixed instruction listing a plurality of digital signal processor instructions. The plurality of digital signal processor instructions includes a plurality of parallel executable instructions (e.g., VLIW instructions or instruction packets) mixed among a plurality of series executable instructions (e.g., superscalar instructions). The series executable instructions are associated by various instruction dependencies. The method and system further identify in the mixed instruction listing the plurality of parallel executable instructions. Once identified, the parallel executable instructions are first executed in parallel irrespective of any such instruction's relative order in the mixed instruction listing.
    Type: Grant
    Filed: March 29, 2005
    Date of Patent: September 15, 2009
    Assignee: QUALCOMM Incorporated
    Inventors: Muhammad Ahmed, Erich Plondke, Lucian Codrescu, William C. Anderson
  • Publication number: 20090228683
    Abstract: A controller operable to control an array of processing elements comprises a retrieval unit operable to retrieve instruction items for each of a plurality of instructions streams, each instruction stream having a plurality of instructions items, a combining unit operable to combine the plurality of instruction streams into a serial instruction stream, and a distribution unit operable to distribute the serial instruction stream to an array of processing elements.
    Type: Application
    Filed: March 13, 2009
    Publication date: September 10, 2009
    Applicant: ClearSpeed Technology plc
    Inventors: Dave Stuttard, Dave Williams, Eamon O'Dea, Gordon Faulds, John Rhoades, Ken Cameron, Phil Atkin, Paul Winser, Russell David, Ray McConnell, Tim Day, Trey Greer
  • Patent number: 7587717
    Abstract: A computing system having expansion modules. One of the expansion modules is identified as a master module. The other modules act as slaves to the master module. The central processing unit routes a task to either the master module for portioning out or to all of the expansion modules. The master module then receives completion signals from all of the active slave modules and then provides only one interrupt to the central processing unit for that task.
    Type: Grant
    Filed: November 13, 2006
    Date of Patent: September 8, 2009
    Assignee: Intel Corporation
    Inventors: John I. Garney, Robert J. Royer, Jr.
  • Publication number: 20090217005
    Abstract: A method for selectively accelerating early instruction processing including receiving an instruction data that is normally processed in an execution stage of a processor pipeline, wherein a configuration of the instruction data allows a processing of the instruction data to be accelerated from the execution stage to an address generation stage that occurs earlier in the processor pipeline than the execution stage, determining whether the instruction data can be dispatched to the address generation stage to be processed without being delayed due to an unavailability of a processing resource needed for the processing of the instruction data in the address generation stage, dispatching the instruction data to be processed in the address generation stage if it can be dispatched without being delayed due to the unavailability of the processing resource, and dispatching the instruction data to be processed in the execution stage if it can not be dispatched without being delayed due to the unavailability of the pro
    Type: Application
    Filed: February 26, 2008
    Publication date: August 27, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Khary J. Alexander, Fadi Y. Busaba, Bruce C. Giamei, David S. Hutton, Chung-Lung K. Shum
  • Publication number: 20090217006
    Abstract: A heuristic backtracer is described. In one embodiment, a scanner scans a stack of an application for a pointer to a word of a machine code of the application. A preceding byte locator identifies one or more bytes immediately preceding the pointed-to machine code. A parser parses the one or more bytes immediately preceding the pointed-to machine code of the machine code for a call instruction. A return address identifier determines the pointed-to as a return address when the one or more bytes constitute the call instruction.
    Type: Application
    Filed: February 27, 2008
    Publication date: August 27, 2009
    Inventor: Soren Sandmann Pedersen
  • Publication number: 20090210664
    Abstract: The present invention provides system and method for a group priority issue schema for a cascaded pipeline. The system includes a cascaded delayed execution pipeline unit having four or more execution pipelines that execute instructions in a common issue group in a delayed manner relative to each other. The system further includes circuitry configured to: (1) receiving an issue group of instructions; (2) scheduling the instructions in program order received; and (3) executing the issue group of instructions in the cascaded delayed execution pipeline unit. The present invention can also be viewed as providing methods for providing a group priority issue schema for a cascaded pipeline. The method includes: (1) receiving an issue group of instructions; (2) scheduling the instructions in the program order received; and (3) executing the issue group of instructions in the cascaded delayed execution pipeline unit.
    Type: Application
    Filed: February 15, 2008
    Publication date: August 20, 2009
    Inventor: David A. Luick
  • Publication number: 20090182986
    Abstract: A circuit arrangement and method utilize an issue rate-based predictive thermal management technique in a microprocessor or other integrated circuit that tracks the rate in which instructions are issued to one or more execution units in the processing unit, and selectively delays the issuance of subsequent instructions to the execution unit(s) based upon the tracked issue rate to predictively control the thermal output of the integrated circuit.
    Type: Application
    Filed: January 16, 2008
    Publication date: July 16, 2009
    Inventors: Stephen Joseph Schwinn, Matthew Ray Tubbs, Charles David Wait
  • Patent number: 7562206
    Abstract: Microarchitecture policies and structures to predict execution clusters and facilitate inter-cluster communication are disclosed. In disclosed embodiments, sequentially ordered instructions are decoded into micro-operations. Execution of one set of micro-operations is predicted to involve execution resources to perform memory access operations and inter-cluster communication, but not to perform branching operations. Execution of a second set of micro-operations is predicted to involve execution resources to perform branching operations but not to perform memory access operations. The micro-operations are partitioned for execution in accordance with these predictions, the first set of micro-operations to a first cluster of execution resources and the second set of micro-operations to a second cluster of execution resources. The first and second sets of micro-operations are executed out of sequential order and are retired to represent their sequential instruction ordering.
    Type: Grant
    Filed: December 30, 2005
    Date of Patent: July 14, 2009
    Assignee: Intel Corporation
    Inventors: Avinash Sodani, Alexandre J. Farcy, Stephan J. Jourdan, Per Hammarlund, Mark C. Davis
  • Publication number: 20090177867
    Abstract: A digital signal processor includes a control block configured to issue instructions based on a stored program, and a compute array including two or more compute engines configured such that each of the issued instructions executes in successive compute engines of at least a subset of the compute engines at successive times. The digital signal processor may be utilized with a control processor or as a stand-alone processor. The compute array may be configured such that each of the issued instructions flows through successive compute engines of at least a subset of the compute engines at successive times.
    Type: Application
    Filed: January 9, 2008
    Publication date: July 9, 2009
    Applicant: Analog Devices, Inc.
    Inventor: Douglas Garde
  • Patent number: 7552313
    Abstract: A VLIW digital signal processor is composed of a program memory including first to n-th banks, first to n-th address counters, a fetch block, and an instruction executing section. The first to n-th banks store therein first to n-th programs, respectively. The first to n-th address counters respectively indicates addresses at which next instructions to be executed next, selected out of VLIW instructions within said first to n-th programs, are stored in said first to n-th banks. The fetch block is configured to fetch said next instructions from said addresses, respectively, and to generate a resultant VLIW instruction from said next instructions. The instruction executing section is configured to receive said resultant VLIW instruction, and to execute said resultant VLIW instruction in a single instruction executing cycle.
    Type: Grant
    Filed: December 13, 2004
    Date of Patent: June 23, 2009
    Assignee: NEC Electronics Corporation
    Inventor: Kazuhiko Tabei
  • Patent number: 7552434
    Abstract: An embodiment of a method of performing a kernel level task upon initial execution of a child process at a user level begins with setting an instruction pointer for an initial child process instruction to an instruction to enter a kernel level. The method continues with beginning the child process which places a return value in a register for the child process and which causes the child process to enter the kernel level. The method concludes with executing a system call having a system call number of the return value. The system call comprises the kernel level task.
    Type: Grant
    Filed: April 30, 2004
    Date of Patent: June 23, 2009
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Yoshio Frank Turner, Dinesh Kumar Subhraveti, Gopalakrishnan Janakiraman, Jose Renato Santos
  • Publication number: 20090144526
    Abstract: A method of accessing a device is provided. A command is received from an agent, over a network, for executing at least one instruction for accessing the device. Information is sent to the agent, over the network, regarding the execution of the at least one instruction.
    Type: Application
    Filed: November 30, 2007
    Publication date: June 4, 2009
    Applicant: INFINEON TECHNOLOGIES AG
    Inventors: Jurijus Cizas, Mark Stafford
  • Patent number: 7533248
    Abstract: A multithreaded processor including a shared functional unit. In one embodiment, the multithreaded processor includes a functional unit coupled to a multithreaded instruction source that may request access to use the functional unit. The multithreaded processor may also include a processing unit that is coupled to request access to use the functional unit. The functional unit may be configured to execute one of an instruction provided by the multithreaded instruction source and an operation provided by the processing unit in a given cycle dependent upon which of the multithreaded instruction source and the processing unit has a higher priority.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: May 12, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Robert T. Golla, Gregory F. Grohoski
  • Publication number: 20090119490
    Abstract: An instruction scheduling method and a processor using an instruction scheduling method are provided. The instruction scheduling method includes selecting a first instruction that has a highest priority from a plurality of instructions, and allocating the selected first instruction and a first time slot to one of the functional units, allocating a second instruction and a second time slot to one of the functional units, wherein the second instruction is dependent on the first instruction.
    Type: Application
    Filed: March 20, 2008
    Publication date: May 7, 2009
    Inventors: Taewook Oh, Hong-Seok Kim, Scott Mahlke, Hyun Chul Park
  • Publication number: 20090113179
    Abstract: The present invention provides an operational processing apparatus which can guarantee a period for executing instructions in the shortest cycle when the operational processing apparatus synchronizes with a hardware accelerator. A processor in the present invention simultaneously issues and executes instructions including instruction groups having a simultaneously issueable instruction. The processor executes a program including a specific instruction. The specific instruction instructs to exclude an instruction subsequent to the specific instruction out of the instruction groups including the specific instruction, and to suspend issuing the instruction subsequent to the specific instruction only during a predetermined period immediately after the specific instruction is issued.
    Type: Application
    Filed: October 28, 2008
    Publication date: April 30, 2009
    Applicant: PANASONIC CORPORATION
    Inventors: Masahide KAKEDA, Shinji OZAKI, Takao YAMAMOTO
  • Patent number: 7523295
    Abstract: An interleaved multithreading pipeline operating method comprises reading an instruction packet containing at least two instructions, steering a first instruction of the instruction packet to a first execution unit for execution and generating a first result, steering a second instruction of the instruction packet to a second execution unit for execution using the first result and generating a second result, and storing the second result.
    Type: Grant
    Filed: March 21, 2005
    Date of Patent: April 21, 2009
    Assignee: QUALCOMM Incorporated
    Inventors: Lucian Codrescu, Erich Plondke, Muhammad Ahmed, Sujat Jamil, William C. Anderson
  • Patent number: 7523297
    Abstract: Methods and circuitry for processing a shadow scan instruction in a multi-threaded microprocessing environment include a bit sequence having a thread identifier, core identifiers and a shadow scan instruction. The core identifiers are assigned a state to identify microprocessor cores of a multi-core structure and are processed combinationally to determine if the shadow scan instruction is to be processed through a thread of the identified core. The processing of the shadow scan instruction through the thread of each of the identified cores is accomplished by a single load operation of the shadow scan instruction into the JTAG TAP controller.
    Type: Grant
    Filed: September 16, 2005
    Date of Patent: April 21, 2009
    Assignee: Sun Microsystems, Inc.
    Inventor: Roger C. Mistely
  • Patent number: 7509482
    Abstract: A memory device stores entries waiting to be processed. Row numbers of matrix information correspond to storage positions within the memory device, column numbers correspond to positions within the order of the entries, and every matrix element corresponding to the storage position and the position within the order of the entry stored in this storage position has a predetermined value. An operation between the first vector information indicating storage positions of processable entries and each column of the matrix information is performed and the second vector information indicating positions within the order of the processable entries is generated. Then, a position to be processed is selected from among the positions of processable entries indicated by the second vector information, an element having the predetermined value in the column corresponding to the selected position is obtained, and an entry in the storage position corresponding to the element is processed.
    Type: Grant
    Filed: June 16, 2006
    Date of Patent: March 24, 2009
    Assignee: Fujitsu Limited
    Inventors: Takuji Takahashi, Masahiro Kuramoto
  • Patent number: 7509483
    Abstract: A computing architecture and software techniques are described which modifies the basic sequential instruction fetching mechanism of a processor by separating a program's control flow from its functional execution flow. A compiled sequential HLL program's static control structures are analyzed and a separate program based on its own unique instructions is created that primarily generates addresses for the selection of functional execution instructions. The original program is now represented by an instruction fetch program and a set of function/logic execution instructions. This basic split allows multiple instruction addresses to be generated in parallel to access multiple instruction memories. These multiple instruction memories contain only the function/logic instructions of the program and no control structure operations such as branches or calls. All the original program's control instructions are split from the original program and used to create the instruction addressing program.
    Type: Grant
    Filed: February 22, 2007
    Date of Patent: March 24, 2009
    Assignee: Renesky Tap III, Limited Liability Company
    Inventor: Gerald George Pechanek
  • Publication number: 20090070558
    Abstract: The present invention provides a probe system and method for multithreaded user-space programs. The system includes an instrumentation module that enables single stepping out of line processing for multithreaded programs, an establish probepoint module that divides up an area of the probed program's memory into a plurality of instruction slots, an ensure slot assigned module that ensures that an instruction slot is assigned to a probepoint, a slot acquisition module that acquires the instruction slot for the probepoint, stealing a slot from another probepoint as needed, and a free slot module that relinquishes the instruction slot owned by the probepoint when the probepoint is being unregistered.
    Type: Application
    Filed: September 11, 2007
    Publication date: March 12, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Prasadarao Akulavenkatavara, Gerrit Huizenga, James A. Keniston
  • Patent number: 7502912
    Abstract: A method and apparatus for rescheduling operations in a processor. More particularly, the present invention relates to optimally using a scheduler resource in a processor by analyzing, predicting, and sorting the write order of instructions into the scheduler so that the duration the instructions sit idle in the scheduler is minimized. The analyses, prediction, and sorting may be done between an instruction queue and a scheduler by using delay units. The prediction can be based on history (latency, dependency, and resource) or on a general prediction scheme.
    Type: Grant
    Filed: December 30, 2003
    Date of Patent: March 10, 2009
    Assignee: Intel Corporation
    Inventors: Avinash Sodani, Per H. Hammarlund, Stephan J. Jourdan
  • Patent number: 7502914
    Abstract: In one embodiment, a processor comprises one or more execution resources configured to execute instruction operations and a scheduler coupled to the execution resources. The scheduler is configured to maintain an ancestor tracking vector (ATV) corresponding to each given instruction operation in the scheduler, wherein the ATV identifies instruction operations which can cause the given instruction operation to replay. The scheduler is configured to set the ATV of the given instruction operation to a null value in response to the given instruction operation being dispatched to the scheduler, and is configured to create the ATV of the given instruction operation dynamically as source operands of the given instruction operation are resolved.
    Type: Grant
    Filed: July 31, 2006
    Date of Patent: March 10, 2009
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Ashutosh S. Dhodapkar
  • Patent number: 7502913
    Abstract: Systems and methods for switch prefetch in multicore computer chips can allow a programmer to tailor operations of a computer program to available data. Control-flow decisions can be made by the program based on the availability of data in a cache. For example, a new instruction in a processor instruction set can receive a list comprising pairs of data addresses and code addresses. The processor can look for data items corresponding to the listed data addresses, and find the first available data item in the cache. When a cached data item is found, control is transferred to the code address supplied in the table. If no data is in the cache, then the processor can stall until the most quickly fetched data item is available.
    Type: Grant
    Filed: June 16, 2006
    Date of Patent: March 10, 2009
    Assignee: Microsoft Corporation
    Inventor: Paul R. Barham
  • Patent number: 7500086
    Abstract: One embodiment of the present invention supports execution of a start transactional execution (STE) instruction, which marks the beginning of a block of instructions to be executed transactionally. Upon encountering the STE instruction during execution of a program, the system commences transactional execution of the block of instructions following the STE instruction. Changes made during this transactional execution are not committed to the architectural state of the processor until the transactional execution successfully completes.
    Type: Grant
    Filed: December 6, 2005
    Date of Patent: March 3, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Marc Tremblay, Shailender Chaudhry, Quinn A. Jacobson
  • Publication number: 20090055626
    Abstract: A method of sharing a coarse grained array and a processor using the method is provided. A processor includes a first processor core including a plurality of first functional units which execute a first instruction set, a second processor core including a plurality of second functional units which execute a second instruction set, and a coarse grained array including a plurality of third functional units which execute a portion of instructions of the first instruction set and/or the second instruction set, instead of the first processor core and/or the second processor core.
    Type: Application
    Filed: February 18, 2008
    Publication date: February 26, 2009
    Inventors: Yeon Gon CHO, Suk Jin Kim, Sang Suk Lee, Junhee Kim, Jeongwook Kim
  • Patent number: 7496490
    Abstract: Core model processing of a processor model PE1 and a processor model PE2 is serialized. Therefore, processing time for the inter-core-model communication is required between the core model processing of a first processor model and the core model processing of a second processor model. The inter-core-model communication processing is performed such that the inter-core-model communication required for the simulation processing of a multi-processor model is performed in parallel with the core model processing.
    Type: Grant
    Filed: February 28, 2006
    Date of Patent: February 24, 2009
    Assignee: Fujitsu Microelectronics Limited
    Inventors: Masato Tatsuoka, Atsushi Ike
  • Patent number: 7496899
    Abstract: Techniques for preventing the loss of trace information being transmitted via trace infrastructure are disclosed. A data processing apparatus for processing instructions is provided.
    Type: Grant
    Filed: August 17, 2005
    Date of Patent: February 24, 2009
    Assignee: ARM Limited
    Inventors: Stephen John Hill, Glen Andrew Harris, David James Williamson
  • Publication number: 20090049278
    Abstract: A multiprocessor data processing system (MDPS) with a weakly-ordered architecture providing processing logic for substantially eliminating issuing sync instructions after every store instruction of a well-behaved application. Instructions of a well-behaved application are translated and executed by a weakly-ordered processor. The processing logic includes a lock address tracking utility (LATU), which provides an algorithm and a table of lock addresses, within which each lock address is stored when the lock is acquired by the weakly-ordered processor. When a store instruction is encountered in the instruction stream, the LATU compares the target address of the store instruction against the table of lock addresses. If the target address matches one of the lock addresses, indicating that the store instruction is the corresponding unlock instruction (or lock release instruction), a sync instruction is issued ahead of the store operation.
    Type: Application
    Filed: October 28, 2008
    Publication date: February 19, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: ANDREW DUNSHEA, SATYA PRAKASH SHARMA, MYSORE SATHYANARAYANA SRINIVAS
  • Patent number: 7493475
    Abstract: An improved superscalar processor. The processor includes multiple lanes, allowing multiple instructions in a bundle to be executed in parallel. In vector mode, the parallel lanes may be used to execute multiple instances of a bundle, representing multiple iterations of the bundle in a vector run. Scheduling logic determines whether, for each bundle, multiple instances can be executed in parallel. If multiple instances can be executed in parallel, coupling circuitry couples an instance of the bundle from one lane into one or more other lanes. In each lane, register addresses are renamed to ensure proper execution of the bundles in the vector run. Additionally, the processor may include a register bank separate from the architectural register file. Renaming logic can generate addresses to this separate register bank that are longer than used to address architectural registers, allowing longer vectors and more efficient processor operation.
    Type: Grant
    Filed: November 15, 2006
    Date of Patent: February 17, 2009
    Assignee: STMicroelectronics, Inc.
    Inventor: Osvaldo M. Colavin
  • Patent number: 7493469
    Abstract: From an application program described in the form of a flow graph, input and output arcs are extracted. Packet rates on the input and output arcs are extracted, and it is determined whether the packet rates of the input arc and the output arc are lower than an upper-limit value of a pipeline transfer rate of a processor element. Based on the determination result, it is determined whether it is possible to execute the described flow graph program in the processor element. Performance evaluation of a program to be executed by a data driven processor based on an asynchronous pipeline transfer control can be carried out with ease and in a short time.
    Type: Grant
    Filed: March 14, 2005
    Date of Patent: February 17, 2009
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Ricardo T. Shichiku, Shinichi Yoshida
  • Patent number: 7490224
    Abstract: Tracking the order of issued instructions using a counter is presented. In one embodiment, a saturating, decrementing counter is used. The counter is initialized to a value that corresponds to the processor's commit point. Instructions are issued from a first issue queue to one or more execution units and one or more second issue queues. After being issued by the first issue queue, the counter associated with each instruction is decremented during each instruction cycle until the instruction is executed by one of the execution units. Once the counter reaches zero it will be completed by the execution unit. If a flush condition occurs, instructions with counters equal to zero are maintained (i.e., not flushed or invalidated), while other instructions in the pipeline are invalidated based upon their counter values.
    Type: Grant
    Filed: October 7, 2005
    Date of Patent: February 10, 2009
    Assignee: International Business Machines Corporation
    Inventors: Christopher Michael Abernathy, Jonathan James DeMent, Ronald Hall, Robert Alan Philhower, David Shippy
  • Patent number: 7490219
    Abstract: In the present invention, in order that a busy judgment of a register can be made without fail and without increasing the number of hardware resources for storing a request into the register provided at the final stage of a pipeline register in a stage in which the request is retained halfway in the pipeline register in a pipeline processor, a first counter for counting the number of valid requests in the registers between a judgment section interposed in the pipeline register and for judging whether the request is a valid request and a request queue and a busy judgment section for judging whether the request queue is in a busy state based on the number of valid requests counted by the first counter are provided and a judgment is made by the judgment section based on the result of the busy state judgment by the busy judgment section.
    Type: Grant
    Filed: June 22, 2005
    Date of Patent: February 10, 2009
    Assignee: Fujitsu Limited
    Inventors: Takao Matsui, Yuka Hosokawa, Makoto Hataida, Toshikazu Ueki, Seishi Okada
  • Publication number: 20090037697
    Abstract: A system and method for data forwarding from a store instruction to a load instruction during out-of-order execution, when the load instruction address matches against multiple older uncommitted store addresses or if the forwarding fails during the first pass due to any other reason. In a first pass, the youngest store instruction in program order of all store instructions older than a load instruction is found and an indication to the store buffer entry holding information of the youngest store instruction is recorded. In a second pass, the recorded indication is used to index the store buffer and the store bypass data is forwarded to the load instruction. Simultaneously, it is verified if no new store, younger than the previously identified store and older than the load has not been issued due to out-of-order execution.
    Type: Application
    Filed: August 1, 2007
    Publication date: February 5, 2009
    Inventors: Krishnan Ramani, Gary Lauterbach
  • Patent number: 7487335
    Abstract: One embodiment of the present invention provides a system that facilitates deferring execution of instructions with unresolved data dependencies as they are issued for execution in program order. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system generates a checkpoint, which includes a checkpointed version of the register file. Next, the system defers the instruction, which involves storing the instruction along with any resolved source operands for the instruction into a deferred buffer. The system then executes subsequent instructions in an execute-ahead mode which operates on a future version of the register file, wherein instructions that cannot be executed because of unresolved data dependencies are deferred, and wherein other non-deferred instructions are executed in program order.
    Type: Grant
    Filed: October 14, 2005
    Date of Patent: February 3, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Shailender Chaudhry, Syed I. Haq, Mohammed M. Rahman, Khanh Luu
  • Publication number: 20090024837
    Abstract: Described are a system and method for language specification. The device may include (a) a processor running an operating system and (b) an image capturing device scanning an image. The operating system is configured to display a user interface. The image includes data that is transmitted to the operating system to execute a language installation protocol. The protocol indicates a display language for the user interface.
    Type: Application
    Filed: July 17, 2007
    Publication date: January 22, 2009
    Inventor: Joel Brand
  • Publication number: 20090024838
    Abstract: A mechanism for suppressing instruction replay includes a processor having one or more execution units and a scheduler that issue instruction operations for execution by the one or more execution units. The scheduler may also cause instruction operations that are determined to be incorrectly executed to be replayed, or reissued. In addition, a prediction unit within the processor may predict whether a given instruction operation will replay and to provide an indication that the given instruction operation will replay. The processor also includes a decode unit that may decode instructions and in response to detecting the indication, may flag the given instruction operation. The scheduler may further inhibit issue of the flagged instruction operation until a status associated with the flagged instruction is good.
    Type: Application
    Filed: July 20, 2007
    Publication date: January 22, 2009
    Inventors: Ashutosh S. Dhodapkar, Michael G. Butler, Gene W. Shen
  • Patent number: 7480771
    Abstract: We propose a class of mechanisms to support a new style of synchronization that offers simple and efficient solutions to several existing problems for which existing solutions are complicated, expensive, and/or otherwise inadequate. In general, the proposed mechanisms allow a program to read from a first memory location (called the “flagged” location), and to then continue execution, storing values to zero or more other memory locations such that these stores take effect (i.e., become visible in the memory system) only while the flagged memory location does not change. In some embodiments, the mechanisms further allow the program to determine when the first memory location has changed. We call the proposed mechanisms conditional multi-store synchronization mechanisms.
    Type: Grant
    Filed: August 17, 2006
    Date of Patent: January 20, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Mark S. Moir, Robert E. Cypher, Paul N. Loewenstein
  • Publication number: 20090019262
    Abstract: An electronic circuit (4000) includes a bias value generator circuit (3900) operable to supply a varying bias value in a programmable range, and an instruction circuit (3625, 4010) responsive to a first instruction to program the range of said bias value generator circuit (3900) and further responsive to a second instruction having an operand to repeatedly issue said second instruction with said operand varied in an operand value range determined as a function of the varying bias value.
    Type: Application
    Filed: May 22, 2008
    Publication date: January 15, 2009
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Kenichi Tashiro, Hiroyuki Mizuno, Yuji Umemoto
  • Patent number: 7478225
    Abstract: An apparatus and method to support pipelining of variable-latency instructions in a multithreaded processor. In one embodiment, a processor may include instruction fetch logic configured to issue a first and a second instruction from different ones of a plurality of threads during successive cycles. The processor may also include first and second execution units respectively configured to execute shorter-latency and longer-latency instructions and to respectively write shorter-latency or longer-latency instruction results to a result write port during a first or second writeback stage. The first writeback stage may occur a fewer number of cycles after instruction issue than the second writeback stage. The instruction fetch logic may be further configured to guarantee result write port access by the second execution unit during the second writeback stage by preventing the shorter-latency instruction from issuing during a cycle for which the first writeback stage collides with the second writeback stage.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: January 13, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Jeffrey S. Brooks, Christopher H. Olson, Robert T. Golla
  • Patent number: 7475225
    Abstract: Microarchitecture policies and structures partition execution resource clusters. In disclosed microarchitecture embodiments, micro-operations representing a sequential instruction ordering are partitioned into a two sets. To one set of micro-operations execution resources are allocated from a cluster of execution resources that can perform memory access operations but not branching operations. To the other set of micro-operations execution resources are allocated from a cluster of execution resources that can perform branching operations but not memory access operations. The first and second sets of micro-operations may be executed out of sequential order but are retired to represent their sequential instruction ordering.
    Type: Grant
    Filed: December 30, 2005
    Date of Patent: January 6, 2009
    Assignee: Intel Corporation
    Inventors: Stephan J. Jourdan, Avinash Sodani, Alexandre J. Farcy, Per Hammarlund, Sebastien Hily, Mark C. Davis
  • Publication number: 20080320282
    Abstract: Methods and systems are described for providing transaction support for executable program components. In one embodiment, transaction information is associated with an instruction included in an executable addressable entity included in an executable program component generated from source code written in a programming language, wherein the transaction information is independent of the source code and the programming language. Further, an access to the instruction is detected for executing by a processor. A transaction operation to perform in association with the executing of the instruction is determined based on the transaction information associated with the instruction. The transaction operation is performed in association with the executing of the instruction, wherein the transaction operation is performed by a program component other than the executable program component including the executable addressable entity.
    Type: Application
    Filed: June 22, 2007
    Publication date: December 25, 2008
    Inventor: Robert P. Morris
  • Publication number: 20080313432
    Abstract: A method and apparatus improves the block allocation time in a parallel computer system. A pre-load controller pre-loads blocks of hardware in a supercomputer cluster in anticipation of demand from a user application. In the preferred embodiments the pre-load controller determines when to pre-load the compute nodes and the block size to allocate the nodes based on pre-set parameters and previous use of the computer system. Further, in preferred embodiments each block of compute nodes in the parallel computer system has a stored hardware status to indicate whether the block is being pre-loaded, or already has been pre-loaded. In preferred embodiments, the hardware status is stored in a database connected to the computer's control system. In other embodiments, the compute nodes are remote computers in a distributed computer system.
    Type: Application
    Filed: August 14, 2008
    Publication date: December 18, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jay Symmes Bryant, Daniel Paul Kolz, Dharmesh J. Patel
  • Patent number: 7464253
    Abstract: A method and apparatus for improving the operation of an out-of order computer processor by utilizing and managing instruction wakeup using pointers with an instruction queue payload random-access memory, a mapping table, and a multiple wake-up table. Instructions allocated to the instruction queue are identified by association with a physical destination register used to index in the mapping table to provide dependent instruction information for instruction wakeup for scalable instruction queue design, reduced power consumption, and fast branch mis-prediction recovery, without the use of content-addressable memory cells.
    Type: Grant
    Filed: October 2, 2006
    Date of Patent: December 9, 2008
    Assignee: The Regents of the University of California
    Inventors: Alexander V. Veidenbaum, Marco Antonio Ramirez Salinas, Adrian Cristal Kestelman, Mateo Valero Cortes
  • Publication number: 20080301694
    Abstract: Within a data processing system, one or more register files are assigned to respective states of a graph for each of a plurality of clock cycles. A plurality of edges are inserted to form connections between the states of the graph, with respective weights being assigned to each of the edges. A best route through the graph is then determined based, at least in part, on the weights assigned to the edges.
    Type: Application
    Filed: August 15, 2008
    Publication date: December 4, 2008
    Inventor: Peter Mattson
  • Publication number: 20080301411
    Abstract: A method of operating an arithmetic logic unit (ALU) by inverting a result of an operation to be executed during a current cycle in response to control signals from instruction decode logic which indicate that a later operation will require a complement of the result, wherein the result is inverted during the current cycle. The later operation may be a subtraction operation that immediately follows the first operation. The later instruction is decoded prior to the current cycle to control the inversion in the ALU. The ALU includes an adder, a rotator, and a data manipulation unit which invert the result during the current cycle in response to an invert control signal. The second operation subtracts the result during a subsequent cycle in which a carry control signal to the adder is enabled, and the rotator and the data manipulation unit are disabled. The ALU may be used in an execution unit of a microprocessor, such as a fixed-point unit.
    Type: Application
    Filed: August 12, 2008
    Publication date: December 4, 2008
    Inventors: Brian William Curran, Ashutosh Goyal, Michael Thomas Vaden, David Allan Webber