Instruction Issuing Patents (Class 712/214)
  • Patent number: 8505015
    Abstract: A “group work sorting” technique is used in a parallel computing system that executes multiple items of work across multiple parallel processing units, where each parallel processing unit processes one or more of the work items according to their positions in a prioritized work queue that corresponds to the parallel processing unit. When implementing the technique, one or more of the parallel processing units receives a new work item to be placed into a first work queue that corresponds to the parallel processing unit and receives data that indicates where one or more other parallel processing units would prefer to place the new work item in the prioritized work queues that correspond to the other parallel processing units. The parallel processing unit uses the received data as a guide in placing the new work item into the first work queue.
    Type: Grant
    Filed: October 29, 2009
    Date of Patent: August 6, 2013
    Assignee: Teradata US, Inc.
    Inventor: Curtis Stehley
  • Publication number: 20130191616
    Abstract: In a vector processing device, a data dependence detecting unit detects a data dependence relation between a preceding instruction and a succeeding instruction which are inputted from an instruction buffer, and an instruction issuance control unit controls issuance of an instruction based on a detection result thereof. When there is a data dependence relation between the preceding instruction and the succeeding instruction, the instruction issuance control unit generates a new instruction equivalent to processing related to a vector register including the data dependence relation with the succeeding instruction in processing executed by the preceding instruction and issues the new instruction between the preceding instruction and the succeeding instruction, and thereby a data hazard can be avoided between the preceding instruction and the succeeding instruction without making a stall occur.
    Type: Application
    Filed: December 13, 2012
    Publication date: July 25, 2013
    Applicant: FUJITSU SEMICONDUCTOR LIMITED
    Inventor: FUJITSU SEMICONDUCTOR LIMITED
  • Publication number: 20130185542
    Abstract: An external Auxiliary Execution Unit (AXU) interface is provided between a processing core disposed in a first programmable chip and an off-chip AXU disposed in a second programmable chip to integrate the AXU with an issue unit, a fixed point execution unit, and optionally other functional units in the processing core. The external AXU interface enables the issue unit to issue instructions to the AXU in much the same manner as the issue unit would be able to issue instructions to an AXU that was disposed on the same chip. By doing so, the AXU on the second programmable chip can be designed, tested and verified independent of the processing core on the first programmable chip, thereby enabling a common processing core, which has been designed, tested, and verified, to be used in connection with multiple different AXU designs.
    Type: Application
    Filed: January 18, 2012
    Publication date: July 18, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Corey V. Swenson
  • Patent number: 8489863
    Abstract: An information handling system includes a processor with an instruction issue queue (IQ) that may perform age tracking operations. The issue queue IQ maintains or stores instructions that may issue out-of-order in an internal data store (IDS). The IDS organizes instructions in a queue position (QPOS) addressing arrangement. An age matrix of the IQ maintains a record of relative instruction aging for those instructions within the IDS. The age matrix updates latches or other memory cell data to reflect the changes in IDS instruction ages during a dispatch operation into the IQ. During dispatch of one or more instructions, the age matrix may update only those latches that require data change to reflect changing IDS instruction ages. The age matrix employs row and column data and clock controls to individually update those latches requiring update.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: July 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: James Wilson Bishop, Mary Douglass Brown, Jeffrey Carl Brownscheidle, Robert Allen Cordes, Maureen Anne Delaney, Jafar Nahidi, Dung Quoc Nguyen, Joel Abraham Silberman
  • Publication number: 20130166885
    Abstract: When an instruction is executed on an integrated circuit (IC), an activity level and temperature are measured. A relationship between the activity level and temperature is determined, to estimate the temperature from the activity level. The activity level is monitored and is input to a scheduler, which estimates the IC temperature based on the activity level. The scheduler distributes work taking into account the temperature of various IC regions and may include distributing work to the IC region that has a lowest estimated temperature or relatively lower estimated temperature (e.g., lower than the average IC or IC region temperature). When the utilization level of one or more IC regions is high, the scheduler is configured to reduce the clock speed or the voltage of the one or more IC regions, or flag the regions as being unavailable for additional workload.
    Type: Application
    Filed: June 22, 2012
    Publication date: June 27, 2013
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Karthik Ramani, Stephen Presant, John Brothers
  • Publication number: 20130145127
    Abstract: A data processing system is provided in which destination operands to be stored within architectural registers are constrained to have zero values added as prefixes in order that the architectural register value has a fixed bit width irrespective of the bit width of the destination operand being written thereto. Instead of adding these zero values everywhere in the data path, they are instead represented by zero flags in at least the physical registers utilised for register renaming operations and in the result queue prior to results being written to the architectural register file. This saves circuitry resources and reduces energy consumption.
    Type: Application
    Filed: December 6, 2011
    Publication date: June 6, 2013
    Applicant: ARM LIMITED
    Inventors: James Nolan Hardage, Glen Andrew Harris, Mark Carpenter Glass
  • Patent number: 8458716
    Abstract: Methods, systems, and computer program products for operating an enterprise resource planning system. The method includes running a placeholder job in said enterprise resource planning system in response to a request from at least one client application for notification of at least one background processing event, wherein the placeholder job is executed in response to the at least one background processing event.
    Type: Grant
    Filed: December 31, 2008
    Date of Patent: June 4, 2013
    Assignee: International Business Machines Corporation
    Inventors: Ralf Altrichter, Gerd Kehrer, Martin Raitza
  • Patent number: 8458443
    Abstract: A processor may include a plurality of processing units for processing instructions, where each processing unit is associated with a discrete instruction queue. Data is read from a data queue selected by each instruction, and a sequencer manages distribution of instructions to the plurality of discrete instruction queues.
    Type: Grant
    Filed: September 8, 2009
    Date of Patent: June 4, 2013
    Assignee: SMSC Holdings S.A.R.L.
    Inventors: Matthias Tramm, Manfred Stadler, Christian Hitz
  • Publication number: 20130138925
    Abstract: A method and circuit arrangement speculatively preprocess data stored in a register file during otherwise unused cycles in an execution unit, e.g., to prenormalize denormal floating point values stored in a floating point register file, to decompress compressed values stored in a register file, to decrypt encrypted values stored in a register file, or to otherwise preprocess data that is stored in an unprocessed form in a register file.
    Type: Application
    Filed: November 30, 2011
    Publication date: May 30, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
  • Publication number: 20130117541
    Abstract: One embodiment of the present invention sets forth a technique for speculatively issuing instructions to allow a processing pipeline to continue to process some instructions during rollback of other instructions. A scheduler circuit issues instructions for execution assuming that, several cycles later, when the instructions reach multithreaded execution units, that dependencies between the instructions will be resolved, resources will be available, operand data will be available, and other conditions will not prevent execution of the instructions. When a rollback condition exists at the point of execution for an instruction for a particular thread group, the instruction is not dispatched to the multithreaded execution units. However, other instructions issued by the scheduler circuit for execution by different thread groups, and for which a rollback condition does not exist, are executed by the multithreaded execution units.
    Type: Application
    Filed: November 4, 2011
    Publication date: May 9, 2013
    Inventors: Jack Hilaire CHOQUETTE, Olivier Giroux, Robert J. Stoll, Xiaogang Qiu
  • Publication number: 20130111191
    Abstract: A system and method for reducing power consumption through issue throttling of selected problematic instructions. A power throttle unit within a processor maintains instruction issue counts for associated instruction types. The instruction types may be a subset of supported instruction types executed by an execution core within the processor. The instruction types may be chosen based on high power consumption estimates for processing instructions of these types. The power throttle unit may determine a given instruction issue count exceeds a given threshold. In response, the power throttle unit may select given instruction types to limit a respective issue rate. The power throttle unit may choose an issue rate for each one of the selected given instruction types and limit an associated issue rate to a chosen issue rate. The selection of given instruction types and associated issue rate limits is programmable.
    Type: Application
    Filed: October 31, 2011
    Publication date: May 2, 2013
    Inventors: Daniel C. Murray, Andrew J. Beaumont-Smith, John H. Mylius, Peter J. Bannon, Toshi Takayanagi, Jung Wook Cho
  • Patent number: 8433855
    Abstract: Embodiments of the invention include a method of synchronizing translation changes in a processor including a translation lookaside buffer, the method including setting a control bit to enable blocking of all fetch requests that miss the translation lookaside buffer without changing a translation state of the current process; if there is at least one pending translation, then waiting for completion of the at least one pending translation; and resetting the control bit. A processor and a computer program product are provided.
    Type: Grant
    Filed: February 15, 2008
    Date of Patent: April 30, 2013
    Assignee: International Business Machines Corporation
    Inventors: Gregory W. Alexander, Lisa C. Heller, Chung-Lung Kevin Shum
  • Patent number: 8397052
    Abstract: Mechanisms are provided for controlling version pressure on a speculative versioning cache. Raw version pressure data is collected based on one or more threads accessing cache lines of the speculative versioning cache. One or more statistical measures of version pressure are generated based on the collected raw version pressure data. A determination is made as to whether one or more modifications to an operation of a data processing system are to be performed based on the one or more statistical measures of version pressure, the one or more modifications affecting version pressure exerted on the speculative versioning cache. An operation of the data processing system is modified based on the one or more determined modifications, in response to a determination that one or more modifications to the operation of the data processing system are to be performed, to affect the version pressure exerted on the speculative versioning cache.
    Type: Grant
    Filed: August 19, 2009
    Date of Patent: March 12, 2013
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Alan Gara, Kathryn M. O'Brien, Martin Ohmacht, Xiaotong Zhuang
  • Patent number: 8397238
    Abstract: Methods, apparatuses, and computer-readable storage media are disclosed for reducing power by reducing hardware-thread toggling in a multi-threaded processor. In a particular embodiment, a method allocates software threads to hardware threads. A number of software threads to be allocated is identified. It is determined when the number of software threads is less than a number of hardware threads. When the number of software threads is less than the number of hardware threads, at least two of the software threads are allocated to non-sequential hardware threads. A clock signal to be applied to the hardware threads is adjusted responsive to the non-sequential hardware threads allocated.
    Type: Grant
    Filed: December 8, 2009
    Date of Patent: March 12, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Suresh K. Venkumahanti, Martin Saint-Laurent, Lucian Codrescu, Baker S. Mohammad
  • Patent number: 8384956
    Abstract: An image processing method includes: selecting an image processing module for each attribute associated with a block of image data in accordance with the content of image processing made to correspond to the attribute; generating an image processing flow for each block by use of the selected image processing module; and determining whether the image processing flow can be constructed in an image processing area. When it is determined that a processing area of the image processing flow cannot be constructed in the image processing area, the method selects an image processing flow having an image processing module which is not contained in the other image processing flows from among a plurality of the image processing flows the blocks, the selected image processing flow being constructed in the image processing area such that the processing area of the image processing flow is included in the image processing area.
    Type: Grant
    Filed: August 20, 2009
    Date of Patent: February 26, 2013
    Assignee: Canon Kabushiki Kaisha
    Inventor: Toshimitsu Nakano
  • Publication number: 20130046954
    Abstract: Disclosed is an architecture, system and method for performing multi-thread DFA descents on a single input stream. An executer performs DFA transitions from a plurality of threads each starting at a different point in an input stream. A plurality of executers may operate in parallel to each other and a plurality of thread contexts operate concurrently within each executer to maintain the context of each thread which is state transitioning. A scheduler in each executer arbitrates instructions for the thread into an at least one pipeline where the instructions are executed. Tokens may be output from each of the plurality of executers to a token processor which sorts and filters the tokens into dispatch order.
    Type: Application
    Filed: January 18, 2012
    Publication date: February 21, 2013
    Inventors: Michael Ruehle, Umesh Ramkrishnarao Kasture, Vinay Janardan Naik, Nayan Amrutlal Suthar, Robert J. McMillen
  • Patent number: 8380964
    Abstract: An information handling system includes a processor with an instruction issue queue (IQ) that may perform age tracking operations. The issue queue IQ maintains or stores instructions that may issue out-of-order in an internal data store IDS. The IDS organizes instructions in a queue position (QPOS) addressing arrangement. An age matrix of the IQ maintains a record of relative instruction aging for those instructions within the IDS. The age matrix updates latches or other memory cell data to reflect the changes in IDS instruction ages during a dispatch operation into the IQ. During dispatch of one or more instructions, the age matrix may update only those latches that require data change to reflect changing IDS instruction ages. The age matrix employs row and column data and clock controls to individually update those latches requiring update.
    Type: Grant
    Filed: April 3, 2009
    Date of Patent: February 19, 2013
    Assignee: International Business Machines Corporation
    Inventors: James Wilson Bishop, Mary Douglass Brown, Jeffrey Carl Brownscheidle, Robert Allen Cordes, Maureen Anne Delaney, Jafar Nahidi, Dung Quoc Nguyen, Joel Abraham Silberman
  • Publication number: 20130042090
    Abstract: One embodiment of the present invention sets forth a technique for optimizing parallel thread execution in a temporal single-instruction multiple thread (SIMT) architecture. When the threads in a parallel thread group execute temporally on a common processing pipeline rather than spatially on parallel processing pipelines, execution cycles may be reduced when some threads in the parallel thread group are inactive due to divergence. Similarly, an instruction can be dispatched for execution by only one thread in the parallel thread group when the threads in the parallel thread group are executing a scalar instruction. Reducing the number of threads that execute an instruction removes unnecessary or redundant operations for execution by the processing pipelines. Information about scalar operands and operations and divergence of the threads is used in the instruction dispatch logic to eliminate unnecessary or redundant activity in the processing pipelines.
    Type: Application
    Filed: August 12, 2011
    Publication date: February 14, 2013
    Inventor: Ronny M. KRASHINSKY
  • Publication number: 20130024666
    Abstract: A method of scheduling a plurality of instructions for a processor comprises the steps of: establishing a functional unit resource table comprising a plurality of columns, each of which corresponds to one of a plurality of operation cycles of the processor and comprises a plurality of fields, each of which indicates a functional unit of the processor; establishing a ping-pong resource table comprising a plurality of columns, each of which corresponds to one of the plurality of operation cycles of the processor and comprises a plurality of fields, each of which indicates a read port or a write port of a register bank of the processor; and allotting the plurality of instructions to the plurality of operation cycles of the processor and registering the functional units and the ports of the register banks corresponding to the allotted instructions on the functional unit resource table and the ping-pong resource table.
    Type: Application
    Filed: July 18, 2011
    Publication date: January 24, 2013
    Applicant: NATIONAL TSING HUA UNIVERSITY
    Inventors: JENQ KUEN LEE, YU TE LIN, CHUNG JU WU
  • Patent number: 8359589
    Abstract: A set of helper thread binaries is created to retrieve data used by a set of main thread binaries. If executing a portion of the set of helper thread binaries results in the retrieval of data needed by the set of main thread binaries, then that retrieved data is utilized by the set of main thread binaries.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: January 22, 2013
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Juan C. Rubio, Balaram Sinharoy
  • Publication number: 20130013897
    Abstract: A method provides efficient dispatch/completion of an N Dimensional (ND) Range command in a data processing system (DPS). The method comprises: a compiler generating one or more commands from received program instructions; ND Range work processing (WP) logic determining when a command generated by the compiler will be implemented over an ND configuration of operands, where N is greater than one (1); automatically decomposing the ND configuration of operands into a one (1) dimension (1D) work element comprising P sequentially ordered work items that each represent one of the operands; placing the 1D work element within a command queue of the DPS; enabling sequential dispatching of 1D work items in ordered sequence from to one or more processing units; and generating an ND Range output by mapping the 1D work output result to an ND position corresponding to an original location of the operand represented by the 1D work item.
    Type: Application
    Filed: September 15, 2012
    Publication date: January 10, 2013
    Applicant: IBM CORPORATION
    Inventors: Gregory H. Bellows, Brian H. Horton, Joaquin Madruga, Barry L. Minor
  • Patent number: 8347309
    Abstract: Systems and methods for efficient thread arbitration in a processor. A processor comprises a multi-threaded resource. The resource may include an array of entries which may be allocated by threads. A thread arbitration table corresponding to a given thread stores a high and a low threshold value in each table entry. A thread history shift register (HSR) indexes the table, wherein each bit of the HSR indicates whether the given thread is a thread hog. When the given thread has more allocated entries in the array than the high threshold of the table entry, the given thread is stalled from further allocating array entries. Similarly, when the given thread has fewer allocated entries in the array than the low threshold of the selected table entry, the given thread is permitted to allocate entries. In this manner, threads that hog dynamic resources can be mitigated such that more resources are available to other threads that are not thread hogs.
    Type: Grant
    Filed: July 29, 2009
    Date of Patent: January 1, 2013
    Assignee: Oracle America, Inc.
    Inventors: Jared C. Smolens, Robert T. Golla, Matthew B. Smittle
  • Publication number: 20120331273
    Abstract: A method of reducing a set of instructions for execution on a processor, the method comprising: extracting information from a first instruction of the set of instructions; identifying unencoded space in one or more further instructions of the set of instructions; replacing the unencoded space of the one or more further instructions with the extracted information of the first instruction so as to form one or more amalgamated instructions; and removing the first instruction from the set of instructions.
    Type: Application
    Filed: December 29, 2011
    Publication date: December 27, 2012
    Applicant: CAMBRIDGE SILICON RADIO LIMITED
    Inventors: Peter Smith, David Richard Hargreaves
  • Patent number: 8327118
    Abstract: A processor 2 is responsive to a stream of program instructions to issue program instructions under control of scheduling circuitry 6 to respective execution units 24 for execution. The execution units 24 can include error detecting circuitry 32 for detecting a change in an output signal which occurs after the output signal has latched and during an error detecting period following the latching of the output signal. The scheduling circuitry 6 is arranged so as to suppress issue of program instructions to an execution unit 24 having such error detecting circuitry 32 on consecutive processing cycles.
    Type: Grant
    Filed: July 21, 2009
    Date of Patent: December 4, 2012
    Assignee: ARM Limited
    Inventors: David Michael Bull, Emre Ozer, Shidhartha Das
  • Publication number: 20120303937
    Abstract: A computer system used to execute an application includes a motion sensing unit, a processor and an instruction transfer unit. The motion sensing unit senses a gesture of a human body and generates an input instruction based on the gesture. The processor executes the application (or a game). The instruction transfer unit is connected with the motion sensing unit and the processor and serves as a communication interface between the motion sensing unit and the application. The instruction transfer unit transfers the input instruction to a control command, and the processor controls and executes the application in accordance with the control command.
    Type: Application
    Filed: May 24, 2012
    Publication date: November 29, 2012
    Inventors: Chia-I CHU, Cheng-Hsein Yang
  • Publication number: 20120272045
    Abstract: A control method and a system for dispatching the execution sequence of the processes in a multiprocessors system so as to dispatch an operation sequence for executing different operation programs by a monitoring processor and a plurality of target processors. The monitoring processor obtains operation status of other processors from a buffer; the monitoring processor selects at least one target processor according to the operation status; the monitoring processor assigns the target processor to execute a corresponding slave operation program, and modifies the operation status of the target processors in the buffer module; and the monitoring processor repeats the setting the operation status and assigning other target processors to execute corresponding operation programs, till a master operation program is completed.
    Type: Application
    Filed: September 1, 2011
    Publication date: October 25, 2012
    Applicant: FEATURE INTEGRATION TECHNOLOGY INC.
    Inventor: Shih-Jen Chuang
  • Patent number: 8296549
    Abstract: The invention discloses an overlapping command committing method of dynamic cycle pipeline, for a chip having pipeline structure, the method comprising the following steps: reading the command from command buffer, decoding the command, judging whether operator is reasonable or not, if a illegal command, then deleting, otherwise preprocessing the operator of command, preparing the initial operator of each pipeline, and observing the status of pipeline, waiting for pipeline command exiting signal, and judging whether there is command relevance or not, if not, then committing a new command to pipeline when the command exiting a last cycle of pipeline. Overlapping command committing method of the invention can avoid appearing of bubble, improve parallelism of pipeline performing unit, and thus shorten the processing period of command in chip, let the chip process more command in unit time.
    Type: Grant
    Filed: June 22, 2004
    Date of Patent: October 23, 2012
    Assignee: ZTE Corporation
    Inventors: Zong Zhao, Min Ren, Hu Chen
  • Patent number: 8291421
    Abstract: A system and method are provided for determining processor usable idle time in a system employing a software instruction processor. The method establishes an idle task with a lowest processor priority for a processor executing application software instructions, and uses the processor to execute an idle task. The method ceases to execute the idle task in response to the processor executing application software instructions. The amount of periodic idle task execution is determined and stored in a tangible memory medium. For example, idle time amounts can be determined per a unit of time, i.e. a percentage per second. In one aspect, the method generates an idle task report. The report can be a periodic report expressing the duration of idle task execution per time period, or a course of execution report expressing idle task start times, idle task stop times, and durations between the corresponding start and stop times.
    Type: Grant
    Filed: November 19, 2008
    Date of Patent: October 16, 2012
    Assignee: Sharp Laboratories of America, Inc.
    Inventors: Tommy Lee Oswald, John C. Thomas, James E. Owen
  • Patent number: 8285974
    Abstract: An apparatus for queue allocation. An embodiment of the apparatus includes a dispatch order data structure, a bit vector, and a queue controller. The dispatch order data structure corresponds to a queue. The dispatch order data structure stores a plurality of dispatch indicators associated with a plurality of pairs of entries of the queue to indicate a write order of the entries in the queue. The bit vector stores a plurality of mask values corresponding to the dispatch indicators of the dispatch order data structure. The queue controller interfaces with the queue and the dispatch order data structure. The queue controller excludes at least some of the entries from a queue operation based on the mask values of the bit vector.
    Type: Grant
    Filed: July 30, 2007
    Date of Patent: October 9, 2012
    Assignee: NetLogic Microsystems, Inc.
    Inventors: Gaurav Singh, Srivatsan Srinivasan, Lintsung Wong
  • Publication number: 20120246448
    Abstract: A system for executing instructions using a plurality of memory fragments for a processor. The system includes a global front end scheduler for receiving an incoming instruction sequence, wherein the global front end scheduler partitions the incoming instruction sequence into a plurality of code blocks of instructions and generates a plurality of inheritance vectors describing interdependencies between instructions of the code blocks. The system further includes a plurality of virtual cores of the processor coupled to receive code blocks allocated by the global front end scheduler, wherein each virtual core comprises a respective subset of resources of a plurality of partitionable engines, wherein the code blocks are executed by using the partitionable engines in accordance with a virtual core mode and in accordance with the respective inheritance vectors. A plurality memory fragments are coupled to the partitionable engines for providing data storage.
    Type: Application
    Filed: March 23, 2012
    Publication date: September 27, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120246447
    Abstract: According to one embodiment of the present disclosure, an approach is provided in which a thread is selected from multiple active threads, along with a corresponding weighting value. Computational logic determines whether one of the multiple threads is dispatching an instruction and, if so, computes a dispatch weighting value using the selected weighting value and a dispatch factor that indicates a weighting adjustment of the selected weighting value. In turn, a resource utilization value of the selected thread is computed using the dispatch weighting value.
    Type: Application
    Filed: March 27, 2011
    Publication date: September 27, 2012
    Applicant: International Business Machines Corporation
    Inventors: James Wilson Bishop, Michael J. Genden, Steven Bradford Herndon, Philip Lee Vitale
  • Patent number: 8275975
    Abstract: The invention proposes a simple method for controlling distributed functional units (FU) in a system. It offloads the main system processor from intermediate status monitoring. The sequencer controlled system comprises a plurality of functional units, a processor operatively coupled to the plurality of functional units through a bus, a sequencer having a set of registers, and an interrupt source register configured for interrupt polling. The registers are configured to control the timing of at least one operation of the functional units with stored instructions for each of the functional units. The processor sets up at least some of the registers through the bus for the initial configuration and the sequencer is activated by the processor.
    Type: Grant
    Filed: January 25, 2008
    Date of Patent: September 25, 2012
    Assignee: Mtekvision Co., Ltd.
    Inventors: Ali Osman Ors, Daniel Laroche, Jean-François Deschênes
  • Patent number: 8271765
    Abstract: The illustrative embodiments described herein provide a computer-implemented method, apparatus, and a system for managing instructions. A load/store unit receives a first instruction at a port. The load/store unit rejects the first instruction in response to determining that the first instruction has a first reject condition. Then, the instruction sequencing unit activates a first bit in response to the load/store unit rejection the first instruction. The instruction sequencing unit blocks the first instruction from reissue while the first bit is activated. The processor unit determines a class of rejection of the first instruction. The instruction sequencing unit starts a timer. The length of the timer is based on the class of rejection of the first instruction. The instruction sequencing unit resets the first bit in response to the timer expiring. The instruction sequencing unit allows the first instruction to become eligible for reissue in response to resetting the first bit.
    Type: Grant
    Filed: April 8, 2009
    Date of Patent: September 18, 2012
    Assignee: International Business Machines Corporation
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Michael Stephen Floyd, Dung Quoc Nguyen, Bruce Joseph Ronchetti
  • Patent number: 8255670
    Abstract: In one embodiment, a processor comprises a scheduler configured to issue a first instruction operation to be executed and an execution core coupled to the scheduler. Configured to execute the first instruction operation, the execution core comprises a plurality of replay sources configured to cause a replay of the first instruction operation responsive to detecting at least one of a plurality of replay cases. The scheduler is configured to inhibit issuance of the first instruction operation subsequent to the replay for a subset of the plurality of replay cases. The scheduler is coupled to receive an acknowledgement indication corresponding to each of the plurality of replay cases in the subset, and is configured to inhibit issuance of the first instruction operation until the acknowledgement indication is asserted that corresponds to an identified replay case of the subset.
    Type: Grant
    Filed: November 17, 2009
    Date of Patent: August 28, 2012
    Assignee: Apple Inc.
    Inventors: Po-Yung Chang, Wei-Han Lien, Jesse Pan, Ramesh Gunna, Tse-Yu Yeh, James B. Keller
  • Patent number: 8250226
    Abstract: In one embodiment, a method for generating one or more synthetic transactions with one or more web service operations includes accessing a Web Services Description Language (WSDL) file describing a web service and, according to the WSDL file, generating a symbol table for describing a client for generating one or more synthetic transactions with the web service. The method also includes receiving input from a user specifying one or more operations of the web service for invocation, an order for invoking the operations of the web service, and one or more values of one or more parameters of the operations. The method also includes incorporating the input from the user into the symbol table and generating the client according to the symbol table.
    Type: Grant
    Filed: July 21, 2005
    Date of Patent: August 21, 2012
    Assignee: CA, Inc.
    Inventors: Roger C. Saunders, William S. R. Thain
  • Patent number: 8245065
    Abstract: A method of power gating a microprocessor having an instruction scheduling unit for receiving issued instructions from an instruction decode unit; an execution unit coupled to receive and send signals from and to the instruction scheduling unit; and a state machine located within the execution unit, the method comprises: obtaining a number of instructions per cycle being issued to the instruction scheduling unit; determining, subsequent to obtaining the number of instructions per cycle, if the number of instruction per cycle being issued to the instruction scheduling unit is less than a threshold level, and then determining if at least two of the instructions being issued to the instruction scheduling unit are independent of each other only when the instructions per cycle is less than the threshold level; determining when at least two of the instructions being issued to the instruction scheduling unit are independent of each other; and power gating the microprocessor to gate off power to idle macros with a si
    Type: Grant
    Filed: March 4, 2009
    Date of Patent: August 14, 2012
    Assignee: International Business Machines Corporation
    Inventors: Tim Niggemeier, Harry Barowski, Maarten Boersma, Gunnar Spiess
  • Patent number: 8245015
    Abstract: A processor includes a plurality of executing sections configured to simultaneously execute instructions for a plurality of threads, an instruction issuing section configured to issue instructions to the plurality of executing sections, and an instruction sync monitoring section configured to, when an instruction-synchronizing instruction is issued to one or more of the plurality of executing sections from the instruction issuing section, monitor completion of execution of the instruction-synchronizing instruction for each of the executing sections, to which the instruction-synchronizing instruction has been issued, thus detecting completion of execution of preceding instructions for the thread to which the instruction-synchronizing instruction belongs.
    Type: Grant
    Filed: July 7, 2009
    Date of Patent: August 14, 2012
    Assignee: Sony Corporation
    Inventor: Masaaki Ishii
  • Patent number: 8245232
    Abstract: Systems and methodologies for stall-time fair memory access scheduling for shared memory systems are provided herein. A stall-time fairness policy can be applied in accordance with various aspects described herein to schedule memory requests from threads sharing a memory system. To this end, a Stall-Time Fair Memory scheduler (STFM) algorithm can be utilized, wherein memory-related slowdown experienced by a group of threads due to interference from other threads is equalized. Additionally and/or alternatively, a traditional scheduling policy such as first-ready first-come-first-serve (FR-FCFS) can be utilized in combination with a cap on column-over-row reordering of memory requests, thereby reducing the amount of stall-time unfairness imposed by such traditional scheduling policies. Further, various aspects described herein can perform memory scheduling based on thread weights and/or other parameters, which can be configured in hardware and/or software.
    Type: Grant
    Filed: March 5, 2008
    Date of Patent: August 14, 2012
    Assignee: Microsoft Corporation
    Inventors: Onur Mutlu, Thomas Moscibroda
  • Patent number: 8245014
    Abstract: The present invention provides a network multithreaded processor, such as a network processor, including a thread interleaver that implements fine-grained thread decisions to avoid underutilization of instruction execution resources in spite of large communication latencies. In an upper pipeline, an instruction unit determines an-instruction fetch sequence responsive to an instruction queue depth on a per thread basis. In a lower pipeline, a thread interleaver determines a thread interleave sequence responsive to thread conditions including thread latency conditions. The thread interleaver selects threads using a two-level round robin arbitration. Thread latency signals are active responsive to thread latencies such as thread stalls, cache misses, and interlocks. During the subsequent one or more clock cycles, the thread is ineligible for arbitration. In one embodiment, other thread conditions affect selection decisions such as local priority, global stalls, and late stalls.
    Type: Grant
    Filed: April 14, 2008
    Date of Patent: August 14, 2012
    Assignee: Cisco Technology, Inc.
    Inventors: Donald E Steiss, Earl T Cohen, John J Williams
  • Publication number: 20120204008
    Abstract: Methods and apparatus for processing instructions by elaboration of instructions prior to issuing the instructions for execution are described. An instruction is received at a hybrid instruction queue comprised of a first queue and a second queue. When the second queue has available space, the instruction is elaborated to expand one or more bit fields to reduce decoding complexity when the elaborated instruction is issued, wherein the elaborated instruction is stored in the second queue. When the second queue does not have available space, the instruction is stored in an unelaborated form in a first queue. The first queue is configured as an exemplary in-order queue and the second queue is configured as an exemplary out-of-order queue.
    Type: Application
    Filed: February 1, 2012
    Publication date: August 9, 2012
    Applicant: QUALCOMM INCORPORATED
    Inventors: Kenneth Alan Dockser, Yusuf Cagatay Tekmen
  • Publication number: 20120204009
    Abstract: A processor includes an instruction fetch unit, an issue queue coupled to the instruction fetch unit, an execution unit coupled to the issue queue, and a multi-level register file including a first level register file having lower access latency and a second level register file having higher access latency. Each of the first and second level register files includes a plurality of physical registers for holding operands that is concurrently shared by a plurality of threads. The processor further includes a mapper that, at dispatch of an instruction specifying a source logical register from the instruction fetch unit to the issue queue, initiates a swap of a first operand associated with the source logical register that is in the second level register file with a second operand held in the first level register file. The issue queue, following the swap, issues the instruction to the execution unit for execution.
    Type: Application
    Filed: April 16, 2012
    Publication date: August 9, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: CHRISTOPHER M. ABERNATHY, MARY D. BROWN, HUNG Q. LE, DUNG Q. NGUYEN
  • Patent number: 8239661
    Abstract: A method for double-issue complex instructions receives a complex instruction comprising a first portion and a second portion. The method sets a single issue queue slot and allocates an execution unit for the complex instruction, and identifies dependencies in the first and second portions. The method sets a dependency matrix slot and a consumers table slot for the first and section portion. In the event the first portion dependencies have been satisfied, the method issues the first portion and then issues the second portion from the single issue queue slot. In the event the second portion dependencies have not been satisfied, the method cancels the second portion issue.
    Type: Grant
    Filed: August 28, 2008
    Date of Patent: August 7, 2012
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Todd A. Venton
  • Publication number: 20120191950
    Abstract: The described embodiments provide a processor that executes vector instructions. In the described embodiments, while dispatching instructions at runtime, the processor encounters a predicate-generating instruction. Upon determining that a result of the predicate-generating instruction is predictable, the processor dispatches a prediction micro-operation associated with the predicate-generating instruction, wherein the prediction micro-operation generates a predicted result vector for the predicate-generating instruction. The processor then executes the prediction micro-operation to generate the predicted result vector.
    Type: Application
    Filed: May 12, 2011
    Publication date: July 26, 2012
    Applicant: APPLE INC.
    Inventor: Jeffry E. Gonion
  • Publication number: 20120191951
    Abstract: A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
    Type: Application
    Filed: April 5, 2012
    Publication date: July 26, 2012
    Inventors: Salvador Palanca, Stephen A. Fischer, Subramaniam Maiyuran, Shekoufeh Oawami
  • Patent number: 8230410
    Abstract: An enhanced mechanism for parallel execution of computer programs utilizes a bidding model to allocate additional registers and execution units for stretches of code identified as opportunities for microparallelization. A microparallel processor architecture apparatus permits software (e.g. compiler) to implement short-term parallel execution of stretches of code identified as such before execution. In one embodiment, an additional paired unit, if available, is allocated for execution of an identified stretch of code. Each additional paired unit includes an execution unit and a half set of registers. This apparatus is available for compilers or assembler language coders to use and allows software to unlock parallel execution capabilities that are present in existing computer programs but heretofore were executed sequentially for lack of a suitable apparatus.
    Type: Grant
    Filed: October 26, 2009
    Date of Patent: July 24, 2012
    Assignee: International Business Machines Corporation
    Inventor: Larry W. Loen
  • Patent number: 8225012
    Abstract: A method may include distributing ranges of addresses in a memory among a first set of functions in a first pipeline. The first set of the functions in the first pipeline may operate on data using the ranges of addresses. Different ranges of addresses in the memory may be redistributed among a second set of functions in a second pipeline without waiting for the first set of functions to be flushed of data.
    Type: Grant
    Filed: September 3, 2009
    Date of Patent: July 17, 2012
    Assignee: Intel Corporation
    Inventor: Thomas A. Piazza
  • Patent number: 8219784
    Abstract: A computer-implemented method and apparatus for managing an out of order dispatched instruction queue in a microprocessor. In one embodiment, the method and apparatus include assigning a group identification number and a target identification number to an instruction in an instruction stream. The group identification number and the target identification number are labeled inside an instruction fetcher unit. The group identification number and the target identification number are pre-decoded. The instruction is sent to an instruction queue. The instruction is re-ordered in the instruction stream after executing the instruction utilizing information from the pre-decoding of the group identification number and the target identification number.
    Type: Grant
    Filed: December 9, 2008
    Date of Patent: July 10, 2012
    Assignee: International Business Machines Corporation
    Inventors: Oliver Keren Ban, Xiangang Cheng, Liang Huang Lee, Katherine June Pearsall
  • Patent number: 8219996
    Abstract: A computer processor includes a fairness monitor for monitoring allocations of a processor resource to requestors. If unfairness is determined, a resource allocator is biased to offset said unfairness.
    Type: Grant
    Filed: May 9, 2007
    Date of Patent: July 10, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Dale C. Morris
  • Patent number: 8200913
    Abstract: An information processing system includes a plurality of PMM and data transmission paths for connection between the PMM and transmitting a value of a PMM to another PMM. A memory of each PMM holds a list of values of first items arranged in the ascending order or descending order without overlap and/or a list of values of the second item to be shared. A memory module of each PMM transmits a value contained in the value list to another PMM, receives a value contained in the value list from the another PMM, references the value list of the first item and the value list of the second item of the another PMM, and generates a list of common values considering the values contained in the value lists of the first item and the second item of all the other PMM.
    Type: Grant
    Filed: January 25, 2005
    Date of Patent: June 12, 2012
    Assignee: Turbo Data Laboratories, Inc.
    Inventor: Shinji Furusho
  • Patent number: 8185722
    Abstract: The invention provides a processor comprising an execution unit and a thread scheduler configured to schedule a plurality of threads for execution by the execution unit in dependence on a respective status for each thread. The execution unit is configured to execute thread scheduling instructions which manage said statuses, the thread scheduling instructions including at least: a thread event enable instruction which sets a status to event-enabled to allow a thread to accept events, a wait instruction which sets the status to suspended pending at least one event upon which continued execution of the thread depends, and a thread event disable instruction which sets the status to event-disabled to stop the thread from accepting events. The continued execution comprises retrieval of a continuation point vector for the thread.
    Type: Grant
    Filed: March 14, 2007
    Date of Patent: May 22, 2012
    Assignee: XMOS Ltd.
    Inventor: Michael David May