Processing Control Patents (Class 712/220)
  • Patent number: 7996659
    Abstract: An apparatus comprises register means for storing a return context upon initiation of a supervisor call instruction and restoring means to restore a privilege level and status register upon execution of a supervisor return instruction. The supervisor call instruction can be called from all contexts.
    Type: Grant
    Filed: June 6, 2005
    Date of Patent: August 9, 2011
    Assignee: Atmel Corporation
    Inventors: Erik K. Renno, Oyvind Strom, Andreas Engh-Halstvedt, Havard Skinnemoen
  • Patent number: 7996656
    Abstract: In one embodiment, the present invention includes a pipeline to execute instructions out-of-order, where the pipeline has front-end stages, execution units, and back-end stages, and the execution units are coupled between dispatch ports of the front-end stages and writeback ports of the back-end stages. Further, a reconfigurable logic is coupled between one of the dispatch ports and one of the writeback ports to perform specialized operations or handle instructions that are not part of an instruction set architecture (ISA) used by the pipeline. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 25, 2007
    Date of Patent: August 9, 2011
    Assignee: Intel Corporation
    Inventor: Andrew F. Glew
  • Patent number: 7996657
    Abstract: A reconfigurable computing circuit for reducing the amount of dummy data to be stored in data registers, which is required when the wiring is shared by the configuration information bus and scan chain. When data is to be stored in data registers and configuration registers constituting the scan chain in reconfig computing block 2010, reg setting data selecting unit 3400 selects either a value stored in reg setting data storage unit 3000 or an initial value output from data reg data generating unit 4000, based on the information stored in reg type managing unit 1100 that indicates the types of registers and the connection order of the registers in the scan chain, and outputs the selected value in sequence to the scan chain under control of scan/reconfig control unit 1000. Each register in the scan chain then shifts data stored therein to the next register in the scan chain in sequence.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: August 9, 2011
    Assignee: Panasonic Corporation
    Inventors: Masaki Maeda, Takahiro Ichinomiya
  • Publication number: 20110191754
    Abstract: A system and method for improving software maintainability, performance, and/or security by associating a unique marker to each software code-block; the system comprising of a plurality of processors, a plurality of code-blocks, and a marker associated with each code-block. The system may also include a special hardware register (code-block marker hardware register) in each processor for identifying the markers of the code-blocks executed by the processor, without changing any of the plurality of code-blocks.
    Type: Application
    Filed: January 29, 2010
    Publication date: August 4, 2011
    Applicant: International Business Machines Corporation
    Inventors: Ramanjaneya S. Burugula, Joefon Jann, Pratap C. Pattnaik
  • Patent number: 7991909
    Abstract: Method and apparatus for communication between a processor and processing elements in an integrated circuit (e.g., a programmable logic device is described. In an example, a first lookup table is configured to store first information representing which of the processing elements is capable of performing which of a plurality of instructions. A second lookup table is configured to store second information representing which of the plurality of instructions is being serviced by which of the processing elements. Control logic is coupled to the processor, the first lookup table, and the second lookup table. The control logic is configured to communicate data from the processor to the processing elements based on the first information, and communicate data from the processing elements to the processor based on the second information.
    Type: Grant
    Filed: March 27, 2007
    Date of Patent: August 2, 2011
    Assignee: Xilinx, Inc.
    Inventors: Paul R. Schumacher, Daniel L McMurtrey, Shengqi Yang
  • Publication number: 20110179253
    Abstract: A computer implemented method for handling events in a multi-core processing environment is provided. The method comprises handling an event by a second application running on a second core, in response to determining that the event is initiated by a first application running on a first core; and running a third application on the first core, while the first application is waiting for the event to be handled by the second application.
    Type: Application
    Filed: January 21, 2010
    Publication date: July 21, 2011
    Applicant: International Business Machines Corporation
    Inventors: Shmuel Ben Yehuda, Abel Gordon, Orit (Luba) Wasserman, Ben-Ami Yassour
  • Publication number: 20110179257
    Abstract: A data processing device including a reception unit, an instruction unit and a storage unit. The reception unit receives instructions for processing at a processing execution device. The instruction unit instructs the processing execution device to cancel a power saving state of the processing execution device and execute the processing corresponding to an instruction received by the reception unit. The storage unit stores data relating to received instructions. If the processing corresponding to the received instruction is a pre-specified process, data relating to the instruction is stored by the storage unit. If the processing corresponding to the received instruction is not a pre-specified process, the instruction unit instructs the processing execution device to execute both the processing corresponding to this instruction and processing based on data relating to instructions stored in the storage unit.
    Type: Application
    Filed: September 3, 2010
    Publication date: July 21, 2011
    Applicant: FUJI XEROX CO., LTD.
    Inventor: Kazumoto SHINODA
  • Patent number: 7984274
    Abstract: In one embodiment, a processor comprises a prediction circuit and another circuit coupled to the prediction circuit. The prediction circuit is configured to predict whether or not a first load instruction will experience a partial store to load forward (PSTLF) event during execution. A PSTLF event occurs if a plurality of bytes, accessed responsive to the first load instruction during execution, include at least a first byte updated responsive to a previous uncommitted store operation and also include at least a second byte not updated responsive to the previous uncommitted store operation. Coupled to receive the first load instruction, the circuit is configured to generate one or more load operations responsive to the first load instruction. The load operations are to be executed in the processor to execute the first load instruction, and a number of the load operations is dependent on the prediction by the prediction circuit.
    Type: Grant
    Filed: June 18, 2009
    Date of Patent: July 19, 2011
    Assignee: Apple Inc.
    Inventors: Sudarshan Kadambi, Po-Yung Chang, Eric Hao
  • Publication number: 20110173420
    Abstract: A system for enhancing performance of a computer includes a computer system having a data storage device. The computer system includes a program stored in the data storage device and steps of the program are executed by a processor. An external unit is external to the processor for monitoring specified computer resources. The external unit is configured to detect a specified condition using the processor. The processor including one or more threads. The thread resumes an active state from a pause state using the external unit when the specified condition is detected by the external unit.
    Type: Application
    Filed: January 8, 2010
    Publication date: July 14, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dong Chen, Mark Giampapa, Philip Heidelberger, Martin Ohmacht, David L. Satterfield, Burkhard Steinmacher-Burow, Krishnan Sugavanam
  • Publication number: 20110173419
    Abstract: A wake-and-go mechanism is provided for a microprocessor. The wake-and-go mechanism looks ahead in the instruction stream of a thread for programming idioms that indicates that the thread is waiting for an event. If a look-ahead polling operation succeeds, the look-ahead wake-and-go engine may record an instruction address for the corresponding idiom so that the wake-and-go mechanism may have the thread perform speculative execution at a time when the thread is waiting for an event. During execution, when the wake-and-go mechanism recognizes a programming idiom, the wake-and-go mechanism may store the thread state in the thread state storage. Instead of putting thread to sleep, the wake-and-go mechanism may perform speculative execution.
    Type: Application
    Filed: February 1, 2008
    Publication date: July 14, 2011
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20110173417
    Abstract: A wake-and-go mechanism may be a programming idiom accelerator. As a processor fetches instructions, the programming idiom accelerator may look ahead to determine whether a programming idiom is coming up in the instruction stream. If the programming idiom accelerator recognizes a programming idiom, the programming idiom accelerator may perform an action to accelerate execution of the programming idiom. In the case of a wake-and-go programming idiom, the programming idiom accelerator may record an entry in a wake-and-go array, for example.
    Type: Application
    Filed: February 1, 2008
    Publication date: July 14, 2011
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20110173422
    Abstract: A system and method for enhancing performance of a computer which includes a computer system including a data storage device. The computer system includes a program stored in the data storage device and steps of the program are executed by a processer. The processor processes instructions from the program. A wait state in the processor waits for receiving specified data. A thread in the processor has a pause state wherein the processor waits for specified data. A pin in the processor initiates a return to an active state from the pause state for the thread. A logic circuit is external to the processor, and the logic circuit is configured to detect a specified condition. The pin initiates a return to the active state of the thread when the specified condition is detected using the logic circuit.
    Type: Application
    Filed: January 8, 2010
    Publication date: July 14, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dong Chen, Mark Giampapa, Philip Heidelberger, Martin Ohmacht, David L. Satterfield, Burkhard Steinmacher-Burow, Krishnan Sugavanam
  • Patent number: 7979679
    Abstract: A computer system is disclosed capable of conditionally carrying out an operation defined in a computer instruction. The computer instruction is implemented on so-called packed operands, that is operands containing a plurality of packed objects in respective lanes. An operation defined in the computer instruction is conditionally carried out in dependence on stored condition values which determine for each lane whether or not the operation is to be executed on objects in that lane. An execution unit for a computer system, a computer system and a method of executing instructions are defined.
    Type: Grant
    Filed: March 13, 2006
    Date of Patent: July 12, 2011
    Assignee: Broadcom Corporation
    Inventor: Sophie Wilson
  • Patent number: 7979642
    Abstract: A data processing apparatus is provided comprising processing circuitry for executing multiple program threads. At least one storage unit is shared between the multiple program threads and comprises multiple entries, each entry for storing a storage item either associated with a high priority program thread or a lower priority program thread. A history storage for retaining a history field for each of a plurality of blocks of the storage unit is also provided. On detection of a high priority storage item being evicted from the storage unit as a result of allocation to that entry of a lower priority storage item, the history field for the block containing that entry is populated with an indication of the evicted high priority storage item.
    Type: Grant
    Filed: September 11, 2008
    Date of Patent: July 12, 2011
    Assignee: ARM Limited
    Inventors: David Michael Bull, Emre Özer
  • Patent number: 7979680
    Abstract: A processor system may implement multiple contexts on one or more processors having a local memory. Code and/or data for first and second contexts may be respectively stored simultaneously in first and second regions of a processor's local memory, storing code and/or data for a second context in a second region of the local memory, the secondary processor may execute the first context while the second context waits. Code and/or data for the first context may be transferred from the first region to the second and code and/or data for the second context may be transferred from the second region to the first, and the processor may execute the second context during a pause or stoppage of execution of the first context. Alternatively, the code and/or data for the second context may be transferred to another processor's local memory.
    Type: Grant
    Filed: December 3, 2009
    Date of Patent: July 12, 2011
    Assignee: Sony Computer Entertainment Inc.
    Inventors: John P. Bates, Attila Vass
  • Publication number: 20110167245
    Abstract: There is provided a multi-core system that provides automated task list generation, parallelism templates, and memory management. By constructing, profiling, and analyzing a sequential list of functions to be executed in a parallel fashion, corresponding parallel execution templates may be stored for future lookup in a database. A processor may then select a subset of functions from the sequential list of functions based on input data, select a template from the template database based on particular matching criteria such as high-level task parameters, finalize the template by resolving pointers and adding or removing transaction control blocks, and forward the resulting optimized task list to a scheduler for distribution to multiple slave processing cores. The processor may also analyze data dependencies between tasks to consolidate tasks working on the same data to a single core, thereby implementing memory management and efficient memory locality.
    Type: Application
    Filed: January 6, 2010
    Publication date: July 7, 2011
    Applicant: MINDSPEED TECHNOLOGIES, INC.
    Inventors: Nick J. Lavrov, Nour Toukmaji
  • Patent number: 7975129
    Abstract: Controlling a reorder buffer (ROB) to selectively perform functional hardware lock disabling (HLD) is described. One apparatus embodiment includes a unit to enable an ROB to selectively disable a lock upon Identifying a lock acquire operation (LAO) associated with a critical section (CS) entry point, a unit to selectively retire the LAO, a unit to cause the ROB to selectively disable the lock, and a unit to snoop a buffer. The apparatus may, based on the snooping, selectively abort a transaction associated with the CS.
    Type: Grant
    Filed: September 18, 2009
    Date of Patent: July 5, 2011
    Assignee: Intel Corporation
    Inventors: Shlomo Raikin, Gad Sheaffer, Doron Orenstlen
  • Patent number: 7973804
    Abstract: A circuit arrangement and method support a multithreaded rendering architecture capable of dynamically routing pixel fragments from a pixel fragment generator to any pixel shader from among a pool of pixel shaders. The pixel fragment generator is therefore not tied to a specific pixel shader, but is instead able to utilize multiple pixel shaders in a pool of pixel shaders to minimize bottlenecks and improve overall hardware utilization and performance during image processing.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: July 5, 2011
    Assignee: International Business Machines Corporation
    Inventors: Eric Oliver Mejdrich, Paul Emery Schardt, Robert Allen Shearer
  • Patent number: 7975126
    Abstract: Described is microprocessor architecture that includes at least one reconfigurable execution path (e.g., implemented via FPGAs or CPLDs). When an instruction is fetched, a mechanism determines whether the reconfigurable execution path (and/or which path) will handle that instruction. A content addressable memory may be used to determine the execution path when fed the instruction's operational code, or an arbiter and multiplexer may resolve conflicts if multiple instruction decode blocks recognize the same instruction. The execution path may be dynamically reconfigured, activated or deactivated as needed, such as to extend an instruction set, to optimize instructions for a particular application program, to implement a peripheral device, to provide parallel computing, and/or based on power consumption and/or processing power needs. Security may be provided by having the reconfigurable execution path loaded from an extension file that is associated with metadata, including security information.
    Type: Grant
    Filed: March 19, 2009
    Date of Patent: July 5, 2011
    Assignee: Microsoft Corporation
    Inventors: Richard Neil Pittman, Alessandro Forin, Nathaniel L. Lynch
  • Publication number: 20110161635
    Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.
    Type: Application
    Filed: December 26, 2009
    Publication date: June 30, 2011
    Inventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean P. Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
  • Publication number: 20110161636
    Abstract: Provided are a method of managing power of a multi-core processor, a recording medium storing a program for performing the method, and a multi-core processor system. The method of managing power of a multi-core processor having at least one core includes determining a parallel-processing section on the basis of information included in a parallel-processing program, collecting information for determining a clock frequency of the core in the determined parallel-processing section according to each core, and then determining the clock frequency of the core on the basis of the collected information. Accordingly, it is possible to minimize power consumption while ensuring quality of service (QoS).
    Type: Application
    Filed: April 8, 2010
    Publication date: June 30, 2011
    Applicant: POSTECH ACADEMY - INDUSTRY FOUNDATION
    Inventors: Ki-Seok Chung, Young-Si Hwang
  • Publication number: 20110161633
    Abstract: Various embodiments of the present invention provide systems and methods for monitoring out of order data decoding. For example, a method for monitoring out of order data processing is provided that includes receiving a plurality of data sets that is associated with a plurality of identifiers with each of the plurality of identifiers indicates a respective one of the plurality of data sets; storing each of the plurality of identifiers in a FIFO memory in an order that the corresponding data sets of the plurality of data sets was received; processing the plurality of data sets such that at least one of the plurality of data sets is provided as an output data set; accessing the next available identifier from the FIFO memory; and asserting an out of order signal when the next available identifier is not the same as the identifier associated with the output data set.
    Type: Application
    Filed: December 31, 2009
    Publication date: June 30, 2011
    Inventors: Changyou Xu, Shaohua Yang, Kapil Gaba
  • Publication number: 20110161734
    Abstract: Disclosed are a method, a system and a computer program product of operating a data processing system that can include or be coupled to multiple processor cores. In one or more embodiments, an error can be determined while two or more processor cores are processing a first group of two or more work items, and the error can be signaled to an application. The application can determine a state of progress of processing the two or more work items and at least one dependency from the state of progress. In one or more embodiments, a second group of two or more work items that are scheduled for processing can be unscheduled, in response to determining the error. In one or more embodiments, the application can process at least one work item that caused the error, and the second group of two or more work items can be rescheduled for processing.
    Type: Application
    Filed: December 31, 2009
    Publication date: June 30, 2011
    Applicant: IBM CORPORATION
    Inventors: Benjamin G. Alexander, Gregory H. Bellows, Joaquin Madruga, Barry L. Minor
  • Publication number: 20110161637
    Abstract: An apparatus and method for parallel processing in consideration of degree of parallelism are provided. One of a task parallelism and a data parallelism is dynamically selected while a job is processed. In response to a task parallelism being selected, a sequential version code is allocated to a core or processor for processing a job. In response to a data parallelism being selected, a parallel version code is allocated to a core a processor for processing a job.
    Type: Application
    Filed: July 29, 2010
    Publication date: June 30, 2011
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kue-Hwan SIHN, Hee-Jin Chung, Dong-Gun Kim
  • Publication number: 20110161630
    Abstract: An apparatus and method is described herein for replacing faulty core components. General purpose hardware is provided to replace core pipeline components, such as execution units. In the embodiment of execution unit replacement, a proxy unit is provided, such that mapping logic is able to map instruction/operations, which correspond to faulty execution units, to the proxy unit. As a result, the proxy unit is able to receive the operations, send them to general purpose hardware for execution, and subsequently write-back the execution results to a register file; it essentially replaces the defective execution unit allowing a processor with defective units to be sold or continue operation.
    Type: Application
    Filed: December 28, 2009
    Publication date: June 30, 2011
    Inventors: Steven E. Raasch, Michael D. Powell, Shubhendu S. Mukherjee, Arijit Biswas
  • Patent number: 7969187
    Abstract: A hardware interface in an integrated circuit is disclosed. The hardware interface comprises data storage coupled to store and provide data; a data shifter coupled to the data storage to at least bit shift the data obtained from the data storage; and a control circuit coupled to the data storage and the data shifter for controlling a transfer of the data from the data storage and the data shifter. The control circuit comprises a state machine for controlling operation of the data storage and the data shifter; and the state machine is programmable responsive to code executable by a processor coupled to an auxiliary processing unit to adapt to the auxiliary processing unit.
    Type: Grant
    Filed: August 6, 2010
    Date of Patent: June 28, 2011
    Assignee: Xilinx, Inc.
    Inventors: Stephen A. Neuendorffer, Paul M. Hartke, Paul R. Schumacher
  • Patent number: 7971036
    Abstract: A multi-node video signal processor (VSPN) is describes that tightly couples multiple multi-cycle state machines (hardware assist units) to each processor and each memory in each node of an N node scalable array processor. VSPN memory hardware assist instructions are used to initiate multi-cycle state machine functions, to pass parameters to the multi-cycle state machines, to fetch operands from a node's memory, and to control the transfer of results from the multi-cycle state machines.
    Type: Grant
    Filed: April 18, 2007
    Date of Patent: June 28, 2011
    Assignee: Altera Corp.
    Inventors: Gerald George Pechanek, Mihailo M. Stojancic
  • Patent number: 7971035
    Abstract: A data processing system having a memory for storing instructions and several central processing units for executing instructions, each central processing unit includes an adaptive power supply which provides, among other data, temperature information. Circuitry is provided that receives the temperature information from the many central processing units, selects a central processing unit which has the lowest temperature and which is available to execute instructions and dispatches instructions to the selected central processing from the memory.
    Type: Grant
    Filed: February 6, 2007
    Date of Patent: June 28, 2011
    Assignee: International Business Machines Corporation
    Inventors: Deepak K. Singh, Francois Ibrahim Atallah
  • Publication number: 20110153992
    Abstract: Example methods and apparatus to manage object locks are disclosed. A disclosed example method includes receiving an object lock request from a processor, the lock request associated with object lock code to lock an object, and generating object lock-bypass code based on a type of the processor, the object lock-bypass code to execute in a managed runtime in response to receiving the object lock request. The example method also includes identifying a type of instruction set architecture (ISA) associated with the processor, invoking a checkpoint instruction for the processor based on the identified ISA, suspending the object lock code from executing and executing target code when the object is uncontended, and allowing the object lock code to execute when the object is contended.
    Type: Application
    Filed: December 23, 2009
    Publication date: June 23, 2011
    Inventors: Suresh Srinivas, Stephen H. Dohrmann, Mingqiu Sun, Uma Srinivasan, Ravi Rajwar, Konrad K. Lai
  • Publication number: 20110153987
    Abstract: A multi-core processor system supporting simultaneous thread sharing across execution resources of multiple processor cores is provided. The multi-core processor system includes a first processor core with a first instruction queue and dispatch logic in communication with a first execution resource of the first processor core. The multi-core processor system also includes a second processor core with a second instruction queue and dispatch logic in communication with a second execution resource of the second processor core. A high-speed execution resource bus couples the first and second processor cores. The first instruction queue and dispatch logic is configured to issue a first instruction of a thread to the first execution resource and issue a second instruction of the thread over the high-speed execution resource bus to the second execution resource for simultaneous execution of the first and second instruction of the thread on the first and second processor cores.
    Type: Application
    Filed: December 17, 2009
    Publication date: June 23, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shawn M. Luke, John Sargis, JR., Daneyand J. Singley
  • Publication number: 20110153991
    Abstract: A system and method for issuing a processor instruction to multiple processing sections arranged in an out-of-order processing pipeline architecture. The multiple processing sections include a first execution unit with a pipeline length and a second execution unit operating upon data produced by the first execution unit. An instruction issue unit accepts a complex instruction that is cracked into respective micro-ops for the first execution unit and the second execution unit. The instruction issue unit issues the first micro-op to the first execution unit to produce intermediate data. The instruction issue unit then delays for a time period corresponding to the processing pipeline length of the first execution unit. After the delay, a second micro-op is issued to the second execution unit.
    Type: Application
    Filed: December 23, 2009
    Publication date: June 23, 2011
    Applicant: International Business Machines Corporation
    Inventors: FADI BUSABA, Brian Curran, Lee Eisen, Christian Jacobi, David A. Schroter, Eric Schwarz
  • Publication number: 20110153982
    Abstract: Systems and methods are disclosed for collecting data from cores of a multi-core processor using collection packets. A collection packet can traverse through cores of the multi-core processor while accumulating requested data. Upon completing the accumulation of the requested data from all required cores, the collection packet can be transmitted to a system operator for system maintenance and/or monitoring.
    Type: Application
    Filed: December 21, 2009
    Publication date: June 23, 2011
    Applicant: BBN TECHNOLOGIES CORP.
    Inventor: Craig Partridge
  • Publication number: 20110153997
    Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.
    Type: Application
    Filed: December 22, 2009
    Publication date: June 23, 2011
    Inventors: Maxim Loktyukhin, Eric W. Mahurin, Bret L. Toll, Martin G. Dixon, Sean P. Mirkes, David L. Kreitzer, El Moustapha Ould-Ahmed-Vall, Vinodh Gopal
  • Patent number: 7966482
    Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to pack the packed data responsive to a pack instruction received by the decoder. A first packed data element and a second packed data element are received from the first source register. A third packed data element and a fourth packed data element are received from the second source register. The circuit packs packing a portion of each of the packed data elements into a destination register resulting with the portion from second packed data element adjacent to the portion from the first packed data element, and the portion from the fourth packed data element adjacent to the portion from the third packed data element.
    Type: Grant
    Filed: June 12, 2006
    Date of Patent: June 21, 2011
    Assignee: Intel Corporation
    Inventors: Alexander Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
  • Patent number: 7962718
    Abstract: A permutation instruction generates vector elements for a destination register using identified source and destination registers. A plurality of partial table lookups corresponding to an extended table produces a plurality of intermediate results. At least one source register stores a plurality of index values corresponding to the extended table. Out-of-range index values are values that are not contained in at least one additional source register and result in a predetermined constant value being stored into a predetermined vector element of the destination register. The index values are adjusted between the partial table lookups. A final result is formed by performing a logic function with the plurality of intermediate results. The final result is thereby formed without a full table lookup of each element of the final result.
    Type: Grant
    Filed: October 12, 2007
    Date of Patent: June 14, 2011
    Assignee: Freescale Semiconductor, Inc.
    Inventor: William C. Moyer
  • Patent number: 7962727
    Abstract: System and method for decompressing data. A compressed data stream including contiguous variable length blocks is received, each block including multiple contiguous variable length data fields and a tag portion that includes multiple contiguous tag fields corresponding respectively to the data fields. Each tag field stores a tag value specifying a size of a respective field in the block. A current variable length block is stored. A single machine instruction of a processor is executed that analyzes the tag portion of the current block, and creates a control pattern, storing the control pattern in a first register of the processor. The control pattern is configured to unpack the variable length data fields of the current variable length block into corresponding uniform data fields. The contiguous variable length data fields of the current variable length block are decompressed using the control pattern, thereby decompressing the compressed data stream.
    Type: Grant
    Filed: December 5, 2008
    Date of Patent: June 14, 2011
    Assignee: GLOBALFOUNDRIES Inc.
    Inventor: Michael Frank
  • Patent number: 7962728
    Abstract: The data processor executes an instruction having a direction for write to a reference register of other instruction flow and an instruction having a direction for reference register invalidation. The data processor is arranged as a data processor having typical functions as an integrated whole of processors (CPU1 and CPU2) which execute simple instruction flows. When executing the instruction having a direction for write to a reference register of other instruction flow, the processor confirms whether a write register is invalid. The processor waits for the register to be made invalid, if the register is not invalid, and performs write if the register is invalid. After having executed the instruction having a direction for reference register invalidation, the processor invalidates the register to which a reference has been made. When the reference register is invalid, execution of the referring instruction is suspended until it is made valid.
    Type: Grant
    Filed: September 14, 2009
    Date of Patent: June 14, 2011
    Assignee: Hitachi, Ltd.
    Inventor: Fumio Arakawa
  • Patent number: 7962726
    Abstract: A pipelined microprocessor configured for long operand instructions is disclosed. The microprocessor includes a memory unit and a load-store unit. The load store unit is coupled to the memory unit and includes a data formatter receiving information from the memory unit and including an operand selector and a shift register portion. The microprocessor also includes an execution unit coupled to the load-store unit and receiving operand information there from. The execution unit includes output latches coupled to a storage location within the execution unit for storing output information from the execution unit.
    Type: Grant
    Filed: March 19, 2008
    Date of Patent: June 14, 2011
    Assignee: International Business Machines Corporation
    Inventors: Edward T. Malley, Khary J. Alexander, Fadi Y. Busaba, Vimal M. Kapadia, Jeffrey S. Plate, John G. Rell, Jr., Chung-Lung Kevin Shum
  • Patent number: 7962732
    Abstract: An instruction processing apparatus includes a thread execution processing section executing threads each including plural instructions, a register file including a register window having plural registers, a current window pointer indicating a position of the register where the register window is possible to be inputted and outputted, a current register reading data held by the register window designated by the current window pointer to hold the data and a replacement buffer holding data transferred from the register file to the current register, a first transfer path transferring data in a register file to one of the replacement buffer, a second data transfer transferring data in a replacement buffer to one of the current registers, a calculation section executing a switching instruction of the register window, and a control section controlling, if the calculation section executes the switching instruction, the first data transfer path and the second data transfer path.
    Type: Grant
    Filed: December 11, 2009
    Date of Patent: June 14, 2011
    Assignee: Fujitsu Limited
    Inventor: Toshio Yoshida
  • Publication number: 20110138154
    Abstract: Described are embodiments of an invention for optimizing a computing environment that performs data management operations such as encryption, deduplication and compression. The computing environment includes data components and a management system. The data components operate on data during the lifecycle of the data. The management system identifies all the data components in a data path, how the data components are interconnected, the data management operations performed at each data component, and how many data management operations of each type are performed at each data component. Further, the management system builds a data structure to represent the flow of data through the data path and analyzes the data structure in view of policy.
    Type: Application
    Filed: December 8, 2009
    Publication date: June 9, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gregory John Tevis, David Gregory Van Hise
  • Patent number: 7958342
    Abstract: A Nyquist sampling frequency is determined for performance counter events to be measured. Based on the Nyquist sampling frequencies, a schedule for measuring the performance counter events is determined. The performance counter event measurements are then conducted in accordance with the schedule, whereby the measurements yield a set of sample data for each performance counter event. A signal reconstruction algorithm is applied to the set of sample data for each performance counter event to reconstruct an essentially complete signal for each performance counter event. The essentially complete signal for each performance counter event is then used to improve either a design or a utilization of either a microprocessor or an application to be executed on the microprocessor.
    Type: Grant
    Filed: January 24, 2007
    Date of Patent: June 7, 2011
    Assignee: Oracle America, Inc.
    Inventors: Robert M. Lane, Kenneth Tracton, Zenon Fortuna
  • Publication number: 20110131396
    Abstract: One aspect of the present invention provides processor comprising: an execution unit arranged to execute a sequence of instructions each comprising a respective opcode; and a counter coupled to the execution unit and arranged to generate a periodically updated counter value during execution. The execution unit comprises logic configured to identify an opcode representing a trap-if-late instruction in said sequence, and in response to execute the trap-if-late instruction by comparing a target value to the counter value and generating an exception on condition that the counter value represents a time that is late relative to said target value. Another aspect provides a compiler for inserting trap-if-late instructions based on timing constraints in higher-level code.
    Type: Application
    Filed: December 1, 2009
    Publication date: June 2, 2011
    Applicant: XMOS Limited
    Inventors: Michael David May, Hendrik Lambertus Muller
  • Publication number: 20110131425
    Abstract: Embodiments of the invention broadly contemplate systems, methods, apparatuses and program products providing a power management technique for an HPC cluster with performance improvements for parallel applications. According to various embodiments of the invention, power usage of an HPC cluster is reduced by boosting the performance of one or more select nodes within the cluster so that the one or more nodes take less time to complete. Embodiments of the invention accomplish this by selectively identifying the appropriate node(s) (or core(s) within the appropriate node(s)) in the cluster and increasing the computing capacity of the selected node(s) (or core(s) within the appropriate node(s)).
    Type: Application
    Filed: November 30, 2009
    Publication date: June 2, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradipta K. Banerjee, Anbazhagan Mani, Rajan Ravindran, Vaidyanathan Srinivasan
  • Patent number: 7953957
    Abstract: Methods, apparatus, and products for distributing parallel algorithms of a parallel application among compute nodes of an operational group in a parallel computer are disclosed that include establishing a hardware profile, the hardware profile describing thermal characteristics of each compute node in the operational group; establishing a hardware independent application profile, the application profile describing thermal characteristics of each parallel algorithm of the parallel application; and mapping, in dependence upon the hardware profile and application profile, each parallel algorithm of the parallel application to a compute node in the operational group.
    Type: Grant
    Filed: February 11, 2008
    Date of Patent: May 31, 2011
    Assignee: International Business Machines Corporation
    Inventors: Thomas M. Gooding, Brant L. Knudson, Cory Lappi, Ruth J. Poole, Andrew T. Tauferner
  • Patent number: 7953962
    Abstract: A multiprocessor system according to an embodiment comprises a plurality of processors, an execution control unit to control processing by the plurality of processors and data transfer between the plurality of processors; and an internal data storage unit to store data dependence information indicating status of the data transfer. If control flow of processing by a processor is fixed after a preceding data transfer is registered for execution and another data transfer to a similar destination as the preceding data transfer is necessary, the execution control unit cancels the preceding data transfer based on the data dependence information.
    Type: Grant
    Filed: September 18, 2009
    Date of Patent: May 31, 2011
    Assignee: Fujitsu Limited
    Inventors: Yasuki Nakamura, Takahisa Suzuki, Makiko Ito, Hideo Miyake
  • Patent number: 7953933
    Abstract: An instruction processing circuit includes an instruction cache, a decoder configured to receive at least one of the instructions and to generate, based thereon, a decoder sequence of at least one operation. The circuit includes a basic block cache that includes a basic block sequence of at least one of the operations. The basic block sequence is derived from at least one of the decoder sequences and includes at most one conditional control transfer operation. The circuit includes a multi-block cache that includes a multi-block sequence consisting of at least one of the operations derived from two or more smaller op sequences. A sequencer is configured to generate a prediction for the result of a conditional control transfer operation, select the next sequence of operations, and provide an indication of the next sequence to the instructions cache, the basic block cache, and the multi-block cache.
    Type: Grant
    Filed: July 23, 2007
    Date of Patent: May 31, 2011
    Assignee: Oracle America, Inc.
    Inventors: Richard Win Thaik, John Gregory Favor, Joseph Byron Rowlands, Leonard Eric Shar
  • Patent number: 7953961
    Abstract: An instruction processing circuit for a processor includes a decoder circuit, a cache circuit, a sequencer circuit operable to select a next sequence of operations, and an operations fetch circuit operable to convey the next sequence of operations to an execution circuit, receive an indication that a sequencing action of the sequencer circuit is sequencing ahead of the execution circuit, and switch, based on the indication, a source of the operations fetch circuit between the cache circuit and the decoder circuit.
    Type: Grant
    Filed: July 23, 2007
    Date of Patent: May 31, 2011
    Assignee: Oracle America, Inc.
    Inventors: Richard Win Thaik, John Gregory Favor, Joseph Byron Rowlands, Leonard Eric Shar
  • Patent number: 7949859
    Abstract: A method and processor for avoiding check stops in speculative accesses. An execution unit, e.g., load/store unit, may be coupled to a queue configured to store instructions. A register, coupled to the execution unit, may be configured to store a value corresponding to an address in physical memory. When the processor is operating in real mode, the execution unit may retrieve the value stored in the register. Upon the execution unit receiving a speculative instruction, e.g., speculative load instruction, from the queue, a determination may be made as to whether the address of the speculative instruction is at or below the retrieved value. If the address of the speculative instruction is at or below this value, then the execution unit may safely speculatively execute this instruction while avoiding a check stop since all the addresses at or below this value are known to exist in physical memory.
    Type: Grant
    Filed: March 6, 2008
    Date of Patent: May 24, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ronald N. Kalla, Cathy May, Balaram Sinharoy, Edward John Silha, Shih-Hsiung S. Tung
  • Patent number: 7948496
    Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: May 24, 2011
    Assignee: MicroUnity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris, Alexia Massalin
  • Publication number: 20110119469
    Abstract: In a multiprocessor system with threads running in parallel, workload balancing is facilitated by recognizing a plurality of levels of sub-tasks of a memory synchronization instruction and selectively choosing for at least one thread to do less than all of levels of these sub-tasks in response to the memory synchronization instruction. Which thread waits to synchronize can be impacted by this choice. The programmer can cause a thread expected to be a bottleneck to wait less than other threads. Where one thread is a producer and another thread is a consumer, types of memory synchronization can be adapted to these roles.
    Type: Application
    Filed: June 8, 2010
    Publication date: May 19, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Martin Ohmacht