Context Preserving (e.g., Context Swapping, Checkpointing, Register Windowing Patents (Class 712/228)
  • Patent number: 8826073
    Abstract: A three-dimensional (3-D) processor system includes a first processor chip and a second processor chip in a stacked configuration. The first processor chip includes a first processor having a first set of state registers. The second processor chip includes a second processor having a second set of state registers that corresponds to the first set of state registers. The first and second processors are connected through vertical connections between the first and second processor chips. A mode control circuit operates the processor system in one of a plurality of operating modes. In one mode of operation, the first processor is active and the second processor is inactive, and the first processor operates at a speed greater than a maximum safe speed of the first processor, and the first processor uses the second set of state registers of the second processor to checkpoint a state of the first processor.
    Type: Grant
    Filed: September 4, 2012
    Date of Patent: September 2, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alper Buyuktosunoglu, Philip G. Emma, Allan M. Hartstein, Michael B. Healy, Krishnan K. Kailas
  • Publication number: 20140244985
    Abstract: Intelligent context management for thread switching is achieved by determining that a register bank has not been used by a thread for a predetermined number of dispatches, and responsively disabling the register bank for use by that thread. A counter is incremented each time the thread is dispatched but the register bank goes unused. Usage or non-usage of the register bank is inferred by comparing a previous checksum for the register bank to a current checksum. If the previous and current checksums match, the system concludes that the register bank has not been used. If a thread attempts to access a disabled bank, the processor takes an interrupt, enables the bank, and resets the corresponding counter. For a system utilizing transactional memory, it is preferable to enable all of the register banks when thread processing begins to avoid aborted transactions from register banks disabled by lazy context management techniques.
    Type: Application
    Filed: February 28, 2013
    Publication date: August 28, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Randal C. Swanberg
  • Patent number: 8806180
    Abstract: A scheduler in a process of a computer system detects a task with an associated execution context that has not been previously invoked by the scheduler. The scheduler executes the task on a processing resource without performing a context switch if the processing resource executed a previous task to completion. The scheduler stores the execution context originally associated with the task for later use.
    Type: Grant
    Filed: May 1, 2008
    Date of Patent: August 12, 2014
    Assignee: Microsoft Corporation
    Inventors: Paul F. Ringseth, Genevieve Fernandes
  • Patent number: 8806181
    Abstract: According to some embodiments, an apparatus having corresponding methods includes a storage module configured to store data and instructions; a first processor pipeline configured to process the data and instructions when the first processor pipeline is selected; a second processor pipeline configured to process the data and instructions when the second processor pipeline is selected; and a selection module configured to select either the first processor pipeline or the second processor pipeline.
    Type: Grant
    Filed: May 1, 2009
    Date of Patent: August 12, 2014
    Assignee: Marvell International Ltd.
    Inventors: R. Frank O'Bleness, Sujat Jamil, Timothy S. Beatty, Franco Ricci, Tom Hameenanttila, Hong-Yi Chen
  • Patent number: 8799710
    Abstract: A three-dimensional (3-D) processor system includes a first processor chip and a second processor chip in a stacked configuration. The first processor chip includes a first processor having a first set of state registers. The second processor chip includes a second processor having a second set of state registers that corresponds to the first set of state registers. The first and second processors are connected through vertical connections between the first and second processor chips. A mode control circuit operates the processor system in one of a plurality of operating modes. In one mode of operation, the first processor is active and the second processor is inactive, and the first processor operates at a speed greater than a maximum safe speed of the first processor, and the first processor uses the second set of state registers of the second processor to checkpoint a state of the first processor.
    Type: Grant
    Filed: June 28, 2012
    Date of Patent: August 5, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alper Buyuktosunoglu, Philip G. Emma, Allan M. Hartstein, Michael B. Healy, Krishnan K. Kailas
  • Patent number: 8793474
    Abstract: A first hardware thread executes a software program instruction, which instructs the first hardware thread to initiate a second hardware thread. As such, the first hardware thread identifies one or more register values accessible by the first hardware thread. Next, the first hardware thread copies the identified register values to one or more registers accessible by the second hardware thread. In turn, the second hardware thread accesses the copied register values included in the accessible registers and executes software code accordingly.
    Type: Grant
    Filed: September 20, 2010
    Date of Patent: July 29, 2014
    Assignee: International Business Machines Corporation
    Inventors: Giles Roger Frazier, Ronald P. Hall
  • Patent number: 8793433
    Abstract: A processor contains multiple levels of registers having different access latency. A relatively smaller set of registers is contained in a relatively faster higher level register bank, and a larger, more complete set of the registers is contained in a relatively slower lower level register bank. Physically, the higher level register bank is placed closer to functional logic which receives inputs from the registers. Selection logic enables selecting output of either register bank for input to processor execution logic. Preferably, the lower level bank includes a complete set of all processor registers, and the higher level bank includes a smaller subset of the registers, duplicating information in the lower level bank. The higher level bank is preferably accessible in a single clock cycle.
    Type: Grant
    Filed: August 8, 2007
    Date of Patent: July 29, 2014
    Assignee: International Business Machines Corporation
    Inventors: Nathan Samuel Nunamaker, Jack Chris Randolph, Kenichi Tsuchiya
  • Publication number: 20140208083
    Abstract: A data slot may be reserved for a first thread selected from a plurality of threads executed by a computer system. A memory of the computer system may comprise a plurality of log files and a next free data slot pointer. Each log file may comprise a plurality of data slots and each of the data slots may be of a common size. Reserving the data slot for the first thread may comprise attempting to perform a first atomic operation to write to a first data slot pointed to by a current value of the next free data slot pointer an indication that the first data slot is filled. If the first atomic operation is successful, the computer system may update the next free data slot pointer to point to a second data slot positioned sequentially after the first data slot. If the first atomic operation is unsuccessful, the computer system may analyze the second data slot.
    Type: Application
    Filed: January 18, 2013
    Publication date: July 24, 2014
    Applicant: MORGAN STANLEY
    Inventor: Graeme Burnett
  • Patent number: 8789042
    Abstract: A processor includes guest mode control registers supporting guest mode operating behavior defined by guest context specified in the guest mode control registers. Root mode control registers support root mode operating behavior defined by root context specified in the root mode control registers. The guest context and the root context are simultaneously active to support virtualization of hardware resources such that multiple operating systems supporting multiple applications are executed by the hardware resources.
    Type: Grant
    Filed: September 27, 2010
    Date of Patent: July 22, 2014
    Assignee: MIPS Technologies, Inc.
    Inventor: James Robert Howard Hakewill
  • Patent number: 8776079
    Abstract: A task processor includes a CPU, a save circuit, and a task control circuit. A task control circuit is provided with a task selection circuit and state storage units associated with respective tasks. When executing a predetermined system call instruction, the CPU notifies the task control circuit accordingly. When informed of the execution of a system call instruction, the task control circuit selects a task to be subsequently executed in accordance with an output from the selection circuit. When an interrupt circuit receives a high-speed interrupt request signal, the task switching circuit controls the state transition of a task by executing an interrupt handling instruction designated by the interrupt circuit.
    Type: Grant
    Filed: November 20, 2012
    Date of Patent: July 8, 2014
    Assignee: Kernelon Silicon Inc.
    Inventor: Naotaka Maruyama
  • Publication number: 20140189329
    Abstract: Techniques are provided for handling a trap encountered in a thread that is part of a thread array that is being executed in a plurality of execution units. In these techniques, a data structure with an identifier associated with the thread is updated to indicate that the trap occurred during the execution of the thread array. Also in these techniques, the execution units execute a trap handling routine that includes a context switch. The execution units perform this context switch for at least one of the execution units as part of the trap handling routine while allowing the remaining execution units to exit the trap handling routine before the context switch. One advantage of the disclosed techniques is that the trap handling routine operates efficiently in parallel processors.
    Type: Application
    Filed: December 27, 2012
    Publication date: July 3, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Gerald F. LUIZ, Philip Alexander CUADRA, Luke DURANT, Shirish GADRE, Robert OHANNESSIAN, Lacky V. SHAH, Nicholas WANG, Arthur DANSKIN
  • Publication number: 20140189328
    Abstract: A computer processor, a computer system and a corresponding method involve a reservation station that stores instructions which are not ready for execution. The reservation station includes a storage area that is divided into bundles of entries. Each bundle is switchable between an open state in which instructions can be written into the bundle and a closed state in which instructions cannot be written into the bundle. A controller selects which bundles are open based on occupancy levels of the bundles.
    Type: Application
    Filed: December 27, 2012
    Publication date: July 3, 2014
    Inventors: Tomer WEINER, Zeev SPERBER, Sagi LAHAV, Guy PATKIN, Gavri BERGER, Itamar FELDMAN, Ofer LEVY, Sara YAKOEL, Adi YOAZ
  • Patent number: 8762692
    Abstract: Methods and apparatuses for reducing power consumption of processor switch operations are disclosed. One or more embodiments may comprise specifying a subset of registers or state storage elements to be involved in a register or state storage operation, performing the register or state storage operation, and performing a switch operation. The embodiments may minimize the number of registers or state storage elements involved with the standby operation by specifying only the subset of registers or state storage elements, which may involve considerably fewer than the total number of registers or state storage or elements of the processor. The switch operation may be switch from one mode to another, such as a transition to or from a sleep mode, a context switch, or the execution of various types of instructions.
    Type: Grant
    Filed: September 27, 2007
    Date of Patent: June 24, 2014
    Assignee: Intel Corporation
    Inventors: Ethan Schuchman, Hong Wang, Chris Weaver, Belliappa M Kuttanna, Asit Mallick, Vivek K De, Per Hammarlund
  • Patent number: 8751833
    Abstract: A data processing apparatus is provided comprising first processing circuitry, second processing circuitry and shared processing circuitry. The first processing circuitry and second processing circuitry are configured to operate in different first and second power domains respectively and the shared processing circuitry is configured to operate in a shared power domain. The data processing apparatus forms a uni-processing environment for executing a single instruction stream in which either the first processing circuitry and the shared processing circuitry operate together to execute the instruction stream or the second processing circuitry and the shared processing circuitry operate together to execute the single instruction stream. Execution flow transfer circuitry is provided for transferring at least one bit of processing-state restoration information between the two hybrid processing units.
    Type: Grant
    Filed: April 30, 2010
    Date of Patent: June 10, 2014
    Assignee: ARM Limited
    Inventor: Stephen John Hill
  • Publication number: 20140156976
    Abstract: Techniques and mechanisms for a processor to determine whether a commit action is to be performed. In an embodiment, a processor performs operations to determine whether a commit instruction is for contingent performance of a commit action. In another embodiment, one or more conditions of processor state are evaluated in response to determining that the commit instruction is for contingent performance of the commit action, where the evaluation is performed to determine whether the commit action indicated by the commit instruction is to be performed.
    Type: Application
    Filed: December 22, 2011
    Publication date: June 5, 2014
    Inventors: Enric Gibert Codina, Josep M. Codina, Fernando Latorre, Pedro Marcuello, Pedro Lopez, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Christos E. Kotselidis, Marc Lupon, Carlos Madriles Gimeno, Grigorios Magklis, Alejandro Martinez Vicente, Raul Martinez, Daniel Ortega, Demos Pavlou, Kyriakos A. Stavrou, Georgios Tournavitis, Polychronis Xekalakis
  • Patent number: 8719827
    Abstract: A processor for sequentially executing a plurality of programs using a plurality of register value groups stored in a memory that correspond one-to-one with the programs.
    Type: Grant
    Filed: July 11, 2011
    Date of Patent: May 6, 2014
    Assignee: Panasonic Corporation
    Inventors: Kazushi Kurata, Tetsuya Tanaka, Nobuo Higaki, Kunihiko Hayashi, Hiroshi Kadota, Tokuzo Kiyohara, Kozo Kimura, Hideshi Nishida, Kazuya Furukawa, Shigeki Fujii, Toshio Sugimura
  • Publication number: 20140122845
    Abstract: In one embodiment, the present invention includes a processor having a core to execute instructions. This core can include various structures and logic that enable instructions of different atomic regions to be executed in an overlapping manner. To this end, the core can include a register file having registers to store data for use in execution of the instructions, and multiple shadow register files each to store a register checkpoint on initiation of a given atomic region. In this way, overlapping execution of atomic regions identified by a programmer or compiler can occur. Other embodiments are described and claimed.
    Type: Application
    Filed: December 30, 2011
    Publication date: May 1, 2014
    Inventors: Jaewoong Chung, Cheng Wang, Youfeng Wu
  • Publication number: 20140122844
    Abstract: Intelligent context management for thread switching is achieved by determining that a register bank has not been used by a thread for a predetermined number of dispatches, and responsively disabling the register bank for use by that thread. A counter is incremented each time the thread is dispatched but the register bank goes unused. Usage or non-usage of the register bank is inferred by comparing a previous checksum for the register bank to a current checksum. If the previous and current checksums match, the system concludes that the register bank has not been used. If a thread attempts to access a disabled bank, the processor takes an interrupt, enables the bank, and resets the corresponding counter. For a system utilizing transactional memory, it is preferable to enable all of the register banks when thread processing begins to avoid aborted transactions from register banks disabled by lazy context management techniques.
    Type: Application
    Filed: November 1, 2012
    Publication date: May 1, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Randal C. Swanberg
  • Patent number: 8707062
    Abstract: For one disclosed embodiment, a processor comprises a first processor core, a second processor core, and a cache memory. The first processor core is to save a state of the first processor core and to enter a mode in which the first processor core is powered off. The second processor core is to save a state of the second processor core and to enter a mode in which the second processor core is powered off. The cache memory is to be powered when the first processor core is powered off. The first processor core is to restore the saved state of the first processor core in response to the first processor core transitioning to a mode in which the first processor core is powered. The second processor core is to restore the saved state of the second processor core in response to the second processor core transitioning to a mode in which the second processor core is powered. Other embodiments are also disclosed.
    Type: Grant
    Filed: February 16, 2010
    Date of Patent: April 22, 2014
    Assignee: Intel Corporation
    Inventors: Sanjeev Jahagirdar, Varghese George, John B. Conrad, Robert Milstrey, Stephen A. Fischer, Alon Naveh, Shai Rotem
  • Patent number: 8700883
    Abstract: A memory access technique that provides for overriding a translation lookaside buffer and page table data structure, in accordance with one embodiment of the present invention, includes selectively translating a virtual address directly to a physical address utilizing an adjustment in a context specifier, or translating the virtual address to the physical address utilizing a translation lookaside buffer or page table data structure.
    Type: Grant
    Filed: October 24, 2006
    Date of Patent: April 15, 2014
    Assignee: Nvidia Corporation
    Inventors: David B. Glasco, John S. Montrym
  • Patent number: 8694758
    Abstract: When legacy instructions, that can only operate on smaller registers, are mixed with new instructions in a processor with larger registers, special handling and architecture are used to prevent the legacy instructions from causing problems with the data in the upper portion of the registers, i.e., the portion that they cannot directly access. In some embodiments, the upper portion of the registers are saved to temporary storage while the legacy instructions are operating, and restored to the upper portion of the registers when the new instructions are operating. A special instruction may also be used to disable this save/restore operation if the new instruction are not going to use the upper part of the registers.
    Type: Grant
    Filed: December 27, 2007
    Date of Patent: April 8, 2014
    Assignee: Intel Corporation
    Inventors: Doron Orenstien, Zeev Sperber, Robert Valentine, Benny Eitan
  • Publication number: 20140095847
    Abstract: A processor uses multiple banks of an extended register set to store the contexts of multiple user-level threads. A current bank register provides a pointer to the bank that is currently active. A first thread saves its context (first context) in a first bank of the extended register set and a second thread saves its context (second context) in a second bank of the extended register set. When the processor receives an instruction for exchanging contexts between the first thread and the second thread, the processor changes the pointer from the first bank to the second bank, and executes the second thread using the second context stored in the second bank.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventor: Doron Orenstein
  • Publication number: 20140095848
    Abstract: Operand liveness state information is maintained during context switches for current architected operands of executing programs the current operand state information indicating whether corresponding current operands are any one of enabled or disabled for use by a first program module, the first program module comprising machine instructions of an instruction set architecture (ISA) for disabling current architected operands, wherein a current operand is accessed by a machine instruction of said first program module, the accessing comprising using the current operand state information to determine whether a previously stored current operand value is accessible by the first program module.
    Type: Application
    Filed: December 9, 2013
    Publication date: April 3, 2014
    Applicant: International Business Machines Corporation
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 8689221
    Abstract: In an embodiment, asynchronous conflict events are received during a previous rollback period. Each of the asynchronous conflict events represent conflicts encountered by speculative execution of a first plurality of work units and may be received out-of-order. During a current rollback period, a first work unit is determined whose speculative execution raised one of the asynchronous conflict events, and the first work unit is older than all other of the first plurality of work units. A second plurality of work units are determined, whose ages are equal to or older than the first work unit, wherein each of the second plurality of work units are assigned to respective executing threads. Rollbacks of the second plurality of work units are performed. After the rollbacks of the second plurality of work units are performed, speculative executions of the second plurality of work units are initiated in age order, from oldest to youngest.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: April 1, 2014
    Assignee: International Business Machines Corporation
    Inventors: Thomas M. Gooding, John K. O'Brien, Kai-Ting Amy Wang, Xiaotong Zhuang
  • Patent number: 8688963
    Abstract: The embodiments described in the instant application provide a system for generating checkpoints. In the described embodiments, while speculatively executing instructions with one or more checkpoints in use, upon detecting an occurrence of a predetermined operating condition or encountering a predetermined type of instruction, the system is configured to determine whether an additional checkpoint is to be generated by computing a factor based on one or more operating conditions of the processor. When the factor is greater than a predetermined value, the processor is configured to generate the additional checkpoint.
    Type: Grant
    Filed: April 22, 2010
    Date of Patent: April 1, 2014
    Assignee: Oracle International Corporation
    Inventors: Shailender Chaudhry, Martin R. Karlsson, Sherman H. Yip
  • Patent number: 8683184
    Abstract: A method for implementing multi context execution on a video processor having a scalar execution unit and a vector execution unit. The method includes allocating a first task to a vector execution unit and allocating a second task to the vector execution unit. The first task is from a first context in the second task is from a second context. The method further includes interleaving a plurality of work packages comprising the first task and the second task to generate a combined work package stream. The combined work package stream is subsequently executed on the vector execution unit.
    Type: Grant
    Filed: November 4, 2005
    Date of Patent: March 25, 2014
    Assignee: Nvidia Corporation
    Inventors: Stephen D. Lew, Ashish Karandikar, Shirish Gadre, Franciscus W. Sijstermans
  • Patent number: 8677163
    Abstract: Embodiments of an invention related to context state management based on processor features are disclosed. In one embodiment, a processor includes instruction logic and state management logic. The instruction logic is to receive a state management instruction having a parameter to identify a subset of the features supported by the processor. The state management logic is to perform a state management operation specified by the state management instruction.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: March 18, 2014
    Assignee: Intel Corporation
    Inventors: Don Van Dyke, Michael Mishaeli, Ittai Anati, Baiju V. Patel, Will Deutsch, Rajesh R. Sha, Gilbert Neiger, James B. Crossland, Chris J. Newburn, Bryant E. Bigbee, Muhammad Faisal Azeem, John L. Reid, Dion Rodgers
  • Patent number: 8677105
    Abstract: A unified architecture for dynamic generation, execution, synchronization and parallelization of complex instructions formats includes a virtual register file, register cache and register file hierarchy. A self-generating and synchronizing dynamic and static threading architecture provides efficient context switching.
    Type: Grant
    Filed: November 14, 2007
    Date of Patent: March 18, 2014
    Assignee: Soft Machines, Inc.
    Inventor: Mohammad A. Abdallah
  • Patent number: 8671232
    Abstract: A system and method for dynamically migrating stash transactions include first and second processing cores, an input/output memory management unit (IOMMU), an IOMMU mapping table, an input/output (I/O) device, a stash transaction migration management unit (STMMU), and an operating system (OS) scheduler. The first core executes a first thread associated with a frame manager. The OS scheduler migrates the first thread from the first core to the second core and generates pre-empt notifiers to indicate scheduling-out and scheduling-in of the first thread from the first core and to the second core. The STMMU uses the pre-empt notifiers to enable dynamic stash transaction migration.
    Type: Grant
    Filed: March 7, 2013
    Date of Patent: March 11, 2014
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Vakul Garg, Varun Sethi
  • Patent number: 8667256
    Abstract: One embodiment of a computing system configured to manage divergent threads in a thread group includes a stack configured to store at least one token and a multithreaded processing unit. The multithreaded processing unit is configured to perform the steps of fetching a program instruction, determining that the program instruction is a branch instruction, determining that the program instruction is not a return or break instruction, determining whether the program instruction includes a set-synchronization bit, and updating an active program counter, where the manner in which the active program counter is updated depends on a branch instruction type.
    Type: Grant
    Filed: June 1, 2009
    Date of Patent: March 4, 2014
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm
  • Patent number: 8661232
    Abstract: In a data processing apparatus 1 having registers 6, when a state saving trigger event occurs while a result value of a data processing operation is still to be written to a destination register then saving and restoring control circuitry 12 selects a state saving sequence defining a temporal order for saving register values to a backup data store 10. The sequence is selected to provide the destination register with a position within the sequence corresponding to a time after the result value has been written to the destination register. The register values are then saved to the backup data store 10 in the order of the selected state saving sequence. A similar technique can be used when a state restoring trigger event triggers loading of the data values from the backup data store 10 to the registers 6.
    Type: Grant
    Filed: September 16, 2010
    Date of Patent: February 25, 2014
    Assignee: ARM Limited
    Inventors: Antony John Penton, Simon Axford
  • Patent number: 8656399
    Abstract: A computer-implemented method of performing runtime analysis on and control of a multithreaded computer program. One embodiment of the present invention can include identifying threads of a computer program to be analyzed. Under control of a supervisor thread, a plurality of the identified threads can be folded together to be executed as a folded thread. The execution of the folded thread can be monitored to determine a status of the identified threads. An indicator corresponding to the determined status of the identified threads can be presented in a user interface that is presented on a display.
    Type: Grant
    Filed: March 23, 2012
    Date of Patent: February 18, 2014
    Assignee: International Business Machines Corporation
    Inventor: Kirk J. Krauss
  • Publication number: 20140040595
    Abstract: A processor may efficiently implement register renaming and checkpoint repair even in instruction set architectures with large numbers of wide (bit-width) registers by (i) renaming all destination operand register targets, (ii) implementing free list and architectural-to-physical mapping table as a combined array storage with unitary (or common) read, write and checkpoint pointer indexing and (iiii) storing checkpoints as snapshots of the mapping table, rather than of actual register contents. In this way, uniformity (and timing simplicity) of the decode pipeline may be accentuated and architectural-to-physical mappings (or allocable mappings) may be efficiently shuttled between free-list, reorder buffer and mapping table stores in correspondence with instruction dispatch and completion as well as checkpoint creation, retirement and restoration.
    Type: Application
    Filed: August 1, 2012
    Publication date: February 6, 2014
    Applicant: FREESCALE SEMICONDUCTOR, INC.
    Inventor: Thang M. Tran
  • Publication number: 20140032885
    Abstract: Example methods and apparatus to manage partial commit-checkpoints are disclosed. A disclosed example method includes identifying a commit instruction associated with a region of instructions executed by a processor, identifying candidate instructions from the region of instructions, and generating a processor partial commit-checkpoint to save a current state of the processor, the checkpoint based on calculated register values associated with live instructions, and including instruction reference addresses to link the candidate instructions.
    Type: Application
    Filed: September 30, 2013
    Publication date: January 30, 2014
    Inventors: Edson Borin, Youfeng Wu
  • Publication number: 20140032884
    Abstract: Reclaiming checkpoints in a system in an order that differs from the order when the checkpoints are created. Reclaiming the checkpoints includes: creating one or more checkpoints, each of which having an initial state using system resources and holding the checkpoints state; identifying the completion of all the instructions associated with the checkpoint; reassigning all the instructions associated with the identified checkpoint to an immediately preceding checkpoint; and freeing the resources associated with the identified checkpoint. The checkpoint is created when the instruction that is checked is a conditional branch having a direction that cannot be predicted with a predetermined confidence level.
    Type: Application
    Filed: July 26, 2012
    Publication date: January 30, 2014
    Applicant: International Business Machines Corporation
    Inventors: Anil Krishna, Ganesh Balakrishnan, Gordon B. Bell
  • Patent number: 8640008
    Abstract: A data processing apparatus has error detection units each configured to generate an error signal if a first and second sample of a signal associated with execution of an instruction differ. Error value generation circuitry generates an error value showing if any of the error detection units have generated the error signal. Error value stabilisation circuitry performs a stabilisation procedure comprising re-sampling the error value to remove metastability. Error recovery circuitry initiates re-execution of the instruction if the error value is asserted. Count circuitry holds a counter value in association with the error value, the counter value set to a predetermined value when the error value is generated and decremented each time the error value is re-sampled prior to reaching the error value stabilisation circuitry. The error value bypasses the stabilisation procedure if the counter value is zero before the error value reaches the error value stabilisation circuitry.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: January 28, 2014
    Assignee: ARM Limited
    Inventors: Guillaume Schon, Luca Scalabrino, Frederic Claude Marie Piry, David Michael Bull
  • Patent number: 8635627
    Abstract: A method, medium and apparatus for storing and restoring a register context for a fast context switching between tasks is disclosed. The method, medium and apparatus may improve overall operating speed of a system by increasing the speed of context switching. The method may include adding an update code for updating information of live registers to a task file that includes a code of a task to perform a specified function, converting the task file having the update code added thereto into a run file, updating the information of the live registers with the update code during running of the task using the run file, and storing a live register context according to the updated information of the registers.
    Type: Grant
    Filed: December 12, 2006
    Date of Patent: January 21, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung-keun Park, Keun-soo Yim, Woon-gee Kim, Jeong-joon Yoo, Kyoung-ho Kang, Chae-seok Im, Jae-don Lee
  • Publication number: 20140019735
    Abstract: A computer architecture allows for simplified exception handling by restarting the program after exceptions at the beginning of idempotent regions, the idempotent regions allowing re-execution without the need for restoring complex state information from checkpoints. Recovery from mis-speculation may be provided by a similar mechanism but using smaller idempotent regions reflecting a more frequent occurrence of mis-speculation. A compiler generating different idempotent regions for speculation and exception handling is also disclosed.
    Type: Application
    Filed: July 13, 2012
    Publication date: January 16, 2014
    Inventors: Jaikrishnan Menon, Marc Asher De Kruijf, Karthikeyan Sankaralingam
  • Patent number: 8631261
    Abstract: Embodiments of an invention related to context state management based on processor features are disclosed. In one embodiment, a processor includes instruction logic and state management logic. The instruction logic is to receive a state management instruction having a parameter to identify a subset of the features supported by the processor. The state management logic is to perform a state management operation specified by the state management instruction.
    Type: Grant
    Filed: December 31, 2007
    Date of Patent: January 14, 2014
    Assignee: Intel Corporation
    Inventors: Don A. Van Dyke, Michael Mishaeli, Ittai Anati, Baiju V. Patel, Will Deutsch, Rajesh Shah, Gilbert Neiger, James B. Crossland, Chris J. Newburn, Bryant E. Bigbee, Muhammad Faisal Azeem, John L. Reid, Dion Rodgers
  • Patent number: 8631223
    Abstract: A processor includes an instruction sequencing unit, execution unit, and multi-level register file including a first level register file having a lower access latency and a second level register file having a higher access latency. Responsive to the processor processing a second instruction in a transactional code section to obtain as an execution result a second register value of the logical register, the mapper moves a first register value of the logical register to the second level register file, places the second register value in the first level register file, marks the second register value as speculative, and replaces a first mapping for the logical register with a second mapping. Responsive to unsuccessful termination of the transactional code section, the mapper designates the second register value in the first level register file as invalid so that the first register value in the second level register file becomes the working value.
    Type: Grant
    Filed: May 12, 2010
    Date of Patent: January 14, 2014
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Hung Q. Le, Dung Q. Nguyen
  • Patent number: 8626821
    Abstract: A device for limiting access to information corresponding to a context. The device has context logic configured to determine context and a trusted computing platform coupled to the context logic. The trusted computing has a computer readable medium and a filter coupled to the computer readable medium. The filter is configured to receive context from the context logic and retrieve context information from the computer readable medium associated with the context. Thus, the trusted platform limits access to the context information in the database unless context associated with the context information is received.
    Type: Grant
    Filed: December 27, 2004
    Date of Patent: January 7, 2014
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Cyril Brignone, Salil Pradhan
  • Publication number: 20140006758
    Abstract: A processor saves micro-architectural contexts to increase the efficiency of code execution and power management. A save instruction is executed to store a micro-architectural state and an architectural state of a processor in a common buffer of a memory upon a context switch that suspends the execution of a process. The micro-architectural state contains performance data resulting from the execution of the process. A restore instruction is executed to retrieve the micro-architectural state and the architectural state from the common buffer upon a resumed execution of the process. Power management hardware then uses the micro-architectural state as an intermediate starting point for the resumed execution.
    Type: Application
    Filed: June 29, 2012
    Publication date: January 2, 2014
    Applicant: Intel Corporation
    Inventors: Efraim Rotem, Eliezer Weissmann, Michael Mishaeli, Boris Ginzburg, Alon Naveh
  • Patent number: 8621238
    Abstract: An apparatus, method and program product are provided for securing a computer system. A digital signature of an application is checked, which is loaded into a memory of the computer system configured to contain memory pages. In response to finding a valid digital signature, memory pages containing instructions of the application are set as executable and memory pages other than those containing instructions of the application are set as non-executable. Instructions in executable memory pages are executed. Instructions in non-executable memory pages are prevented from being executed. A page fault is generated in response to an attempt to execute an instruction in a non-executable memory page. In response to the page fault, an exception list of a sequence of instructions is checked for the attempted instruction in the non-executable memory page and if on the list, the page is set to executable and the attempted instruction executed.
    Type: Grant
    Filed: July 26, 2011
    Date of Patent: December 31, 2013
    Assignee: The United States of America as represented by the Secretary of the Air Force
    Inventor: William B Kimball
  • Patent number: 8601235
    Abstract: A shared memory management system and method are described. In one embodiment, a memory management system includes a memory management unit for concurrently managing memory access requests from a plurality of engines. The shared memory management system independently controls access to the context memory without interference from other engine activities. In one exemplary implementation, the memory management unit tracks an identifier for each of the plurality of engines making a memory access request. The memory management unit associates each of the plurality of engines with particular translation information respectively. This translation information is specified by a block bind operation. In one embodiment the translation information is stored in a portion of instance memory. A memory management unit can be non-blocking and can also permit a hit under miss.
    Type: Grant
    Filed: December 30, 2009
    Date of Patent: December 3, 2013
    Assignee: Nvidia Corporation
    Inventors: David B. Glasco, John S. Montrym, Lingfeng Yuan
  • Patent number: 8589925
    Abstract: Various technologies and techniques are disclosed for switching threads within routines. A controller routine receives a request from an originating routine to execute a coroutine, and executes the coroutine on an initial thread. The controller routine receives a response back from the coroutine when the coroutine exits based upon a return statement. Upon return, the coroutine indicates a subsequent thread that the coroutine should be executed on when the coroutine is executed a subsequent time. The controller routine executes the coroutine the subsequent time on the subsequent thread. The coroutine picks up execution at a line of code following the return statement. Multiple return statements can be included in the coroutine, and the threads can be switched multiple times using this same approach. Graphical user interface logic and worker thread logic can be co-mingled into a single routine.
    Type: Grant
    Filed: October 25, 2007
    Date of Patent: November 19, 2013
    Assignee: Microsoft Corporation
    Inventor: Krzysztof Cwalina
  • Patent number: 8578139
    Abstract: A data processing apparatus and method of data processing are provided. The data processing apparatus comprises execution circuitry configured to execute a sequence of program instructions. Checkpoint circuitry is configured to identify an instance of a predetermined type of instruction in the sequence of program instructions and to store checkpoint information associated with that instance. The checkpoint information identifies a state of the data processing apparatus prior to execution of that instance of the predetermined type of instruction, wherein the predetermined type of instruction has an expected long completion latency.
    Type: Grant
    Filed: August 5, 2010
    Date of Patent: November 5, 2013
    Assignee: ARM Limited
    Inventors: Nicolas Chaussade, Florent Begon, Mélanie Emanuelle Lucie Teyssier, Rémi Teyssier, Jocelyn Francois Orion Jaubert
  • Patent number: 8578138
    Abstract: In one embodiment, the present invention includes a processor that has an on-die storage such as a static random access memory to store an architectural state of one or more threads that are swapped out of architectural state storage of the processor on entry to a system management mode (SMM). In this way communication of this state information to a system management memory can be avoided, reducing latency associated with entry into SMM. Embodiments may also enable the processor to update a status of executing agents that are either in a long instruction flow or in a system management interrupt (SMI) blocked state, in order to provide an indication to agents inside the SMM. Other embodiments are described and claimed.
    Type: Grant
    Filed: August 31, 2009
    Date of Patent: November 5, 2013
    Assignee: Intel Corporation
    Inventors: Mahesh S. Natu, Thanunathan Rangarajan, Gautam B. Doshi, Shammanna M. Datta, Baskaran Ganesan, Mohan J. Kumar, Rajesh S. Parthasarathy, Frank Binns, Rajesh Nagaraja Murthy, Robert C. Swanson
  • Publication number: 20130290688
    Abstract: Embodiments of the present invention provide for concurrent instruction execution in heterogeneous computer systems by forming a parallel execution context whenever a first software thread encounters a parallel execution construct. The parallel execution context may comprise a reference to instructions to be executed concurrently, a reference to data said instructions may depend on, and a parallelism level indicator whose value specifies the number of times said instructions are to be executed. The first software thread may then signal to other software threads to begin concurrent execution of instructions referenced in said context. Each software thread may then decrease the parallelism level indicator and copy data referenced in the parallel execution context to said thread's private memory location and modify said data to accommodate for the new location. Software threads may be executed by a processor and operate on behalf of other processing devices or remote computer systems.
    Type: Application
    Filed: April 22, 2013
    Publication date: October 31, 2013
    Inventor: Stanislav Victorovich Bratanov
  • Patent number: 8572355
    Abstract: One embodiment of the present invention sets forth a method for executing a non-local return instruction in a parallel thread processor. The method comprises the steps of receiving, within the thread group, a first long jump instruction and, in response, popping a first token from the execution stack. The method also comprises determining whether the first token is a first long jump token that was pushed onto the execution stack when a first push instruction associated with the first long jump instruction was executed, and when the first token is the first long jump token, jumping to the second instruction based on the address specified by the first long jump token, or, when the first token is not the first long jump token, disabling the active thread until the first long jump token is popped from the execution stack.
    Type: Grant
    Filed: September 13, 2010
    Date of Patent: October 29, 2013
    Assignee: Nvidia Corporation
    Inventors: Guillermo Juan Rozas, Brett W. Coon
  • Patent number: 8566841
    Abstract: Processing data communications events in a parallel active messaging interface (‘PAMI’) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for the context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.
    Type: Grant
    Filed: November 10, 2010
    Date of Patent: October 22, 2013
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith