Context Preserving (e.g., Context Swapping, Checkpointing, Register Windowing Patents (Class 712/228)
-
Patent number: 9396012Abstract: An apparatus includes a processor and a guest operating system. In response to receiving a request to create a task, the guest operating system requests a hypervisor to create a virtual processor to execute the requested task. The virtual processor is schedulable on the processor.Type: GrantFiled: March 14, 2013Date of Patent: July 19, 2016Assignee: Qualcomm IncorporatedInventors: Erich James Plondke, Lucian Codrescu
-
Patent number: 9384036Abstract: A method includes performing one or more operations as requested by a thread executing on a processor, the thread having a thread context; receiving a park request from the thread, the park request received following a request from the thread for a low latency resource, wherein the cache response time is less than or equal to a resource response threshold so as to allow the thread context to be stored and retrieved from the cache in less time than the portion of time it takes to complete the request for the low latency resource; storing the thread context in the cache; detecting that the resume condition has occurred; retrieving the thread context from the cache; and resuming execution of the thread.Type: GrantFiled: October 21, 2013Date of Patent: July 5, 2016Assignee: Google Inc.Inventors: Luiz Andre Barroso, James Laudon, Michael R. Marty
-
Patent number: 9372718Abstract: A system and method for executing a transaction in a transactional memory system is disclosed. The system includes a processor of a plurality of processors coupled to shared memory, wherein the processor is configured to execute a section of code, including a plurality of memory access operations to the shared memory, as an atomic transaction relative to the execution of the plurality of processors. According to embodiments, the processor is configured to determine whether the memory access operations include any of a set of disallowed instructions, wherein the set includes one or more instructions that operate differently in a virtualized computing environment than in a native computing environment. If any of the memory access operations are ones of the disallowed instructions, then the processor aborts the transaction.Type: GrantFiled: July 28, 2009Date of Patent: June 21, 2016Assignee: Advanced Micro Devices, Inc.Inventors: David S. Christie, Michael P. Hohmuth, Stephan Diestelhorst
-
Patent number: 9354926Abstract: Various systems, processes, and products may be used to manage a processor. In particular implementations, managing a processor may include the ability to determine whether a thread is pausing for a short period of time and place a wait event for the thread in a queue based on a short thread pause occurring. Managing a processor may also include the ability to activate a delay thread that determines whether a wait time associated with the pause has expired and remove the wait event from the queue based on the wait time having expired.Type: GrantFiled: March 22, 2011Date of Patent: May 31, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bernard A. King-Smith, Bret R. Olszewski, Stephen Rees, Basu Vaidyanathan
-
Patent number: 9342688Abstract: Disclosed is a method for inheriting a non-secure thread context. In the method, a first secure monitor call associated with a first non-secure thread of a non-secure environment of a processing system is received. A first secure thread is created, in response to the first secure monitor call, that inherits a first interrupt state of the first non-secure thread.Type: GrantFiled: March 7, 2013Date of Patent: May 17, 2016Assignee: QUALCOMM IncorporatedInventors: Samar Asbe, Tero M. Kukola, Paul Richard Ellis, Qazi Y. Bashir, Suresh Bollapragada
-
Patent number: 9323315Abstract: A system and method for power management by performing clock-gating at a clock source. In the method a critical stall condition is detected within a clocked component of a core of a processing unit. The core includes one or more clocked components synchronized in operation by a clock signal distributed by a clock grid. The clock grid is clock-gated to suspend distribution of the clock signal to the core during the critical stall condition.Type: GrantFiled: August 15, 2012Date of Patent: April 26, 2016Assignee: NVIDIA CORPORATIONInventor: Guillermo Juan Rozas
-
Patent number: 9304540Abstract: Embodiments are described for handling the launching of applications in a multi-screen device. In embodiments, a first touch sensitive display of a first screen receives input to launch an application. In response, the application is launched and a window of the first application is displayed on the first display. A second touch sensitive display of a second screen receives input to launch a second application. In response, the second application is launched and a second window of the second application is displayed on the second display. In embodiments, when an application is launched, it displays the view of the application (whether on the first touch sensitive display or the second touch sensitive display) that was displayed when the application was last closed.Type: GrantFiled: September 29, 2011Date of Patent: April 5, 2016Assignee: Z124Inventor: Ron Cassar
-
Patent number: 9239801Abstract: An example processing system may comprise: a lower stack bound register configured to store a first memory address, the first memory address identifying a lower bound of a memory addressable via a stack segment; an upper stack bound register configured to store a second memory address, the second memory address identifying an upper bound of the memory addressable via the stack segment; and a stack bounds checking logic configured to detect unauthorized stack pivoting, by comparing a memory address being accessed via the stack segment with at least one of the first memory address and the second memory address.Type: GrantFiled: June 5, 2013Date of Patent: January 19, 2016Assignee: Intel CorporationInventors: Baiju V. Patel, Xiaoning Li, H P. Anvin, Asit K. Mallick, Gilbert Neiger, James B. Crossland, Toby Opferman, Atul A. Khare, Jason W. Brandt, James S. Coke, Brian L. Vajda
-
Patent number: 9207919Abstract: A system, method, and computer program product are provided for. The method includes the steps of executing a block of translated binary instructions by multiple threads and gathering profiling data during execution of the block of translated binary instructions. The multiple threads are then synchronized at a barrier instruction associated with the block of translated binary instructions and the block of translated binary instructions is replaced with optimized binary instructions, where the optimized binary instructions are produced based on the profiling data.Type: GrantFiled: January 17, 2014Date of Patent: December 8, 2015Assignee: NVIDIA CorporationInventor: Gregory Frederick Diamos
-
Patent number: 9208087Abstract: The present invention discloses a data caching method and apparatus, and relates to the field of network applications. The method includes: receiving a first data request; writing target data in the first data request into an on-chip Cache, and counting a storage time of the target data in the on-chip cache; enabling a delay expiry identifier of the target data when the storage time of the target data in the Cache reaches a preset delay time; and releasing the target data when the delay expiry identifier of the target data is in an enabled state and processing of the target data is complete.Type: GrantFiled: November 28, 2012Date of Patent: December 8, 2015Assignee: Huawei Technologies Co., Ltd.Inventors: Zixue Bi, Hua Wei, Chunlei Fan
-
Patent number: 9207995Abstract: Various systems and processes may be used to speed up multi-threaded execution. In certain implementations, a system and process may include the ability to write results of a first group of execution units associated with a first register file into the first register file using a first write port of the first register file and write results of a second group of execution units associated with a second register file into the second register file using a first write port of the second register file. The system, apparatus, and process may also include the ability to connect, in a shared register file mode, results of the second group of execution units to a second write port of the first register file and connect, in a split register file mode, results of a part of the first group of execution units to the second write port of the first register file.Type: GrantFiled: June 27, 2011Date of Patent: December 8, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Maarten J. Boersma, Markus Kaltenbach, Jens Leenstra, Tim Niggemeier, Philipp Oehler, Philipp Panitz
-
Patent number: 9195439Abstract: Exemplary embodiments support multi-threaded subgraph execution control within a graphical modeling or graphical programming environment. In an embodiment, a subgraph may be identified as a subset of blocks within a graphical model, or graphical program, or both. A subgraph initiator may explicitly execute the subgraph while maintaining data dependencies within the subgraph. Explicit signatures may be defined for the subgraph initiator and the subgraph either graphically or textually. Execution control may be branched wherein the data dependencies within the subgraph are maintained. Execution control may be joined together wherein the data dependencies within the subgraph are maintained. Exemplary embodiments may allow subgraphs to execute on different threads within a graphical modeling or programming environment.Type: GrantFiled: August 27, 2013Date of Patent: November 24, 2015Assignee: The MathWorks, Inc.Inventors: John Edward Ciolfi, Ramamurthy Mani, Qu Zhang
-
Patent number: 9164854Abstract: Embodiments relate to thread sparing between cores in a processor. An aspect includes determining that a number of recovery attempts made by a first thread on the first core has exceeded a recovery attempt threshold, and sending a request to transfer the first thread. Another aspect includes, selecting a second core from a plurality of cores to receive the first thread from the first core, wherein the second core is selected based on the second core having an idle thread. Another aspect includes transferring a last good architected state of the first thread from the first core to the second core. Another aspect includes loading the last good architected state of the first thread by the idle thread on the second core. Yet another aspect includes resuming execution of the first thread on the second core from the last good architected state of the first thread by the idle thread.Type: GrantFiled: March 4, 2013Date of Patent: October 20, 2015Assignee: International Business Machines CorporationInventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
-
Patent number: 9164764Abstract: Methods and apparatuses for reducing power consumption of processor switch operations are disclosed. One or more embodiments may comprise specifying a subset of registers or state storage elements to be involved in a register or state storage operation, performing the register or state storage operation, and performing a switch operation. The embodiments may minimize the number of registers or state storage elements involved with the standby operation by specifying only the subset of registers or state storage elements, which may involve considerably fewer than the total number of registers or state storage or elements of the processor. The switch operation may be switch from one mode to another, such as a transition to or from a sleep mode, a context switch, or the execution of various types of instructions.Type: GrantFiled: May 23, 2014Date of Patent: October 20, 2015Assignee: Intel CorporationInventors: Ethan Schuchman, Hong Wang, Chris Weaver, Belliappa M. Kuttanna, Asit Mallick, Vivek K. De, Per Hammarlund
-
Patent number: 9164900Abstract: A method for expanding preload capabilities of a memory to encompass a register file is provided. The method comprises predicting an address of a memory location containing data to be accessed by a first memory operation instruction that has not yet executed, prior to the first memory operation instruction executing moving the data in the memory location to an unassigned register file entry, and causing a renaming register to assign the register file entry to an architectural register. Responsive to the renaming register assigning the register file entry to the architectural register, the method further comprises permitting a second instruction to execute using the data moved to the register file, wherein the second instruction is dependent on the first memory operation instruction.Type: GrantFiled: May 14, 2013Date of Patent: October 20, 2015Assignee: MARVELL INTERNATIONAL LTD.Inventor: Kim Schuttenberg
-
Patent number: 9129062Abstract: Systems and methods for instrumenting code are disclosed. The entry to a subroutine is trapped and the subroutine's return address is mutated to create an invalid instruction pointer. The mutated return address is stored in the architecture reserved space for the return address. An exception handler is executed that has been instrumented to handle the fault caused by the mutated return address such that the exit from the subroutine is instrumented.Type: GrantFiled: May 20, 2010Date of Patent: September 8, 2015Assignee: VMware, Inc.Inventors: Keith Adams, Eli Daniel Collins
-
Patent number: 9122610Abstract: The present invention is a microprocessor architecture for efficiently running an operating system. The improved architecture provides higher performance, improved operating system efficiency, enhanced security, and reduced power consumption.Type: GrantFiled: September 17, 2013Date of Patent: September 1, 2015Assignee: The United States of America as represented by the Secretary of the ArmyInventors: Patrick Jungwirth, Patrick La Fratta
-
Patent number: 9122649Abstract: A method and computing system for handling a page fault while executing a cross-platform system call with a shared page cache. A first kernel running in a first computer system receives a request for a faulted page associated with raw data from a second kernel running in a second computer system. In response to the request for the faulted page: (i) a first validity flag is updated to denote that the faulted page is unavailable to the first computer system in a first copy of the shared page cache and (ii) the faulted page is transmitted to the second kernel for insertion of the faulted page in a second copy of the shared page cache and for updating a second validity flag to denote that the faulted page is available to the second computer system in the second copy of the shared page cache.Type: GrantFiled: October 8, 2013Date of Patent: September 1, 2015Assignee: International Business Machines CorporationInventor: Utz Bacher
-
Patent number: 9086721Abstract: Methods, reservation stations and processors for allocating resources to a plurality of threads based on the extent to which the instructions associated with each of the threads are speculative. The method comprises receiving a speculation metric for each thread at a reservation station. Each speculation metric represents the extent to which the instructions associated with a particular thread are speculative. The more speculative an instruction, the more likely the instruction has been incorrectly predicted by a branch predictor. The reservation station then allocates functional unit resources (e.g. pipelines) to the threads based on the speculation metrics and selects a number of instructions from one or more of the threads based on the allocation. The selected instructions are then issued to the functional unit resources.Type: GrantFiled: January 17, 2014Date of Patent: July 21, 2015Assignee: Imagination Technologies LimitedInventors: Hugh Jackson, Paul Rowland
-
Patent number: 9083641Abstract: A network processing device having multiple processing engines capable of providing multi-context parallel processing is disclosed. The device includes a receiver and a packet processor, wherein the receiver is capable of receiving packets at a predefined packet flow rate. The packet processor, in one embodiment, includes multiple processing engines, wherein each processing engine is further configured to include multiple context processing components. The context processing components are used to provide multi-context parallel processing to increase throughput.Type: GrantFiled: November 4, 2011Date of Patent: July 14, 2015Assignee: Tellabs Operations, Inc.Inventors: Naveen K. Jain, Venkata Rangavajjhala
-
Patent number: 9081928Abstract: A computer-implemented method of automatically generating an embedded system on the basis of an original computer program, comprising analyzing the original computer program, comprising a step of compiling the original computer program into an executable to obtain data flow graphs with static data dependencies and a step of executing the executable using test data to provide dynamic data dependencies as communication patterns between load and store operations of the original computer program, and a step of transforming the original computer program into an intermediary computer program that exhibits multi-threaded parallelism with inter-thread communication, which comprises identifying at least one static and/or dynamic data dependency that crosses a thread boundary and converting said data dependency into a buffered communication channel with read/write access.Type: GrantFiled: June 1, 2010Date of Patent: July 14, 2015Assignee: Vector Fabrics, B.V.Inventors: Jos Van Eijndhoven, Tommy Kamps, Maurice Kastelijn, Martijn Rutten, Paul Stravers
-
Patent number: 9064437Abstract: Memory-based semaphore are described that are useful for synchronizing operations between different processing engines. In one example, operations include executing a context at a producer engine, the executing including updating a memory register, and sending a signal from the producer engine to a consumer engine that the memory register has been updated, the signal including a Context ID to identify a context to be executed by the consumer engine to update the register.Type: GrantFiled: December 7, 2012Date of Patent: June 23, 2015Assignee: Intel CorporationInventors: Hema Chand Nalluri, Aditya Navale
-
Patent number: 9063906Abstract: Embodiments relate to thread sparing between cores in a processor. An aspect includes determining that a number of recovery attempts made by a first thread on the first core has exceeded a recovery attempt threshold, and sending a request to transfer the first thread. Another aspect includes, selecting a second core from a plurality of cores to receive the first thread from the first core, wherein the second core is selected based on the second core having an idle thread. Another aspect includes transferring a last good architected state of the first thread from the first core to the second core. Another aspect includes loading the last good architected state of the first thread by the idle thread on the second core. Yet another aspect includes resuming execution of the first thread on the second core from the last good architected state of the first thread by the idle thread.Type: GrantFiled: September 27, 2012Date of Patent: June 23, 2015Assignee: International Business Machines CorporationInventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
-
Patent number: 9052835Abstract: An abort function for storage devices sets a “poison bit” flag in the command to be deleted while the command resides on a submission queue prior to being fetched by the SSD controller. In response to the set “poison bit” flag, a storage device controller aborts execution of the I/O command and returns an abort successful status reply to the completion queue.Type: GrantFiled: December 20, 2013Date of Patent: June 9, 2015Assignee: HGST NETHERLANDS B.V.Inventors: David Lee Darrington, Dylan Mark Dewitt, Adam Michael Espeseth, Lee Anton Sendelbach
-
Patent number: 9032190Abstract: A leading thread and a trailing thread are executed in parallel. Assuming that no transient fault occurs in each section, a system is speculatively executed in the section, with the leading thread and the trailing thread preferably being assigned to two different cores. At this time, the leading thread and the trailing thread are simultaneously executed, performing a buffering operation on a thread local area without performing a write operation on a shared memory. When the respective execution results of the two threads match each other, the content buffered to the thread local area is committed and written to the shared memory. When the respective execution results of the two threads do not match each other, the leading thread and the trailing thread are rolled back to a preceding commit point and re-executed.Type: GrantFiled: August 20, 2010Date of Patent: May 12, 2015Assignee: International Business Machines CorporationInventors: Toshihiko Koju, Takuya Nakaike
-
Patent number: 9021239Abstract: The disclosure relates to the implementation of multi-tasking on a digital signal processor. Blocking functions are arranged such that they do not make use of a processor's hardware stack. Respective function calls are replaced with a piece of inline assembly code, which instead performs a branch to the correct routine for carrying out said function. If a blocking condition of the blocking function is encountered, a task switch can be done to resume another task. Whilst the hardware stack is not used when a task switch might have to occur, mixed-up contents of the hardware stack among function calls performed by different tasks are avoided.Type: GrantFiled: April 7, 2006Date of Patent: April 28, 2015Assignee: NXP, B.V.Inventor: Tomas Henriksson
-
Publication number: 20150113255Abstract: A computing platform may include heterogeneous processors (e.g., CPU and a GPU) to support sharing of virtual functions between such processors. In one embodiment, a CPU side vtable pointer used to access a shared object from the CPU 110 may be used to determine a GPU vtable if a GPU-side table exists. In another embodiment, a shared non-coherent region, which may not maintain data consistency, may be created within the shared virtual memory. The CPU and the GPU side data stored within the shared non-coherent region may have a same address as seen from the CPU and the GPU side. However, the contents of the CPU-side data may be different from that of GPU-side data as shared virtual memory may not maintain coherency during the run-time. In one embodiment, the vptr may be modified to point to the CPU vtable and GPU vtable stored in the shared virtual memory.Type: ApplicationFiled: December 12, 2014Publication date: April 23, 2015Inventors: Shoumeng Yan, Xiaocheng Zhou, Hu Chen, Ying Gao, Sai Luo, Bratin Saha
-
Patent number: 9015720Abstract: A system and method to optimize processor performance and minimizing average thread latency by selectively loading a cache when a program state, resources required for execution of a program or the program itself change, is described. An embodiment of the invention supports a “cache priming program” that is selectively executed for a first thread/program/sub-routine of each process. Such a program is optimized for situations when instructions and other program data are not yet resident in cache(s), and/or whenever resources required for program execution or the program itself changes. By pre-loading the cache with two resources required for two instructions for only a first thread, average thread latency is reduced because the resources are already present in the cache.Type: GrantFiled: January 6, 2009Date of Patent: April 21, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Andrew Brown, Brian Emberling
-
Patent number: 9015450Abstract: Embodiments of a processor architecture efficiently implement shadow registers in hardware. A register system in a processor includes a set of physical data registers coupled to register renaming logic. The register renaming logic stores data in and retrieves data from the set of physical registers when the processor is in a first processor state. The register renaming logic identifies ones of the set of physical registers that have a first operational state as a first group of registers and identifies the remaining ones of the set of physical registers as a second group of registers in response to an indication that the processor is to enter a second processor state from the first processor state. The register renaming logic stores data in and retrieves data from the second group of registers but not the first group of registers when the processor is in the second processor state.Type: GrantFiled: January 20, 2010Date of Patent: April 21, 2015Assignee: STMicroelectronics (Beijing) R&D Co. Ltd.Inventors: Hong-Xia Sun, Peng Fei Zhu, Yong Qiang Wu
-
Patent number: 9009312Abstract: Controlling access to a resource in a distributed computing system that includes nodes having a status field, a next field, a source data buffer, and that are characterized by a unique node identifier, where controlling access includes receiving a request for access to the resource implemented as an active message that includes the requesting node's unique node identifier, the value stored in the requesting node's source data buffer, and an instruction to perform a reduction operation with the value stored in the requesting node's source data buffer and the value stored in the receiving node's source data buffer; returning the requesting node's unique node identifier as a result of the reduction operation; and updating the status and next fields to identify the requesting node as a next node to have sole access to the resource.Type: GrantFiled: November 2, 2012Date of Patent: April 14, 2015Assignee: International Business Machines CorporationInventors: Charles J. Archer, James E. Carey, Matthew W. Markland, Philip J. Sanders
-
Publication number: 20150095627Abstract: In response to detecting one or more conditions are met, a checkpoint of a current state of a thread may be created. One or more incomplete instructions may be moved from a first level of a re-order buffer to a second level of the re-order buffer. Each incomplete instruction may be currently executing or awaiting execution.Type: ApplicationFiled: September 27, 2013Publication date: April 2, 2015Inventors: Mark J. DECHENE, Srikanth T. SRINIVASAN, Matthew C. MERTEN, Tong LI, Christine E. WANG
-
Patent number: 8972705Abstract: A constant data accessing system having a constant pool comprises a computer processor having a constant pool base register, a compiler having a constant pool handler, and an instruction set module having a constant pool instruction set unit. The constant pool base register is configured to store a value of constant pool base address of one or a plurality of subroutines which have constants to be accessed.Type: GrantFiled: November 16, 2011Date of Patent: March 3, 2015Assignee: Andes Technology CorporationInventors: Wei-Hao Chiao, Haw-Luen Tsai, Chen-Wei Chang, Hong-Men Su
-
Patent number: 8959317Abstract: A microcomputer includes: a plurality of register lists having a plurality of register patterns, respectively, wherein each of plurality of register patterns designates registers, data of which are to be saved in a data memory; an instruction fetch control circuit configured to fetch instruction code from an instruction memory in response to an interrupt request issued based on occurrence of an interrupt factor; and a register data saving control circuit configured to acquire one register pattern from one of the plurality of register lists in response to the interrupt request, and issue a microinstruction based on the acquired register pattern in response to the interrupt request. An instruction executing section is configured to execute the microinstruction prior to the fetched instruction code, to save the data of registers designated based on the acquired register pattern in the data memory.Type: GrantFiled: April 12, 2011Date of Patent: February 17, 2015Assignee: Renesas Electronics CorporationInventor: Hideki Matsuyama
-
Publication number: 20150039869Abstract: In one embodiment, the present invention includes a method for receiving control in a kernel mode via a ring transition from a user thread during execution of an unbounded transactional memory (UTM) transaction, updating a state of a transaction status register (TSR) associated with the user thread and storing the TSR with a context of the user thread, and later restoring the context during a transition from the kernel mode to the user thread. In this way, the UTM transaction may continue on resumption of the user thread. Other embodiments are described and claimed.Type: ApplicationFiled: August 1, 2013Publication date: February 5, 2015Inventors: Koichi Yamada, GAD SHEAFFER, JAN GRAY, LANDY WANG, MARTIN TAILLEFER, ARUN KISHAN, ALI-REZA ADL-TABATABAI, DAVID CALLAHAN
-
Patent number: 8949583Abstract: Executing a set one or more instructions is disclosed. A set of one or more register states is saved in a software data structure. The set of instructions is speculatively executed. At least one store made to a memory location during the speculative execution is not committed until the speculative execution is successfully completed. If an abort indication is received, the state of one or more registers restored.Type: GrantFiled: April 30, 2007Date of Patent: February 3, 2015Assignee: Azul Systems, Inc.Inventors: Gil Tene, Ivan Posva, Michael A. Wolf, Daniel Dwight Grove, Tom Kraljevic
-
Patent number: 8949475Abstract: A method includes pre-configuring a hardware-implemented front-end of a storage device with multiple contexts of respective connections conducted between one or more hosts and the storage device. Storage commands, which are received in the storage device and are associated with the connections having the pre-configured contexts, are executed in a memory of the storage device using the hardware-implemented front-end. Upon identifying a storage command associated with a context that is not pre-configured in the hardware-implemented front-end, software of the storage device is triggered to configure the context in the hardware-implemented front-end, and the storage command is then executed using the hardware-implemented front-end in accordance with the context configured by the software.Type: GrantFiled: March 18, 2014Date of Patent: February 3, 2015Assignee: Apple Inc.Inventor: Arie Peled
-
Publication number: 20150026441Abstract: A method and system of inserting marker values used to correlate trace data as between processor cores. At least some of the illustrative embodiments are integrated circuit devices comprising a first processor core, a first data collection portion coupled to the first processor core and configured to gather data comprising addresses of instructions executed by the first processor core, a second processor core communicatively coupled to the first processor core, and a second data collection portion coupled to the first processor core and configured to gather data comprising addresses of instructions executed by the second processor core. The integrated circuit device is configured to insert marker values into the data of the first and second processor cores which allow correlation of the data such that contemporaneously executed instruction are identifiable.Type: ApplicationFiled: October 3, 2014Publication date: January 22, 2015Applicant: Texas Instruments IncorporatedInventors: Oliver P. Sohm, Brian Cruickshank, Manisha Agarwala, Gary L. Swoboda
-
Publication number: 20150006864Abstract: The present embodiments provide a system that facilitates lazy register window fills in a processor. During program execution, when the system encounters a restore instruction for a register window, the system determines if the restore instruction causes an underflow condition that requires the register window to be filled from a stack in memory. If so, the system completes the restore instruction by updating state information for the register window to indicate that the restore instruction is complete without actually filling the individual registers that comprise the register window from the stack. During subsequent program execution, the system lazily fills registers in the register window from the stack as the registers are accessed by the program.Type: ApplicationFiled: July 1, 2013Publication date: January 1, 2015Inventor: Yuan C. Chou
-
Patent number: 8898440Abstract: A request control device, request control method, and a multiprocessor cooperation architecture. The request control device is connected to a request storage module and includes a comparing means and an identifier means. The comparing means is configured to determine if an incoming first queue unit corresponds to the same message with a queue unit that has existed in the request storage module. The identifier setting means is configured to set a save identifier of the queue unit that has existed in the request storage module to indicate not to save a state associated with the message if the first queue unit corresponds to the same message with the queue unit that has existed in the request storage module. According to the technical solution of the invention, the access to the memory caused by saving/loading the states is reduced and thereby increases the processing speed of the processor.Type: GrantFiled: August 18, 2010Date of Patent: November 25, 2014Assignee: International Business Machines CorporationInventors: Xiao Tao Chang, Wei Liu, Kun Wang, Hong Bo Zeng
-
Patent number: 8898438Abstract: The invention provides a processor comprising an execution unit for executing multiple threads, each thread comprising a sequence of instructions and each thread being designated to handle activity from at least one specified source. The processor also comprises a thread scheduler for scheduling a plurality of threads to be executed by the execution unit, said scheduling being based on the respective activity handled by the threads; and a plurality of sets of registers connected to the execution unit. Each set of registers is arranged to store information representing a respective one of the plurality of threads, at least a part of the information being accessible by the execution unit for use in executing the respective thread when scheduled.Type: GrantFiled: March 14, 2007Date of Patent: November 25, 2014Assignee: XMOS Ltd.Inventor: Michael David May
-
Patent number: 8893094Abstract: Hardware compilation and/or translation with fault detection and roll back functionality are disclosed. Compilation and/or translation logic receives programs encoded in one language, and encodes the programs into a second language including instructions to support processor features not encoded into the original language encoding of the programs. In one embodiment, an execution unit executes instructions of the second language including an operation-check instruction to perform a first operation and record the first operation result for a comparison, and an operation-test instruction to perform a second operation and a fault detection operation by comparing the second operation result to the recorded first operation result.Type: GrantFiled: December 30, 2011Date of Patent: November 18, 2014Assignee: Intel CorporationInventors: Nicholas Cheng Hwa Chee, Tryggve Fossum, William C. Hasenplaugh
-
Publication number: 20140325193Abstract: Techniques for dynamic instrumentation are provided. A method for instrumentation preparation may include obtaining address data of an original instruction in an original instruction stream, obtaining kernel mode data comprising a kernel breakpoint handler, obtaining user mode data comprising a user breakpoint handler, allocating a page of a process address space, creating a trampoline, associating the trampoline with a breakpoint instruction, and replacing the original instruction with the breakpoint instruction. A method for instrumentation may include detecting the breakpoint instruction, calling the kernel breakpoint handler, modifying an instruction pointer via the kernel breakpoint handler such that the instruction pointer points to the trampoline, and executing the trampoline. The system for instrumentation may include a breakpoint setup module and a breakpoint execution module for respectively setting up and completing instrumentation involving the trampoline.Type: ApplicationFiled: July 3, 2014Publication date: October 30, 2014Inventors: BALBIR SINGH, MANEESH SONI
-
Publication number: 20140325192Abstract: A method and a device that includes a set of multiple pipeline stages, wherein the set of multiple pipeline stages is arranged to execute a first thread of instructions; multiple memristor based registers that are arranged to store a state of another thread of instructions that differs from the first thread of instructions; and a control circuit that is arranged to control a thread switch between the first thread of instructions and the other thread of instructions by controlling a storage of a state of the first thread of instructions at the multiple memristor based registers and by controlling a provision of the state of the other thread of instructions by the set of multiple pipeline stages; wherein the set of multiple pipeline stages is arranged to execute the other thread of instructions upon a reception of the state of the other thread of instructions.Type: ApplicationFiled: March 19, 2014Publication date: October 30, 2014Applicant: Technion Research and Development Foundation LTD.Inventors: Avinoam Kolodny, Uri Weiser, Shahar Kvatinsky
-
Patent number: 8868887Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.Type: GrantFiled: November 5, 2004Date of Patent: October 21, 2014Assignee: Intel CorporationInventors: Hong Wang, Per Hammarlund, Xiang Zou, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Piyush Desai
-
Patent number: 8856261Abstract: A system, method and computer program product for supporting system initiated checkpoints in parallel computing systems. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity.Type: GrantFiled: December 28, 2012Date of Patent: October 7, 2014Assignee: International Business Machines CorporationInventors: Dong Chen, Philip Heidelberger
-
Patent number: 8850169Abstract: A system, apparatus and method for multithread handling on a multithread processing device are described herein. Embodiments of the present invention provide a multithread processing device for multithread handling including a plurality of registers operatively coupled to an instruction dispatch block, including thread-control registers for selectively disabling threads. In various embodiments, the multithread processing device may include a thread-operation register for selectively providing a lock to a first thread to prevent a second thread from disabling the first thread while the first thread has the lock. In still further embodiments, the multithread processing device may be configured to atomically disable and release a lock held by a thread. Other embodiments may be described and claimed.Type: GrantFiled: July 1, 2013Date of Patent: September 30, 2014Assignee: Marvell International Ltd.Inventors: Jack Kang, Hsi-Cheng Chu, Yu-Chi Chuang
-
Patent number: 8850168Abstract: A processor apparatus according to the present invention is a processor apparatus which shares hardware resources between a plurality of processors, and includes: a first determination unit which determines whether or not a register in each of the hardware resources holds extension context data of a program that is currently executed; a second determination unit which determines to which processor the extension context data in the hardware resource corresponds; a first transfer unit which saves and restores the extension context data between programs in the processor; and a second transfer unit which saves and restores the extension context data between programs between different processors.Type: GrantFiled: August 23, 2011Date of Patent: September 30, 2014Assignee: Panasonic CorporationInventors: Takao Yamamoto, Shinji Ozaki, Masahide Kakeda, Masaitsu Nakajima
-
Patent number: 8850436Abstract: One embodiment of the present invention sets forth a technique for performing a method for synchronizing divergent executing threads. The method includes receiving a plurality of instructions that includes at least one set-synchronization instruction and at least one instruction that includes a synchronization command, and determining an active mask that indicates which threads in a plurality of threads are active and which threads in the plurality of threads are disabled. For each instruction included in the plurality of instructions, the instruction is transmitted to each of the active threads included in the plurality of threads. If the instruction is a set-synchronization instruction, then a synchronization token, the active mask and the synchronization point is each pushed onto a stack.Type: GrantFiled: September 28, 2010Date of Patent: September 30, 2014Assignee: NVIDIA CorporationInventors: Brian Fahs, Ming Y. Siu, Robert Steven Glanville
-
Publication number: 20140281437Abstract: Robust system call and system return instructions are executed by a processor to transfer control between a requester and an operating system kernel. The processor includes execution circuitry and registers that store pointers to data structures in memory. The execution circuitry receives a system call instruction from a requester to transfer control from a first privilege level of the requester to a second privilege level of an operating system kernel. In response, the execution circuitry swaps the data structures that are pointed to by the registers between the requester and the operating system kernel in one atomic transition.Type: ApplicationFiled: March 15, 2013Publication date: September 18, 2014Inventors: Baiju V. Patel, James B. Crossland, Atul A. Khare, Toby Opferman
-
Patent number: 8832475Abstract: A system includes a context file to store multiple contexts corresponding to different power modes of an electronic system, and a domain control device to generate control signals based, at least in part, on a context from the context file. The electronic system is configured to transition to a power mode corresponding to the context responsive to the control signals.Type: GrantFiled: May 10, 2010Date of Patent: September 9, 2014Assignee: Cypress Semiconductor CorporationInventor: Michael Sheets