Context Preserving (e.g., Context Swapping, Checkpointing, Register Windowing Patents (Class 712/228)
  • Patent number: 10380034
    Abstract: Improving operation of a processing unit to access data within a cache system. A first fetch request and one or more subsequent fetch requests are accessed in an instruction stream. An address of data sought by the first fetch requested is obtained. At least a portion of the address of data sought by the first fetch request in inserted in each of the one or more subsequent fetch requests. The portion of the address inserted in each of the one or more subsequent fetch requests is utilized to retrieve the data sought by the first fetch request first in order from the cache system.
    Type: Grant
    Filed: July 14, 2017
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Markus Kaltenbach, Ulrich Mayer, Siegmund Schlechter, Maxim Scholl
  • Patent number: 10354706
    Abstract: For an integrated circuit (IC) that is designed to execute user defined operations after initialization, a sequencing circuitry in the IC that delays the start of the user design execution until a set of initial condition has been computed and propagated is provided. The sequencing holds the first group of circuits at an initial state while a second group of circuits computes and propagates a set of initial conditions based at least partly on the initial state of the first group of circuits. The circuits in the first group when being held disregard their inputs and do not change their outputs. The first group of circuits is released from its initial state after the second group of circuits has completed computation and propagation of the set of initial conditions. The circuits in the first group when released are freed to store or clock-in new inputs and produce new outputs in order to perform the user defined operations in conjunction with the second group of circuits.
    Type: Grant
    Filed: November 23, 2015
    Date of Patent: July 16, 2019
    Assignee: Altera Corporation
    Inventors: Christopher D. Ebeling, Trevis Chandler
  • Patent number: 10325844
    Abstract: A computer-implemented method includes, in a code transformation system, identifying save-to-return code instructions, function call code instructions, comparison code instructions, and exceptional code instructions. The function call code instructions are associated with the save-to-return code instructions. The comparison code instructions are associated with the save-to-return code instructions. The exceptional code instructions are associated with the comparison code instructions. A predefined proximity range based on a predefined proximity value as well as a proximity eligibility indicator are determined. The proximity eligibility indicator denotes whether the save-to-return code instructions and the comparison code instructions are within the predefined proximity range.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: June 18, 2019
    Assignee: International Business Machines Corporation
    Inventors: Iain A. Ireland, Allan H. Kielstra, Muntasir A. Mallick
  • Patent number: 10295980
    Abstract: An apparatus, method for operating an automation device which includes a processor for directly executing function modules, where a function module selected as a component of an automation solution via a development environment, is converted into a code block via the development environment, where the code block includes a type identifier corresponding to the type of the particular function module, where inputs and outputs of the particular function module are mapped to simultaneously usable processor registers in the code block, a plurality of code blocks is processed by reading-in a particular code block by the processor and subsequently executing the code block to execute the automation solution, and where execution of the code block includes selecting a function unit from a plurality of function units based on the type identifier of the code block and activation of the selected function unit with the processor registers specified in the code block.
    Type: Grant
    Filed: March 26, 2015
    Date of Patent: May 21, 2019
    Assignee: Siemens Aktiengesellschaft
    Inventor: Eberhard Schlarb
  • Patent number: 10275250
    Abstract: An apparatus comprises processing circuitry for executing instructions of two or more threads of processing, hardware registers to store context data for the two or more threads concurrently, and commit circuitry to commit results of executed instructions of the threads, where for each thread the commit circuitry commits the instructions of that thread in program order. At least one defer buffer is provided to buffer at least one blocked instruction for which execution by the processing circuitry is complete but execution of an earlier instruction of the same thread in the program order is incomplete. This can help to resolve inter-thread blocking and hence improve performance.
    Type: Grant
    Filed: March 6, 2017
    Date of Patent: April 30, 2019
    Assignee: ARM Limited
    Inventors: Jose Alberto Joao, Ziqiang Huang, Alejandro Rico Carro
  • Patent number: 10248908
    Abstract: Methods, systems, and apparatus for accessing a N-dimensional tensor are described. In some implementations, a method includes, for each of one or more first iterations of a first nested loop, performing iterations of a second nested loop that is nested within the first nested loop until a first loop bound for the second nested loop is reached. A number of iterations of the second nested loop for the one or more first iterations of the first nested loop is limited by the first loop bound in response to the second nested loop having a total number of iterations that exceeds a value of a hardware property of the computing system. After a penultimate iteration of the first nested loop has completed, one or more iterations of the second nested loop are performed for a final iteration of the first nested loop until an alternative loop bound is reached.
    Type: Grant
    Filed: June 19, 2017
    Date of Patent: April 2, 2019
    Assignee: Google LLC
    Inventors: Olivier Temam, Harshit Khaitan, Ravi Narayanaswami, Dong Hyuk Woo
  • Patent number: 10180789
    Abstract: Systems, apparatuses, and methods for implementing software control of state sets are disclosed. In one embodiment, a processor includes at least an execution unit and a plurality of state registers. The processor is configured to detect a command to allocate a first state set for storing a first state, wherein the command is generated by software, and wherein the first state specifies values for the plurality of state registers. The command is executed on the execution unit while the processor is in a second state, wherein the second state is different from the first state. The first state set of the processor is allocated with the first state responsive to executing the command on the execution unit. The processor is configured to allocate the first state set for the first state prior to the processor entering the first state.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: January 15, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Rex Eldon McCrary, Michael J. Mantor, Alexander Fuad Ashkar, Harry J. Wise
  • Patent number: 10176002
    Abstract: Methods and apparatuses for performing a quiesce operation in a multithread environment is provided. A processor receives a first thread quiesce request from a first thread executing on the processor. A processor sends a first processor quiesce request to a system controller to initiate a quiesce operation. A processor performs one or more operations of the first thread based, at least in part, on receiving a response from the system controller.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: January 8, 2019
    Assignee: International Business Machines Corporation
    Inventors: Michael Fee, Ute Gaertner, Lisa C. Heller, Thomas Koehler, Frank Lehnert, Jennifer A. Navarro
  • Patent number: 10168941
    Abstract: Methods, systems, and computer program products for historical state snapshot construction over temporally evolving data are provided herein. A computer-implemented method includes classifying each of multiple temporally evolving data entities into one of multiple categories based on one or more parameters; partitioning the multiple temporally evolving data entities into multiple partitions based at least on (i) said classifying and (ii) the update frequency of each of the multiple temporally evolving data entities; implementing multiple checkpoints at a distinct temporal interval for each of the multiple partitions; and creating a snapshot of the multiple temporally evolving data entities at a selected past point of time (i) based on said implementing and (ii) in response to a query pertaining to a historical state of one or more of the multiple temporally evolving data entities.
    Type: Grant
    Filed: February 19, 2016
    Date of Patent: January 1, 2019
    Assignee: International Business Machines Corporation
    Inventors: Srikanta B. Jagannath, Sriram Lakshminarasimhan, Sameep Mehta, Animesh Nandi, Narendran Sachindran
  • Patent number: 10162687
    Abstract: A processor of an aspect includes at least one lower processing capability and lower power consumption physical compute element and at least one higher processing capability and higher power consumption physical compute element. Migration performance benefit evaluation logic is to evaluate a performance benefit of a migration of a workload from the at least one lower processing capability compute element to the at least one higher processing capability compute element, and to determine whether or not to allow the migration based on the evaluated performance benefit. Available energy and thermal budget evaluation logic is to evaluate available energy and thermal budgets and to determine to allow the migration if the migration fits within the available energy and thermal budgets. Workload migration logic is to perform the migration when allowed by both the migration performance benefit evaluation logic and the available energy and thermal budget evaluation logic.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: December 25, 2018
    Assignee: Intel Corporation
    Inventors: Eugene Gorbatov, Alon Naveh, Inder M. Sodhi, Ganapati N. Srinivasa, Eliezer Weissmann, Guarav Khanna, Mishali Naik, Russell J. Fenger, Andrew D. Henroid, Dheeraj R. Subbareddy, David A. Koufaty, Paolo Narvaez
  • Patent number: 10146548
    Abstract: A method for populating a source view data structure by using register template snapshots. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks; using a plurality of register templates to track instruction destinations and instruction sources by populating the register template with block numbers corresponding to the instruction blocks, wherein the block numbers corresponding to the instruction blocks indicate interdependencies among the blocks of instructions; populating a source view data structure, wherein the source view data structure stores sources corresponding to the instruction blocks as recorded by the plurality of register templates; and determining which of the plurality of instruction blocks are ready for dispatch by using the populated source view data structure.
    Type: Grant
    Filed: January 17, 2017
    Date of Patent: December 4, 2018
    Assignee: Intel Corporation
    Inventor: Mohammad Abdallah
  • Patent number: 10127076
    Abstract: A method includes performing one or more operations as requested by a thread executing on a processor, the thread having a thread context; receiving a park request from the thread, the park request received following a request from the thread for a low latency resource, wherein the cache response time is less than or equal to a resource response threshold so as to allow the thread context to be stored and retrieved from the cache in less time than the portion of time it takes to complete the request for the low latency resource; storing the thread context in the cache; detecting that the resume condition has occurred; retrieving the thread context from the cache; and resuming execution of the thread.
    Type: Grant
    Filed: June 6, 2016
    Date of Patent: November 13, 2018
    Assignee: Google LLC
    Inventors: Luiz Andre Barroso, James Laudon, Michael R. Marty
  • Patent number: 10115222
    Abstract: A graphics processing unit comprises a programmable execution unit executing graphics processing programs for execution threads to perform graphics processing operations, a local register memory comprising one or more registers, where registers of the register memory are assignable to store data associated with an individual execution thread that is being executed by the execution unit, and where the register(s) assigned to an individual execution thread are accessible only to that associated individual execution thread, and a further local memory that is operable to store data for use in common by plural execution threads, where the data stored in the further local memory is accessible to plural execution threads as they execute. The programmable execution unit is operable to selectively store output data for an execution thread in a register(s) of the local register memory assigned to the execution thread, and the further local memory.
    Type: Grant
    Filed: January 9, 2017
    Date of Patent: October 30, 2018
    Assignee: Arm Limited
    Inventors: Sean Tristram LeGuay Ellis, Thomas James Cooksey, Robert Martin Elliott
  • Patent number: 10095525
    Abstract: A processor may include a reorder buffer, reservation stations, and execution units. The reorder buffer may be a circular buffer with a head pointer and a tail pointer, configured to assign indexes to instructions. Reservation stations may be configured to host instructions with the assigned indexes, while waiting to be issued to the execution units. Responsive to exception event, reservation stations may be configured to flush instructions that are younger, in program order, than the instruction executed with exception. Execution units may provide the reorder buffer index EX of the instruction executed with exception. The reorder buffer may provide the reorder buffer index TP stored in the tail pointer. Reservation stations may be configured to flush instructions with assigned indexes in the wrapped-around increasing interval from the index EX to the index TP.
    Type: Grant
    Filed: November 24, 2017
    Date of Patent: October 9, 2018
    Inventor: Dejan Spasov
  • Patent number: 10067782
    Abstract: Various aspects are disclosed herein for attenuating spin waiting in a virtual machine environment comprising a plurality of virtual machines and virtual processors. Selected virtual processors can be given time slice extensions in order to prevent such virtual processors from becoming de-scheduled (and hence causing other virtual processors to have to spin wait). Selected virtual processors can also be expressly scheduled so that they can be given higher priority to resources, resulting in reduced spin waits for other virtual processors waiting on such selected virtual processors. Finally, various spin wait detection techniques can be incorporated into the time slice extension and express scheduling mechanisms, in order to identify potential and existing spin waiting scenarios.
    Type: Grant
    Filed: November 18, 2015
    Date of Patent: September 4, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yau Ning Chin, John Te-Jui Sheu, Arun Kishan, Thomas Fahrig, Rene Antonio Vega
  • Patent number: 10025608
    Abstract: Methods and apparatuses for performing a quiesce operation in a multithread environment is provided. A processor receives a first thread quiesce request from a first thread executing on the processor. A processor sends a first processor quiesce request to a system controller to initiate a quiesce operation. A processor performs one or more operations of the first thread based, at least in part, on receiving a response from the system controller.
    Type: Grant
    Filed: November 17, 2014
    Date of Patent: July 17, 2018
    Assignee: International Business Machines Corporation
    Inventors: Michael Fee, Ute Gaertner, Lisa C. Heller, Thomas Koehler, Frank Lehnert, Jennifer A. Navarro
  • Patent number: 10019283
    Abstract: A processing device includes a first memory that includes a context buffer. The processing device also includes a processor core to execute threads based on context information stored in registers of the processor core and a memory controller to selectively move a subset of the context information between the context buffer and the registers based on one or more latencies of the threads.
    Type: Grant
    Filed: June 22, 2015
    Date of Patent: July 10, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Dmitri Yudanov, Sergey Blagodurov, Arkaprava Basu, Sooraj Puthoor, Joseph L. Greathouse
  • Patent number: 9977968
    Abstract: A method and system for identifying content relevance comprises acquiring video data, mapping the acquired video data to a feature space to obtain a feature representation of the video data, assigning the acquired video data to at least one action class based on the feature representation of the video data, and determining a relevance of the acquired video data.
    Type: Grant
    Filed: March 4, 2016
    Date of Patent: May 22, 2018
    Assignee: Xerox Corporation
    Inventors: Edgar A. Bernal, Qun Li, Yun Zhang, Jayant Kumar, Raja Bala
  • Patent number: 9959238
    Abstract: Message passing is provided among a plurality of interdependent parallel processes using a shared memory. Inter-process communication among a plurality of interdependent processes executing on a plurality of compute nodes is performed by obtaining a message from a first process for a second process; and storing the message in a memory location of a Peripheral Component Interconnect Express (PCIE)-linked storage device, wherein the second process reads the memory location to obtain the message. The message is optionally persistently stored in the PCIE-linked storage device for an asynchronous checkpoint until the message is no longer required for an asynchronous restart.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: May 1, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: John M. Bent, Sorin Faibish, James M. Pedone, Jr.
  • Patent number: 9946547
    Abstract: A load/store unit for a processor, and applications thereof. In an embodiment, the load/store unit includes a load/store queue configured to store information and data associated with a particular class of instructions. Data stored in the load/store queue can be bypassed to dependent instructions. When an instruction belonging to the particular class of instructions graduates and the instruction is associated with a cache miss, control logic causes a pointer to be stored in a load/store graduation buffer that points to an entry in the load/store queue associated with the instruction. The load/store graduation buffer ensures that graduated instructions access a shared resource of the load/store unit in program order.
    Type: Grant
    Filed: September 29, 2006
    Date of Patent: April 17, 2018
    Assignee: ARM Finance Overseas Limited
    Inventors: Meng-Bing Yu, Era K. Nangia, Michael Ni
  • Patent number: 9940168
    Abstract: Methods and systems that reduce the number of instance of a shared resource needed for a processor to perform an operation and/or execute a process without impacting function are provided. a method of processing in a processor is provided. Aspects include determining that an operation to be performed by the processor will require the use of a shared resource. A command can be issued to cause a second operation to not use the shared resources N cycles later. The shared resource can then be used for a first aspect of the operation at cycle X and then used for a second aspect of the operation at cycle X+N. The second operation may be rescheduled according to embodiments.
    Type: Grant
    Filed: January 12, 2017
    Date of Patent: April 10, 2018
    Assignee: MIPS Tech, LLC
    Inventor: Debasish Chandra
  • Patent number: 9940139
    Abstract: A split level history buffer in a central processing unit is provided. A history buffer is split into a first portion and a second portion. An instruction fetch unit fetches and tags instructions. A register file stores tagged instructions. An execution unit generates results for tagged instructions. A first instruction is fetched, tagged, and stored in an entry of the register file. A second instruction is fetched and tagged, and then evicts the first instruction from the register file, such that the second instruction is stored in the entry of the register file. Subsequently, the first instruction is stored in an entry in the first portion of the history buffer. After a result for the first instruction is generated, the first instruction is moved from the first portion of the history buffer to the second portion of the history buffer.
    Type: Grant
    Filed: September 20, 2016
    Date of Patent: April 10, 2018
    Assignee: Internaitonal Business Machines Corporation
    Inventors: Hung Q. Le, Dung Q. Nguyen, David R. Terry
  • Patent number: 9891925
    Abstract: An allocation system and a method for allocating an architectural register in a system having one or more mapping tables. When the allocation system detects a plurality of available architectural registers to an allocation target virtual register, it identifies adjacent instructions to all instructions having the allocation target virtual register in its destination operand, counts the number of uses of the architectural register appearing in the destination operand for each architectural register, summing the number of uses for each architectural register for each entry group in one or more mapping tables having the same assignment rule for correlations with the architectural registers, calculating the total of the numbers of uses of entries for each entry group, and allocating the architectural register to the allocation target virtual register such that the total of the numbers of uses of entries for each entry group approaches uniformity.
    Type: Grant
    Filed: October 5, 2016
    Date of Patent: February 13, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Kazuaki Ishizaki
  • Patent number: 9886077
    Abstract: Various systems, processes, and products may be used to manage a processor. In particular implementations, managing a processor may include the ability to determine whether a thread is pausing for a short period of time and place a wait event for the thread in a queue based on a short thread pause occurring. Managing a processor may also include the ability to activate a delay thread that determines whether a wait time associated with the pause has expired and remove the wait event from the queue based on the wait time having expired.
    Type: Grant
    Filed: May 9, 2016
    Date of Patent: February 6, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bernard A. King-Smith, Bret R. Olszewski, Stephen Rees, Basu Vaidyanathan
  • Patent number: 9886416
    Abstract: A matrix of execution blocks form a set of rows and columns. The rows support parallel execution of instructions and the columns support execution of dependent instructions. The matrix of execution blocks process a single block of instructions specifying parallel and dependent instructions.
    Type: Grant
    Filed: June 8, 2015
    Date of Patent: February 6, 2018
    Assignee: INTEL CORPORATION
    Inventor: Mohammad A. Abdallah
  • Patent number: 9875204
    Abstract: A processing node of a server rack includes a processor to generate processing node management requests and to process responses to the node management requests, and a communication module to receive the processing node management requests, to transmit over a communication link to a management controller of the server rack external to the processing node a processing node management request, to receive over the communication link from the management controller processing node management information, and to transmit the processing node management information to the processor.
    Type: Grant
    Filed: May 17, 2013
    Date of Patent: January 23, 2018
    Assignee: Dell Products, LP
    Inventors: Robert W. Hormuth, Robert L. Winter, Shawn J. Dube, Bradley J. Booth, Geng Lin, Jimmy Pike
  • Patent number: 9864708
    Abstract: In a computer system operable at multiple hierarchical privilege levels, a “wait-for-event” (WFE) communication channel between components operating at different privilege levels is established. Initially, a central processing unit (CPU) is configured to “trap” WFE instructions issued by a client, such as an operating system, operating at one privilege level to an agent, such as a hypervisor, operating at a more privileged level. After storing a predefined special sequence in a storage component (e.g., a register), the client executes a WFE instruction. As part of trapping the WFE instruction, the agent reads and interprets the special sequence from the storage component and may respond to the special sequence by storing another special sequence in a storage component that is accessible to the client. Advantageously, a client may leverage this WFE communication channel to safely and reliably detect whether an agent is present.
    Type: Grant
    Filed: December 16, 2014
    Date of Patent: January 9, 2018
    Assignee: VMware, Inc.
    Inventors: Andrei Warkentin, Harvey Tuch
  • Patent number: 9858151
    Abstract: A computer-implemented method according to one embodiment includes establishing a predetermined checkpoint and storing duplicate read data in association with the predetermined checkpoint during a running of an application that is processing at least one data set, identifying a failure of the application, restarting the application in response to the failure, and enabling a replay of the processing of the at least one data set by the restarted application, utilizing the predetermined checkpoint and the duplicate read data.
    Type: Grant
    Filed: October 3, 2016
    Date of Patent: January 2, 2018
    Assignee: International Business Machines Corporation
    Inventors: Donna N. Dillenberger, David C. Frank, Terri A. Menendez, Gary S. Puchkoff, Wayne E. Rhoten
  • Patent number: 9830187
    Abstract: In one embodiment, an application programming interface (API) is defined that enables a thread scheduler to communicate thread information to the CPU performance controller when dispatching a thread to a processor or processor core. When dispatching a thread, the scheduler may communicate thread information including thread state information, a general “importance” of the thread as defined by a priority level and/or quality of service (QoS) classification, a measurement of the scheduler dispatch latency for the thread, or architectural information regarding the instructions within the thread, such as whether the thread is contains 64-bit or 32-bit instructions. The performance controller can use the information provided by the scheduler to make performance control decisions for the processor cores within the system.
    Type: Grant
    Filed: June 5, 2015
    Date of Patent: November 28, 2017
    Assignee: Apple Inc.
    Inventors: Russell A. Blaine, Daniel A. Chimene, Shantonu Sen, John Dorsey, Bryan Hinch, Cyril De La Cropte De Chanterac, Olivier Cozelle
  • Patent number: 9817664
    Abstract: Techniques are disclosed relating to register caching techniques for thread switches. In one embodiment, an apparatus includes a register file and caching circuitry. In this embodiment, the register file includes a plurality of registers and the caching circuitry is configured to store information that indicates threads that correspond to data stored in respective ones of the plurality of registers. In this embodiment, the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. In some embodiments, the disclosed techniques may reduce context switch latency, reduce pressure on a data cache, and/or allow smaller slices of thread execution, for example.
    Type: Grant
    Filed: February 19, 2015
    Date of Patent: November 14, 2017
    Assignee: Apple Inc.
    Inventors: Shachar Ron, Bernard J. Semeria
  • Patent number: 9785538
    Abstract: Arbitrary instruction execution from context memory. In some embodiments, an integrated circuit includes a processor core; a context management circuit coupled to the processor core; and a debug support circuit coupled to the context management circuit, where: the context management circuit is configured to halt a thread running on the processor core and save a halted thread context for that thread into a context memory distinct from the processor core, where the halted thread context comprises a fetched instruction as the next instruction in the execution pipeline; the debug support circuit is configured instruct the context management circuit to modify the halted thread context in the context memory by replacing the fetched instruction with an arbitrary instruction; and the context management circuit is further configured to cause the thread to resume using the modified thread context to execute the arbitrary instruction.
    Type: Grant
    Filed: September 1, 2015
    Date of Patent: October 10, 2017
    Assignee: NXP USA, Inc.
    Inventors: Celso Fernando Veras Brites, Alex Rocha Prado
  • Patent number: 9740496
    Abstract: A processor and a method implemented by the processor to obtain computation results are described. The processor includes a unified reuse table embedded in a processor pipeline, the unified reuse table including a plurality of entries, each entry of the plurality of entries corresponding with a computation instruction or a set of computation instructions. The processor also includes a functional unit to perform a computation based on a corresponding instruction.
    Type: Grant
    Filed: September 6, 2013
    Date of Patent: August 22, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Xiaochen Guo, Hillery C. Hunter, Jude A. Rivers, Vijayalakshmi Srinivasan
  • Patent number: 9740497
    Abstract: A processor and a method implemented by the processor to obtain computation results are described. The processor includes a unified reuse table embedded in a processor pipeline, the unified reuse table including a plurality of entries, each entry of the plurality of entries corresponding with a computation instruction or a set of computation instructions. The processor also includes a functional unit to perform a computation based on a corresponding instruction.
    Type: Grant
    Filed: October 15, 2013
    Date of Patent: August 22, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Xiaochen Guo, Hillery C. Hunter, Jude A. Rivers, Vijayalakshmi Srinivasan
  • Patent number: 9727421
    Abstract: Technologies for environment checkpointing include an orchestration node communicatively coupled to one or more working computing nodes. The orchestration node is configured to administer an environment checkpointing event by transmitting a checkpoint initialization signal to each of the one or more working computing nodes that have been registered with the orchestration node. Each working computing node is configured to pause and buffer any presently executing applications, save checkpointing data (an execution state of each of the one or more applications) and transmit the checkpointing data to the orchestration node. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 24, 2015
    Date of Patent: August 8, 2017
    Assignee: Intel Corporation
    Inventors: Igor Ljubuncic, Ravi A. Giri
  • Patent number: 9652282
    Abstract: One embodiment of the present invention sets forth a technique for instruction level execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. Any in-flight instructions that follow the preemption command in the processing pipeline are captured and stored in a processing task buffer to be reissued when the preempted program is resumed. The processing task buffer is designated as a high priority task to ensure the preempted instructions are reissued before any new instructions for the preempted context when execution of the preempted context is restored.
    Type: Grant
    Filed: November 8, 2011
    Date of Patent: May 16, 2017
    Assignee: NVIDIA Corporation
    Inventors: Philip Alexander Cuadra, Christopher Lamb, Lacky V. Shah
  • Patent number: 9646154
    Abstract: Return oriented programming (ROP) attack prevention techniques are described. In one or more examples, a method is described of protecting against return oriented programming attacks. The method includes initiating a compute signature hardware instruction of a computing device to compute a signature for a return address and the associated location on the stack the return address is stored and causing storage of the computed signature along with the return address in the stack. The method also includes enforcing that before executing the return instruction using the return address on the stack, initiating a verify signature hardware instruction of the computing device to verify the signature matches the target return address on the stack and responding to successful verification of the signature through execution of the verify signature hardware instruction by the computing device, executing the return instruction to the return address.
    Type: Grant
    Filed: January 20, 2015
    Date of Patent: May 9, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ling Tony Chen, Jonathan E. Lange, Greg M. Zaverucha
  • Patent number: 9626194
    Abstract: Method, apparatus, and system embodiments to assign priority to a thread when the thread is otherwise unable to proceed with instruction retirement. For at least one embodiment, the thread is one of a plurality of active threads in a multiprocessor system that includes memory livelock breaker logic and/or starvation avoidance logic. Other embodiments are also described and claimed.
    Type: Grant
    Filed: September 24, 2012
    Date of Patent: April 18, 2017
    Assignee: Intel Corporation
    Inventors: David W. Burns, K. S. Venkatraman
  • Patent number: 9606834
    Abstract: Methods, reservation stations and processors for allocating resources to a plurality of threads based on the extent to which the instructions associated with each of the threads are speculative. The method comprises receiving a speculation metric for each thread at a reservation station. Each speculation metric represents the extent to which the instructions associated with a particular thread are speculative. The more speculative an instruction, the more likely the instruction has been incorrectly predicted by a branch predictor. The reservation station then allocates functional unit resources (e.g. pipelines) to the threads based on the speculation metrics and selects a number of instructions from one or more of the threads based on the allocation. The selected instructions are then issued to the functional unit resources.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: March 28, 2017
    Assignee: Imagination Technologies Limited
    Inventors: Hugh Jackson, Paul Rowland
  • Patent number: 9589311
    Abstract: Techniques to saturate a graphics processing unit (GPU) with independent threads from multiple kernels are described. An apparatus may include a graphics processing unit driver for a graphics processing unit having a first partition including a first plurality of execution units and a second partition including a second plurality of execution units, the graphics processing unit driver to dispatch one or more threads of a first kernel to the first partition and to dispatch one or more threads of a second kernel to the second partition to increase a utilization of the plurality of execution units and avoid hardware resource competition.
    Type: Grant
    Filed: December 18, 2013
    Date of Patent: March 7, 2017
    Assignee: INTEL CORPORATION
    Inventors: Julia A. Gould, Haihua Wu
  • Patent number: 9588845
    Abstract: A processor includes a storage configured to receive a snapshot of a state of the processor prior to performing a set of computations in an approximating manner. The processor also includes an indicator that indicates an amount of error accumulated while the set of computations is performed in the approximating manner. When the processor detects that the amount of error accumulated has exceeded an error bound, the processor is configured to restore the state of the processor to the snapshot from the storage.
    Type: Grant
    Filed: October 23, 2014
    Date of Patent: March 7, 2017
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
  • Patent number: 9569219
    Abstract: A method for assisting operations of a processor core coupled to a first memory and a second memory includes: examining instructions being filled from the first memory to the second memory to extract instruction information containing at least branch information of the instructions, and creating a plurality of tracks based on the extracted instruction information. Further, the method includes filling one or more instructions from the first memory to the second memory based on one or more tracks from the plurality of tracks before the processor core starts executing the instructions, such that the processor core fetches the instructions from the second memory for execution. Filling the instructions further includes pre-fetching from the first memory to the second memory instruction segments containing the instructions corresponding to at least two levels of branch target instructions based on the one or more tracks.
    Type: Grant
    Filed: November 15, 2012
    Date of Patent: February 14, 2017
    Assignee: SHANGHAI XINHAO MICROELECTRONICS CO. LTD.
    Inventor: Chenghao Kenneth Lin
  • Patent number: 9569216
    Abstract: A method for populating a source view data structure by using register template snapshots. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks; using a plurality of register templates to track instruction destinations and instruction sources by populating the register template with block numbers corresponding to the instruction blocks, wherein the block numbers corresponding to the instruction blocks indicate interdependencies among the blocks of instructions; populating a source view data structure, wherein the source view data structure stores sources corresponding to the instruction blocks as recorded by the plurality of register templates; and determining which of the plurality of instruction blocks are ready for dispatch by using the populated source view data structure.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: February 14, 2017
    Assignee: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Patent number: 9563500
    Abstract: A sequence code verification system can be designed to include a data reader, a validity engine, and an error notifier. The data reader can read sequence codes from consecutive logical blocks. The validity engine can invalidate write operations in response to checking data validity by applying comparison operations to sequence codes and block offsets of batch write operations. The error notifier can notify a user of an error for each invalidated write operation batch. The system can validate data written to logical blocks on a storage subsystem adapted so that, during write operations, an additional sequence code is written to each logical block of data. The sequence code can remain constant for each write operation batch and the sequence code can be incremented for each new write operation batch.
    Type: Grant
    Filed: March 14, 2016
    Date of Patent: February 7, 2017
    Assignee: International Business Machines Corporation
    Inventors: Huw Francis, David A. Sinclair
  • Patent number: 9552192
    Abstract: The disclosed embodiments provide a system that facilitates execution of a software program. During operation, the system determines a structure of a software program and an execution context for the software program from a set of possible execution contexts for the software program. Next, the system generates memory layouts for a set of object instances in the software program at least in part by applying the execution context to the structure independently of a local execution context on the computer system. The system then stores the memory layouts in association with the software program.
    Type: Grant
    Filed: November 5, 2014
    Date of Patent: January 24, 2017
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Jean-Francois Denise, Steven J. Drach, Charles J. Hunt
  • Patent number: 9542185
    Abstract: An allocation system and a method for allocating an architectural register in a system having one or more mapping tables. When the allocation system detects a plurality of available architectural registers to an allocation target virtual register, it identifies adjacent instructions to all instructions having the allocation target virtual register in its destination operand, counts the number of uses of the architectural register appearing in the destination operand for each architectural register, summing the number of uses for each architectural register for each entry group in one or more mapping tables having the same assignment rule for correlations with the architectural registers, calculating the total of the numbers of uses of entries for each entry group, and allocating the architectural register to the allocation target virtual register such that the total of the numbers of uses of entries for each entry group approaches uniformity.
    Type: Grant
    Filed: July 2, 2014
    Date of Patent: January 10, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Kazuaki Ishizaki
  • Patent number: 9535772
    Abstract: In a computer system operable at multiple hierarchical privilege levels, a “wait-for-event” (WFE) communication channel between components operating at different privilege levels is established. Initially, a central processing unit (CPU) is configured to to “trap” WFE instructions issued by a client, such as an operating system, operating at one privilege level to an agent, such as a hypervisor, operating at a more privileged level. After storing a predefined special sequence in a storage component (e.g., a register), the client executes a WFE instruction. As part of trapping the WFE instruction, the agent reads and interprets the special sequence from the storage component and may respond to the special sequence by storing another special sequence in a storage component that is accessible to the client. Advantageously, the client may leverage this WFE communication channel to establish low-overhead watchdog functionality for the client.
    Type: Grant
    Filed: December 16, 2014
    Date of Patent: January 3, 2017
    Assignee: VMware, Inc.
    Inventors: Andrei Warkentin, Harvey Tuch
  • Patent number: 9529654
    Abstract: A recoverable and fault-tolerant CPU core and a control method thereof are provided. The recoverable and fault-tolerant CPU core includes first, second, and third arithmetic logic circuits configured to perform a calculation requested by the same instruction, a first selector configured to compare calculation values output from the first, second, and third arithmetic logic circuits by the same instruction, determine as a normal state when two or more of the calculation values are the same, and if not, determine as a fault state, and a register file configured to record the calculation value having the same value, when determining as the normal state in the first selector.
    Type: Grant
    Filed: November 19, 2014
    Date of Patent: December 27, 2016
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Young Su Kwon, Jin Ho Han, Kyung Jin Byun
  • Patent number: 9519507
    Abstract: Systems and methods for managing context switches among threads in a processing system. A processor may perform a context switch between threads using separate context registers. A context switch allows a processor to switch from processing a thread that is waiting for data to one that is ready for additional processing. The processor includes control registers with entries which may indicate that an associated context is waiting for data from an external source.
    Type: Grant
    Filed: May 1, 2015
    Date of Patent: December 13, 2016
    Assignee: ARM Finance Overseas Limited
    Inventors: Robert Gelinas, W. Patrick Hays, Sol Katzman, William J. Dally
  • Patent number: 9489208
    Abstract: A semiconductor device comprising a processor having a pipelined architecture and a pipeline flattener and a method for operating a pipeline flattener in a semiconductor device are provided. The processor comprises a pipeline having a plurality of pipeline stages and a plurality of pipeline registers that are coupled between the pipeline stages. The pipeline flattener comprises a plurality of trigger registers for storing a trigger, wherein the trigger registers are coupled between the pipeline stages.
    Type: Grant
    Filed: May 16, 2012
    Date of Patent: November 8, 2016
    Assignee: TEXAS INSTRUMENTS DEUTSCHLAND GMBH
    Inventors: Markus Koesler, Johann Zipperer, Christian Wiencke, Wolfgang Lutsch
  • Patent number: 9483324
    Abstract: Provided is to a program conversion device which can use processor resources of a system to the utmost and enhance performance ability. The program conversion device includes: specific process determining unit which determines a range of a partial program to perform a specific process in a target program which includes a first execution scheme specifying program which can be executed in parallel with a first ratio of being a usage ratio of a first usage quantity with respect to a first resource of a first processor and a second usage quantity with respect to a second resource of a second processor; and process converting unit which converts the partial program into a second execution scheme specifying program which can be executed in parallel with a second ratio of being the usage ratio different from the first ratio.
    Type: Grant
    Filed: June 12, 2013
    Date of Patent: November 1, 2016
    Assignee: NEC CORPORATION
    Inventor: Takamichi Miyamoto