Context Preserving (e.g., Context Swapping, Checkpointing, Register Windowing Patents (Class 712/228)
  • Patent number: 10176002
    Abstract: Methods and apparatuses for performing a quiesce operation in a multithread environment is provided. A processor receives a first thread quiesce request from a first thread executing on the processor. A processor sends a first processor quiesce request to a system controller to initiate a quiesce operation. A processor performs one or more operations of the first thread based, at least in part, on receiving a response from the system controller.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: January 8, 2019
    Assignee: International Business Machines Corporation
    Inventors: Michael Fee, Ute Gaertner, Lisa C. Heller, Thomas Koehler, Frank Lehnert, Jennifer A. Navarro
  • Patent number: 10168941
    Abstract: Methods, systems, and computer program products for historical state snapshot construction over temporally evolving data are provided herein. A computer-implemented method includes classifying each of multiple temporally evolving data entities into one of multiple categories based on one or more parameters; partitioning the multiple temporally evolving data entities into multiple partitions based at least on (i) said classifying and (ii) the update frequency of each of the multiple temporally evolving data entities; implementing multiple checkpoints at a distinct temporal interval for each of the multiple partitions; and creating a snapshot of the multiple temporally evolving data entities at a selected past point of time (i) based on said implementing and (ii) in response to a query pertaining to a historical state of one or more of the multiple temporally evolving data entities.
    Type: Grant
    Filed: February 19, 2016
    Date of Patent: January 1, 2019
    Assignee: International Business Machines Corporation
    Inventors: Srikanta B. Jagannath, Sriram Lakshminarasimhan, Sameep Mehta, Animesh Nandi, Narendran Sachindran
  • Patent number: 10162687
    Abstract: A processor of an aspect includes at least one lower processing capability and lower power consumption physical compute element and at least one higher processing capability and higher power consumption physical compute element. Migration performance benefit evaluation logic is to evaluate a performance benefit of a migration of a workload from the at least one lower processing capability compute element to the at least one higher processing capability compute element, and to determine whether or not to allow the migration based on the evaluated performance benefit. Available energy and thermal budget evaluation logic is to evaluate available energy and thermal budgets and to determine to allow the migration if the migration fits within the available energy and thermal budgets. Workload migration logic is to perform the migration when allowed by both the migration performance benefit evaluation logic and the available energy and thermal budget evaluation logic.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: December 25, 2018
    Assignee: Intel Corporation
    Inventors: Eugene Gorbatov, Alon Naveh, Inder M. Sodhi, Ganapati N. Srinivasa, Eliezer Weissmann, Guarav Khanna, Mishali Naik, Russell J. Fenger, Andrew D. Henroid, Dheeraj R. Subbareddy, David A. Koufaty, Paolo Narvaez
  • Patent number: 10146548
    Abstract: A method for populating a source view data structure by using register template snapshots. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks; using a plurality of register templates to track instruction destinations and instruction sources by populating the register template with block numbers corresponding to the instruction blocks, wherein the block numbers corresponding to the instruction blocks indicate interdependencies among the blocks of instructions; populating a source view data structure, wherein the source view data structure stores sources corresponding to the instruction blocks as recorded by the plurality of register templates; and determining which of the plurality of instruction blocks are ready for dispatch by using the populated source view data structure.
    Type: Grant
    Filed: January 17, 2017
    Date of Patent: December 4, 2018
    Assignee: Intel Corporation
    Inventor: Mohammad Abdallah
  • Patent number: 10127076
    Abstract: A method includes performing one or more operations as requested by a thread executing on a processor, the thread having a thread context; receiving a park request from the thread, the park request received following a request from the thread for a low latency resource, wherein the cache response time is less than or equal to a resource response threshold so as to allow the thread context to be stored and retrieved from the cache in less time than the portion of time it takes to complete the request for the low latency resource; storing the thread context in the cache; detecting that the resume condition has occurred; retrieving the thread context from the cache; and resuming execution of the thread.
    Type: Grant
    Filed: June 6, 2016
    Date of Patent: November 13, 2018
    Assignee: Google LLC
    Inventors: Luiz Andre Barroso, James Laudon, Michael R. Marty
  • Patent number: 10115222
    Abstract: A graphics processing unit comprises a programmable execution unit executing graphics processing programs for execution threads to perform graphics processing operations, a local register memory comprising one or more registers, where registers of the register memory are assignable to store data associated with an individual execution thread that is being executed by the execution unit, and where the register(s) assigned to an individual execution thread are accessible only to that associated individual execution thread, and a further local memory that is operable to store data for use in common by plural execution threads, where the data stored in the further local memory is accessible to plural execution threads as they execute. The programmable execution unit is operable to selectively store output data for an execution thread in a register(s) of the local register memory assigned to the execution thread, and the further local memory.
    Type: Grant
    Filed: January 9, 2017
    Date of Patent: October 30, 2018
    Assignee: Arm Limited
    Inventors: Sean Tristram LeGuay Ellis, Thomas James Cooksey, Robert Martin Elliott
  • Patent number: 10095525
    Abstract: A processor may include a reorder buffer, reservation stations, and execution units. The reorder buffer may be a circular buffer with a head pointer and a tail pointer, configured to assign indexes to instructions. Reservation stations may be configured to host instructions with the assigned indexes, while waiting to be issued to the execution units. Responsive to exception event, reservation stations may be configured to flush instructions that are younger, in program order, than the instruction executed with exception. Execution units may provide the reorder buffer index EX of the instruction executed with exception. The reorder buffer may provide the reorder buffer index TP stored in the tail pointer. Reservation stations may be configured to flush instructions with assigned indexes in the wrapped-around increasing interval from the index EX to the index TP.
    Type: Grant
    Filed: November 24, 2017
    Date of Patent: October 9, 2018
    Inventor: Dejan Spasov
  • Patent number: 10067782
    Abstract: Various aspects are disclosed herein for attenuating spin waiting in a virtual machine environment comprising a plurality of virtual machines and virtual processors. Selected virtual processors can be given time slice extensions in order to prevent such virtual processors from becoming de-scheduled (and hence causing other virtual processors to have to spin wait). Selected virtual processors can also be expressly scheduled so that they can be given higher priority to resources, resulting in reduced spin waits for other virtual processors waiting on such selected virtual processors. Finally, various spin wait detection techniques can be incorporated into the time slice extension and express scheduling mechanisms, in order to identify potential and existing spin waiting scenarios.
    Type: Grant
    Filed: November 18, 2015
    Date of Patent: September 4, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yau Ning Chin, John Te-Jui Sheu, Arun Kishan, Thomas Fahrig, Rene Antonio Vega
  • Patent number: 10025608
    Abstract: Methods and apparatuses for performing a quiesce operation in a multithread environment is provided. A processor receives a first thread quiesce request from a first thread executing on the processor. A processor sends a first processor quiesce request to a system controller to initiate a quiesce operation. A processor performs one or more operations of the first thread based, at least in part, on receiving a response from the system controller.
    Type: Grant
    Filed: November 17, 2014
    Date of Patent: July 17, 2018
    Assignee: International Business Machines Corporation
    Inventors: Michael Fee, Ute Gaertner, Lisa C. Heller, Thomas Koehler, Frank Lehnert, Jennifer A. Navarro
  • Patent number: 10019283
    Abstract: A processing device includes a first memory that includes a context buffer. The processing device also includes a processor core to execute threads based on context information stored in registers of the processor core and a memory controller to selectively move a subset of the context information between the context buffer and the registers based on one or more latencies of the threads.
    Type: Grant
    Filed: June 22, 2015
    Date of Patent: July 10, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Dmitri Yudanov, Sergey Blagodurov, Arkaprava Basu, Sooraj Puthoor, Joseph L. Greathouse
  • Patent number: 9977968
    Abstract: A method and system for identifying content relevance comprises acquiring video data, mapping the acquired video data to a feature space to obtain a feature representation of the video data, assigning the acquired video data to at least one action class based on the feature representation of the video data, and determining a relevance of the acquired video data.
    Type: Grant
    Filed: March 4, 2016
    Date of Patent: May 22, 2018
    Assignee: Xerox Corporation
    Inventors: Edgar A. Bernal, Qun Li, Yun Zhang, Jayant Kumar, Raja Bala
  • Patent number: 9959238
    Abstract: Message passing is provided among a plurality of interdependent parallel processes using a shared memory. Inter-process communication among a plurality of interdependent processes executing on a plurality of compute nodes is performed by obtaining a message from a first process for a second process; and storing the message in a memory location of a Peripheral Component Interconnect Express (PCIE)-linked storage device, wherein the second process reads the memory location to obtain the message. The message is optionally persistently stored in the PCIE-linked storage device for an asynchronous checkpoint until the message is no longer required for an asynchronous restart.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: May 1, 2018
    Assignee: EMC IP Holding Company LLC
    Inventors: John M. Bent, Sorin Faibish, James M. Pedone, Jr.
  • Patent number: 9946547
    Abstract: A load/store unit for a processor, and applications thereof. In an embodiment, the load/store unit includes a load/store queue configured to store information and data associated with a particular class of instructions. Data stored in the load/store queue can be bypassed to dependent instructions. When an instruction belonging to the particular class of instructions graduates and the instruction is associated with a cache miss, control logic causes a pointer to be stored in a load/store graduation buffer that points to an entry in the load/store queue associated with the instruction. The load/store graduation buffer ensures that graduated instructions access a shared resource of the load/store unit in program order.
    Type: Grant
    Filed: September 29, 2006
    Date of Patent: April 17, 2018
    Assignee: ARM Finance Overseas Limited
    Inventors: Meng-Bing Yu, Era K. Nangia, Michael Ni
  • Patent number: 9940139
    Abstract: A split level history buffer in a central processing unit is provided. A history buffer is split into a first portion and a second portion. An instruction fetch unit fetches and tags instructions. A register file stores tagged instructions. An execution unit generates results for tagged instructions. A first instruction is fetched, tagged, and stored in an entry of the register file. A second instruction is fetched and tagged, and then evicts the first instruction from the register file, such that the second instruction is stored in the entry of the register file. Subsequently, the first instruction is stored in an entry in the first portion of the history buffer. After a result for the first instruction is generated, the first instruction is moved from the first portion of the history buffer to the second portion of the history buffer.
    Type: Grant
    Filed: September 20, 2016
    Date of Patent: April 10, 2018
    Assignee: Internaitonal Business Machines Corporation
    Inventors: Hung Q. Le, Dung Q. Nguyen, David R. Terry
  • Patent number: 9940168
    Abstract: Methods and systems that reduce the number of instance of a shared resource needed for a processor to perform an operation and/or execute a process without impacting function are provided. a method of processing in a processor is provided. Aspects include determining that an operation to be performed by the processor will require the use of a shared resource. A command can be issued to cause a second operation to not use the shared resources N cycles later. The shared resource can then be used for a first aspect of the operation at cycle X and then used for a second aspect of the operation at cycle X+N. The second operation may be rescheduled according to embodiments.
    Type: Grant
    Filed: January 12, 2017
    Date of Patent: April 10, 2018
    Assignee: MIPS Tech, LLC
    Inventor: Debasish Chandra
  • Patent number: 9891925
    Abstract: An allocation system and a method for allocating an architectural register in a system having one or more mapping tables. When the allocation system detects a plurality of available architectural registers to an allocation target virtual register, it identifies adjacent instructions to all instructions having the allocation target virtual register in its destination operand, counts the number of uses of the architectural register appearing in the destination operand for each architectural register, summing the number of uses for each architectural register for each entry group in one or more mapping tables having the same assignment rule for correlations with the architectural registers, calculating the total of the numbers of uses of entries for each entry group, and allocating the architectural register to the allocation target virtual register such that the total of the numbers of uses of entries for each entry group approaches uniformity.
    Type: Grant
    Filed: October 5, 2016
    Date of Patent: February 13, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Kazuaki Ishizaki
  • Patent number: 9886077
    Abstract: Various systems, processes, and products may be used to manage a processor. In particular implementations, managing a processor may include the ability to determine whether a thread is pausing for a short period of time and place a wait event for the thread in a queue based on a short thread pause occurring. Managing a processor may also include the ability to activate a delay thread that determines whether a wait time associated with the pause has expired and remove the wait event from the queue based on the wait time having expired.
    Type: Grant
    Filed: May 9, 2016
    Date of Patent: February 6, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bernard A. King-Smith, Bret R. Olszewski, Stephen Rees, Basu Vaidyanathan
  • Patent number: 9886416
    Abstract: A matrix of execution blocks form a set of rows and columns. The rows support parallel execution of instructions and the columns support execution of dependent instructions. The matrix of execution blocks process a single block of instructions specifying parallel and dependent instructions.
    Type: Grant
    Filed: June 8, 2015
    Date of Patent: February 6, 2018
    Assignee: INTEL CORPORATION
    Inventor: Mohammad A. Abdallah
  • Patent number: 9875204
    Abstract: A processing node of a server rack includes a processor to generate processing node management requests and to process responses to the node management requests, and a communication module to receive the processing node management requests, to transmit over a communication link to a management controller of the server rack external to the processing node a processing node management request, to receive over the communication link from the management controller processing node management information, and to transmit the processing node management information to the processor.
    Type: Grant
    Filed: May 17, 2013
    Date of Patent: January 23, 2018
    Assignee: Dell Products, LP
    Inventors: Robert W. Hormuth, Robert L. Winter, Shawn J. Dube, Bradley J. Booth, Geng Lin, Jimmy Pike
  • Patent number: 9864708
    Abstract: In a computer system operable at multiple hierarchical privilege levels, a “wait-for-event” (WFE) communication channel between components operating at different privilege levels is established. Initially, a central processing unit (CPU) is configured to “trap” WFE instructions issued by a client, such as an operating system, operating at one privilege level to an agent, such as a hypervisor, operating at a more privileged level. After storing a predefined special sequence in a storage component (e.g., a register), the client executes a WFE instruction. As part of trapping the WFE instruction, the agent reads and interprets the special sequence from the storage component and may respond to the special sequence by storing another special sequence in a storage component that is accessible to the client. Advantageously, a client may leverage this WFE communication channel to safely and reliably detect whether an agent is present.
    Type: Grant
    Filed: December 16, 2014
    Date of Patent: January 9, 2018
    Assignee: VMware, Inc.
    Inventors: Andrei Warkentin, Harvey Tuch
  • Patent number: 9858151
    Abstract: A computer-implemented method according to one embodiment includes establishing a predetermined checkpoint and storing duplicate read data in association with the predetermined checkpoint during a running of an application that is processing at least one data set, identifying a failure of the application, restarting the application in response to the failure, and enabling a replay of the processing of the at least one data set by the restarted application, utilizing the predetermined checkpoint and the duplicate read data.
    Type: Grant
    Filed: October 3, 2016
    Date of Patent: January 2, 2018
    Assignee: International Business Machines Corporation
    Inventors: Donna N. Dillenberger, David C. Frank, Terri A. Menendez, Gary S. Puchkoff, Wayne E. Rhoten
  • Patent number: 9830187
    Abstract: In one embodiment, an application programming interface (API) is defined that enables a thread scheduler to communicate thread information to the CPU performance controller when dispatching a thread to a processor or processor core. When dispatching a thread, the scheduler may communicate thread information including thread state information, a general “importance” of the thread as defined by a priority level and/or quality of service (QoS) classification, a measurement of the scheduler dispatch latency for the thread, or architectural information regarding the instructions within the thread, such as whether the thread is contains 64-bit or 32-bit instructions. The performance controller can use the information provided by the scheduler to make performance control decisions for the processor cores within the system.
    Type: Grant
    Filed: June 5, 2015
    Date of Patent: November 28, 2017
    Assignee: Apple Inc.
    Inventors: Russell A. Blaine, Daniel A. Chimene, Shantonu Sen, John Dorsey, Bryan Hinch, Cyril De La Cropte De Chanterac, Olivier Cozelle
  • Patent number: 9817664
    Abstract: Techniques are disclosed relating to register caching techniques for thread switches. In one embodiment, an apparatus includes a register file and caching circuitry. In this embodiment, the register file includes a plurality of registers and the caching circuitry is configured to store information that indicates threads that correspond to data stored in respective ones of the plurality of registers. In this embodiment, the apparatus is configured to store, at a point in time at which a first register of the plurality of registers includes first valid data corresponding to a first thread, second valid data corresponding to a second thread in a second register of the plurality of registers. In some embodiments, the disclosed techniques may reduce context switch latency, reduce pressure on a data cache, and/or allow smaller slices of thread execution, for example.
    Type: Grant
    Filed: February 19, 2015
    Date of Patent: November 14, 2017
    Assignee: Apple Inc.
    Inventors: Shachar Ron, Bernard J. Semeria
  • Patent number: 9785538
    Abstract: Arbitrary instruction execution from context memory. In some embodiments, an integrated circuit includes a processor core; a context management circuit coupled to the processor core; and a debug support circuit coupled to the context management circuit, where: the context management circuit is configured to halt a thread running on the processor core and save a halted thread context for that thread into a context memory distinct from the processor core, where the halted thread context comprises a fetched instruction as the next instruction in the execution pipeline; the debug support circuit is configured instruct the context management circuit to modify the halted thread context in the context memory by replacing the fetched instruction with an arbitrary instruction; and the context management circuit is further configured to cause the thread to resume using the modified thread context to execute the arbitrary instruction.
    Type: Grant
    Filed: September 1, 2015
    Date of Patent: October 10, 2017
    Assignee: NXP USA, Inc.
    Inventors: Celso Fernando Veras Brites, Alex Rocha Prado
  • Patent number: 9740496
    Abstract: A processor and a method implemented by the processor to obtain computation results are described. The processor includes a unified reuse table embedded in a processor pipeline, the unified reuse table including a plurality of entries, each entry of the plurality of entries corresponding with a computation instruction or a set of computation instructions. The processor also includes a functional unit to perform a computation based on a corresponding instruction.
    Type: Grant
    Filed: September 6, 2013
    Date of Patent: August 22, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Xiaochen Guo, Hillery C. Hunter, Jude A. Rivers, Vijayalakshmi Srinivasan
  • Patent number: 9740497
    Abstract: A processor and a method implemented by the processor to obtain computation results are described. The processor includes a unified reuse table embedded in a processor pipeline, the unified reuse table including a plurality of entries, each entry of the plurality of entries corresponding with a computation instruction or a set of computation instructions. The processor also includes a functional unit to perform a computation based on a corresponding instruction.
    Type: Grant
    Filed: October 15, 2013
    Date of Patent: August 22, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Xiaochen Guo, Hillery C. Hunter, Jude A. Rivers, Vijayalakshmi Srinivasan
  • Patent number: 9727421
    Abstract: Technologies for environment checkpointing include an orchestration node communicatively coupled to one or more working computing nodes. The orchestration node is configured to administer an environment checkpointing event by transmitting a checkpoint initialization signal to each of the one or more working computing nodes that have been registered with the orchestration node. Each working computing node is configured to pause and buffer any presently executing applications, save checkpointing data (an execution state of each of the one or more applications) and transmit the checkpointing data to the orchestration node. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 24, 2015
    Date of Patent: August 8, 2017
    Assignee: Intel Corporation
    Inventors: Igor Ljubuncic, Ravi A. Giri
  • Patent number: 9652282
    Abstract: One embodiment of the present invention sets forth a technique for instruction level execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. Any in-flight instructions that follow the preemption command in the processing pipeline are captured and stored in a processing task buffer to be reissued when the preempted program is resumed. The processing task buffer is designated as a high priority task to ensure the preempted instructions are reissued before any new instructions for the preempted context when execution of the preempted context is restored.
    Type: Grant
    Filed: November 8, 2011
    Date of Patent: May 16, 2017
    Assignee: NVIDIA Corporation
    Inventors: Philip Alexander Cuadra, Christopher Lamb, Lacky V. Shah
  • Patent number: 9646154
    Abstract: Return oriented programming (ROP) attack prevention techniques are described. In one or more examples, a method is described of protecting against return oriented programming attacks. The method includes initiating a compute signature hardware instruction of a computing device to compute a signature for a return address and the associated location on the stack the return address is stored and causing storage of the computed signature along with the return address in the stack. The method also includes enforcing that before executing the return instruction using the return address on the stack, initiating a verify signature hardware instruction of the computing device to verify the signature matches the target return address on the stack and responding to successful verification of the signature through execution of the verify signature hardware instruction by the computing device, executing the return instruction to the return address.
    Type: Grant
    Filed: January 20, 2015
    Date of Patent: May 9, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ling Tony Chen, Jonathan E. Lange, Greg M. Zaverucha
  • Patent number: 9626194
    Abstract: Method, apparatus, and system embodiments to assign priority to a thread when the thread is otherwise unable to proceed with instruction retirement. For at least one embodiment, the thread is one of a plurality of active threads in a multiprocessor system that includes memory livelock breaker logic and/or starvation avoidance logic. Other embodiments are also described and claimed.
    Type: Grant
    Filed: September 24, 2012
    Date of Patent: April 18, 2017
    Assignee: Intel Corporation
    Inventors: David W. Burns, K. S. Venkatraman
  • Patent number: 9606834
    Abstract: Methods, reservation stations and processors for allocating resources to a plurality of threads based on the extent to which the instructions associated with each of the threads are speculative. The method comprises receiving a speculation metric for each thread at a reservation station. Each speculation metric represents the extent to which the instructions associated with a particular thread are speculative. The more speculative an instruction, the more likely the instruction has been incorrectly predicted by a branch predictor. The reservation station then allocates functional unit resources (e.g. pipelines) to the threads based on the speculation metrics and selects a number of instructions from one or more of the threads based on the allocation. The selected instructions are then issued to the functional unit resources.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: March 28, 2017
    Assignee: Imagination Technologies Limited
    Inventors: Hugh Jackson, Paul Rowland
  • Patent number: 9589311
    Abstract: Techniques to saturate a graphics processing unit (GPU) with independent threads from multiple kernels are described. An apparatus may include a graphics processing unit driver for a graphics processing unit having a first partition including a first plurality of execution units and a second partition including a second plurality of execution units, the graphics processing unit driver to dispatch one or more threads of a first kernel to the first partition and to dispatch one or more threads of a second kernel to the second partition to increase a utilization of the plurality of execution units and avoid hardware resource competition.
    Type: Grant
    Filed: December 18, 2013
    Date of Patent: March 7, 2017
    Assignee: INTEL CORPORATION
    Inventors: Julia A. Gould, Haihua Wu
  • Patent number: 9588845
    Abstract: A processor includes a storage configured to receive a snapshot of a state of the processor prior to performing a set of computations in an approximating manner. The processor also includes an indicator that indicates an amount of error accumulated while the set of computations is performed in the approximating manner. When the processor detects that the amount of error accumulated has exceeded an error bound, the processor is configured to restore the state of the processor to the snapshot from the storage.
    Type: Grant
    Filed: October 23, 2014
    Date of Patent: March 7, 2017
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
  • Patent number: 9569219
    Abstract: A method for assisting operations of a processor core coupled to a first memory and a second memory includes: examining instructions being filled from the first memory to the second memory to extract instruction information containing at least branch information of the instructions, and creating a plurality of tracks based on the extracted instruction information. Further, the method includes filling one or more instructions from the first memory to the second memory based on one or more tracks from the plurality of tracks before the processor core starts executing the instructions, such that the processor core fetches the instructions from the second memory for execution. Filling the instructions further includes pre-fetching from the first memory to the second memory instruction segments containing the instructions corresponding to at least two levels of branch target instructions based on the one or more tracks.
    Type: Grant
    Filed: November 15, 2012
    Date of Patent: February 14, 2017
    Assignee: SHANGHAI XINHAO MICROELECTRONICS CO. LTD.
    Inventor: Chenghao Kenneth Lin
  • Patent number: 9569216
    Abstract: A method for populating a source view data structure by using register template snapshots. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks; using a plurality of register templates to track instruction destinations and instruction sources by populating the register template with block numbers corresponding to the instruction blocks, wherein the block numbers corresponding to the instruction blocks indicate interdependencies among the blocks of instructions; populating a source view data structure, wherein the source view data structure stores sources corresponding to the instruction blocks as recorded by the plurality of register templates; and determining which of the plurality of instruction blocks are ready for dispatch by using the populated source view data structure.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: February 14, 2017
    Assignee: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Patent number: 9563500
    Abstract: A sequence code verification system can be designed to include a data reader, a validity engine, and an error notifier. The data reader can read sequence codes from consecutive logical blocks. The validity engine can invalidate write operations in response to checking data validity by applying comparison operations to sequence codes and block offsets of batch write operations. The error notifier can notify a user of an error for each invalidated write operation batch. The system can validate data written to logical blocks on a storage subsystem adapted so that, during write operations, an additional sequence code is written to each logical block of data. The sequence code can remain constant for each write operation batch and the sequence code can be incremented for each new write operation batch.
    Type: Grant
    Filed: March 14, 2016
    Date of Patent: February 7, 2017
    Assignee: International Business Machines Corporation
    Inventors: Huw Francis, David A. Sinclair
  • Patent number: 9552192
    Abstract: The disclosed embodiments provide a system that facilitates execution of a software program. During operation, the system determines a structure of a software program and an execution context for the software program from a set of possible execution contexts for the software program. Next, the system generates memory layouts for a set of object instances in the software program at least in part by applying the execution context to the structure independently of a local execution context on the computer system. The system then stores the memory layouts in association with the software program.
    Type: Grant
    Filed: November 5, 2014
    Date of Patent: January 24, 2017
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Jean-Francois Denise, Steven J. Drach, Charles J. Hunt
  • Patent number: 9542185
    Abstract: An allocation system and a method for allocating an architectural register in a system having one or more mapping tables. When the allocation system detects a plurality of available architectural registers to an allocation target virtual register, it identifies adjacent instructions to all instructions having the allocation target virtual register in its destination operand, counts the number of uses of the architectural register appearing in the destination operand for each architectural register, summing the number of uses for each architectural register for each entry group in one or more mapping tables having the same assignment rule for correlations with the architectural registers, calculating the total of the numbers of uses of entries for each entry group, and allocating the architectural register to the allocation target virtual register such that the total of the numbers of uses of entries for each entry group approaches uniformity.
    Type: Grant
    Filed: July 2, 2014
    Date of Patent: January 10, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Kazuaki Ishizaki
  • Patent number: 9535772
    Abstract: In a computer system operable at multiple hierarchical privilege levels, a “wait-for-event” (WFE) communication channel between components operating at different privilege levels is established. Initially, a central processing unit (CPU) is configured to to “trap” WFE instructions issued by a client, such as an operating system, operating at one privilege level to an agent, such as a hypervisor, operating at a more privileged level. After storing a predefined special sequence in a storage component (e.g., a register), the client executes a WFE instruction. As part of trapping the WFE instruction, the agent reads and interprets the special sequence from the storage component and may respond to the special sequence by storing another special sequence in a storage component that is accessible to the client. Advantageously, the client may leverage this WFE communication channel to establish low-overhead watchdog functionality for the client.
    Type: Grant
    Filed: December 16, 2014
    Date of Patent: January 3, 2017
    Assignee: VMware, Inc.
    Inventors: Andrei Warkentin, Harvey Tuch
  • Patent number: 9529654
    Abstract: A recoverable and fault-tolerant CPU core and a control method thereof are provided. The recoverable and fault-tolerant CPU core includes first, second, and third arithmetic logic circuits configured to perform a calculation requested by the same instruction, a first selector configured to compare calculation values output from the first, second, and third arithmetic logic circuits by the same instruction, determine as a normal state when two or more of the calculation values are the same, and if not, determine as a fault state, and a register file configured to record the calculation value having the same value, when determining as the normal state in the first selector.
    Type: Grant
    Filed: November 19, 2014
    Date of Patent: December 27, 2016
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Young Su Kwon, Jin Ho Han, Kyung Jin Byun
  • Patent number: 9519507
    Abstract: Systems and methods for managing context switches among threads in a processing system. A processor may perform a context switch between threads using separate context registers. A context switch allows a processor to switch from processing a thread that is waiting for data to one that is ready for additional processing. The processor includes control registers with entries which may indicate that an associated context is waiting for data from an external source.
    Type: Grant
    Filed: May 1, 2015
    Date of Patent: December 13, 2016
    Assignee: ARM Finance Overseas Limited
    Inventors: Robert Gelinas, W. Patrick Hays, Sol Katzman, William J. Dally
  • Patent number: 9489208
    Abstract: A semiconductor device comprising a processor having a pipelined architecture and a pipeline flattener and a method for operating a pipeline flattener in a semiconductor device are provided. The processor comprises a pipeline having a plurality of pipeline stages and a plurality of pipeline registers that are coupled between the pipeline stages. The pipeline flattener comprises a plurality of trigger registers for storing a trigger, wherein the trigger registers are coupled between the pipeline stages.
    Type: Grant
    Filed: May 16, 2012
    Date of Patent: November 8, 2016
    Assignee: TEXAS INSTRUMENTS DEUTSCHLAND GMBH
    Inventors: Markus Koesler, Johann Zipperer, Christian Wiencke, Wolfgang Lutsch
  • Patent number: 9483324
    Abstract: Provided is to a program conversion device which can use processor resources of a system to the utmost and enhance performance ability. The program conversion device includes: specific process determining unit which determines a range of a partial program to perform a specific process in a target program which includes a first execution scheme specifying program which can be executed in parallel with a first ratio of being a usage ratio of a first usage quantity with respect to a first resource of a first processor and a second usage quantity with respect to a second resource of a second processor; and process converting unit which converts the partial program into a second execution scheme specifying program which can be executed in parallel with a second ratio of being the usage ratio different from the first ratio.
    Type: Grant
    Filed: June 12, 2013
    Date of Patent: November 1, 2016
    Assignee: NEC CORPORATION
    Inventor: Takamichi Miyamoto
  • Patent number: 9459869
    Abstract: Instructions may require one or more operands to be executed, which may be provided from a register file. In the context of a GPU, however, a register file may be a relatively large structure, and reading from the register file may be energy and/or time intensive An operand cache may store a subset of operands, and may use less power and have quicker access times than the register file. In some embodiments, intelligent operand prefetching may speed execution by reducing memory bank conflicts (e.g., conflicts within a register file containing multiple memory banks). An unused operand slot for another instruction (e.g., an instruction that does not require a maximum number of source operands allowed by an instruction set architecture) may be used to prefetch an operand for another instruction in one embodiment. Prefetched operands may be stored in an operand cache, and prefetching may occur based on software-provided information.
    Type: Grant
    Filed: August 20, 2013
    Date of Patent: October 4, 2016
    Assignee: Apple Inc.
    Inventors: Timothy A. Olson, Terence M. Potter, James S. Blomgren, Andrew M. Havlir
  • Patent number: 9442861
    Abstract: Apparatuses, systems, and a method for providing a processor architecture with data prefetching are described. In one embodiment, a system includes one or more processing units that include a first type of in-order pipeline to receive at least one data prefetch instruction. The one or more processing units include a second type of in-order pipeline having issues slots to receive instructions and a data prefetch queue to receive the at least one data prefetch instruction. The data prefetch queue may issue the at least one data prefetch instruction to the second type of in-order pipeline based upon one or more factors (e.g., at least one execution slot of the second type of in-order pipeline being available, priority of the data prefetch instruction).
    Type: Grant
    Filed: December 20, 2011
    Date of Patent: September 13, 2016
    Assignee: Intel Corporation
    Inventor: James Earl McCormick, Jr.
  • Patent number: 9442772
    Abstract: A global interconnect system. The global interconnect system includes a plurality of resources having data for supporting the execution of multiple code sequences and a plurality of engines for implementing the execution of the multiple code sequences. A plurality of resource consumers are within each of the plurality of engines. A global interconnect structure is coupled to the plurality of resource consumers and coupled to the plurality of resources to enable data access and execution of the multiple code sequences, wherein the resource consumers access the resources through a per cycle utilization of the global interconnect structure.
    Type: Grant
    Filed: May 18, 2012
    Date of Patent: September 13, 2016
    Assignee: SOFT MACHINES INC.
    Inventor: Mohammad Abdallah
  • Patent number: 9424086
    Abstract: A system comprises a scheduling unit for scheduling jobs to resources, and a library unit comprising a machine map of the system and a global status map of interconnections of resources. A monitoring unit generates status information signals for the resources. The library unit receives the signals and determines a free map of resources to execute the job to be scheduled, the free map indicating the interconnection of resources to which the job in a current scheduling cycle can be scheduled and determined by removing from the machine map resources which fall within the global status map and re-introducing resources in the global status map which the scheduling unit has indicated the job being scheduled can be scheduled to. The monitoring unit dispatches a job to the resources in the free map which match the resource mapping requirements of the job and fall within the free map.
    Type: Grant
    Filed: September 18, 2013
    Date of Patent: August 23, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Igor Shpigelman
  • Patent number: 9411599
    Abstract: Data operand fetching control includes a computer processor that includes a control unit for determining memory access operations. The control unit is configured to perform a method. The method includes calculating a summation weight value for each instruction in a pipeline, the summation weight value calculated as a function of branch uncertainty and a pendency in which the instruction resides in the pipeline relative to other instructions in the pipeline. The method also includes mapping the summation weight value of a selected instruction that is attempting to access system memory to a memory access control, each memory access control specifying a manner of handling data fetching operations. The method further includes performing a memory access operation for the selected instruction based upon the mapping.
    Type: Grant
    Filed: June 24, 2010
    Date of Patent: August 9, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christian Jacobi, Barry W. Krumm, Brian R. Prasky, Martin Recktenwald, Chung-Lung K. Shum, Charles F. Webb, Joshua M. Weinberg
  • Patent number: 9411642
    Abstract: When a computing system is running at a lower clock rate, in response to an event that triggers the computing system to increase the clock rate, a list of threads pending execution by the computing system is accessed. The list includes a thread that, when executed, causes the clock rate to increase. That thread is selected and executed before any other thread in the list is executed.
    Type: Grant
    Filed: January 17, 2014
    Date of Patent: August 9, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Yogish Sadashiv Kulkarni, Li Li, Vikas Ashok Jain
  • Patent number: 9396020
    Abstract: An apparatus is described having multiple cores, each core having: a) an accelerator; and, b) a general purpose CPU coupled to the accelerator. The general purpose CPU has functional unit logic circuitry to execute an instruction that returns an amount of storage space to store context information of the accelerator.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: July 19, 2016
    Assignee: Intel Corporation
    Inventors: Boris Ginzburg, Ronny Ronen, Eliezer Weissmann, Karthikeyan Vaithianathan, Ehud Cohen