Simultaneous Issuance Of Multiple Instructions Patents (Class 712/215)
  • Patent number: 11132228
    Abstract: A computing device and a method of allocating vector register files in a simultaneously-multithreaded (SMT) processor core are provided. A request for a first number (M) of vector register files is received from a borrower thread of the processor core. One or more available donor threads of the processor core are identified. A second number (N) of the vector register files, of the identified one or more available donor threads, are assigned to the borrower thread, where N is ?M. The borrower thread is parameterized to create a virtualized vector register file for the borrower thread, based on a width of the N vector register files of the identified one or more donor threads.
    Type: Grant
    Filed: March 21, 2018
    Date of Patent: September 28, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mauricio Serrano, Giles Frazier, Silvia Melitta Mueller
  • Patent number: 11132486
    Abstract: Systems and method are provided that include a standard cell with multiple input and output storage elements, such as flip flops, latches, etc., with some combination logic interconnected between them. In embodiments, the slave latches on input flip flops are replaced with a fewer number latches at a downstream node(s) of the combination logic resulting in improved performance, area and power, while maintaining functionality at the interface pins of the standard cell. The process of inferring such a standard cell from a behavioral description, such as RTL, of a design or remapping equivalent sub-circuits from a netlist to such a standard cell is also described.
    Type: Grant
    Filed: May 21, 2020
    Date of Patent: September 28, 2021
    Assignee: Taiwan Semiconductor Manufacturing Company, Ltd.
    Inventors: Guru Prasad, Sachin Kumar
  • Patent number: 11126438
    Abstract: In one embodiment, a reservation station of a processor includes: a plurality of first lanes having a plurality of entries to store information for instructions having in-order dependencies; a variable latency tracking table including a second plurality of entries to store information for instructions having a variable latency; and a scheduler circuit to access a head entry of the plurality of first lanes to schedule, for execution on at least one execution unit, at least one instruction from the head entry of at least one of the plurality of first lanes. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: September 21, 2021
    Assignee: Intel Corporation
    Inventors: Srikanth Srinivasan, Thomas Mullins, Ammon Christiansen, James Hadley, Robert S. Chappell, Sean Mirkes
  • Patent number: 11106363
    Abstract: A nonvolatile memory device includes a nonvolatile memory, a volatile memory being a cache memory of the nonvolatile memory, and a first controller configured to control the nonvolatile memory. The nonvolatile memory device further includes a second controller configured to receive a device write command and an address, and transmit, to the volatile memory through a first bus, a first read command and the address and a first write command and the address sequentially, and transmit a second write command and the address to the first controller through a second bus, in response to the reception of the device write command and the address.
    Type: Grant
    Filed: May 17, 2019
    Date of Patent: August 31, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Youngjin Cho, Sungyong Seo, Sun-Young Lim, Uksong Kang, Chankyung Kim, Duckhyun Chang, JinHyeok Choi
  • Patent number: 11055099
    Abstract: A method and system of the branch look-ahead (BLA) instruction disassembling, assembling, and delivering are designed for improving speed of branch prediction and instruction fetch of microprocessor systems by reducing the amount of clock cycles required to deliver branch instructions to a branch predictor located inside the microprocessors. The invention is also designed for reducing run-length of the instructions found between branch instructions by disassembling the instructions in a basic block as a BLA instruction and a single or plurality of non-BLA instructions from the software/assembly program. The invention is also designed for dynamically reassembling the BLA and the non-BLA instructions and delivering them to a single or plurality of microprocessors in a compatible sequence. In particular, the reassembled instructions are concurrently delivered to a single or plurality of microprocessors in a timely and precise manner while providing compatibility of the software/assembly program.
    Type: Grant
    Filed: February 17, 2019
    Date of Patent: July 6, 2021
    Inventor: Yong-Kyu Jung
  • Patent number: 11048515
    Abstract: Disclosed herein are systems and method for instruction tightly-coupled memory (iTIM) and instruction cache (iCache) access prediction. A processor may use a predictor to enable access to the iTIM or the iCache and a particular way (a memory structure) based on a location state and program counter value. The predictor may determine whether to stay in an enabled memory structure, move to and enable a different memory structure, or move to and enable both memory structures. Stay and move predictions may be based on whether a memory structure boundary crossing has occurred due to sequential instruction processing, branch or jump instruction processing, branch resolution, and cache miss processing. The program counter and a location state indicator may use feedback and be updated each instruction-fetch cycle to determine which memory structure(s) needs to be enabled for the next instruction fetch.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: June 29, 2021
    Assignee: SiFive, Inc.
    Inventors: Krste Asanovic, Andrew Waterman
  • Patent number: 10990745
    Abstract: An integrated circuit includes a first bit flip-flop and a second flip-flop. The first flip-flop has a first driving capability. The second flip-flop has a second driving capability different from the first driving capability. The first flip-flop and the second flip-flop are part of a multibit flip-flop configured to share at least a first clock pin. The first clock pin is configured to receive the first clock signal.
    Type: Grant
    Filed: September 3, 2019
    Date of Patent: April 27, 2021
    Assignee: TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY LTD.
    Inventors: Sheng-Hsiung Chen, Shao-Huan Wang, Wen-Hao Chen, Chun-Yao Ku, Hung-Chih Ou
  • Patent number: 10956160
    Abstract: A processor and method are described for a multi-level reservation station.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: March 23, 2021
    Assignee: Intel Corporation
    Inventors: Mark Dechene, Srikanth Srinivasan, Matthew Merten, Ammon Christiansen
  • Patent number: 10942745
    Abstract: Fast issuance and execution of a multi-width instruction across multiple slices in a parallel slice processor core is supported in part through the use of an early notification signal passed between issue logic associated with multiple slices handling that multi-width instruction coupled with an issuance of a different instruction by the originating issue logic for the early notification signal.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: March 9, 2021
    Assignee: International Business Machines Corporation
    Inventors: Salma Ayub, Jeffrey C. Brownscheidle, Sundeep Chadha, Dung Q. Nguyen, Tu-An T. Nguyen, Salim A. Shah, Brian W. Thompto
  • Patent number: 10936321
    Abstract: An approach is disclosed that that in one or more embodiments includes receiving an indicator to issue an out-of-order instruction or a type of out-of-order instruction in-order; receiving a first instruction; determining whether the first instruction corresponds to the indicated out-of-order instruction or the type of out-of-order instruction; writing, in response to determining that the first instruction corresponds to the indicated out-of-order instruction or the type of out-of-order instruction, an instruction identifier and a dependent instruction opcode into a first queue and an issue queue of the processor; receiving at least one subsequent instruction; determining whether an instruction opcode of the subsequent instructions matches the dependent instruction opcode of the first instruction; and writing, in response to determining the instruction opcode of the subsequent instruction matches the dependent instruction opcode of the instruction, a dependent instruction identifier for the subsequent instruc
    Type: Grant
    Filed: February 1, 2019
    Date of Patent: March 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kurt A. Feiste, Joshua W. Bowman, Christopher M. Mueller, Dung Q. Nguyen, Deepak K. Singh, Brian W. Thompto
  • Patent number: 10929140
    Abstract: Aspects of the invention include tracking dependencies between instructions in an issue queue. The tracking includes, for each instruction in the issue queue, identifying whether the instruction is dependent on each of a threshold number of instructions added to the issue queue prior to the instruction. The tracking also includes identifying whether the instruction is dependent on one or more other instructions in a group of instructions in the issue queue that were added to the issue queue prior to the instruction and that are not included in the threshold number of instructions that are tracked individually. A dependency between the instruction and the one or more other instructions in the group of instructions is tracked using a single summary bit that is set to indicate that a dependency exists between the instruction and the group of instructions. Instructions are issued from the issue queue based at least in part on the tracking.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: February 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joel A. Silberman, Balaram Sinharoy
  • Patent number: 10884743
    Abstract: A method of activating scheduling instructions within a parallel processing unit is described. The method comprises decoding, in an instruction decoder, an instruction in a scheduled task in an active state and checking, by an instruction controller, if a swap flag is set in the decoded instruction. If the swap flag in the decoded instruction is set, a scheduler is triggered to de-activate the scheduled task by changing the scheduled task from the active state to a non-active state.
    Type: Grant
    Filed: June 18, 2018
    Date of Patent: January 5, 2021
    Assignee: Imagination Technologies Limited
    Inventors: Simon Nield, Yoong-Chert Foo, Adam de Grasse, Luca Iuliano
  • Patent number: 10860327
    Abstract: A method for scheduling micro-instructions, performed by a qualifier, is provided. The method includes the following steps: detecting a load write-back signal broadcasted by a load execution unit; determining whether to trigger a load-detection counting logic according to content of the load write-back signal; determining whether an execution status of a load micro-instruction is cache hit when the triggered load-detection counting logic reaches a predetermined value; and driving a release circuit to remove the first micro-instruction in a reservation station queue when the execution status of the load micro-instruction is cache hit and the first micro-instruction has been dispatched to an arithmetic and logic unit for execution.
    Type: Grant
    Filed: October 2, 2018
    Date of Patent: December 8, 2020
    Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.
    Inventor: Xiaolong Fei
  • Patent number: 10853076
    Abstract: An apparatus is provided to perform branch prediction in respect of a plurality of instructions divided into a plurality of blocks. Receiving circuitry receives references to at least two blocks in the plurality of blocks. Branch prediction circuitry performs at least two branch predictions at a time. The branch predictions are performed in respect of the at least two blocks and the at least two blocks are non-contiguous.
    Type: Grant
    Filed: February 21, 2018
    Date of Patent: December 1, 2020
    Assignee: Arm Limited
    Inventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Eddy Lapeyre, Luc Orion
  • Patent number: 10831496
    Abstract: The present disclosure relates to a method to execute successive dependent instructions from an instruction stream in a processor. In an embodiment, the invention relates to a method to execute successive dependent instructions from an instruction stream in a processor. The method may include identifying a first instruction and a second instruction. A given operand of a second instruction is an output of the first instruction of the pair. The first instruction is older than the second instruction. The method may include loading the operands of the first instruction and the second instruction. The method may include executing the first instruction and the second instruction.
    Type: Grant
    Filed: February 28, 2019
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Maarten J. Boersma, Michael Klaus Kroener, Niels Fricke, Razvan Peter Figuli, Nandor Szirmak, Dung Q. Nguyen
  • Patent number: 10831232
    Abstract: A computer architecture suitable for out-of-order processors manages the problem of timing slack, in which an instruction completes before its clock cycle, by recycling that slack to allow the next succeeding instruction allowing that instruction to begin execution earlier. This recycling mechanism is enabled through the use of a transparent gating between execution units which allows data transfer before clock cycle boundaries and, in some cases, by aggressively issuing children instructions contemporaneously with their parent instruction after a grandparent instruction is issued.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: November 10, 2020
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Gokul Subramanian Ravi, Mikko H. Lipasti
  • Patent number: 10747541
    Abstract: Instructions are executed in a pipeline. Storage accessible to the pipeline stores branch prediction information characterizing results of branch instructions previously executed. A predicted branch result is provided, for at least some branch instructions, based on a selected predictor of multiple predictors. An actual branch result is provided based on an executed branch instruction, and the branch prediction information is updated based on the actual branch result. The predictors include: a first predictor that determines the predicted branch result based on at least a portion of the branch prediction information; and a second predictor that determines the predicted branch result independently from the branch prediction information.
    Type: Grant
    Filed: January 25, 2018
    Date of Patent: August 18, 2020
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Shubhendu Sekhar Mukherjee, David Kravitz, Edward J. McLellan
  • Patent number: 10732976
    Abstract: A processor includes an instruction pipeline. The pipeline can be operated alternatively in a multi-thread mode and in a single-thread mode. In the multi-thread mode, the instruction pipeline processes multiple threads in an interleaved or simultaneous manner. In the single-thread mode, the pipeline processes a single thread. The instruction pipeline comprises multiple functional units, each of which is reserved for one thread among the multiple threads when the pipeline is in the multi-thread mode and reserved for one context layer among multiple context layers when the instruction pipeline is in the single-thread mode.
    Type: Grant
    Filed: January 10, 2013
    Date of Patent: August 4, 2020
    Assignee: NXP USA, Inc.
    Inventors: Alistair Robertson, Jeffrey W. Scott
  • Patent number: 10719325
    Abstract: Very long instruction word (VLIW) instruction processing using a reduced-width processor is disclosed. In a particular embodiment, a VLIW processor includes a control circuit configured to receive a VLIW packet that includes a first number of instructions and to distribute the instructions to a second number of instruction execution paths. The first number is greater than the second number. The VLIW processor also includes physical registers configured to store results of executing the instructions and a register renaming circuit that is coupled to the control circuit.
    Type: Grant
    Filed: November 7, 2017
    Date of Patent: July 21, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Peter Sassone, Christopher Koob, Suresh Kumar Venkumahanti
  • Patent number: 10705851
    Abstract: A method for scheduling micro-instructions, performed by a first qualifier, is provided. The method includes the following steps: detecting a write-back signal broadcasted by a second qualifier; determining whether a value of a first load-detection counting logic is to be synchronized with a value of a second load-detection counting logic carried by the write-back signal according to content of the write-back signal; determining whether execution statuses of all load micro-instructions are cache hit when the synchronized value of the first load-detection counting logic reaches a predetermined value; and driving a release circuit to remove a micro-instruction in a reservation station queue when the execution statuses of the all load micro-instructions are cache hit and the micro-instruction has been dispatched to an arithmetic and logic unit for execution.
    Type: Grant
    Filed: October 2, 2018
    Date of Patent: July 7, 2020
    Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.
    Inventor: Xiaolong Fei
  • Patent number: 10698729
    Abstract: The invention relates to a method for organizing tasks, in at least some nodes of a computer cluster, comprising: First, launching two containers on each of said nodes, a standard container and a priority container, next, for all or part of said nodes with two containers, at each node, while a priority task does not occur, assigning one or more available resources of the node to the standard container thereof in order to execute a standard task, the priority container thereof not executing any task, when a priority task occurs, dynamically switching only a portion of the resources from the standard container thereof to the priority container thereof, such that, the priority task is executed in the priority container with the switched portion of the resources, and the standard task continues to be executed, without being halted, in the standard container with the non-switched portion of the resources.
    Type: Grant
    Filed: December 16, 2015
    Date of Patent: June 30, 2020
    Assignee: BULL SAS
    Inventors: Yann Maupu, Thomas Cadeau, Matthieu Daniel
  • Patent number: 10678542
    Abstract: Systems, apparatuses, and methods for implementing a non-shifting reservation station. A dispatch unit may write an operation into any entry of a reservation station. The reservation station may include an age matrix for determining the relative ages of the operations stored in the entries of the reservation station. The reservation station may include selection logic which is configured to pick the oldest ready operation from the reservation station based on the values stored in the age matrix. The selection logic may utilize control logic to mask off columns of an age matrix corresponding to non-ready operation so as to determine which operation is the oldest ready operation in the reservation station. Also, the reservation station may be configured to dequeue operations early when these operations do not have load dependency.
    Type: Grant
    Filed: July 24, 2015
    Date of Patent: June 9, 2020
    Assignee: Apple Inc.
    Inventors: Ian D. Kountanis, Mahesh K. Reddy
  • Patent number: 10620960
    Abstract: An apparatus and method are provided for performing branch prediction. The apparatus has processing circuitry for executing instructions out-of-order with respect to original program order, and event counting prediction circuitry for maintaining event count values for branch instructions, for use in making branch outcome predictions for those branch instructions. Further, checkpointing storage stores state information of the apparatus at a plurality of checkpoints to enable the state information to be restored for a determined one of those checkpoints in response to a flush event. The event counting prediction circuitry has training storage with a first number of training entries, each training entry being associated with a branch instruction.
    Type: Grant
    Filed: August 20, 2018
    Date of Patent: April 14, 2020
    Assignee: Arm Limited
    Inventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Vincenzo Consales
  • Patent number: 10613993
    Abstract: Program code intended to be copied into the cache memory of a microprocessor is transferred encrypted between the random-access memory and the processor, and the decryption is carried out at the level of the cache memory. A checksum may be inserted into the cache lines in order to allow integrity verification, and this checksum is then replaced with a specific instruction before delivery of an instruction word to the central unit of the microprocessor.
    Type: Grant
    Filed: January 30, 2015
    Date of Patent: April 7, 2020
    Assignee: STMICROELECTRONICS SA
    Inventor: Bruno Fel
  • Patent number: 10606590
    Abstract: Technical solutions are described for out-of-order (OoO) execution of one or more instructions by a processing unit. An example method includes looking up, by a load-store unit (LSU), an entry in an effective address directory (EAD) for an effective address (EA) of an operand of an instruction to be launched. Further, the method includes, in response to the EA being present in the EAD, launching, by the LSU, the instruction with the RA from the EAD, and in response to the EA not being present in the EAD, looking up, by the LSU, the EA in an effective real table (ERT) entry, and launching the instruction with the RA from the ERT entry. Further, in response to the ERT entry to be removed, the ERT entry including an ERT index and a mapping between the EA and the RA, removing the entry of the EA from the EAD.
    Type: Grant
    Filed: October 6, 2017
    Date of Patent: March 31, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bryan Lloyd, Balaram Sinharoy
  • Patent number: 10606593
    Abstract: Technical solutions are described for out-of-order (OoO) execution of one or more instructions by a processing unit. An example method includes looking up, by a load-store unit (LSU), an entry in an effective address directory (EAD) for an effective address (EA) of an operand of an instruction to be launched. Further, the method includes, in response to the EA being present in the EAD, launching, by the LSU, the instruction with the RA from the EAD, and in response to the EA not being present in the EAD, looking up, by the LSU, the EA in an effective real table (ERT) entry, and launching the instruction with the RA from the ERT entry. Further, in response to the ERT entry to be removed, the ERT entry including an ERT index and a mapping between the EA and the RA, removing the entry of the EA from the EAD.
    Type: Grant
    Filed: November 29, 2017
    Date of Patent: March 31, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bryan Lloyd, Balaram Sinharoy
  • Patent number: 10592298
    Abstract: A system and method for processing a data packet. The method comprises initiating processing of a received plurality of data packets by CPU cores; tracking, by a scale management routine, processing queues for the CPU cores and their load. In response to an average size of a processing queue being lower than a first pre-determined queue threshold, and a CPU core load being lower than a first pre-determined load threshold, preventing adding new data packets to the processing queue, monitoring emptying of processing queues for each processing CPU core. In response to an average size of a processing queue or a CPU core load being above a second pre-determined upper queue threshold or the second pre-determined load threshold, transmitting all data from processing queues for each processing CPU core to a memory buffer, increasing the number of processing cores by one; and initiating data packet processing.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: March 17, 2020
    Assignee: NFWARE, INC.
    Inventors: Alexander Britkin, Viacheslav Morozov, Igor Pavlov
  • Patent number: 10585669
    Abstract: A system and method suppresses occurrence of stalling caused by data dependency other than register dependency in an out-of-order processor. A stall reducing method includes a handler for detecting a stall occurring during execution of execution code using a performance monitoring unit, and for identifying, based on dependencies, a second instruction on which a first instruction is data dependent, the stall based on this dependency. A profiler registers the second instruction as profile information. An optimization module inserts a thread yield instruction in an appropriate position inside execution code or an original code file based on the profile information, and outputs optimized execution code.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: March 10, 2020
    Assignee: International Business Machines Corporation
    Inventor: Takeshi Ogasawara
  • Patent number: 10564976
    Abstract: Aspects of the invention include tracking dependencies between instructions in an issue queue. The tracking includes, for each instruction in the issue queue, identifying whether the instruction is dependent on each of a threshold number of instructions added to the issue queue prior to the instruction. The tracking also includes identifying whether the instruction is dependent on one or more other instructions added to the issue queue prior to the instruction that are not included in the each of the threshold number of instructions. A dependency between the instruction and each of the other instructions is tracked as a plurality of groups by indicating that a dependency exists between the instruction and one of the groups based on identifying a dependency between the instruction and at least one instruction in the group. Instructions are issued from the issue queue based at least in part on the tracking.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: February 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joel A. Silberman, Balaram Sinharoy
  • Patent number: 10558460
    Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.
    Type: Grant
    Filed: December 14, 2016
    Date of Patent: February 11, 2020
    Assignee: QUALCOMM Incorporated
    Inventors: Yun Du, Liang Han, Lin Chen, Chihong Zhang, Hongjiang Shang, Jing Wu, Zilin Ying, Chun Yu, Guofang Jiao, Andrew Gruber, Eric Demers
  • Patent number: 10552130
    Abstract: A method of providing by a code optimization service an optimized version of a code unit to a managed runtime environment is disclosed. Information related to one or more runtime conditions associated with the managed runtime environment that is executing in a different process than that of the code optimization service is obtained, wherein the one or more runtime conditions are subject to change during the execution of the code unit. The optimized version of the code unit and a corresponding set of one or more speculative assumptions are provided to the managed runtime environment, wherein the optimized version of the code unit produces the same logical results as the code unit unless at least one of the set of one or more speculative assumptions is not true, wherein the set of one or more speculative assumptions are based on the information related to the one or more runtime conditions.
    Type: Grant
    Filed: June 8, 2018
    Date of Patent: February 4, 2020
    Assignee: Azul Systems, Inc.
    Inventors: Gil Tene, Philip Reames
  • Patent number: 10445097
    Abstract: Apparatus and methods are disclosed for decoding targets from an instruction and transmitting data to those targets in accordance with a current instruction. Multimodal target hardware is used in conjunction with one or more of the routers so as to route data to an appropriate target. The data can be one or more operands or a predicate and the targets can include operand buffers, broadcast channels, and general registers. In this way, operands, for example, can be directed for use with multiple subsequent instructions, and there are multiple modes for distributing the operands to the multiple instructions.
    Type: Grant
    Filed: March 17, 2016
    Date of Patent: October 15, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Douglas C. Burger, Aaron L. Smith
  • Patent number: 10437603
    Abstract: The disclosed inventions include a processor apparatus and method that enable a general purpose processor to achieve twice the operating frequency of typical processor implementations with a modest increase in area and a modest increase in energy per operation. The invention relies upon exploiting multiple independent streams of execution. Low area and low energy memory arrays used for register files operate a modest frequency. Instructions can be issued at a rate higher than this frequency by including logic that guarantees the spacing between instructions from the same thread are spaced wider than the time to access the register file. The result of the invention is the ability to overlap long latency structures, which allows using lower energy structures, thereby reducing energy per operation.
    Type: Grant
    Filed: February 20, 2018
    Date of Patent: October 8, 2019
    Inventor: Kevin Sean Halle
  • Patent number: 10379866
    Abstract: An electronic apparatus generating compiled data used in a very long instruction word (VLIW) processor including a plurality of function units is provided. The electronic apparatus includes a storage and a processor configured to control the storage to store the compiled data in which a plurality of VLIW instructions are compiled, identify a VLIW instruction from the compiled data; and update, if a multi-cycle no operation (nop) instruction for the plurality of function units is identified within a cycle corresponding to a latency of the identified VLIW instruction and if an end cycle of another VLIW instruction is within the cycle corresponding to the latency of the identified VLIW instruction, the compiled data by including information on a cycle difference between an end cycle of the identified VLIW instruction and the end cycle of the another VLIW instruction in the multi-cycle nop instruction.
    Type: Grant
    Filed: July 19, 2017
    Date of Patent: August 13, 2019
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Jong-hun Lee, Jae-un Park, Si-hoon Song, Myung-sun Kim
  • Patent number: 10346174
    Abstract: Operation of a multi-slice processor that includes a plurality of execution slices and a plurality of load/store slices, where the multi-slice processor is configured to dynamically cancel partial load operations by, among other steps, receiving a load instruction requesting multiple portions of data; receiving a load instruction requesting multiple portions of data; determining that a load of one portion of the requested multiple portions is unavailable to be issued; and responsive to determining that the load of the one portion of the requested multiple portions is unavailable to be issued, delaying issuance of the load instruction.
    Type: Grant
    Filed: March 24, 2016
    Date of Patent: July 9, 2019
    Assignee: International Business Machines Corporation
    Inventors: Elizabeth A. McGlone, Jennifer L. Molnar
  • Patent number: 10346165
    Abstract: A load/store unit including a memory queue configured to store a plurality of memory instructions and state information indicating whether each memory instruction of the plurality of memory instructions can be performed independently, with, separately, or after older pending instructions; and a state-selection circuit configured to set a state information of each memory instruction of the plurality of memory instructions in view of an older pending instruction in the memory queue.
    Type: Grant
    Filed: October 31, 2014
    Date of Patent: July 9, 2019
    Assignee: Avago Technologies International Sales Pte. Limited
    Inventor: Tariq Kurd
  • Patent number: 10331455
    Abstract: An electronic apparatus generating compiled data used in a very long instruction word (VLIW) processor including a plurality of function units is provided. The electronic apparatus includes a storage and a processor configured to control the storage to store the compiled data in which a plurality of VLIW instructions are compiled, identify a VLIW instruction from the compiled data; and update, if a multi-cycle no operation (nop) instruction for the plurality of function units is identified within a cycle corresponding to a latency of the identified VLIW instruction and if an end cycle of another VLIW instruction is within the cycle corresponding to the latency of the identified VLIW instruction, the compiled data by including information on a cycle difference between an end cycle of the identified VLIW instruction and the end cycle of the another VLIW instruction in the multi-cycle nop instruction.
    Type: Grant
    Filed: July 19, 2017
    Date of Patent: June 25, 2019
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Jong-hun Lee, Jae-un Park, Si-hoon Song, Myung-sun Kim
  • Patent number: 10324724
    Abstract: Methods and apparatuses relating to a fusion manager to fuse instructions are described. In one embodiment, a hardware processor includes a hardware binary translator to translate an instruction stream into a translated instruction stream, a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, a hardware decode unit to decode the single fused instruction into a decoded, single fused instruction, and a hardware execution unit to execute the decoded, single fused instruction.
    Type: Grant
    Filed: December 16, 2015
    Date of Patent: June 18, 2019
    Assignee: Intel Corporation
    Inventors: Patrick P. Lai, Tyler N. Sondag, Sebastian Winkel, Polychronis Xekalakis, Ethan Schuchman, Jayesh Iyer
  • Patent number: 10282207
    Abstract: Operation of a multi-slice processor that includes execution slices and load/store slices coupled via a results bus includes: receiving, by an execution slice, a producer instruction, including: storing, in an entry of an issue queue, the producer instruction; and storing, in a register, an issue queue entry identifier representing the entry of the issue queue in which the producer instruction is stored; receiving, by the execution slice, a source instruction, the source instruction dependent upon the result of the producer instruction, including: storing, in another entry of the issue queue, the source instruction and the issue queue entry identifier of the producer instruction; determining in dependence upon the issue queue entry identifier of the producer instruction that the producer instruction has issued from the issue queue; and responsive to the determination that the producer instruction has issued from the issue queue, issuing the source instruction from the issue queue.
    Type: Grant
    Filed: February 18, 2016
    Date of Patent: May 7, 2019
    Assignee: International Business Machines Corporation
    Inventors: Brian D. Barrick, Sundeep Chadha, Michael J. Genden, Jerry Y. Lu, Dung Q. Nguyen, Nasrin Sultana, David R. Terry, David S. Walder
  • Patent number: 10269088
    Abstract: A mechanism is described for facilitating thread execution arbitration for thread scheduling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes assigning priority levels to threads based on stall signals communicated from the one or more shared function units to one or more execution units of a processor including a graphics processor, and selecting a first thread to be scheduled and a second thread to be ignored based on the stall signals.
    Type: Grant
    Filed: April 21, 2017
    Date of Patent: April 23, 2019
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Abhishek R. Appu, Subramaniam M. Maiyuran, Eric J. Hoekstra, Prasoonkumar Surti, Balaji Vembu, Altug Koker
  • Patent number: 10268482
    Abstract: Operation of a multi-slice processor that includes execution slices and load/store slices coupled via a results bus includes: receiving, by an execution slice, a producer instruction, including: storing, in an entry of an issue queue, the producer instruction; and storing, in a register, an issue queue entry identifier representing the entry of the issue queue in which the producer instruction is stored; receiving, by the execution slice, a source instruction, the source instruction dependent upon the result of the producer instruction, including: storing, in another entry of the issue queue, the source instruction and the issue queue entry identifier of the producer instruction; determining in dependence upon the issue queue entry identifier of the producer instruction that the producer instruction has issued from the issue queue; and responsive to the determination that the producer instruction has issued from the issue queue, issuing the source instruction from the issue queue.
    Type: Grant
    Filed: December 15, 2015
    Date of Patent: April 23, 2019
    Assignee: International Business Machines Corporation
    Inventors: Brian D. Barrick, Sundeep Chadha, Michael J. Genden, Jerry Y. Lu, Dung Q. Nguyen, Nasrin Sultana, David R. Terry, David S. Walder
  • Patent number: 10248421
    Abstract: Operation of a multi-slice processor that includes execution slices and load/store slices coupled via a results bus, including: for a target instruction targeting a logical register, determining whether an entry in a general purpose register representing the logical register is pending a flush; if the entry in the general purpose register representing the logical register is pending a flush: cancelling the flush in the entry of the general purpose register; storing the target instruction in the entry of the general purpose register representing the logical register, and if an entry in a history buffer targeting the logical register is pending a restore, cancelling the restore for the entry of the history buffer.
    Type: Grant
    Filed: February 16, 2016
    Date of Patent: April 2, 2019
    Assignee: International Business Machines Corporation
    Inventors: Salma Ayub, Brian D. Barrick, Joshua W. Bowman, Sundeep Chadha, Cliff Kucharski, Dung Q. Nguyen, David R. Terry, Jing Zhang
  • Patent number: 10241790
    Abstract: Operation of a multi-slice processor that includes execution slices and load/store slices coupled via a results bus, including: for a target instruction targeting a logical register, determining whether an entry in a general purpose register representing the logical register is pending a flush; if the entry in the general purpose register representing the logical register is pending a flush: cancelling the flush in the entry of the general purpose register; storing the target instruction in the entry of the general purpose register representing the logical register, and if an entry in a history buffer targeting the logical register is pending a restore, cancelling the restore for the entry of the history buffer.
    Type: Grant
    Filed: December 15, 2015
    Date of Patent: March 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Salma Ayub, Brian D. Barrick, Joshua W. Bowman, Sundeep Chadha, Cliff Kucharski, Dung Q. Nguyen, David R. Terry, Jing Zhang
  • Patent number: 10241557
    Abstract: A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.
    Type: Grant
    Filed: December 12, 2013
    Date of Patent: March 26, 2019
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Ronald P Hall, Ramesh B Gunna, Ian D Kountanis, Shyam Sundar, André Seznec
  • Patent number: 10235232
    Abstract: A processor includes an indicator configured to indicate a first mode or a second mode and a functional unit configured to perform computations with a full degree of accuracy when the indicator indicates the first mode and to perform computations with less than the full degree of accuracy when the indicator indicates the second mode.
    Type: Grant
    Filed: October 23, 2014
    Date of Patent: March 19, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD
    Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
  • Patent number: 10235181
    Abstract: An out-of-order (OOO) processor includes ready logic that provides a signal indicating an instruction is ready when all operands for the instruction are ready, or when all operands are either ready or are marked back-to-back to a current instruction. By marking a second instruction that consumes an operand as ready when it is back-to-back with a first instruction that produces the operand, but the first instruction has not yet produced the operand, latency due to missed cycles in executing back-to-back instructions is minimized.
    Type: Grant
    Filed: February 3, 2017
    Date of Patent: March 19, 2019
    Assignee: International Business Machines Corporation
    Inventor: Brian W. Thompto
  • Patent number: 10228982
    Abstract: A mechanism is provided for allocating a hyper-threaded processor to nodes of multi-tenant distributed software systems. Responsive to receiving a request to provision a node of the multi-tenant distributed software system on the host data processing system, a cluster to which the node belongs is identified. Responsive to the node being a second type of node, responsive to determining that another second type of node in the same cluster has been provisioned on the host data processing system, and responsive to the number of unallocated VPs on different physical processors from that of the other second type of node being greater than or equal to the requested number of VPs for the second type of node, the requested number of VPs for the second type of node is allocated each to a different physical processor from that of the other second type of node.
    Type: Grant
    Filed: January 25, 2018
    Date of Patent: March 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Rachit Arora, Dharmesh K. Jain, Padmanabhan Krishnan, Shrinivas S. Kulkarni, Subin Shekhar
  • Patent number: 10229066
    Abstract: A data processing apparatus is provided including queue circuitry to respond to control signals each associated with a memory access instruction, and to queue a plurality of requests for data, each associated with a reference to a storage location. Resolution circuitry acquires a request for data, and issues the request for data, the resolution circuitry having a resolution circuitry limit. When a current capacity of the resolution circuitry is below the resolution circuitry limit, the resolution circuitry acquires the request for data by receiving the request for data from the queue circuitry, stores the request for data in association with the storage location, issues the request for data, and causes a result of issuing the request for data to be provided to said storage location.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: March 12, 2019
    Assignee: ARM Limited
    Inventors: Miles Robert Dooley, Matthew Andrew Rafacz, Huzefa Moiz Sanjeliwala, Michael Filippo
  • Patent number: 10223126
    Abstract: An out-of-order (OOO) processor includes ready logic that provides a signal indicating an instruction is ready when all operands for the instruction are ready, or when all operands are either ready or are marked back-to-back to a current instruction. By marking a second instruction that consumes an operand as ready when it is back-to-back with a first instruction that produces the operand, but the first instruction has not yet produced the operand, latency due to missed cycles in executing back-to-back instructions is minimized.
    Type: Grant
    Filed: January 6, 2017
    Date of Patent: March 5, 2019
    Assignee: International Business Machines Corporation
    Inventor: Brian W. Thompto
  • Patent number: 10216547
    Abstract: A mechanism is provided for allocating a hyper-threaded processor to nodes of multi-tenant distributed software systems. Responsive to receiving a request to provision a node of the multi-tenant distributed software system on the host data processing system, a cluster to which the node belongs is identified. Responsive to the node being a second type of node, responsive to determining that another second type of node in the same cluster has been provisioned on the host data processing system, and responsive to the number of unallocated VPs on different physical processors from that of the other second type of node being greater than or equal to the requested number of VPs for the second type of node, the requested number of VPs for the second type of node is allocated each to a different physical processor from that of the other second type of node.
    Type: Grant
    Filed: November 22, 2016
    Date of Patent: February 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Rachit Arora, Dharmesh K. Jain, Padmanabhan Krishnan, Shrinivas S. Kulkarni, Subin Shekhar