Superscalar Patents (Class 712/23)
  • Patent number: 11934934
    Abstract: An apparatus to facilitate optimization of a convolutional neural network (CNN) is disclosed. The apparatus includes optimization logic to receive a CNN model having a list of instructions and including pruning logic to optimize the list of instructions by eliminating branches in the list of instructions that comprise a weight value of 0.
    Type: Grant
    Filed: April 17, 2017
    Date of Patent: March 19, 2024
    Assignee: Intel Corporation
    Inventors: Liwei Ma, Elmoustapha Ould- Ahmed-Vall, Barath Lakshmanan, Ben J. Ashbaugh, Jingyi Jin, Jeremy Bottleson, Mike B. Macpherson, Kevin Nealis, Dhawal Srivastava, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Altug Koker, Abhishek R. Appu
  • Patent number: 11868818
    Abstract: Techniques for selectively executing a lock instruction speculatively or non-speculatively based on lock address prediction and/or temporal lock prediction. including methods an devices for locking an entry in a memory device. In some techniques, a lock instruction executed by a thread for a particular memory entry of a memory device is detected. Whether contention occurred for the particular memory entry during an earlier speculative lock is detected on a condition that the lock instruction comprises a speculative lock instruction. The lock is executed non-speculatively if contention occurred for the particular memory entry during an earlier speculative lock. The lock is executed speculatively if contention did not occur for the particular memory entry during an earlier speculative lock.
    Type: Grant
    Filed: September 22, 2016
    Date of Patent: January 9, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Gregory W. Smaus, John M. King, Matthew A. Rafacz, Matthew M. Crum
  • Patent number: 11847463
    Abstract: A processor includes a load/store unit and an execution pipeline to execute an instruction that represents a single-instruction-multiple-data (SIMD) operation, and which references a memory block storing operand data for one or more lanes of a plurality of lanes and a mask vector indicating which lanes of a plurality of lanes are enabled and which are disabled for the operation. The execution pipeline executes an instruction in a first execution mode unless a memory fault is generated during execution of the instruction in the first execution mode. In response to the memory fault, the execution pipeline re-executes the instruction in a second execution mode. In the first execution mode, a single load operation is attempted to access the memory block via the load/store unit. In the second execution mode, a separate load operation is performed by the load/store unit for each enabled lane of the plurality of lanes prior to executing the SIMD operation.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: December 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kai Troester, Scott Thomas Bingham, John M. King, Michael Estlick, Erik Swanson, Robert Weidner
  • Patent number: 11789735
    Abstract: In an embodiment, the present invention includes a processor having an execution logic to execute instructions and a control transfer termination (CTT) logic coupled to the execution logic. This logic is to cause a CTT fault to be raised if a target instruction of a control transfer instruction is not a CTT instruction. Other embodiments are described and claimed.
    Type: Grant
    Filed: May 25, 2021
    Date of Patent: October 17, 2023
    Assignee: Intel Corporation
    Inventors: Vedvyas Shanbhogue, Jason W. Brandt, Uday R. Savagaonkar, Ravi L. Sahita
  • Patent number: 11775185
    Abstract: A memory device includes a plurality of memory dies, each memory die of the plurality of memory dies comprising a memory array and control logic. The control logic comprises a plurality of processing threads to execute memory access operations on the memory array concurrently, a thread selection component to identify one or more processing threads of the plurality of processing threads for a power management cycle of the associated memory die and a power management component to determine an amount of power associated with the one or more processing threads and request the amount of power during the power management cycle.
    Type: Grant
    Filed: September 17, 2020
    Date of Patent: October 3, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Luca Nubile, Ali Mohammadzadeh, Biagio Iorio, Walter Di Francesco, Yuanhang Cao, Luca De Santis, Fumin Gu
  • Patent number: 11734007
    Abstract: A system parses a very long instruction word (VLIW) to obtain an execution parameter. The system obtains a first sliding window width count, a first sliding window height count, a first feature map width count, and a first feature map height count that correspond to first target data. In accordance with a determination that the first sliding window width count falls within the sliding window width range, the first sliding window height count falls within the sliding window height range, (the first feature map width count falls within the feature map width range, and the first feature map height count falls within the feature map height range, the system determines an offset of the first target data. The system also obtains a starting address of the first target data, and adds the starting address to the offset to obtain a first target address of the first target data.
    Type: Grant
    Filed: April 26, 2022
    Date of Patent: August 22, 2023
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Xiaoyu Yu, Dewei Chen, Heng Zhang, Yan Xiong, Jianlin Gao
  • Patent number: 11687346
    Abstract: The present invention relates to a processor having a trace cache and a plurality of ALUs arranged in a matrix, comprising an analyser unit located between the trace cache and the ALUs, wherein the analyser unit analyses the code in the trace cache, detects loops, transforms the code, and issues to the ALUs sections of the code combined to blocks for joint execution for a plurality of clock cycles.
    Type: Grant
    Filed: October 14, 2020
    Date of Patent: June 27, 2023
    Assignee: Hyperion Core, Inc.
    Inventor: Martin Vorbach
  • Patent number: 11567554
    Abstract: A pipeline includes a first portion configured to process a first subset of bits of an instruction and a second portion configured to process a second subset of the bits of the instruction. A first clock mesh is configured to provide a first clock signal to the first portion of the pipeline. A second clock mesh is configured to provide a second clock signal to the second portion of the pipeline. The first and second clock meshes selectively provide the first and second clock signals based on characteristics of in-flight instructions that have been dispatched to the pipeline but not yet retired. In some cases, a physical register file is configured to store values of bits representative of instructions. Only the first subset is stored in the physical register file in response to the value of the zero high bit indicating that the second subset is equal to zero.
    Type: Grant
    Filed: December 11, 2017
    Date of Patent: January 31, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jay Fleischman, Michael Estlick, Michael Christopher Sedmak, Erik Swanson, Sneha V. Desai
  • Patent number: 11507379
    Abstract: A front-end portion of a pipeline includes a stage that speculatively issues at least some instructions out-of-order. A back-end portion of the pipeline includes one or more stages that access a processor memory system. In the front-end (back-end), execution of instructions is managed based on information available in the front-end (back-end). Managing execution of a first memory barrier instruction includes preventing speculative out-of-order issuance of store instructions. The back-end control circuitry provides information accessible to the front-end control circuitry indicating that one or more particular memory instructions have completed handling by the processor memory system.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: November 22, 2022
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Shubhendu Sekhar Mukherjee, Michael Bertone, David Albert Carlson
  • Patent number: 11416255
    Abstract: An instruction execution method suitable for being executed by a processor is provided. The first processor comprises a register alias table (RAT) and a reservation station. The instruction execution method includes: a register alias table receives a first micro-instruction and a second micro-instruction and issues the first micro-instruction and the second micro-instruction to the reservation station; and the reservation station assigns one of a plurality of execution units to execute the first micro-instruction, according to the first specific message of the first micro-instruction; and the reservation station assigns one of the execution units to execute the second micro-instruction, according to the second specific message of the second micro-instruction.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: August 16, 2022
    Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.
    Inventors: Penghao Zou, Chen-Chen Song, Kang-Kang Zhang, Jianbin Wang
  • Patent number: 11314516
    Abstract: Systems and methods of selecting a collection of compatible issue-ready instructions for parallel execution by functional units in a superscalar processor in a single clock cycle. All possible instructions (opcodes) to be executed by the functional units are pre-arranged into several scenarios based on potential resource conflicts among the instructions. Each scenario includes multiple groups of predefined instructions. During operation, concurrently for all the groups, an issue-ready instruction is identified with reference to each group based on group-specific selection policies. Further, based on the identified instructions, predefined policies are applied to select one or more scenarios and select among the picks of the selected scenarios. As a result, the output instructions of the selected scenarios are issued for parallel execution by the functional units.
    Type: Grant
    Filed: January 19, 2018
    Date of Patent: April 26, 2022
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: David Carlson
  • Patent number: 11309894
    Abstract: A programmable integrated circuit (“PIC”) device includes configurable logic blocks (“LBs”), routing connections, and configuration memory for performing user defined programmed logic functions. Each configurable LB, in one example, includes a set of lookup tables (“LUTs”) and associated registers. The LUTs, for example, are configured to generate one or more output signals in accordance with a set of input signals. The registers are arranged so that each register corresponds to one LUT. In one embodiment, a group of registers, instead of assigning to a group of LUTs across multiple configurable LBs, is allocated or configured as embedded signature registers in PSD. For example, a first register which corresponds or physically situated in the vicinity of first LUT can be designated as an embedded signature register for storing a fixed value or signature information for facilitating device or IC identification.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: April 19, 2022
    Assignee: GOWIN SEMICONDUCTOR CORPORATION
    Inventor: Jinghui Zhu
  • Patent number: 11243905
    Abstract: A data path block circuit is disclosed. The data path block circuit includes a data path circuit having logic circuits, each configured to perform a data path operation to generate a result based on first and second operands. The data path block circuit also includes a first operand multiplexer, having inputs, each connected to one of a first register file, including a quantity of read and write ports, and a second register file, including a different quantity of read and write ports. The data path block circuit also includes a second operand multiplexer, having inputs, each connected to one of the first register file and the second register file. At least one of the first and second operand multiplexers includes a data input connected to the first register file. At least one of the first and second operand multiplexers includes a data input connected to the second register file.
    Type: Grant
    Filed: July 28, 2020
    Date of Patent: February 8, 2022
    Assignee: SHENZHEN GOODIX TECHNOLOGY CO., LTD.
    Inventor: Jaehoon Heo
  • Patent number: 11243778
    Abstract: The present disclosure relates to instruction dispatch mechanisms for superscalar processors having a plurality of functional units for executing operations simultaneously. Each particular functional unit of the plurality of functional units may be configured to output a capability vector indicating a set of operations that the particular functional unit is currently available to perform. As instructions are received in an issue queue, the functional unit to execute the instruction is selected by comparing capabilities required by the instruction to the asserted capabilities of each of the functional units. A functional unit may reset or de-assert a particular functionality while performing an operation and then re-assert the capability when the instruction is completed. A result of the operation may be stored in a skid buffer for at least as long as the chain execution time in order to avoid resource hazards are a write port of the vector register file.
    Type: Grant
    Filed: December 31, 2020
    Date of Patent: February 8, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Skand Hurkat, Jeremy Fowers
  • Patent number: 11175915
    Abstract: Systems and methods related to implementing vector registers in memory. A memory system for implementing vector registers in memory can include an array of memory cells, where a plurality of rows in the array serve as a plurality of vector registers as defined by an instruction set architecture. The memory system for implementing vector registers in memory can also include a processing resource configured to, responsive to receiving a command to perform a particular vector operation on a particular vector register, access a particular row of the array serving as the particular register to perform the vector operation.
    Type: Grant
    Filed: October 10, 2018
    Date of Patent: November 16, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Timothy P Finkbeiner, Troy D. Larsen
  • Patent number: 11169807
    Abstract: A processor comprising a processor pipeline comprising one or more execution units configured to execute branch instructions, a branch predictor associated with the processor pipeline and configured to predict a branch instruction outcome, a branch classification unit associated with the processor pipeline and the branch prediction unit. The branch classification unit is configured to, in response to detecting a branch instruction, classify the branch instruction as at least one of the following: a simple branch or a hard-to-predict (HTP) branch, wherein a threshold used for the classification is dynamically adjusted based on a workload of the processor.
    Type: Grant
    Filed: February 11, 2020
    Date of Patent: November 9, 2021
    Assignee: International Business Machines Corporation
    Inventors: Puneeth A. H. Bhat, Satish Kumar Sadasivam, Shruti Saxena
  • Patent number: 11163568
    Abstract: An approach is provided in which a system writes a set of data into a register file entry that includes a first memory array and a second memory array. The register file entry also includes a set of first write ports corresponding to the first memory array and a set of second write ports corresponding to the second memory array. The system configures a selection bit based on determining that a selected one of the set of first write ports is utilized to store the set of data in the first memory array. In turn, the system reads the set of data out of the first memory array based on the configured selection bit.
    Type: Grant
    Filed: September 6, 2018
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Saiful Islam, Sam G. Chu, Dung Q. Nguyen, Binglong Zhang, Howard Levy, David R. Terry, Steven J. Battle
  • Patent number: 11144364
    Abstract: Recovering microprocessor logical register values by: partitioning a register mapper by logical register type; providing a plurality of recovery ports; assigning a logical register type to a recovery port; receiving a restore required instruction; and mapping SRB (save and restore buffer) values to the register mapper by logical register type.
    Type: Grant
    Filed: January 25, 2019
    Date of Patent: October 12, 2021
    Assignee: International Business Machines Corporation
    Inventors: Steven J. Battle, Brandon R. Goddard, Dung Q. Nguyen, Joshua W. Bowman, Brian D. Barrick, Susan E. Eisen, David S. Walder, Cliff Kucharski
  • Patent number: 11093246
    Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor, a register file associated with the at least one processor, the register file having a plurality of entries for storing data and sliced into a plurality of register banks, each register bank having a portion of the plurality of entries for storing data, one or more write ports to write data to the register file entries, and a plurality of read ports to read data from the register file entries; one or more read multiplexors associated with one or more read ports of each register bank and configured to receive data from the respective register banks; and one or more write multiplexors associated with one or more of the register banks.
    Type: Grant
    Filed: September 6, 2019
    Date of Patent: August 17, 2021
    Assignee: International Business Machines Corporation
    Inventors: Maarten J. Boersma, Niels Fricke, Michael Klaus Kroener, Hung Q. Le, Dung Q. Nguyen, Brian W. Thompto
  • Patent number: 11093249
    Abstract: In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: August 17, 2021
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Brett S. Feero, David Williamson, Ian D. Kountanis, Shih-Chieh Wen
  • Patent number: 11055000
    Abstract: The present disclosure includes apparatuses and methods for counter update operations. An example apparatus comprises a memory including a managed unit that includes a plurality of first groups of memory cells and a second group of memory cells, in which respective counters associated with the managed unit are stored on the second group of memory cells. The example apparatus further includes a controller. The controller includes a core configured to route a memory operation request received from a host and a datapath coupled to the core and the memory. The datapath may be configured to issue, responsive to a receipt of the memory operation request routed from the core, a plurality of commands associated with the routed memory operation request to the memory to perform corresponding memory operations on the plurality of first groups of memory cells. The respective counters may be updated independently of the plurality of commands.
    Type: Grant
    Filed: June 18, 2020
    Date of Patent: July 6, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Robert N. Hasbun, Daniele Balluchi
  • Patent number: 11042380
    Abstract: A plurality of instructions to be executed in an order of being issued without an appointment of a waiting time or a starting moment are designed to be executed after a certain waiting time; instructions to be executed in an order of being issued without designation of starting moment or waiting time are provided with starting moment or waiting time information so that the instructions can be executed in an order designated by the time information.
    Type: Grant
    Filed: March 2, 2009
    Date of Patent: June 22, 2021
    Assignee: KAWAI MUSICAL INSTRUMENTS MFG. CO., LTD.
    Inventor: Yasushi Sato
  • Patent number: 11003454
    Abstract: Apparatuses for data processing and methods of data processing are provided. A data processing apparatus performs data processing operations in response to a sequence of instructions including performing speculative execution of at least some of the sequence of instructions. In response to a branch instruction the data processing apparatus predicts whether or not the branch is taken or not taken further speculative instruction execution is based on that prediction. A path speculation cost is calculated in dependence on a number of recently flushed instructions and a rate at which speculatively executed instructions are issued may be modified based on the path speculation cost.
    Type: Grant
    Filed: July 17, 2019
    Date of Patent: May 11, 2021
    Assignee: Arm Limited
    Inventors: Michael Brian Schinzler, Michael Filippo, Yasuo Ishii
  • Patent number: 10956167
    Abstract: An instruction fusion system in which instructions are tagged with extra bits to specify the conditions by which the instructions can be fused is provided. A computing device receives a first instruction to be executed at a processor. The computing device receives a first fusion tag that corresponds to the first instruction, the first fusion tag specifying a condition for fusing the first instruction with another instruction. The computing device determines whether the first instruction is allowed to fuse with a second instruction based on the first fusion tag. When the first instruction is allowed to fuse with the second instruction, the computing device generates a fused instruction based on the first instruction and the second instruction. The computing device executes the fused instruction at the processor.
    Type: Grant
    Filed: June 6, 2019
    Date of Patent: March 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jessica Hui-Chun Tseng, Manoj Kumar, Kattamuri Ekanadham, Jose E. Moreira, Pratap C. Pattnaik
  • Patent number: 10929535
    Abstract: The present disclosure is directed to systems and methods for mitigating or eliminating the effectiveness of a side channel attack, such as a Meltdown or Spectre type attack by selectively introducing a variable, but controlled, quantity of uncertainty into the externally accessible system parameters visible and useful to the attacker. The systems and methods described herein provide perturbation circuitry that includes perturbation selector circuitry and perturbation block circuitry. The perturbation selector circuitry detects a potential attack by monitoring the performance/timing data generated by the processor. Upon detecting an attack, the perturbation selector circuitry determines a variable quantity of uncertainty to introduce to the externally accessible system data. The perturbation block circuitry adds the determined uncertainty into the externally accessible system data. The added uncertainty may be based on the frequency or interval of the event occurrences indicative of an attack.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: February 23, 2021
    Assignee: Intel Corporation
    Inventors: Vadim Sukhomlinov, Kshitij Doshi, Francesc Guim, Alex Nayshtut
  • Patent number: 10915327
    Abstract: Aspects of the present disclosure relate to an apparatus comprising a plurality of clusters, each cluster having a plurality of execution units to execute instructions. The apparatus comprises dispatch circuitry to determine, for each instruction to be executed, a chosen cluster from amongst the plurality of clusters to which to dispatch that instruction for execution. This determination is performed by selecting between a default dispatch policy wherein said chosen cluster is a cluster to which an earlier instruction to generate at least one source operand of said instruction was dispatched for execution, and an alternative dispatch policy for selecting said chosen cluster. Said selecting is based on a selection parameter. The dispatch circuitry is further configured to dispatch said instruction to the chosen cluster for execution.
    Type: Grant
    Filed: December 14, 2018
    Date of Patent: February 9, 2021
    Assignee: Arm Limited
    Inventors: Luca Nassi, Remi Marius Teyssier, François Donati, Damian Maiorano
  • Patent number: 10901745
    Abstract: A processor unit for processing storage instructions. The processor unit comprises a detection logic unit configured to identify at least two storage instructions for moving addressable words between registers of the processor unit and neighboring storage locations. The processor unit further comprises a combination unit configured to combine the identified instructions into a single combined instruction; and a data movement unit configured to move the words using the combined instruction.
    Type: Grant
    Filed: July 10, 2018
    Date of Patent: January 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Cedric Lichtenau, Peter Altevogt, Thomas Pflueger
  • Patent number: 10768939
    Abstract: A load/store unit for a processor, and applications thereof. In an embodiment, the load/store unit includes a load/store queue configured to store information and data associated with a particular class of instructions. Data stored in the load/store queue can be bypassed to dependent instructions. When an instruction belonging to The particular class of instructions graduates and the instruction is associated with a cache miss, control logic causes a pointer to be stored in a load/store graduation buffer that points to an entry in the load/store queue associated with the instruction. The load/store graduation buffer ensures that graduated instructions access a shared resource of the load/store unit in program order.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: September 8, 2020
    Assignee: ARM Finance Overseas Limited
    Inventors: Meng-Bing Yu, Era K. Nangia, Michael Ni
  • Patent number: 10725783
    Abstract: According to one or more embodiments, an example computer-implemented method for executing one or more out-of-order instructions by a processing unit, includes decoding an instruction to be executed, and based on a determination that the instruction is a store instruction, identifying a split load-hit-store (LHS) table for the store instruction, wherein a LHS table of the processing unit includes multiple split LHS tables. Identifying the split LHS table includes determining, for the store instruction, a first split LHS table by performing a mod operation using one or more operands from the store instruction, and adding one or more parameters of the store instruction in the first split LHS table by generating an ITAG for the store instruction. The method further includes dispatching the store instruction for execution to an issue queue with the ITAG.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: July 28, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ehsan Fatehi, Richard J. Eickemeyer, Edmund J. Gieske
  • Patent number: 10691457
    Abstract: An example apparatus includes a plurality of execution units, a physical register file that includes a plurality of physical registers, an instruction buffer, and a scheduling circuit. The instruction buffer may receive a group of instructions to be performed by the plurality of execution units. The scheduling circuit may allocate a physical register of the plurality of physical registers in the physical register file to store an operand of a particular instruction of the group of instructions. The scheduling circuit may also, in response to determining that a result of the particular instruction is used as an operand for a different instruction of the group of instructions, assign a tag to the particular instruction and to the different instruction to indicate that the result of the particular instruction will be sent to the different instruction without using the physical register file.
    Type: Grant
    Filed: December 13, 2017
    Date of Patent: June 23, 2020
    Assignee: Apple Inc.
    Inventors: Ian Kountanis, Muawya Al-Otoom
  • Patent number: 10621082
    Abstract: An information processing apparatus includes a receiving unit that receives data from the outside, a first memory space to which data is written from the receiving unit, a second memory space to which a flag for synchronization is written, and an arithmetic unit. The arithmetic unit includes a synchronization control unit that instructs the receiving unit to synchronize the first memory space and the second memory space. The receiving unit includes a synchronization command issuing unit that issues a synchronization command to the first memory space and the second memory space, and a synchronization command receiving unit that receives a response indicating that data writing is guaranteed from the first memory space and a response indicating that flag writing is guaranteed from the second memory space, and responds to the arithmetic unit that synchronization is completed when writing to the first memory space and the second memory space is guaranteed.
    Type: Grant
    Filed: January 11, 2018
    Date of Patent: April 14, 2020
    Assignee: NEC CORPORATION
    Inventor: Eiichiro Kawaguchi
  • Patent number: 10572264
    Abstract: Aspects of the invention include detecting, in an out-of-order (OoO) processor, that all instructions in a first group of in-flight instructions have a status of finished. The first group of in-flight instructions is the oldest group in an entry of a global completion table (GCT). It is determined that the entry in the GCT is a merged entry that is associated with both the first group of in-flight instructions and a second group of in-flight instructions dispatched immediately subsequent to the first group of in-flight instructions. The first group of in-flight instructions and the second group of in-flight instructions are completed in a single processor cycle. The completing is based at least in part on detecting that all instructions in the first group of in-flight instructions have a status of finished. The completing includes requesting release of resources utilized by both the first and second groups of in-flight instructions.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: February 25, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joel A. Silberman, Balaram Sinharoy
  • Patent number: 10564979
    Abstract: Aspects of the invention include detecting that all instructions in a first group of in-flight instructions have a status of finished. The first group of in-flight instructions is associated with a first allocated entry in a global completion table (GCT) which tracks a dispatch order and status of groups of in-flight instructions. The GCT includes a plurality of allocated entries including the first allocated entry and a second allocated entry. A second group of in-flight instructions dispatched immediately prior to the first group is associated with the second allocated entry in the GCT. Based at least in part on the detecting, the first allocated entry is merged into the second allocated entry to create a single merged second allocated entry in the GCT that includes completion information for both the first group of in-flight instructions and the second group of in-flight instructions. The first allocated entry is then deallocated.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: February 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joel A. Silberman, Balaram Sinharoy
  • Patent number: 10552162
    Abstract: Variable latency flush filtering including receiving a first flush instruction tag (ITAG) and a second flush ITAG, wherein the first flush ITAG and the second flush ITAG are instructions to invalidate internal operation results after an internal operation identified by the first flush ITAG and the second flush ITAG; determining that the second flush ITAG is before the first flush ITAG by comparing the first flush ITAG and the second flush ITAG; determining that the first flush ITAG requires adjustment; and delaying the flush to a subsequent cycle in response to determining that the second flush ITAG is before the first flush ITAG and determining that the first flush ITAG requires adjustment.
    Type: Grant
    Filed: January 22, 2018
    Date of Patent: February 4, 2020
    Assignee: International Business Machines Corporation
    Inventors: Glenn O. Kincaid, David S. Levitan, Albert J. Van Norstrand, Jr.
  • Patent number: 10552165
    Abstract: Within a processor, speculative finishes of load instructions only are tracked in a speculative finish table by maintaining an oldest load instruction of a thread in the speculative finish table after data is loaded for the oldest load instruction, wherein a particular queue index tag assigned to the oldest load instruction by an execution unit points to a particular entry in the speculative finish table, wherein the oldest load instruction is waiting to be finished dependent upon an error check code result. Responsive to a flow unit receiving the particular queue index tag with an indicator that the error check code result for data retrieved for the oldest load instruction is good, finishing the oldest load instruction in the particular entry pointed to by the queue index tag and writing an instruction tag stored in the entry for the oldest load instruction out of the speculative finish table for completion.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: February 4, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Susan E. Eisen, David A. Hrusecky, Christopher M. Mueller, Dung Q. Nguyen, A. James Van Norstrand, Jr., Kenneth L. Ward
  • Patent number: 10534608
    Abstract: A central processing unit system includes: a pipeline configured to receive an instruction; and a register file partitioned into one or more subarrays where (i) the register file includes one or more computation elements and (ii) the one or more computation elements are directly connected to one or more subarrays.
    Type: Grant
    Filed: August 17, 2011
    Date of Patent: January 14, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Jeffrey Haskell Derby, Michele Martino Franceschini, Robert Kevin Montoye, Augusto J. Vega
  • Patent number: 10452434
    Abstract: Systems, apparatuses, and methods for efficiently scheduling processor instructions for execution. The reservation station in a processor stores instructions in each of a primary buffer and a secondary buffer. Control logic selects a first number of instructions with ready source operands in the primary buffer and a second number of instructions with ready source operands in the secondary buffer. If a third number of instructions to issue from the reservation station is greater than the first number of instructions, then the reservation station issues one or more instructions of the second number of instructions from the secondary buffer to the one or more execution units. Control logic selects a fourth number of instructions in the secondary buffer to transfer to the primary buffer, and cancels the transfer of a given instruction in response to determining the given instruction has issued to the one or more execution units.
    Type: Grant
    Filed: September 11, 2017
    Date of Patent: October 22, 2019
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Sean M. Reynolds
  • Patent number: 10452564
    Abstract: Format preserving encryption of object code is disclosed. One example is a system including at least one processor and a memory storing instructions executable by the at least one processor to identify object code to be secured, where the object code comprises a list of instructions, each instruction comprising an opcode and zero or more parameters. A format preserving encryption (FPE) is applied to the received object code, where the FPE is applied separately to a sub-plurality of instructions in the list of instructions, to generate an encrypted object code comprising a sub-plurality of encrypted instructions. An encrypted object code is provided to a service provider, where the encrypted object code comprises the sub-plurality of encrypted instructions, and any unencrypted portions of the object code.
    Type: Grant
    Filed: April 25, 2017
    Date of Patent: October 22, 2019
    Assignee: ENTIT SOFTWARE LLC
    Inventors: Luther Martin, Timothy Roake
  • Patent number: 10423423
    Abstract: Within a processor, speculative finishes of load instructions only are tracked in a speculative finish table by maintaining an oldest load instruction of a thread in the speculative finish table after data is loaded for the oldest load instruction, wherein a particular queue index tag assigned to the oldest load instruction by an execution unit points to a particular entry in the speculative finish table, wherein the oldest load instruction is waiting to be finished dependent upon an error check code result. Responsive to a flow unit receiving the particular queue index tag with an indicator that the error check code result for data retrieved for the oldest load instruction is good, finishing the oldest load instruction in the particular entry pointed to by the queue index tag and writing an instruction tag stored in the entry for the oldest load instruction out of the speculative finish table for completion.
    Type: Grant
    Filed: September 29, 2015
    Date of Patent: September 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Susan E. Eisen, David A. Hrusecky, Christopher M. Mueller, Dung Q. Nguyen, A. James Van Norstrand, Jr., Kenneth L. Ward
  • Patent number: 10379954
    Abstract: The present invention provides a method and an apparatus for cache management of transaction processing in persistent memory.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: August 13, 2019
    Assignee: TSINGHUA UNIVERSITY
    Inventors: Jiwu Shu, Youyou Lu
  • Patent number: 10318260
    Abstract: A method and system of compiling and linking source stream programs for efficient use of multi-node devices. The system includes a compiler, a linker, a loader and a runtime component. The process converts a source code stream program to a compiled object code that is used with a programmable node based computing device having a plurality of processing nodes coupled to each other. The programming modules include stream statements for input values and output values in the form of sources and destinations for at least one of the plurality of processing nodes and stream statements that determine the streaming flow of values for the at least one of the plurality of processing nodes. The compiler converts the source code stream based program to object modules, object module instances and executables. The linker matches the object module instances to at least one of the multiple cores.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: June 11, 2019
    Assignee: Cornami, Inc.
    Inventors: Frederick Furtek, Paul Master
  • Patent number: 10291391
    Abstract: A method to protect computational, in particular cryptographic, devices having multi-core processors from DPA and DFA attacks is disclosed herein. The method implies: Defining a library of execution units functionally grouped into business function related units, security function related units and scheduler function related units; Designating at random one among the plurality of processing cores on the computational device to as a master core for execution of the scheduler function related execution units; and Causing, under control of the scheduler, execution of the library of execution units, so as to result in a randomized execution flow capable of resisting security threats initiated on the computational device.
    Type: Grant
    Filed: June 4, 2014
    Date of Patent: May 14, 2019
    Assignee: GIESECKE+DEVRIENT MOBILE SECURITY GMBH
    Inventors: Sai Yanamandra, Vineet Kulkarni, Shrikanthrao Kulkarni
  • Patent number: 10180856
    Abstract: A method for performing dynamic port remapping during instruction scheduling in an out of order microprocessor is disclosed. The method comprises selecting and dispatching a plurality of instructions from a plurality of select ports in a scheduler module in first clock cycle. Next, it comprises determining if a first physical register file unit has capacity to support instructions dispatched in the first clock cycle. Further, it comprises supplying a response back to logic circuitry between the plurality of select ports and a plurality of execution ports, wherein the logic circuitry is operable to re-map select ports in the scheduler module to execution ports based on the response. Finally, responsive to a determination that the first physical register file unit is full, the method comprises re-mapping at least one select port connecting with an execution unit in the first physical register file unit to a second physical register file unit.
    Type: Grant
    Filed: July 25, 2016
    Date of Patent: January 15, 2019
    Assignee: INTEL CORPORATION
    Inventor: Nelson N. Chan
  • Patent number: 10108423
    Abstract: An approach is provided in which a mapper control unit matches a result instruction tag corresponding to an executed instruction to a history buffer entry's instruction tag. The matched history buffer entry includes multiple history buffer field sets that each include a field set state indicator. The mapper control unit identifies a subset of the history buffer field sets having a valid field set state indicator and stores result data corresponding to the result instruction tag in the identified subset of history buffer field sets. In turn, the mapper control unit restores a subset of a register's fields utilizing content from the subset of history buffer field sets.
    Type: Grant
    Filed: March 25, 2015
    Date of Patent: October 23, 2018
    Assignee: International Business Machines Corporation
    Inventors: Michael J. Genden, Dung Q. Nguyen, Kenneth L. Ward
  • Patent number: 10067788
    Abstract: A computing system can provide user interfaces and back-end operations to facilitate review and invalidation of executed jobs. The system can provide an interface that allows the operator to review quality-control information about a completed job. Once the operator identifies a job as invalid, the operator can be presented with further options, such as whether to invalidate only the reviewed job or the job and all its descendants. The operator can also review antecedent jobs to an invalid job (e.g., in order to trace the root of the problem) and can selectively invalidate antecedent jobs.
    Type: Grant
    Filed: September 2, 2016
    Date of Patent: September 4, 2018
    Assignee: Dropbox, Inc.
    Inventors: Shaunak Kishore, Karl Dray
  • Patent number: 10007522
    Abstract: A branch instruction and a corresponding branch instruction address are received at a data processing system. A first value is received and is compared to a portion of the branch instruction address. An entry at a branch target buffer corresponding to the branch instruction is selectively allocated based on a result of the comparing.
    Type: Grant
    Filed: May 20, 2014
    Date of Patent: June 26, 2018
    Assignee: NXP USA, Inc.
    Inventors: Jeffrey W. Scott, William C. Moyer
  • Patent number: 9977674
    Abstract: Disclosed are an apparatus, system, and method for implementing predicated instructions using micro-operations. A micro-code engine receives an instruction, decomposes the instruction, and generates a plurality of micro-operations to implement the instruction. Each of the decomposed micro-operations indicates a single destination register. For predicated instructions, the decomposed micro-operations include “conditional move” micro-operations to select between two potential output values. Except in the case that one of the potential output values is a constant, the decomposed micro-operations for a predicated instruction also include an append instruction that saves the incoming value of a destination register in a temporary variable. For at least one embodiment, the qualifying predicate for a predicated instruction is appended to the incoming value stored in the temporary register.
    Type: Grant
    Filed: October 14, 2003
    Date of Patent: May 22, 2018
    Assignee: Intel Corporation
    Inventors: Jeffrey P. Rupley, II, Edward A. Brekelbaum, Edward T. Grochowski, Bryan P. Black
  • Patent number: 9977596
    Abstract: The speed at which files can be accessed from a remote location is increased by predicting the file access pattern based on a predictive model. The file access pattern describes the order in which blocks of data for a given file type are read by a given application. From aggregated data across many file accesses, one or more predictive models of access patterns can be built. A predictive model takes as input the application requesting the file access and the file type being requested, and outputs information describing an order of data blocks for transmitting the file to the requesting application. Accordingly, when a server receives a request for a file from an application, the server uses the predictive model to determine the order that the application is most likely to use the data blocks of the file. The data is then transmitted in that order to the client device.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: May 22, 2018
    Assignee: Dropbox, Inc.
    Inventors: Rian Hunter, Jeffrey Bartelma
  • Patent number: 9965274
    Abstract: A computer processor is provided with a plurality of functional units that performs operations specified by the at least one instruction over the multiple machine cycles, wherein the operations produce result operands. The processor also includes circuitry that generates result tags dynamically according to the number of operations that produce result operands in a given machine cycle. A bypass network is configured to provide data paths for transfer of operand data between the plurality of functional units according to the result tags.
    Type: Grant
    Filed: October 15, 2014
    Date of Patent: May 8, 2018
    Assignee: Mill Computing, Inc.
    Inventors: Arthur David Kahlich, Roger Rawson Godard
  • Patent number: 9952870
    Abstract: An apparatus and method for filtering biased conditional branches in a branch predictor in favor of non-biased conditional branches are disclosed. Biased conditional branches, which are consistently skewed toward one direction or outcome, are filtered such that an increased number of non-biased conditional branches which resolve in both directions may be considered. As a result, more useful branches may be captured over larger distances, thereby providing correlations deeper in a global history. In addition, by tracking only the latest occurrences of non-biased conditional branches using a recency stack structure, even more distant branch correlations may be made.
    Type: Grant
    Filed: June 13, 2014
    Date of Patent: April 24, 2018
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Mikko Lipasti, Dibakar Gope