Superscalar Patents (Class 712/23)
  • Patent number: 11144364
    Abstract: Recovering microprocessor logical register values by: partitioning a register mapper by logical register type; providing a plurality of recovery ports; assigning a logical register type to a recovery port; receiving a restore required instruction; and mapping SRB (save and restore buffer) values to the register mapper by logical register type.
    Type: Grant
    Filed: January 25, 2019
    Date of Patent: October 12, 2021
    Assignee: International Business Machines Corporation
    Inventors: Steven J. Battle, Brandon R. Goddard, Dung Q. Nguyen, Joshua W. Bowman, Brian D. Barrick, Susan E. Eisen, David S. Walder, Cliff Kucharski
  • Patent number: 11093246
    Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor, a register file associated with the at least one processor, the register file having a plurality of entries for storing data and sliced into a plurality of register banks, each register bank having a portion of the plurality of entries for storing data, one or more write ports to write data to the register file entries, and a plurality of read ports to read data from the register file entries; one or more read multiplexors associated with one or more read ports of each register bank and configured to receive data from the respective register banks; and one or more write multiplexors associated with one or more of the register banks.
    Type: Grant
    Filed: September 6, 2019
    Date of Patent: August 17, 2021
    Assignee: International Business Machines Corporation
    Inventors: Maarten J. Boersma, Niels Fricke, Michael Klaus Kroener, Hung Q. Le, Dung Q. Nguyen, Brian W. Thompto
  • Patent number: 11093249
    Abstract: In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: August 17, 2021
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Brett S. Feero, David Williamson, Ian D. Kountanis, Shih-Chieh Wen
  • Patent number: 11055000
    Abstract: The present disclosure includes apparatuses and methods for counter update operations. An example apparatus comprises a memory including a managed unit that includes a plurality of first groups of memory cells and a second group of memory cells, in which respective counters associated with the managed unit are stored on the second group of memory cells. The example apparatus further includes a controller. The controller includes a core configured to route a memory operation request received from a host and a datapath coupled to the core and the memory. The datapath may be configured to issue, responsive to a receipt of the memory operation request routed from the core, a plurality of commands associated with the routed memory operation request to the memory to perform corresponding memory operations on the plurality of first groups of memory cells. The respective counters may be updated independently of the plurality of commands.
    Type: Grant
    Filed: June 18, 2020
    Date of Patent: July 6, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Robert N. Hasbun, Daniele Balluchi
  • Patent number: 11042380
    Abstract: A plurality of instructions to be executed in an order of being issued without an appointment of a waiting time or a starting moment are designed to be executed after a certain waiting time; instructions to be executed in an order of being issued without designation of starting moment or waiting time are provided with starting moment or waiting time information so that the instructions can be executed in an order designated by the time information.
    Type: Grant
    Filed: March 2, 2009
    Date of Patent: June 22, 2021
    Assignee: KAWAI MUSICAL INSTRUMENTS MFG. CO., LTD.
    Inventor: Yasushi Sato
  • Patent number: 11003454
    Abstract: Apparatuses for data processing and methods of data processing are provided. A data processing apparatus performs data processing operations in response to a sequence of instructions including performing speculative execution of at least some of the sequence of instructions. In response to a branch instruction the data processing apparatus predicts whether or not the branch is taken or not taken further speculative instruction execution is based on that prediction. A path speculation cost is calculated in dependence on a number of recently flushed instructions and a rate at which speculatively executed instructions are issued may be modified based on the path speculation cost.
    Type: Grant
    Filed: July 17, 2019
    Date of Patent: May 11, 2021
    Assignee: Arm Limited
    Inventors: Michael Brian Schinzler, Michael Filippo, Yasuo Ishii
  • Patent number: 10956167
    Abstract: An instruction fusion system in which instructions are tagged with extra bits to specify the conditions by which the instructions can be fused is provided. A computing device receives a first instruction to be executed at a processor. The computing device receives a first fusion tag that corresponds to the first instruction, the first fusion tag specifying a condition for fusing the first instruction with another instruction. The computing device determines whether the first instruction is allowed to fuse with a second instruction based on the first fusion tag. When the first instruction is allowed to fuse with the second instruction, the computing device generates a fused instruction based on the first instruction and the second instruction. The computing device executes the fused instruction at the processor.
    Type: Grant
    Filed: June 6, 2019
    Date of Patent: March 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jessica Hui-Chun Tseng, Manoj Kumar, Kattamuri Ekanadham, Jose E. Moreira, Pratap C. Pattnaik
  • Patent number: 10929535
    Abstract: The present disclosure is directed to systems and methods for mitigating or eliminating the effectiveness of a side channel attack, such as a Meltdown or Spectre type attack by selectively introducing a variable, but controlled, quantity of uncertainty into the externally accessible system parameters visible and useful to the attacker. The systems and methods described herein provide perturbation circuitry that includes perturbation selector circuitry and perturbation block circuitry. The perturbation selector circuitry detects a potential attack by monitoring the performance/timing data generated by the processor. Upon detecting an attack, the perturbation selector circuitry determines a variable quantity of uncertainty to introduce to the externally accessible system data. The perturbation block circuitry adds the determined uncertainty into the externally accessible system data. The added uncertainty may be based on the frequency or interval of the event occurrences indicative of an attack.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: February 23, 2021
    Assignee: Intel Corporation
    Inventors: Vadim Sukhomlinov, Kshitij Doshi, Francesc Guim, Alex Nayshtut
  • Patent number: 10915327
    Abstract: Aspects of the present disclosure relate to an apparatus comprising a plurality of clusters, each cluster having a plurality of execution units to execute instructions. The apparatus comprises dispatch circuitry to determine, for each instruction to be executed, a chosen cluster from amongst the plurality of clusters to which to dispatch that instruction for execution. This determination is performed by selecting between a default dispatch policy wherein said chosen cluster is a cluster to which an earlier instruction to generate at least one source operand of said instruction was dispatched for execution, and an alternative dispatch policy for selecting said chosen cluster. Said selecting is based on a selection parameter. The dispatch circuitry is further configured to dispatch said instruction to the chosen cluster for execution.
    Type: Grant
    Filed: December 14, 2018
    Date of Patent: February 9, 2021
    Assignee: Arm Limited
    Inventors: Luca Nassi, Remi Marius Teyssier, François Donati, Damian Maiorano
  • Patent number: 10901745
    Abstract: A processor unit for processing storage instructions. The processor unit comprises a detection logic unit configured to identify at least two storage instructions for moving addressable words between registers of the processor unit and neighboring storage locations. The processor unit further comprises a combination unit configured to combine the identified instructions into a single combined instruction; and a data movement unit configured to move the words using the combined instruction.
    Type: Grant
    Filed: July 10, 2018
    Date of Patent: January 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Cedric Lichtenau, Peter Altevogt, Thomas Pflueger
  • Patent number: 10768939
    Abstract: A load/store unit for a processor, and applications thereof. In an embodiment, the load/store unit includes a load/store queue configured to store information and data associated with a particular class of instructions. Data stored in the load/store queue can be bypassed to dependent instructions. When an instruction belonging to The particular class of instructions graduates and the instruction is associated with a cache miss, control logic causes a pointer to be stored in a load/store graduation buffer that points to an entry in the load/store queue associated with the instruction. The load/store graduation buffer ensures that graduated instructions access a shared resource of the load/store unit in program order.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: September 8, 2020
    Assignee: ARM Finance Overseas Limited
    Inventors: Meng-Bing Yu, Era K. Nangia, Michael Ni
  • Patent number: 10725783
    Abstract: According to one or more embodiments, an example computer-implemented method for executing one or more out-of-order instructions by a processing unit, includes decoding an instruction to be executed, and based on a determination that the instruction is a store instruction, identifying a split load-hit-store (LHS) table for the store instruction, wherein a LHS table of the processing unit includes multiple split LHS tables. Identifying the split LHS table includes determining, for the store instruction, a first split LHS table by performing a mod operation using one or more operands from the store instruction, and adding one or more parameters of the store instruction in the first split LHS table by generating an ITAG for the store instruction. The method further includes dispatching the store instruction for execution to an issue queue with the ITAG.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: July 28, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ehsan Fatehi, Richard J. Eickemeyer, Edmund J. Gieske
  • Patent number: 10691457
    Abstract: An example apparatus includes a plurality of execution units, a physical register file that includes a plurality of physical registers, an instruction buffer, and a scheduling circuit. The instruction buffer may receive a group of instructions to be performed by the plurality of execution units. The scheduling circuit may allocate a physical register of the plurality of physical registers in the physical register file to store an operand of a particular instruction of the group of instructions. The scheduling circuit may also, in response to determining that a result of the particular instruction is used as an operand for a different instruction of the group of instructions, assign a tag to the particular instruction and to the different instruction to indicate that the result of the particular instruction will be sent to the different instruction without using the physical register file.
    Type: Grant
    Filed: December 13, 2017
    Date of Patent: June 23, 2020
    Assignee: Apple Inc.
    Inventors: Ian Kountanis, Muawya Al-Otoom
  • Patent number: 10621082
    Abstract: An information processing apparatus includes a receiving unit that receives data from the outside, a first memory space to which data is written from the receiving unit, a second memory space to which a flag for synchronization is written, and an arithmetic unit. The arithmetic unit includes a synchronization control unit that instructs the receiving unit to synchronize the first memory space and the second memory space. The receiving unit includes a synchronization command issuing unit that issues a synchronization command to the first memory space and the second memory space, and a synchronization command receiving unit that receives a response indicating that data writing is guaranteed from the first memory space and a response indicating that flag writing is guaranteed from the second memory space, and responds to the arithmetic unit that synchronization is completed when writing to the first memory space and the second memory space is guaranteed.
    Type: Grant
    Filed: January 11, 2018
    Date of Patent: April 14, 2020
    Assignee: NEC CORPORATION
    Inventor: Eiichiro Kawaguchi
  • Patent number: 10572264
    Abstract: Aspects of the invention include detecting, in an out-of-order (OoO) processor, that all instructions in a first group of in-flight instructions have a status of finished. The first group of in-flight instructions is the oldest group in an entry of a global completion table (GCT). It is determined that the entry in the GCT is a merged entry that is associated with both the first group of in-flight instructions and a second group of in-flight instructions dispatched immediately subsequent to the first group of in-flight instructions. The first group of in-flight instructions and the second group of in-flight instructions are completed in a single processor cycle. The completing is based at least in part on detecting that all instructions in the first group of in-flight instructions have a status of finished. The completing includes requesting release of resources utilized by both the first and second groups of in-flight instructions.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: February 25, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joel A. Silberman, Balaram Sinharoy
  • Patent number: 10564979
    Abstract: Aspects of the invention include detecting that all instructions in a first group of in-flight instructions have a status of finished. The first group of in-flight instructions is associated with a first allocated entry in a global completion table (GCT) which tracks a dispatch order and status of groups of in-flight instructions. The GCT includes a plurality of allocated entries including the first allocated entry and a second allocated entry. A second group of in-flight instructions dispatched immediately prior to the first group is associated with the second allocated entry in the GCT. Based at least in part on the detecting, the first allocated entry is merged into the second allocated entry to create a single merged second allocated entry in the GCT that includes completion information for both the first group of in-flight instructions and the second group of in-flight instructions. The first allocated entry is then deallocated.
    Type: Grant
    Filed: November 30, 2017
    Date of Patent: February 18, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joel A. Silberman, Balaram Sinharoy
  • Patent number: 10552162
    Abstract: Variable latency flush filtering including receiving a first flush instruction tag (ITAG) and a second flush ITAG, wherein the first flush ITAG and the second flush ITAG are instructions to invalidate internal operation results after an internal operation identified by the first flush ITAG and the second flush ITAG; determining that the second flush ITAG is before the first flush ITAG by comparing the first flush ITAG and the second flush ITAG; determining that the first flush ITAG requires adjustment; and delaying the flush to a subsequent cycle in response to determining that the second flush ITAG is before the first flush ITAG and determining that the first flush ITAG requires adjustment.
    Type: Grant
    Filed: January 22, 2018
    Date of Patent: February 4, 2020
    Assignee: International Business Machines Corporation
    Inventors: Glenn O. Kincaid, David S. Levitan, Albert J. Van Norstrand, Jr.
  • Patent number: 10552165
    Abstract: Within a processor, speculative finishes of load instructions only are tracked in a speculative finish table by maintaining an oldest load instruction of a thread in the speculative finish table after data is loaded for the oldest load instruction, wherein a particular queue index tag assigned to the oldest load instruction by an execution unit points to a particular entry in the speculative finish table, wherein the oldest load instruction is waiting to be finished dependent upon an error check code result. Responsive to a flow unit receiving the particular queue index tag with an indicator that the error check code result for data retrieved for the oldest load instruction is good, finishing the oldest load instruction in the particular entry pointed to by the queue index tag and writing an instruction tag stored in the entry for the oldest load instruction out of the speculative finish table for completion.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: February 4, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Susan E. Eisen, David A. Hrusecky, Christopher M. Mueller, Dung Q. Nguyen, A. James Van Norstrand, Jr., Kenneth L. Ward
  • Patent number: 10534608
    Abstract: A central processing unit system includes: a pipeline configured to receive an instruction; and a register file partitioned into one or more subarrays where (i) the register file includes one or more computation elements and (ii) the one or more computation elements are directly connected to one or more subarrays.
    Type: Grant
    Filed: August 17, 2011
    Date of Patent: January 14, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Jeffrey Haskell Derby, Michele Martino Franceschini, Robert Kevin Montoye, Augusto J. Vega
  • Patent number: 10452564
    Abstract: Format preserving encryption of object code is disclosed. One example is a system including at least one processor and a memory storing instructions executable by the at least one processor to identify object code to be secured, where the object code comprises a list of instructions, each instruction comprising an opcode and zero or more parameters. A format preserving encryption (FPE) is applied to the received object code, where the FPE is applied separately to a sub-plurality of instructions in the list of instructions, to generate an encrypted object code comprising a sub-plurality of encrypted instructions. An encrypted object code is provided to a service provider, where the encrypted object code comprises the sub-plurality of encrypted instructions, and any unencrypted portions of the object code.
    Type: Grant
    Filed: April 25, 2017
    Date of Patent: October 22, 2019
    Assignee: ENTIT SOFTWARE LLC
    Inventors: Luther Martin, Timothy Roake
  • Patent number: 10452434
    Abstract: Systems, apparatuses, and methods for efficiently scheduling processor instructions for execution. The reservation station in a processor stores instructions in each of a primary buffer and a secondary buffer. Control logic selects a first number of instructions with ready source operands in the primary buffer and a second number of instructions with ready source operands in the secondary buffer. If a third number of instructions to issue from the reservation station is greater than the first number of instructions, then the reservation station issues one or more instructions of the second number of instructions from the secondary buffer to the one or more execution units. Control logic selects a fourth number of instructions in the secondary buffer to transfer to the primary buffer, and cancels the transfer of a given instruction in response to determining the given instruction has issued to the one or more execution units.
    Type: Grant
    Filed: September 11, 2017
    Date of Patent: October 22, 2019
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Sean M. Reynolds
  • Patent number: 10423423
    Abstract: Within a processor, speculative finishes of load instructions only are tracked in a speculative finish table by maintaining an oldest load instruction of a thread in the speculative finish table after data is loaded for the oldest load instruction, wherein a particular queue index tag assigned to the oldest load instruction by an execution unit points to a particular entry in the speculative finish table, wherein the oldest load instruction is waiting to be finished dependent upon an error check code result. Responsive to a flow unit receiving the particular queue index tag with an indicator that the error check code result for data retrieved for the oldest load instruction is good, finishing the oldest load instruction in the particular entry pointed to by the queue index tag and writing an instruction tag stored in the entry for the oldest load instruction out of the speculative finish table for completion.
    Type: Grant
    Filed: September 29, 2015
    Date of Patent: September 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Susan E. Eisen, David A. Hrusecky, Christopher M. Mueller, Dung Q. Nguyen, A. James Van Norstrand, Jr., Kenneth L. Ward
  • Patent number: 10379954
    Abstract: The present invention provides a method and an apparatus for cache management of transaction processing in persistent memory.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: August 13, 2019
    Assignee: TSINGHUA UNIVERSITY
    Inventors: Jiwu Shu, Youyou Lu
  • Patent number: 10318260
    Abstract: A method and system of compiling and linking source stream programs for efficient use of multi-node devices. The system includes a compiler, a linker, a loader and a runtime component. The process converts a source code stream program to a compiled object code that is used with a programmable node based computing device having a plurality of processing nodes coupled to each other. The programming modules include stream statements for input values and output values in the form of sources and destinations for at least one of the plurality of processing nodes and stream statements that determine the streaming flow of values for the at least one of the plurality of processing nodes. The compiler converts the source code stream based program to object modules, object module instances and executables. The linker matches the object module instances to at least one of the multiple cores.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: June 11, 2019
    Assignee: Cornami, Inc.
    Inventors: Frederick Furtek, Paul Master
  • Patent number: 10291391
    Abstract: A method to protect computational, in particular cryptographic, devices having multi-core processors from DPA and DFA attacks is disclosed herein. The method implies: Defining a library of execution units functionally grouped into business function related units, security function related units and scheduler function related units; Designating at random one among the plurality of processing cores on the computational device to as a master core for execution of the scheduler function related execution units; and Causing, under control of the scheduler, execution of the library of execution units, so as to result in a randomized execution flow capable of resisting security threats initiated on the computational device.
    Type: Grant
    Filed: June 4, 2014
    Date of Patent: May 14, 2019
    Assignee: GIESECKE+DEVRIENT MOBILE SECURITY GMBH
    Inventors: Sai Yanamandra, Vineet Kulkarni, Shrikanthrao Kulkarni
  • Patent number: 10180856
    Abstract: A method for performing dynamic port remapping during instruction scheduling in an out of order microprocessor is disclosed. The method comprises selecting and dispatching a plurality of instructions from a plurality of select ports in a scheduler module in first clock cycle. Next, it comprises determining if a first physical register file unit has capacity to support instructions dispatched in the first clock cycle. Further, it comprises supplying a response back to logic circuitry between the plurality of select ports and a plurality of execution ports, wherein the logic circuitry is operable to re-map select ports in the scheduler module to execution ports based on the response. Finally, responsive to a determination that the first physical register file unit is full, the method comprises re-mapping at least one select port connecting with an execution unit in the first physical register file unit to a second physical register file unit.
    Type: Grant
    Filed: July 25, 2016
    Date of Patent: January 15, 2019
    Assignee: INTEL CORPORATION
    Inventor: Nelson N. Chan
  • Patent number: 10108423
    Abstract: An approach is provided in which a mapper control unit matches a result instruction tag corresponding to an executed instruction to a history buffer entry's instruction tag. The matched history buffer entry includes multiple history buffer field sets that each include a field set state indicator. The mapper control unit identifies a subset of the history buffer field sets having a valid field set state indicator and stores result data corresponding to the result instruction tag in the identified subset of history buffer field sets. In turn, the mapper control unit restores a subset of a register's fields utilizing content from the subset of history buffer field sets.
    Type: Grant
    Filed: March 25, 2015
    Date of Patent: October 23, 2018
    Assignee: International Business Machines Corporation
    Inventors: Michael J. Genden, Dung Q. Nguyen, Kenneth L. Ward
  • Patent number: 10067788
    Abstract: A computing system can provide user interfaces and back-end operations to facilitate review and invalidation of executed jobs. The system can provide an interface that allows the operator to review quality-control information about a completed job. Once the operator identifies a job as invalid, the operator can be presented with further options, such as whether to invalidate only the reviewed job or the job and all its descendants. The operator can also review antecedent jobs to an invalid job (e.g., in order to trace the root of the problem) and can selectively invalidate antecedent jobs.
    Type: Grant
    Filed: September 2, 2016
    Date of Patent: September 4, 2018
    Assignee: Dropbox, Inc.
    Inventors: Shaunak Kishore, Karl Dray
  • Patent number: 10007522
    Abstract: A branch instruction and a corresponding branch instruction address are received at a data processing system. A first value is received and is compared to a portion of the branch instruction address. An entry at a branch target buffer corresponding to the branch instruction is selectively allocated based on a result of the comparing.
    Type: Grant
    Filed: May 20, 2014
    Date of Patent: June 26, 2018
    Assignee: NXP USA, Inc.
    Inventors: Jeffrey W. Scott, William C. Moyer
  • Patent number: 9977596
    Abstract: The speed at which files can be accessed from a remote location is increased by predicting the file access pattern based on a predictive model. The file access pattern describes the order in which blocks of data for a given file type are read by a given application. From aggregated data across many file accesses, one or more predictive models of access patterns can be built. A predictive model takes as input the application requesting the file access and the file type being requested, and outputs information describing an order of data blocks for transmitting the file to the requesting application. Accordingly, when a server receives a request for a file from an application, the server uses the predictive model to determine the order that the application is most likely to use the data blocks of the file. The data is then transmitted in that order to the client device.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: May 22, 2018
    Assignee: Dropbox, Inc.
    Inventors: Rian Hunter, Jeffrey Bartelma
  • Patent number: 9977674
    Abstract: Disclosed are an apparatus, system, and method for implementing predicated instructions using micro-operations. A micro-code engine receives an instruction, decomposes the instruction, and generates a plurality of micro-operations to implement the instruction. Each of the decomposed micro-operations indicates a single destination register. For predicated instructions, the decomposed micro-operations include “conditional move” micro-operations to select between two potential output values. Except in the case that one of the potential output values is a constant, the decomposed micro-operations for a predicated instruction also include an append instruction that saves the incoming value of a destination register in a temporary variable. For at least one embodiment, the qualifying predicate for a predicated instruction is appended to the incoming value stored in the temporary register.
    Type: Grant
    Filed: October 14, 2003
    Date of Patent: May 22, 2018
    Assignee: Intel Corporation
    Inventors: Jeffrey P. Rupley, II, Edward A. Brekelbaum, Edward T. Grochowski, Bryan P. Black
  • Patent number: 9965274
    Abstract: A computer processor is provided with a plurality of functional units that performs operations specified by the at least one instruction over the multiple machine cycles, wherein the operations produce result operands. The processor also includes circuitry that generates result tags dynamically according to the number of operations that produce result operands in a given machine cycle. A bypass network is configured to provide data paths for transfer of operand data between the plurality of functional units according to the result tags.
    Type: Grant
    Filed: October 15, 2014
    Date of Patent: May 8, 2018
    Assignee: Mill Computing, Inc.
    Inventors: Arthur David Kahlich, Roger Rawson Godard
  • Patent number: 9952870
    Abstract: An apparatus and method for filtering biased conditional branches in a branch predictor in favor of non-biased conditional branches are disclosed. Biased conditional branches, which are consistently skewed toward one direction or outcome, are filtered such that an increased number of non-biased conditional branches which resolve in both directions may be considered. As a result, more useful branches may be captured over larger distances, thereby providing correlations deeper in a global history. In addition, by tracking only the latest occurrences of non-biased conditional branches using a recency stack structure, even more distant branch correlations may be made.
    Type: Grant
    Filed: June 13, 2014
    Date of Patent: April 24, 2018
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Mikko Lipasti, Dibakar Gope
  • Patent number: 9942272
    Abstract: Processing streaming data in accordance with policies that group data by source, enforce a maximum permissible late arrival value for streaming data, a maximum permissible early arrival for data and/or a maximum degree to which data can be out of order and still be compliant with the out of order policy is described. The correct starting point for reading a data stream so as to produce correct output from a given output start time can be enabled using the early arrival policy. Using combinations of policies, output can be generated promptly (with low latency). When input from a given source is not disrupted, output can be generated with low latency. Output can be generated even when the input stops by applying a late arrival policy.
    Type: Grant
    Filed: June 5, 2015
    Date of Patent: April 10, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Zhong Chen, Lev Novik, Boris Shulman, Clemens A. Szyperski
  • Patent number: 9864398
    Abstract: A method for transmitting a plurality of data bits and a clock signal on a return to zero (RZ) signal includes: transmitting a first voltage that is greater than a first threshold, the first voltage being decodable to first order of data bits; transmitting a second voltage that is between a second threshold and the first threshold, the second voltage being decodable to a second order of data bits; transmitting a third voltage that is between a third threshold and a fourth threshold, the third voltage being decodable to a third order of data bits; transmitting a fourth voltage that is greater in magnitude than the fourth threshold, the fourth voltage being decodable to a fourth order of data bits; and transitioning the clock signal in response to the RZ signal being between the second threshold and the third threshold.
    Type: Grant
    Filed: December 30, 2015
    Date of Patent: January 9, 2018
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Robert Floyd Payne
  • Patent number: 9851977
    Abstract: A method for executing instructions on a single-program, multiple-data processor system having a fixed number of execution lanes, including: scheduling a primary instruction for execution with a first wave of multiple data; assigning the first wave to a corresponding primary subset of the execution lanes; scheduling a secondary instruction having a second wave of multiple data, such that the second wave fits in lanes that are unused by the primary subset of lanes; assigning the second wave to a corresponding secondary subset of the lanes; fetching the primary and secondary instructions; configuring the execution lanes such that the primary subset is responsive to the primary instruction and the secondary subset is simultaneously responsive to the secondary instruction; and simultaneously executing the primary and secondary instructions in the execution lanes.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: December 26, 2017
    Assignee: KALRAY
    Inventors: Nicolas Brunie, Sylvain Collange
  • Patent number: 9798590
    Abstract: A method and apparatus for post-retire transaction access tracking is herein described. Load and store buffers are capable of storing senior entries. In the load buffer a first access is scheduled based on a load buffer entry. Tracking information associated with the load is stored in a filter field in the load buffer entry. Upon retirement, the load buffer entry is marked as a senior load entry. A scheduler schedules a post-retire access to update transaction tracking information, if the filter field does not represent that the tracking information has already been updated during a pendency of the transaction. Before evicting a line in a cache, the load buffer is snooped to ensure no load accessed the line to be evicted.
    Type: Grant
    Filed: September 7, 2006
    Date of Patent: October 24, 2017
    Assignee: Intel Corporation
    Inventors: Haitham Akkary, Ravi Rajwar, Srikanth T. Srinivasan
  • Patent number: 9766895
    Abstract: A computing device determines that a current software thread of a plurality of software threads having an issuing sequence does not have a first instruction waiting to be issued to a hardware thread during a clock cycle. The computing device identifies one or more alternative software threads in the issuing sequence having instructions waiting to be issued. The computing device selects, during the clock cycle by the computing device, a second instruction from a second software thread among the one or more alternative software threads in view of determining that the second instruction has no dependencies with any other instructions among the instructions waiting to be issued. Dependencies are identified by the computing device in view of the values of a chaining bit extracted from each of the instructions waiting to be issued. The computing device issues the second instruction to the hardware thread.
    Type: Grant
    Filed: November 12, 2014
    Date of Patent: September 19, 2017
    Assignee: Optimum Semiconductor Technologies, Inc.
    Inventors: Shenghong Wang, C. John Glossner, Gary J. Nacer
  • Patent number: 9766894
    Abstract: A chaining bit decoder of a computer processor receives an instruction stream. The chaining bit decoder selects a group of instructions from the instruction stream. The chaining bit decoder extracts a designated bit from each instruction of the instruction stream to produce a sequence of chaining bits. The chaining bit decoder decodes the sequence of chaining bits. The chaining bit decoder identifies zero or more instruction stream dependencies among the selected group of instructions in view of the decoded sequence of chaining bits. The chaining bit decoder outputs control signals to cause one or more pipelines stages of the processor to execute the selected group of instructions in view of the identified zero or more instruction stream dependencies among the group sequence of instructions.
    Type: Grant
    Filed: November 12, 2014
    Date of Patent: September 19, 2017
    Assignee: Optimum Semiconductor Technologies, Inc.
    Inventors: C. John Glossner, Gary J. Nacer, Murugappan Senthilvelan, Vitaly Kalashnikov, Arthur J. Hoane, Paul D'Arcy, Sabin D. Iancu, Shenghong Wang
  • Patent number: 9747224
    Abstract: Provided is a method of managing a register port, the method including performing scheduling on register ports that are used during a plurality of cycles to enable performing of a calculation; encoding data of the register ports according to results of the scheduling, the encoding of the data including, with respect to data of one of the register ports that does not have a schedule during one of the plurality of cycles, equally encoding the data of the one register port during the one cycle with data of an adjacent cycle of the one register port, the adjacent cycle being adjacent to the one cycle; and transmitting results of the encoding to a device that includes the register ports.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: August 29, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Tai-Song Jin, Jae-Un Park, Do-hyung Kim, Seung-won Lee
  • Patent number: 9740494
    Abstract: Instruction issue circuits are disclosed that are configured to issue multiple instructions within a superscalar pipeline of a microprocessor. The instruction issue circuit includes an instruction queue that stores instructions. A ready generation circuit is operably associated with the instruction queue and generates ready signals that indicate which instructions in the instruction queue are ready for execution. To simplify the instruction issue circuit, the instruction issue circuit has group blocks. Each group block receives a different group of the ready signals corresponding to a different group of the instructions. Each group block generates a group output indicating a group set within the corresponding group of the instructions that has a highest instruction execution priority and are ready for execution. By splitting the ready signals into groups, the groups of ready signals can be processed in parallel thereby reducing both the resulting delay and complexity of the instruction issue circuit.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: August 22, 2017
    Assignee: Arizona Board of Regents for and on behalf of Arizona State University
    Inventors: Lawrence T. Clark, Siddhesh Mhambrey, Satendra Kumar Maurya
  • Patent number: 9720675
    Abstract: Illustrated is a system and method to receive an instruction to access a set of identified data structures, where each identified data structure is associated with a version data structure that includes annotations of particular dependencies amongst at least two members of the set of identical data structures. The system and method further comprising determining, based upon the dependencies, that a version mismatch exists between the at least two members of the set of identified data structures, the dependencies used to identify a most recent version and a locally cached version of the at least two members. The system and method further comprising delaying execution of the instruction until the version mismatch between the at least two members of the set of identified data structures is resolved through an upgrade of a version of one of the at least two members of the set of identified data structures.
    Type: Grant
    Filed: October 27, 2010
    Date of Patent: August 1, 2017
    Assignee: Hewlett Packard Enterprise Development LP
    Inventor: Antonio Lain
  • Patent number: 9652371
    Abstract: A circular queue implementing a scheme for prioritized reads is disclosed. In one embodiment, a circular queue (or buffer) includes a number of storage locations each configured to store a data value. A multiplexer tree is coupled between the storage locations and a read port. A priority circuit is configured to generate and provide selection signals to each multiplexer of the multiplexer tree, based on a priority scheme. Based on the states of the selection signals, one of the storage locations is coupled to the read port via the multiplexers of the multiplexer tree.
    Type: Grant
    Filed: February 18, 2015
    Date of Patent: May 16, 2017
    Assignee: Apple Inc.
    Inventors: Rajat Goel, Hari S. Kannan, Khurram Z. Malik
  • Patent number: 9645802
    Abstract: A device compiler and linker is configured to group instructions into different strands for execution by different threads based on the dependence of those instructions on other, long-latency instructions. A thread may execute a strand that includes long-latency instructions, and then hardware resources previously allocated for the execution of that thread may be de-allocated from the thread and re-allocated to another thread. The other thread may then execute another strand while the long-latency instructions are in flight. With this approach, the other thread is not required to wait for the long-latency instructions to complete before acquiring hardware resources and initiating execution of the other strand, thereby eliminating at least a portion of the time that the other thread would otherwise spend waiting.
    Type: Grant
    Filed: August 7, 2013
    Date of Patent: May 9, 2017
    Assignee: NVIDIA Corporation
    Inventors: Mojtaba Mehrara, Michael Garland, Gregory Diamos
  • Patent number: 9632977
    Abstract: A data processor includes a packet selector. The packet selector creates an ordered list of packets, each packet corresponding to a respective communication flow, determines whether each packet in the ordered list of packets is eligible for transfer to a prefetch unit based on whether a preceding packet in the same communication flow has been transferred to the prefetch unit, and sets a selection priority for each packet based on start time constraints for the respective communication flow, and based on a processing status of a preceding packet in the communication flow.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: April 25, 2017
    Assignee: NXP USA, Inc.
    Inventors: Timothy G. Boland, Anne C. Harris, Steven D. Millman
  • Patent number: 9626190
    Abstract: The present invention provides a method and apparatus for floating-point register caching. One embodiment of the method includes mapping a first set of architected registers defined by a first instruction set to a memory outside of a plurality of physical registers. The plurality of physical registers are configured to map to the first set, a second set of architected registers defined by a second construction set, and a set of rename registers. This embodiment of the method also includes adding the physical registers corresponding to the first set of architected registers to the set of rename registers.
    Type: Grant
    Filed: October 7, 2010
    Date of Patent: April 18, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Jeff Rupley
  • Patent number: 9612835
    Abstract: A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: April 4, 2017
    Assignee: Intel Corporation
    Inventors: Salvador Palanca, Stephen A. Fischer, Subramaniam Maiyuran, Shekoufeh Qawami
  • Patent number: 9582283
    Abstract: A microcontroller includes a program memory, data memory, central processing unit, at least one register module, a memory management unit, and a transport network. Instructions are executed in one clock cycle via an instruction word. The instruction word indicates the source module from which data is to be retrieved and the destination module to which data is to be stored. The address/data capability of an instruction word may be extended via a prefix module. If an operation is performed on the data, the source module or the destination module may perform the operation during the same clock cycle in which the data is transferred.
    Type: Grant
    Filed: May 18, 2009
    Date of Patent: February 28, 2017
    Assignee: Maxim Integrated Products, Inc.
    Inventors: Jeffrey D. Owens, Edward Tangkwai Ma, Donald W. Loomis, III, Thomas Augustus Chenot
  • Patent number: 9535701
    Abstract: A pipelined processor selects an instruction fetch mode from a number of fetch modes including an executed branch fetch mode, a predicted fetch mode, and a sequential fetch mode. Each branch instruction is associated with branch delay slots, the size of which can be greater than or equal to zero, and can vary from one branch instance to another. Branch prediction is used to fetch instructions, with the source of information for predictions deriving from a last instruction in the branch delay slots. When a prediction error occurs, the executed branch fetch mode uses an address from branch instruction evaluation to fetch a next instruction.
    Type: Grant
    Filed: January 29, 2014
    Date of Patent: January 3, 2017
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (publ)
    Inventors: Erik Rijshouwer, Ricky Nas
  • Patent number: 9529596
    Abstract: In accordance with embodiments disclosed herein, there are provided methods, systems, and apparatuses for scheduling instructions in a multi-strand out-of-order processor. For example, an apparatus for scheduling instructions in a multi-strand out-of-order processor includes an out-of-order instruction fetch unit to retrieve a plurality of interdependent instructions for execution from a multi-strand representation of a sequential program listing; an instruction scheduling unit to schedule the execution of the plurality of interdependent instructions based at least in part on operand synchronization bits encoded within each of the plurality of interdependent instructions; and a plurality of execution units to execute at least a subset of the plurality of interdependent instructions in parallel.
    Type: Grant
    Filed: July 1, 2011
    Date of Patent: December 27, 2016
    Assignee: Intel Corporation
    Inventors: Boris A. Babayan, Vladimir M. Pentkovski, Alexander V. Butuzov, Sergey Y. Shishlov, Alexey Y. Sivtsov, Nikolay E. Kosarev