Patents Examined by Courtney P Carmichael-Moody
  • Patent number: 11379240
    Abstract: In an embodiment, an indirect branch predictor generates indirect branch predictions based on one or more register values. The register values may be the contents of registers on which the indirect branch instruction is directly or indirectly dependent for generating the branch target address, for example. In an embodiment, at least one of the registers may be a source for a load instruction, and the indirect branch may be dependent (directly or indirectly) on the target of the load. In an embodiment, the indirect branch predictor may be one of at least two indirect branch predictors in a processor. The other indirect branch predictor may be based on a fetch address, or PC, associated with the indirect branch instruction. The other indirect branch predictor may generate a first predicted target address, and the indirect branch predictor may generate a second predicted target address for the same indirect branch instruction.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: July 5, 2022
    Assignee: Apple Inc.
    Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Conrado Blasco, Haoyan Jia, Amit Kumar
  • Patent number: 11379233
    Abstract: In an apparatus with transactional memory support circuitry, for a first type of transaction started using a first type of transaction start instruction, commitment of results of instructions executed speculatively following the first type of transaction start instruction are prevented until a transaction end instruction is reached. An abort is triggered when a conflict is detected between an address of a memory access from another thread and the addresses tracked for the transaction. For a second type of transaction started using a second type of transaction start instruction, an address of the read operation is marked as trackable whilst an address of a write operation is omitted from being marked as trackable. This allows an apparatus that supports transactional memory to also be used for multi-word address watching.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: July 5, 2022
    Assignee: Arm Limited
    Inventors: Matthew James Horsnell, Richard Roy Grisenthwaite
  • Patent number: 11360773
    Abstract: Reusing fetched, flushed instructions after an instruction pipeline flush in response to a hazard in a processor to reduce instruction re-fetching is disclosed. An instruction processing circuit is configured to detect fetched performance degrading instructions (Pals) in a pre-execution stage in an instruction pipeline that may cause a precise interrupt that would cause flushing of the instruction pipeline. In response to detecting a PDI in an instruction pipeline, the instruction processing circuit is configured to capture the fetched PDI and/or its successor, younger fetched instructions that are processed in the instruction pipeline behind the PDI, in a pipeline refill circuit.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: June 14, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Rami Mohammad Al Sheikh, Michael Scott McIlvaine
  • Patent number: 11360774
    Abstract: In one embodiment, a branch processing method, comprising: assigning plural branch instructions for a given clock cycle to primary branch information and secondary branch information; routing the primary branch information along a first path having adder logic and the secondary branch information along a second path having no adder logic; and writing the primary branch information including a displacement branch target address to a branch order table (BOT) and the secondary branch information without a target address to the BOT.
    Type: Grant
    Filed: October 23, 2020
    Date of Patent: June 14, 2022
    Assignee: CENTAUR TECHNOLOGY, INC.
    Inventors: Thomas C. McDonald, John Duncan
  • Patent number: 11341087
    Abstract: A heterogeneous multi-core integrated circuit comprising two or more processors, at least one of the processors being a general purpose CPU and at least one of the processors being a specialized hardware processing engine, the processors being connected by a processor local bus on the integrated circuit, wherein the general purpose CPU is configured to generate a first instruction for an atomic operation to be performed by a second processor, different from the general purpose CPU, the first instruction comprising an address of the second processor and a first command indicating a first action to be executed by the second processor, and transmit the first instruction to the second processor over the processor local bus. The first command may include the first action, or may be a descriptor of the first action or a pointer to where the first action may be found in a memory.
    Type: Grant
    Filed: May 24, 2016
    Date of Patent: May 24, 2022
    Assignee: DISPLAYLINK (UK) LIMITED
    Inventors: Robin Alexander Cawley, Colin Skinner, Eric Kenneth Hamaker
  • Patent number: 11314516
    Abstract: Systems and methods of selecting a collection of compatible issue-ready instructions for parallel execution by functional units in a superscalar processor in a single clock cycle. All possible instructions (opcodes) to be executed by the functional units are pre-arranged into several scenarios based on potential resource conflicts among the instructions. Each scenario includes multiple groups of predefined instructions. During operation, concurrently for all the groups, an issue-ready instruction is identified with reference to each group based on group-specific selection policies. Further, based on the identified instructions, predefined policies are applied to select one or more scenarios and select among the picks of the selected scenarios. As a result, the output instructions of the selected scenarios are issued for parallel execution by the functional units.
    Type: Grant
    Filed: January 19, 2018
    Date of Patent: April 26, 2022
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: David Carlson
  • Patent number: 11314518
    Abstract: A method of monitoring execution in an execution environment of an operation, for example a cryptographic operation, comprising a sequence of instructions, is disclosed. Instructions sent in the sequence from a main processor to one or more auxiliary processors, for example cryptographic processors, to execute the operation are monitored and the sequence of instructions is verified using verification information. The method comprises enabling output from the execution environment of a result of the operation in response to a successful verification of the sequence, or generating a verification failure signal in response to a failed verification of the sequence.
    Type: Grant
    Filed: August 2, 2017
    Date of Patent: April 26, 2022
    Assignee: Nagravision S.A.
    Inventors: Marco Macchetti, Nicolas Fischer, Jerome Perrine
  • Patent number: 11301251
    Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.
    Type: Grant
    Filed: April 23, 2020
    Date of Patent: April 12, 2022
    Assignee: SiFive, Inc.
    Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman
  • Patent number: 11301252
    Abstract: A data processing apparatus is provided comprising: a plurality of input lanes and a plurality of corresponding output lanes. Processing circuitry executes a first vector instruction and a second vector instruction. The first vector instruction specifies a target of output data from the corresponding output lanes that is specified as a source of input data to the input lanes by the second vector instruction. Mask circuitry stores a first mask that defines a first set of the output lanes that are valid for the first vector instruction, and stores a second mask that defines a second set of the output lanes that are valid for the second vector instruction. The first set and the second set are mutually exclusive. Issue circuitry begins processing of the second vector instruction at a lane index prior to completion of the first vector instruction at the lane index.
    Type: Grant
    Filed: January 15, 2020
    Date of Patent: April 12, 2022
    Assignee: Arm Limited
    Inventor: Kim Richard Schuttenberg
  • Patent number: 11294681
    Abstract: An integrated circuit comprising instruction processing circuitry for processing a plurality of program instructions and instruction prediction circuitry. The instruction prediction circuitry comprises circuitry for detecting successive occurrences of a same program loop sequence of program instructions. The instruction prediction circuitry also comprises circuitry for predicting a number of iterations of the same program loop sequence of program instructions, in response to detecting, by the circuitry for detecting, that a second occurrence of the same program loop sequence of program instructions comprises a same number of iterations as a first occurrence of the same program loop sequence of program instructions.
    Type: Grant
    Filed: May 31, 2020
    Date of Patent: April 5, 2022
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Kai Chirca, Paul Daniel Gauvreau, David Edward Smith, Jr.
  • Patent number: 11294680
    Abstract: A microprocessor implemented method is disclosed. The method includes mapping a plurality of instructions in a guest address space to corresponding instructions in a native address space. The method further includes, for each of one or more guest branch instructions in said native address space fetched during execution, performing the following: determining a youngest prior guest branch target stored in a guest branch target register, determining a branch target for a respective guest branch instruction by adding an offset value for said respective guest branch instruction to said youngest prior guest branch target, where said offset value is adjusted to account for a difference in address in said guest address space between an instruction at a beginning of a guest instruction block and a branch instruction in said guest instruction block. The method further includes creating an entry in said guest branch target register for said branch target.
    Type: Grant
    Filed: October 31, 2019
    Date of Patent: April 5, 2022
    Assignee: Intel Corporation
    Inventor: Mohammad A. Abdallah
  • Patent number: 11294687
    Abstract: A data bus includes process elements and a linear main pipeline. Each process element is coupled to a linear pipeline having M stages arranged in series, each of the M stages including a buffer element configured to buffer a data bit sequence and to forward the buffered data bit sequence from a first of the buffer elements to a last of the buffer elements. The linear main pipeline includes N pipeline stage elements arranged in series. Each pipeline stage element is connected to the last buffer element of a respective linear pipeline and configured to read-out one or more of the buffered data bit sequences and to forward the read-out data bit sequences from one of N pipeline stag elements to a next of the N pipeline stage elements.
    Type: Grant
    Filed: May 22, 2020
    Date of Patent: April 5, 2022
    Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
    Inventors: Jens Döge, Christoph Hoppe, Peter Reichel
  • Patent number: 11294787
    Abstract: An apparatus and method are provided to control assertion of a trigger signal to processing circuitry. The apparatus has evaluation circuitry to receive program instruction execution information indicative of a program instruction executed by the processing circuitry, which is arranged to perform an evaluation operation to determine with reference to evaluation information whether the program instruction execution information indicates presence of a trigger condition. Trigger signal generation circuitry is used to assert a trigger signal to the processing circuitry in dependence on whether the trigger condition is determined to be present. Further, filter circuitry is arranged to receive event information indicative of at least one event occurring within the processing circuitry, and is arranged to determine with reference to filter control information and that event information whether a qualifying condition is present.
    Type: Grant
    Filed: August 10, 2017
    Date of Patent: April 5, 2022
    Assignee: ARM Limited
    Inventors: François Christopher Jacques Botman, Thomas Christopher Grocutt, John Michael Horley, Michael John Williams
  • Patent number: 11294670
    Abstract: Embodiments detailed herein relate to reduction operations on a plurality of data element values. In one embodiment, a process comprises decoding circuitry to decode an instruction and execution circuitry to execute the decoded instruction. The instruction specifies a first input register containing a plurality of data element values, a first index register containing a plurality of indices, and an output register, where each index of the plurality of indices maps to one unique data element position of the first input register. The execution includes to identify data element values that are associated with one another based on the indices, perform one or more reduction operations on the associated data element values based on the identification, and store results of the one or more reduction operations in the output register.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: April 5, 2022
    Assignee: INTEL CORPORATION
    Inventors: Christopher J. Hughes, Jonathan D. Pearce, Guei-Yuan Lueh, ElMoustapha Ould-Ahmed-Vall, Jorge E. Parra, Prasoonkumar Surti, Krishna N. Vinod, Ronen Zohar
  • Patent number: 11294682
    Abstract: A program is executed using a call stack and shadow stack. The call stack includes frames having respective return addresses. The frames may also store variables and/or parameters. The shadow stack stores duplicates of the return addresses in the call stack. The call stack and the shadow stack are maintained by, (i) each time a function is called, adding a corresponding stack frame to the call stack and adding a corresponding return address to the shadow stack, and (ii) each time a function is exited, removing a corresponding frame from the call stack and removing a corresponding return address from the shadow stack. A backtrace of the program's current call chain is generated by accessing the return addresses in the shadow stack. The outputted backtrace includes the return addresses from the shadow stack and/or information about the traced functions that is derived from the shadow stack's return addresses.
    Type: Grant
    Filed: May 20, 2019
    Date of Patent: April 5, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ben Niu, Gregory John Colombo, Weidong Cui, Jason Lin, Kenneth Dean Johnson
  • Patent number: 11294684
    Abstract: In an embodiment, an indirect branch predictor generates indirect branch predictions for indirect branch instructions. For relatively static branch instructions, the indirect branch predictor may be configured to use a PC corresponding to the indirect branch instruction to generate a target prediction. The indirect branch predictor may be configured to identify at least one dynamic indirect branch instruction and may use a different PC than the PC corresponding to the indirect branch instruction to generate the target prediction (e.g. the most recent previous PC associated with a taken branch (“the previous taken PC”). For some dynamic indirect branch instructions, the previous taken PC may disambiguate different target addresses (e.g. there may be a correlation between the previous taken PC and the target address of the indirect branch instruction).
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: April 5, 2022
    Assignee: Apple Inc.
    Inventor: Ian D. Kountanis
  • Patent number: 11294688
    Abstract: A parallel-processing computer system is provided for parallel processing of data packets conveyed in a communication network. The system comprises: a memory; a plurality of processing elements; and a program stored at the memory for execution by the plurality of processing elements.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: April 5, 2022
    Assignees: DRIVENETS LTD., AT&T SERVICES, INC.
    Inventors: Amir Krayden, Yuval Moshe, Anton Gartsbein, Gal Zolkover, Or Sadeh, Ori Zakin, Yuval Lev
  • Patent number: 11294690
    Abstract: Single Program, Multiple Data (SPMD) parallel processing of SPMD instructions can be generated among processors assigned to a task in a plurality of threads. The SPMD parallel processing can be increased in speed by performing predicated looping with the SPMD instructions in an activated SPMD mode of operation over a non-SPMD mode. Execution of overhead instructions is removed from the SPMD instructions associated with a thread in order to only execute the loop body of a loop associated with a data element of a data set in an enhanced Zero Loop Overhead (ZOL) device.
    Type: Grant
    Filed: January 29, 2020
    Date of Patent: April 5, 2022
    Assignee: Infineon Technologies AG
    Inventor: Prakash Balasubramanian
  • Patent number: 11294671
    Abstract: Disclosed embodiments relate to systems and methods for performing duplicate detection instructions on two-dimensional (2D) data. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode and locations of a source matrix comprising M×N elements and a destination, the opcode to indicate execution circuitry is to use a plurality of comparators to discover duplicates in the source matrix, and store indications of locations of discovered duplicates in the destination. The execution circuitry to execute the decoded instruction as per the opcode.
    Type: Grant
    Filed: December 26, 2018
    Date of Patent: April 5, 2022
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Michael Espig, Dan Baum, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 11294685
    Abstract: Method and systems for creating a sequence of fused instructions. An instruction stream is obtained, and a window of instructions from the instruction stream is examined and one or more groups of instructions that satisfy one or more fusion rules are identified. One or more of the groups of instructions that satisfy the one or more fusion rules are fused and a maximal length data dependence chain in the instruction stream is analyzed by analyzing every node in a dependence graph in a selected window of instructions. Fusion of an instruction group is prevented based on the maximal length data dependence chain.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: April 5, 2022
    Assignee: International Business Machines Corporation
    Inventors: Jessica Hui-Chun Tseng, Manoj Kumar, Kattamuri Ekanadham, Jose E. Moreira, Pratap C. Pattnaik