Patents Examined by Courtney P Carmichael-Moody
-
Patent number: 11379240Abstract: In an embodiment, an indirect branch predictor generates indirect branch predictions based on one or more register values. The register values may be the contents of registers on which the indirect branch instruction is directly or indirectly dependent for generating the branch target address, for example. In an embodiment, at least one of the registers may be a source for a load instruction, and the indirect branch may be dependent (directly or indirectly) on the target of the load. In an embodiment, the indirect branch predictor may be one of at least two indirect branch predictors in a processor. The other indirect branch predictor may be based on a fetch address, or PC, associated with the indirect branch instruction. The other indirect branch predictor may generate a first predicted target address, and the indirect branch predictor may generate a second predicted target address for the same indirect branch instruction.Type: GrantFiled: January 31, 2020Date of Patent: July 5, 2022Assignee: Apple Inc.Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Conrado Blasco, Haoyan Jia, Amit Kumar
-
Patent number: 11379233Abstract: In an apparatus with transactional memory support circuitry, for a first type of transaction started using a first type of transaction start instruction, commitment of results of instructions executed speculatively following the first type of transaction start instruction are prevented until a transaction end instruction is reached. An abort is triggered when a conflict is detected between an address of a memory access from another thread and the addresses tracked for the transaction. For a second type of transaction started using a second type of transaction start instruction, an address of the read operation is marked as trackable whilst an address of a write operation is omitted from being marked as trackable. This allows an apparatus that supports transactional memory to also be used for multi-word address watching.Type: GrantFiled: October 17, 2019Date of Patent: July 5, 2022Assignee: Arm LimitedInventors: Matthew James Horsnell, Richard Roy Grisenthwaite
-
Patent number: 11360773Abstract: Reusing fetched, flushed instructions after an instruction pipeline flush in response to a hazard in a processor to reduce instruction re-fetching is disclosed. An instruction processing circuit is configured to detect fetched performance degrading instructions (Pals) in a pre-execution stage in an instruction pipeline that may cause a precise interrupt that would cause flushing of the instruction pipeline. In response to detecting a PDI in an instruction pipeline, the instruction processing circuit is configured to capture the fetched PDI and/or its successor, younger fetched instructions that are processed in the instruction pipeline behind the PDI, in a pipeline refill circuit.Type: GrantFiled: June 22, 2020Date of Patent: June 14, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Rami Mohammad Al Sheikh, Michael Scott McIlvaine
-
Patent number: 11360774Abstract: In one embodiment, a branch processing method, comprising: assigning plural branch instructions for a given clock cycle to primary branch information and secondary branch information; routing the primary branch information along a first path having adder logic and the secondary branch information along a second path having no adder logic; and writing the primary branch information including a displacement branch target address to a branch order table (BOT) and the secondary branch information without a target address to the BOT.Type: GrantFiled: October 23, 2020Date of Patent: June 14, 2022Assignee: CENTAUR TECHNOLOGY, INC.Inventors: Thomas C. McDonald, John Duncan
-
Patent number: 11341087Abstract: A heterogeneous multi-core integrated circuit comprising two or more processors, at least one of the processors being a general purpose CPU and at least one of the processors being a specialized hardware processing engine, the processors being connected by a processor local bus on the integrated circuit, wherein the general purpose CPU is configured to generate a first instruction for an atomic operation to be performed by a second processor, different from the general purpose CPU, the first instruction comprising an address of the second processor and a first command indicating a first action to be executed by the second processor, and transmit the first instruction to the second processor over the processor local bus. The first command may include the first action, or may be a descriptor of the first action or a pointer to where the first action may be found in a memory.Type: GrantFiled: May 24, 2016Date of Patent: May 24, 2022Assignee: DISPLAYLINK (UK) LIMITEDInventors: Robin Alexander Cawley, Colin Skinner, Eric Kenneth Hamaker
-
Patent number: 11314516Abstract: Systems and methods of selecting a collection of compatible issue-ready instructions for parallel execution by functional units in a superscalar processor in a single clock cycle. All possible instructions (opcodes) to be executed by the functional units are pre-arranged into several scenarios based on potential resource conflicts among the instructions. Each scenario includes multiple groups of predefined instructions. During operation, concurrently for all the groups, an issue-ready instruction is identified with reference to each group based on group-specific selection policies. Further, based on the identified instructions, predefined policies are applied to select one or more scenarios and select among the picks of the selected scenarios. As a result, the output instructions of the selected scenarios are issued for parallel execution by the functional units.Type: GrantFiled: January 19, 2018Date of Patent: April 26, 2022Assignee: Marvell Asia Pte, Ltd.Inventor: David Carlson
-
Patent number: 11314518Abstract: A method of monitoring execution in an execution environment of an operation, for example a cryptographic operation, comprising a sequence of instructions, is disclosed. Instructions sent in the sequence from a main processor to one or more auxiliary processors, for example cryptographic processors, to execute the operation are monitored and the sequence of instructions is verified using verification information. The method comprises enabling output from the execution environment of a result of the operation in response to a successful verification of the sequence, or generating a verification failure signal in response to a failed verification of the sequence.Type: GrantFiled: August 2, 2017Date of Patent: April 26, 2022Assignee: Nagravision S.A.Inventors: Marco Macchetti, Nicolas Fischer, Jerome Perrine
-
Patent number: 11301251Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.Type: GrantFiled: April 23, 2020Date of Patent: April 12, 2022Assignee: SiFive, Inc.Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman
-
Patent number: 11301252Abstract: A data processing apparatus is provided comprising: a plurality of input lanes and a plurality of corresponding output lanes. Processing circuitry executes a first vector instruction and a second vector instruction. The first vector instruction specifies a target of output data from the corresponding output lanes that is specified as a source of input data to the input lanes by the second vector instruction. Mask circuitry stores a first mask that defines a first set of the output lanes that are valid for the first vector instruction, and stores a second mask that defines a second set of the output lanes that are valid for the second vector instruction. The first set and the second set are mutually exclusive. Issue circuitry begins processing of the second vector instruction at a lane index prior to completion of the first vector instruction at the lane index.Type: GrantFiled: January 15, 2020Date of Patent: April 12, 2022Assignee: Arm LimitedInventor: Kim Richard Schuttenberg
-
Patent number: 11294681Abstract: An integrated circuit comprising instruction processing circuitry for processing a plurality of program instructions and instruction prediction circuitry. The instruction prediction circuitry comprises circuitry for detecting successive occurrences of a same program loop sequence of program instructions. The instruction prediction circuitry also comprises circuitry for predicting a number of iterations of the same program loop sequence of program instructions, in response to detecting, by the circuitry for detecting, that a second occurrence of the same program loop sequence of program instructions comprises a same number of iterations as a first occurrence of the same program loop sequence of program instructions.Type: GrantFiled: May 31, 2020Date of Patent: April 5, 2022Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Kai Chirca, Paul Daniel Gauvreau, David Edward Smith, Jr.
-
Patent number: 11294680Abstract: A microprocessor implemented method is disclosed. The method includes mapping a plurality of instructions in a guest address space to corresponding instructions in a native address space. The method further includes, for each of one or more guest branch instructions in said native address space fetched during execution, performing the following: determining a youngest prior guest branch target stored in a guest branch target register, determining a branch target for a respective guest branch instruction by adding an offset value for said respective guest branch instruction to said youngest prior guest branch target, where said offset value is adjusted to account for a difference in address in said guest address space between an instruction at a beginning of a guest instruction block and a branch instruction in said guest instruction block. The method further includes creating an entry in said guest branch target register for said branch target.Type: GrantFiled: October 31, 2019Date of Patent: April 5, 2022Assignee: Intel CorporationInventor: Mohammad A. Abdallah
-
Patent number: 11294687Abstract: A data bus includes process elements and a linear main pipeline. Each process element is coupled to a linear pipeline having M stages arranged in series, each of the M stages including a buffer element configured to buffer a data bit sequence and to forward the buffered data bit sequence from a first of the buffer elements to a last of the buffer elements. The linear main pipeline includes N pipeline stage elements arranged in series. Each pipeline stage element is connected to the last buffer element of a respective linear pipeline and configured to read-out one or more of the buffered data bit sequences and to forward the read-out data bit sequences from one of N pipeline stag elements to a next of the N pipeline stage elements.Type: GrantFiled: May 22, 2020Date of Patent: April 5, 2022Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.Inventors: Jens Döge, Christoph Hoppe, Peter Reichel
-
Patent number: 11294787Abstract: An apparatus and method are provided to control assertion of a trigger signal to processing circuitry. The apparatus has evaluation circuitry to receive program instruction execution information indicative of a program instruction executed by the processing circuitry, which is arranged to perform an evaluation operation to determine with reference to evaluation information whether the program instruction execution information indicates presence of a trigger condition. Trigger signal generation circuitry is used to assert a trigger signal to the processing circuitry in dependence on whether the trigger condition is determined to be present. Further, filter circuitry is arranged to receive event information indicative of at least one event occurring within the processing circuitry, and is arranged to determine with reference to filter control information and that event information whether a qualifying condition is present.Type: GrantFiled: August 10, 2017Date of Patent: April 5, 2022Assignee: ARM LimitedInventors: François Christopher Jacques Botman, Thomas Christopher Grocutt, John Michael Horley, Michael John Williams
-
Patent number: 11294670Abstract: Embodiments detailed herein relate to reduction operations on a plurality of data element values. In one embodiment, a process comprises decoding circuitry to decode an instruction and execution circuitry to execute the decoded instruction. The instruction specifies a first input register containing a plurality of data element values, a first index register containing a plurality of indices, and an output register, where each index of the plurality of indices maps to one unique data element position of the first input register. The execution includes to identify data element values that are associated with one another based on the indices, perform one or more reduction operations on the associated data element values based on the identification, and store results of the one or more reduction operations in the output register.Type: GrantFiled: March 27, 2019Date of Patent: April 5, 2022Assignee: INTEL CORPORATIONInventors: Christopher J. Hughes, Jonathan D. Pearce, Guei-Yuan Lueh, ElMoustapha Ould-Ahmed-Vall, Jorge E. Parra, Prasoonkumar Surti, Krishna N. Vinod, Ronen Zohar
-
Patent number: 11294682Abstract: A program is executed using a call stack and shadow stack. The call stack includes frames having respective return addresses. The frames may also store variables and/or parameters. The shadow stack stores duplicates of the return addresses in the call stack. The call stack and the shadow stack are maintained by, (i) each time a function is called, adding a corresponding stack frame to the call stack and adding a corresponding return address to the shadow stack, and (ii) each time a function is exited, removing a corresponding frame from the call stack and removing a corresponding return address from the shadow stack. A backtrace of the program's current call chain is generated by accessing the return addresses in the shadow stack. The outputted backtrace includes the return addresses from the shadow stack and/or information about the traced functions that is derived from the shadow stack's return addresses.Type: GrantFiled: May 20, 2019Date of Patent: April 5, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Ben Niu, Gregory John Colombo, Weidong Cui, Jason Lin, Kenneth Dean Johnson
-
Patent number: 11294684Abstract: In an embodiment, an indirect branch predictor generates indirect branch predictions for indirect branch instructions. For relatively static branch instructions, the indirect branch predictor may be configured to use a PC corresponding to the indirect branch instruction to generate a target prediction. The indirect branch predictor may be configured to identify at least one dynamic indirect branch instruction and may use a different PC than the PC corresponding to the indirect branch instruction to generate the target prediction (e.g. the most recent previous PC associated with a taken branch (“the previous taken PC”). For some dynamic indirect branch instructions, the previous taken PC may disambiguate different target addresses (e.g. there may be a correlation between the previous taken PC and the target address of the indirect branch instruction).Type: GrantFiled: January 31, 2020Date of Patent: April 5, 2022Assignee: Apple Inc.Inventor: Ian D. Kountanis
-
Patent number: 11294688Abstract: A parallel-processing computer system is provided for parallel processing of data packets conveyed in a communication network. The system comprises: a memory; a plurality of processing elements; and a program stored at the memory for execution by the plurality of processing elements.Type: GrantFiled: June 1, 2018Date of Patent: April 5, 2022Assignees: DRIVENETS LTD., AT&T SERVICES, INC.Inventors: Amir Krayden, Yuval Moshe, Anton Gartsbein, Gal Zolkover, Or Sadeh, Ori Zakin, Yuval Lev
-
Patent number: 11294690Abstract: Single Program, Multiple Data (SPMD) parallel processing of SPMD instructions can be generated among processors assigned to a task in a plurality of threads. The SPMD parallel processing can be increased in speed by performing predicated looping with the SPMD instructions in an activated SPMD mode of operation over a non-SPMD mode. Execution of overhead instructions is removed from the SPMD instructions associated with a thread in order to only execute the loop body of a loop associated with a data element of a data set in an enhanced Zero Loop Overhead (ZOL) device.Type: GrantFiled: January 29, 2020Date of Patent: April 5, 2022Assignee: Infineon Technologies AGInventor: Prakash Balasubramanian
-
Patent number: 11294671Abstract: Disclosed embodiments relate to systems and methods for performing duplicate detection instructions on two-dimensional (2D) data. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode and locations of a source matrix comprising M×N elements and a destination, the opcode to indicate execution circuitry is to use a plurality of comparators to discover duplicates in the source matrix, and store indications of locations of discovered duplicates in the destination. The execution circuitry to execute the decoded instruction as per the opcode.Type: GrantFiled: December 26, 2018Date of Patent: April 5, 2022Assignee: Intel CorporationInventors: Christopher J. Hughes, Michael Espig, Dan Baum, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 11294685Abstract: Method and systems for creating a sequence of fused instructions. An instruction stream is obtained, and a window of instructions from the instruction stream is examined and one or more groups of instructions that satisfy one or more fusion rules are identified. One or more of the groups of instructions that satisfy the one or more fusion rules are fused and a maximal length data dependence chain in the instruction stream is analyzed by analyzing every node in a dependence graph in a selected window of instructions. Fusion of an instruction group is prevented based on the maximal length data dependence chain.Type: GrantFiled: June 4, 2019Date of Patent: April 5, 2022Assignee: International Business Machines CorporationInventors: Jessica Hui-Chun Tseng, Manoj Kumar, Kattamuri Ekanadham, Jose E. Moreira, Pratap C. Pattnaik