Patents Examined by Courtney P Carmichael-Moody
  • Patent number: 12020032
    Abstract: A prediction unit includes a single-cycle predictor (SCP) configured to provide a series of outputs associated with a respective series of fetch blocks on a first respective series of clock cycles and a fetch block prediction unit (FBPU) configured to use the series of SCP outputs to provide, on a second respective series of clock cycles, a respective series of fetch block descriptors that describe the respective series of fetch blocks. The fetch block descriptors are useable by an instruction fetch unit to fetch the series of fetch blocks from an instruction cache. The second respective series of clock cycles follows the first respective series of clock cycles in a pipelined fashion by a latency of the FBPU.
    Type: Grant
    Filed: August 2, 2022
    Date of Patent: June 25, 2024
    Assignee: Ventana Micro Systems Inc.
    Inventors: John G. Favor, Michael N. Michael
  • Patent number: 12014178
    Abstract: An instruction fetch pipeline includes first, second, and third sub-pipelines that respectively include: a TLB that receives a fetch virtual address, a tag random access memory (RAM) of a physically-indexed physically-tagged set associative instruction cache that receives a predicted set index, and a data RAM that receives the predicted set index and a predicted way number that specifies a way of the entry from which a block of instructions was previously fetched. The predicted set index specifies the instruction cache set that includes the entry. The three sub-pipelines respectively initiate in parallel: a TLB access using the fetch virtual address to obtain a translation thereof into a fetch physical address that includes a tag, a tag RAM access using the predicted set index to read a set of tags, and a data RAM access using the predicted set index and the predicted way number to fetch the block of instructions.
    Type: Grant
    Filed: June 8, 2022
    Date of Patent: June 18, 2024
    Assignee: Ventana Micro Systems Inc.
    Inventors: John G. Favor, Michael N. Michael, Vihar Soneji
  • Patent number: 12001845
    Abstract: An apparatus comprises first instruction execution circuitry, second instruction execution circuitry, and a decoupled access buffer. Instructions of an ordered sequence of instructions are issued to one of the first and second instruction execution circuitry for execution in dependence on whether the instruction has a first type label or a second type label. An instruction with the first type label is an access-related instruction which determines at least one characteristic of a load operation to retrieve a data value from a memory address. Instruction execution by the first instruction execution circuitry of instructions having the first type label is prioritised over instruction execution by the second instruction execution circuitry of instructions having the second type label. Data values retrieved from memory as a result of execution of the first type instructions are stored in the decoupled access buffer.
    Type: Grant
    Filed: October 15, 2020
    Date of Patent: June 4, 2024
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Stefanos Kaxiras
  • Patent number: 11983507
    Abstract: A differential multiplier-accumulator accepts A and B digital inputs plus a sign bit and generates a dot product P by applying the bits of the A input and the bits of the B inputs to respective positive and negative unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. One of the positive and negative unit element is enabled by the sign bit, the enabled unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each positive and negative unit element having the bits of A applied to each associated AND gate input of each unit element, which charge to charge transfer lines, and the charge transfer lines are coupled to binary weighted charge summing capacitors and to an analog to digital converter to generate a digital output product.
    Type: Grant
    Filed: December 31, 2020
    Date of Patent: May 14, 2024
    Assignee: Ceremorphic, Inc.
    Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong
  • Patent number: 11977893
    Abstract: An instruction fetch pipeline includes first, second, and third sub-pipelines that respectively include: a TLB that receives a fetch virtual address, a tag random access memory (RAM) of a physically-indexed physically-tagged set associative instruction cache that receives a predicted set index, and a data RAM that receives the predicted set index and a predicted way number that specifies a way of the entry from which a block of instructions was previously fetched. The predicted set index specifies the instruction cache set that includes the entry. The three sub-pipelines respectively initiate in parallel: a TLB access using the fetch virtual address to obtain a translation thereof into a fetch physical address that includes a tag, a tag RAM access using the predicted set index to read a set of tags, and a data RAM access using the predicted set index and the predicted way number to fetch the block of instructions.
    Type: Grant
    Filed: June 8, 2022
    Date of Patent: May 7, 2024
    Assignee: Ventana Micro Systems Inc.
    Inventors: John G. Favor, Michael N. Michael, Vihar Soneji
  • Patent number: 11977936
    Abstract: A differential multiplier-accumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to respective positive and negative unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. Each positive and negative unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each positive and negative unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors and to an analog to digital converter to generate a digital output product. The charge transfer lines may span multiple unit elements.
    Type: Grant
    Filed: December 31, 2020
    Date of Patent: May 7, 2024
    Assignee: Ceremorphic, Inc.
    Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong
  • Patent number: 11972264
    Abstract: Processing circuitry performs processing operations in response to micro-operations. Front end circuitry supplies the micro-operations to be processed by the processing circuitry. Prediction circuitry generates a prediction of a number of loop iterations for which one or more micro-operations per loop iteration are to be supplied by the front end circuitry, where an actual number of loop iterations to be processed by the processing circuitry is resolvable by the processing circuitry based on at least one operand corresponding to a first loop iteration to be processed by the processing circuitry. The front end circuitry varies, based on a level of confidence in the prediction of the number of loop iterations, a supply rate with which the one or more micro-operations for at least a subset of the loop iterations are supplied to the processing circuitry.
    Type: Grant
    Filed: June 13, 2022
    Date of Patent: April 30, 2024
    Inventors: Guillaume Bolbenes, Thibaut Elie Lanois, Houdhaifa Bouzguarrou, Luca Nassi
  • Patent number: 11960893
    Abstract: A method, programming product, and/or system for prefetching instructions includes an instruction prefetch table that has a plurality of entries, each entry for storing a first portion of an indirect branch instruction address and a target address, wherein the indirect branch instruction has multiple target addresses and the instruction prefetch table is accessed by an index obtained by hashing a second portion of bits of the indirect branch instruction address with an information vector of the indirect branch instruction. A further embodiment includes a first prefetch table for uni-target branch instructions and a second prefetch table for multi-target branch instructions. In operation it is determined whether a branch instruction hits in one of the multiple prefetch tables; a target address for the branch instruction is read from the respective prefetch table in which the branch instruction hit; and the branch instruction is prefetched to an instruction cache.
    Type: Grant
    Filed: December 29, 2021
    Date of Patent: April 16, 2024
    Assignee: International Business Machines Corporation
    Inventors: Naga P. Gorti, Mohit Karve
  • Patent number: 11941401
    Abstract: An apparatus includes a processor circuit that includes a return address stack circuit, a return prediction circuit, and a fetch control circuit. The return prediction circuit is configured to store, for previously accessed return addresses, fetch parameters for next fetch addresses. The fetch control circuit is configured to in response to a fetch of a call instruction, push a return address onto the return address stack circuit. In response to a fetch of a return instruction that corresponds to the call instruction, the fetch control circuit is further configured to retrieve the return address from the return address stack circuit, and to create, using the return address and fetch parameters retrieved from the return prediction circuit, a next fetch request to retrieve instructions subsequent to the return instruction.
    Type: Grant
    Filed: June 9, 2022
    Date of Patent: March 26, 2024
    Assignee: Apple Inc.
    Inventors: Pruthivi Vuyyuru, Ian D. Kountanis
  • Patent number: 11922168
    Abstract: A program is executed using a call stack and shadow stack. The call stack includes frames having respective return addresses. The frames may also store variables and/or parameters. The shadow stack stores duplicates of the return addresses in the call stack. The call stack and the shadow stack are maintained by, (i) each time a function is called, adding a corresponding stack frame to the call stack and adding a corresponding return address to the shadow stack, and (ii) each time a function is exited, removing a corresponding frame from the call stack and removing a corresponding return address from the shadow stack. A backtrace of the program's current call chain is generated by accessing the return addresses in the shadow stack. The outputted backtrace includes the return addresses from the shadow stack and/or information about the traced functions that is derived from the shadow stack's return addresses.
    Type: Grant
    Filed: March 23, 2022
    Date of Patent: March 5, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ben Niu, Gregory John Colombo, Weidong Cui, Jason Lin, Kenneth Dean Johnson
  • Patent number: 11915006
    Abstract: A method, system and device for pipeline processing of instructions and a computer storage medium. The method comprises: acquiring a target instruction set (S101); acquiring a target prediction result, wherein the target prediction result is a result obtained by predicting a jump mode of the target instruction set (S102); performing pipeline processing on the target instruction set according to the target prediction result (S103); determining if a pipeline flushing request is received (S104); and if so, correspondingly saving the target instruction set and a corresponding pipeline processing result, so as to perform pipeline processing on the target instruction set again on the basis of the pipeline processing result (S105).
    Type: Grant
    Filed: November 28, 2019
    Date of Patent: February 27, 2024
    Assignee: INSPUR SUZHOU INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Yulong Zhou, Tongqiang Liu, Xiaofeng Zou
  • Patent number: 11915002
    Abstract: Providing extended branch target buffer (BTB) entries for storing trunk branch metadata and leaf branch metadata is disclosed herein. In one aspect, a processor comprises a BTB circuit comprising a BTB comprising a plurality of extended BTB entries. The BTB circuit is configured to store trunk branch metadata for a first branch instruction in an extended BTB entry of the plurality of extended BTB entries, wherein the extended BTB entry corresponds to a first aligned memory block containing an address of the first branch instruction. The BTB circuit is also configured to store leaf branch metadata for a second branch instruction in the extended BTB entry in association with the trunk branch metadata, wherein an address of the second branch instruction is subsequent to a target address of the first branch instruction within a second aligned memory block.
    Type: Grant
    Filed: June 24, 2022
    Date of Patent: February 27, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Saransh Jain, Rami Mohammad Al Sheikh, Daren Eugene Streett, Michael Scott McIlvaine
  • Patent number: 11900124
    Abstract: Various embodiments are disclosed of a multiprocessor system with processing elements optimized for high performance and low power dissipation and an associated method of programming the processing elements. Each processing element may comprise a fetch unit and a plurality of address generator units and a plurality of pipelined datapaths. The fetch unit may be configured to receive a multi-part instruction, wherein the multi-part instruction includes a plurality of fields. First and second address generator units may generate, based on different fields of the multi-part instruction, addresses from which to retrieve first and second data for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction. The execution units may perform operations using a single pipeline or multiple pipelines based on third and fourth fields of the multi-part instruction.
    Type: Grant
    Filed: January 3, 2023
    Date of Patent: February 13, 2024
    Assignee: Coherent Logix, Incorporated
    Inventors: Michael B Doerr, Carl S. Dobbs, Michael B. Solka, Michael R. Trocino, Kenneth R. Faulkner, Keith M. Bindloss, Sumeer Arya, John Mark Beardslee, David A. Gibson
  • Patent number: 11900120
    Abstract: Systems and methods of selecting a collection of compatible issue-ready instructions for parallel execution by functional units in a superscalar processor in a single clock cycle. All possible instructions (opcodes) to be executed by the functional units are pre-arranged into several scenarios based on potential resource conflicts among the instructions. Each scenario includes multiple groups of predefined instructions. During operation, concurrently for all the groups, an issue-ready instruction is identified with reference to each group based on group-specific selection policies. Further, based on the identified instructions, predefined policies are applied to select one or more scenarios and select among the picks of the selected scenarios. As a result, the output instructions of the selected scenarios are issued for parallel execution by the functional units.
    Type: Grant
    Filed: March 24, 2022
    Date of Patent: February 13, 2024
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: David Carlson
  • Patent number: 11886875
    Abstract: Disclosed embodiments relate to systems and methods for performing nibble-sized operations on matrix elements. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction the fetched instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode to indicate the processor is to, for each pair of corresponding elements of the first and second source matrices, logically partition each element into nibble-sized partitions, perform an operation indicated by the instruction on each partition, and store execution results to a corresponding nibble-sized partition of a corresponding element of the destination matrix. The exemplary processor includes execution circuitry to execute the decoded instruction as per the opcode.
    Type: Grant
    Filed: December 26, 2018
    Date of Patent: January 30, 2024
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Jonathan D. Pearce, Dan Baum, Guei-Yuan Lueh, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
  • Patent number: 11880685
    Abstract: An instruction fetch pipeline includes first, second, and third sub-pipelines that respectively include: a TLB that receives a fetch virtual address, a tag random access memory (RAM) of a physically-indexed physically-tagged set associative instruction cache that receives a predicted set index, and a data RAM that receives the predicted set index and a predicted way number that specifies a way of the entry from which a block of instructions was previously fetched. The predicted set index specifies the instruction cache set that includes the entry. The three sub-pipelines respectively initiate in parallel: a TLB access using the fetch virtual address to obtain a translation thereof into a fetch physical address that includes a tag, a tag RAM access using the predicted set index to read a set of tags, and a data RAM access using the predicted set index and the predicted way number to fetch the block of instructions.
    Type: Grant
    Filed: June 8, 2022
    Date of Patent: January 23, 2024
    Assignee: Ventana Micro Systems Inc.
    Inventors: John G. Favor, Michael N. Michael, Vihar Soneji
  • Patent number: 11875155
    Abstract: An integrated circuit comprising instruction processing circuitry for processing a plurality of program instructions and instruction prediction circuitry. The instruction prediction circuitry comprises circuitry for detecting successive occurrences of a same program loop sequence of program instructions. The instruction prediction circuitry also comprises circuitry for predicting a number of iterations of the same program loop sequence of program instructions, in response to detecting, by the circuitry for detecting, that a second occurrence of the same program loop sequence of program instructions comprises a same number of iterations as a first occurrence of the same program loop sequence of program instructions.
    Type: Grant
    Filed: January 19, 2022
    Date of Patent: January 16, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Kai Chirca, Paul Daniel Gauvreau, David Edward Smith, Jr.
  • Patent number: 11861368
    Abstract: A first type of prediction, for controlling execution of at least one instruction by processing circuitry, is based at least on a first prediction table storing prediction information looked up based on at least a first portion of branch history information stored in branch history storage corresponding to a first predetermined number of branches. In response to detecting an execution state switch of the processing circuitry from a first execution state to a second, more privileged, execution state, use of the first prediction table for determining the first type of prediction is disabled. In response to detecting that a number of branches causing an update to the branch history storage since the execution state switch is greater than or equal to the first predetermined number, use of the first prediction table in determining the first type of prediction is re-enabled.
    Type: Grant
    Filed: May 24, 2022
    Date of Patent: January 2, 2024
    Assignee: Arm Limited
    Inventors: Houdhaifa Bouzguarrou, Michael Brian Schinzler, Yasuo Ishii, Jatin Bhartia, Sumanth Chengad Raghu
  • Patent number: 11853763
    Abstract: A new device executing an application on a new central processing unit (CPU), determines whether the application is for a legacy device having a legacy CPU. When the new device determines that the application is for the legacy device, it executes the application on the new CPU with selected available resources of the new device restricted to approximate or match a processing behavior of the legacy CPU, e.g., by reducing a usable portion of a return address stack of the new CPU and thereby reducing a number of calls and associated returns that can be tracked.
    Type: Grant
    Filed: June 29, 2022
    Date of Patent: December 26, 2023
    Assignee: SONY INTERACTIVE ENTERTAINMENT LLC
    Inventors: Mark Evan Cerny, David Simpson
  • Patent number: 11853765
    Abstract: The disclosure includes a method of authenticating a processor that includes an arithmetic and logic unit. At least one decoded operand of at least a portion of a to-be-executed opcode is received on a first terminal of the arithmetic and logic unit. A signed instruction is received on a second terminal of the arithmetic and logic unit. The signed instruction combines a decoded instruction of the to-be-executed opcode and a previous calculation result of the arithmetic and logic unit.
    Type: Grant
    Filed: April 14, 2022
    Date of Patent: December 26, 2023
    Assignees: STMICROELECTRONICS (ROUSSET) SAS, PROTON WORLD INTERNATIONAL N.V.
    Inventors: Michael Peeters, Fabrice Marinet