Patents Examined by Courtney P Carmichael-Moody
-
Patent number: 12020032Abstract: A prediction unit includes a single-cycle predictor (SCP) configured to provide a series of outputs associated with a respective series of fetch blocks on a first respective series of clock cycles and a fetch block prediction unit (FBPU) configured to use the series of SCP outputs to provide, on a second respective series of clock cycles, a respective series of fetch block descriptors that describe the respective series of fetch blocks. The fetch block descriptors are useable by an instruction fetch unit to fetch the series of fetch blocks from an instruction cache. The second respective series of clock cycles follows the first respective series of clock cycles in a pipelined fashion by a latency of the FBPU.Type: GrantFiled: August 2, 2022Date of Patent: June 25, 2024Assignee: Ventana Micro Systems Inc.Inventors: John G. Favor, Michael N. Michael
-
Patent number: 12014178Abstract: An instruction fetch pipeline includes first, second, and third sub-pipelines that respectively include: a TLB that receives a fetch virtual address, a tag random access memory (RAM) of a physically-indexed physically-tagged set associative instruction cache that receives a predicted set index, and a data RAM that receives the predicted set index and a predicted way number that specifies a way of the entry from which a block of instructions was previously fetched. The predicted set index specifies the instruction cache set that includes the entry. The three sub-pipelines respectively initiate in parallel: a TLB access using the fetch virtual address to obtain a translation thereof into a fetch physical address that includes a tag, a tag RAM access using the predicted set index to read a set of tags, and a data RAM access using the predicted set index and the predicted way number to fetch the block of instructions.Type: GrantFiled: June 8, 2022Date of Patent: June 18, 2024Assignee: Ventana Micro Systems Inc.Inventors: John G. Favor, Michael N. Michael, Vihar Soneji
-
Patent number: 12001845Abstract: An apparatus comprises first instruction execution circuitry, second instruction execution circuitry, and a decoupled access buffer. Instructions of an ordered sequence of instructions are issued to one of the first and second instruction execution circuitry for execution in dependence on whether the instruction has a first type label or a second type label. An instruction with the first type label is an access-related instruction which determines at least one characteristic of a load operation to retrieve a data value from a memory address. Instruction execution by the first instruction execution circuitry of instructions having the first type label is prioritised over instruction execution by the second instruction execution circuitry of instructions having the second type label. Data values retrieved from memory as a result of execution of the first type instructions are stored in the decoupled access buffer.Type: GrantFiled: October 15, 2020Date of Patent: June 4, 2024Assignee: Arm LimitedInventors: Mbou Eyole, Stefanos Kaxiras
-
Patent number: 11983507Abstract: A differential multiplier-accumulator accepts A and B digital inputs plus a sign bit and generates a dot product P by applying the bits of the A input and the bits of the B inputs to respective positive and negative unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. One of the positive and negative unit element is enabled by the sign bit, the enabled unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each positive and negative unit element having the bits of A applied to each associated AND gate input of each unit element, which charge to charge transfer lines, and the charge transfer lines are coupled to binary weighted charge summing capacitors and to an analog to digital converter to generate a digital output product.Type: GrantFiled: December 31, 2020Date of Patent: May 14, 2024Assignee: Ceremorphic, Inc.Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong
-
Patent number: 11977893Abstract: An instruction fetch pipeline includes first, second, and third sub-pipelines that respectively include: a TLB that receives a fetch virtual address, a tag random access memory (RAM) of a physically-indexed physically-tagged set associative instruction cache that receives a predicted set index, and a data RAM that receives the predicted set index and a predicted way number that specifies a way of the entry from which a block of instructions was previously fetched. The predicted set index specifies the instruction cache set that includes the entry. The three sub-pipelines respectively initiate in parallel: a TLB access using the fetch virtual address to obtain a translation thereof into a fetch physical address that includes a tag, a tag RAM access using the predicted set index to read a set of tags, and a data RAM access using the predicted set index and the predicted way number to fetch the block of instructions.Type: GrantFiled: June 8, 2022Date of Patent: May 7, 2024Assignee: Ventana Micro Systems Inc.Inventors: John G. Favor, Michael N. Michael, Vihar Soneji
-
Patent number: 11977936Abstract: A differential multiplier-accumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to respective positive and negative unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. Each positive and negative unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each positive and negative unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors and to an analog to digital converter to generate a digital output product. The charge transfer lines may span multiple unit elements.Type: GrantFiled: December 31, 2020Date of Patent: May 7, 2024Assignee: Ceremorphic, Inc.Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong
-
Patent number: 11972264Abstract: Processing circuitry performs processing operations in response to micro-operations. Front end circuitry supplies the micro-operations to be processed by the processing circuitry. Prediction circuitry generates a prediction of a number of loop iterations for which one or more micro-operations per loop iteration are to be supplied by the front end circuitry, where an actual number of loop iterations to be processed by the processing circuitry is resolvable by the processing circuitry based on at least one operand corresponding to a first loop iteration to be processed by the processing circuitry. The front end circuitry varies, based on a level of confidence in the prediction of the number of loop iterations, a supply rate with which the one or more micro-operations for at least a subset of the loop iterations are supplied to the processing circuitry.Type: GrantFiled: June 13, 2022Date of Patent: April 30, 2024Inventors: Guillaume Bolbenes, Thibaut Elie Lanois, Houdhaifa Bouzguarrou, Luca Nassi
-
Patent number: 11960893Abstract: A method, programming product, and/or system for prefetching instructions includes an instruction prefetch table that has a plurality of entries, each entry for storing a first portion of an indirect branch instruction address and a target address, wherein the indirect branch instruction has multiple target addresses and the instruction prefetch table is accessed by an index obtained by hashing a second portion of bits of the indirect branch instruction address with an information vector of the indirect branch instruction. A further embodiment includes a first prefetch table for uni-target branch instructions and a second prefetch table for multi-target branch instructions. In operation it is determined whether a branch instruction hits in one of the multiple prefetch tables; a target address for the branch instruction is read from the respective prefetch table in which the branch instruction hit; and the branch instruction is prefetched to an instruction cache.Type: GrantFiled: December 29, 2021Date of Patent: April 16, 2024Assignee: International Business Machines CorporationInventors: Naga P. Gorti, Mohit Karve
-
Patent number: 11941401Abstract: An apparatus includes a processor circuit that includes a return address stack circuit, a return prediction circuit, and a fetch control circuit. The return prediction circuit is configured to store, for previously accessed return addresses, fetch parameters for next fetch addresses. The fetch control circuit is configured to in response to a fetch of a call instruction, push a return address onto the return address stack circuit. In response to a fetch of a return instruction that corresponds to the call instruction, the fetch control circuit is further configured to retrieve the return address from the return address stack circuit, and to create, using the return address and fetch parameters retrieved from the return prediction circuit, a next fetch request to retrieve instructions subsequent to the return instruction.Type: GrantFiled: June 9, 2022Date of Patent: March 26, 2024Assignee: Apple Inc.Inventors: Pruthivi Vuyyuru, Ian D. Kountanis
-
Patent number: 11922168Abstract: A program is executed using a call stack and shadow stack. The call stack includes frames having respective return addresses. The frames may also store variables and/or parameters. The shadow stack stores duplicates of the return addresses in the call stack. The call stack and the shadow stack are maintained by, (i) each time a function is called, adding a corresponding stack frame to the call stack and adding a corresponding return address to the shadow stack, and (ii) each time a function is exited, removing a corresponding frame from the call stack and removing a corresponding return address from the shadow stack. A backtrace of the program's current call chain is generated by accessing the return addresses in the shadow stack. The outputted backtrace includes the return addresses from the shadow stack and/or information about the traced functions that is derived from the shadow stack's return addresses.Type: GrantFiled: March 23, 2022Date of Patent: March 5, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Ben Niu, Gregory John Colombo, Weidong Cui, Jason Lin, Kenneth Dean Johnson
-
Patent number: 11915006Abstract: A method, system and device for pipeline processing of instructions and a computer storage medium. The method comprises: acquiring a target instruction set (S101); acquiring a target prediction result, wherein the target prediction result is a result obtained by predicting a jump mode of the target instruction set (S102); performing pipeline processing on the target instruction set according to the target prediction result (S103); determining if a pipeline flushing request is received (S104); and if so, correspondingly saving the target instruction set and a corresponding pipeline processing result, so as to perform pipeline processing on the target instruction set again on the basis of the pipeline processing result (S105).Type: GrantFiled: November 28, 2019Date of Patent: February 27, 2024Assignee: INSPUR SUZHOU INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Yulong Zhou, Tongqiang Liu, Xiaofeng Zou
-
Patent number: 11915002Abstract: Providing extended branch target buffer (BTB) entries for storing trunk branch metadata and leaf branch metadata is disclosed herein. In one aspect, a processor comprises a BTB circuit comprising a BTB comprising a plurality of extended BTB entries. The BTB circuit is configured to store trunk branch metadata for a first branch instruction in an extended BTB entry of the plurality of extended BTB entries, wherein the extended BTB entry corresponds to a first aligned memory block containing an address of the first branch instruction. The BTB circuit is also configured to store leaf branch metadata for a second branch instruction in the extended BTB entry in association with the trunk branch metadata, wherein an address of the second branch instruction is subsequent to a target address of the first branch instruction within a second aligned memory block.Type: GrantFiled: June 24, 2022Date of Patent: February 27, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Saransh Jain, Rami Mohammad Al Sheikh, Daren Eugene Streett, Michael Scott McIlvaine
-
Patent number: 11900124Abstract: Various embodiments are disclosed of a multiprocessor system with processing elements optimized for high performance and low power dissipation and an associated method of programming the processing elements. Each processing element may comprise a fetch unit and a plurality of address generator units and a plurality of pipelined datapaths. The fetch unit may be configured to receive a multi-part instruction, wherein the multi-part instruction includes a plurality of fields. First and second address generator units may generate, based on different fields of the multi-part instruction, addresses from which to retrieve first and second data for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction. The execution units may perform operations using a single pipeline or multiple pipelines based on third and fourth fields of the multi-part instruction.Type: GrantFiled: January 3, 2023Date of Patent: February 13, 2024Assignee: Coherent Logix, IncorporatedInventors: Michael B Doerr, Carl S. Dobbs, Michael B. Solka, Michael R. Trocino, Kenneth R. Faulkner, Keith M. Bindloss, Sumeer Arya, John Mark Beardslee, David A. Gibson
-
Patent number: 11900120Abstract: Systems and methods of selecting a collection of compatible issue-ready instructions for parallel execution by functional units in a superscalar processor in a single clock cycle. All possible instructions (opcodes) to be executed by the functional units are pre-arranged into several scenarios based on potential resource conflicts among the instructions. Each scenario includes multiple groups of predefined instructions. During operation, concurrently for all the groups, an issue-ready instruction is identified with reference to each group based on group-specific selection policies. Further, based on the identified instructions, predefined policies are applied to select one or more scenarios and select among the picks of the selected scenarios. As a result, the output instructions of the selected scenarios are issued for parallel execution by the functional units.Type: GrantFiled: March 24, 2022Date of Patent: February 13, 2024Assignee: Marvell Asia Pte, Ltd.Inventor: David Carlson
-
Patent number: 11886875Abstract: Disclosed embodiments relate to systems and methods for performing nibble-sized operations on matrix elements. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction the fetched instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode to indicate the processor is to, for each pair of corresponding elements of the first and second source matrices, logically partition each element into nibble-sized partitions, perform an operation indicated by the instruction on each partition, and store execution results to a corresponding nibble-sized partition of a corresponding element of the destination matrix. The exemplary processor includes execution circuitry to execute the decoded instruction as per the opcode.Type: GrantFiled: December 26, 2018Date of Patent: January 30, 2024Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Jonathan D. Pearce, Dan Baum, Guei-Yuan Lueh, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
-
Patent number: 11880685Abstract: An instruction fetch pipeline includes first, second, and third sub-pipelines that respectively include: a TLB that receives a fetch virtual address, a tag random access memory (RAM) of a physically-indexed physically-tagged set associative instruction cache that receives a predicted set index, and a data RAM that receives the predicted set index and a predicted way number that specifies a way of the entry from which a block of instructions was previously fetched. The predicted set index specifies the instruction cache set that includes the entry. The three sub-pipelines respectively initiate in parallel: a TLB access using the fetch virtual address to obtain a translation thereof into a fetch physical address that includes a tag, a tag RAM access using the predicted set index to read a set of tags, and a data RAM access using the predicted set index and the predicted way number to fetch the block of instructions.Type: GrantFiled: June 8, 2022Date of Patent: January 23, 2024Assignee: Ventana Micro Systems Inc.Inventors: John G. Favor, Michael N. Michael, Vihar Soneji
-
Patent number: 11875155Abstract: An integrated circuit comprising instruction processing circuitry for processing a plurality of program instructions and instruction prediction circuitry. The instruction prediction circuitry comprises circuitry for detecting successive occurrences of a same program loop sequence of program instructions. The instruction prediction circuitry also comprises circuitry for predicting a number of iterations of the same program loop sequence of program instructions, in response to detecting, by the circuitry for detecting, that a second occurrence of the same program loop sequence of program instructions comprises a same number of iterations as a first occurrence of the same program loop sequence of program instructions.Type: GrantFiled: January 19, 2022Date of Patent: January 16, 2024Assignee: Texas Instruments IncorporatedInventors: Kai Chirca, Paul Daniel Gauvreau, David Edward Smith, Jr.
-
Patent number: 11861368Abstract: A first type of prediction, for controlling execution of at least one instruction by processing circuitry, is based at least on a first prediction table storing prediction information looked up based on at least a first portion of branch history information stored in branch history storage corresponding to a first predetermined number of branches. In response to detecting an execution state switch of the processing circuitry from a first execution state to a second, more privileged, execution state, use of the first prediction table for determining the first type of prediction is disabled. In response to detecting that a number of branches causing an update to the branch history storage since the execution state switch is greater than or equal to the first predetermined number, use of the first prediction table in determining the first type of prediction is re-enabled.Type: GrantFiled: May 24, 2022Date of Patent: January 2, 2024Assignee: Arm LimitedInventors: Houdhaifa Bouzguarrou, Michael Brian Schinzler, Yasuo Ishii, Jatin Bhartia, Sumanth Chengad Raghu
-
Patent number: 11853763Abstract: A new device executing an application on a new central processing unit (CPU), determines whether the application is for a legacy device having a legacy CPU. When the new device determines that the application is for the legacy device, it executes the application on the new CPU with selected available resources of the new device restricted to approximate or match a processing behavior of the legacy CPU, e.g., by reducing a usable portion of a return address stack of the new CPU and thereby reducing a number of calls and associated returns that can be tracked.Type: GrantFiled: June 29, 2022Date of Patent: December 26, 2023Assignee: SONY INTERACTIVE ENTERTAINMENT LLCInventors: Mark Evan Cerny, David Simpson
-
Patent number: 11853765Abstract: The disclosure includes a method of authenticating a processor that includes an arithmetic and logic unit. At least one decoded operand of at least a portion of a to-be-executed opcode is received on a first terminal of the arithmetic and logic unit. A signed instruction is received on a second terminal of the arithmetic and logic unit. The signed instruction combines a decoded instruction of the to-be-executed opcode and a previous calculation result of the arithmetic and logic unit.Type: GrantFiled: April 14, 2022Date of Patent: December 26, 2023Assignees: STMICROELECTRONICS (ROUSSET) SAS, PROTON WORLD INTERNATIONAL N.V.Inventors: Michael Peeters, Fabrice Marinet