Patents by Inventor Lihu Rappoport

Lihu Rappoport has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230342156
    Abstract: Methods and apparatuses relating to mitigations for speculative execution side channels are described. Speculative execution hardware and environments that utilize the mitigations are also described. For example, three indirect branch control mechanisms and their associated hardware are discussed herein: (i) indirect branch restricted speculation (IBRS) to restrict speculation of indirect branches, (ii) single thread indirect branch predictors (STIBP) to prevent indirect branch predictions from being controlled by a sibling thread, and (iii) indirect branch predictor barrier (IBPB) to prevent indirect branch predictions after the barrier from being controlled by software executed before the barrier.
    Type: Application
    Filed: April 24, 2023
    Publication date: October 26, 2023
    Inventors: Jason W. Brandt, Deepak K. Gupta, Rodrigo Branco, Joseph Nuzman, Robert S. Chappell, Sergiu Ghetie, Wojciech Powiertowski, Jared W. Stark, IV, Ariel Sabba, Scott J. Cape, Hisham Shafi, Lihu Rappoport, Yair Berger, Scott P. Bobholz, Gilad Holzstein, Sagar V. Dalvi, Yogesh Bijlani
  • Publication number: 20230315453
    Abstract: An instruction pipeline includes a circuit that can generate a hardware event to indicate conditional branches, including the direction of taken branches. The circuit can generate a forward conditional branch indicator for an opcode when a conditional branch is taken to a forward location from the opcode. The instruction pipeline includes a counter to increment in response to the forward conditional branch indicator, which will indicate a frequency of forward conditional branches for the opcode.
    Type: Application
    Filed: April 1, 2022
    Publication date: October 5, 2023
    Inventors: Ahmad YASIN, Lihu RAPPOPORT, Nir TELL, Rami BUSOOL, Eyal HADAS, Michael CHYNOWETH, Joseph OLIVAS, Christopher M. CHRULSKI
  • Publication number: 20230205699
    Abstract: An apparatus includes memory circuitry including a first data structure and prefetch circuitry that is coupled to the memory circuitry. The prefetch circuitry is to store, in the first data structure, a first subregion entry corresponding to a first subregion of a memory region allocated to a program. The first subregion entry is to include a plurality of delta values. A first delta value of the plurality of delta values represents a first distance between two cache lines associated with consecutive memory accesses within a second subregion of the memory region. The prefetch circuitry is further to detect a first memory access of a first cache line in the first subregion, identify prefetch candidates based on the first cache line and the plurality of delta values, and issue at least one prefetch request based on at least two of the prefetch candidates to be prefetched into a cache.
    Type: Application
    Filed: December 24, 2021
    Publication date: June 29, 2023
    Applicant: Intel Corporation
    Inventors: Swaraj Sha, Anant Vithal Nori, Sreenivas Subramoney, Stanislav Shwartsman, Pavel I. Kryukov, Lihu Rappoport
  • Publication number: 20230195469
    Abstract: Techniques and mechanisms for a processor to determine an execution of instructions based on a prediction of a taken branch. In an embodiment, a first prediction unit generates each of multiple branch predictions in one cycle of successive branch prediction cycles. An indication of the branch predictions is provided to an execution pipeline, which prepares to execute an instruction based on the indication. Where a first one of the branch predictions is determined to be of a low confidence type, said first branch prediction is further indicated to a second prediction unit, which performs a second branch prediction based on the same branch instruction for which the first branch prediction was made. In another embodiment, the second prediction unit signals that a state of the execution pipeline is to be cleared, based on a determination that the first and second branch predictions are inconsistent with each other.
    Type: Application
    Filed: December 21, 2021
    Publication date: June 22, 2023
    Applicant: Intel Corporation
    Inventors: Sumeet Bandishte, Jayesh Gaur, Franck Sala, Alexey Yurievich Sivtsov, Jared Warner Stark, IV, Lihu Rappoport, Sreenivas Subramoney
  • Publication number: 20230195465
    Abstract: Techniques and mechanisms for efficiently making value prediction information available for use by in a processor. In an embodiment, the instruction execution is to include a loading of some data to a first location (e.g., a first register). A decoder of the processor accesses reference information which indicates that the execution is to comprise multiple micro-operations (?ops) including a LoadCheck ?op and a Move ?op. The LoadCheck ?op loads a first value to the first location, and checks whether the loaded first value is the same as a previously-determined second value which represents a prediction of what the first value would be. The Move ?op moves the second value to the first location. In another embodiment, the Move ?op is scheduled for execution out-of-order with respect to the LoadCheck ?op, resulting in an early availability of the second value for access in a register file by another ?op.
    Type: Application
    Filed: December 21, 2021
    Publication date: June 22, 2023
    Applicant: Intel Corporation
    Inventors: Stanislav Shwartsman, Elad Shtiegmann, Sumeet Bandishte, Lihu Rappoport, Zeev Sperber, Jayesh Gaur
  • Publication number: 20230185718
    Abstract: Methods and apparatus relating to de-prioritizing speculative code lines in on-chip caches are described. In an embodiment, logic circuitry determines whether a storage structure includes a reference to a code miss request prior to transmission of the code miss request to a shared cache. The logic circuitry causes de-prioritization of a code line, corresponding to the code miss request, in the shared cache in response to an absence of the reference in the storage structure. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: December 14, 2021
    Publication date: June 15, 2023
    Applicant: Intel Corporation
    Inventors: Anant Vithal Nori, Prathmesh Kallurkar, Niranjan Kumar Soundararajan, Sreenivas Subramoney, Lihu Rappoport, Hanna Alam, Adrian Moga, Ronak Singhal
  • Patent number: 11656971
    Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: May 23, 2023
    Assignee: Intel Corporation
    Inventors: Adarsh Chauhan, Jayesh Gaur, Franck Sala, Lihu Rappoport, Zeev Sperber, Adi Yoaz, Sreenivas Subramoney
  • Patent number: 11645078
    Abstract: Systems, methods, and apparatuses relating to hardware for auto-predication of critical branches. In one embodiment, a processor core includes a decoder to decode instructions into decoded instructions, an execution unit to execute the decoded instructions, a branch predictor circuit to predict a future outcome of a branch instruction, and a branch predication manager circuit to disable use of the predicted future outcome for a conditional critical branch comprising the branch instruction.
    Type: Grant
    Filed: December 28, 2019
    Date of Patent: May 9, 2023
    Assignee: Intel Corporation
    Inventors: Adarsh Chauhan, Franck Sala, Jayesh Gaur, Zeev Sperber, Lihu Rappoport, Adi Yoaz, Sreenivas Subramoney
  • Patent number: 11635965
    Abstract: Methods and apparatuses relating to mitigations for speculative execution side channels are described. Speculative execution hardware and environments that utilize the mitigations are also described. For example, three indirect branch control mechanisms and their associated hardware are discussed herein: (i) indirect branch restricted speculation (IBRS) to restrict speculation of indirect branches, (ii) single thread indirect branch predictors (STIBP) to prevent indirect branch predictions from being controlled by a sibling thread, and (iii) indirect branch predictor barrier (IBPB) to prevent indirect branch predictions after the barrier from being controlled by software executed before the barrier.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: April 25, 2023
    Assignee: Intel Corporation
    Inventors: Jason W. Brandt, Deepak K. Gupta, Rodrigo Branco, Joseph Nuzman, Robert S. Chappell, Sergiu D. Ghetie, Wojciech Powiertowski, Jared W. Stark, IV, Ariel Sabba, Scott J. Cape, Hisham Shafi, Lihu Rappoport, Yair Berger, Scott P. Bobholz, Gilad Holzstein, Sagar V. Dalvi, Yogesh Bijlani
  • Publication number: 20220237123
    Abstract: Embodiments of an invention a processor architecture are disclosed. In an embodiment, a processor includes a decoder, an execution unit, a coherent cache, and an interconnect. The decoder is to decode an instruction to zero a cache line. The execution unit is to issue a write command to initiate a cache line sized write of zeros. The coherent cache is to receive the write command, to determine whether there is a hit in the coherent cache and whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros, and to issue the write command toward the interconnect. The interconnect is to, responsive to receipt of the write command, issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit.
    Type: Application
    Filed: April 4, 2022
    Publication date: July 28, 2022
    Applicant: Intel Corporation
    Inventors: Jason W. Brandt, Robert S. Chappell, Jesus Corbal, Edward T. Grochowski, Stephen H. Gunther, Buford M. Guy, Thomas R. Huff, Christopher J. Hughes, Elmoustapha Ould-Ahmed-Vall, Ronak Singhal, Seyed Yahya Sotoudeh, Bret L. Toll, Lihu Rappoport, David B. Papworth, James D. Allen
  • Publication number: 20220206925
    Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state.
    Type: Application
    Filed: January 24, 2022
    Publication date: June 30, 2022
    Inventors: Adarsh Chauhan, Jayesh Gaur, Franck Sala, Lihu Rappoport, Zeev Sperber, Adi Yoaz, Sreenivas Subramoney
  • Patent number: 11294809
    Abstract: Embodiments of an invention a processor architecture are disclosed. In an embodiment, a processor includes a decoder, an execution unit, a coherent cache, and an interconnect. The decoder is to decode an instruction to zero a cache line. The execution unit is to issue a write command to initiate a cache line sized write of zeros. The coherent cache is to receive the write command, to determine whether there is a hit in the coherent cache and whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros, and to issue the write command toward the interconnect. The interconnect is to, responsive to receipt of the write command, issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit.
    Type: Grant
    Filed: August 28, 2018
    Date of Patent: April 5, 2022
    Assignee: Intel Corporation
    Inventors: Jason W. Brandt, Robert S. Chappell, Jesus Corbal, Edward T. Grochowski, Stephen H. Gunther, Buford M. Guy, Thomas R. Huff, Christopher J. Hughes, Elmoustapha Ould-Ahmed-Vall, Ronak Singhal, Seyed Yahya Sotoudeh, Bret L. Toll, Lihu Rappoport, David Papworth, James D. Allen
  • Patent number: 11256599
    Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state.
    Type: Grant
    Filed: December 21, 2020
    Date of Patent: February 22, 2022
    Assignee: Intel Corporation
    Inventors: Adarsh Chauhan, Jayesh Gaur, Franck Sala, Lihu Rappoport, Zeev Sperber, Adi Yoaz, Sreenivas Subramoney
  • Publication number: 20210342134
    Abstract: Embodiments of apparatuses, methods, and systems for code prefetching are described. In an embodiment, an apparatus includes an instruction decoder, load circuitry, and execution circuitry. The instruction decoder is to decode a code prefetch instruction. The code prefetch instruction is to specify a first instruction to be prefetched. The load circuitry to prefetch the first instruction in response to the decoded code prefetch instruction. The execution circuitry is to execute the first instruction at a fetch stage of a pipeline.
    Type: Application
    Filed: September 26, 2020
    Publication date: November 4, 2021
    Applicant: Intel Corporation
    Inventors: Ahmad Yasin, Lihu Rappoport, Jared W. Stark, Jeffrey Baxter, Israel Diamand, Pavel Fridman, Ibrahim Hur, Nir Tell
  • Patent number: 11150979
    Abstract: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.
    Type: Grant
    Filed: August 13, 2019
    Date of Patent: October 19, 2021
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Stanislav Shwartsman, Jared W. Stark, IV, Lihu Rappoport, Igor Yanover, George Leifman
  • Patent number: 11086627
    Abstract: A system is provided that includes an instruction buffer that stores bytes representative of one or more macroinstructions and instruction length decoder circuitry. The instruction length decoder circuitry includes a non-sequential first multiplexer circuitry having first input lines receiving a first input data representative of a speculative length of a first macroinstruction of the macroinstructions, and first selector that selects from the first input lines via a one-hot selector vector. The instruction length decoder circuitry also includes a first output line communicatively coupled to second selector, wherein the first output line causes the selector to select from a second input data representative of a first location of a first ending byte for the first macroinstruction with respect to a value x. The first multiplexer circuitry and the second selector may output start and end byte locations for the macroinstructions.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: August 10, 2021
    Assignee: Intel Corporation
    Inventors: Nir Tell, Shahar Sandor, Amotz Yagev, Michael Hermony, Sagie Yakov Goldenberg, Lihu Rappoport
  • Publication number: 20210200538
    Abstract: Disclosed embodiments relate to systems and methods to dually write micro-ops to a micro-op queue. A processor includes a micro-op cache communicatively coupled, via a first write port, to a micro-op queue, and a legacy fetch and decode pipeline communicatively coupled, via a second write port, to the micro-op queue, the processor to determine whether the micro-op cache stores a thread, the thread comprising a micro-op to be written to the micro-op queue, determine whether the legacy fetch and decode pipeline stores the thread if the micro-op cache does not store the thread, and write, via the micro-op queue, the micro-op from the thread to the micro-op queue responsive to the determination of whether the micro-op cache or the legacy fetch and decode pipeline stores the thread.
    Type: Application
    Filed: December 28, 2019
    Publication date: July 1, 2021
    Inventors: Franck SALA, Lihu RAPPOPORT
  • Publication number: 20210200550
    Abstract: Disclosed embodiments relate to systems and methods structured to predict a loop exit. In one example, a processor includes a branch prediction unit to determine a loop exit predictor start corresponding to a finite consistent loop, and an instruction decoder queue to: receive an iteration of the finite consistent loop corresponding to a loop exit predictor and an iteration count, replay one or more instructions of the iteration based on the iteration count, and switch to post-loop instructions responsive to a determination that a number of iterations of the finite consistent loop is equal to the iteration count.
    Type: Application
    Filed: December 28, 2019
    Publication date: July 1, 2021
    Inventors: Alexey Yurievich SIVTSOV, Franck SALA, Jared Warner STARK, IV, Lihu RAPPOPORT
  • Publication number: 20210109839
    Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state.
    Type: Application
    Filed: December 21, 2020
    Publication date: April 15, 2021
    Inventors: Adarsh Chauhan, Jayesh Gaur, Franck Sala, Lihu Rappoport, Zeev Sperber, Adi Yoaz, Sreenivas Subramoney
  • Publication number: 20210096867
    Abstract: A system is provided that includes an instruction buffer that stores bytes representative of one or more macroinstructions and instruction length decoder circuitry. The instruction length decoder circuitry includes a non-sequential first multiplexer circuitry having first input lines receiving a first input data representative of a speculative length of a first macroinstruction of the macroinstructions, and first selector that selects from the first input lines via a one-hot selector vector. The instruction length decoder circuitry also includes a first output line communicatively coupled to second selector, wherein the first output line causes the selector to select from a second input data representative of a first location of a first ending byte for the first macroinstruction with respect to a value x. The first multiplexer circuitry and the second selector may output start and end byte locations for the macroinstructions.
    Type: Application
    Filed: September 27, 2019
    Publication date: April 1, 2021
    Inventors: Nir Tell, Shahar Sandor, Amotz Yagev, Michael Hermnoy, Sagie Yakov Goldenberg, Lihu Rappoport