Patents by Inventor Gadi Haber

Gadi Haber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11579881
    Abstract: Disclosed embodiments relate to instructions for vector operations with immediate values. In one example, a system includes a memory and a processor that includes fetch circuitry to fetch the instruction from a code storage, the instruction including an opcode, a destination identifier to specify a destination vector register, a first immediate, and a write mask identifier to specify a write mask register, the write mask register including at least one bit corresponding to each destination vector register element, the at least one bit to specify whether the destination vector register element is masked or unmasked, decode circuitry to decode the fetched instruction, and execution circuitry to execute the decoded instruction, to, use the write mask register to determine unmasked elements of the destination vector register, and, when the opcode specifies to broadcast, broadcast the first immediate to one or more unmasked vector elements of the destination vector register.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: February 14, 2023
    Assignee: Intel Corporation
    Inventors: Gadi Haber, Robert Valentine, Ayal Zaks, Jesus Corbal San Adrian
  • Patent number: 11188342
    Abstract: An apparatus and method for a speculative conditional move instruction. A processor comprising: a decoder to decode a first speculative conditional move instruction; a prediction storage to store prediction data related to previously executed speculative conditional move instructions; and execution circuitry to read first prediction data associated with the speculative conditional move instruction and to execute the speculative conditional move instruction either speculatively or non-speculatively based on the first prediction data.
    Type: Grant
    Filed: April 1, 2020
    Date of Patent: November 30, 2021
    Assignee: Intel Corporation
    Inventors: Amjad Aboud, Gadi Haber, Jared Warner Stark, IV
  • Publication number: 20200225959
    Abstract: An apparatus and method for a speculative conditional move instruction. A processor comprising: a decoder to decode a first speculative conditional move instruction; a prediction storage to store prediction data related to previously executed speculative conditional move instructions; and execution circuitry to read first prediction data associated with the speculative conditional move instruction and to execute the speculative conditional move instruction either speculatively or non-speculatively based on the first prediction data.
    Type: Application
    Filed: April 1, 2020
    Publication date: July 16, 2020
    Applicant: Intel Corporation
    Inventors: AMJAD ABOUD, GADI HABER, JARED WARNER STARK, IV
  • Patent number: 10620961
    Abstract: An apparatus and method for a speculative conditional move instruction. A processor comprising: a decoder to decode a first speculative conditional move instruction; a prediction storage to store prediction data related to previously executed speculative conditional move instructions; and execution circuitry to read first prediction data associated with the speculative conditional move instruction and to execute the speculative conditional move instruction either speculatively or non-speculatively based on the first prediction data.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: April 14, 2020
    Assignee: Intel Corporation
    Inventors: Amjad Aboud, Gadi Haber, Jared Warner Stark, IV
  • Publication number: 20190303163
    Abstract: An apparatus and method for a speculative conditional move instruction. A processor comprising: a decoder to decode a first speculative conditional move instruction; a prediction storage to store prediction data related to previously executed speculative conditional move instructions; and execution circuitry to read first prediction data associated with the speculative conditional move instruction and to execute the speculative conditional move instruction either speculatively or non-speculatively based on the first prediction data.
    Type: Application
    Filed: March 30, 2018
    Publication date: October 3, 2019
    Inventors: AMJAD ABOUD, GADI HABER, JARED Warner STARK, IV
  • Patent number: 10209764
    Abstract: Embodiments described herein relate to improving processor power-performance using a binary analyzer routine. In one example, a processor includes a memory interface to couple to a memory, at least one hardware accelerator circuit, and an execution pipeline including at least fetch, decode, and execute stages, wherein the processor, in response to a hot-spot hardware event indicating presence of a hot-spot sequence, is to switch context to a binary analyzer routine stored in the memory, the binary analyzer routine including instructions that, when fetched, decoded, and executed by the processor, cause the processor to analyze a region in the memory containing the hot-spot sequence, analyze hardware metrics relating to execution of the hot-spot sequence, and generate, based on the analyses, a recommendation for the at least one hardware accelerator circuit to improve at least one of power consumption and performance.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: February 19, 2019
    Assignee: Intel Corporation
    Inventors: Konstantin Levit-Gurevich, Gadi Haber
  • Publication number: 20190004801
    Abstract: Disclosed embodiments relate to instructions for vector operations with immediate values. In one example, a system includes a memory and a processor that includes fetch circuitry to fetch the instruction from a code storage, the instruction including an opcode, a destination identifier to specify a destination vector register, a first immediate, and a write mask identifier to specify a write mask register, the write mask register including at least one bit corresponding to each destination vector register element, the at least one bit to specify whether the destination vector register element is masked or unmasked, decode circuitry to decode the fetched instruction, and execution circuitry to execute the decoded instruction, to, use the write mask register to determine unmasked elements of the destination vector register, and, when the opcode specifies to broadcast, broadcast the first immediate to one or more unmasked vector elements of the destination vector register.
    Type: Application
    Filed: June 29, 2017
    Publication date: January 3, 2019
    Inventors: Gadi Haber, Robert Valentine, Ayal Zaks, Jesus Corbal San Adrian
  • Publication number: 20180173534
    Abstract: A processor may include a decoder to decode a first instance of a branch instruction for which the resolved branch direction is data dependent and add results of the decoding to a stream of decoded instructions for execution. The processor may include a code generator to inject, into the stream of decoded instructions, branch resolution code to resolve the branch condition for a second instance of the branch instruction following the first instance at a predetermined look-ahead distance. The processor may include an execution unit to execute the branch resolution code, storing an indication of the resolved branch direction for the second instance in an entry of a prediction queue for the branch instruction. The processor may include a branch predictor to receive the second instance of the branch instruction, and output the resolved branch direction as the predicted branch direction for the second instance of the branch instruction.
    Type: Application
    Filed: December 20, 2016
    Publication date: June 21, 2018
    Inventors: Leeor Peled, Gadi Haber, Yiannakis Sazeides
  • Publication number: 20180173291
    Abstract: Embodiments described herein relate to improving processor power-performance using a binary analyzer routine. In one example, a processor includes a memory interface to couple to a memory, at least one hardware accelerator circuit, and an execution pipeline including at least fetch, decode, and execute stages, wherein the processor, in response to a hot-spot hardware event indicating presence of a hot-spot sequence, is to switch context to a binary analyzer routine stored in the memory, the binary analyzer routine including instructions that, when fetched, decoded, and executed by the processor, cause the processor to analyze a region in the memory containing the hot-spot sequence, analyze hardware metrics relating to execution of the hot-spot sequence, and generate, based on the analyses, a recommendation for the at least one hardware accelerator circuit to improve at least one of power consumption and performance.
    Type: Application
    Filed: December 20, 2016
    Publication date: June 21, 2018
    Inventors: Konstantin Levit-Gurevich, Gadi Haber
  • Publication number: 20170132007
    Abstract: Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.
    Type: Application
    Filed: January 23, 2017
    Publication date: May 11, 2017
    Inventors: Gadi Haber, Konstantin Kostya Levit-Gurevich, Esfir Natanzon, Boris Ginzburg, Aya Elhanan, Moshe Maury Bach, Igor Breger
  • Patent number: 9552207
    Abstract: Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.
    Type: Grant
    Filed: February 2, 2016
    Date of Patent: January 24, 2017
    Assignee: Intel Corporation
    Inventors: Gadi Haber, Konstantin Kostya Levit-Gurevich, Esfir Natanzon, Boris Ginzburg, Aya Elhanan, Moshe Maury Bach, Igor Breger
  • Publication number: 20160378480
    Abstract: Embodiments for systems, methods, and apparatuses for improving performance of status dependent computations are detailed. In an embodiment, an hardware apparatus comprises decoder hardware to decode an instruction, operand retrieval hardware to retrieve data from at least one source operand associated with the instruction decoded by the decoder hardware, and execution hardware to execute the decoded instruction to generate a result including at least one status bit and to cause the result and at least one status bit to be stored in a single destination physical storage location, wherein the at least one status bit and result are accessible through a read of the single register.
    Type: Application
    Filed: June 27, 2015
    Publication date: December 29, 2016
    Inventors: Pavel G. Matveyev, Dmitry M. Maslennikov, Paul Caprioli, Gadi Haber
  • Publication number: 20160154651
    Abstract: Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.
    Type: Application
    Filed: February 2, 2016
    Publication date: June 2, 2016
    Inventors: Gadi Haber, Konstantin Kostya Levit-Gurevich, Esfir Natanzon, Boris Ginzburg, Aya Elhanan, Moshe Maury Bach, Igor Breger
  • Patent number: 9348594
    Abstract: An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code executing on the ASMP is analyzed by a binary analysis unit to determine what functions are called by the program code and select which of the cores are to execute the program code, or a code segment thereof. Selection may be made to provide for native execution of the program code, to minimize power consumption, and so forth. Control operations based on this selection may then be inserted into the program code, forming instrumented program code. The instrumented program code is then executed by the ASMP.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: May 24, 2016
    Assignee: Intel Corporation
    Inventors: Koichi Yamada, Boris Ginzburg, Wei Li, Ronny Ronen, Esfir Natanzon, Konstantin Levit-Gurevich, Gadi Haber, Alon Naveh, Eliezer Weissmann, Michael Mishaeli
  • Patent number: 9250906
    Abstract: Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.
    Type: Grant
    Filed: August 30, 2015
    Date of Patent: February 2, 2016
    Assignee: Intel Corporation
    Inventors: Gadi Haber, Konstantin Kostya Levit-Gurevich, Esfir Natanzon, Boris Ginzburg, Aya Elhanan, Moshe Maury Bach, Igor Breger
  • Publication number: 20150370567
    Abstract: Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.
    Type: Application
    Filed: August 30, 2015
    Publication date: December 24, 2015
    Inventors: Gadi Haber, Konstantin Kostya Levit-Gurevich, Esfir Natanzon, Boris Ginzburg, Aya Elhanan, Moshe Maury Bach, Igor Breger
  • Patent number: 9141361
    Abstract: Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.
    Type: Grant
    Filed: September 30, 2012
    Date of Patent: September 22, 2015
    Assignee: Intel Corporation
    Inventors: Gadi Haber, Konstantin Kostya Levit-Gurevich, Esfir Natanzon, Boris Ginzburg, Aya Elhanan, Moshe Maury Bach, Igor Breger
  • Publication number: 20140095832
    Abstract: Methods, apparatus and systems for virtualization of a native instruction set are disclosed. Embodiments include a processor core executing the native instructions and a second core, or alternatively only the second processor core consuming less power while executing a second instruction set that excludes portions of the native instruction set. The second core's decoder detects invalid opcodes of the second instruction set. A microcode layer disassembler determines if opcodes should be translated. A translation runtime environment identifies an executable region containing an invalid opcode, other invalid opcodes and interjacent valid opcodes of the second instruction set. An analysis unit determines an initial machine state prior to execution of the invalid opcode. A partial translation of the executable region that includes encapsulations of the translations of invalid opcodes and state recoveries of the machine states is generated and saved to a translation cache memory.
    Type: Application
    Filed: September 30, 2012
    Publication date: April 3, 2014
    Inventors: Gadi Haber, Konstantin Kostya Levit-Gurevich, Esfir Natanzon, Boris Ginzburg, Aya Elhanan, Moshe Maury Bach, Igor Breger
  • Publication number: 20140019723
    Abstract: An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code for execution on the ASMP is analyzed and a determination is made as to whether to allow the program code, or a code segment thereof to execute on a first core natively or to use binary translation on the code and execute the translated code on a second core which consumes less power than the first core during execution.
    Type: Application
    Filed: December 28, 2011
    Publication date: January 16, 2014
    Inventors: Koichi Yamada, Ronny Ronen, Wei Li, Boris Ginzburg, Gadi Haber, Konstantin Levit-Gurevich, Esfir Natanzon, Alon Naveh, Eliezer Weissmann, Michael Mishaeli
  • Publication number: 20130268742
    Abstract: An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code executing on the ASMP is analyzed by a binary analysis unit to determine what functions are called by the program code and select which of the cores are to execute the program code, or a code segment thereof. Selection may be made to provide for native execution of the program code, to minimize power consumption, and so forth. Control operations based on this selection may then be inserted into the program code, forming instrumented program code. The instrumented program code is then executed by the ASMP.
    Type: Application
    Filed: December 29, 2011
    Publication date: October 10, 2013
    Inventors: Koichi Yamada, Boris Ginzburg, Wei Li, Ronny Ronen, Esfir Natanzon, Konstantin Levit-Gurevich, Gadi Haber, Alon Naveh, Eliezer Weissmann, Michael Mishaeli