Patents by Inventor Zeev Sperber

Zeev Sperber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210406018
    Abstract: Systems, methods, and apparatuses relating to one or more instructions that utilize direct paths for loading data into a tile from a vector register and/or storing data from a tile into a vector register are described.
    Type: Application
    Filed: June 27, 2020
    Publication date: December 30, 2021
    Inventors: Menachem Adelman, Robert Valentine, Barukh Ziv, Yaroslav Pollak, Gideon Stupp, Amit Gradstein, Simon Rubanovich, Zeev Sperber, Mark Charney, Christopher Hughes, Alexander Heinecke
  • Publication number: 20210406012
    Abstract: Embodiments for loading and storing matrix data with datatype conversion are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a first destination matrix location, and a first source operand field to specify a first source matrix location. The execution circuitry is to, in response to the decoded instruction, convert data elements from a plurality of source element locations of a first source matrix specified by the first source matrix location from a first datatype to a second datatype to generate a plurality of converted data elements and to store each of the plurality of converted data elements in one of a plurality of destination element locations in a first destination matrix specified by the first destination matrix location.
    Type: Application
    Filed: June 27, 2020
    Publication date: December 30, 2021
    Applicant: Intel Corporation
    Inventors: Menachem Adelman, Robert Valentine, Gideon Stupp, Yaroslav Pollak, Amit Gradstein, Simon Rubanovich, Zeev Sperber, Mark J. Charney, Christopher J. Hughes, Alexander F. Heinecke, Evangelos Georganas
  • Publication number: 20210405974
    Abstract: Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.
    Type: Application
    Filed: June 27, 2020
    Publication date: December 30, 2021
    Applicant: Intel Corporation
    Inventors: Menachem Adelman, Robert Valentine, Barukh Ziv, Amit Gradstein, Simon Rubanovich, Zeev Sperber, Mark J. Charney, Christopher J. Hughes, Alexander F. Heinecke, Evangelos Georganas, Binh Pham
  • Publication number: 20210406011
    Abstract: Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand.
    Type: Application
    Filed: September 7, 2021
    Publication date: December 30, 2021
    Inventors: Robert Valentine, Galina Ryvchin, Piotr Majcher, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Milind B. Girkar, Zeev Sperber, Simon Rubanovich, Amit Gradstein
  • Patent number: 11200055
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: December 14, 2021
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Simon Rubanovich
  • Patent number: 11188335
    Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described.
    Type: Grant
    Filed: November 2, 2020
    Date of Patent: November 30, 2021
    Assignee: Intel Corporation
    Inventors: Regev Shemy, Zeev Sperber, Wajdi Feghali, Vinodh Gopal, Amit Gradstein, Simon Rubanovich, Sean Gulley, Ilya Albrekht, Jacob Doweck, Jose Yallouz, Ittai Anati
  • Publication number: 20210357216
    Abstract: In one embodiment, a processor includes a fetch logic to fetch instructions, a decode logic to decode the fetched instructions, and an execution logic to execute at least some of the instructions. The decode logic may determine whether a flag portion of a first instruction to be folded is to be performed, and if not, accumulate a first immediate value of the first instruction with a folded immediate value obtained from an entry of an immediate buffer. Other embodiments are described and claimed.
    Type: Application
    Filed: June 1, 2021
    Publication date: November 18, 2021
    Inventors: Zeev Sperber, Tomer Weiner, Amit Gradstein, Simon Rubanovich, Alex Gerber, Itai Ravid
  • Patent number: 11175891
    Abstract: Disclosed embodiments relate to performing floating-point addition with selected rounding. In one example, a processor includes circuitry to decode and execute an instruction specifying locations of first and second floating-point (FP) sources, and an opcode indicating the processor is to: bring the FP sources into alignment by shifting a mantissa of the smaller source FP operand to the right by a difference between their exponents, generating rounding controls based on any bits that escape; simultaneously generate a sum of the FP sources and of the FP sources plus one, the sums having a fuzzy-Jbit format having an additional Jbit into which a carry-out, if any, select one of the sums based on the rounding controls, and generate a result comprising a mantissa-wide number of most-significant bits of the selected sum, starting with the most significant non-zero Jbit.
    Type: Grant
    Filed: March 30, 2019
    Date of Patent: November 16, 2021
    Assignee: Intel Corporation
    Inventors: Simon Rubanovich, Amit Gradstein, Zeev Sperber, Mrinmay Dutta
  • Publication number: 20210349720
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
    Type: Application
    Filed: July 22, 2021
    Publication date: November 11, 2021
    Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
  • Patent number: 11169802
    Abstract: In some embodiments, packed data elements of first and second packed data source operands are of a first, different size than a second size of packed data elements of a third packed data operand. Execution circuitry executes decoded single instruction to perform, for each packed data element position of a destination operand, a multiplication of a M N-sized packed data elements from the first and second packed data sources that correspond to a packed data element position of the third packed data source, add of results from these multiplications to a full-sized packed data element of a packed data element position of the third packed data source, and storage of the addition result in a packed data element position destination corresponding to the packed data element position of the third packed data source, wherein M is equal to the full-sized packed data element divided by N.
    Type: Grant
    Filed: October 20, 2016
    Date of Patent: November 9, 2021
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Galina Ryvchin, Piotr Majcher, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Milind B. Girkar, Zeev Sperber, Simon Rubanovich, Amit Gradstein
  • Patent number: 11163565
    Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a double word with saturation; computing a dot product of bytes and accumulating in to a dword with saturation, where the input bytes can be signed or unsigned and the dword accumulation has output saturation; etc.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: November 2, 2021
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Mark J. Charney, Menachem Adelman, Barukh Ziv, Alexander Heinecke, Simon Rubanovich
  • Patent number: 11150979
    Abstract: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.
    Type: Grant
    Filed: August 13, 2019
    Date of Patent: October 19, 2021
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Stanislav Shwartsman, Jared W. Stark, IV, Lihu Rappoport, Igor Yanover, George Leifman
  • Publication number: 20210318932
    Abstract: An apparatus and method are described for detecting and correcting data fetch errors within a processor core. For example, one embodiment of an instruction processing apparatus for detecting and recovering from data fetch errors comprises: at least one processor core having a plurality of instruction processing stages including a data fetch stage and a retirement stage; and error processing logic in communication with the processing stages to perform the operations of: detecting an error associated with data in response to a data fetch operation performed by the data fetch stage; and responsively performing one or more operations to ensure that the error does not corrupt an architectural state of the processor core within the retirement stage.
    Type: Application
    Filed: June 23, 2021
    Publication date: October 14, 2021
    Applicant: Intel Corporation
    Inventors: Theodros Yigzaw, Geeyarpuram N. Santhanakrishnan, Ganapati N. Srinivasa, Jose A. Vargas, Hisham Shafi, Michael Mishaeli, Ehud Cohen, Zeev Sperber, Shlomo Raikin, Mohan J. Kumar, Julius Y. Mandelblat
  • Publication number: 20210303309
    Abstract: In one embodiment, a processor includes a fetch logic to fetch instructions, a decode logic to decode the instructions, an execution logic to execute at least some of the instructions, and a reconstruction logic. The decode logic may identify a first instruction having a first immediate value, accumulate the first immediate value with a folded immediate value associated with a first operand of the first instruction, and prevent the first instruction from provision to the execution logic, such that the first instruction is not to be executed within the execution logic. The reconstruction logic may reconstruct one or more flags associated with a result of the first instruction. Other embodiments are described and claimed.
    Type: Application
    Filed: March 27, 2020
    Publication date: September 30, 2021
    Applicant: Intel Corporation
    Inventors: Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Publication number: 20210286620
    Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.
    Type: Application
    Filed: March 29, 2021
    Publication date: September 16, 2021
    Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
  • Publication number: 20210279038
    Abstract: Disclosed embodiments relate to performing floating-point (FP) arithmetic. In one example, a processor is to decode an instruction specifying locations of first, second, and third floating-point (FP) operands and an opcode calling for accumulating a FP product of the first and second FP operands with the third FP operand, and execution circuitry to, in a first cycle, generate the FP product having a Fuzzy-Jbit format comprising a sign bit, a 9-bit exponent, and a 25-bit mantissa having two possible positions for a JBit and, in a second cycle, to accumulate the FP product with the third FP operand, while concurrently, based on Jbit positions of the FP product and the third FP operand, determining an exponent adjustment and a mantissa shift control of a result of the accumulation, wherein performing the exponent adjustment concurrently enhances an ability to perform the accumulation in one cycle.
    Type: Application
    Filed: May 25, 2021
    Publication date: September 9, 2021
    Inventors: Amit GRADSTEIN, Simon RUBANOVICH, Zeev SPERBER
  • Publication number: 20210263743
    Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four non-negative integers in numerical order with all integers in consecutive positions differing by a constant stride of at least two. In an aspect, storing the result including the sequence of the at least four integers is performed without calculating the at least four integers using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed.
    Type: Application
    Filed: December 14, 2020
    Publication date: August 26, 2021
    Inventors: Elmoustapha Ould-Ahmed-Vall, Seth Abraham, Robert Valentine, Zeev Sperber, Amit Gradstein
  • Patent number: 11093247
    Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: August 17, 2021
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
  • Patent number: 11086623
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: August 10, 2021
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Stanislav Shwartsman, Dan Baum, Igor Yanover, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Jesus Corbal, Yuri Gebil, Simon Rubanovich
  • Patent number: 11080048
    Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Menachem Adelman, Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Dan Baum, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade