Patents by Inventor Zeev Sperber

Zeev Sperber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220326948
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Application
    Filed: June 28, 2022
    Publication date: October 13, 2022
    Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
  • Publication number: 20220326949
    Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Application
    Filed: June 21, 2022
    Publication date: October 13, 2022
    Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
  • Patent number: 11455167
    Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.
    Type: Grant
    Filed: December 2, 2019
    Date of Patent: September 27, 2022
    Assignee: Intel Coporation
    Inventors: Raanan Sade, Thierry Pons, Amit Gradstein, Zeev Sperber, Mark J. Charney, Robert Valentine, Eyal Oz-Sinay
  • Publication number: 20220300286
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, performing a matrix operation of zeroing a matrix in response to a single instruction. For example, a processor detailed which includes decode circuitry to decode an instruction having fields for an opcode and a source/destination matrix operand identifier; and execution circuitry to execute the decoded instruction to zero each data element of the identified source/destination matrix.
    Type: Application
    Filed: June 6, 2022
    Publication date: September 22, 2022
    Inventors: Robert VALENTINE, Menachem ADELMAN, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Stanislav SHWARTSMAN
  • Publication number: 20220291927
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.
    Type: Application
    Filed: March 28, 2022
    Publication date: September 15, 2022
    Inventors: Robert VALENTINE, Menachem ADELMAN, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Rinat RAPPOPORT, Jesus CORBAL, Stanislav SHWARTSMAN, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Dan BAUM, Yuri GEBIL
  • Publication number: 20220291926
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.
    Type: Application
    Filed: March 28, 2022
    Publication date: September 15, 2022
    Inventors: Robert VALENTINE, Menachem ADELMAN, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Rinat RAPPOPORT, Jesus CORBAL, Stanislav SHWARTSMAN, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Dan BAUM, Yuri GEBIL
  • Patent number: 11403071
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transpose rectangular tiles. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first destination, second destination, first source, and second source matrices, the specified opcode to cause the processor to process each of the specified source and destination matrices as a rectangular matrix, decode circuitry to decode the fetched rectangular matrix transpose instruction, and execution circuitry to respond to the decoded rectangular matrix transpose instruction by transposing each row of elements of the specified first source matrix into a corresponding column of the specified first destination matrix and transposing each row of elements of the specified second source matrix into a corresponding column of the specified second destination matrix.
    Type: Grant
    Filed: December 14, 2020
    Date of Patent: August 2, 2022
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Robert Valentine, Mark J. Charney, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Bret Toll, Jesus Corbal, Christopher J. Hughes, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall
  • Publication number: 20220236989
    Abstract: Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.
    Type: Application
    Filed: January 28, 2022
    Publication date: July 28, 2022
    Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Dan BAUM, Alexander HEINECKE, Elmoustapha OULD-AHMED-VALL
  • Publication number: 20220206805
    Abstract: Techniques for converting FP16 data elements to BF8 data elements using a single instruction are described. An exemplary apparatus includes decoder circuitry to decode a single instruction, the single instruction to include a one or more fields to identify a source operand, one or more fields to identify a destination operand, and one or more fields for an opcode, the opcode to indicate that execution circuitry is to convert packed half-precision floating-point data from the identified source to packed bfloat8 data and store the packed bfloat8 data into corresponding data element positions of the identified destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision floating-point data from the identified source to packed bfloat8 data and store the packed bfloat8 data into corresponding data element positions.
    Type: Application
    Filed: December 26, 2020
    Publication date: June 30, 2022
    Inventors: Alexander Heinecke, Naveen Mellempudi, Robert Valentine, Mark Charney, Christopher Hughes, Evangelos Georganas, Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Publication number: 20220206743
    Abstract: Techniques for converting FP16 to BF8 using bias are described.
    Type: Application
    Filed: December 26, 2020
    Publication date: June 30, 2022
    Inventors: Alexander Heinecke, Naveen Mellempudi, Robert Valentine, Mark Charney, Christopher Hughes, Evangelos Georganas, Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Publication number: 20220206925
    Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state.
    Type: Application
    Filed: January 24, 2022
    Publication date: June 30, 2022
    Inventors: Adarsh Chauhan, Jayesh Gaur, Franck Sala, Lihu Rappoport, Zeev Sperber, Adi Yoaz, Sreenivas Subramoney
  • Publication number: 20220206801
    Abstract: Systems, methods, and apparatuses relating to 8-bit floating-point matrix dot product instructions are described.
    Type: Application
    Filed: December 26, 2020
    Publication date: June 30, 2022
    Applicant: Intel Corporation
    Inventors: Naveen Mellempudi, Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Christopher J. Hughes, Evangelos Georganas, Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Patent number: 11372643
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Grant
    Filed: November 9, 2018
    Date of Patent: June 28, 2022
    Assignee: Intel Corporation
    Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Patent number: 11366663
    Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Grant
    Filed: November 9, 2018
    Date of Patent: June 21, 2022
    Assignee: Intel Corporation
    Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Publication number: 20220188114
    Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described.
    Type: Application
    Filed: March 7, 2022
    Publication date: June 16, 2022
    Inventors: Regev Shemy, Zeev Sperber, Wajdi Feghali, Vinodh Gopal, Amit Gradstein, Simon Rubanovich, Sean Gulley, Ilya Albrekht, Jacob Doweck, Jose Yallouz, Ittai Anati
  • Patent number: 11360770
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, performing a matrix operation of zeroing a matrix in response to a single instruction. For example, a processor detailed which includes decode circuitry to decode an instruction having fields for an opcode and a source/destination matrix operand identifier; and execution circuitry to execute the decoded instruction to zero each data element of the identified source/destination matrix.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: June 14, 2022
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Menachem Adelman, Zeev Sperber, Mark J. Charney, Bret L. Toll, Jesus Corbal, Alexander F. Heinecke, Barukh Ziv, Elmoustapha Ould-Ahmed-Vall, Stanislav Shwartsman
  • Patent number: 11354124
    Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.
    Type: Grant
    Filed: August 3, 2017
    Date of Patent: June 7, 2022
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
  • Publication number: 20220171623
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.
    Type: Application
    Filed: December 10, 2021
    Publication date: June 2, 2022
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Barukh ZIV, Alexander HEINECKE, Milind GIRKAR, Simon RUBANOVICH
  • Patent number: 11347502
    Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: May 31, 2022
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
  • Publication number: 20220147356
    Abstract: Systems, methods, and apparatuses relating to performing hashing operations on packed data elements are described.
    Type: Application
    Filed: November 29, 2021
    Publication date: May 12, 2022
    Inventors: Regev Shemy, Zeev Sperber, Wajdi Feghali, Vinodh Gopal, Amit Gradstein, Simon Rubanovich, Sean Gulley, Ilya Albrekht, Jacob Doweck, Jose Yallouz, Ittai Anati