Patents by Inventor Alexander Heinecke

Alexander Heinecke has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240045681
    Abstract: Techniques for comparing FP8 data elements are described. An exemplary FP8 comparison instruction includes fields for an opcode, an identification of a location of a first packed data source operand, and an identification of a location of a second packed data source operand, wherein the opcode is to indicate that execution circuitry is to perform, for a particular data element position of the packed data source operands, a comparison of a data element at that position, and update a flags register based on the comparison.
    Type: Application
    Filed: October 1, 2022
    Publication date: February 8, 2024
    Inventors: Alexander Heinecke, Menachem Adelman, Evangelos Georganas, Amit Gradstein, Christopher Hughes, Naveen Mellempudi, Simon Rubanovich, Uri Sherman, Zeev Sperber
  • Publication number: 20240045683
    Abstract: Techniques for performing square root or reciprocal square root calculations on FP8 data elements in response to an instruction are described. An example of an instruction is one that includes fields for an opcode, an identification of a location of a packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a calculation of a square root value of a FP8 data element in that position and store a result of each square root into a corresponding data element position of the packed data destination operand.
    Type: Application
    Filed: October 1, 2022
    Publication date: February 8, 2024
    Inventors: Alexander Heinecke, Menachem Adelman, Evangelos Georganas, Amit Gradstein, Christopher Hughes, Naveen Mellempudi, Simon Rubanovich, Uri Sherman, Zeev Sperber
  • Publication number: 20240045682
    Abstract: Techniques for scale and reduction of FP8 data elements are described. An exemplary instruction includes fields for an having fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operands, a floating point scale operation of a FP8 data element of the first packed data source by multiplying the data element by a power of 2 value, wherein a value of the exponent of the power of 2 value is a floor value of a FP8 data element of the second packed data source, and store a result of the floating point scale operation into a corresponding data element position of the packed data destination operand.
    Type: Application
    Filed: October 1, 2022
    Publication date: February 8, 2024
    Inventors: Alexander Heinecke, Menachem Adelman, Evangelos Georganas, Amit Gradstein, Christopher Hughes, Naveen Mellempudi, Simon Rubanovich, Uri Sherman, Zeev Sperber
  • Publication number: 20240045677
    Abstract: Techniques for converting FP16 or FP32 data elements to FP8 data elements using a single instruction are described. An exemplary apparatus includes decoder circuitry to decode a single instruction, the single instruction to include a one or more fields to identify a source operand, one or more fields to identify a destination operand, and one or more fields for an opcode, the opcode to indicate that execution circuitry is to convert packed half-precision floating-point data or single-precision floating point data from the identified source to packed FP8 data and store the packed bfloat8 data into corresponding data element positions of the identified destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision floating-point data or single-precision floating point data from the identified source to packed bfloat8 data and store the packed bfloat8 data into corresponding data element positions.
    Type: Application
    Filed: October 1, 2022
    Publication date: February 8, 2024
    Inventors: Alexander Heinecke, Menachem Adelman, Mark Charney, Evangelos Georganas, Amit Gradstein, Christopher Hughes, Naveen Mellempudi, Simon Rubanovich, Uri Sherman, Zeev Sperber, Robert Valentine
  • Publication number: 20240045684
    Abstract: Techniques for converting FP16 to BF8 using bias are described.
    Type: Application
    Filed: October 1, 2022
    Publication date: February 8, 2024
    Inventors: Alexander Heinecke, Menachem Adelman, Mark Charney, Evangelos Georganas, Amit Gradstein, Christopher Hughes, Naveen Mellempudi, Simon Rubanovich, Uri Sherman, Zeev Sperber, Robert Valentine
  • Publication number: 20240045686
    Abstract: Techniques for converting FP8 data elements to FP16 or FP32 data elements using a single instruction are described. An example apparatus includes decoder circuitry to decode a single instruction, the single instruction to indicate that execution circuitry is to convert packed FP8 data from the identified source to packed half-precision floating-point data or single-precision floating point data and store the packed half-precision floating-point data or single-precision floating point data into corresponding data element positions of the identified destination operand.
    Type: Application
    Filed: October 1, 2022
    Publication date: February 8, 2024
    Inventors: Alexander Heinecke, Menachem Adelman, Evangelos Georganas, Amit Gradstein, Christopher Hughes, Naveen Mellempudi, Simon Rubanovich, Uri Sherman, Zeev Sperber
  • Publication number: 20240045688
    Abstract: Techniques for performing FP8 FMA in response to an instruction are described. In some examples, an instruction has fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a FP8 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand, wherein the FP8 value has an 8-bit floating point format that comprises one bit for a sign, at least 4 bits for an exponent, and at least two bits for a fraction.
    Type: Application
    Filed: October 1, 2022
    Publication date: February 8, 2024
    Inventors: Alexander Heinecke, Menachem Adelman, Evangelos Georganas, Amit Gradstein, Christopher Hughes, Naveen Mellempudi, Simon Rubanovich, Uri Sherman, Zeev Sperber
  • Patent number: 11868770
    Abstract: Embodiments detailed herein relate to arithmetic operations of float-point values. An exemplary processor includes decoding circuitry to decode an instruction, where the instruction specifies locations of a plurality of operands, values of which being in a floating-point format. The exemplary processor further includes execution circuitry to execute the decoded instruction, where the execution includes to: convert the values for each operand, each value being converted into a plurality of lower precision values, where an exponent is to be stored for each operand; perform arithmetic operations among lower precision values converted from values for the plurality of the operands; and generate a floating-point value by converting a resulting value from the arithmetic operations into the floating-point format and store the floating-point value.
    Type: Grant
    Filed: December 29, 2022
    Date of Patent: January 9, 2024
    Assignee: INTEL CORPORATION
    Inventors: Gregory Henry, Alexander Heinecke
  • Publication number: 20230418750
    Abstract: Techniques for hierarchical core valid tracking are described. An example apparatus comprises a cache to store information accessible by two or more cores, and circuitry coupled to the cache to maintain coherence of the information stored in the cache and to hierarchically track respective associations of the information stored in the cache with the two or more cores, where a lowest hierarchical level of the hierarchically tracked associations is to indicate a logical core identifier of a particular core of the two or more cores. Other examples are disclosed and claimed.
    Type: Application
    Filed: June 28, 2022
    Publication date: December 28, 2023
    Applicant: Intel Corporation
    Inventors: Yedidya Hilewitz, Monam Agarwal, Yen-Cheng Liu, Alexander Heinecke
  • Publication number: 20230409326
    Abstract: Techniques and mechanisms for processor circuitry to execute a load and expand instruction of an instruction set to generate decompressed matrix data. In an embodiment, the instruction comprises a source operand which indicates a location from which compressed matrix data, and corresponding metadata, are to be accessed. A destination operand of the instruction indicates a location which is to receive decompressed metadata, which is generated, during execution of the instruction, based on the compressed matrix data and the corresponding metadata. The metadata comprises compression mask information which identifies which elements of the matrix have been masked from the compressed matrix data. In another embodiment, the instruction further comprises a count operand which identifies a total number of the unmasked matrix elements which are represented in the compressed matrix data.
    Type: Application
    Filed: June 15, 2022
    Publication date: December 21, 2023
    Applicant: Intel Corporation
    Inventors: Menachem Adelman, Amit Gradstein, Simon Rubanovich, Barukh Ziv, Uri Sherman, Dana Rip, Shahar Mizrahi, Dan Baum, Rinat Rappoport, Nilesh Jain, Zeev Sperber, Gideon Stupp, Alexander Heinecke, Christopher Hughes, Evangelos Georganas
  • Publication number: 20230385059
    Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).
    Type: Application
    Filed: August 14, 2023
    Publication date: November 30, 2023
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 11816483
    Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address, and execution circuitry to execute the decoded instruction to store configuration information about usage of storage for two-dimensional data structures at the memory address.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: November 14, 2023
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
  • Patent number: 11809869
    Abstract: Embodiments detailed herein relate to systems and methods to store a tile register pair to memory. In one example, a processor includes: decode circuitry to decode a store matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded store matrix pair instruction to store every element of left and right tiles of the identified source matrix to corresponding element positions of left and right tiles of the identified destination matrix, respectively, wherein the executing stores a chunk of C elements of one row of the identified source matrix at a time.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: November 7, 2023
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
  • Patent number: 11789729
    Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: October 17, 2023
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 11768681
    Abstract: An apparatus and method for performing multiply-accumulate operations.
    Type: Grant
    Filed: January 24, 2018
    Date of Patent: September 26, 2023
    Assignee: Intel Corporation
    Inventors: Alexander Heinecke, Dipankar Das, Robert Valentine, Mark Charney
  • Publication number: 20230256961
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed herein that mitigate hard-braking events. An example apparatus at least one memory; instructions; and processor circuitry to execute the instructions to: determine a danger level associated with an object, the danger level indicative of a first measure of damage corresponding to a trajectory of the object compared to a trajectory of a vehicle; determine, based on the first danger level, a danger measure based on at least one of a position of the object, a velocity of the object, an acceleration of the object, a direction of travel of the object, a weight or mass of the object; and generate instructions to transmit to a steering system or a braking system of the vehicle based on the determination.
    Type: Application
    Filed: February 2, 2023
    Publication date: August 17, 2023
    Inventors: Alexander Heinecke, Sara Baghsorkhi, Justin Gottschlich, Mohammad Mejbah Ul Alam, Shengtian Zhou, Jeffrey Ota
  • Publication number: 20230229446
    Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.
    Type: Application
    Filed: March 20, 2023
    Publication date: July 20, 2023
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
  • Publication number: 20230214215
    Abstract: Embodiments detailed herein relate to arithmetic operations of float-point values. An exemplary processor includes decoding circuitry to decode an instruction, where the instruction specifies locations of a plurality of operands, values of which being in a floating-point format. The exemplary processor further includes execution circuitry to execute the decoded instruction, where the execution includes to: convert the values for each operand, each value being converted into a plurality of lower precision values, where an exponent is to be stored for each operand; perform arithmetic operations among lower precision values converted from values for the plurality of the operands; and generate a floating-point value by converting a resulting value from the arithmetic operations into the floating-point format and store the floating-point value.
    Type: Application
    Filed: December 29, 2022
    Publication date: July 6, 2023
    Inventors: Gregory HENRY, Alexander HEINECKE
  • Patent number: 11669326
    Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a quadword data elements of a matrix pair. Additionally, in some instances, non-accumulating quadword data elements of the matrix pair are set to zero.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: June 6, 2023
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
  • Patent number: 11669586
    Abstract: The present disclosure relates to an apparatus that includes decoding circuitry that decodes a single instruction. The single instruction includes an identifier of a first source operand, an identifier of a second source operand, an identifier of a destination, and an opcode indicative of execution circuitry is to multiply from the identified first source operand and the identified second source operand and store a result in the identified destination. Additionally, the apparatus includes execution circuitry to execute the single decoded instruction to calculate a dot product by calculating a plurality of products using data elements of the identified first and second operands using values less precise than the identified first and second source operands, summing the calculated products, and storing the summed products in the destination.
    Type: Grant
    Filed: February 25, 2022
    Date of Patent: June 6, 2023
    Assignee: Intel Corporation
    Inventors: Gregory Henry, Alexander Heinecke