Patents by Inventor Raanan Sade

Raanan Sade has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11036501
    Abstract: An apparatus and method for executing an atomic test and update instruction. For example, one embodiment of a processor comprises: a decoder to decode an atomic test and update (ATU) instruction having a first operand specifying a first value in a first storage location, a second operand specifying a second value in a second storage location, a third operand specifying a third value in a third storage location, and an opcode specifying a condition to be tested relative to the first and second values; and execution circuitry to perform a load lock operation to load the first value from the first storage location, the load lock operation to prevent access by another instruction before a result of the ATU instruction is stored, the execution circuitry to test a condition related the first value and the second value, wherein if the condition is met then the execution circuitry is to add the first value and the third value to generate a sum and to store the sum to the first storage location.
    Type: Grant
    Filed: December 23, 2018
    Date of Patent: June 15, 2021
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Joseph Nuzman, Hubert Nueckel
  • Patent number: 11023235
    Abstract: Embodiments detailed herein relate to systems and methods to zero a tile register pair. In one example, a processor includes decode circuitry to decode a matrix pair zeroing instruction having fields for an opcode and an identifier to identify a destination matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded matrix pair zeroing instruction to zero every element of a left matrix and a right matrix of the identified destination matrix.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: June 1, 2021
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Eyal Hadas
  • Patent number: 11023382
    Abstract: Implementations of using tiles for caching are detailed In some implementations, an instruction execution circuitry executes one or more instructions, a register state cache coupled to the instruction execution circuitry holds thread register state in a plurality of registers, and backing storage pointer storage stores a backing storage pointer, wherein the backing storage pointer is to reference a state backing storage area in external memory to store the thread register state stored in the register state cache.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: June 1, 2021
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Jason Brandt, Mark J. Charney, Joseph Nuzman, Leena Puthiyedath, Rinat Rappoport, Vivekananthan Sanjeepan, Robert Valentine
  • Publication number: 20210157589
    Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Application
    Filed: February 4, 2021
    Publication date: May 27, 2021
    Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
  • Publication number: 20210141683
    Abstract: Methods and apparatuses relating to memory corruption detection are described. In one embodiment, a hardware processor includes an execution unit to execute an instruction to request access to a block of a memory through a pointer to the block of the memory, and a memory management unit to allow access to the block of the memory when a memory corruption detection value in the pointer is validated with a memory corruption detection value in the memory for the block, wherein a position of the memory corruption detection value in the pointer is selectable between a first location and a second, different location.
    Type: Application
    Filed: September 14, 2020
    Publication date: May 13, 2021
    Inventors: Tomer Stark, Ron Gabor, Joseph Nuzman, Raanan Sade, Bryant E. Bigbee
  • Publication number: 20210124580
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Application
    Filed: December 23, 2020
    Publication date: April 29, 2021
    Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
  • Publication number: 20210124581
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Application
    Filed: December 23, 2020
    Publication date: April 29, 2021
    Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
  • Patent number: 10990396
    Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: April 27, 2021
    Assignee: Intel Corporation
    Inventors: Bret Toll, Christopher J. Hughes, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
  • Publication number: 20210117194
    Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Application
    Filed: December 23, 2020
    Publication date: April 22, 2021
    Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
  • Patent number: 10970076
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying ternary tile operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction specifying a ternary tile operation, and locations of destination and first, second, and third source matrices, each of the matrices having M rows by N columns; and execution circuitry to respond to the decoded instruction by, for each equal-sized group of K elements of the specified first, second, and third source matrices, generate K results by performing the ternary tile operation in parallel on K corresponding elements of the specified first, second, and third source matrices, and store each of the K results to a corresponding element of the specified destination matrix, wherein corresponding elements of the specified source and destination matrices occupy a same relative position within their associated matrix.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: April 6, 2021
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Bret Toll, Dan Baum, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
  • Patent number: 10970072
    Abstract: Disclosed embodiments relate to transposing vectors while loading from memory. In one example, a processor includes a register file, a memory interface, fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode, a destination vector register, and a source vector having N groups of elements, N being a positive integer, the opcode to indicate the processor is to fetch the source vector, generate write data comprising one or more N-tuples, each N-tuple comprising corresponding elements from each of the N groups of elements, and write the write data to the destination vector register, and execution circuitry to execute the decoded instruction as per the opcode, the execution circuitry has a shuffle pipeline disposed between the memory and the register file, the shuffle pipeline to fetch, decode, and execute further instances of the instruction at one instruction per clock cycle.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: April 6, 2021
    Assignee: Intel Corporation
    Inventors: Alexander F. Heinecke, Evangelos Georganas, Christopher J. Hughes, Raanan Sade, Robert Valentine
  • Publication number: 20210096860
    Abstract: An apparatus and method for pairing store operations. For example, one embodiment of a processor comprises: a grouping eligibility checker to evaluate a plurality of store instructions based on a set of grouping rules to determine whether two or more of the plurality of store instructions are eligible for grouping; and a dispatcher to simultaneously dispatch a first group of store instructions of the plurality of store instructions determined to be eligible for grouping by the grouping eligibility checker.
    Type: Application
    Filed: March 28, 2020
    Publication date: April 1, 2021
    Inventors: Raanan SADE, Igor YANOVER, Stanislav SHWARTSMAN, Muhammad TAHER, David ZYSMAN, Liron ZUR, Yiftach GILAD
  • Publication number: 20210096822
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transpose rectangular tiles. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first destination, second destination, first source, and second source matrices, the specified opcode to cause the processor to process each of the specified source and destination matrices as a rectangular matrix, decode circuitry to decode the fetched rectangular matrix transpose instruction, and execution circuitry to respond to the decoded rectangular matrix transpose instruction by transposing each row of elements of the specified first source matrix into a corresponding column of the specified first destination matrix and transposing each row of elements of the specified second source matrix into a corresponding column of the specified second destination matrix.
    Type: Application
    Filed: December 14, 2020
    Publication date: April 1, 2021
    Inventors: Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Simon RUBANOVICH, Amit GRADSTEIN, Zeev SPERBER, Bret TOLL, Jesus CORBAL, Christopher J. HUGHES, Alexander F. HEINECKE, Elmoustapha OULD-AHMED-VALL
  • Patent number: 10963256
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: March 30, 2021
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Robert Valentine, Bret Toll, Christopher J. Hughes, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
  • Patent number: 10963246
    Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.
    Type: Grant
    Filed: November 9, 2018
    Date of Patent: March 30, 2021
    Assignee: Intel Corporation
    Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Patent number: 10929143
    Abstract: An apparatus and method for efficient matrix alignment in a systolic array.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: February 23, 2021
    Assignee: Intel Corporation
    Inventors: Mike Espig, Bret Toll, Raanan Sade, Bob Valentine, Alexander Heinecke, Christopher J. Hughes
  • Publication number: 20210049102
    Abstract: Method and system for performing data movement operations is described herein. One embodiment of a method includes: storing data for a first memory address in a cache line of a memory of a first processing unit, the cache line associated with a coherency state indicating that the memory has sole ownership of the cache line; decoding an instruction for execution by a second processing unit, the instruction comprising a source data operand specifying the first memory address and a destination operand specifying a memory location in the second processing unit; and responsive to executing the decoded instruction, copying data from the cache line of the memory of the first processing unit as identified by the first memory address, to the memory location of the second processing unit, wherein responsive to the copy, the cache line is to remain in the memory and the coherency state is to remain unchanged.
    Type: Application
    Filed: March 30, 2020
    Publication date: February 18, 2021
    Applicant: Intel Corporation
    Inventors: Anil Vasudevan, Venkata Krishnan, Andrew J. Herdrich, Ren Wang, Robert G. Blankenship, Vedaraman Geetha, Shrikant M. Shah, Marshall A. Millier, Raanan Sade, Binh Q. Pham, Olivier Serres, Chyi-Chang Miao, Christopher B. Wilkerson
  • Patent number: 10896043
    Abstract: Disclosed embodiments relate to instructions for fast element unpacking. In one example, a processor includes fetch circuitry to fetch an instruction whose format includes fields to specify an opcode and locations of an Array-of-Structures (AOS) source matrix and one or more Structure of Arrays (SOA) destination matrices, wherein: the specified opcode calls for unpacking elements of the specified AOS source matrix into the specified Structure of Arrays (SOA) destination matrices, the AOS source matrix is to contain N structures each containing K elements of different types, with same-typed elements in consecutive structures separated by a stride, the SOA destination matrices together contain K segregated groups, each containing N same-typed elements, decode circuitry to decode the fetched instruction, and execution circuitry, responsive to the decoded instruction, to unpack each element of the specified AOS matrix into one of the K element types of the one or more SOA matrices.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: January 19, 2021
    Assignee: Intel Corporation
    Inventors: Bret Toll, Alexander F. Heinecke, Christopher J. Hughes, Ronen Zohar, Michael Espig, Dan Baum, Raanan Sade, Robert Valentine, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 10866786
    Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transpose rectangular tiles. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first destination, second destination, first source, and second source matrices, the specified opcode to cause the processor to process each of the specified source and destination matrices as a rectangular matrix, decode circuitry to decode the fetched rectangular matrix transpose instruction, and execution circuitry to respond to the decoded rectangular matrix transpose instruction by transposing each row of elements of the specified first source matrix into a corresponding column of the specified first destination matrix and transposing each row of elements of the specified second source matrix into a corresponding column of the specified second destination matrix.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: December 15, 2020
    Assignee: Intel Corporation
    Inventors: Raanan Sade, Robert Valentine, Mark J. Charney, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Bret Toll, Jesus Corbal, Christopher J. Hughes, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall
  • Publication number: 20200379917
    Abstract: A processing system includes a processing core to execute a virtual machine (VM) comprising a guest operating system (OS) and a memory management unit, communicatively coupled to the processing core, comprising a storage device to store an extended page table entry (EPTE) comprising a mapping from a guest physical address (GPA) associated with the guest OS to an identifier of a memory frame, a first plurality of access right flags associated with accessing the memory frame in a first page mode referenced by an attribute of a memory page identified by the GPA, and a second plurality of access right flags associated with accessing the memory frame in a second page mode referenced by the attribute of the memory page identified by the GPA.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 3, 2020
    Inventors: Gilbert Neiger, Baiju V. Patel, Gur Hildesheim, Ron Rais, Andrew V. Anderson, Jason W. Brandt, David M. Durham, Barry E. Huntley, Raanan Sade, Ravi L. Sahita, Vedvyas Shanbhogue, Arumugam Thiyagarajah