Patents by Inventor Raanan Sade

Raanan Sade has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for a range comparison, exchange, and add

Patent number: 11036501

Abstract: An apparatus and method for executing an atomic test and update instruction. For example, one embodiment of a processor comprises: a decoder to decode an atomic test and update (ATU) instruction having a first operand specifying a first value in a first storage location, a second operand specifying a second value in a second storage location, a third operand specifying a third value in a third storage location, and an opcode specifying a condition to be tested relative to the first and second values; and execution circuitry to perform a load lock operation to load the first value from the first storage location, the load lock operation to prevent access by another instruction before a result of the ATU instruction is stored, the execution circuitry to test a condition related the first value and the second value, wherein if the condition is met then the execution circuitry is to add the first value and the third value to generate a sum and to store the sum to the first storage location.

Type: Grant

Filed: December 23, 2018

Date of Patent: June 15, 2021

Assignee: Intel Corporation

Inventors: Raanan Sade, Joseph Nuzman, Hubert Nueckel
Systems and methods to zero a tile register pair

Patent number: 11023235

Abstract: Embodiments detailed herein relate to systems and methods to zero a tile register pair. In one example, a processor includes decode circuitry to decode a matrix pair zeroing instruction having fields for an opcode and an identifier to identify a destination matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded matrix pair zeroing instruction to zero every element of a left matrix and a right matrix of the identified destination matrix.

Type: Grant

Filed: December 29, 2017

Date of Patent: June 1, 2021

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Eyal Hadas
Systems, methods, and apparatuses utilizing CPU storage with a memory reference

Patent number: 11023382

Abstract: Implementations of using tiles for caching are detailed In some implementations, an instruction execution circuitry executes one or more instructions, a register state cache coupled to the instruction execution circuitry holds thread register state in a plurality of registers, and backing storage pointer storage stores a backing storage pointer, wherein the backing storage pointer is to reference a state backing storage area in external memory to store the thread register state stored in the register state cache.

Type: Grant

Filed: December 22, 2017

Date of Patent: June 1, 2021

Assignee: Intel Corporation

Inventors: Raanan Sade, Jason Brandt, Mark J. Charney, Joseph Nuzman, Leena Puthiyedath, Rinat Rappoport, Vivekananthan Sanjeepan, Robert Valentine
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT VECTOR DOT PRODUCT INSTRUCTIONS

Publication number: 20210157589

Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Type: Application

Filed: February 4, 2021

Publication date: May 27, 2021

Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
HARDWARE APPARATUSES AND METHODS FOR MEMORY CORRUPTION DETECTION

Publication number: 20210141683

Abstract: Methods and apparatuses relating to memory corruption detection are described. In one embodiment, a hardware processor includes an execution unit to execute an instruction to request access to a block of a memory through a pointer to the block of the memory, and a memory management unit to allow access to the block of the memory when a memory corruption detection value in the pointer is validated with a memory corruption detection value in the memory for the block, wherein a position of the memory corruption detection value in the pointer is selectable between a first location and a second, different location.

Type: Application

Filed: September 14, 2020

Publication date: May 13, 2021

Inventors: Tomer Stark, Ron Gabor, Joseph Nuzman, Raanan Sade, Bryant E. Bigbee
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO CONVERT TO 16-BIT FLOATING-POINT FORMAT

Publication number: 20210124580

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Type: Application

Filed: December 23, 2020

Publication date: April 29, 2021

Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO CONVERT TO 16-BIT FLOATING-POINT FORMAT

Publication number: 20210124581

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Type: Application

Filed: December 23, 2020

Publication date: April 29, 2021

Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
Systems for performing instructions to quickly convert and use tiles as 1D vectors

Patent number: 10990396

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Grant

Filed: September 27, 2018

Date of Patent: April 27, 2021

Assignee: Intel Corporation

Inventors: Bret Toll, Christopher J. Hughes, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT VECTOR DOT PRODUCT INSTRUCTIONS

Publication number: 20210117194

Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Type: Application

Filed: December 23, 2020

Publication date: April 22, 2021

Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH
Systems and methods for performing instructions specifying ternary tile logic operations

Patent number: 10970076

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying ternary tile operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction specifying a ternary tile operation, and locations of destination and first, second, and third source matrices, each of the matrices having M rows by N columns; and execution circuitry to respond to the decoded instruction by, for each equal-sized group of K elements of the specified first, second, and third source matrices, generate K results by performing the ternary tile operation in parallel on K corresponding elements of the specified first, second, and third source matrices, and store each of the K results to a corresponding element of the specified destination matrix, wherein corresponding elements of the specified source and destination matrices occupy a same relative position within their associated matrix.

Type: Grant

Filed: September 14, 2018

Date of Patent: April 6, 2021

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Bret Toll, Dan Baum, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Systems and methods to transpose vectors on-the-fly while loading from memory

Patent number: 10970072

Abstract: Disclosed embodiments relate to transposing vectors while loading from memory. In one example, a processor includes a register file, a memory interface, fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode, a destination vector register, and a source vector having N groups of elements, N being a positive integer, the opcode to indicate the processor is to fetch the source vector, generate write data comprising one or more N-tuples, each N-tuple comprising corresponding elements from each of the N groups of elements, and write the write data to the destination vector register, and execution circuitry to execute the decoded instruction as per the opcode, the execution circuitry has a shuffle pipeline disposed between the memory and the register file, the shuffle pipeline to fetch, decode, and execute further instances of the instruction at one instruction per clock cycle.

Type: Grant

Filed: December 21, 2018

Date of Patent: April 6, 2021

Assignee: Intel Corporation

Inventors: Alexander F. Heinecke, Evangelos Georganas, Christopher J. Hughes, Raanan Sade, Robert Valentine
Apparatus and Method for Store Pairing with Reduced Hardware Requirements

Publication number: 20210096860

Abstract: An apparatus and method for pairing store operations. For example, one embodiment of a processor comprises: a grouping eligibility checker to evaluate a plurality of store instructions based on a set of grouping rules to determine whether two or more of the plurality of store instructions are eligible for grouping; and a dispatcher to simultaneously dispatch a first group of store instructions of the plurality of store instructions determined to be eligible for grouping by the grouping eligibility checker.

Type: Application

Filed: March 28, 2020

Publication date: April 1, 2021

Inventors: Raanan SADE, Igor YANOVER, Stanislav SHWARTSMAN, Muhammad TAHER, David ZYSMAN, Liron ZUR, Yiftach GILAD
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSPOSE RECTANGULAR TILES

Publication number: 20210096822

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transpose rectangular tiles. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first destination, second destination, first source, and second source matrices, the specified opcode to cause the processor to process each of the specified source and destination matrices as a rectangular matrix, decode circuitry to decode the fetched rectangular matrix transpose instruction, and execution circuitry to respond to the decoded rectangular matrix transpose instruction by transposing each row of elements of the specified first source matrix into a corresponding column of the specified first destination matrix and transposing each row of elements of the specified second source matrix into a corresponding column of the specified second destination matrix.

Type: Application

Filed: December 14, 2020

Publication date: April 1, 2021

Inventors: Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Simon RUBANOVICH, Amit GRADSTEIN, Zeev SPERBER, Bret TOLL, Jesus CORBAL, Christopher J. HUGHES, Alexander F. HEINECKE, Elmoustapha OULD-AHMED-VALL
Systems and methods for performing instructions to transform matrices into row-interleaved format

Patent number: 10963256

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.

Type: Grant

Filed: September 28, 2018

Date of Patent: March 30, 2021

Assignee: Intel Corporation

Inventors: Raanan Sade, Robert Valentine, Bret Toll, Christopher J. Hughes, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
Systems and methods for performing 16-bit floating-point matrix dot product instructions

Patent number: 10963246

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

Type: Grant

Filed: November 9, 2018

Date of Patent: March 30, 2021

Assignee: Intel Corporation

Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Method and apparatus for efficient matrix alignment in a systolic array

Patent number: 10929143

Abstract: An apparatus and method for efficient matrix alignment in a systolic array.

Type: Grant

Filed: September 28, 2018

Date of Patent: February 23, 2021

Assignee: Intel Corporation

Inventors: Mike Espig, Bret Toll, Raanan Sade, Bob Valentine, Alexander Heinecke, Christopher J. Hughes
METHOD AND SYSTEM FOR PERFORMING DATA MOVEMENT OPERATIONS WITH READ SNAPSHOT AND IN PLACE WRITE UPDATE

Publication number: 20210049102

Abstract: Method and system for performing data movement operations is described herein. One embodiment of a method includes: storing data for a first memory address in a cache line of a memory of a first processing unit, the cache line associated with a coherency state indicating that the memory has sole ownership of the cache line; decoding an instruction for execution by a second processing unit, the instruction comprising a source data operand specifying the first memory address and a destination operand specifying a memory location in the second processing unit; and responsive to executing the decoded instruction, copying data from the cache line of the memory of the first processing unit as identified by the first memory address, to the memory location of the second processing unit, wherein responsive to the copy, the cache line is to remain in the memory and the coherency state is to remain unchanged.

Type: Application

Filed: March 30, 2020

Publication date: February 18, 2021

Applicant: Intel Corporation

Inventors: Anil Vasudevan, Venkata Krishnan, Andrew J. Herdrich, Ren Wang, Robert G. Blankenship, Vedaraman Geetha, Shrikant M. Shah, Marshall A. Millier, Raanan Sade, Binh Q. Pham, Olivier Serres, Chyi-Chang Miao, Christopher B. Wilkerson
Systems for performing instructions for fast element unpacking into 2-dimensional registers

Patent number: 10896043

Abstract: Disclosed embodiments relate to instructions for fast element unpacking. In one example, a processor includes fetch circuitry to fetch an instruction whose format includes fields to specify an opcode and locations of an Array-of-Structures (AOS) source matrix and one or more Structure of Arrays (SOA) destination matrices, wherein: the specified opcode calls for unpacking elements of the specified AOS source matrix into the specified Structure of Arrays (SOA) destination matrices, the AOS source matrix is to contain N structures each containing K elements of different types, with same-typed elements in consecutive structures separated by a stride, the SOA destination matrices together contain K segregated groups, each containing N same-typed elements, decode circuitry to decode the fetched instruction, and execution circuitry, responsive to the decoded instruction, to unpack each element of the specified AOS matrix into one of the K element types of the one or more SOA matrices.

Type: Grant

Filed: September 28, 2018

Date of Patent: January 19, 2021

Assignee: Intel Corporation

Inventors: Bret Toll, Alexander F. Heinecke, Christopher J. Hughes, Ronen Zohar, Michael Espig, Dan Baum, Raanan Sade, Robert Valentine, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall
Systems and methods for performing instructions to transpose rectangular tiles

Patent number: 10866786

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transpose rectangular tiles. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first destination, second destination, first source, and second source matrices, the specified opcode to cause the processor to process each of the specified source and destination matrices as a rectangular matrix, decode circuitry to decode the fetched rectangular matrix transpose instruction, and execution circuitry to respond to the decoded rectangular matrix transpose instruction by transposing each row of elements of the specified first source matrix into a corresponding column of the specified first destination matrix and transposing each row of elements of the specified second source matrix into a corresponding column of the specified second destination matrix.

Type: Grant

Filed: September 27, 2018

Date of Patent: December 15, 2020

Assignee: Intel Corporation

Inventors: Raanan Sade, Robert Valentine, Mark J. Charney, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Bret Toll, Jesus Corbal, Christopher J. Hughes, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall
DEFINING VIRTUALIZED PAGE ATTRIBUTES BASED ON GUEST PAGE ATTRIBUTES

Publication number: 20200379917

Abstract: A processing system includes a processing core to execute a virtual machine (VM) comprising a guest operating system (OS) and a memory management unit, communicatively coupled to the processing core, comprising a storage device to store an extended page table entry (EPTE) comprising a mapping from a guest physical address (GPA) associated with the guest OS to an identifier of a memory frame, a first plurality of access right flags associated with accessing the memory frame in a first page mode referenced by an attribute of a memory page identified by the GPA, and a second plurality of access right flags associated with accessing the memory frame in a second page mode referenced by the attribute of the memory page identified by the GPA.

Type: Application

Filed: June 12, 2020

Publication date: December 3, 2020

Inventors: Gilbert Neiger, Baiju V. Patel, Gur Hildesheim, Ron Rais, Andrew V. Anderson, Jason W. Brandt, David M. Durham, Barry E. Huntley, Raanan Sade, Ravi L. Sahita, Vedvyas Shanbhogue, Arumugam Thiyagarajah

prev 1 2 3 4 5 6 7 8 9 … next