Patents by Inventor Raanan Sade

Raanan Sade has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS TO STORE A TILE REGISTER PAIR TO MEMORY

Publication number: 20190042255

Abstract: Embodiments detailed herein relate to systems and methods to store a tile register pair to memory. In one example, a processor includes: decode circuitry to decode a store matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded store matrix pair instruction to store every element of left and right tiles of the identified source matrix to corresponding element positions of left and right tiles of the identified destination matrix, respectively, wherein the executing stores a chunk of C elements of one row of the identified source matrix at a time.

Type: Application

Filed: December 29, 2017

Publication date: February 7, 2019

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
SYSTEMS FOR PERFORMING INSTRUCTIONS FOR FAST ELEMENT UNPACKING INTO 2-DIMENSIONAL REGISTERS

Publication number: 20190042245

Abstract: Disclosed embodiments relate to instructions for fast element unpacking. In one example, a processor includes fetch circuitry to fetch an instruction whose format includes fields to specify an opcode and locations of an Array-of-Structures (AOS) source matrix and one or more Structure of Arrays (SOA) destination matrices, wherein: the specified opcode calls for unpacking elements of the specified AOS source matrix into the specified Structure of Arrays (SOA) destination matrices, the AOS source matrix is to contain N structures each containing K elements of different types, with same-typed elements in consecutive structures separated by a stride, the SOA destination matrices together contain K segregated groups, each containing N same-typed elements, decode circuitry to decode the fetched instruction, and execution circuitry, responsive to the decoded instruction, to unpack each element of the specified AOS matrix into one of the K element types of the one or more SOA matrices.

Type: Application

Filed: September 28, 2018

Publication date: February 7, 2019

Inventors: Bret TOLL, Alexander F. HEINECKE, Christopher J. HUGHES, Ronen ZOHAR, Michael ESPIG, Dan BAUM, Raanan SADE, Robert VALENTINE
METHOD AND APPARATUS FOR EFFICIENT MATRIX ALIGNMENT IN A SYSTOLIC ARRAY

Publication number: 20190042262

Abstract: An apparatus and method for efficient matrix alignment in a systolic array.

Type: Application

Filed: September 28, 2018

Publication date: February 7, 2019

Inventors: Michael Espig, Bret Toll, Raanan Sade, Robert Valentine, Alexander Heinecke
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS SPECIFYING TERNARY TILE LOGIC OPERATIONS

Publication number: 20190042260

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying ternary tile operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction specifying a ternary tile operation, and locations of destination and first, second, and third source matrices, each of the matrices having M rows by N columns; and execution circuitry to respond to the decoded instruction by, for each equal-sized group of K elements of the specified first, second, and third source matrices, generate K results by performing the ternary tile operation in parallel on K corresponding elements of the specified first, second, and third source matrices, and store each of the K results to a corresponding element of the specified destination matrix, wherein corresponding elements of the specified source and destination matrices occupy a same relative position within their associated matrix.

Type: Application

Filed: September 14, 2018

Publication date: February 7, 2019

Inventors: Elmoustapha OULD-AHMED-VALL, Christopher J. HUGHES, Bret TOLL, Dan BAUM, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
SYSTEMS AND METHODS FOR PERFORMING HORIZONTAL TILE OPERATIONS

Publication number: 20190042261

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying horizontal tile operations. In one example, a processor includes fetch circuitry to fetch an instruction specifying a horizontal tile operation, a location of a M by N source matrix comprising K groups of elements, and locations of K destinations, wherein each of the K groups of elements comprises the same number of elements, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction by generating K results, each result being generated by performing the specified horizontal tile operation across every element of a corresponding group of the K groups, and writing each generated result to a corresponding location of the K specified destination locations.

Type: Application

Filed: September 14, 2018

Publication date: February 7, 2019

Inventors: Christopher J. HUGHES, Bret TOLL, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
SYSTEMS AND METHODS FOR COMPUTING DOT PRODUCTS OF NIBBLES IN TWO TILE OPERANDS

Publication number: 20190042235

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).

Type: Application

Filed: December 29, 2017

Publication date: February 7, 2019

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
SYSTEMS AND METHODS TO ZERO A TILE REGISTER PAIR

Publication number: 20190042256

Abstract: Embodiments detailed herein relate to systems and methods to zero a tile register pair. In one example, a processor includes decode circuitry to decode a matrix pair zeroing instruction having fields for an opcode and an identifier to identify a destination matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded matrix pair zeroing instruction to zero every element of a left matrix and a right matrix of the identified destination matrix.

Type: Application

Filed: December 29, 2017

Publication date: February 7, 2019

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Eyal Hadas
SYSTEMS, METHODS, AND APPARATUSES UTILIZING CPU STORAGE WITH A MEMORY REFERENCE

Publication number: 20190042448

Abstract: Implementations of using tiles for caching are detailed In some implementations, an instruction execution circuitry executes one or more instructions, a register state cache coupled to the instruction execution circuitry holds thread register state in a plurality of registers, and backing storage pointer storage stores a backing storage pointer, wherein the backing storage pointer is to reference a state backing storage area in external memory to store the thread register state stored in the register state cache.

Type: Application

Filed: December 22, 2017

Publication date: February 7, 2019

Inventors: Raanan SADE, Jason BRANDT, Mark J. CHARNEY, Joseph NUZMAN, Leena PUTHIYEDATH, Rinat RAPPOPORT, Vivekananthan SANJEEPAN, Robert VALENTINE
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX OPERATIONS

Publication number: 20190042540

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address, and execution circuitry to execute the decoded instruction to store configuration information about usage of storage for two-dimensional data structures at the memory address.

Type: Application

Filed: December 29, 2017

Publication date: February 7, 2019

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSPOSE RECTANGULAR TILES

Publication number: 20190042202

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transpose rectangular tiles. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first destination, second destination, first source, and second source matrices, the specified opcode to cause the processor to process each of the specified source and destination matrices as a rectangular matrix, decode circuitry to decode the fetched rectangular matrix transpose instruction, and execution circuitry to respond to the decoded rectangular matrix transpose instruction by transposing each row of elements of the specified first source matrix into a corresponding column of the specified first destination matrix and transposing each row of elements of the specified second source matrix into a corresponding column of the specified second destination matrix.

Type: Application

Filed: September 27, 2018

Publication date: February 7, 2019

Inventors: Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Simon RUBANOVICH, Amit GRADSTEIN, Zeev SPERBER, Bret TOLL, Jesus CORBAL, Christopher J. HUGHES, Alexander F. HEINECKE, Elmoustapha OULD-AHMED-VALL
SYSTEMS, METHODS, AND APPARATUSES FOR DOT PRODUCT OPERATIONS

Publication number: 20190042541

Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a quadword data elements of a matrix pair. Additionally, in some instances, non-accumulating quadword data elements of the matrix pair are set to zero.

Type: Application

Filed: December 29, 2017

Publication date: February 7, 2019

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
METHOD AND SYSTEM FOR PERFORMING DATA MOVEMENT OPERATIONS WITH READ SNAPSHOT AND IN PLACE WRITE UPDATE

Publication number: 20190004958

Abstract: Method and system for performing data movement operations is described herein. One embodiment of a method includes: storing data for a first memory address in a cache line of a memory of a first processing unit, the cache line associated with a coherency state indicating that the memory has sole ownership of the cache line; decoding an instruction for execution by a second processing unit, the instruction comprising a source data operand specifying the first memory address and a destination operand specifying a memory location in the second processing unit; and responsive to executing the decoded instruction, copying data from the cache line of the memory of the first processing unit as identified by the first memory address, to the memory location of the second processing unit, wherein responsive to the copy, the cache line is to remain in the memory and the coherency state is to remain unchanged.

Type: Application

Filed: June 30, 2017

Publication date: January 3, 2019

Inventors: Anil Vasudevan, Venkata Krishnan, Andrew J. Herdrich, Ren Wang, Robert G. Blankenship, Vedaraman Geetha, Shrikant M. Shah, Marshall A. Millier, Raanan Sade, Binh Q. Pham, Olivier Serres, Chyi-Chang Miao, Christopher B. Wilkerson
Hardware apparatuses and methods for memory corruption detection

Patent number: 10162694

Abstract: Methods and apparatuses relating to memory corruption detection are described. In one embodiment, a hardware processor includes an execution unit to execute an instruction to request access to a block of a memory through a pointer to the block of the memory, and a memory management unit to allow access to the block of the memory when a memory corruption detection value in the pointer is validated with a memory corruption detection value in the memory for the block, wherein a position of the memory corruption detection value in the pointer is selectable between a first location and a second, different location.

Type: Grant

Filed: December 21, 2015

Date of Patent: December 25, 2018

Assignee: Intel Corporation

Inventors: Tomer Stark, Ron Gabor, Joseph Nuzman, Raanan Sade, Bryant E. Bigbee
Suspendable load address tracking inside transactions

Patent number: 10146538

Abstract: Suspendable load address tracking inside transactions is disclosed. An example processing device of implementations of the disclosure includes a transactional memory (TM) read set tracking component circuitry to identify a suspend read tracking instruction within a transaction executed by the processing device, mark load instructions occurring in the transaction subsequent to the identified suspend read tracking instruction with a suspend attribute, wherein the addresses corresponding to the marked load instructions are excluded from a read set maintained for the transaction, identify a resume read tracking instruction within the transaction, and stop marking the load instructions occurring subsequent to the identified resume read tracking instruction with the suspend attribute.

Type: Grant

Filed: September 30, 2016

Date of Patent: December 4, 2018

Assignee: Intel Corporation

Inventors: Raanan Sade, Roman Dementiev, Ravi Rajwar, Ady Tal, Alex Gerber
LINEAR MEMORY ADDRESS TRANSFORMATION AND MANAGEMENT

Publication number: 20180210842

Abstract: A processing device including a linear address transformation circuit to determine that a metadata value stored in a portion of a linear address falls within a pre-defined metadata range. The metadata value corresponds to a plurality of metadata bits. The linear address transformation circuit to replace each of the plurality of the metadata bits with a constant value.

Type: Application

Filed: January 26, 2017

Publication date: July 26, 2018

Inventors: Joseph Nuzman, Raanan Sade, Igor Yanover, Ron Gabor, Amit Gradstein
METHOD AND APPARATUS FOR SUPPORTING QUASI-POSTED LOADS

Publication number: 20180181393

Abstract: A processor includes a decoder, a data return buffer, and an execution unit. The decoder is to decode an instruction for a non-posted load into a decoded instruction for loading data from memory mapped input/output. The execution unit is for executing the decoded instruction. The execution is to start a timer, determine whether the timer exceeds a timeout threshold, allocate an entry in the data return buffer for the load, and determine whether an event arrived. The timer is to measure an amount of time taken to return the non-posted load instruction. The determination whether an event arrived is made in response to at least one of the allocation of the entry for the load, or a determination that the timer exceeds the timeout threshold.

Type: Application

Filed: December 22, 2016

Publication date: June 28, 2018

Inventors: Ido Ouziel, Raanan Sade, Jacob Doweck
Apparatus and method for memory-hierarchy aware producer-consumer instruction

Patent number: 9990287

Abstract: An apparatus and method are described for efficiently transferring data from a core of a central processing unit (CPU) to a graphics processing unit (GPU). For example, one embodiment of a method comprises: writing data to a buffer within the core of the CPU until a designated amount of data has been written; upon detecting that the designated amount of data has been written, responsively generating an eviction cycle, the eviction cycle causing the data to be transferred from the buffer to a cache accessible by both the core and the GPU; setting an indication to indicate to the GPU that data is available in the cache; and upon the GPU detecting the indication, providing the data to the GPU from the cache upon receipt of a read signal from the GPU.

Type: Grant

Filed: December 21, 2011

Date of Patent: June 5, 2018

Assignee: Intel Corporation

Inventors: Shlomo Raikin, Raanan Sade, Robert Valentine, Julius Yuli Mandelblat, Ron Shalev, Larisa Novakovsky
SUSPENDABLE LOAD ADDRESS TRACKING INSIDE TRANSACTIONS

Publication number: 20180095759

Abstract: Suspendable load address tracking inside transactions is disclosed. An example processing device of implementations of the disclosure includes a transactional memory (TM) read set tracking component circuitry to identify a suspend read tracking instruction within a transaction executed by the processing device, mark load instructions occurring in the transaction subsequent to the identified suspend read tracking instruction with a suspend attribute, wherein the addresses corresponding to the marked load instructions are excluded from a read set maintained for the transaction, identify a resume read tracking instruction within the transaction, and stop marking the load instructions occurring subsequent to the identified resume read tracking instruction with the suspend attribute.

Type: Application

Filed: September 30, 2016

Publication date: April 5, 2018

Inventors: Raanan Sade, Roman Dementiev, Ravi Rajwar, Ady Tal, Alex Gerber
DEFINING VIRTUALIZED PAGE ATTRIBUTES BASED ON GUEST PAGE ATTRIBUTES

Publication number: 20180074969

Abstract: A processing system includes a processing core to execute a virtual machine (VM) comprising a guest operating system (OS) and a memory management unit, communicatively coupled to the processing core, comprising a storage device to store an extended page table entry (EPTE) comprising a mapping from a guest physical address (GPA) associated with the guest OS to an identifier of a memory frame, a first plurality of access right flags associated with accessing the memory frame in a first page mode referenced by an attribute of a memory page identified by the GPA, and a second plurality of access right flags associated with accessing the memory frame in a second page mode referenced by the attribute of the memory page identified by the GPA.

Type: Application

Filed: September 9, 2016

Publication date: March 15, 2018

Inventors: Gilbert Neiger, Baiju V. Patel, Gur Hildesheim, Ron Rais, Andrew V. Anderson, Jason W. Brandt, David M. Durham, Barry E. Huntley, Raanan Sade, Ravi L. Sahita, Vedvyas Shanbhogue, Arumugam Thiyagarajah
INCREASING INVALID TO MODIFIED PROTOCOL OCCURRENCES IN A COMPUTING SYSTEM

Publication number: 20180024925

Abstract: An example system on a chip (SoC) includes a processor, a cache, and a main memory. The SoC can include a first memory to store data in a memory line, wherein the memory line is set to an invalid state. The processor can include a processor coupled to the first memory. The processor can determine that a data size of a first data set received from an application is within a data size range. The processor can determine that an aggregate data size of the first data set and a second data set received from the application is at least a same data size as data size of the memory line. The processor can perform an invalid-to-modify (I2M) operation to change the memory line from the invalid state to a modified state. The processor can write the first data set and the second data set to the memory line.

Type: Application

Filed: July 20, 2016

Publication date: January 25, 2018

Inventors: Raanan Sade, Joseph Nuzman, Stanislav Shwartsman, Igor Yanover, Liron Zur

prev … 4 5 6 7 8 9 10 next