Patents by Inventor Robert Valentine

Robert Valentine has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Instruction execution that broadcasts and masks data values at different levels of granularity

Patent number: 11301581

Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure.

Type: Grant

Filed: December 30, 2019

Date of Patent: April 12, 2022

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney
Instruction execution that broadcasts and masks data values at different levels of granularity

Patent number: 11301580

Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure.

Type: Grant

Filed: December 30, 2019

Date of Patent: April 12, 2022

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney
SYSTEMS, APPARATUSES, AND METHODS FOR DUAL COMPLEX MULTIPLY ADD OF SIGNED WORDS

Publication number: 20220107804

Abstract: Embodiments of systems, apparatuses, and methods for dual complex number multiplication and addition in a processor are described. For example, execution circuitry executes a decoded instruction to multiplex data values from positions in source operands to a multiplier, the source operands including pairs complex numbers, calculate a real part of a product of each pair of complex numbers, add the real part of the product of a first pair of complex numbers to the real part of the product of a second pair of complex numbers to calculate a first real result, and add the real part of the product of a third pair of complex numbers to the real part of the product of a fourth pair of complex numbers to calculate a second real result, and store the results to corresponding positions in the destination operand.

Type: Application

Filed: October 25, 2021

Publication date: April 7, 2022

Inventors: Elmoustapha OULD-AHMED-VALL, Venkateswara R. MADDURI, Mark J. CHARNEY, Robert VALENTINE
Apparatus and method for multiplication and accumulation of complex values

Patent number: 11294679

Abstract: An apparatus and method for multiplying packed signed words. For example, one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed signed words; a second source register to store a second plurality of packed signed words; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply each of a plurality of packed signed words from the first source register with corresponding packed signed words from the second source register to generate a plurality of products responsive to the decoded instruction, adder circuitry to add the products to generate a first result, and accumulation circuitry to combine the first result with an accumulated result to generate a final result comprising a third plurality of packed signed words, and to write the third plurality of packed signed words or a maximum value to a destination register.

Type: Grant

Filed: June 30, 2017

Date of Patent: April 5, 2022

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Venkateswara R. Madduri, Robert Valentine
Systems and methods for performing duplicate detection instructions on 2D data

Patent number: 11294671

Abstract: Disclosed embodiments relate to systems and methods for performing duplicate detection instructions on two-dimensional (2D) data. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode and locations of a source matrix comprising M×N elements and a destination, the opcode to indicate execution circuitry is to use a plurality of comparators to discover duplicates in the source matrix, and store indications of locations of discovered duplicates in the destination. The execution circuitry to execute the decoded instruction as per the opcode.

Type: Grant

Filed: December 26, 2018

Date of Patent: April 5, 2022

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Michael Espig, Dan Baum, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS TO CONVERT 16-BIT FLOATING-POINT FORMATS

Publication number: 20220100507

Abstract: Systems, methods, and apparatuses relating to instructions to convert 16-bit floating-point formats are described. In one embodiment, a processor includes fetch circuitry to fetch a single instruction having fields to specify an opcode and locations of a source vector comprising N plurality of 16-bit half-precision floating-point elements, and a destination vector to store N plurality of 16-bit bfloat floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the source vector from 16-bit half-precision floating-point format to 16-bit bfloat floating-point format and store each converted element into a corresponding location of the destination vector, decode circuitry to decode the fetched single instruction into a decoded single instruction, and the execution circuitry to respond to the decoded single instruction as specified by the opcode.

Type: Application

Filed: December 24, 2020

Publication date: March 31, 2022

Inventors: ALEXANDER F. HEINECKE, ROBERT VALENTINE, MARK J. CHARNEY, MENACHEM ADELMAN, CHRISTOPHER J. HUGHES, EVANGELOS GEORGANAS, ZEEV SPERBER, AMIT GRADSTEIN, SIMON RUBANOVICH
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS

Publication number: 20220100515

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Application

Filed: December 13, 2021

Publication date: March 31, 2022

Inventors: Bret TOLL, Christopher J. HUGHES, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR LOADING DATA AND PADDING INTO A TILE OF A MATRIX OPERATIONS ACCELERATOR

Publication number: 20220100513

Abstract: Systems, methods, and apparatuses relating to one or more instructions that load data into a tile register and pad a row (or column) with a pad value from a padding circuit are described.

Type: Application

Filed: December 24, 2020

Publication date: March 31, 2022

Inventors: CHRISTOPHER J. HUGHES, ALEXANDER HEINECKE, ROBERT VALENTINE, MENACHEM ADELMAN, EVANGELOS GEORGANAS, MARK CHARNEY
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR 16-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS

Publication number: 20220100502

Abstract: Systems, methods, and apparatuses relating to 16-bit floating-point matrix dot product instructions are described.

Type: Application

Filed: December 24, 2020

Publication date: March 31, 2022

Inventors: ALEXANDER F. HEINECKE, ROBERT VALENTINE, MARK J. CHARNEY, MENACHEM ADELMAN, CHRISTOPHER J. HUGHES, EVANGELOS GEORGANAS, ZEEV SPERBER, AMIT GRADSTEIN, SIMON RUBANOVICH
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS

Publication number: 20220100505

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

Type: Application

Filed: December 13, 2021

Publication date: March 31, 2022

Inventors: Bret TOLL, Christopher J. HUGHES, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
Systems, methods, and apparatuses for tile store

Patent number: 11288069

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.

Type: Grant

Filed: July 1, 2017

Date of Patent: March 29, 2022

Assignee: Intel Corporation

Inventors: Robert Valentine, Menachem Adelman, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Dan Baum, Yuri Gebil
Systems, methods, and apparatus for matrix move

Patent number: 11288068

Abstract: Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.

Type: Grant

Filed: July 1, 2017

Date of Patent: March 29, 2022

Assignee: Intel Corporation

Inventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Jesus Corbal, Dan Baum, Alexander Heinecke, Elmoustapha Ould-Ahmed-Vall
SYSTEMS AND METHODS TO LOAD A TILE REGISTER PAIR

Publication number: 20220091848

Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.

Type: Application

Filed: August 10, 2021

Publication date: March 24, 2022

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
Context save with variable save state size

Patent number: 11275588

Abstract: Embodiments of an apparatus comprising a decoder to decode an instruction having fields for an opcode and a destination operand and execution circuitry to execute the decoded instruction to perform a save of processor state components to an area located at a destination memory address specified by the destination operand, wherein a size of the area is defined by at least one indication of an execution of an instruction operating on a specified group of processor states are described.

Type: Grant

Filed: July 1, 2017

Date of Patent: March 15, 2022

Assignee: Intel Corporation

Inventors: Robert Valentine, Mark J. Charney, Rinat Rappoport, Vivekananthan Sanjeepan
Apparatus and method of improved insert instructions

Patent number: 11275583

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Grant

Filed: August 3, 2017

Date of Patent: March 15, 2022

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Systems and methods for performing 16-bit floating-point vector dot product instructions

Patent number: 11263009

Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Type: Grant

Filed: February 4, 2021

Date of Patent: March 1, 2022

Assignee: Intel Corporation

Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Systems, methods, and apparatuses for tile broadcast

Patent number: 11263008

Abstract: Embodiments detailed herein relate to matrix operations. In particular, embodiment of broadcasting elements are described. For example, some embodiments describe broadcasting a scalar to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a row to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a column to all configured data element positons of a destination matrix (tile).

Type: Grant

Filed: July 1, 2017

Date of Patent: March 1, 2022

Assignee: Intel Corporation

Inventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Jesus Corbal, Alexander Heinecke, Barukh Ziv, Dan Baum, Elmoustapha Ould-Ahmed-Vall, Stanislav Shwartsman
SYSTEMS, METHODS, AND APPARATUSES FOR DOT PRODUCTION OPERATIONS

Publication number: 20220058021

Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a double word with saturation; computing a dot product of bytes and accumulating in to a dword with saturation, where the input bytes can be signed or unsigned and the dword accumulation has output saturation; etc.

Type: Application

Filed: November 1, 2021

Publication date: February 24, 2022

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Menachem ADELMAN, Barukh ZIV, Alexander HEINECKE, Simon RUBANOVICH
Apparatus and method for complex by complex conjugate multiplication

Patent number: 11256504

Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction.

Type: Grant

Filed: September 29, 2017

Date of Patent: February 22, 2022

Assignee: Intel Corporation

Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
Systems, Apparatuses, And Methods For Fused Multiply Add

Publication number: 20220050678

Abstract: Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand.

Type: Application

Filed: September 3, 2021

Publication date: February 17, 2022

Inventors: Robert Valentine, Galina Ryvchin, Piotr Majcher, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Milind B. Girkar, Zeev Sperber, Simon Rubanovich, Amit Gradstein

prev … 4 5 6 7 8 9 10 11 12 … next