Patents by Inventor Bret L. Toll

Bret L. Toll has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Rotate instructions that complete execution either without writing or reading flags

Patent number: 11106461

Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.

Type: Grant

Filed: March 29, 2018

Date of Patent: August 31, 2021

Assignee: Intel Corporation

Inventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean P. Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
Systems, methods, and apparatuses for tile matrix multiplication and accumulation

Patent number: 11086623

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Type: Grant

Filed: July 1, 2017

Date of Patent: August 10, 2021

Assignee: Intel Corporation

Inventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Stanislav Shwartsman, Dan Baum, Igor Yanover, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Jesus Corbal, Yuri Gebil, Simon Rubanovich
Systems, methods, and apparatus for tile configuration

Patent number: 11080048

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

Type: Grant

Filed: July 1, 2017

Date of Patent: August 3, 2021

Assignee: Intel Corporation

Inventors: Menachem Adelman, Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Dan Baum, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
PACKED DATA ELEMENT PREDICATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20210216325

Abstract: A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.

Type: Application

Filed: March 29, 2021

Publication date: July 15, 2021

Inventors: Bret L. TOLL, Buford M. GUY, Ronak SINGHAL, Mishali NAIK
Compressed instruction format

Patent number: 11048507

Abstract: A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.

Type: Grant

Filed: October 9, 2018

Date of Patent: June 29, 2021

Assignee: Intel Corporation

Inventors: Robert Valentine, Doron Orenstein, Bret L. Toll
SYSTEMS, METHODS, AND APPARATUSES FOR DOT PRODUCTION OPERATIONS

Publication number: 20210132943

Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a double word with saturation; computing a dot product of bytes and accumulating in to a dword with saturation, where the input bytes can be signed or unsigned and the dword accumulation has output saturation; etc.

Type: Application

Filed: July 1, 2017

Publication date: May 6, 2021

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Menachem ADELMAN, Barukh ZIV, Alexander HEINECKE, Simon RUBANOVICH
Packed data element predication processors, methods, systems, and instructions

Patent number: 10963257

Abstract: A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.

Type: Grant

Filed: September 28, 2019

Date of Patent: March 30, 2021

Assignee: Intel Corporation

Inventors: Bret L. Toll, Buford M. Guy, Ronak Singhal, Mishali Naik
DEEP LEARNING IMPLEMENTATIONS USING SYSTOLIC ARRAYS AND FUSED OPERATIONS

Publication number: 20210089316

Abstract: Disclosed embodiments relate to deep learning implementations using systolic arrays and fused operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of a destination and N source matrices, the opcode indicating the processor is to load the N source matrices from memory, perform N convolutions on the N source matrices to generate N feature maps, and store results of the N convolutions in registers to be passed to an activation layer, wherein the processor is to perform the N convolutions and the activation layer with at most one memory load of each of the N source matrices. The processor further includes scheduling circuitry to schedule execution of the instruction and execution circuitry to execute the instruction as per the opcode.

Type: Application

Filed: September 25, 2019

Publication date: March 25, 2021

Applicant: Intel Corporation

Inventors: William RASH, Subramaniam MAIYURAN, Varghese GEORGE, Bret L. TOLL, Rajesh SANKARAN, Robert S. CHAPPELL, Supratim PAL, Alexander F. HEINECKE, Elmoustapha OULD-AHMED-VALL, Gang CHEN
Instruction execution that broadcasts and masks data values at different levels of granularity

Patent number: 10909259

Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure.

Type: Grant

Filed: September 25, 2018

Date of Patent: February 2, 2021

Assignee: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney
Systems, methods, and apparatuses for tile diagonal

Patent number: 10877756

Abstract: Embodiments detailed herein relate to matrix operations. In particular, tile diagonal support is described. For example, a processor is detailed having decode circuitry to decode an instruction having fields for an opcode, a source operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to write the identified source operand to each element along a main diagonal of the identified destination matrix operand.

Type: Grant

Filed: July 1, 2017

Date of Patent: December 29, 2020

Assignee: Intel Corporation

Inventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Mark J. Charney, Alexander Heinecke
VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF

Publication number: 20200394042

Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

Type: Application

Filed: August 27, 2020

Publication date: December 17, 2020

Applicant: Intel Corporation

Inventors: Robert C. VALENTINE, Jesus Corbal SAN ADRIAN, Roger Espasa SANS, Robert D. CAVIN, Bret L. TOLL, Santiago Galan DURAN, Jeffrey G. WIEDEMEIER, Sridhar SAMUDRALA, Milind Baburao GIRKAR, Edward Thomas GROCHOWSKI, Jonathan Cannon HALL, Dennis R. BRADFORD, Elmoustapha OULD-AHMED-VALL, James C. ABEL, Mark CHARNEY, Seth ABRAHAM, Suleyman SAIR, Andrew Thomas FORSYTH, Lisa WU, Charles YOUNT
Vector friendly instruction format and execution thereof

Patent number: 10795680

Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

Type: Grant

Filed: February 28, 2019

Date of Patent: October 6, 2020

Assignee: Intel Corporation

Inventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C. Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
SYSTEMS, METHODS, AND APPARATUSES FOR TILE BROADCAST

Publication number: 20200249947

Abstract: Embodiments detailed herein relate to matrix operations. In particular, embodiment of broadcasting elements are described. For example, some embodiments describe broadcasting a scalar to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a row to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a column to all configured data element positons of a destination matrix (tile).

Type: Application

Filed: July 1, 2017

Publication date: August 6, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Alexander HEINECKE, Barukh ZIV, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Stanislav SHWARTSMAN
SYSTEMS, METHODS, AND APPARATUSES FOR TILE LOAD

Publication number: 20200249949

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.

Type: Application

Filed: July 1, 2017

Publication date: August 6, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Menachem ADELMAN, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus Corbal, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Yuri GEBIL
SYSTEMS, METHODS, AND APPARATUS FOR TILE CONFIGURATION

Publication number: 20200241877

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

Type: Application

Filed: July 1, 2017

Publication date: July 30, 2020

Applicant: Intel Corporation

Inventors: Menachem ADELMAN, Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus CORBAL, Dan BAUM, Alexander F. HEINECKE, Elmoustapha OULD-AHMED-VALL, Yuri GEBIL, Raanan SADE
SYSTEMS, METHODS, AND APPARATUSES FOR ZEROING A MATRIX

Publication number: 20200241873

Abstract: Embodiments detailed herein relate to matrix operations. In particular, performing a matrix operation of zeroing a matrix in response to a single instruction. For example, a processor detailed which includes decode circuitry to decode an instruction having fields for an opcode and a source/destination matrix operand identifier; and execution circuitry to execute the decoded instruction to zero each data element of the identified source/destination matrix.

Type: Application

Filed: July 1, 2017

Publication date: July 30, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Menachem ADELMAN, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Stanislav SHWARTSMAN
SYSTEMS, METHODS, AND APPARATUSES FOR TILE MATRIX MULTIPLICATION AND ACCUMULATION

Publication number: 20200233667

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Type: Application

Filed: July 1, 2017

Publication date: July 23, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
SYSTEMS, METHODS, AND APPARATUS FOR MATRIX MOVE

Publication number: 20200233665

Abstract: Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.

Type: Application

Filed: July 1, 2017

Publication date: July 23, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Dan BAUM, Alexander HEINECKE, Elmoustapha OULD-AHMED-VALL
SYSTEMS, METHODS, AND APPARATUSES FOR TILE STORE

Publication number: 20200233666

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.

Type: Application

Filed: July 1, 2017

Publication date: July 23, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Menachem ADELMAN, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Rinat RAPPOPORT, Jesus CORBAL, Stanislav SHWARTSMAN, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Dan BAUM, Yuri GEBIL
EFFICIENT RANGE-BASED MEMORY WRITEBACK TO IMPROVE HOST TO DEVICE COMMUNICATION FOR OPTIMAL POWER AND PERFORMANCE

Publication number: 20200233664

Abstract: Method and apparatus for efficient range-based memory writeback is described herein. One embodiment of an apparatus includes a system memory, a plurality of hardware processor cores each of which includes a first cache, a decoder circuitry to decode an instruction having fields for a first memory address and a range indicator, and an execution circuitry to execute the decoded instruction. Together, the first memory address and the range indicator define a contiguous region in the system memory that includes one or more cache lines. An execution of the decoded instruction causes any instances of the one or more cache lines in the first cache to be invalidated. Additionally, any invalidated instances of the one or more cache lines that are dirty are to be stored to system memory.

Type: Application

Filed: December 17, 2019

Publication date: July 23, 2020

Applicant: Intel Corporation

Inventors: Ren Wang, Chunhui Zhang, Qixiong J. Bian, Bret L. Toll, Jason W. Brandt

prev 1 2 3 4 5 6 7 … next