Patents by Inventor Jesus Corbal
Jesus Corbal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250117222Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.Type: ApplicationFiled: October 29, 2024Publication date: April 10, 2025Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
-
Publication number: 20250117221Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.Type: ApplicationFiled: October 18, 2024Publication date: April 10, 2025Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Barukh ZIV, Alexander HEINECKE, Milind GIRKAR, Menachem ADELMAN, Simon RUBANOVICH
-
Patent number: 12260213Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.Type: GrantFiled: December 10, 2021Date of Patent: March 25, 2025Assignee: Intel CorporationInventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Simon Rubanovich
-
Patent number: 12236242Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.Type: GrantFiled: March 20, 2023Date of Patent: February 25, 2025Assignee: Intel CorporationInventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
-
Publication number: 20250060963Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.Type: ApplicationFiled: August 26, 2024Publication date: February 20, 2025Inventors: JESUS CORBAL, ROBERT VALENTINE, ROMAN S. DUBTSOV, NIKITA A. SHUSTROV, MARK J. CHARNEY, DENNIS R. BRADFORD, MILIND B. GIRKAR, EDWARD T. GROCHOWSKI, THOMAS D. FLETCHER, WARREN E. FERGUSON
-
Patent number: 12197617Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure.Type: GrantFiled: July 21, 2023Date of Patent: January 14, 2025Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L Toll, Mark J. Charney
-
Publication number: 20250004763Abstract: Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand.Type: ApplicationFiled: September 16, 2024Publication date: January 2, 2025Inventors: Robert Valentine, Galina Ryvchin, Piotr Majcher, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Milind B. Girkar, Zeev Sperber, Simon Rubanovich, Amit Gradstein
-
Publication number: 20250004716Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.Type: ApplicationFiled: May 3, 2024Publication date: January 2, 2025Inventors: Robert VALENTINE, Menachem ADELMAN, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus Corbal, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Yuri GEBIL, Raanan SADE
-
Patent number: 12182571Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.Type: GrantFiled: January 23, 2023Date of Patent: December 31, 2024Assignee: Intel CorporationInventors: Robert Valentine, Menachem Adelman, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Dan Baum, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
-
Patent number: 12182568Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).Type: GrantFiled: August 14, 2023Date of Patent: December 31, 2024Assignee: Intel CorporationInventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall
-
Publication number: 20240427600Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.Type: ApplicationFiled: September 6, 2024Publication date: December 26, 2024Applicant: Intel CorporationInventors: Robert C. VALENTINE, Jesus Corbal SAN ADRIAN, Roger Espasa SANS, Robert D. CAVIN, Bret L. TOLL, Santiago Galan DURAN, Jeffrey G. WIEDEMEIER, Sridhar SAMUDRALA, Milind Baburao GIRKAR, Edward Thomas GROCHOWSKI, Jonathan Cannon HALL, Dennis R. BRADFORD, Elmoustapha OULD-AHMED-VALL, James C ABEL, Mark CHARNEY, Seth ABRAHAM, Suleyman SAIR, Andrew Thomas FORSYTH, Lisa WU, Charles YOUNT
-
Patent number: 12147804Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.Type: GrantFiled: July 22, 2021Date of Patent: November 19, 2024Assignee: Intel CorporationInventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Stanislav Shwartsman, Dan Baum, Igor Yanover, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Jesus Corbal, Yuri Gebil, Simon Rubanovich
-
Patent number: 12135981Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.Type: GrantFiled: June 9, 2023Date of Patent: November 5, 2024Assignee: Intel CorporationInventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
-
Patent number: 12130740Abstract: Embodiments of an invention a processor architecture are disclosed. In an embodiment, a processor includes a decoder, an execution unit, a coherent cache, and an interconnect. The decoder is to decode an instruction to zero a cache line. The execution unit is to issue a write command to initiate a cache line sized write of zeros. The coherent cache is to receive the write command, to determine whether there is a hit in the coherent cache and whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros, and to issue the write command toward the interconnect. The interconnect is to, responsive to receipt of the write command, issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit.Type: GrantFiled: April 4, 2022Date of Patent: October 29, 2024Assignee: Intel CorporationInventors: Jason W. Brandt, Robert S. Chappell, Jesus Corbal, Edward T. Grochowski, Stephen H. Gunther, Buford M. Guy, Thomas R. Huff, Christopher J. Hughes, Elmoustapha Ould-Ahmed-Vall, Ronak Singhal, Seyed Yahya Sotoudeh, Bret L. Toll, Lihu Rappoport, David B. Papworth, James D. Allen
-
Patent number: 12124846Abstract: Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand.Type: GrantFiled: August 28, 2023Date of Patent: October 22, 2024Assignee: Intel CorporationInventors: Robert Valentine, Galina Ryvchin, Piotr Majcher, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Milind B. Girkar, Zeev Sperber, Simon Rubanovich, Amit Gradstein
-
Patent number: 12124847Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.Type: GrantFiled: July 1, 2017Date of Patent: October 22, 2024Assignee: Intel CorporationInventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Menachem Adelman, Simon Rubanovich
-
Patent number: 12106100Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.Type: GrantFiled: July 1, 2017Date of Patent: October 1, 2024Assignee: Intel CorporationInventors: Robert Valentine, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Dan Baum, Zeev Sperber, Jesus Corbal, Bret L. Toll, Raanan Sade, Igor Yanover, Yuri Gebil, Rinat Rappoport, Stanislav Shwartsman, Menachem Adelman, Simon Rubanovich
-
Publication number: 20240320001Abstract: Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.Type: ApplicationFiled: May 14, 2024Publication date: September 26, 2024Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Dan BAUM, Alexander HEINECKE, Elmoustapha OULD-AHMED-VALL
-
Patent number: 12086594Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.Type: GrantFiled: August 28, 2023Date of Patent: September 10, 2024Assignee: Intel CorporationInventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
-
Patent number: 12086080Abstract: Systems, methods, and apparatuses relating to a configurable accelerator having dataflow execution circuits are described.Type: GrantFiled: September 26, 2020Date of Patent: September 10, 2024Assignee: Intel CorporationInventors: George Chrysos, Bhargavi Narayanasetty, Jesus Corbal, Ching-Kai Liang, Chinmay Ashok, Francis Tseng