Patents by Inventor Dan Baum

Dan Baum has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200310800
    Abstract: Methods and apparatus for approximation using polynomial functions are disclosed. In one embodiment, a processor comprises decoding and execution circuitry. The decoding circuitry is to decode an instruction, where the instruction comprises a first operand specifying an output location and a second operand specifying a plurality of data element values to be computed. The execution circuitry is to execute the decoded instruction. The execution includes to compute a result for each of the plurality of data element values using a polynomial function to approximate a complex function, where the computation uses coefficients stored in a lookup location for the complex function, and where data element values within different data element value ranges use different sets of coefficients. The execution further includes to store results of the computation in the output location.
    Type: Application
    Filed: March 27, 2019
    Publication date: October 1, 2020
    Inventors: Jorge PARRA, Dan BAUM, Robert CHAPPELL, Michael ESPIG, Varghese GEORGE, Alexander HEINECKE, Christopher HUGHES, Subramaniam MAIYURAN, Elmoustapha OULD-AHMED-VALL, Prasoonkumar SURTI, Ronen ZOHAR
  • Publication number: 20200272466
    Abstract: An apparatus and method for processing efficient multicast operation.
    Type: Application
    Filed: May 13, 2020
    Publication date: August 27, 2020
    Inventors: CHRISTOPHER J. HUGHES, DAN BAUM
  • Publication number: 20200249949
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.
    Type: Application
    Filed: July 1, 2017
    Publication date: August 6, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Menachem ADELMAN, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus Corbal, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Yuri GEBIL
  • Publication number: 20200249947
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, embodiment of broadcasting elements are described. For example, some embodiments describe broadcasting a scalar to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a row to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a column to all configured data element positons of a destination matrix (tile).
    Type: Application
    Filed: July 1, 2017
    Publication date: August 6, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Alexander HEINECKE, Barukh ZIV, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Stanislav SHWARTSMAN
  • Publication number: 20200241877
    Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
    Type: Application
    Filed: July 1, 2017
    Publication date: July 30, 2020
    Applicant: Intel Corporation
    Inventors: Menachem ADELMAN, Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus CORBAL, Dan BAUM, Alexander F. HEINECKE, Elmoustapha OULD-AHMED-VALL, Yuri GEBIL, Raanan SADE
  • Publication number: 20200233666
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.
    Type: Application
    Filed: July 1, 2017
    Publication date: July 23, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Menachem ADELMAN, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Rinat RAPPOPORT, Jesus CORBAL, Stanislav SHWARTSMAN, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Dan BAUM, Yuri GEBIL
  • Publication number: 20200233667
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
    Type: Application
    Filed: July 1, 2017
    Publication date: July 23, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
  • Publication number: 20200233665
    Abstract: Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.
    Type: Application
    Filed: July 1, 2017
    Publication date: July 23, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Dan BAUM, Alexander HEINECKE, Elmoustapha OULD-AHMED-VALL
  • Patent number: 10719323
    Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: July 21, 2020
    Assignee: Intel Corporation
    Inventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
  • Publication number: 20200210182
    Abstract: Disclosed embodiments relate to systems and methods for performing duplicate detection instructions on two-dimensional (2D) data. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode and locations of a source matrix comprising M×N elements and a destination, the opcode to indicate execution circuitry is to use a plurality of comparators to discover duplicates in the source matrix, and store indications of locations of discovered duplicates in the destination. The execution circuitry to execute the decoded instruction as per the opcode.
    Type: Application
    Filed: December 26, 2018
    Publication date: July 2, 2020
    Inventors: Christopher J. HUGHES, Michael ESPIG, Dan BAUM, Robert VALENTINE, Bret TOLL, Elmoustapha OULD-AHMED-VALL
  • Publication number: 20200210173
    Abstract: Disclosed embodiments relate to systems and methods for performing nibble-sized operations on matrix elements. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction the fetched instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode to indicate the processor is to, for each pair of corresponding elements of the first and second source matrices, logically partition each element into nibble-sized partitions, perform an operation indicated by the instruction on each partition, and store execution results to a corresponding nibble-sized partition of a corresponding element of the destination matrix. The exemplary processor includes execution circuitry to execute the decoded instruction as per the opcode.
    Type: Application
    Filed: December 26, 2018
    Publication date: July 2, 2020
    Inventors: Elmoustapha OULD-AHMED-VALL, Jonathan D. PEARCE, Dan BAUM, Guei-Yuan LUEH, Michael ESPIG, Christopher J. HUGHES, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
  • Publication number: 20200210188
    Abstract: Disclosed embodiments relate to systems and methods for performing matrix row-wise and column-wise permute instructions. In one example, a processor includes fetch circuitry to fetch an instruction, decoding, using decode circuitry, the fetched instruction having fields to specify an opcode and locations of a source matrix and a destination matrix, the opcode indicating the processor is to perform a permutation by copying, into each of a plurality of equal-sized logical partitions of the destination matrix, a selected logical partition of a same size from the source matrix, the selection being indicated by a permute control, and execution circuitry to execute the decoded instruction as per the opcode.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 2, 2020
    Inventors: Elmoustapha OULD-AHMED-VALL, Jonathan D. PEARCE, Dan BAUM, Guei-Yuan LUEH, Michael ESPIG, Christopher J. HUGHES, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
  • Publication number: 20200210517
    Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 2, 2020
    Inventors: Dan BAUM, Chen KOREN, Elmoustapha OULD-AHMED-VALL, Michael ESPIG, Christopher J. HUGHES, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
  • Patent number: 10664273
    Abstract: An apparatus and method for processing efficient multicast operation.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: May 26, 2020
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Dan Baum
  • Publication number: 20200104135
    Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.
    Type: Application
    Filed: September 27, 2018
    Publication date: April 2, 2020
    Inventors: Bret TOLL, Christopher J. HUGHES, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Raanan SADE, Robert VALENTINE, Mark J. CHARNEY, Alexander F. HEINECKE
  • Publication number: 20200097291
    Abstract: An apparatus and method for tile-based gather and scatter operations. For example, one embodiment of a processor comprises: a destination tile register to store a 2-D arrangement of data elements; a first source tile register to store indices associated with the data elements; instruction fetch circuitry to fetch a tile gather instruction comprising operands identifying the first source tile register and the destination tile register; a decoder to decode the tile gather instruction; and execution circuitry to determine a plurality of system memory addresses based on the indices from the first source tile register and to load the data elements from the system memory addresses to the destination tile register.
    Type: Application
    Filed: September 24, 2018
    Publication date: March 26, 2020
    Inventors: CHRISTOPHER J. HUGHES, BRET TOLL, ALEXANDER HEINECKE, DAN BAUM, ELMOUSTAPHA OULD-AHMED-VALL, RAANAN SADE, ROBERT VALENTINE, MARK CHARNEY
  • Publication number: 20200097298
    Abstract: An apparatus and method for processing array of structures (AoS) and structure of arrays (SoA) data. For example, one embodiment of a processor comprises: a destination tile register to store data elements in a structure of arrays (SoA) format; a first source tile register to store indices associated with the data elements; instruction fetch circuitry to fetch an array of structures (AoS) gather instruction comprising operands identifying the first source tile register and the destination tile register; a decoder to decode the AoS gather instruction; and execution circuitry to determine a plurality of system memory addresses based on the indices from the first source tile register, to read data elements from the system memory addresses in an AoS format, and to load the data elements to the destination tile register in an SoA format.
    Type: Application
    Filed: September 24, 2018
    Publication date: March 26, 2020
    Inventors: CHRISTOPHER J. HUGHES, BRET TOLL, ALEXANDER HEINECKE, DAN BAUM, ELMOUSTAPHA OULD-AHMED-VALL, RAANAN SADE, ROBERT VALENTINE, MARK CHARNEY
  • Publication number: 20200065352
    Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
    Type: Application
    Filed: July 1, 2017
    Publication date: February 27, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Mark J. CHARNEY, Elmoustapha OULD-AHMED-VALL, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Bret L. TOLL, Raanan SADE, Igor YANOVER, Yuri GEBIL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Menachem ADELMAN, Simon RUBANOVICH
  • Publication number: 20200050452
    Abstract: Disclosed embodiments relate to apparatuses, systems, and methods for performing sort indexing and/or permutation using an index. An exemplary apparatus includes decode circuitry to decode an instruction, the instruction to include a first field to identify a location of a source vector, a second field to identify a location of a destination vector, and an opcode to indicate to execution circuitry to execute the decoded instruction to sort values of the source vector and store a result of the sort in the destination vector by generating, per each element of the source vector, an index value using one or more comparisons of the element itself and to other data elements of the source vector, and permuting the values of the elements of the source vector based upon the index values for the elements and execution circuitry to execute the decoded instruction as indicated by the opcode.
    Type: Application
    Filed: March 27, 2019
    Publication date: February 13, 2020
    Inventors: Dan BAUM, Ronen ZOHAR, Asit MISHRA, Prasoonkumar Surti, Elmoustapha OULD-AHMED-VALL, Christopher HUGHES, Alexander HEINECKE
  • Patent number: 10509846
    Abstract: An accelerator for increasing the processing speed of a processor. The accelerator operates in two distinct modes. In a first mode for dense layer processing, row data sets and column data sets are sent to a multiplier for multiplication. In a second mode for sparse layer processing compressed row data sets are received by a row multiplexer and compressed column data sets are received by a column multiplexer. Each multiplexer is configured to compare the indexes of data sets with one another to determine matching indexes. When indexes match, the matching data sets are selected and sent to the multiplier for multiplication. When indexes do not match, data sets are stored in memory devices for subsequent cycles.
    Type: Grant
    Filed: December 13, 2017
    Date of Patent: December 17, 2019
    Assignee: Intel Corporation
    Inventors: Chen Koren, Dan Baum