Patents Assigned to Habana Labs Ltd.
  • Patent number: 11847491
    Abstract: An apparatus for Machine Learning (ML) processing includes computational engines and a Central Processing Unit (CPU). The CPU is configured to receive a work plan for processing one or more samples in accordance with a ML model represented by a corresponding ML graph. The work plan specifies jobs required for executing at least a subgraph of the ML graph by the computational engines, the at least subgraph includes multiple inputs, and is executable independently of other parts of the ML graph when the inputs are valid. The CPU is further configured to pre-process only a partial subset of the jobs in the work plan corresponding to the at least subgraph, for producing a group of pre-processed jobs that are required for executing part of the at least subgraph based on the one or more samples, and to submit the pre-processed jobs in the group to the computational engines for execution.
    Type: Grant
    Filed: April 22, 2021
    Date of Patent: December 19, 2023
    Assignee: HABANA LABS LTD.
    Inventors: Oren Kaidar, Oded Gabbay
  • Patent number: 11714653
    Abstract: A method for computing includes defining a processing pipeline, including at least a first stage in which producer processors compute and output data to respective locations in a buffer and a second processing stage in which one or more consumer processors read the data from the buffer and apply a computational task to the data read from the buffer. The computational task is broken into multiple, independent work units, for application by the consumer processors to respective ranges of the data in the buffer, and respective indexes are assigned to the work units in a predefined index space. A mapping is generated between the index space and the addresses in the buffer, and execution of the work units is scheduled such that at least one of the work units can begin execution before all the producer processors have completed the first processing stage.
    Type: Grant
    Filed: February 15, 2021
    Date of Patent: August 1, 2023
    Assignee: HABANA LABS LTD.
    Inventors: Tzachi Cohen, Michael Zuckerman, Doron Singer, Ron Shalev, Amos Goldman
  • Patent number: 11532338
    Abstract: An electronic circuit includes a memory buffer and control logic. The memory buffer is configured to transfer data from a first domain to a second domain of the circuit, the first and the second domains operate in synchronization with respective clock signals. The control logic is configured to maintain a write indicator in the first domain indicative of a next write position in the memory buffer for storing data, to maintain a read indicator in the second domain indicative of a next read position in the memory buffer for retrieving the stored data, to generate in the second domain, based on the write and the read indicators, a first signal that is indicative of whether the memory buffer has data for reading or has become empty, and retain the first signal in a state that indicates that the memory buffer has become empty, until writing to the memory buffer resumes.
    Type: Grant
    Filed: April 6, 2021
    Date of Patent: December 20, 2022
    Assignee: HABANA LABS LTD.
    Inventors: Ehud Eliaz, Yamin Mokatren
  • Patent number: 11468147
    Abstract: A computational apparatus for implementing a neural network model having multiple neurons that evaluate an activation function, the apparatus including a memory and circuitry. The memory is configured to hold values of a difference-function, each value being a respective difference between the activation function and a predefined baseline function. The circuitry is configured to evaluate the neural network model, including, for at least one of the neurons: evaluate the baseline function at the argument, retrieve from the memory one or more values of the difference-function responsively to the argument, and evaluate the activation function at the argument based on the baseline function at the argument and on the one or more values of the difference-function.
    Type: Grant
    Filed: February 24, 2020
    Date of Patent: October 11, 2022
    Assignee: HABANA LABS LTD.
    Inventors: Elad Hofer, Sergei Gofman, Shlomo Raikin
  • Patent number: 11467827
    Abstract: A method for computing includes providing software source code defining a processing pipeline including multiple, sequential stages of parallel computations, in which a plurality of processors apply a computational task to data read from a buffer. A static code analysis is applied to the software source code so as to break the computational task into multiple, independent work units, and to define an index space in which the work units are identified by respective indexes. Based on the static code analysis, mapping parameters that define a mapping between the index space and addresses in the buffer are computed, indicating by the mapping the respective ranges of the data to which the work units are to be applied. The source code is compiled so that the processors execute the work units identified by the respective indexes while accessing the data in the buffer in accordance with the mapping.
    Type: Grant
    Filed: April 6, 2021
    Date of Patent: October 11, 2022
    Assignee: HABANA LABS LTD.
    Inventors: Michael Zuckerman, Tzachi Cohen, Doron Singer, Ron Shalev, Amos Goldman
  • Publication number: 20220201075
    Abstract: Systems, and method and computer readable media that store instructions for remote direct memory access (RDMA) transfers.
    Type: Application
    Filed: December 17, 2020
    Publication date: June 23, 2022
    Applicant: HABANA LABS LTD.
    Inventors: Itay Zur, Ira Joffe, Guy Hershtig, Amit Pessach, Yanai Pomeranz
  • Patent number: 11321092
    Abstract: A processor includes an internal memory and processing circuitry. The internal memory is configured to store a definition of a multi-dimensional array stored in an external memory, and indices that specify elements of the multi-dimensional array in terms of multi-dimensional coordinates of the elements within the array. The processing circuitry is configured to execute instructions in accordance with an Instruction Set Architecture (ISA) defined for the processor. At least some of the instructions in the ISA access the multi-dimensional array by operating on the multi-dimensional coordinates specified in the indices.
    Type: Grant
    Filed: October 25, 2018
    Date of Patent: May 3, 2022
    Assignee: HABANA LABS LTD.
    Inventors: Shlomo Raikin, Sergei Gofman, Ran Halutz, Evgeny Spektor, Amos Goldman, Ron Shalev
  • Publication number: 20220060423
    Abstract: Systems, and method and computer readable media that store instructions for remote direct memory access (RDMA) congestion control.
    Type: Application
    Filed: August 23, 2020
    Publication date: February 24, 2022
    Applicant: HABANA LABS LTD.
    Inventors: ITAY ZUR, Ira Joffe, Shlomo Raikin
  • Patent number: 11249724
    Abstract: A computational apparatus includes a memory unit and Read-Modify-Write (RMW) logic. The memory unit is configured to hold a data value. The RMW logic, which is coupled to the memory unit, is configured to perform an atomic RMW operation on the data value stored in the memory unit.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: February 15, 2022
    Assignee: HABANA LABS LTD.
    Inventors: Shlomo Raikin, Ron Shalev, Sergei Gofman, Ran Halutz, Nadav Klein
  • Patent number: 11240162
    Abstract: Systems, and method and computer readable media that store instructions for remote direct memory access (RDMA) congestion control.
    Type: Grant
    Filed: August 23, 2020
    Date of Patent: February 1, 2022
    Assignee: HABANA LABS LTD.
    Inventors: Itay Zur, Ira Joffe, Shlomo Raikin
  • Patent number: 10915297
    Abstract: Computational apparatus includes a systolic array of processing elements. In each of a sequence of processing cycles, the processing elements in a first row of the array each receive a respective first plurality of first operands, while the processing elements in a first column of the array each receive a respective second plurality of second operands. Each processing element, except in the first row and first column, receives the respective first and second pluralities of the operands from adjacent processing elements in a preceding row and column of the array. Each processing element multiplies pairs of the first and second operands together to generate multiple respective products, and accumulates the products in accumulators. Synchronization logic loads a succession of first and second vectors of the operands into the array, and upon completion of processing triggers the processing elements to transfer respective data values from the accumulators out of the array.
    Type: Grant
    Filed: November 12, 2018
    Date of Patent: February 9, 2021
    Assignee: HABANA LABS LTD.
    Inventors: Ran Halutz, Tomer Rothschild, Ron Shalev
  • Patent number: 10915494
    Abstract: A vector processor includes a coefficient memory and a processor. The processor has an Instruction Set Architecture (ISA), which includes an instruction that approximates a mathematical function by a polynomial. The processor is configured to approximate the mathematical function over an argument, by reading one or more coefficients of the polynomial from the coefficient memory and evaluating the polynomial at the argument using the coefficients.
    Type: Grant
    Filed: November 11, 2018
    Date of Patent: February 9, 2021
    Assignee: HABANA LABS LTD.
    Inventors: Ron Shalev, Evgeny Spektor, Sergei Gofman, Ran Halutz, Shlomo Raikin, Hilla Ben Yaacov
  • Patent number: 10853070
    Abstract: A processor includes a processing engine, an address queue, an address generation unit, and logic circuitry. The processing engine is configured to process instructions that access data in an external memory. The address generation unit is configured to generate respective addresses for the instructions to be processed by the processing engine, to provide the addresses to the processing engine, and to write the addresses to the address queue. The logic circuitry is configured to access the external memory on behalf of the processing engine while compensating for variations in access latency to the external memory, by reading the addresses from the address queue, and executing the instructions in the external memory in accordance with the addresses read from the address queue.
    Type: Grant
    Filed: October 3, 2018
    Date of Patent: December 1, 2020
    Assignee: HABANA LABS LTD.
    Inventors: Ron Shalev, Evgeny Spektor, Ran Halutz
  • Patent number: 10853448
    Abstract: Computational apparatus includes a memory, which is configured to contain multiple matrices of input data values. An array of processing elements is configured to perform multiplications of respective first and second input operands and to accumulate products of the multiplication to generate respective output values. Data access logic is configured to select from the memory a plurality of mutually-disjoint first matrices and a second matrix, and to distribute to the processing elements the input data values in a sequence that is interleaved among the first matrices, along with corresponding input data values from the second matrix, so as to cause the processing elements to compute, in the interleaved sequence, respective convolutions of each of the first matrices with the second matrix.
    Type: Grant
    Filed: September 11, 2017
    Date of Patent: December 1, 2020
    Assignee: HABANA LABS LTD.
    Inventors: Ron Shalev, Tomer Rothschild
  • Patent number: 10713214
    Abstract: Computational apparatus includes a systolic array of processing elements, each including a multiplier and first and second accumulators. In each of a sequence of processing cycles, the processing elements perform the following steps concurrently: Each processing element, except in the first row and first column of the array, receives first and second operands from adjacent processing elements in a preceding row and column of the array, respectively, multiplies the first and second operands together to generate a product, and accumulates the product in the first accumulator. In addition, each processing element passes a stored output data value from the second accumulator to a succeeding processing element along a respective column of the array, receives a new output data value from a preceding processing element along the respective column, and stores the new output data value in the second accumulator.
    Type: Grant
    Filed: September 20, 2018
    Date of Patent: July 14, 2020
    Assignee: HABANA LABS LTD.
    Inventors: Ron Shalev, Ran Halutz
  • Patent number: 10491241
    Abstract: An apparatus includes an input interface and compression circuitry. The input interface is configured to receive input source data. The compression circuitry in configured to set a symbol anchor value, having a highest occurrence probability among the symbol values in the input source data, to generate a bit-map by (i) for every symbol in the input source data whose symbol value is the anchor value, setting a respective bit in the bit-map to a first binary value, and (ii) for every symbol in the source data whose symbol value differs from the anchor value, setting the respective bit in the bit-map to a second binary value, and to generate compressed data including (i) the bit-map and (ii) the symbols whose symbol values differ from the symbol anchor value.
    Type: Grant
    Filed: July 1, 2018
    Date of Patent: November 26, 2019
    Assignee: Habana Labs Ltd.
    Inventor: Shlomo Raikin
  • Patent number: 10489479
    Abstract: Computational apparatus includes a memory, which contains first and second input matrices of input data values, having at least three dimensions including respective heights and widths in a predefined sampling space and a common depth in a feature dimension, orthogonal to the sampling space. An array of processing elements each perform a multiplication of respective first and second input operands and to accumulate products of the multiplication to generate a respective output value. Data access logic extracts first and second pluralities of vectors of the input data values extending in the feature dimension from the first and second input matrices, respectively, and distributes the input data values from the extracted vectors in sequence to the processing elements so as to cause the processing elements to compute a convolution of first and second two-dimensional matrices composed respectively of the first and second pluralities of vectors.
    Type: Grant
    Filed: September 11, 2017
    Date of Patent: November 26, 2019
    Assignee: Habana Labs Ltd.
    Inventors: Ron Shalev, Sergei Gofman, Amos Goldman, Tomer Rothschild
  • Patent number: 10491239
    Abstract: A computational device includes an input memory, which receives a first array of input numbers having a first precision represented by N bits. An output memory stores a second array of output numbers having a second precision represented by M bits, M<N. Quantization logic reads the input numbers from the input memory, extracts from each input number a set of M bits, at a bit offset within the input number that is indicated by a quantization factor, and writes a corresponding output number based on the extracted set of bits to the second array in the output memory. A quantization controller sets the quantization factor so as to optimally fit an available range of the output numbers in the second array to an actual range of the input numbers in the first array in extraction of the M bits from the input numbers.
    Type: Grant
    Filed: January 30, 2018
    Date of Patent: November 26, 2019
    Assignee: Habana Labs Ltd.
    Inventor: Itay Hubara