Patents Assigned to Recogni Inc.
  • Patent number: 12293163
    Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.
    Type: Grant
    Filed: May 31, 2024
    Date of Patent: May 6, 2025
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 12271624
    Abstract: A memory system comprises a plurality of memory sub-systems, each with a memory bank and other circuit components. For each of the memory sub-systems, a first buffer receives and stores a read-modify-write request (with a read address, a write address and a first operand), a second operand is read from the memory bank at the location specified by the read address, a combiner circuit combines the first operand with the second operand, an activation circuit transforms the output of the combiner circuit, and the output of the activation circuit is stored in the memory bank at the location specified by the write address. The first operand and the write address may be stored in a second buffer while the second operand is read from the memory bank. Further, the output of the activation circuit may be first stored in the first buffer before being stored in the memory bank.
    Type: Grant
    Filed: March 13, 2023
    Date of Patent: April 8, 2025
    Assignee: Recogni Inc.
    Inventors: Gary S. Goldman, Ashwin Radhakrishnan
  • Patent number: 12165041
    Abstract: In a low power hardware architecture for handling accumulation overflows in a convolver unit, an accumulator of the convolver unit computes a running total by successively summing dot products from a dot product computation module during an accumulation cycle. In response to the running total overflowing the maximum or minimum value of a data storage element, the accumulator transmits an overflow indicator to a controller and sets its output equal to a positive or negative overflow value. In turn, the controller disables the dot product computation module by clock gating, clamping one of its inputs to zero and/or holding its inputs to constant values. At the end of the accumulation cycle, the output of the accumulator is sampled. In response to a clear signal being asserted, the dot product computation module is enabled, and the running total is set to zero for the start of the next accumulation cycle.
    Type: Grant
    Filed: June 9, 2022
    Date of Patent: December 10, 2024
    Assignee: Recogni Inc.
    Inventors: Shabarivas Abhiram, Gary S. Goldman, Jian hui Huang, Eugene M. Feinberg
  • Patent number: 12141685
    Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n<m); and a quantized representation of a 3×3 kernel with p-bit parameter values may include 9 q-bit mantissa values and one exponent shared between the q-bit mantissa values (q<p). Convolution of the kernel with the activation data may include computing a dot product of the 9 n-bit mantissa values with the 9 q-bit mantissa values, and summing the shared exponents. In a CNN with multiple kernels, multiple computing units (each corresponding to one of the kernels) may receive the quantized representation of the 3×3 array of m-bit activation values from the same quantization-alignment module.
    Type: Grant
    Filed: January 11, 2024
    Date of Patent: November 12, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, James Michael Bodwin, Pradeep R. Joginipally, Shabarivas Abhiram, Gary S. Goldman, Martin Stefan Patz, Eugene M. Feinberg, Berend Ozceri
  • Patent number: 12045309
    Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.
    Type: Grant
    Filed: November 29, 2023
    Date of Patent: July 23, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 12039290
    Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.
    Type: Grant
    Filed: January 9, 2024
    Date of Patent: July 16, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 12026478
    Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.
    Type: Grant
    Filed: January 9, 2024
    Date of Patent: July 2, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 12008069
    Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.
    Type: Grant
    Filed: November 29, 2023
    Date of Patent: June 11, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 12007937
    Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.
    Type: Grant
    Filed: November 29, 2023
    Date of Patent: June 11, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 11915126
    Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n<m); and a quantized representation of a 3×3 kernel with p-bit parameter values may include 9 q-bit mantissa values and one exponent shared between the q-bit mantissa values (q<p). Convolution of the kernel with the activation data may include computing a dot product of the 9 n-bit mantissa values with the 9 q-bit mantissa values, and summing the shared exponents. In a CNN with multiple kernels, multiple computing units (each corresponding to one of the kernels) may receive the quantized representation of the 3×3 array of m-bit activation values from the same quantization-alignment module.
    Type: Grant
    Filed: September 4, 2020
    Date of Patent: February 27, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, James Michael Bodwin, Pradeep R. Joginipally, Shabarivas Abhiram, Gary S. Goldman, Martin Stefan Patz, Eugene M. Feinberg, Berend Ozceri
  • Patent number: 11762946
    Abstract: Convolution with a 5×5 kernel involves computing the dot product of a 5×5 data block with a 5×5 kernel. Instead of computing this dot product as a single sum of 25 products, the dot product is computed as a sum of four partial sums, where each partial sum is computed as a dot product of a 3×3 data block with a 3×3 kernel. The four partial sums may be computed by a single 3×3 convolver unit over four time periods. During each time period, at least some of the weights received by the 3×3 convolver unit may correspond to a quadrant of weights from the 5×5 kernel. A shifter circuit provides shifted columns (left or right shifted) of the input data to the 3×3 convolver unit, allowing the 3×3 convolver unit access to the 3×3 data block that spatially corresponds to a particular quadrant of weights from the 5×5 kernel.
    Type: Grant
    Filed: September 23, 2022
    Date of Patent: September 19, 2023
    Assignee: Recogni Inc.
    Inventors: Gary S. Goldman, Shabarivas Abhiram
  • Patent number: 11694068
    Abstract: A convolutional engine is configured to process input data that is organized into horizontal stripes. The number of accumulators present in each convolver unit of the convolutional engine may equal a total number of rows of data in each of the horizontal stripes.
    Type: Grant
    Filed: July 8, 2022
    Date of Patent: July 4, 2023
    Assignee: Recogni Inc.
    Inventor: Eugene M. Feinberg
  • Patent number: 11694069
    Abstract: Contiguous columns of a convolutional engine are partitioned into two or more groups. Each group of columns may be used to process input data. Filter weights assigned to one group may be distinct from filter weights assigned to another group.
    Type: Grant
    Filed: July 8, 2022
    Date of Patent: July 4, 2023
    Assignee: Recogni Inc.
    Inventor: Eugene M. Feinberg
  • Patent number: 11645355
    Abstract: A system for evaluating a piecewise linear function includes a first look-up table with N entries, and a second look-up table with M entries, with M being less than N. Each of the N entries contains parameters that define a corresponding linear segment of the piecewise linear function. The system further includes a controller configured to store a subset of the N entries from the first look-up table in the second look-up table. The system further includes a classifier for receiving an input value and classifying the input value in one of a plurality of segments of a number line. A total number of the segments is equal to M, and the segments are non-overlapping and contiguous. The system further includes a multiplexor for selecting one of the M entries of the second look-up table based on the classification of the input value into one of the plurality of segments.
    Type: Grant
    Filed: December 30, 2022
    Date of Patent: May 9, 2023
    Assignee: Recogni Inc.
    Inventors: Gilles J. C. A. Backhus, Gary S. Goldman
  • Patent number: 11645504
    Abstract: A convolutional engine is configured to process input data that is organized into vertical stripes.
    Type: Grant
    Filed: July 8, 2022
    Date of Patent: May 9, 2023
    Assignee: Recogni Inc.
    Inventor: Eugene M. Feinberg
  • Patent number: 11630605
    Abstract: A memory system comprises a plurality of memory sub-systems, each with a memory bank and other circuit components. For each of the memory sub-systems, a first buffer receives and stores a read-modify-write request (with a read address, a write address and a first operand), a second operand is read from the memory bank at the location specified by the read address, a combiner circuit combines the first operand with the second operand, an activation circuit transforms the output of the combiner circuit, and the output of the activation circuit is stored in the memory bank at the location specified by the write address. The first operand and the write address may be stored in a second buffer while the second operand is read from the memory bank. Further, the output of the activation circuit may be first stored in the first buffer before being stored in the memory bank.
    Type: Grant
    Filed: August 10, 2022
    Date of Patent: April 18, 2023
    Assignee: Recogni Inc.
    Inventors: Gary S. Goldman, Ashwin Radhakrishnan
  • Patent number: 11593630
    Abstract: A hardware architecture for implementing a convolutional neural network. Certain ones of the convolver units may be controlled to be active and others may be controlled to be non-active by a controller in order to perform convolution with a striding of greater than or equal to two.
    Type: Grant
    Filed: July 8, 2022
    Date of Patent: February 28, 2023
    Assignee: Recogni Inc.
    Inventor: Eugene M. Feinberg
  • Patent number: 11580372
    Abstract: A hardware architecture for implementing a convolutional neural network.
    Type: Grant
    Filed: July 8, 2022
    Date of Patent: February 14, 2023
    Assignee: Recogni Inc.
    Inventor: Eugene M. Feinberg
  • Patent number: 11468302
    Abstract: A hardware architecture for implementing a convolutional neural network.
    Type: Grant
    Filed: February 12, 2019
    Date of Patent: October 11, 2022
    Assignee: Recogni Inc.
    Inventor: Eugene M. Feinberg
  • Patent number: 11468316
    Abstract: A method for instantiating a convolutional neural network on a computing system. The convolutional neural network includes a plurality of layers, and instantiating the convolutional neural network includes training the convolutional neural network using a first loss function until a first classification accuracy is reached, clustering a set of F×K kernels of the first layer into a set of C clusters, training the convolutional neural network using a second loss function until a second classification accuracy is reached, creating a dictionary which maps each of a number of centroids to a corresponding centroid identifier, quantizing and compressing F filters of the first layer, storing F quantized and compressed filters of the first layer in a memory of the computing system, storing F biases of the first layer in the memory, and classifying data received by the convolutional neural network.
    Type: Grant
    Filed: February 12, 2019
    Date of Patent: October 11, 2022
    Assignee: Recogni Inc.
    Inventors: Gilles J. C. A. Backhus, Eugene M. Feinberg