Patents Assigned to Recogni Inc.
-
Patent number: 12293163Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.Type: GrantFiled: May 31, 2024Date of Patent: May 6, 2025Assignee: Recogni Inc.Inventors: Jian hui Huang, Gary S. Goldman
-
Patent number: 12271624Abstract: A memory system comprises a plurality of memory sub-systems, each with a memory bank and other circuit components. For each of the memory sub-systems, a first buffer receives and stores a read-modify-write request (with a read address, a write address and a first operand), a second operand is read from the memory bank at the location specified by the read address, a combiner circuit combines the first operand with the second operand, an activation circuit transforms the output of the combiner circuit, and the output of the activation circuit is stored in the memory bank at the location specified by the write address. The first operand and the write address may be stored in a second buffer while the second operand is read from the memory bank. Further, the output of the activation circuit may be first stored in the first buffer before being stored in the memory bank.Type: GrantFiled: March 13, 2023Date of Patent: April 8, 2025Assignee: Recogni Inc.Inventors: Gary S. Goldman, Ashwin Radhakrishnan
-
Patent number: 12165041Abstract: In a low power hardware architecture for handling accumulation overflows in a convolver unit, an accumulator of the convolver unit computes a running total by successively summing dot products from a dot product computation module during an accumulation cycle. In response to the running total overflowing the maximum or minimum value of a data storage element, the accumulator transmits an overflow indicator to a controller and sets its output equal to a positive or negative overflow value. In turn, the controller disables the dot product computation module by clock gating, clamping one of its inputs to zero and/or holding its inputs to constant values. At the end of the accumulation cycle, the output of the accumulator is sampled. In response to a clear signal being asserted, the dot product computation module is enabled, and the running total is set to zero for the start of the next accumulation cycle.Type: GrantFiled: June 9, 2022Date of Patent: December 10, 2024Assignee: Recogni Inc.Inventors: Shabarivas Abhiram, Gary S. Goldman, Jian hui Huang, Eugene M. Feinberg
-
Patent number: 12141685Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n<m); and a quantized representation of a 3×3 kernel with p-bit parameter values may include 9 q-bit mantissa values and one exponent shared between the q-bit mantissa values (q<p). Convolution of the kernel with the activation data may include computing a dot product of the 9 n-bit mantissa values with the 9 q-bit mantissa values, and summing the shared exponents. In a CNN with multiple kernels, multiple computing units (each corresponding to one of the kernels) may receive the quantized representation of the 3×3 array of m-bit activation values from the same quantization-alignment module.Type: GrantFiled: January 11, 2024Date of Patent: November 12, 2024Assignee: Recogni Inc.Inventors: Jian hui Huang, James Michael Bodwin, Pradeep R. Joginipally, Shabarivas Abhiram, Gary S. Goldman, Martin Stefan Patz, Eugene M. Feinberg, Berend Ozceri
-
Patent number: 12045309Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.Type: GrantFiled: November 29, 2023Date of Patent: July 23, 2024Assignee: Recogni Inc.Inventors: Jian hui Huang, Gary S. Goldman
-
Patent number: 12039290Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.Type: GrantFiled: January 9, 2024Date of Patent: July 16, 2024Assignee: Recogni Inc.Inventors: Jian hui Huang, Gary S. Goldman
-
Patent number: 12026478Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.Type: GrantFiled: January 9, 2024Date of Patent: July 2, 2024Assignee: Recogni Inc.Inventors: Jian hui Huang, Gary S. Goldman
-
Patent number: 12008069Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.Type: GrantFiled: November 29, 2023Date of Patent: June 11, 2024Assignee: Recogni Inc.Inventors: Jian hui Huang, Gary S. Goldman
-
Patent number: 12007937Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.Type: GrantFiled: November 29, 2023Date of Patent: June 11, 2024Assignee: Recogni Inc.Inventors: Jian hui Huang, Gary S. Goldman
-
Patent number: 11915126Abstract: Dynamic data quantization may be applied to minimize the power consumption of a system that implements a convolutional neural network (CNN). Under such a quantization scheme, a quantized representation of a 3×3 array of m-bit activation values may include 9 n-bit mantissa values and one exponent shared between the n-bit mantissa values (n<m); and a quantized representation of a 3×3 kernel with p-bit parameter values may include 9 q-bit mantissa values and one exponent shared between the q-bit mantissa values (q<p). Convolution of the kernel with the activation data may include computing a dot product of the 9 n-bit mantissa values with the 9 q-bit mantissa values, and summing the shared exponents. In a CNN with multiple kernels, multiple computing units (each corresponding to one of the kernels) may receive the quantized representation of the 3×3 array of m-bit activation values from the same quantization-alignment module.Type: GrantFiled: September 4, 2020Date of Patent: February 27, 2024Assignee: Recogni Inc.Inventors: Jian hui Huang, James Michael Bodwin, Pradeep R. Joginipally, Shabarivas Abhiram, Gary S. Goldman, Martin Stefan Patz, Eugene M. Feinberg, Berend Ozceri
-
Patent number: 11762946Abstract: Convolution with a 5×5 kernel involves computing the dot product of a 5×5 data block with a 5×5 kernel. Instead of computing this dot product as a single sum of 25 products, the dot product is computed as a sum of four partial sums, where each partial sum is computed as a dot product of a 3×3 data block with a 3×3 kernel. The four partial sums may be computed by a single 3×3 convolver unit over four time periods. During each time period, at least some of the weights received by the 3×3 convolver unit may correspond to a quadrant of weights from the 5×5 kernel. A shifter circuit provides shifted columns (left or right shifted) of the input data to the 3×3 convolver unit, allowing the 3×3 convolver unit access to the 3×3 data block that spatially corresponds to a particular quadrant of weights from the 5×5 kernel.Type: GrantFiled: September 23, 2022Date of Patent: September 19, 2023Assignee: Recogni Inc.Inventors: Gary S. Goldman, Shabarivas Abhiram
-
Patent number: 11694068Abstract: A convolutional engine is configured to process input data that is organized into horizontal stripes. The number of accumulators present in each convolver unit of the convolutional engine may equal a total number of rows of data in each of the horizontal stripes.Type: GrantFiled: July 8, 2022Date of Patent: July 4, 2023Assignee: Recogni Inc.Inventor: Eugene M. Feinberg
-
Patent number: 11694069Abstract: Contiguous columns of a convolutional engine are partitioned into two or more groups. Each group of columns may be used to process input data. Filter weights assigned to one group may be distinct from filter weights assigned to another group.Type: GrantFiled: July 8, 2022Date of Patent: July 4, 2023Assignee: Recogni Inc.Inventor: Eugene M. Feinberg
-
Patent number: 11645355Abstract: A system for evaluating a piecewise linear function includes a first look-up table with N entries, and a second look-up table with M entries, with M being less than N. Each of the N entries contains parameters that define a corresponding linear segment of the piecewise linear function. The system further includes a controller configured to store a subset of the N entries from the first look-up table in the second look-up table. The system further includes a classifier for receiving an input value and classifying the input value in one of a plurality of segments of a number line. A total number of the segments is equal to M, and the segments are non-overlapping and contiguous. The system further includes a multiplexor for selecting one of the M entries of the second look-up table based on the classification of the input value into one of the plurality of segments.Type: GrantFiled: December 30, 2022Date of Patent: May 9, 2023Assignee: Recogni Inc.Inventors: Gilles J. C. A. Backhus, Gary S. Goldman
-
Patent number: 11645504Abstract: A convolutional engine is configured to process input data that is organized into vertical stripes.Type: GrantFiled: July 8, 2022Date of Patent: May 9, 2023Assignee: Recogni Inc.Inventor: Eugene M. Feinberg
-
Patent number: 11630605Abstract: A memory system comprises a plurality of memory sub-systems, each with a memory bank and other circuit components. For each of the memory sub-systems, a first buffer receives and stores a read-modify-write request (with a read address, a write address and a first operand), a second operand is read from the memory bank at the location specified by the read address, a combiner circuit combines the first operand with the second operand, an activation circuit transforms the output of the combiner circuit, and the output of the activation circuit is stored in the memory bank at the location specified by the write address. The first operand and the write address may be stored in a second buffer while the second operand is read from the memory bank. Further, the output of the activation circuit may be first stored in the first buffer before being stored in the memory bank.Type: GrantFiled: August 10, 2022Date of Patent: April 18, 2023Assignee: Recogni Inc.Inventors: Gary S. Goldman, Ashwin Radhakrishnan
-
Patent number: 11593630Abstract: A hardware architecture for implementing a convolutional neural network. Certain ones of the convolver units may be controlled to be active and others may be controlled to be non-active by a controller in order to perform convolution with a striding of greater than or equal to two.Type: GrantFiled: July 8, 2022Date of Patent: February 28, 2023Assignee: Recogni Inc.Inventor: Eugene M. Feinberg
-
Patent number: 11580372Abstract: A hardware architecture for implementing a convolutional neural network.Type: GrantFiled: July 8, 2022Date of Patent: February 14, 2023Assignee: Recogni Inc.Inventor: Eugene M. Feinberg
-
Patent number: 11468302Abstract: A hardware architecture for implementing a convolutional neural network.Type: GrantFiled: February 12, 2019Date of Patent: October 11, 2022Assignee: Recogni Inc.Inventor: Eugene M. Feinberg
-
Patent number: 11468316Abstract: A method for instantiating a convolutional neural network on a computing system. The convolutional neural network includes a plurality of layers, and instantiating the convolutional neural network includes training the convolutional neural network using a first loss function until a first classification accuracy is reached, clustering a set of F×K kernels of the first layer into a set of C clusters, training the convolutional neural network using a second loss function until a second classification accuracy is reached, creating a dictionary which maps each of a number of centroids to a corresponding centroid identifier, quantizing and compressing F filters of the first layer, storing F quantized and compressed filters of the first layer in a memory of the computing system, storing F biases of the first layer in the memory, and classifying data received by the convolutional neural network.Type: GrantFiled: February 12, 2019Date of Patent: October 11, 2022Assignee: Recogni Inc.Inventors: Gilles J. C. A. Backhus, Eugene M. Feinberg