Patents by Inventor Oleg Khavin
Oleg Khavin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12165042Abstract: Neural network hardware acceleration data parallelism is performed by an integrated circuit including a plurality of memory banks, each memory bank among the plurality of memory banks configured to store values and to transmit stored values, a plurality of computation units, each computation unit among the plurality of computation units including one of a channel pipeline and a multiply-and-accumulate (MAC) element configured to perform a mathematical operation on an input data value and a weight value to produce a resultant data value, and a computation controller configured to cause a value transmission to be received by more than one computation unit or memory bank.Type: GrantFiled: April 13, 2023Date of Patent: December 10, 2024Assignee: EDGECORTIX INC.Inventors: Nikolay Nez, Oleg Khavin, Tanvir Ahmed, Jens Huthmann, Sakyasingha Dasgupta
-
Publication number: 20240386072Abstract: Shifter implemented circulant permutation matrix operations are realized by an integrated circuit including a forward shifter configured to shift forward each sequential value of a target segment by a shift amount to produce a forward-shifted partial segment, a reverse shifter configured to shift in reverse each sequential value of the target segment by a reverse shift value equal to a segment length minus the shift amount to produce a reverse-shifted partial target segment, a combiner configured to combine the forward-shifted partial target segment with the reverse-shifted partial target segment according to the shift amount and the segment length to produce a shifted target segment, and a mask selector configured to select at least one of a merge mask corresponding to the shift amount and the segment length and a filter mask corresponding to the segment length.Type: ApplicationFiled: May 8, 2024Publication date: November 21, 2024Inventors: Kunihiko IETOMI, Nikolay NEZ, Oleg KHAVIN, Sakyasingha DASGUPTA
-
Publication number: 20240385761Abstract: Integrated circuit data stream processing utilizing paged buffering is performed by an integrated circuit that includes an upstream random access memory, a local read counter, a downstream random access memory, a local write counter, and a processor. The local read counter is configured to store a value indicating whether any accessible blocks of data are stored in the upstream memory. The local write counter is configured to store a value indicating whether any downstream pages of the downstream memory are available for recording. The processor is configured to adjust the local read counter to indicate a page release, adjust the local write counter to indicate a page occupy, read a first block of data recorded to the upstream memory, process the first block of data to produce a second block of data, and record the second block of data to the downstream memory.Type: ApplicationFiled: February 28, 2024Publication date: November 21, 2024Inventors: Kunihiko IETOMI, Nikolay NEZ, Oleg KHAVIN, Sakyasingha DASGUPTA
-
Publication number: 20240388310Abstract: Low-Density Parity-Check (LDPC) data decoding using iteration-variable accuracy is performed by segmenting a LDPC encoded data block of probability values by dividing the LDPC encoded data block into a plurality of data probability value segments and a plurality of parity probability value segments, each probability value of the LDPC encoded data block representing a likelihood between binary values, decoding the LDPC encoded data block by adjusting, according to an iteration-variable accuracy parameter, the probability values of the LDPC encoded data block based on a parity-check matrix, the parity-check matrix defining correspondence among data probability value segments and parity probability value segments, and concatenating likely binary values that satisfy the parity-check matrix associated with the probability values of each data probability value segment to form a decoded data block. The iteration-variable accuracy parameter represents a tradeoff between accuracy and computational efficiency.Type: ApplicationFiled: January 22, 2024Publication date: November 21, 2024Inventors: Kunihiko IETOMI, Nikolay NEZ, Oleg KHAVIN, Sakyasingha DASGUPTA
-
Publication number: 20240201987Abstract: Neural network hardware acceleration is performed by an integrated circuit including sequentially connected computation modules. Each computation module includes a processor and an adder. The processor includes circuitry configured to receive an input data value and a weight value, and perform a mathematical operation on the input data value and the weight value to produce a resultant data value. The adder includes circuitry configured to receive the resultant data value directly from the processor, receive one of a preceding resultant data value and a preceding sum value directly from a preceding adder of a preceding computation module, add the resultant data value to the one of the preceding resultant data value and the preceding sum value to produce a sum value, and transmit one of the resultant data value and the sum value to the memory or directly to a subsequent adder of a subsequent computation module.Type: ApplicationFiled: December 16, 2022Publication date: June 20, 2024Inventors: Oleg KHAVIN, Nikolay NEZ, Sakyasingha DASGUPTA
-
Publication number: 20240169192Abstract: Neural network inference may be performed by obtaining a neural network and a configuration of an integrated circuit, the integrated circuit including a plurality of convolution modules, a plurality of adder modules, an accumulation memory, and a convolution output interconnect control module, determining at least one convolution output connection scheme whereby each convolution module has no more than one open direct connection through a plurality of convolution output interconnects to the accumulation memory or one of the plurality of adder modules, and generating integrated circuit instructions for the integrated circuit to perform inference of the neural network, the instructions including an instruction for the convolution output interconnect control module to configure the plurality of convolution output interconnects according to the at least one convolution output connection scheme.Type: ApplicationFiled: December 22, 2023Publication date: May 23, 2024Inventors: Nikolay NEZ, Hamid Reza ZOHOURI, Oleg KHAVIN, Antonio Tomas Nevado VILCHEZ, Sakyasingha DASGUPTA
-
Patent number: 11893475Abstract: Neural network inference may be performed by configuration of a device including an accumulation memory, a plurality of convolution modules configured to perform mathematical operations on input values, a plurality of adder modules configured to sum values output from the plurality of convolution modules, and a plurality of convolution output interconnects connecting the plurality of convolution modules, the plurality of adder modules, and the accumulation memory. The accumulation memory is an accumulation memory allocation of a writable memory block having a reconfigurable bank width, and each bank of the accumulation memory allocation is a virtual combination of consecutive banks of the writable memory block.Type: GrantFiled: October 11, 2021Date of Patent: February 6, 2024Assignee: EDGECORTIX INC.Inventors: Nikolay Nez, Hamid Reza Zohouri, Oleg Khavin, Antonio Tomas Nevado Vilchez, Sakyasingha Dasgupta
-
Publication number: 20230252275Abstract: Neural network hardware acceleration data parallelism is performed by an integrated circuit including a plurality of memory banks, each memory bank among the plurality of memory banks configured to store values and to transmit stored values, a plurality of computation units, each computation unit among the plurality of computation units including one of a channel pipeline and a multiply-and-accumulate (MAC) element configured to perform a mathematical operation on an input data value and a weight value to produce a resultant data value, and a computation controller configured to cause a value transmission to be received by more than one computation unit or memory bank.Type: ApplicationFiled: April 13, 2023Publication date: August 10, 2023Inventors: Nikolay NEZ, Oleg KHAVIN, Tanvir AHMED, Jens HUTHMANN, Sakyasingha DASGUPTA
-
Patent number: 11657260Abstract: Neural network hardware acceleration data parallelism is performed by an integrated circuit including a plurality of memory banks, each memory bank among the plurality of memory banks configured to store values and to transmit stored values, a plurality of computation units, each computation unit among the plurality of computation units including a processor including circuitry configured to perform a mathematical operation on an input data value and a weight value to produce a resultant data value, and a computation controller configured to cause a value transmission to be received by more than one computation unit or memory bank.Type: GrantFiled: October 26, 2021Date of Patent: May 23, 2023Assignee: EDGECORTIX PTE. LTD.Inventors: Nikolay Nez, Oleg Khavin, Tanvir Ahmed, Jens Huthmann, Sakyasingha Dasgupta
-
Publication number: 20230128600Abstract: Neural network hardware acceleration data parallelism is performed by an integrated circuit including a plurality of memory banks, each memory bank among the plurality of memory banks configured to store values and to transmit stored values, a plurality of computation units, each computation unit among the plurality of computation units including a processor including circuitry configured to perform a mathematical operation on an input data value and a weight value to produce a resultant data value, and a computation controller configured to cause a value transmission to be received by more than one computation unit or memory bank.Type: ApplicationFiled: October 26, 2021Publication date: April 27, 2023Inventors: Nikolay NEZ, Oleg KHAVIN, Tanvir AHMED, Jens HUTHMANN, Sakyasingha DASGUPTA
-
Publication number: 20220215236Abstract: Neural network inference may be performed by configuration of a device including an accumulation memory, a plurality of convolution modules configured to perform mathematical operations on input values, a plurality of adder modules configured to sum values output from the plurality of convolution modules, and a plurality of convolution output interconnects connecting the plurality of convolution modules, the plurality of adder modules, and the accumulation memory. The accumulation memory is an accumulation memory allocation of a writable memory block having a reconfigurable bank width, and each bank of the accumulation memory allocation is a virtual combination of consecutive banks of the writable memory block.Type: ApplicationFiled: October 11, 2021Publication date: July 7, 2022Inventors: Nikolay NEZ, Hamid Reza ZOHOURI, Oleg KHAVIN, Antonio Tomas Nevado VILCHEZ, Sakyasingha DASGUPTA
-
Publication number: 20220027716Abstract: Neural network inference may be performed by an apparatus or integrated circuit configured to perform mathematical operations on activation data stored in an activation data memory and weight values stored in a weight memory, to store values resulting from the mathematical operations onto an accumulation memory, to perform activation operations on the values stored in the accumulation memory, to store resulting activation data onto the activation data memory, and to perform inference of a neural network by feeding and synchronizing instructions from an external memory.Type: ApplicationFiled: October 4, 2021Publication date: January 27, 2022Inventors: Nikolay Nez, Antonio Tomas Nevado Vilchez, Hamid Reza Zohouri, Mikhail Volkov, Oleg Khavin, Sakyasingha Dasgupta
-
Patent number: 11188300Abstract: Preparation and execution of quantized scaling may be performed by operations including obtaining an original array and a scaling factor representing a ratio of a size of the original array to a size of a scaled array, determining, for each column of the scaled array, a horizontal coordinate of each of two nearest elements in the horizontal dimension of the original array, and, for each row of the scaled array, a vertical coordinate of each of two nearest elements in the vertical dimension of the original array, calculating, for each row of the scaled array and each column of the scaled array, a linear interpolation coefficient, converting each value of the original array from a floating point number into a quantized number, converting each linear interpolation coefficient from a floating point number into a fixed point number, storing, in a memory, the horizontal coordinates and vertical coordinates as integers, the values as quantized numbers, and the linear interpolation coefficients as fixed point numbersType: GrantFiled: June 18, 2021Date of Patent: November 30, 2021Assignee: EDGECORTIX PTE. LTD.Inventors: Oleg Khavin, Nikolay Nez, Sakyasingha Dasgupta, Antonio Tomas Nevado Vilchez
-
Publication number: 20210357732Abstract: Neural network accelerator hardware-specific division of inference may be performed by operations including obtaining a computational graph and a hardware chip configuration. The operations also include dividing inference of the plurality of layers into a plurality of groups. Each group includes a number of sequential layers based on an estimate of duration and energy consumption by the hardware chip to perform inference of the neural network by performing the mathematical operations on activation data, sequentially by layer, of corresponding portions of layers of each group. The operations further include generating instructions for the hardware chip to perform inference of the neural network, sequentially by group, of the plurality of groups.Type: ApplicationFiled: February 26, 2021Publication date: November 18, 2021Inventors: Nikolay NEZ, Antonio Tomas Nevado VILCHEZ, Hamid Reza ZOHOURI, Mikhail VOLKOV, Oleg KHAVIN, Sakyasingha DASGUPTA
-
Patent number: 11176449Abstract: Neural network accelerator hardware-specific division of inference may be performed by operations including obtaining a computational graph and a hardware chip configuration. The operations also include dividing inference of the plurality of layers into a plurality of groups. Each group includes a number of sequential layers based on an estimate of duration and energy consumption by the hardware chip to perform inference of the neural network by performing the mathematical operations on activation data, sequentially by layer, of corresponding portions of layers of each group. The operations further include generating instructions for the hardware chip to perform inference of the neural network, sequentially by group, of the plurality of groups.Type: GrantFiled: February 26, 2021Date of Patent: November 16, 2021Assignee: EDGECORTIX PTE. LTD.Inventors: Nikolay Nez, Antonio Tomas Nevado Vilchez, Hamid Reza Zohouri, Mikhail Volkov, Oleg Khavin, Sakyasingha Dasgupta
-
Patent number: 11144822Abstract: Neural network inference may be performed by configuration of a device including a plurality of convolution modules, a plurality of adder modules, an accumulation memory, and a convolution output interconnect control module configured to open and close convolution output interconnects among a plurality of convolution output interconnects connecting the plurality of convolution modules, the plurality of adder modules, and the accumulation memory. Inference may be performed while the device is configured according to at least one convolution output connection scheme whereby each convolution module has no more than one open direct connection through the plurality of convolution output interconnects to the accumulation memory or one of the plurality of adder modules. The device includes a convolution output interconnect control module to configure the plurality of convolution output interconnects according to the at least one convolution output connection scheme.Type: GrantFiled: January 4, 2021Date of Patent: October 12, 2021Assignee: EDGECORTIX PTE. LTD.Inventors: Nikolay Nez, Hamid Reza Zohouri, Oleg Khavin, Antonio Tomas Nevado Vilchez, Sakyasingha Dasgupta