Patents by Inventor Asher Hazanchuk

Asher Hazanchuk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

NEURAL NETWORKS PROCESSING UNITS PERFORMANCE OPTIMIZATION

Publication number: 20250181905

Abstract: In an example embodiment, a scalable deep neural networks (DNN) accelerator (sDNA) is provided that includes multiple neural networks processing units (NPUs) that are interconnected to provide a flexible DNN that is programmable and scalable. Each NPU includes one or more pruned weights memories and one or more compressed activation memories. Each NPU may include multiple A_LUT memories that are used as multipliers accelerator and a W_LUT memory that are used all together to reduce a number of DNN multiplications. The sDNA may include one or more W map memories and A map memories that provide a sDNA algorithm inputs data that enable skipping of zero weights and zero activations. The sDNA architecture can be generalized into sDNA parallel mode to achieve higher memory bandwidth and throughput. In an embodiment, the sDNA architecture is power-efficient, silicon size-efficient and cost-efficient.

Type: Application

Filed: February 7, 2025

Publication date: June 5, 2025

Inventor: Asher Hazanchuk
NEURAL NETWORKS PROCESSING UNITS FOLDING

Publication number: 20230394278

Abstract: In an example, a method is disclosed of folding each group of neighbor pixels (memory bins) of activations into a same pixel memory bin or a group of 3*3 neighboring pixels memory bins that are all accessible from a middle point processing unit to localize and standardize different convolution operations that are required or other operations such as max pooling or average pooling. The method includes folding together neighboring pixel activations. The method includes storing all the folded activations at the same pixel memory bin so that a local processing unit is able to access all required activations by accessing local memory or 3*3 neighboring pixel memory bins only.

Type: Application

Filed: August 22, 2023

Publication date: December 7, 2023

Inventors: Asher Hazanchuk, Yaron Raz
NEURAL NETWORKS PROCESSING UNITS PAIRING, SYMMETRY, AND STOP-ON-MINUS

Publication number: 20230368001

Abstract: In an example, a method of pairing and adding together pairs of activations that need to be multiplied by the same weight includes identifying pairs of activations to be multiplied by a corresponding common weight. The method includes adding together activations in each pair of activations to be multiplied by the corresponding common weight to generate a corresponding summed activation. The method includes multiplying the corresponding summed activation by the corresponding common weight.

Type: Application

Filed: July 26, 2023

Publication date: November 16, 2023

Inventors: Asher Hazanchuk, Yaron Raz
NEURAL NETWORKS PROCESSING UNITS ACTIVATION SPARSITY REMOVAL

Publication number: 20230325648

Abstract: In an example, a method of activation sparsity removal includes implementing a non-zero Activation jump algorithm. Alternatively, the method includes using multiple first in first out (FIFO) memories to store non-zero activations for each vector multiplication.

Type: Application

Filed: June 2, 2023

Publication date: October 12, 2023

Inventors: Asher Hazanchuk, Yaron Raz
NEURAL NETWORKS PROCESSING UNITS REDUNDANCY REMOVAL

Publication number: 20230316059

Abstract: In an example, a method of removing redundancy of multiplications of a same weight with different activations before adding all multiplication results together includes adding all activations that need to be multiplied with a common weight to generate a sum of activations. The method includes multiplying the sum of activations with the common weight.

Type: Application

Filed: June 2, 2023

Publication date: October 5, 2023

Inventors: Asher Hazanchuk, Yaron Raz
NEURAL NETWORKS PROCESSING UNITS PERFORMANCE OPTIMIZATION PARALLEL MODE

Publication number: 20230306235

Abstract: In an example, a scalable deep neural networks (DNN) accelerator (sDNA) includes multiple address generators, an activation memory matrix (AMM), and multiple network processing units (NPUs). The AMM is coupled to outputs of the address generators. The NPUs are coupled to outputs of the AMM. Each NPU includes one of: an activation sparsity removal (ASR) block coupled to the AMM; a redundancy removal (RR) block coupled to the AMM; or both an ASR block coupled to the AMM and an RR block coupled to an output of the ASR block. Each NPU additionally includes a multiply accumulator (MAC) block coupled to the output of the ASR block or an output of the RR block and a non-linear unit coupled to an output of the MAC block.

Type: Application

Filed: June 2, 2023

Publication date: September 28, 2023

Inventors: Asher Hazanchuk, Yaron Raz
NEURAL NETWORKS PROCESSING UNITS WEIGHT SPARSITY REMOVAL

Publication number: 20230306243

Abstract: In an example, a method of reducing scalable deep neural networks (DNN) accelerator (sDNA) power consumption and silicon area includes generating a list of addresses in activation memory matrixes (AMM). Each address in the list of addresses points to an activations row that needs to be multiplied by a given non-zero weight for different vector multiplication calculations. The method includes storing in the AMM rows of activations, each row of activations including corresponding activations to be multiplied with a same non-zero weight. The method includes implementing vector multiplication on the rows of activations and non-zero weights, including removing weight sparsity from the AMM.

Type: Application

Filed: June 2, 2023

Publication date: September 28, 2023

Inventors: Asher Hazanchuk, Yaron Raz
NEURAL NETWORKS PROCESSING UNITS PERFORMANCE OPTIMIZATION

Publication number: 20220188611

Abstract: In an example embodiment, a scalable deep neural networks (DNN) accelerator (sDNA) is provided that includes multiple neural networks processing units (NPUs) that are interconnected to provide a flexible DNN that is programmable and scalable. Each NPU includes one or more pruned weights memories and one or more compressed activation memories. Each NPU may include multiple A_LUT memories that are used as multipliers accelerator and a W_LUT memory that are used all together to reduce a number of DNN multiplications. The sDNA may include one or more W map memories and A map memories that provide a sDNA algorithm inputs data that enable skipping of zero weights and zero activations. The sDNA architecture can be generalized into sDNA parallel mode to achieve higher memory bandwidth and throughput. In an embodiment, the sDNA architecture is power-efficient, silicon size-efficient and cost-efficient.

Type: Application

Filed: December 3, 2021

Publication date: June 16, 2022

Inventor: Asher Hazanchuk
Flexible hardware programmable scalable parallel processor

Patent number: 9535705

Abstract: In a typical embodiment, a parallel processor is provided that includes: A plurality of parallel processing units that are interconnected to provide a flexible hardware programmable, scalable and re-configurable parallel processor that executes different functions in a parallel processor space domain instead of a processor (serial processor) time domain. Each parallel processing unit includes a flexible processing engine with its inputs and outputs connected to MDDP-RAM blocks. The MDDP-RAM blocks provide the processing engine with different channels' data and coefficients. The processing engine and the MDDP-RAM blocks are controlled by a system processor (or other control scheme hardware) via the parameter blocks to enable high hardware flexibility and software programmability.

Type: Grant

Filed: August 13, 2014

Date of Patent: January 3, 2017

Inventor: Asher Hazanchuk
Programmable logic device data rate booster for digital signal processing

Patent number: 8977885

Abstract: A programmable logic device is provided that includes: a programmable interconnect adapted to route input signals through the device at a system clock rate; and a digital signal processor (DSP) block coupled to the interconnect, the DSP block including: a plurality of input ports; an input register coupled to the multiple input ports and adapted to sequentially register samples of the input signals from the interconnect received at the input ports at a multiple of the system clock rate; and a multiplier adapted to multiply the registered samples at the multiple of the system clock rate to produce an output signal.

Type: Grant

Filed: March 5, 2012

Date of Patent: March 10, 2015

Assignee: Lattice Semiconductor Corporation

Inventor: Asher Hazanchuk
Digital signal processing block architecture for programmable logic device

Patent number: 8463832

Abstract: Various implementations of a digital signal processing (DSP) block architecture of a programmable logic device (PLD) and related methods are provided. In one example, a PLD includes a dedicated DSP block. The DSP block includes a first multiplier adapted to multiply a first plurality of input signals to provide a first plurality of product signals. The DSP block also includes a second multiplier adapted to multiply a second plurality of input signals to provide a second plurality of product signals. The DSP block further includes an arithmetic logic unit (ALU) adapted to operate on the first product signals and the second product signals received at first and second operand inputs, respectively, of the ALU to provide a plurality of output signals.

Type: Grant

Filed: June 25, 2008

Date of Patent: June 11, 2013

Assignee: Lattice Semiconductor Corporation

Inventors: Asher Hazanchuk, Ian Ing, Satwant Singh
Method and apparatus for implementing a multiplier utilizing digital signal processor block memory extension

Patent number: 7987222

Abstract: A method for performing multiplication on a field programmable gate array includes generating a product by multiplying a first plurality of bits from a first number and a first plurality of bits from a second number. A stored value designated as a product of a second plurality of bits from the first number and a second plurality of bits from the second number is retrieved. The product is scaled with respect to a position of the first plurality of bits from the first number and a position of the first plurality of bits from the second number. The stored value is scaled with respect to a position of the second plurality of bits from the second number and a position of the second plurality of bits from the second number. The scaled product and the scaled stored value are summed.

Type: Grant

Filed: April 22, 2004

Date of Patent: July 26, 2011

Assignee: Altera Corporation

Inventors: Asher Hazanchuk, Benjamin Esposito
Parallel samples, parallel coefficients, time division multiplexing correlator architecture

Patent number: 7688919

Abstract: A method for managing a code sequence is disclosed. First intermediate correlation values for a first plurality of sample sequences are determined during a first clock cycle. Second intermediate correlation values for the first plurality of sample sequences are determined during a second clock cycle. Correlation outputs for the first plurality of sample sequences are determined from the first and second intermediate correlation values.

Type: Grant

Filed: June 26, 2001

Date of Patent: March 30, 2010

Assignee: Altera Corporation

Inventor: Asher Hazanchuk
Variable fixed multipliers using memory blocks

Patent number: 7356554

Abstract: A programmable logic device includes at least one RAM block generating a first multi-bit calculation result which may, but does not necessarily, involve a multiplication of two operands. A shift operation is driven by a second multi-bit calculation result shifts the second multi-bit calculation result by at least one bit to generate a shifted second multi-bit calculation result. A multi-bit adder coupled to the at least one RAM block adds the shifted second multi-bit calculation result to the first multi-bit calculation result.

Type: Grant

Filed: June 27, 2005

Date of Patent: April 8, 2008

Assignee: Altera Corporation

Inventors: Asher Hazanchuk, Benjamin Esposito
Method and systems to align outputs signals of an analog-to-digital converter

Patent number: 7348914

Abstract: Systems and methods are disclosed herein to provide improved alignment of output signals of an analog-to-digital converter (ADC). For example, in accordance with an embodiment of the present invention, a method of aligning digital signals appearing on signal paths of a parallel data bus includes sampling the digital signals at a plurality of delay times to obtain a plurality of sample sets, wherein each sample set is associated with a corresponding delay time. A second digital signal that is misaligned with respect to a first digital signal is identified from the sample sets. The delay time required to align the second digital signal with the first digital signal is determined. The delay of the second digital signal is adjusted by the determined delay time.

Type: Grant

Filed: June 29, 2006

Date of Patent: March 25, 2008

Assignee: Lattice Semiconductor Corporation

Inventors: Asher Hazanchuk, Ian Ing, Satwant Singh
Apparatus and method for implementing efficient arithmetic circuits in programmable logic devices

Publication number: 20060277240

Abstract: Efficient implementation of arithmetic circuits in programmable logic devices by using Look-Up Tables (LUTs) to store pre-calculated values. A table look-up operation is performed in place of complex arithmetic operations. In this way, at the expense of a few LUTs, many logic elements can be saved. This approach is particularly applicable to circuits for calculating reciprocal values and circuits for performing normalized LMS algorithm.

Type: Application

Filed: August 16, 2006

Publication date: December 7, 2006

Inventors: Chang Choo, Asher Hazanchuk
Apparatus and method for implementing efficient arithmetic circuits in programmable logic devices

Patent number: 7124161

Abstract: Efficient implementation of arithmetic circuits in programmable logic devices by using Look-Up Tables (LUTs) to store pre-calculated values. A table look-up operation is performed in place of complex arithmetic operations. In this way, at the expense of a few LUTs, many logic elements can be saved. This approach is particularly applicable to circuits for calculating reciprocal values and circuits for performing normalized LMS algorithm.

Type: Grant

Filed: October 31, 2005

Date of Patent: October 17, 2006

Assignee: Altera Corporation

Inventors: Chang Choo, Asher Hazanchuk
Apparatus and method for implementing efficient arithmetic circuits in programmable logic devices

Patent number: 7058675

Abstract: Efficient implementation of arithmetic circuits in programmable logic devices by using Look-Up Tables (LUTs) to store pre-calculated values. A table look-up operation is performed in place of complex arithmetic operations. In this way, at the expense of a few LUTs, many logic elements can be saved. This approach is particularly applicable to circuits for calculating reciprocal values and circuits for performing normalized LMS algorithm.

Type: Grant

Filed: May 22, 2001

Date of Patent: June 6, 2006

Assignee: Altera Corporation

Inventors: Chang Choo, Asher Hazanchuk
Apparatus and method for implementing efficient arithmetic circuits in programmable logic devices

Publication number: 20060053192

Abstract: Efficient implementation of arithmetic circuits in programmable logic devices by using Look-Up Tables (LUTs) to store pre-calculated values. A table look-up operation is performed in place of complex arithmetic operations. In this way, at the expense of a few LUTs, many logic elements can be saved. This approach is particularly applicable to circuits for calculating reciprocal values and circuits for performing normalized LMS algorithm.

Type: Application

Filed: October 31, 2005

Publication date: March 9, 2006

Inventors: Chang Choo, Asher Hazanchuk
Variable fixed multipliers using memory blocks

Patent number: 6943579

Abstract: A programmable logic device includes at least one RAM block generating a first multi-bit calculation result which may, but does not necessarily, involve a multiplication of two operands. A shift operation is driven by a second multi-bit calculation result shifts the second multi-bit calculation result by at least one bit to generate a shifted second multi-bit calculation result. A multi-bit adder coupled to the at least one RAM block adds the shifted second multi-bit calculation result to the first multi-bit calculation result.

Type: Grant

Filed: September 22, 2003

Date of Patent: September 13, 2005

Assignee: Altera Corporation

Inventors: Asher Hazanchuk, Benjamin Esposito

1 2 next