Patents by Inventor Ali Shafiee-Ardestani

Ali Shafiee-Ardestani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Neural processor

Patent number: 11954574

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: June 19, 2019

Date of Patent: April 9, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Publication number: 20240095519

Abstract: A neural network inference accelerator includes first and second neural processing units (NPUs) and a sparsity management unit. The first NPU receives activation and weight tensors based on an activation sparsity density and a weight sparsity density both being greater than a predetermined sparsity density. The second NPU receives activation and weight tensors based on at least one of the activation sparsity density and the weight sparsity density being less than or equal to the predetermined sparsity density. The sparsity management unit controls transfer of the activation tensor and the weight tensor based on the activation sparsity density and the weight sparsity density with respect to the predetermined sparsity density.

Type: Application

Filed: November 17, 2022

Publication date: March 21, 2024

Inventors: Ardavan PEDRAM, Ali SHAFIEE ARDESTANI, Jong Hoon SHIN, Joseph H. HASSOUN
Mixed-precision NPU tile with depth-wise convolution

Patent number: 11880760

Abstract: A processor to perform inference on deep learning neural network models. In some embodiments, the process includes: a first tile, a second tile, a memory, and a bus, the bus being connected to: the memory, the first tile, and the second tile, the first tile including: a first weight register, a second weight register, an activations cache, a shuffler, an activations buffer, a first multiplier, and a second multiplier, the activations buffer being configured to include: a first queue connected to the first multiplier, and a second queue connected to the second multiplier, the activations cache including a plurality of independent lanes, each of the independent lanes being randomly accessible, the first tile being configured: to receive a tensor including a plurality of two-dimensional arrays, each representing one color component of the image; and to perform a convolution of a kernel with one of the two-dimensional arrays.

Type: Grant

Filed: April 3, 2020

Date of Patent: January 23, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Hamzah Ahmed Ali Abdelaziz, Joseph H. Hassoun
Processor for fine-grain sparse integer and floating-point operations

Patent number: 11861328

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products, and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a second multiplier, and a third multiplier, the first activation value by a first least significant sub-word, a second least significant sub-word, and a most significant sub-word; and adding a first resulting partial product and a second resulting partial product. The forming of the second set of products may include forming a first floating point product, the forming of the first floating point product including multiplying, in the first multiplier, a first sub-word of a mantissa of an activation value by a first sub-word of a mantissa of a weight, to form a third partial product.

Type: Grant

Filed: December 23, 2020

Date of Patent: January 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph H. Hassoun
Processor for fine-grain sparse integer and floating-point operations

Patent number: 11861327

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a first activation value by a least significant sub-word and a most significant sub-word of a first weight to form a first partial product and a second partial product; and adding the first partial product and the second partial product. The forming of the second set of products may include: multiplying, in the first multiplier, a second activation value by a first sub-word and a second sub-word of a mantissa to form a third partial product and a fourth partial product; and adding the third partial product and the fourth partial product.

Type: Grant

Filed: December 22, 2020

Date of Patent: January 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
EFFICIENCY OF VISION TRANSFORMERS WITH ADAPTIVE TOKEN PRUNING

Publication number: 20230368494

Abstract: A system and a method are disclosed for training a vision transformer. A token distillation loss of an input image based on a teacher network classification token and a token importance score of a student network (the vision transformer during training) are determined at a pruning layer of the vision transformer. When a current epoch number is odd, sparsification of tokens of the input image is skipped and the dense input image is processed by layers that are subsequent to the pruning layer. When the current epoch number is even, tokens of the input image are pruned at the pruning layer and processed by layers that are subsequent to the pruning layer. A label loss and a total loss for the input image are determined by the subsequent layers and the student network is updated.

Type: Application

Filed: November 1, 2022

Publication date: November 16, 2023

Inventors: Ling LI, Ali SHAFIEE ARDESTANI
NEURAL PROCESSOR

Publication number: 20230351151

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Application

Filed: July 10, 2023

Publication date: November 2, 2023

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 11783161

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: June 19, 2019

Date of Patent: October 10, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 11783162

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: August 27, 2019

Date of Patent: October 10, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 11775801

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: August 27, 2019

Date of Patent: October 3, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 11775802

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: August 27, 2019

Date of Patent: October 3, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Signed multiplication using unsigned multiplier with dynamic fine-grained operand isolation

Patent number: 11775256

Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.

Type: Grant

Filed: January 12, 2023

Date of Patent: October 3, 2023

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang
Piecewise quantization for neural networks

Patent number: 11775611

Abstract: In some embodiments, a method of quantizing an artificial neural network includes dividing a quantization range for a tensor of the artificial neural network into a first region and a second region, and quantizing values of the tensor in the first region separately from values of the tensor in the second region. In some embodiments, linear or nonlinear quantization are applied to values of the tensor in the first region and the second region. In some embodiments, the method includes locating a breakpoint between the first region and the second region by substantially minimizing an expected quantization error over at least a portion of the quantization range. In some embodiments, the expected quantization error is minimized by solving analytically and/or searching numerically.

Type: Grant

Filed: March 11, 2020

Date of Patent: October 3, 2023

Inventors: Jun Fang, Joseph H. Hassoun, Ali Shafiee Ardestani, Hamzah Ahmed Ali Abdelaziz, Georgios Georgiadis, Hui Chen, David Philip Lloyd Thorsley
SYSTEM AND METHOD FOR INCREASING UTILIZATION OF DOT-PRODUCT BASED NEURAL NETWORK ACCELERATOR

Publication number: 20230289584

Abstract: A method of flattening channel data of an input feature map in an inference system includes retrieving pixel values of a channel of a plurality of channels of the input feature map from a memory and storing the pixel values in a buffer, extracting first values of a first region having a first size from among the pixel values stored in the buffer, the first region corresponding to an overlap region of a kernel of the inference system with channel data of the input feature map, rearranging second values corresponding to the overlap region of the kernel from among the first values in the first region, and identifying a first group of consecutive values from among the rearranged second values for supplying to a first dot-product circuit of the inference system.

Type: Application

Filed: May 18, 2023

Publication date: September 14, 2023

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
EFFICIENT CIRCUIT FOR NEURAL NETWORK PROCESSING

Publication number: 20230205488

Abstract: A system and method for efficient processing for neural network inference operations. In some embodiments, the system includes: a circuit configured to multiply a first number by a second number, the first number being represented as: a sign bit five exponent bits, and seven mantissa bits, representing an eight-bit full mantissa.

Type: Application

Filed: January 6, 2022

Publication date: June 29, 2023

Inventors: Ling LI, Ali SHAFIEE ARDESTANI, Hamzah ABDELAZIZ, Joseph H. HASSOUN
System and method for increasing utilization of dot-product based neural network accelerator

Patent number: 11687764

Abstract: A method of flattening channel data of an input feature map in an inference system includes retrieving pixel values of a channel of a plurality of channels of the input feature map from a memory and storing the pixel values in a buffer, extracting first values of a first region having a first size from among the pixel values stored in the buffer, the first region corresponding to an overlap region of a kernel of the inference system with channel data of the input feature map, rearranging second values corresponding to the overlap region of the kernel from among the first values in the first region, and identifying a first group of consecutive values from among the rearranged second values for supplying to a first dot-product circuit of the inference system.

Type: Grant

Filed: June 12, 2020

Date of Patent: June 27, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
System and method for performing computations for deep neural networks

Patent number: 11681907

Abstract: A computation unit for performing a computation of a neural network layer is disclosed. A number of processing element (PE) units are arranged in an array. First input values are provided in parallel in an input dimension of the array during a first processing period, and a second input values are provided in parallel in the input dimension during a second processing period. Computations are performed by the PE units based on stored weight values. An adder coupled to the first set of PE units generates a first sum of results of the computations by the first set of PE units during the first processing cycle, and generates a second sum of results of the computations during the second processing cycle. A first accumulator coupled to the first adder stores the first sum, and further shifts the first sum to a second accumulator prior to storing the second sum.

Type: Grant

Filed: October 14, 2022

Date of Patent: June 20, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hamzah Abdelaziz, Joseph Hassoun, Ali Shafiee Ardestani
Hardware channel-parallel data compression/decompression

Patent number: 11671111

Abstract: A multichannel data packer includes a plurality of two-input multiplexers and a controller. The plurality of two-input multiplexers is arranged in 2N rows and N columns in which N is an integer greater than 1. Each input of a multiplexer in a first column receives a respective bit stream of 2N channels of bit streams. Each respective bit stream includes a bit-stream length based on data in the bit stream. The multiplexers in a last column output 2N channels of packed bit streams each having a same bit-stream length. The controller controls the plurality of multiplexers so that the multiplexers in the last column output the 2N channels of bit streams that each has the same bit-stream length.

Type: Grant

Filed: April 7, 2020

Date of Patent: June 6, 2023

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Lei Wang, Joseph H. Hassoun
SIGNED MULTIPLICATION USING UNSIGNED MULTIPLIER WITH DYNAMIC FINE-GRAINED OPERAND ISOLATION

Publication number: 20230153065

Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.

Type: Application

Filed: January 12, 2023

Publication date: May 18, 2023

Inventors: Ilia OVSIANNIKOV, Ali SHAFIEE ARDESTANI, Joseph H. HASSOUN, Lei WANG
ACCELERATE NEURAL NETWORKS WITH COMPRESSION AT DIFFERENT LEVELS

Publication number: 20230153586

Abstract: A neural network accelerator includes 2n multiplier circuits, 2n shifter circuits and an adder tree circuit. Each respective multiplier circuit multiplies a first value by a second value to output a first product value. Each respective first value is represented by a first predetermined number of bits beginning at a most significant bit of the first value having a value equal to 1. Each respective second value is represented by a second predetermined number of bits, and each respective first product value is represented by a third predetermined number of bits. Each respective shifter circuit receives the first product value of a corresponding multiplier circuit and left shifts the corresponding product value by the first predetermined number of bits to form a respective second product value. The adder circuit adds each respective second product value to form a partial-sum value represented by a fourth predetermined number of bits.

Type: Application

Filed: January 18, 2022

Publication date: May 18, 2023

Inventors: Ling LI, Ali SHAFIEE ARDESTANI

1 2 3 4 next