Patents by Inventor Joseph H. Hassoun

Joseph H. Hassoun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Supporting floating point 16 (FP16) in dot product architecture

Patent number: 12216735

Abstract: A dot-product architecture and method are disclosed for calculating floating-point dot-products of two vectors. The architecture includes an array of multiplier units that each include an integer logic that multiplies integer values of corresponding elements of the two vectors; an exponent logic that adds exponent values of the corresponding elements of the two vectors to form an unbiased exponent values, and a local shifter that forms a first shifted value by shifting a product-integer value by a number of bits in a predetermined direction based on a difference value between an unbiased exponent value corresponding to the product-integer value and a maximum unbiased exponent value for the array of multiplier units. An adder tree adds shifted values output from local shifters of the array of multiplier units to form an output, and an accumulator accumulates the output of the addition unit.

Type: Grant

Filed: January 20, 2021

Date of Patent: February 4, 2025

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Hamzah Ahmed Ali Abdelaziz, Ali Shafiee Ardestani, Joseph H. Hassoun
Neural-processing unit tile for shuffling queued nibbles for multiplication with non-zero weight nibbles

Patent number: 12182577

Abstract: A processor. In some embodiments, the processor includes: a first tile, the first tile being configured: to feed a first nibble from a third queue, through a first shuffler, to a first multiplier, and to multiply, in the first multiplier, the first nibble from the third queue by a first nibble of a third weight; to feed a second nibble from the third queue, through the first shuffler, to a second multiplier, and to multiply, in the second multiplier, the second nibble from the third queue by a second nibble of the third weight; and to feed a first nibble from a fourth queue, through the first shuffler, to a third multiplier, and to multiply, in the third multiplier, the first nibble from the fourth queue by a first nibble of a fourth weight.

Type: Grant

Filed: April 13, 2020

Date of Patent: December 31, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Hamzah Ahmed Ali Abdelaziz, Joseph H. Hassoun
Neural processor

Patent number: 12099912

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: June 19, 2019

Date of Patent: September 24, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 12086700

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: August 27, 2019

Date of Patent: September 10, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 12073302

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: July 10, 2023

Date of Patent: August 27, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
NEURAL PROCESSOR

Publication number: 20240256828

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Application

Filed: March 11, 2024

Publication date: August 1, 2024

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
SYSTEMS AND METHODS FOR MATRIX OPERATION SELECTOR BASED ON MACHINE LEARNING

Publication number: 20240211533

Abstract: Systems and methods for matrix operation selector are disclosed. A selection engine receives a matrix as an input and extracts one or more features from the matrix. A machine learning model selects an action based on the one or more features. The action is for performing a matrix operation based on the matrix, and is predicted to satisfy a criterion with respect to a reward. The action is applied for the matrix operation, and a reward is computed based on the applying of the action. The machine learning model is retrained based on the reward.

Type: Application

Filed: February 3, 2023

Publication date: June 27, 2024

Inventors: Hsin-Hsuan Sung, David Thorsley, Joseph H. Hassoun
Hardware channel-parallel data compression/decompression

Patent number: 12015429

Abstract: A multichannel data packer includes a plurality of two-input multiplexers and a controller. The plurality of two-input multiplexers is arranged in 2N rows and N columns in which N is an integer greater than 1. Each input of a multiplexer in a first column receives a respective bit stream of 2N channels of bit streams. Each respective bit stream includes a bit-stream length based on data in the bit stream. The multiplexers in a last column output 2N channels of packed bit streams each having a same bit-stream length. The controller controls the plurality of multiplexers so that the multiplexers in the last column output the 2N channels of bit streams that each has the same bit-stream length.

Type: Grant

Filed: October 19, 2022

Date of Patent: June 18, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Lei Wang, Joseph H. Hassoun
SYSTEM AND METHOD FOR HANDLING PROCESSING WITH SPARSE WEIGHTS AND OUTLIERS

Publication number: 20240192922

Abstract: Systems and methods for handling processing with sparse weights and outliers. In some embodiments, the method includes reading a first activation from a first row of an array of activations; multiplying a first weight by the first activation to form a first product; directing, by a first demultiplexer, the first product to a first adder tree, of a plurality of adder trees; reading a second activation from a second row of the array of activations; and multiplying a second weight by the second activation.

Type: Application

Filed: February 17, 2023

Publication date: June 13, 2024

Inventors: Ali SHAFIEE ARDESTANI, Hamzah Ahmed Ali ABDELAZIZ, Ardavan PEDRAM, Joseph H. HASSOUN
Neural processor

Patent number: 11954574

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: June 19, 2019

Date of Patent: April 9, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
STRUCTURED SPARSE MEMORY HIERARCHY FOR DEEP LEARNING

Publication number: 20240095518

Abstract: A memory system and a method are disclosed for training a neural network model. A decompressor unit decompresses an activation tensor to a first predetermined sparsity density based on the activation tensor being compressed, and decompresses an weight tensor to a second predetermined sparsity density based on the weight tensor being compressed. A buffer unit receives the activation tensor at the first predetermined sparsity density and the weight tensor at the second predetermined sparsity density. A neural processing unit receives the activation tensor and the weight tensor from the buffer unit and computes a result for the activation tensor and the weight tensor based on first predetermined sparsity density of the activation tensor and based on the second predetermined sparsity density of the weight tensor.

Type: Application

Filed: November 16, 2022

Publication date: March 21, 2024

Inventors: Ardavan PEDRAM, Jong Hoon SHIN, Joseph H. HASSOUN
EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Publication number: 20240095519

Abstract: A neural network inference accelerator includes first and second neural processing units (NPUs) and a sparsity management unit. The first NPU receives activation and weight tensors based on an activation sparsity density and a weight sparsity density both being greater than a predetermined sparsity density. The second NPU receives activation and weight tensors based on at least one of the activation sparsity density and the weight sparsity density being less than or equal to the predetermined sparsity density. The sparsity management unit controls transfer of the activation tensor and the weight tensor based on the activation sparsity density and the weight sparsity density with respect to the predetermined sparsity density.

Type: Application

Filed: November 17, 2022

Publication date: March 21, 2024

Inventors: Ardavan PEDRAM, Ali SHAFIEE ARDESTANI, Jong Hoon SHIN, Joseph H. HASSOUN
Mixed-precision NPU tile with depth-wise convolution

Patent number: 11880760

Abstract: A processor to perform inference on deep learning neural network models. In some embodiments, the process includes: a first tile, a second tile, a memory, and a bus, the bus being connected to: the memory, the first tile, and the second tile, the first tile including: a first weight register, a second weight register, an activations cache, a shuffler, an activations buffer, a first multiplier, and a second multiplier, the activations buffer being configured to include: a first queue connected to the first multiplier, and a second queue connected to the second multiplier, the activations cache including a plurality of independent lanes, each of the independent lanes being randomly accessible, the first tile being configured: to receive a tensor including a plurality of two-dimensional arrays, each representing one color component of the image; and to perform a convolution of a kernel with one of the two-dimensional arrays.

Type: Grant

Filed: April 3, 2020

Date of Patent: January 23, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Hamzah Ahmed Ali Abdelaziz, Joseph H. Hassoun
Processor for fine-grain sparse integer and floating-point operations

Patent number: 11861328

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products, and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a second multiplier, and a third multiplier, the first activation value by a first least significant sub-word, a second least significant sub-word, and a most significant sub-word; and adding a first resulting partial product and a second resulting partial product. The forming of the second set of products may include forming a first floating point product, the forming of the first floating point product including multiplying, in the first multiplier, a first sub-word of a mantissa of an activation value by a first sub-word of a mantissa of a weight, to form a third partial product.

Type: Grant

Filed: December 23, 2020

Date of Patent: January 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph H. Hassoun
NEURAL PROCESSOR

Publication number: 20230351151

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Application

Filed: July 10, 2023

Publication date: November 2, 2023

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 11783162

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: August 27, 2019

Date of Patent: October 10, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 11783161

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: June 19, 2019

Date of Patent: October 10, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Signed multiplication using unsigned multiplier with dynamic fine-grained operand isolation

Patent number: 11775256

Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.

Type: Grant

Filed: January 12, 2023

Date of Patent: October 3, 2023

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang
Piecewise quantization for neural networks

Patent number: 11775611

Abstract: In some embodiments, a method of quantizing an artificial neural network includes dividing a quantization range for a tensor of the artificial neural network into a first region and a second region, and quantizing values of the tensor in the first region separately from values of the tensor in the second region. In some embodiments, linear or nonlinear quantization are applied to values of the tensor in the first region and the second region. In some embodiments, the method includes locating a breakpoint between the first region and the second region by substantially minimizing an expected quantization error over at least a portion of the quantization range. In some embodiments, the expected quantization error is minimized by solving analytically and/or searching numerically.

Type: Grant

Filed: March 11, 2020

Date of Patent: October 3, 2023

Inventors: Jun Fang, Joseph H. Hassoun, Ali Shafiee Ardestani, Hamzah Ahmed Ali Abdelaziz, Georgios Georgiadis, Hui Chen, David Philip Lloyd Thorsley
Neural processor

Patent number: 11775802

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: August 27, 2019

Date of Patent: October 3, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li

1 2 3 4 next