Patents by Inventor Ali Shafiee

Ali Shafiee has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

IDENTIFICATION OF ORGAN DONORS FOR TRANSPLANTATION AMONG POTENTIAL DONORS

Publication number: 20250069722

Abstract: A method for identifying a plurality of intended organ donors among a plurality of organ donor candidates. The method includes obtaining a donor clinical dataset by acquiring each donor clinical data from a respective organ donor candidate, obtaining a recipient clinical dataset by acquiring each recipient clinical data from a respective recipient candidate, predicting one of an in-hospital death or survival of an intended organ donor candidate, estimating a time of death of the intended organ donor candidate, obtaining a paired donor-recipient by pairing the intended organ donor candidate with an intended recipient for organ transplantation, estimating a probability of organ transplant success for the paired donor-recipient, and pairing the intended recipient with the plurality of intended organ donors for organ transplantation based on the probability of organ transplant success.

Type: Application

Filed: January 10, 2022

Publication date: February 27, 2025

Applicant: ORTHO BIOMED INC.

Inventors: Nick SAJADI, Mohammad Ali SHAFIEE NYESTANAK, Ebrahim POURJAFARI, Seyed Hamid Reza MIRKHANI, Seyed Mohammad ALAVINIA, Mohammad Reza REZAEI, Navid ZIAEI, Mehdi AARABI, Reza SAADATI FARD, Saba RAHIMI, Amirmohammad SAMIEZADEH, Pouria TAVAKKOLI AVVAL, Kathryn TINCKAM, Darren YUEN, Sang Joseph KIM, Nazia SELZNER, Darin TRELEAVEN, Pouyan SHAKER, Mansour ABOLGHASEMIAN
Processor with outlier accommodation

Patent number: 12229659

Abstract: A system and method for performing sets of multiplications in a manner that accommodates outlier values. In some embodiments the method includes: forming a first set of products, each product of the first set of products being a product of a first activation value and a respective weight of a first plurality of weights. The forming of the first set of products may include multiplying, in a first multiplier, the first activation value and a least significant sub-word of a first weight to form a first partial product; multiplying, in a second multiplier, the first activation value and a least significant sub-word of a second weight; multiplying, in a third multiplier, the first activation value and a most significant sub-word of the first weight to form a second partial product; and adding the first partial product and the second partial product.

Type: Grant

Filed: December 2, 2020

Date of Patent: February 18, 2025

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
ACCELERATING 2D CONVOLUTIONAL LAYER MAPPING ON A DOT PRODUCT ARCHITECTURE

Publication number: 20250028505

Abstract: A method for performing a convolution operation includes storing, a convolution kernel in a first storage device, the convolution kernel having dimensions x by y; storing, in a second storage device, a first subset of element values of an input feature map having dimensions n by m; performing a first simultaneous multiplication, of each value of the first subset of element values of the input feature map with a first element value from among the x*y elements of the convolution kernel; for each remaining value of the x*y elements of the convolution kernel, performing, a simultaneous multiplication of the remaining value with a corresponding subset of element values of the input feature map; for each simultaneous multiplication, storing, result of the simultaneous multiplication in an accumulator; and outputting, the values of the accumulator as a first row of an output feature map.

Type: Application

Filed: October 7, 2024

Publication date: January 23, 2025

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
Neural-processing unit tile for shuffling queued nibbles for multiplication with non-zero weight nibbles

Patent number: 12182577

Abstract: A processor. In some embodiments, the processor includes: a first tile, the first tile being configured: to feed a first nibble from a third queue, through a first shuffler, to a first multiplier, and to multiply, in the first multiplier, the first nibble from the third queue by a first nibble of a third weight; to feed a second nibble from the third queue, through the first shuffler, to a second multiplier, and to multiply, in the second multiplier, the second nibble from the third queue by a second nibble of the third weight; and to feed a first nibble from a fourth queue, through the first shuffler, to a third multiplier, and to multiply, in the third multiplier, the first nibble from the fourth queue by a first nibble of a fourth weight.

Type: Grant

Filed: April 13, 2020

Date of Patent: December 31, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Hamzah Ahmed Ali Abdelaziz, Joseph H. Hassoun
Low overhead implementation of Winograd for CNN with 3x3, 1x3 and 3x1 filters on weight station dot-product based CNN accelerators

Patent number: 12158923

Abstract: A system and a method are disclosed for forming an output feature map (OFM). Activation values in an input feature map (IFM) are selected and transformed on-the-fly into the Winograd domain. Elements in a Winograd filter is selected that respectively correspond to the transformed activation values. A transformed activation value is multiplied by a corresponding element of the Winograd filter to form a corresponding product value in the Winograd domain. Activation values are repeatedly selected, transformed and multiplied by a corresponding element in the Winograd filter to form corresponding product values in the Winograd domain until all activation values in the IFM have been transformed and multiplied by the corresponding element. The product values are summed in the Winograd domain to form elements of a feature map in the Winograd domain. The elements of the feature map in the Winograd domain are inverse-Winograd transformed on-the-fly to form the OFM.

Type: Grant

Filed: June 10, 2020

Date of Patent: December 3, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
System and method for increasing utilization of dot-product based neural network accelerator

Patent number: 12136031

Abstract: A method of flattening channel data of an input feature map in an inference system includes retrieving pixel values of a channel of a plurality of channels of the input feature map from a memory and storing the pixel values in a buffer, extracting first values of a first region having a first size from among the pixel values stored in the buffer, the first region corresponding to an overlap region of a kernel of the inference system with channel data of the input feature map, rearranging second values corresponding to the overlap region of the kernel from among the first values in the first region, and identifying a first group of consecutive values from among the rearranged second values for supplying to a first dot-product circuit of the inference system.

Type: Grant

Filed: May 18, 2023

Date of Patent: November 5, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
Accelerating 2D convolutional layer mapping on a dot product architecture

Patent number: 12112141

Abstract: A method for performing a convolution operation includes storing, a convolution kernel in a first storage device, the convolution kernel having dimensions x by y; storing, in a second storage device, a first subset of element values of an input feature map having dimensions n by m; performing a first simultaneous multiplication, of each value of the first subset of element values of the input feature map with a first element value from among the x*y elements of the convolution kernel; for each remaining value of the x*y elements of the convolution kernel, performing, a simultaneous multiplication of the remaining value with a corresponding subset of element values of the input feature map; for each simultaneous multiplication, storing, result of the simultaneous multiplication in an accumulator; and outputting, the values of the accumulator as a first row of an output feature map.

Type: Grant

Filed: June 12, 2020

Date of Patent: October 8, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
Neural processor

Patent number: 12099912

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: June 19, 2019

Date of Patent: September 24, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 12086700

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: August 27, 2019

Date of Patent: September 10, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Neural processor

Patent number: 12073302

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: July 10, 2023

Date of Patent: August 27, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
NEURAL PROCESSOR

Publication number: 20240256828

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Application

Filed: March 11, 2024

Publication date: August 1, 2024

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
Hardware channel-parallel data compression/decompression

Patent number: 12015429

Abstract: A multichannel data packer includes a plurality of two-input multiplexers and a controller. The plurality of two-input multiplexers is arranged in 2N rows and N columns in which N is an integer greater than 1. Each input of a multiplexer in a first column receives a respective bit stream of 2N channels of bit streams. Each respective bit stream includes a bit-stream length based on data in the bit stream. The multiplexers in a last column output 2N channels of packed bit streams each having a same bit-stream length. The controller controls the plurality of multiplexers so that the multiplexers in the last column output the 2N channels of bit streams that each has the same bit-stream length.

Type: Grant

Filed: October 19, 2022

Date of Patent: June 18, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Lei Wang, Joseph H. Hassoun
SYSTEM AND METHOD FOR HANDLING PROCESSING WITH SPARSE WEIGHTS AND OUTLIERS

Publication number: 20240192922

Abstract: Systems and methods for handling processing with sparse weights and outliers. In some embodiments, the method includes reading a first activation from a first row of an array of activations; multiplying a first weight by the first activation to form a first product; directing, by a first demultiplexer, the first product to a first adder tree, of a plurality of adder trees; reading a second activation from a second row of the array of activations; and multiplying a second weight by the second activation.

Type: Application

Filed: February 17, 2023

Publication date: June 13, 2024

Inventors: Ali SHAFIEE ARDESTANI, Hamzah Ahmed Ali ABDELAZIZ, Ardavan PEDRAM, Joseph H. HASSOUN
Mixed-precision neural processing unit (NPU) using spatial fusion with load balancing

Patent number: 12001929

Abstract: According to one general aspect, an apparatus may include a machine learning system. The machine learning system may include a precision determination circuit configured to: determine a precision level of data, and divide the data into a data subdivision. The machine learning system may exploit sparsity during the computation of each subdivision. The machine learning system may include a load balancing circuit configured to select a load balancing technique, wherein the load balancing technique includes alternately loading the computation circuit with at least a first data/weight subdivision combination and a second data/weight subdivision combination. The load balancing circuit may be configured to load a computation circuit with a selected data subdivision and a selected weight subdivision based, at least in part, upon the load balancing technique.

Type: Grant

Filed: June 10, 2020

Date of Patent: June 4, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Hamzah Abdelaziz, Joseph Hassoun, Ali Shafiee Ardestani
Neural processor

Patent number: 11954574

Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.

Type: Grant

Filed: June 19, 2019

Date of Patent: April 9, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph H. Hassoun, Lei Wang, Sehwan Lee, JoonHo Song, Jun-Woo Jang, Yibing Michelle Wang, Yuecheng Li
EXTREME SPARSE DEEP LEARNING EDGE INFERENCE ACCELERATOR

Publication number: 20240095519

Abstract: A neural network inference accelerator includes first and second neural processing units (NPUs) and a sparsity management unit. The first NPU receives activation and weight tensors based on an activation sparsity density and a weight sparsity density both being greater than a predetermined sparsity density. The second NPU receives activation and weight tensors based on at least one of the activation sparsity density and the weight sparsity density being less than or equal to the predetermined sparsity density. The sparsity management unit controls transfer of the activation tensor and the weight tensor based on the activation sparsity density and the weight sparsity density with respect to the predetermined sparsity density.

Type: Application

Filed: November 17, 2022

Publication date: March 21, 2024

Inventors: Ardavan PEDRAM, Ali SHAFIEE ARDESTANI, Jong Hoon SHIN, Joseph H. HASSOUN
Mixed-precision NPU tile with depth-wise convolution

Patent number: 11880760

Abstract: A processor to perform inference on deep learning neural network models. In some embodiments, the process includes: a first tile, a second tile, a memory, and a bus, the bus being connected to: the memory, the first tile, and the second tile, the first tile including: a first weight register, a second weight register, an activations cache, a shuffler, an activations buffer, a first multiplier, and a second multiplier, the activations buffer being configured to include: a first queue connected to the first multiplier, and a second queue connected to the second multiplier, the activations cache including a plurality of independent lanes, each of the independent lanes being randomly accessible, the first tile being configured: to receive a tensor including a plurality of two-dimensional arrays, each representing one color component of the image; and to perform a convolution of a kernel with one of the two-dimensional arrays.

Type: Grant

Filed: April 3, 2020

Date of Patent: January 23, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Hamzah Ahmed Ali Abdelaziz, Joseph H. Hassoun
Processor for fine-grain sparse integer and floating-point operations

Patent number: 11861327

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a first activation value by a least significant sub-word and a most significant sub-word of a first weight to form a first partial product and a second partial product; and adding the first partial product and the second partial product. The forming of the second set of products may include: multiplying, in the first multiplier, a second activation value by a first sub-word and a second sub-word of a mantissa to form a third partial product and a fourth partial product; and adding the third partial product and the fourth partial product.

Type: Grant

Filed: December 22, 2020

Date of Patent: January 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
Processor for fine-grain sparse integer and floating-point operations

Patent number: 11861328

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products, and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a second multiplier, and a third multiplier, the first activation value by a first least significant sub-word, a second least significant sub-word, and a most significant sub-word; and adding a first resulting partial product and a second resulting partial product. The forming of the second set of products may include forming a first floating point product, the forming of the first floating point product including multiplying, in the first multiplier, a first sub-word of a mantissa of an activation value by a first sub-word of a mantissa of a weight, to form a third partial product.

Type: Grant

Filed: December 23, 2020

Date of Patent: January 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph H. Hassoun
EFFICIENCY OF VISION TRANSFORMERS WITH ADAPTIVE TOKEN PRUNING

Publication number: 20230368494

Abstract: A system and a method are disclosed for training a vision transformer. A token distillation loss of an input image based on a teacher network classification token and a token importance score of a student network (the vision transformer during training) are determined at a pruning layer of the vision transformer. When a current epoch number is odd, sparsification of tokens of the input image is skipped and the dense input image is processed by layers that are subsequent to the pruning layer. When the current epoch number is even, tokens of the input image are pruned at the pruning layer and processed by layers that are subsequent to the pruning layer. A label loss and a total loss for the input image are determined by the subsequent layers and the student network is updated.

Type: Application

Filed: November 1, 2022

Publication date: November 16, 2023

Inventors: Ling LI, Ali SHAFIEE ARDESTANI

1 2 3 4 5 next