Patents by Inventor Amir GHOLAMINEJAD

Amir GHOLAMINEJAD has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Tensor processing using low precision format

Patent number: 12299577

Abstract: Aspects of the present invention are directed to computer-implemented techniques for improving the training of artificial neural networks using a reduced precision (e.g., float16) data format. Embodiments of the present invention rescale tensor values prior to performing matrix operations (such as matrix multiplication or matrix addition) to prevent overflow and underflow. To preserve accuracy throughout the performance of the matrix operations, the scale factors are defined using a novel data format to represent tensors, wherein a matrix is represented by the tuple X, where X=(a, v[.]), wherein a is a float scale factor and v[.] are scaled values stored in the float16 format. The value of any element X[i] according to this data format would be equal to a*v[i].

Type: Grant

Filed: June 15, 2017

Date of Patent: May 13, 2025

Assignee: NVIDIA Corporation

Inventors: Boris Ginsburg, Sergei Nikolaev, Ahmad Kiswani, Hao Wu, Amir Gholaminejad, Slawomir Kierat, Michael Houston, Alex Fit-Florea
LEARNED THRESHOLD TOKEN PRUNING FOR TRANSFORMER NEURAL NETWORKS

Publication number: 20220374766

Abstract: An architecture and method are disclosed to reduce computation in a self-attention model. The self-attention model is trained using multiple sub-models; each sub-model receiving an input sequence of tokens; each input sequence of tokens being scored within each sub-model to provide a token score for each sub-model; each sub-model having a predetermined threshold score. Each sub-model prunes tokens from the input sequence with a score below the predetermined threshold score for the sub-model. The pruned sequences of each sub-model are used as the input sequences for the next sub-model. The predetermined threshold scores for each sub-model differing.

Type: Application

Filed: January 18, 2022

Publication date: November 24, 2022

Inventors: David Philip Lloyd THORSLEY, Sheng SHEN, Se Hoon KIM, Amir GHOLAMINEJAD, Woosuk KWON, Joseph HASSOUN, Kurt KEUTZER
DYNAMIC DIRECTIONAL ROUNDING

Publication number: 20210232366

Abstract: A method, computer readable medium, and system are disclosed for rounding floating point values. Dynamic directional rounding is a rounding technique for floating point operations. A floating point operation (addition, subtraction, multiplication, etc.) is performed on an operand to compute a floating point result. A sign (positive or negative) of the operand is identified. In one embodiment, the sign determines a direction in which the floating point result is rounded (towards negative or positive infinity). When used for updating parameters of a neural network during backpropagation, dynamic directional rounding ensures that rounding is performed in the direction of the gradient.

Type: Application

Filed: February 1, 2021

Publication date: July 29, 2021

Inventors: Alex Fit-Florea, Boris Ginsburg, Pooya Davoodi, Amir Gholaminejad
Dynamic directional rounding

Patent number: 10908878

Abstract: A method, computer readable medium, and system are disclosed for rounding floating point values. Dynamic directional rounding is a rounding technique for floating point operations. A floating point operation (addition, subtraction, multiplication, etc.) is performed on an operand to compute a floating point result. A sign (positive or negative) of the operand is identified. In one embodiment, the sign determines a direction in which the floating point result is rounded (towards negative or positive infinity). When used for updating parameters of a neural network during backpropagation, dynamic directional rounding ensures that rounding is performed in the direction of the gradient.

Type: Grant

Filed: November 26, 2018

Date of Patent: February 2, 2021

Assignee: NVIDIA Corporation

Inventors: Alex Fit-Florea, Boris Ginsburg, Pooya Davoodi, Amir Gholaminejad
DYNAMIC DIRECTIONAL ROUNDING

Publication number: 20200167125

Abstract: A method, computer readable medium, and system are disclosed for rounding floating point values. Dynamic directional rounding is a rounding technique for floating point operations. A floating point operation (addition, subtraction, multiplication, etc.) is performed on an operand to compute a floating point result. A sign (positive or negative) of the operand is identified. In one embodiment, the sign determines a direction in which the floating point result is rounded (towards negative or positive infinity). When used for updating parameters of a neural network during backpropagation, dynamic directional rounding ensures that rounding is performed in the direction of the gradient.

Type: Application

Filed: November 26, 2018

Publication date: May 28, 2020

Inventors: Alex Fit-Florea, Boris Ginsburg, Pooya Davoodi, Amir Gholaminejad
High performance inplace transpose operations

Patent number: 10067911

Abstract: Systems, apparatuses, and methods for performing in-place matrix transpose operations are disclosed. Operations for transposing tiles of a matrix are scheduled in an order determined by moving diagonally through tiles of the matrix. When a diagonal line hits a boundary, then a tile on a new diagonal line of the matrix is selected and operations are scheduled for transposing this tile. Only tiles within a triangular region of the matrix are scheduled for being transposed. This allows memory access operations to be performed in parallel, expediting the matrix transpose operation compared to linear tile indexing.

Type: Grant

Filed: July 26, 2016

Date of Patent: September 4, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Amir Gholaminejad, Bragadeesh Natarajan
HIGH PERFORMANCE INPLACE TRANSPOSE OPERATIONS

Publication number: 20180032477

Abstract: Systems, apparatuses, and methods for performing in-place matrix transpose operations are disclosed. Operations for transposing tiles of a matrix are scheduled in an order determined by moving diagonally through tiles of the matrix. When a diagonal line hits a boundary, then a tile on a new diagonal line of the matrix is selected and operations are scheduled for transposing this tile. Only tiles within a triangular region of the matrix are scheduled for being transposed. This allows memory access operations to be performed in parallel, expediting the matrix transpose operation compared to linear tile indexing.

Type: Application

Filed: July 26, 2016

Publication date: February 1, 2018

Inventors: Amir Gholaminejad, Bragadeesh Natarajan
TENSOR PROCESSING USING LOW PRECISION FORMAT

Publication number: 20170372202

Abstract: Aspects of the present invention are directed to computer-implemented techniques for improving the training of artificial neural networks using a reduced precision (e.g., float16) data format. Embodiments of the present invention rescale tensor values prior to performing matrix operations (such as matrix multiplication or matrix addition) to prevent overflow and underflow. To preserve accuracy throughout the performance of the matrix operations, the scale factors are defined using a novel data format to represent tensors, wherein a matrix is represented by the tuple X, where X=(a, v[.]), wherein a is a float scale factor and v[.] are scaled values stored in the float16 format. The value of any element X[i] according to this data format would be equal to a*v[i].

Type: Application

Filed: June 15, 2017

Publication date: December 28, 2017

Inventors: Boris GINSBURG, Sergei NIKOLAEV, Ahmad KISWANI, Hao WU, Amir GHOLAMINEJAD, Slawomir KIERAT, Michael HOUSTON, Alex FIT-FLOREA