Abstract: The disclosure relates to a method and an apparatus for lightweighting of artificial intelligence models, and the method of lightweighting of artificial intelligence models includes identifying an outlier in an input vector of a layer, identifying at least one column corresponding to the outlier in a weight matrix, and quantizing weight values of columns which do not correspond to the outlier.
Type:
Grant
Filed:
November 25, 2024
Date of Patent:
May 12, 2026
Assignee:
SqueezeBits Inc.
Inventors:
Eunhyeok Park, Taesu Kim, Changhun Lee, Hyungjun Kim, Jungyu Jin
Abstract: The disclosure relates to a method and an apparatus for lightweighting of artificial intelligence models, and a method of performing matrix multiplication of weight values and input values of artificial intelligence models includes copying quantized weight values stored in a global memory to a register, dequantizing the quantized weight values, copying an input value matrix to the register, and performing matrix multiplication between a dequantized weight value matrix and the input value matrix.