Patents Assigned to HyperAccel Co., Ltd.

Method and system for performing multi-device based inference for large language model

Patent number: 12242564

Abstract: Provided is a method and system for performing multi-device-based inference for a large language model. A multi-device-based inference performance system may include a plurality of devices configured to map to partitions that separate a large language model (LLM) according to an intra-layer parallelism method. Here, each of the plurality of devices may be implemented to synchronize data by sharing a sub-result of matrix multiplication on the data with another device of the plurality of devices while the matrix multiplication is being performed.

Type: Grant

Filed: June 11, 2024

Date of Patent: March 4, 2025

Assignee: HyperAccel Co., Ltd.

Inventors: Seongmin Hong, Junsoo Kim, Gyubin Choi
MIXED-PRECISION MULTIPLY-AND-ACCUMULATION TREE STRUCTURE TO MAXIMIZE MEMORY BANDWIDTH USAGE FOR COMPUTATIONAL ACCELERATION OF GENERATIVE LARGE LANGUAGE MODEL

Publication number: 20250004715

Abstract: Provided is a mixed-precision multiply-and-accumulation (MAC) tree structure to maximize memory bandwidth usage for computational acceleration of a generative large language model. A MAC tree-based operator may include a plurality of floating-point (FP) multipliers connected in parallel and configured to process a multiplication operation on data delivered from an external memory; a plurality of first converters configured to convert output of each of the plurality of FP multipliers from floating point to fixed point; a fixed-point (FXP) adder tree connected to the plurality of first converters and configured to process summation of multiplication results of the plurality of FP multipliers; an FXP accumulator configured to accumulate output of the FXP adder tree; and a second converter configured to convert output of the FXP accumulator from the fixed point to the floating point.

Type: Application

Filed: June 24, 2024

Publication date: January 2, 2025

Applicant: HyperAccel Co., Ltd.

Inventor: Jung-Hoon Kim
Mixed-precision multiply-and-accumulation tree structure to maximize memory bandwidth usage for computational acceleration of generative large language model

Patent number: 12182532

Abstract: Provided is a mixed-precision multiply-and-accumulation (MAC) tree structure to maximize memory bandwidth usage for computational acceleration of a generative large language model. A MAC tree-based operator may include a plurality of floating-point (FP) multipliers connected in parallel and configured to process a multiplication operation on data delivered from an external memory; a plurality of first converters configured to convert output of each of the plurality of FP multipliers from floating point to fixed point; a fixed-point (FXP) adder tree connected to the plurality of first converters and configured to process summation of multiplication results of the plurality of FP multipliers; an FXP accumulator configured to accumulate output of the FXP adder tree; and a second converter configured to convert output of the FXP accumulator from the fixed point to the floating point.

Type: Grant

Filed: June 24, 2024

Date of Patent: December 31, 2024

Assignee: HyperAccel Co., Ltd.

Inventor: Jung-Hoon Kim
METHOD AND SYSTEM FOR EFFICIENT HARDWARE MAPPING OF GENERATIVE GIANT ARTIFICIAL INTELLIGENCE MODEL

Publication number: 20240419467

Abstract: Provided is a method and system for efficient hardware mapping of a generative giant artificial intelligence model. A hardware mapping method may include receiving, by at least one processor, model software and sequentially performing, by the at least one processor, source code level simulation, instruction level simulation, and register transfer level simulation for the model software.

Type: Application

Filed: June 14, 2024

Publication date: December 19, 2024

Applicant: HyperAccel Co., Ltd.

Inventors: Junsoo Kim, Seungjae Moon, Gyubin Choi
METHOD AND SYSTEM FOR PERFORMING MULTI-DEVICE BASED INFERENCE FOR LARGE LANGUAGE MODEL

Publication number: 20240411835

Abstract: Provided is a method and system for performing multi-device-based inference for a large language model. A multi-device-based inference performance system may include a plurality of devices configured to map to partitions that separate a large language model (LLM) according to an intra-layer parallelism method. Here, each of the plurality of devices may be implemented to synchronize data by sharing a sub-result of matrix multiplication on the data with another device of the plurality of devices while the matrix multiplication is being performed.

Type: Application

Filed: June 11, 2024

Publication date: December 12, 2024

Applicant: HyperAccel Co., Ltd.

Inventors: Seongmin Hong, Junsoo Kim, Gyubin Choi
Method and system for weight memory mapping for streaming operation of giant generative artifical intelligence hardware

Patent number: 12153898

Abstract: Provided is a method and system for weight memory mapping for a streaming operation of giant generative artificial intelligence hardware. A weight memory mapping system may include a weight memory configured to store a weight matrix for a pretrained artificial intelligence model; an input register configured to store a plurality of input data; a first hardware operator configured to process a matrix multiplication operation between the plurality of input data and the weight matrix and to compute a lane-level final sum during the progress of the matrix multiplication operation by reusing a partial sum of the matrix multiplication operation; and a second hardware operator configured to preprocess a next matrix multiplication operation during the progress of the matrix multiplication operation using the final sum.

Type: Grant

Filed: June 14, 2024

Date of Patent: November 26, 2024

Assignee: HyperAccel Co., Ltd.

Inventors: Junsoo Kim, Jung-Hoon Kim, Junseo Cha
Method and system for verifying operation and data precision of hardware for artificial intelligence

Patent number: 12147746

Abstract: Provided is a method and system for verifying an operation and data precision of generative giant artificial intelligence hardware. A verification method may include receiving target device information, a model instruction, and a model parameter related to an artificial intelligence (AI) model; constructing a simulator corresponding to real hardware based on the target device information; processing an operation between the model instruction and the model parameter through the constructed simulator; and storing a processing result of the operation in a memory module included in the simulator. Here, the at least one processor may include a CPU and a GPU and the constructing of the simulator may include constructing a first simulator that uses the CPU in response to a high-precision mode being selected and constructing a second simulator that uses both the CPU and the GPU in response to a low-latency mode being selected.

Type: Grant

Filed: June 14, 2024

Date of Patent: November 19, 2024

Assignee: HyperAccel Co., Ltd.

Inventors: Junsoo Kim, Seongmin Hong
Latency processing unit

Patent number: 12032925

Abstract: Provided is a latency processing unit. The latency processing unit may include a plurality of multiplier-accumulator (MAC) trees configured to perform a matrix product operation for at least one of a plurality of partitions that implement an artificial intelligence (AI) model, streamlined memory access configured to connect each of the plurality of MAC trees to high bandwidth memory in which the at least one partition has been stored through a plurality of channels, a vector execution engine configured to perform an additional operation on results of the operation of the plurality of MAC trees, a local memory unit configured to store the results of the operation of the vector execution engine and an activation value, and an instruction scheduling unit configured to schedule the operations of the plurality of MAC trees and the vector execution engine.

Type: Grant

Filed: December 20, 2023

Date of Patent: July 9, 2024

Assignee: HyperAccel Co., Ltd.

Inventors: Jung-Hoon Kim, Junseo Cha, Gyubin Choi