Patents by Inventor Byung-Gon Chun

Byung-Gon Chun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System, apparatus and method with fault recovery

Patent number: 12197278

Abstract: A system with fault recovery includes: a plurality of worker nodes configured to perform distributed training; and a master node configured to control the plurality of worker nodes, wherein the master node is configured to: detect a fault of the plurality of worker nodes based on a predetermined period; adjust a collective communication participant list in response to the detecting of the fault; and transmit the adjusted participant list to one or more worker nodes in the adjusted participant list.

Type: Grant

Filed: September 28, 2022

Date of Patent: January 14, 2025

Assignees: Samsung Electronics Co., Ltd., Seoul National University R&DB Foundation

Inventors: Yongdeok Kim, Kyung Geun Lee, Jeong Yoon Eo, Byung Gon Chun, Ahn Jae Shin
Selective Batching for Inference System for Transformer-Based Generation Tasks

Publication number: 20240403722

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Application

Filed: January 29, 2024

Publication date: December 5, 2024

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
CONSENSUS BUG DETECTION THROUGH MULTI-TRANSACTION DIFFERENTIAL FUZZING

Publication number: 20240354224

Abstract: According to example embodiments, provided are a method and an apparatus for finding consensus bugs using multi-transaction differential fuzzing.

Type: Application

Filed: July 12, 2022

Publication date: October 24, 2024

Inventors: Youngseok Yang, Byung-Gon CHUN, Taesoo KIM
Method and apparatus for lightweight and parallelization of accelerator task scheduling

Patent number: 12124882

Abstract: Disclosed are a method and an electronic apparatus including an accelerator for lightweight and parallel accelerator task scheduling. The method includes pre-running a deep learning model with sample input data having a preset data form and generating a scheduling result through the pre-running.

Type: Grant

Filed: November 12, 2021

Date of Patent: October 22, 2024

Assignee: Seoul National University R&DB Foundation

Inventors: Byung-Gon Chun, Gyeongin Yu, Woosuk Kwon
NEURAL ADAPTER FOR CLASSICAL MACHINE LEARNING (ML) MODELS

Publication number: 20240232634

Abstract: Solutions for adapting machine learning (ML) models to neural networks (NNs) include receiving an ML pipeline comprising a plurality of operators; determining operator dependencies within the ML pipeline; determining recognized operators; for each of at least two recognized operators, selecting a corresponding NN module from a translation dictionary; and wiring the selected NN modules in accordance with the operator dependencies to generate a translated NN. Some examples determine a starting operator for translation, which is the earliest recognized operator having parameters. Some examples connect inputs of the translated NN to upstream operators of the ML pipeline that had not been translated. Some examples further tune the translated NN using backpropagation. Some examples determine whether an operator is trainable or non-trainable and flag related parameters accordingly for later training.

Type: Application

Filed: January 25, 2024

Publication date: July 11, 2024

Inventors: Matteo INTERLANDI, Byung-Gon CHUN, Markus WEIMER, Gyeongin YU, Saeed AMIZADEH
DYNAMIC BATCHING FOR INFERENCE SYSTEM FOR TRANSFORMER-BASED GENERATION TASKS

Publication number: 20240231902

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Application

Filed: August 21, 2023

Publication date: July 11, 2024

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
Selective batching for inference system for transformer-based generation tasks

Patent number: 11934930

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Grant

Filed: October 19, 2022

Date of Patent: March 19, 2024

Assignee: FRIENDLIAI INC.

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
Selective batching for inference system for transformer-based generation tasks

Patent number: 11922282

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Grant

Filed: September 19, 2022

Date of Patent: March 5, 2024

Assignee: FRIENDLIAI INC.

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
Neural adapter for classical machine learning (ML) models

Patent number: 11922315

Abstract: Solutions for adapting machine learning (ML) models to neural networks (NNs) include receiving an ML pipeline comprising a plurality of operators; determining operator dependencies within the ML pipeline; determining recognized operators; for each of at least two recognized operators, selecting a corresponding NN module from a translation dictionary; and wiring the selected NN modules in accordance with the operator dependencies to generate a translated NN. Some examples determine a starting operator for translation, which is the earliest recognized operator having parameters. Some examples connect inputs of the translated NN to upstream operators of the ML pipeline that had not been translated. Some examples further tune the translated NN using backpropagation. Some examples determine whether an operator is trainable or non-trainable and flag related parameters accordingly for later training.

Type: Grant

Filed: August 26, 2019

Date of Patent: March 5, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Matteo Interlandi, Byung-Gon Chun, Markus Weimer, Gyeongin Yu, Saeed Amizadeh
METHOD AND TERMINAL FOR PERFORMING SCHEDULING

Publication number: 20230401091

Abstract: A single terminal performs scheduling of processing a request from a plurality of applications by using heterogeneous processors. The single terminal may include an analysis unit partitioning a request from an application in units and generating at least one subgraph, a profiling unit predicting an operation execution time for at least one frequency of at least one processor capable of processing the subgraph, and a scheduler performing scheduling based on a request from the application and the operation execution time.

Type: Application

Filed: February 9, 2023

Publication date: December 14, 2023

Applicant: SEOUL NATIONAL UNIVERSITY R&DB FOUNDATION

Inventors: Youngki LEE, Changjin JEONG, Jingyu LEE, Changmin JEON, Joo Seong JEONG, Byung-Gon CHUN, Donghyun KIM
Dynamic batching for inference system for transformer-based generation tasks

Patent number: 11836520

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Grant

Filed: August 4, 2022

Date of Patent: December 5, 2023

Assignee: FRIENDLIAI INC.

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
SELECTIVE BATCHING FOR INFERENCE SYSTEM FOR TRANSFORMER-BASED GENERATION TASKS

Publication number: 20230177399

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Application

Filed: September 19, 2022

Publication date: June 8, 2023

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
DYNAMIC BATCHING FOR INFERENCE SYSTEM FOR TRANSFORMER-BASED GENERATION TASKS

Publication number: 20230176903

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Application

Filed: August 4, 2022

Publication date: June 8, 2023

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
Selective Batching for Inference System for Transformer-Based Generation Tasks

Publication number: 20230177401

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Application

Filed: October 19, 2022

Publication date: June 8, 2023

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
SYSTEM, APPARATUS AND METHOD WITH FAULT RECOVERY

Publication number: 20230139091

Abstract: A system with fault recovery includes: a plurality of worker nodes configured to perform distributed training; and a master node configured to control the plurality of worker nodes, wherein the master node is configured to: detect a fault of the plurality of worker nodes based on a predetermined period; adjust a collective communication participant list in response to the detecting of the fault; and transmit the adjusted participant list to one or more worker nodes in the adjusted participant list.

Type: Application

Filed: September 28, 2022

Publication date: May 4, 2023

Applicants: Samsung Electronics Co., Ltd., Seoul National University R&DB Foundation

Inventors: Yongdeok KIM, Kyung Geun LEE, Jeong Yoon EO, Byung Gon CHUN, Ahn Jae SHIN
METHOD AND APPARATUS FOR EXECUTING DEEP LEARNING PROGRAMS

Publication number: 20230118829

Abstract: Disclosed is a method of executing deep learning programs. The method includes generating a symbolic graph corresponding to an imperative deep learning program, dividing the imperative deep learning program into a first portion related to a deep learning computation and a second portion not related to the deep learning computation, and performing a computation on the first portion using a graph runner and simultaneously performing a computation on the second portion using a language runner.

Type: Application

Filed: September 28, 2022

Publication date: April 20, 2023

Applicant: Seoul National University R&DB Foundation

Inventors: Tae Bum KIM, Byung Gon CHUN, Geon Woo KIM, Yun Mo KOO, Gyeong In YU, Eun Ji JEONG, Se Hoon KIM
Selective batching for inference system for transformer-based generation tasks

Patent number: 11514370

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Grant

Filed: December 3, 2021

Date of Patent: November 29, 2022

Assignee: FriendliAI Inc.

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
Dynamic batching for inference system for transformer-based generation tasks

Patent number: 11442775

Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.

Type: Grant

Filed: December 3, 2021

Date of Patent: September 13, 2022

Assignee: FriendliAI Inc.

Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
ACCELERATOR RESOURCE MANAGEMENT METHOD AND APPARATUS

Publication number: 20220237040

Abstract: An accelerator resource management method and apparatus are disclosed. The accelerator resource management method includes receiving a task request for a neural network-related task and a resource scheduling policy for the neural network-related task, obtaining information on a current resource utilization status of an accelerator cluster comprising a plurality of accelerators, in response to the task request, and allocating an accelerator resource for performing the task based on a utility of a resource allocation that is based on the resource scheduling policy and the information.

Type: Application

Filed: July 12, 2021

Publication date: July 28, 2022

Applicants: Samsung Electronics Co., Ltd., SNU R&DB FOUNDATION

Inventors: Sanggyu SHIN, Soojeong KIM, Byung-Gon CHUN, Kyunggeun LEE
METHOD AND APPARATUS FOR LIGHTWEIGHT AND PARALLELIZATION OF ACCELERATOR TASK SCHEDULING

Publication number: 20220147398

Abstract: Disclosed are a method and an electronic apparatus including an accelerator for lightweight and parallel accelerator task scheduling. The method includes pre-running a deep learning model with sample input data having a preset data form and generating a scheduling result through the pre-running.

Type: Application

Filed: November 12, 2021

Publication date: May 12, 2022

Inventors: Byung-Gon Chun, Gyeongin Yu, Woosuk Kwon

1 2 next