Patents by Inventor Byung-Gon Chun
Byung-Gon Chun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12197278Abstract: A system with fault recovery includes: a plurality of worker nodes configured to perform distributed training; and a master node configured to control the plurality of worker nodes, wherein the master node is configured to: detect a fault of the plurality of worker nodes based on a predetermined period; adjust a collective communication participant list in response to the detecting of the fault; and transmit the adjusted participant list to one or more worker nodes in the adjusted participant list.Type: GrantFiled: September 28, 2022Date of Patent: January 14, 2025Assignees: Samsung Electronics Co., Ltd., Seoul National University R&DB FoundationInventors: Yongdeok Kim, Kyung Geun Lee, Jeong Yoon Eo, Byung Gon Chun, Ahn Jae Shin
-
Publication number: 20240403722Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: ApplicationFiled: January 29, 2024Publication date: December 5, 2024Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Publication number: 20240354224Abstract: According to example embodiments, provided are a method and an apparatus for finding consensus bugs using multi-transaction differential fuzzing.Type: ApplicationFiled: July 12, 2022Publication date: October 24, 2024Inventors: Youngseok Yang, Byung-Gon CHUN, Taesoo KIM
-
Patent number: 12124882Abstract: Disclosed are a method and an electronic apparatus including an accelerator for lightweight and parallel accelerator task scheduling. The method includes pre-running a deep learning model with sample input data having a preset data form and generating a scheduling result through the pre-running.Type: GrantFiled: November 12, 2021Date of Patent: October 22, 2024Assignee: Seoul National University R&DB FoundationInventors: Byung-Gon Chun, Gyeongin Yu, Woosuk Kwon
-
Publication number: 20240232634Abstract: Solutions for adapting machine learning (ML) models to neural networks (NNs) include receiving an ML pipeline comprising a plurality of operators; determining operator dependencies within the ML pipeline; determining recognized operators; for each of at least two recognized operators, selecting a corresponding NN module from a translation dictionary; and wiring the selected NN modules in accordance with the operator dependencies to generate a translated NN. Some examples determine a starting operator for translation, which is the earliest recognized operator having parameters. Some examples connect inputs of the translated NN to upstream operators of the ML pipeline that had not been translated. Some examples further tune the translated NN using backpropagation. Some examples determine whether an operator is trainable or non-trainable and flag related parameters accordingly for later training.Type: ApplicationFiled: January 25, 2024Publication date: July 11, 2024Inventors: Matteo INTERLANDI, Byung-Gon CHUN, Markus WEIMER, Gyeongin YU, Saeed AMIZADEH
-
Publication number: 20240231902Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: ApplicationFiled: August 21, 2023Publication date: July 11, 2024Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Patent number: 11934930Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: GrantFiled: October 19, 2022Date of Patent: March 19, 2024Assignee: FRIENDLIAI INC.Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Patent number: 11922282Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: GrantFiled: September 19, 2022Date of Patent: March 5, 2024Assignee: FRIENDLIAI INC.Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Patent number: 11922315Abstract: Solutions for adapting machine learning (ML) models to neural networks (NNs) include receiving an ML pipeline comprising a plurality of operators; determining operator dependencies within the ML pipeline; determining recognized operators; for each of at least two recognized operators, selecting a corresponding NN module from a translation dictionary; and wiring the selected NN modules in accordance with the operator dependencies to generate a translated NN. Some examples determine a starting operator for translation, which is the earliest recognized operator having parameters. Some examples connect inputs of the translated NN to upstream operators of the ML pipeline that had not been translated. Some examples further tune the translated NN using backpropagation. Some examples determine whether an operator is trainable or non-trainable and flag related parameters accordingly for later training.Type: GrantFiled: August 26, 2019Date of Patent: March 5, 2024Assignee: Microsoft Technology Licensing, LLC.Inventors: Matteo Interlandi, Byung-Gon Chun, Markus Weimer, Gyeongin Yu, Saeed Amizadeh
-
Publication number: 20230401091Abstract: A single terminal performs scheduling of processing a request from a plurality of applications by using heterogeneous processors. The single terminal may include an analysis unit partitioning a request from an application in units and generating at least one subgraph, a profiling unit predicting an operation execution time for at least one frequency of at least one processor capable of processing the subgraph, and a scheduler performing scheduling based on a request from the application and the operation execution time.Type: ApplicationFiled: February 9, 2023Publication date: December 14, 2023Applicant: SEOUL NATIONAL UNIVERSITY R&DB FOUNDATIONInventors: Youngki LEE, Changjin JEONG, Jingyu LEE, Changmin JEON, Joo Seong JEONG, Byung-Gon CHUN, Donghyun KIM
-
Patent number: 11836520Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: GrantFiled: August 4, 2022Date of Patent: December 5, 2023Assignee: FRIENDLIAI INC.Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Publication number: 20230177399Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: ApplicationFiled: September 19, 2022Publication date: June 8, 2023Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Publication number: 20230176903Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: ApplicationFiled: August 4, 2022Publication date: June 8, 2023Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Publication number: 20230177401Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: ApplicationFiled: October 19, 2022Publication date: June 8, 2023Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Publication number: 20230139091Abstract: A system with fault recovery includes: a plurality of worker nodes configured to perform distributed training; and a master node configured to control the plurality of worker nodes, wherein the master node is configured to: detect a fault of the plurality of worker nodes based on a predetermined period; adjust a collective communication participant list in response to the detecting of the fault; and transmit the adjusted participant list to one or more worker nodes in the adjusted participant list.Type: ApplicationFiled: September 28, 2022Publication date: May 4, 2023Applicants: Samsung Electronics Co., Ltd., Seoul National University R&DB FoundationInventors: Yongdeok KIM, Kyung Geun LEE, Jeong Yoon EO, Byung Gon CHUN, Ahn Jae SHIN
-
Publication number: 20230118829Abstract: Disclosed is a method of executing deep learning programs. The method includes generating a symbolic graph corresponding to an imperative deep learning program, dividing the imperative deep learning program into a first portion related to a deep learning computation and a second portion not related to the deep learning computation, and performing a computation on the first portion using a graph runner and simultaneously performing a computation on the second portion using a language runner.Type: ApplicationFiled: September 28, 2022Publication date: April 20, 2023Applicant: Seoul National University R&DB FoundationInventors: Tae Bum KIM, Byung Gon CHUN, Geon Woo KIM, Yun Mo KOO, Gyeong In YU, Eun Ji JEONG, Se Hoon KIM
-
Patent number: 11514370Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: GrantFiled: December 3, 2021Date of Patent: November 29, 2022Assignee: FriendliAI Inc.Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Patent number: 11442775Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.Type: GrantFiled: December 3, 2021Date of Patent: September 13, 2022Assignee: FriendliAI Inc.Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
-
Publication number: 20220237040Abstract: An accelerator resource management method and apparatus are disclosed. The accelerator resource management method includes receiving a task request for a neural network-related task and a resource scheduling policy for the neural network-related task, obtaining information on a current resource utilization status of an accelerator cluster comprising a plurality of accelerators, in response to the task request, and allocating an accelerator resource for performing the task based on a utility of a resource allocation that is based on the resource scheduling policy and the information.Type: ApplicationFiled: July 12, 2021Publication date: July 28, 2022Applicants: Samsung Electronics Co., Ltd., SNU R&DB FOUNDATIONInventors: Sanggyu SHIN, Soojeong KIM, Byung-Gon CHUN, Kyunggeun LEE
-
Publication number: 20220147398Abstract: Disclosed are a method and an electronic apparatus including an accelerator for lightweight and parallel accelerator task scheduling. The method includes pre-running a deep learning model with sample input data having a preset data form and generating a scheduling result through the pre-running.Type: ApplicationFiled: November 12, 2021Publication date: May 12, 2022Inventors: Byung-Gon Chun, Gyeongin Yu, Woosuk Kwon