Patents by Inventor Byung-Gon Chun

Byung-Gon Chun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12197278
    Abstract: A system with fault recovery includes: a plurality of worker nodes configured to perform distributed training; and a master node configured to control the plurality of worker nodes, wherein the master node is configured to: detect a fault of the plurality of worker nodes based on a predetermined period; adjust a collective communication participant list in response to the detecting of the fault; and transmit the adjusted participant list to one or more worker nodes in the adjusted participant list.
    Type: Grant
    Filed: September 28, 2022
    Date of Patent: January 14, 2025
    Assignees: Samsung Electronics Co., Ltd., Seoul National University R&DB Foundation
    Inventors: Yongdeok Kim, Kyung Geun Lee, Jeong Yoon Eo, Byung Gon Chun, Ahn Jae Shin
  • Publication number: 20240403722
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Application
    Filed: January 29, 2024
    Publication date: December 5, 2024
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Publication number: 20240354224
    Abstract: According to example embodiments, provided are a method and an apparatus for finding consensus bugs using multi-transaction differential fuzzing.
    Type: Application
    Filed: July 12, 2022
    Publication date: October 24, 2024
    Inventors: Youngseok Yang, Byung-Gon CHUN, Taesoo KIM
  • Patent number: 12124882
    Abstract: Disclosed are a method and an electronic apparatus including an accelerator for lightweight and parallel accelerator task scheduling. The method includes pre-running a deep learning model with sample input data having a preset data form and generating a scheduling result through the pre-running.
    Type: Grant
    Filed: November 12, 2021
    Date of Patent: October 22, 2024
    Assignee: Seoul National University R&DB Foundation
    Inventors: Byung-Gon Chun, Gyeongin Yu, Woosuk Kwon
  • Publication number: 20240232634
    Abstract: Solutions for adapting machine learning (ML) models to neural networks (NNs) include receiving an ML pipeline comprising a plurality of operators; determining operator dependencies within the ML pipeline; determining recognized operators; for each of at least two recognized operators, selecting a corresponding NN module from a translation dictionary; and wiring the selected NN modules in accordance with the operator dependencies to generate a translated NN. Some examples determine a starting operator for translation, which is the earliest recognized operator having parameters. Some examples connect inputs of the translated NN to upstream operators of the ML pipeline that had not been translated. Some examples further tune the translated NN using backpropagation. Some examples determine whether an operator is trainable or non-trainable and flag related parameters accordingly for later training.
    Type: Application
    Filed: January 25, 2024
    Publication date: July 11, 2024
    Inventors: Matteo INTERLANDI, Byung-Gon CHUN, Markus WEIMER, Gyeongin YU, Saeed AMIZADEH
  • Publication number: 20240231902
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Application
    Filed: August 21, 2023
    Publication date: July 11, 2024
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Patent number: 11934930
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Grant
    Filed: October 19, 2022
    Date of Patent: March 19, 2024
    Assignee: FRIENDLIAI INC.
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Patent number: 11922282
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Grant
    Filed: September 19, 2022
    Date of Patent: March 5, 2024
    Assignee: FRIENDLIAI INC.
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Patent number: 11922315
    Abstract: Solutions for adapting machine learning (ML) models to neural networks (NNs) include receiving an ML pipeline comprising a plurality of operators; determining operator dependencies within the ML pipeline; determining recognized operators; for each of at least two recognized operators, selecting a corresponding NN module from a translation dictionary; and wiring the selected NN modules in accordance with the operator dependencies to generate a translated NN. Some examples determine a starting operator for translation, which is the earliest recognized operator having parameters. Some examples connect inputs of the translated NN to upstream operators of the ML pipeline that had not been translated. Some examples further tune the translated NN using backpropagation. Some examples determine whether an operator is trainable or non-trainable and flag related parameters accordingly for later training.
    Type: Grant
    Filed: August 26, 2019
    Date of Patent: March 5, 2024
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Matteo Interlandi, Byung-Gon Chun, Markus Weimer, Gyeongin Yu, Saeed Amizadeh
  • Publication number: 20230401091
    Abstract: A single terminal performs scheduling of processing a request from a plurality of applications by using heterogeneous processors. The single terminal may include an analysis unit partitioning a request from an application in units and generating at least one subgraph, a profiling unit predicting an operation execution time for at least one frequency of at least one processor capable of processing the subgraph, and a scheduler performing scheduling based on a request from the application and the operation execution time.
    Type: Application
    Filed: February 9, 2023
    Publication date: December 14, 2023
    Applicant: SEOUL NATIONAL UNIVERSITY R&DB FOUNDATION
    Inventors: Youngki LEE, Changjin JEONG, Jingyu LEE, Changmin JEON, Joo Seong JEONG, Byung-Gon CHUN, Donghyun KIM
  • Patent number: 11836520
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Grant
    Filed: August 4, 2022
    Date of Patent: December 5, 2023
    Assignee: FRIENDLIAI INC.
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Publication number: 20230177399
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Application
    Filed: September 19, 2022
    Publication date: June 8, 2023
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Publication number: 20230176903
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal sate length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Application
    Filed: August 4, 2022
    Publication date: June 8, 2023
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Publication number: 20230177401
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Application
    Filed: October 19, 2022
    Publication date: June 8, 2023
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Publication number: 20230139091
    Abstract: A system with fault recovery includes: a plurality of worker nodes configured to perform distributed training; and a master node configured to control the plurality of worker nodes, wherein the master node is configured to: detect a fault of the plurality of worker nodes based on a predetermined period; adjust a collective communication participant list in response to the detecting of the fault; and transmit the adjusted participant list to one or more worker nodes in the adjusted participant list.
    Type: Application
    Filed: September 28, 2022
    Publication date: May 4, 2023
    Applicants: Samsung Electronics Co., Ltd., Seoul National University R&DB Foundation
    Inventors: Yongdeok KIM, Kyung Geun LEE, Jeong Yoon EO, Byung Gon CHUN, Ahn Jae SHIN
  • Publication number: 20230118829
    Abstract: Disclosed is a method of executing deep learning programs. The method includes generating a symbolic graph corresponding to an imperative deep learning program, dividing the imperative deep learning program into a first portion related to a deep learning computation and a second portion not related to the deep learning computation, and performing a computation on the first portion using a graph runner and simultaneously performing a computation on the second portion using a language runner.
    Type: Application
    Filed: September 28, 2022
    Publication date: April 20, 2023
    Applicant: Seoul National University R&DB Foundation
    Inventors: Tae Bum KIM, Byung Gon CHUN, Geon Woo KIM, Yun Mo KOO, Gyeong In YU, Eun Ji JEONG, Se Hoon KIM
  • Patent number: 11514370
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Grant
    Filed: December 3, 2021
    Date of Patent: November 29, 2022
    Assignee: FriendliAI Inc.
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Patent number: 11442775
    Abstract: An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length.
    Type: Grant
    Filed: December 3, 2021
    Date of Patent: September 13, 2022
    Assignee: FriendliAI Inc.
    Inventors: Gyeongin Yu, Geon-Woo Kim, Joo Seong Jeong, Soojeong Kim, Byung-Gon Chun
  • Publication number: 20220237040
    Abstract: An accelerator resource management method and apparatus are disclosed. The accelerator resource management method includes receiving a task request for a neural network-related task and a resource scheduling policy for the neural network-related task, obtaining information on a current resource utilization status of an accelerator cluster comprising a plurality of accelerators, in response to the task request, and allocating an accelerator resource for performing the task based on a utility of a resource allocation that is based on the resource scheduling policy and the information.
    Type: Application
    Filed: July 12, 2021
    Publication date: July 28, 2022
    Applicants: Samsung Electronics Co., Ltd., SNU R&DB FOUNDATION
    Inventors: Sanggyu SHIN, Soojeong KIM, Byung-Gon CHUN, Kyunggeun LEE
  • Publication number: 20220147398
    Abstract: Disclosed are a method and an electronic apparatus including an accelerator for lightweight and parallel accelerator task scheduling. The method includes pre-running a deep learning model with sample input data having a preset data form and generating a scheduling result through the pre-running.
    Type: Application
    Filed: November 12, 2021
    Publication date: May 12, 2022
    Inventors: Byung-Gon Chun, Gyeongin Yu, Woosuk Kwon