Patents by Inventor Hanseok Ko

Hanseok Ko has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240338878
    Abstract: Embodiments described herein provide systems and methods for 3D-aware image generation. A system receives, via a data interface, a plurality of control parameters and a view direction. The system generates a plurality of predicted densities based on a plurality of positions and the plurality of control parameters. The densities may be predicted by applying a series of modulation blocks, wherein each block modulates a vector representation based on control parameters that are used to generate frequency values and phase shift values for the modulation. The system generates an image based on the plurality of predicted densities and the view direction.
    Type: Application
    Filed: April 3, 2024
    Publication date: October 10, 2024
    Inventors: Jeong-gi Kwak, Hanseok Ko
  • Publication number: 20240338560
    Abstract: Embodiments described herein provide systems and methods for gesture generation from multimodal input. A method includes receiving a multimodal input. The method may further include masking a subset of the multimodal input; generating, via an embedder, a multimodal embedding based on the masked multimodal input; generating, via an encoder, multimodal features based on the multimodal embedding, wherein the encoder includes one or more attention layers connecting different modalities; generating, via a generator, multimodal output based on the multimodal features; computing a loss based on the multimodal input and the multimodal output. The method may further include updating parameters of the encoder based on the loss.
    Type: Application
    Filed: April 4, 2024
    Publication date: October 10, 2024
    Inventors: Gwantae Kim, Hanseok Ko
  • Publication number: 20240338974
    Abstract: Embodiments described herein provide systems and methods for facial expression recognition (FER). Embodiments herein combine features of different semantic levels and classifies both sentiment and specific emotion categories with emotion grouping. Embodiments herein include a model with a bottom-up branch that learns facial expressions representation at different semantic levels and output pseudo labels of facial expressions for each frame using a 2D FER model, and a top-down branch that learns discriminative representations by combining feature vectors of each semantic level for recognizing facial expressions at the corresponding emotion group.
    Type: Application
    Filed: March 13, 2024
    Publication date: October 10, 2024
    Inventors: Bokyeung Lee, Bonhwa Ku, Hanseok Ko
  • Publication number: 20240338874
    Abstract: Embodiments described herein provide systems and methods for gesture generation from text. A method for gesture generation includes receiving an input text. The method may further include generating, via an encoder, an action representation in an action representation space based on the input text. The method may further include generating, via a first motion decoder, a first body configuration based on the action representation. The method may further include generating, via a second motion decoder, a second body configuration based on the first body configuration. The method may further include generating, via a token decoder, a first stop token based on the first body configuration.
    Type: Application
    Filed: April 3, 2024
    Publication date: October 10, 2024
    Inventors: Gwantae Kim, Hanseok Ko
  • Publication number: 20240338917
    Abstract: Embodiments described herein provide systems and methods for image to 3D generation. A system receives an input image, for example a portrait. The system generates, via an encoder, a first latent representation based on the input image. The system generates, based on the first latent representation, a plurality of latent representations associated with a plurality of view angles. The system generates, via a decoder, a plurality of images in the plurality of view angles based on the plurality of latent representations. Finally, the system generates a final UV map based on the plurality of images.
    Type: Application
    Filed: April 3, 2024
    Publication date: October 10, 2024
    Inventors: Yuanming Li, Bonhwa Ku, Hanseok Ko
  • Publication number: 20240338802
    Abstract: Embodiments described herein provide systems and methods for image inpainting. A system receives a masked input image and a mask. The system generates, via a pretrained model, a first pass inpainted image based on the masked input image. The system generates a plurality of variants of the first pass inpainted image. The system generates, via a first encoder, a vector representation of the masked input image. The system generates, via a first decoder, a plurality of output images based on the vector representation of the masked input image and conditioned by the plurality of variants of the first pass inpainted image.
    Type: Application
    Filed: April 2, 2024
    Publication date: October 10, 2024
    Inventors: Dongsik Yoon, Jeong-gi Kwak, Hanseok Ko
  • Publication number: 20240339103
    Abstract: Embodiments described herein provide systems and methods for text to speech synthesis. A system receives, via a data interface, an input text, a first reference spectrogram, and a second reference spectrogram. The system generates, via encoders, vector representations of each of the inputs. The system generates a combined representation based on the vector representation of the first reference spectrogram and the vector representation of the second reference spectrogram. The system performs cross attention between the combined representation and the vector representation of the input text to generate a style vector. The system may generate, via a decoder, an audio waveform based on the modified vector representation and conditioned by the style vector where the style vector conditions the speech generation via conditional layer normalization. The generated audio waveform may be played via a speaker. The generated audio may be used in communication by a digital avatar interface.
    Type: Application
    Filed: March 13, 2024
    Publication date: October 10, 2024
    Inventors: Jeongki Min, Bonhwa Ku, Hanseok Ko
  • Publication number: 20240339104
    Abstract: Embodiments described herein provide systems and methods for text to speech synthesis. A system receives, via a data interface, an input text, a reference spectrogram, and at least one of an emotion ID or speaker ID. The system generates, via a first encoder, a vector representation of the input text. The system generates, via a second encoder, a vector representation of the reference spectrogram. The system generates, via a variance adaptor, a modified vector representation based on a combined representation including a combination of the vector representation of the input text, the vector representation of the reference spectrogram, and at least one of an embedding of the emotion ID or an embedding of the speaker ID. The system generates, via a decoder, an audio waveform based on the modified vector representation. The generated audio waveform may be played via a speaker.
    Type: Application
    Filed: March 7, 2024
    Publication date: October 10, 2024
    Inventors: Jeongki Min, Bonhwa Ku, Hanseok Ko
  • Publication number: 20240339122
    Abstract: Embodiments described herein provide systems and methods for any to any voice conversion. A system receives, via a data interface, a source utterance of a first style and a target utterance of a second style. The system generates, via a first encoder, a vector representation of the target utterance. The system generates, via a second encoder, a vector representation of the source utterance. The system generates, via a filter generator, a generated filter based on the vector representation of the target utterance. The system generates, via a decoder, a generated utterance based on the vector representation of the source utterance and the generated filter.
    Type: Application
    Filed: March 18, 2024
    Publication date: October 10, 2024
    Inventors: Donghyeon Kim, Bonhwa Ku, Hanseok Ko
  • Patent number: 11947061
    Abstract: An earthquake event classification method using an attention-based neural network includes: preprocessing input earthquake data by centering; extracting a feature map by nonlinearly converting the preprocessed earthquake data through a plurality of convolution layers having three or more layers; measuring importance of a learned feature of the nonlinear-converted earthquake data based on an attention technique in which interdependence of channels of the feature map is modeled; correcting a feature value of the measured importance value through element-wise multiply with the learned feature map; performing down-sampling through max-pooling based on the feature value; and classifying an earthquake event by regularizing the down-sampled feature value. Accordingly, main core features inherent in many/complex data are extracted through attention-based deep learning to overcome the limitations of the existing micro earthquake detection technology, thereby enabling earthquake detection even in low SNR environments.
    Type: Grant
    Filed: August 13, 2020
    Date of Patent: April 2, 2024
    Assignee: Korea University Research and Business Foundation
    Inventors: Hanseok Ko, Bon Hwa Ku
  • Patent number: 11830272
    Abstract: Disclosed are a method and an apparatus for identifying animal species by using audiovisual information. A method for identifying animal species, according to one embodiment of the present invention, may include: a step of receiving an input signal for an object to be identified; a step of processing image information and acoustic information based on the input signal, wherein a processing result of the image information and a processing result of the acoustic information are represented by class-specific scores; a step of determining whether the image information processing result and the acoustic information processing result corresponding to the input signal exist; and a final result derivation step of fusing the image information processing result and the acoustic information processing result according to the determination result and classifying the object to be identified as a certain animal species by using the fused processing result.
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: November 28, 2023
    Assignee: Korea University Research and Business Foundation
    Inventors: Hanseok Ko, Sangwook Park, Kyung-Deuk Ko, Donghyeon Kim
  • Publication number: 20220036053
    Abstract: Disclosed are a method and an apparatus for identifying animal species by using audiovisual information. A method for identifying animal species, according to one embodiment of the present invention, may include: a step of receiving an input signal for an object to be identified; a step of processing image information and acoustic information based on the input signal, wherein a processing result of the image information and a processing result of the acoustic information are represented by class-specific scores; a step of determining whether the image information processing result and the acoustic information processing result corresponding to the input signal exist; and a final result derivation step of fusing the image information processing result and the acoustic information processing result according to the determination result and classifying the object to be identified as a certain animal species by using the fused processing result.
    Type: Application
    Filed: April 18, 2019
    Publication date: February 3, 2022
    Applicant: Korea University Research and Business Foundation
    Inventors: Hanseok KO, Sangwook PARK, Kyung-Deuk KO, Donghyeon KIM
  • Publication number: 20210117737
    Abstract: An earthquake event classification method using an attention-based neural network includes: preprocessing input earthquake data by centering; extracting a feature map by nonlinearly converting the preprocessed earthquake data through a plurality of convolution layers having three or more layers; measuring importance of a learned feature of the nonlinear-converted earthquake data based on an attention technique in which interdependence of channels of the feature map is modeled; correcting a feature value of the measured importance value through element-wise multiply with the learned feature map; performing down-sampling through max-pooling based on the feature value; and classifying an earthquake event by regularizing the down-sampled feature value. Accordingly, main core features inherent in many/complex data are extracted through attention-based deep learning to overcome the limitations of the existing micro earthquake detection technology, thereby enabling earthquake detection even in low SNR environments.
    Type: Application
    Filed: August 13, 2020
    Publication date: April 22, 2021
    Applicant: Korea University Research and Business Foundation
    Inventors: Hanseok KO, Bon Hwa KU
  • Patent number: 10923126
    Abstract: Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.
    Type: Grant
    Filed: March 19, 2015
    Date of Patent: February 16, 2021
    Assignees: Samsung Electronics Co., Ltd., Korea University Research And Business Foundation
    Inventors: Hanseok Ko, Sung-soo Kim, Taeyup Song, Kyungsun Lee, Jae-won Lee
  • Patent number: 10772910
    Abstract: The present disclosure relates to a pharmaceutical composition for preventing or treating neurodegenerative diseases, the pharmaceutical composition including a graphene nanostructure as an active ingredient.
    Type: Grant
    Filed: June 11, 2018
    Date of Patent: September 15, 2020
    Assignees: Seoul National University R&DB Foundation, The Johns Hopkins University
    Inventors: Byung Hee Hong, Je Min Yoo, Hanseok Ko, Donghoon Kim
  • Patent number: 10410638
    Abstract: A method of converting a feature vector includes extracting a feature sequence from an audio signal including utterance of a user; extracting a feature vector from the feature sequence; acquiring a conversion matrix for reducing a dimension of the feature vector, based on a probability value acquired based on different covariance values; and converting the feature vector by using the conversion matrix.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: September 10, 2019
    Assignees: SAMSUNG ELECTRONICS CO., LTD., Korea University Research and Business Foundation
    Inventors: Hanseok Ko, Sung-soo Kim, Jinsang Rho, Suwon Shon, Jae-won Lee
  • Publication number: 20180289646
    Abstract: The present disclosure relates to a pharmaceutical composition for preventing or treating neurodegenerative diseases, the pharmaceutical composition including a graphene nanostructure as an active ingredient.
    Type: Application
    Filed: June 11, 2018
    Publication date: October 11, 2018
    Inventors: Byung Hee Hong, Je Min Yoo, Hanseok Ko, Donghoon Kim
  • Publication number: 20180247651
    Abstract: Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.
    Type: Application
    Filed: March 19, 2015
    Publication date: August 30, 2018
    Inventors: Hanseok KO, Sung-soo KIM, Taeyup SONG, Kyungsun LEE, Jae-won LEE
  • Publication number: 20180033439
    Abstract: A method of converting a feature vector includes extracting a feature sequence from an audio signal including utterance of a user; extracting a feature vector from the feature sequence; acquiring a conversion matrix for reducing a dimension of the feature vector, based on a probability value acquired based on different covariance values; and converting the feature vector by using the conversion matrix.
    Type: Application
    Filed: February 27, 2015
    Publication date: February 1, 2018
    Applicants: SAMSUNG ELECTRONICS CO., LTD., Korea University Research and Business Foundation
    Inventors: Hanseok KO, Sung-soo KIM, Jinsang RHO, Suwon SHON, Jae-won LEE
  • Patent number: 9842382
    Abstract: The present invention provides a method for removing a haze in a single image. In the present invention, a transmission is estimated by using a dark channel prior obtained from a hazy input image. The estimated transmission includes a block artifact. In an exemplary embodiment of the present invention, in order to preserve an edge and remove the block artifact, a refined transmission value is obtained by performing WLS filtering by using an estimated transmission value and a morphologically-processed input image, the image is restored based on the refined transmission value, and then multi-scale tone manipulation image processing is performed.
    Type: Grant
    Filed: April 11, 2014
    Date of Patent: December 12, 2017
    Assignees: Hanwha Techwin Co., Ltd., KOREA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION
    Inventors: Dubok Park, Hanseok Ko