Patents by Inventor Hanseok Ko
Hanseok Ko has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240338974Abstract: Embodiments described herein provide systems and methods for facial expression recognition (FER). Embodiments herein combine features of different semantic levels and classifies both sentiment and specific emotion categories with emotion grouping. Embodiments herein include a model with a bottom-up branch that learns facial expressions representation at different semantic levels and output pseudo labels of facial expressions for each frame using a 2D FER model, and a top-down branch that learns discriminative representations by combining feature vectors of each semantic level for recognizing facial expressions at the corresponding emotion group.Type: ApplicationFiled: March 13, 2024Publication date: October 10, 2024Inventors: Bokyeung Lee, Bonhwa Ku, Hanseok Ko
-
Publication number: 20240339104Abstract: Embodiments described herein provide systems and methods for text to speech synthesis. A system receives, via a data interface, an input text, a reference spectrogram, and at least one of an emotion ID or speaker ID. The system generates, via a first encoder, a vector representation of the input text. The system generates, via a second encoder, a vector representation of the reference spectrogram. The system generates, via a variance adaptor, a modified vector representation based on a combined representation including a combination of the vector representation of the input text, the vector representation of the reference spectrogram, and at least one of an embedding of the emotion ID or an embedding of the speaker ID. The system generates, via a decoder, an audio waveform based on the modified vector representation. The generated audio waveform may be played via a speaker.Type: ApplicationFiled: March 7, 2024Publication date: October 10, 2024Inventors: Jeongki Min, Bonhwa Ku, Hanseok Ko
-
Publication number: 20240338802Abstract: Embodiments described herein provide systems and methods for image inpainting. A system receives a masked input image and a mask. The system generates, via a pretrained model, a first pass inpainted image based on the masked input image. The system generates a plurality of variants of the first pass inpainted image. The system generates, via a first encoder, a vector representation of the masked input image. The system generates, via a first decoder, a plurality of output images based on the vector representation of the masked input image and conditioned by the plurality of variants of the first pass inpainted image.Type: ApplicationFiled: April 2, 2024Publication date: October 10, 2024Inventors: Dongsik Yoon, Jeong-gi Kwak, Hanseok Ko
-
Publication number: 20240338560Abstract: Embodiments described herein provide systems and methods for gesture generation from multimodal input. A method includes receiving a multimodal input. The method may further include masking a subset of the multimodal input; generating, via an embedder, a multimodal embedding based on the masked multimodal input; generating, via an encoder, multimodal features based on the multimodal embedding, wherein the encoder includes one or more attention layers connecting different modalities; generating, via a generator, multimodal output based on the multimodal features; computing a loss based on the multimodal input and the multimodal output. The method may further include updating parameters of the encoder based on the loss.Type: ApplicationFiled: April 4, 2024Publication date: October 10, 2024Inventors: Gwantae Kim, Hanseok Ko
-
Publication number: 20240339122Abstract: Embodiments described herein provide systems and methods for any to any voice conversion. A system receives, via a data interface, a source utterance of a first style and a target utterance of a second style. The system generates, via a first encoder, a vector representation of the target utterance. The system generates, via a second encoder, a vector representation of the source utterance. The system generates, via a filter generator, a generated filter based on the vector representation of the target utterance. The system generates, via a decoder, a generated utterance based on the vector representation of the source utterance and the generated filter.Type: ApplicationFiled: March 18, 2024Publication date: October 10, 2024Inventors: Donghyeon Kim, Bonhwa Ku, Hanseok Ko
-
Publication number: 20240338878Abstract: Embodiments described herein provide systems and methods for 3D-aware image generation. A system receives, via a data interface, a plurality of control parameters and a view direction. The system generates a plurality of predicted densities based on a plurality of positions and the plurality of control parameters. The densities may be predicted by applying a series of modulation blocks, wherein each block modulates a vector representation based on control parameters that are used to generate frequency values and phase shift values for the modulation. The system generates an image based on the plurality of predicted densities and the view direction.Type: ApplicationFiled: April 3, 2024Publication date: October 10, 2024Inventors: Jeong-gi Kwak, Hanseok Ko
-
Publication number: 20240338917Abstract: Embodiments described herein provide systems and methods for image to 3D generation. A system receives an input image, for example a portrait. The system generates, via an encoder, a first latent representation based on the input image. The system generates, based on the first latent representation, a plurality of latent representations associated with a plurality of view angles. The system generates, via a decoder, a plurality of images in the plurality of view angles based on the plurality of latent representations. Finally, the system generates a final UV map based on the plurality of images.Type: ApplicationFiled: April 3, 2024Publication date: October 10, 2024Inventors: Yuanming Li, Bonhwa Ku, Hanseok Ko
-
Publication number: 20240339103Abstract: Embodiments described herein provide systems and methods for text to speech synthesis. A system receives, via a data interface, an input text, a first reference spectrogram, and a second reference spectrogram. The system generates, via encoders, vector representations of each of the inputs. The system generates a combined representation based on the vector representation of the first reference spectrogram and the vector representation of the second reference spectrogram. The system performs cross attention between the combined representation and the vector representation of the input text to generate a style vector. The system may generate, via a decoder, an audio waveform based on the modified vector representation and conditioned by the style vector where the style vector conditions the speech generation via conditional layer normalization. The generated audio waveform may be played via a speaker. The generated audio may be used in communication by a digital avatar interface.Type: ApplicationFiled: March 13, 2024Publication date: October 10, 2024Inventors: Jeongki Min, Bonhwa Ku, Hanseok Ko
-
Publication number: 20240338874Abstract: Embodiments described herein provide systems and methods for gesture generation from text. A method for gesture generation includes receiving an input text. The method may further include generating, via an encoder, an action representation in an action representation space based on the input text. The method may further include generating, via a first motion decoder, a first body configuration based on the action representation. The method may further include generating, via a second motion decoder, a second body configuration based on the first body configuration. The method may further include generating, via a token decoder, a first stop token based on the first body configuration.Type: ApplicationFiled: April 3, 2024Publication date: October 10, 2024Inventors: Gwantae Kim, Hanseok Ko
-
Patent number: 11947061Abstract: An earthquake event classification method using an attention-based neural network includes: preprocessing input earthquake data by centering; extracting a feature map by nonlinearly converting the preprocessed earthquake data through a plurality of convolution layers having three or more layers; measuring importance of a learned feature of the nonlinear-converted earthquake data based on an attention technique in which interdependence of channels of the feature map is modeled; correcting a feature value of the measured importance value through element-wise multiply with the learned feature map; performing down-sampling through max-pooling based on the feature value; and classifying an earthquake event by regularizing the down-sampled feature value. Accordingly, main core features inherent in many/complex data are extracted through attention-based deep learning to overcome the limitations of the existing micro earthquake detection technology, thereby enabling earthquake detection even in low SNR environments.Type: GrantFiled: August 13, 2020Date of Patent: April 2, 2024Assignee: Korea University Research and Business FoundationInventors: Hanseok Ko, Bon Hwa Ku
-
Patent number: 11830272Abstract: Disclosed are a method and an apparatus for identifying animal species by using audiovisual information. A method for identifying animal species, according to one embodiment of the present invention, may include: a step of receiving an input signal for an object to be identified; a step of processing image information and acoustic information based on the input signal, wherein a processing result of the image information and a processing result of the acoustic information are represented by class-specific scores; a step of determining whether the image information processing result and the acoustic information processing result corresponding to the input signal exist; and a final result derivation step of fusing the image information processing result and the acoustic information processing result according to the determination result and classifying the object to be identified as a certain animal species by using the fused processing result.Type: GrantFiled: April 18, 2019Date of Patent: November 28, 2023Assignee: Korea University Research and Business FoundationInventors: Hanseok Ko, Sangwook Park, Kyung-Deuk Ko, Donghyeon Kim
-
Publication number: 20220036053Abstract: Disclosed are a method and an apparatus for identifying animal species by using audiovisual information. A method for identifying animal species, according to one embodiment of the present invention, may include: a step of receiving an input signal for an object to be identified; a step of processing image information and acoustic information based on the input signal, wherein a processing result of the image information and a processing result of the acoustic information are represented by class-specific scores; a step of determining whether the image information processing result and the acoustic information processing result corresponding to the input signal exist; and a final result derivation step of fusing the image information processing result and the acoustic information processing result according to the determination result and classifying the object to be identified as a certain animal species by using the fused processing result.Type: ApplicationFiled: April 18, 2019Publication date: February 3, 2022Applicant: Korea University Research and Business FoundationInventors: Hanseok KO, Sangwook PARK, Kyung-Deuk KO, Donghyeon KIM
-
Publication number: 20210117737Abstract: An earthquake event classification method using an attention-based neural network includes: preprocessing input earthquake data by centering; extracting a feature map by nonlinearly converting the preprocessed earthquake data through a plurality of convolution layers having three or more layers; measuring importance of a learned feature of the nonlinear-converted earthquake data based on an attention technique in which interdependence of channels of the feature map is modeled; correcting a feature value of the measured importance value through element-wise multiply with the learned feature map; performing down-sampling through max-pooling based on the feature value; and classifying an earthquake event by regularizing the down-sampled feature value. Accordingly, main core features inherent in many/complex data are extracted through attention-based deep learning to overcome the limitations of the existing micro earthquake detection technology, thereby enabling earthquake detection even in low SNR environments.Type: ApplicationFiled: August 13, 2020Publication date: April 22, 2021Applicant: Korea University Research and Business FoundationInventors: Hanseok KO, Bon Hwa KU
-
Patent number: 10923126Abstract: Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.Type: GrantFiled: March 19, 2015Date of Patent: February 16, 2021Assignees: Samsung Electronics Co., Ltd., Korea University Research And Business FoundationInventors: Hanseok Ko, Sung-soo Kim, Taeyup Song, Kyungsun Lee, Jae-won Lee
-
Patent number: 10772910Abstract: The present disclosure relates to a pharmaceutical composition for preventing or treating neurodegenerative diseases, the pharmaceutical composition including a graphene nanostructure as an active ingredient.Type: GrantFiled: June 11, 2018Date of Patent: September 15, 2020Assignees: Seoul National University R&DB Foundation, The Johns Hopkins UniversityInventors: Byung Hee Hong, Je Min Yoo, Hanseok Ko, Donghoon Kim
-
Patent number: 10410638Abstract: A method of converting a feature vector includes extracting a feature sequence from an audio signal including utterance of a user; extracting a feature vector from the feature sequence; acquiring a conversion matrix for reducing a dimension of the feature vector, based on a probability value acquired based on different covariance values; and converting the feature vector by using the conversion matrix.Type: GrantFiled: February 27, 2015Date of Patent: September 10, 2019Assignees: SAMSUNG ELECTRONICS CO., LTD., Korea University Research and Business FoundationInventors: Hanseok Ko, Sung-soo Kim, Jinsang Rho, Suwon Shon, Jae-won Lee
-
Publication number: 20180289646Abstract: The present disclosure relates to a pharmaceutical composition for preventing or treating neurodegenerative diseases, the pharmaceutical composition including a graphene nanostructure as an active ingredient.Type: ApplicationFiled: June 11, 2018Publication date: October 11, 2018Inventors: Byung Hee Hong, Je Min Yoo, Hanseok Ko, Donghoon Kim
-
Publication number: 20180247651Abstract: Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.Type: ApplicationFiled: March 19, 2015Publication date: August 30, 2018Inventors: Hanseok KO, Sung-soo KIM, Taeyup SONG, Kyungsun LEE, Jae-won LEE
-
Publication number: 20180033439Abstract: A method of converting a feature vector includes extracting a feature sequence from an audio signal including utterance of a user; extracting a feature vector from the feature sequence; acquiring a conversion matrix for reducing a dimension of the feature vector, based on a probability value acquired based on different covariance values; and converting the feature vector by using the conversion matrix.Type: ApplicationFiled: February 27, 2015Publication date: February 1, 2018Applicants: SAMSUNG ELECTRONICS CO., LTD., Korea University Research and Business FoundationInventors: Hanseok KO, Sung-soo KIM, Jinsang RHO, Suwon SHON, Jae-won LEE
-
Patent number: 9842382Abstract: The present invention provides a method for removing a haze in a single image. In the present invention, a transmission is estimated by using a dark channel prior obtained from a hazy input image. The estimated transmission includes a block artifact. In an exemplary embodiment of the present invention, in order to preserve an edge and remove the block artifact, a refined transmission value is obtained by performing WLS filtering by using an estimated transmission value and a morphologically-processed input image, the image is restored based on the refined transmission value, and then multi-scale tone manipulation image processing is performed.Type: GrantFiled: April 11, 2014Date of Patent: December 12, 2017Assignees: Hanwha Techwin Co., Ltd., KOREA UNIVERSITY RESEARCH AND BUSINESS FOUNDATIONInventors: Dubok Park, Hanseok Ko