Patents by Inventor Gyeongsu CHAE

Gyeongsu CHAE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Device and method for generating speech video

Patent number: 12205342

Abstract: A speech video generation device according to an embodiment includes a first encoder that receives an input of a first person background image of a predetermined person partially hidden by a first mask, and extracts a first image feature vector from the first person background image, a second encoder, which receives an input of a second person background image of the person partially hidden by a second mask, and extracts a second image feature vector from the second person background image, a third encoder, which receives an input of a speech audio signal of the person, and extracts a voice feature vector from the speech audio signal, a combining unit, which generates a combined vector of the first image feature vector, the second image feature vector, and the voice feature vector, and a decoder, which reconstructs a speech video of the person using the combined vector as an input.

Type: Grant

Filed: December 15, 2020

Date of Patent: January 21, 2025

Assignee: DEEPBRAIN AI INC.

Inventors: Gyeongsu Chae, Guembuel Hwang
Method and device for generating speech moving image

Patent number: 12205212

Abstract: A device which generates a speech moving image includes a first encoder, a second encoder, a combination unit, and an image reconstruction unit. The first encoder receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector. The second encoder receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector. The combination unit generates a combination vector of the compressed image feature vector and the compressed voice feature vector. The image reconstruction unit reconstructs the speech moving image of the person with the combination as an input.

Type: Grant

Filed: December 8, 2020

Date of Patent: January 21, 2025

Assignee: DEEPBRAIN AI INC.

Inventors: Gyeongsu Chae, Guembuel Hwang
METHOD AND APPARATUS FOR LEARNING KEY POINT OF BASED NEURAL NETWORK

Publication number: 20240428615

Abstract: A neural network-based key point training apparatus according to an embodiment includes a key point model trained to extract key points from an input image, and an image reconstruction model trained to reconstruct the input image with the key points output by the key point model as the input. The optimized parameters of the key point model and the image reconstruction model can be calculated.

Type: Application

Filed: September 3, 2024

Publication date: December 26, 2024

Inventors: GYEONGSU CHAE, GUEMBUEL HWANG
METHOD AND APPARATUS FOR SYNTHESIZING VOICE OF BASED TEXT

Publication number: 20240386878

Abstract: An apparatus for synthesizing speech according to an embodiment is a computing apparatus that includes one or more processors and a memory storing one or more programs executed by the one or more processors. The apparatus for synthesizing speech includes a pre-processing module that marks a preset classification symbol on each of unit texts input; and a speech synthesis module that receives each unit text marked with the classification symbol and synthesizes speech uttering the unit text based on the input unit text.

Type: Application

Filed: July 30, 2024

Publication date: November 21, 2024

Inventors: GYEONGSU CHAE, DALHYUN KIM
Method and device for generating speech video using audio signal

Patent number: 12148431

Abstract: A device according to an embodiment has one or more processors and a memory storing one or more programs executable by the one or more processors. The device includes a first encoder configured to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder configured to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner configured to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder configured to reconstruct the speech video of the person using the combined vector as an input.

Type: Grant

Filed: June 19, 2020

Date of Patent: November 19, 2024

Assignee: DEEPBRAIN AI INC.

Inventors: Gyeongsu Chae, Guembuel Hwang, Sungwoo Park, Seyoung Jang
Learning device and method for generating image

Patent number: 12131441

Abstract: A learning device for generating an image according to an embodiment disclosed is a computing device including one or more processors and a memory storing one or more programs executed by the one or more processors. The learning device includes a first machine learning model that generates a mask for masking a portion related to speech in a person basic image with the person basic image as an input, and generates a person background image by synthesizing the person basic image and the mask.

Type: Grant

Filed: December 1, 2020

Date of Patent: October 29, 2024

Assignee: DEEPBRAIN AI INC.

Inventors: Gyeongsu Chae, Guembuel Hwang
Neural network-based key point training apparatus and method

Patent number: 12112571

Abstract: A neural network-based key point training apparatus according to an embodiment disclosed includes a key point model trained to extract key points from an input image and an image reconstruction model trained to reconstruct the input image with the key points output by the key point model as the input.

Type: Grant

Filed: December 1, 2020

Date of Patent: October 8, 2024

Assignee: DEEPBRAIN AI INC.

Inventors: Gyeongsu Chae, Guembuel Hwang
Method and apparatus for text-based speech synthesis

Patent number: 12080270

Abstract: An apparatus for synthesizing speech according to an embodiment is a computing apparatus that includes one or more processors and a memory storing one or more programs executed by the one or more processors. The apparatus for synthesizing speech includes a pre-processing module that marks a preset classification symbol on each of unit texts input; and a speech synthesis module that receives each unit text marked with the classification symbol and synthesizes speech uttering the unit text based on the input unit text.

Type: Grant

Filed: December 22, 2020

Date of Patent: September 3, 2024

Assignee: DEEPBRAIN AI INC.

Inventors: Gyeongsu Chae, Dalhyun Kim
Method and device for generating speech video by using text

Patent number: 11972516

Abstract: A device for generating a speech video according to an embodiment has one or more processor and a memory storing one or more programs executable by the one or more processors, and the device includes a video part generator configured to receive a person background image of a person and generate a video part of a speech video of the person; and an audio part generator configured to receive text, generate an audio part of the speech video of the person, and provide speech-related information occurring during the generation of the audio part to the video part generator.

Type: Grant

Filed: June 19, 2020

Date of Patent: April 30, 2024

Assignee: DEEPBRAIN AI INC.

Inventors: Gyeongsu Chae, Guembuel Hwang, Sungwoo Park, Seyoung Jang
METHOD AND DEVICE FOR GENERATING SPEECH VIDEO USING AUDIO SIGNAL

Publication number: 20220399025

Abstract: A device according to an embodiment has one or more processors and a memory storing one or more programs executable by the one or more processors. The device includes a first encoder configured to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder configured to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner configured to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder configured to reconstruct the speech video of the person using the combined vector as an input.

Type: Application

Filed: June 19, 2020

Publication date: December 15, 2022

Inventors: Gyeongsu CHAE, Guembuel HWANG, Sungwoo PARK, Seyoung JANG
METHOD AND DEVICE FOR GENERATING SPEECH MOVING IMAGE

Publication number: 20220398793

Abstract: A device for generating a speech moving image according to an embodiment includes a first encoder that receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector, a second encoder that receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector, a combination unit that generates a combination vector of the compressed image feature vector and the compressed voice feature vector, and an image reconstruction unit that reconstructs the speech moving image of the person with the combination as an input.

Type: Application

Filed: December 8, 2020

Publication date: December 15, 2022

Inventors: Gyeongsu CHAE, Guembuel HWANG
DEVICE AND METHOD FOR GENERATING SPEECH VIDEO ALONG WITH LANDMARK

Publication number: 20220375224

Abstract: A speech video generation device according to an embodiment includes a first encoder, which receives an input of a person background image that is a video part in a speech video of a predetermined person, and extracts an image feature vector from the person background image, a second encoder, which receives an input of a speech audio signal that is an audio part in the speech video, and extracts a voice feature vector from the speech audio signal, a combining unit, which generates a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, a first decoder, which reconstructs the speech video of the person using the combined vector as an input, and a second decoder, which predicts a landmark of the speech video using the combined vector as an input.

Type: Application

Filed: December 15, 2020

Publication date: November 24, 2022

Inventor: Gyeongsu CHAE
DEVICE AND METHOD FOR GENERATING SPEECH VIDEO

Publication number: 20220375190

Abstract: A speech video generation device according to an embodiment includes a first encoder that receives an input of a first person background image of a predetermined person partially hidden by a first mask, and extracts a first image feature vector from the first person background image, a second encoder, which receives an input of a second person background image of the person partially hidden by a second mask, and extracts a second image feature vector from the second person background image, a third encoder, which receives an input of a speech audio signal of the person, and extracts a voice feature vector from the speech audio signal, a combining unit, which generates a combined vector of the first image feature vector, the second image feature vector, and the voice feature vector, and a decoder, which reconstructs a speech video of the person using the combined vector as an input.

Type: Application

Filed: December 15, 2020

Publication date: November 24, 2022

Inventors: Gyeongsu CHAE, Guembuel HWANG
METHOD AND APPARATUS FOR TEXT-BASED SPEECH SYNTHESIS

Publication number: 20220366890

Abstract: An apparatus for synthesizing speech according to an embodiment is a computing apparatus that includes one or more processors and a memory storing one or more programs executed by the one or more processors. The apparatus for synthesizing speech includes a pre-processing module that marks a preset classification symbol on each of unit texts input; and a speech synthesis module that receives each unit text marked with the classification symbol and synthesizes speech uttering the unit text based on the input unit text.

Type: Application

Filed: December 22, 2020

Publication date: November 17, 2022

Inventors: Gyeongsu CHAE, Dalhyun KIM
METHOD AND DEVICE FOR GENERATING SPEECH VIDEO ON BASIS OF MACHINE LEARNING

Publication number: 20220358703

Abstract: A device for generating a speech video may include a first encoder to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder to reconstruct the speech video of the person using the combined vector as an input. The person background image input to the first encoder includes a face and an upper body of the person, with a portion related to speech of the person covered with a mask.

Type: Application

Filed: June 19, 2020

Publication date: November 10, 2022

Inventors: Gyeongsu CHAE, Guembuel HWANG, Sungwoo PARK, Seyoung JANG
LEARNING DEVICE AND METHOD FOR GENERATING IMAGE

Publication number: 20220351348

Abstract: A learning device for generating an image according to an embodiment disclosed is a computing device including one or more processors and a memory storing one or more programs executed by the one or more processors. The learning device includes a first machine learning model that generates a mask for masking a portion related to speech in a person basic image with the person basic image as an input, and generates a person background image by synthesizing the person basic image and the mask.

Type: Application

Filed: December 1, 2020

Publication date: November 3, 2022

Inventors: Gyeongsu CHAE, Guembuel HWANG
METHOD AND DEVICE FOR GENERATING SPEECH VIDEO BY USING TEXT

Publication number: 20220351439

Abstract: A device for generating a speech video according to an embodiment has one or more processor and a memory storing one or more programs executable by the one or more processors, and the device includes a video part generator configured to receive a person background image of a person and generate a video part of a speech video of the person; and an audio part generator configured to receive text, generate an audio part of the speech video of the person, and provide speech-related information occurring during the generation of the audio part to the video part generator.

Type: Application

Filed: June 19, 2020

Publication date: November 3, 2022

Inventors: Gyeongsu CHAE, Guembuel HWANG, Sungwoo PARK, Seyoung JANG
NEURAL NETWORK-BASED KEY POINT TRAINING APPARATUS AND METHOD

Publication number: 20220343679

Abstract: A neural network-based key point training apparatus according to an embodiment disclosed includes a key point model trained to extract key points from an input image and an image reconstruction model trained to reconstruct the input image with the key points output by the key point model as the input.

Type: Application

Filed: December 1, 2020

Publication date: October 27, 2022

Inventors: Gyeongsu CHAE, Guembuel HWANG
METHOD AND DEVICE FOR GENERATING SPEECH IMAGE

Publication number: 20220343651

Abstract: A device for generating a speech image according to an embodiment disclosed herein is a speech image generation device including one or more processors and a memory storing one or more programs executed by the one or more processors. The device includes a first machine learning model that extracts an image feature with a speech image of a person as an input to reconstruct the speech image from the extracted image feature and a second machine learning model that predicts the image feature with a speech audio signal of the person as an input.

Type: Application

Filed: December 8, 2020

Publication date: October 27, 2022

Inventors: Gyeongsu CHAE, Guembuel HWANG