Patents by Inventor Gyeongsu CHAE
Gyeongsu CHAE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12205212Abstract: A device which generates a speech moving image includes a first encoder, a second encoder, a combination unit, and an image reconstruction unit. The first encoder receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector. The second encoder receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector. The combination unit generates a combination vector of the compressed image feature vector and the compressed voice feature vector. The image reconstruction unit reconstructs the speech moving image of the person with the combination as an input.Type: GrantFiled: December 8, 2020Date of Patent: January 21, 2025Assignee: DEEPBRAIN AI INC.Inventors: Gyeongsu Chae, Guembuel Hwang
-
Patent number: 12205342Abstract: A speech video generation device according to an embodiment includes a first encoder that receives an input of a first person background image of a predetermined person partially hidden by a first mask, and extracts a first image feature vector from the first person background image, a second encoder, which receives an input of a second person background image of the person partially hidden by a second mask, and extracts a second image feature vector from the second person background image, a third encoder, which receives an input of a speech audio signal of the person, and extracts a voice feature vector from the speech audio signal, a combining unit, which generates a combined vector of the first image feature vector, the second image feature vector, and the voice feature vector, and a decoder, which reconstructs a speech video of the person using the combined vector as an input.Type: GrantFiled: December 15, 2020Date of Patent: January 21, 2025Assignee: DEEPBRAIN AI INC.Inventors: Gyeongsu Chae, Guembuel Hwang
-
Publication number: 20240428615Abstract: A neural network-based key point training apparatus according to an embodiment includes a key point model trained to extract key points from an input image, and an image reconstruction model trained to reconstruct the input image with the key points output by the key point model as the input. The optimized parameters of the key point model and the image reconstruction model can be calculated.Type: ApplicationFiled: September 3, 2024Publication date: December 26, 2024Inventors: GYEONGSU CHAE, GUEMBUEL HWANG
-
Publication number: 20240386878Abstract: An apparatus for synthesizing speech according to an embodiment is a computing apparatus that includes one or more processors and a memory storing one or more programs executed by the one or more processors. The apparatus for synthesizing speech includes a pre-processing module that marks a preset classification symbol on each of unit texts input; and a speech synthesis module that receives each unit text marked with the classification symbol and synthesizes speech uttering the unit text based on the input unit text.Type: ApplicationFiled: July 30, 2024Publication date: November 21, 2024Inventors: GYEONGSU CHAE, DALHYUN KIM
-
Patent number: 12148431Abstract: A device according to an embodiment has one or more processors and a memory storing one or more programs executable by the one or more processors. The device includes a first encoder configured to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder configured to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner configured to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder configured to reconstruct the speech video of the person using the combined vector as an input.Type: GrantFiled: June 19, 2020Date of Patent: November 19, 2024Assignee: DEEPBRAIN AI INC.Inventors: Gyeongsu Chae, Guembuel Hwang, Sungwoo Park, Seyoung Jang
-
Patent number: 12131441Abstract: A learning device for generating an image according to an embodiment disclosed is a computing device including one or more processors and a memory storing one or more programs executed by the one or more processors. The learning device includes a first machine learning model that generates a mask for masking a portion related to speech in a person basic image with the person basic image as an input, and generates a person background image by synthesizing the person basic image and the mask.Type: GrantFiled: December 1, 2020Date of Patent: October 29, 2024Assignee: DEEPBRAIN AI INC.Inventors: Gyeongsu Chae, Guembuel Hwang
-
Patent number: 12112571Abstract: A neural network-based key point training apparatus according to an embodiment disclosed includes a key point model trained to extract key points from an input image and an image reconstruction model trained to reconstruct the input image with the key points output by the key point model as the input.Type: GrantFiled: December 1, 2020Date of Patent: October 8, 2024Assignee: DEEPBRAIN AI INC.Inventors: Gyeongsu Chae, Guembuel Hwang
-
Patent number: 12080270Abstract: An apparatus for synthesizing speech according to an embodiment is a computing apparatus that includes one or more processors and a memory storing one or more programs executed by the one or more processors. The apparatus for synthesizing speech includes a pre-processing module that marks a preset classification symbol on each of unit texts input; and a speech synthesis module that receives each unit text marked with the classification symbol and synthesizes speech uttering the unit text based on the input unit text.Type: GrantFiled: December 22, 2020Date of Patent: September 3, 2024Assignee: DEEPBRAIN AI INC.Inventors: Gyeongsu Chae, Dalhyun Kim
-
Patent number: 11972516Abstract: A device for generating a speech video according to an embodiment has one or more processor and a memory storing one or more programs executable by the one or more processors, and the device includes a video part generator configured to receive a person background image of a person and generate a video part of a speech video of the person; and an audio part generator configured to receive text, generate an audio part of the speech video of the person, and provide speech-related information occurring during the generation of the audio part to the video part generator.Type: GrantFiled: June 19, 2020Date of Patent: April 30, 2024Assignee: DEEPBRAIN AI INC.Inventors: Gyeongsu Chae, Guembuel Hwang, Sungwoo Park, Seyoung Jang
-
Publication number: 20220398793Abstract: A device for generating a speech moving image according to an embodiment includes a first encoder that receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector, a second encoder that receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector, a combination unit that generates a combination vector of the compressed image feature vector and the compressed voice feature vector, and an image reconstruction unit that reconstructs the speech moving image of the person with the combination as an input.Type: ApplicationFiled: December 8, 2020Publication date: December 15, 2022Inventors: Gyeongsu CHAE, Guembuel HWANG
-
Publication number: 20220399025Abstract: A device according to an embodiment has one or more processors and a memory storing one or more programs executable by the one or more processors. The device includes a first encoder configured to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder configured to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner configured to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder configured to reconstruct the speech video of the person using the combined vector as an input.Type: ApplicationFiled: June 19, 2020Publication date: December 15, 2022Inventors: Gyeongsu CHAE, Guembuel HWANG, Sungwoo PARK, Seyoung JANG
-
Publication number: 20220375190Abstract: A speech video generation device according to an embodiment includes a first encoder that receives an input of a first person background image of a predetermined person partially hidden by a first mask, and extracts a first image feature vector from the first person background image, a second encoder, which receives an input of a second person background image of the person partially hidden by a second mask, and extracts a second image feature vector from the second person background image, a third encoder, which receives an input of a speech audio signal of the person, and extracts a voice feature vector from the speech audio signal, a combining unit, which generates a combined vector of the first image feature vector, the second image feature vector, and the voice feature vector, and a decoder, which reconstructs a speech video of the person using the combined vector as an input.Type: ApplicationFiled: December 15, 2020Publication date: November 24, 2022Inventors: Gyeongsu CHAE, Guembuel HWANG
-
Publication number: 20220375224Abstract: A speech video generation device according to an embodiment includes a first encoder, which receives an input of a person background image that is a video part in a speech video of a predetermined person, and extracts an image feature vector from the person background image, a second encoder, which receives an input of a speech audio signal that is an audio part in the speech video, and extracts a voice feature vector from the speech audio signal, a combining unit, which generates a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, a first decoder, which reconstructs the speech video of the person using the combined vector as an input, and a second decoder, which predicts a landmark of the speech video using the combined vector as an input.Type: ApplicationFiled: December 15, 2020Publication date: November 24, 2022Inventor: Gyeongsu CHAE
-
Publication number: 20220366890Abstract: An apparatus for synthesizing speech according to an embodiment is a computing apparatus that includes one or more processors and a memory storing one or more programs executed by the one or more processors. The apparatus for synthesizing speech includes a pre-processing module that marks a preset classification symbol on each of unit texts input; and a speech synthesis module that receives each unit text marked with the classification symbol and synthesizes speech uttering the unit text based on the input unit text.Type: ApplicationFiled: December 22, 2020Publication date: November 17, 2022Inventors: Gyeongsu CHAE, Dalhyun KIM
-
Publication number: 20220358703Abstract: A device for generating a speech video may include a first encoder to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder to reconstruct the speech video of the person using the combined vector as an input. The person background image input to the first encoder includes a face and an upper body of the person, with a portion related to speech of the person covered with a mask.Type: ApplicationFiled: June 19, 2020Publication date: November 10, 2022Inventors: Gyeongsu CHAE, Guembuel HWANG, Sungwoo PARK, Seyoung JANG
-
Publication number: 20220351439Abstract: A device for generating a speech video according to an embodiment has one or more processor and a memory storing one or more programs executable by the one or more processors, and the device includes a video part generator configured to receive a person background image of a person and generate a video part of a speech video of the person; and an audio part generator configured to receive text, generate an audio part of the speech video of the person, and provide speech-related information occurring during the generation of the audio part to the video part generator.Type: ApplicationFiled: June 19, 2020Publication date: November 3, 2022Inventors: Gyeongsu CHAE, Guembuel HWANG, Sungwoo PARK, Seyoung JANG
-
Publication number: 20220351348Abstract: A learning device for generating an image according to an embodiment disclosed is a computing device including one or more processors and a memory storing one or more programs executed by the one or more processors. The learning device includes a first machine learning model that generates a mask for masking a portion related to speech in a person basic image with the person basic image as an input, and generates a person background image by synthesizing the person basic image and the mask.Type: ApplicationFiled: December 1, 2020Publication date: November 3, 2022Inventors: Gyeongsu CHAE, Guembuel HWANG
-
Publication number: 20220343679Abstract: A neural network-based key point training apparatus according to an embodiment disclosed includes a key point model trained to extract key points from an input image and an image reconstruction model trained to reconstruct the input image with the key points output by the key point model as the input.Type: ApplicationFiled: December 1, 2020Publication date: October 27, 2022Inventors: Gyeongsu CHAE, Guembuel HWANG
-
Publication number: 20220343651Abstract: A device for generating a speech image according to an embodiment disclosed herein is a speech image generation device including one or more processors and a memory storing one or more programs executed by the one or more processors. The device includes a first machine learning model that extracts an image feature with a speech image of a person as an input to reconstruct the speech image from the extracted image feature and a second machine learning model that predicts the image feature with a speech audio signal of the person as an input.Type: ApplicationFiled: December 8, 2020Publication date: October 27, 2022Inventors: Gyeongsu CHAE, Guembuel HWANG