Patents Assigned to ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD.
-
Publication number: 20250078806Abstract: The present disclosure discloses a method for synthesizing a speech. The method includes generating the speech based on a text with a speech synthesis model, wherein the speech synthesis model includes an embedding layer, a speech synthesis layer, and a position layer; and training the speech synthesis model when an evaluation index meets a preset condition, wherein the evaluation index includes one or more quality indexes determined based on at least a part of the text and at least a part of the speech.Type: ApplicationFiled: November 18, 2024Publication date: March 6, 2025Applicant: ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Peng ZHANG, Xinhui HU, Xinkang XU, Jian LU
-
Publication number: 20250037720Abstract: The present disclosure may provide a voice audio data processing system. The voice audio data processing system may obtain voice audio data, which includes one or more voices, each being respectively associated with one of one or more subjects. For one of the one or more voices and the subject associated with the voice, the voice audio processing system may generate a text based on the voice audio data. The text may have one or more sizes, each size corresponding to one of one or more volumes of the voice. The text may have one or more colors, each color corresponding to one of one or more emotion types of the voice.Type: ApplicationFiled: October 14, 2024Publication date: January 30, 2025Applicant: ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Yichen YU, Yunsan GUO
-
Patent number: 12148415Abstract: The present disclosure discloses a method for synthesizing a speech. The method includes generating the speech based on a text with a speech synthesis model, wherein the speech synthesis model includes an embedding layer, a speech synthesis layer, and a position layer; and training the speech synthesis model when an evaluation index meets a preset condition, wherein the evaluation index includes one or more quality indexes determined based on at least a part of the text and at least a part of the speech.Type: GrantFiled: September 11, 2023Date of Patent: November 19, 2024Assignee: ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Peng Zhang, Xinhui Hu, Xinkang Xu, Jian Lu
-
Patent number: 12119004Abstract: The present disclosure may provide a voice audio data processing system. The voice audio data processing system may obtain voice audio data, which includes one or more voices, each being respectively associated with one of one or more subjects. For one of the one or more voices and the subject associated with the voice, the voice audio processing system may generate a text based on the voice audio data. The text may have one or more sizes, each size corresponding to one of one or more volumes of the voice. The text may have one or more colors, each color corresponding to one of one or more emotion types of the voice.Type: GrantFiled: September 8, 2021Date of Patent: October 15, 2024Assignee: ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Yichen Yu, Yunsan Guo
-
Publication number: 20230419948Abstract: The present disclosure discloses a method for synthesizing a speech. The method includes generating the speech based on a text with a speech synthesis model, wherein the speech synthesis model includes an embedding layer, a speech synthesis layer, and a position layer; and training the speech synthesis model when an evaluation index meets a preset condition, wherein the evaluation index includes one or more quality indexes determined based on at least a part of the text and at least a part of the speech.Type: ApplicationFiled: September 11, 2023Publication date: December 28, 2023Applicant: ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Peng ZHANG, Xinhui HU, Xinkang XU, Jian LU
-
Publication number: 20220084525Abstract: The present disclosure may provide a voice audio data processing system. The voice audio data processing system may obtain voice audio data, which includes one or more voices, each being respectively associated with one of one or more subjects. For one of the one or more voices and the subject associated with the voice, the voice audio processing system may generate a text based on the voice audio data. The text may have one or more sizes, each size corresponding to one of one or more volumes of the voice. The text may have one or more colors, each color corresponding to one of one or more emotion types of the voice.Type: ApplicationFiled: September 8, 2021Publication date: March 17, 2022Applicant: ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Yichen YU, Yunsan GUO
-
Publication number: 20220059072Abstract: The present disclosure discloses a method for synthesizing a speech. The method includes generating the speech based on a text with a speech synthesis model, wherein the speech synthesis model includes an embedding layer, a speech synthesis layer, and a position layer; and training the speech synthesis model when an evaluation index meets a preset condition, wherein the evaluation index includes one or more quality indexes determined based on at least a part of the text and at least a part of the speech.Type: ApplicationFiled: August 18, 2021Publication date: February 24, 2022Applicant: ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Peng ZHANG, Xinhui HU, Xinkang XU, Jian LU