Abstract: The method for generating captions, subtitles and dubbing for audiovisual media uses a machine learning-based approach for automatically generating captions from the audio portion of audiovisual media, and further translates the captions to produce both subtitles and dubbing. A speech component of an audio portion of audiovisual media is converted into at least one text string which includes at least one word. Temporal start and end points for the at least one word are determined, and the at least one word is visually inserted into the video portion of the audiovisual media. The temporal start and end points for the at least one word are synchronized with corresponding temporal start and end points of the speech component of the audio portion of the audiovisual media. A latency period may be selectively inserted into broadcast of the audiovisual media such that the synchronization may be selectively adjusted during the latency period.
Type:
Application
Filed:
January 4, 2024
Publication date:
May 9, 2024
Applicant:
SYNCWORDS
Inventors:
ASHISH SHAH, SOTIRIS CARTSOS, ALEKSANDR DUBINSKY
Abstract: A system and method to perform dubbing automatically for multiple languages at the same time using speech-to-text transcriptions, language translation, and artificial intelligence engines to perform the actual dubbing in the voice likeness of the original speaker.
Abstract: A system and method to perform dubbing automatically for multiple languages at the same time using speech-to-text transcriptions, language translation, and artificial intelligence engines to perform the actual dubbing in the voice likeness of the original speaker.