Patents by Inventor Mikayel MIRZOYAN

Mikayel MIRZOYAN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ASSIGNING SSML TAGS TO AN AUDIO CORPUS

Publication number: 20230317057

Abstract: This disclosure describes a system that converts an audio object (e.g., an audio book, a podcast, a videoconference meeting) to text with SSML tags so that any future text-to-speech conversion enables speech synthesis to sound more human-like. The system analyzes the audio object to identify speech output characteristics for different tokens. Variations in speech output characteristics can distinguish between an utterance spoken by one character and an utterance spoken by another character. The system assigns the tokens to the characters and compares a speech output characteristic for a token to a baseline speech output characteristic associated with an identified character. Next, the system determines an amount of deviation between the speech output characteristic for the token and the baseline speech output characteristic. The system uses this deviation to determine a relative speech output characteristic value, which is to be included in an SSML tag for a token.

Type: Application

Filed: March 31, 2022

Publication date: October 5, 2023

Inventors: Mikayel MIRZOYAN, Vidush VISHWANATH
USING TOKEN LEVEL CONTEXT TO GENERATE SSML TAGS

Publication number: 20230215417

Abstract: This disclosure describes a system that analyzes a corpus of text (e.g., a financial article, an audio book, etc.) so that the context surrounding the text is fully understood. For instance, the context may be an environment described by the text, or an environment in which the text occurs. Based on the analysis, the system can determine sentiment, part of speech, entities, and/or human characters at the token level of the text, and automatically generate Speech Synthesis Markup Language (SSML) tags based on this information. The SSML tags can be used by applications, services, and/or features that implement text-to-speech (TTS) conversion to improve the audio experience for end-users. Consequently, via the techniques described herein, more realistic and human-like speech synthesis can be efficiently implemented at larger scale (e.g., for audio books, for all the articles published to a news site, etc.).

Type: Application

Filed: December 30, 2021

Publication date: July 6, 2023

Inventors: Mikayel MIRZOYAN, André AING, Aysar KHALID, Chad Joseph LYNCH, Graham Michael REEVE, Sadek BAROUDI, Vidush VISHWANATH

ASSIGNING SSML TAGS TO AN AUDIO CORPUS

USING TOKEN LEVEL CONTEXT TO GENERATE SSML TAGS