Publication number: 20250139379
Abstract: Implementations relate to generating multi-modal response(s) through utilization of large language model(s) (LLM(s)) and other generative model(s). Processor(s) of a system can: receive natural language (NL) based input, generate a multi-modal response that is responsive to the NL based output, and cause the multi-modal response to be rendered. In some implementations, and in generating the multi-modal response, the processor(s) can process, using a LLM, LLM input to generate LLM output, and determine, based on the LLM output, textual content and generative multimedia content for inclusion in the multi-modal response. In some implementations, the generative multimedia content can be generated by another generative model (e.g., an image generator, a video generator, an audio generator, etc.) based on generative multimedia content prompt(s) included in the LLM output and that is indicative of the generative multimedia content.
Type:
Application
Filed:
October 30, 2023
Publication date:
May 1, 2025
Inventors:
Sanil Jain, Wei Yu, Alessandro Agostini, Agoston Weisz, Michael Andrew Goodman, Attila Dankovics, Elle Chae, Evgeny Sluzhaev, Amin Ghafouri, Golnaz Ghiasi, Igor Petrovski, Konstantin Shagin, Marcelo Menegali, Oscar Akerlund, Rakesh Shivanna, Thang Luong, Tiffany Chen, Vikas Peswani, Yifeng Lu