Publication number: 20260044559
Abstract: Provided is a real-time multi-modal artificial intelligence agent. In some implementations, the multi-modal agent can be implemented as a “situated agent”. The term situated agent refers to a setting in which the agent shares one or more perceptual inputs with a human user. For example, the situated agent can receive and process various data inputs, including video, audio, and/or textual data which are also observable by the human user. The agent can process these inputs to generate responses that are contextually-relevant for the user's physical or digital environment, for example enabling the agent to generate dialogue or other responses or outputs which assist the user in understanding and/or navigating the environment.
Type:
Application
Filed:
October 22, 2025
Publication date:
February 12, 2026
Inventors:
Fengning Ding, Dong Yin, Alistair Michael Muldal, Neil Charles Rabinowitz, Chen Yan, Ankesh Anand, Hsiao-Yu Tung, Keren Gu-Lemberg, Timothy Chieu Nguyen, Charles John Deck, Arslan Chaudhry, Nathaniel John McAleese-Park, Pavel Dubov, Mikhail Dashevskiy, Robert Douglas Fritz, III, Donald Russell Reed Roberts, Lili Janzer, Juliette Love, Michael Benjamin Chang, Nikolai Grigorev, Jiaming Li, Cédric Hauteville, Gregory Wayne, Keith Anderson, Nevena Lazic, Arun Ravi Ahuja, Mehdi Abbana Bennani, Dilan Gorur, Duncan David Ross Williams, Richard James Green, Toshiyuki Fukuzawa, Sridhar Thiagarajan, Federico Javier Carnevale, Praveen Deepak Srinivasan, Tobias Markus Pohlen, Sina Samangooei, Mehdi Mirza Mohammadi, Jonas Degrave