Abstract: Systems and methods for the management of video calls are provided. A computing device dynamically determines which video stream, of multiple available video streams, is to be presented full-screen or otherwise most prominently during a video call when a user switches between cameras. The available video streams can include a video stream from a user-facing camera of the device, a video stream from a rear-facing camera of the device, or a video stream from another device. Video of the video call can also be annotated to identify individual items shown in the video, characteristics of the items, and the like. The annotations can be used to generate a summary report of the items in the video.