Abstract: Techniques for generating 3D gaze predictions based on a deep learning system are described. In an example, the deep learning system includes a neural network. The neural network is trained with training images generated by cameras and showing eyes of user while gazing at stimulus points. Some of the stimulus points are in the planes of the camera. Remaining stimulus points are not un the planes of the cameras. The training includes inputting a first training image associated with a stimulus point in a camera plane and inputting a second training image associated with a stimulus point outside the camera plane. The training minimizes a loss function of the neural network based on a distance between at least one of the stimulus points and a gaze line.