Abstract: A system and method is disclosed for forming stereoscopic images. The images are formed through the arrangement of a viewpoint position detection sensor for emitting visible or infrared light to an observer. Reflected light is then detected from the observer, where controls perform follow-up control on a portion of the apparatus by using the signal obtained by the viewpoint position detection sensor. The signal, along with other desired signals, are then properly set to accurately detect the viewpoint position of the observer with a simplified arrangement that easily allows a stereoscopic view region to follow-up at high speeds.