Abstract: An image storage section 48 stores shot image data with a plurality of resolutions transmitted from an imaging device. Depth images 152 with a plurality of resolutions are generated using stereo images with a plurality of resolution levels from the shot image data (S10). Next, template matching is performed using a reference template image 154 that represents a desired shape and size, thus extracting a candidate area for a target picture having the shape and size for each distance range associated with one of the resolutions (S12). A more detailed analysis is performed on the extracted candidate areas using the shot image stored in the image storage section 48 (S14). In some cases, a further image analysis is performed based on the analysis result using a shot image with a higher resolution level (S16a and S16b).