Abstract: When a microphone array apparatus sound-picks up voice in a prescribed sound volume or higher, which is output from a sound source, and sends voice data on the voice to voice processing apparatus, a sound source direction detection unit causes sound source marks, each of which indicates a directivity direction, to be displayed on a display, and urges a user to make a selection among the sound source marks and to input camera information. A voice processing apparatus transmits the camera information that is input, and the directivity direction, to the microphone array apparatus. The microphone array apparatus stores the camera information and the directivity direction, as a preset information table, in a storage unit. Accordingly, where a positional relationship between the camera and the microphone array is unclear, directionality is formed in a determined image capture position, and voice in the predetermined image capture position is output clearly.