Abstract: A voice control system including a voice receiving unit, an image capturing unit, a storage unit and a control unit is disclosed. The voice receiving unit receives a voice. The image capturing unit captures a video image stream including several human face images. The storage unit stores the voice and the video image stream. The control unit is electrically connected to the voice receiving unit, the image capturing unit and the storage unit. The control unit detects a feature of a human face from the human face images, defines a mouth motion detection region from the feature of the human face, and generates a control signal according to a variation of the mouth motion detection region and a variation of the voice over time. A voice control method, a computer program product and a computer readable medium are also disclosed.