Abstract: A multi-participant conference system and method is described. The multi-participant system includes a PSTN client, at least one remote client and a first participant client. The PSTN client communicates audio data and the remote clients communicate audio-video data. The first participant client includes a voice over IP (VoIP) encoder, a VoIP decoder, a first audio mixer, and a second audio mixer. The VoIP encoder compresses audio data transported to the PSTN client. The VoIP decoder then decodes audio data from the PSTN client. The first audio mixer mixes the decoded audio data from the PSTN client with the audio-video data from the first participant into a first mixed audio-video data stream transmitted to the remote client. The second audio mixer mixes the audio-video data stream from the first participant with the audio-video data stream from each remote client into a second mixed audio transmitted to the PSTN client.