๐ŸŽค Speaker Separation with Voice Activity Detection

Separate mixed audio into individual speakers

Choose between standard separation or separation with Voice Activity Detection (VAD).


๐Ÿ“ Example Audio

Select an example audio file below, or upload your own!

๐ŸŽต Input

Select Example Audio
๐Ÿค– Model Selection

VAD models provide voice activity detection for each speaker

๐ŸŽง Separated Audio Outputs

๐Ÿ“Š Audio Spectrograms


๐Ÿ“‹ Instructions:

  1. Upload an audio file or record directly using the microphone
  2. Select your preferred model (with or without VAD)
  3. If using VAD, adjust the threshold as needed
  4. Click "Separate Speakers" to process
  5. Download the separated audio files and view the spectrograms

๐Ÿ”ง Technical Notes:

  • Audio is automatically resampled to 16kHz
  • Multi-channel audio uses the first channel
  • Spectrograms: Show frequency content over time with VAD activity highlighted
  • VAD Overlay: A white line at the top indicates when the speaker is active

๐Ÿ“– Reference

Opochinsky, R., Moradi, M., & Gannot, S. (2025).
Single-microphone speaker separation and voice activity detection in noisy and reverberant environments.
EURASIP Journal on Audio, Speech, and Music Processing, 2025(1), 18. Springer.
๐Ÿ“‹ BibTeX Citation
@article{opochinsky2025single,
title={Single-microphone speaker separation and voice activity detection in noisy and reverberant environments},
author={Opochinsky, Renana and Moradi, Mordehay and Gannot, Sharon},
journal={EURASIP Journal on Audio, Speech, and Music Processing},
volume={2025},
number={1},
pages={18},
year={2025},
publisher={Springer}
}