Spaces:
Sleeping
CantusSVS
Table of Contents
About CantusSVS
CantusSVS is a singing voice synthesis tool that automatically generates audio playback for the Latin chants in Cantus. You can access CantusSVS directly in the browser here https://cantussvs.streamlit.app. For training and inferencing, we use DiffSinger, a diffusion-based singing voice synthesis model described in the paper below:
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Liu, Jinglin, Chengxi Li, Yi Ren, Feiyang Chen, and Zhou Zhao. 2022. "Diffsinger: Singing Voice Synthesis via Shallow Diffusion Mechanism." In Proceedings of the AAAI Conference on Artificial Intelligence 36 10: 11020–11028. https://arxiv.org/abs/2105.02446.
Training was done using Cedar, a cluster provided by the Digital Research Alliance of Canada. To set up training locally, follow this tutorial by tigermeat.
For general help training and creating a dataset, this tutorial by PixPrucer is an excellent guide. For help, join the DiffSinger Discord server.
The dataset used for this project was built using Adventus: Dominica prima adventus Domini, the first track from Psallentes' album Salzinnes Saints. Psallentes is a Belgian women's chorus that specializes in Late Medieval and Renaissance music. Salzinnes Saints is an album of music from the Salzinnes Antiphonal, a mid-sixteenth century choirbook with the music and text for the Liturgy of the Hours.
Quick Start
Clone the repository:
git clone https://github.com/yourusername/CantusSVS.git cd CantusSVS
Set up the environment:
make setup
Run the web app locally:
make run
Open your browser at:
http://localhost:8501
Or just use the hosted app here: https://cantussvs.streamlit.app
Preparing Your Input
- Most commercial music composition software can export
.mei
files. MuseScore 4 is free to use. - Input format must be
.mei
(Music Encoding Initiative XML). - Only monophonic scores are supported (one staff, one voice).
- Lyrics must be embedded in the MEI file and aligned with notes.
Validation tool:
python scripts/validate_mei.py your_song.mei
Running Locally
Drop your
.mei
file into the upload area of the web app.Choose settings:
- Tempo (BPM)
- Output file name (optional)
Hit "Synthesize" and download the resulting
.wav
file.
Generated files:
.wav
: final audio output.mel.npy
: intermediate mel-spectrogram.info.json
: metadata (phoneme sequence, note mapping)
FAQ
Q: Can I synthesize polyphonic (multi-voice) chants?
A: No, only monophonic scores are supported currently. However, in the future, polyphonic chants could be synthesized by layering multiple monophonic voices.
Q: Can I change the voice timbre?
A: In the webapp, only the provided pre-trained model is available. However, DiffSinger will learn the timbre of the input dataset so if you train your own model, you can control the timbre that way.