CantusSVS

About CantusSVS
Quick Start
Preparing Your Input
Running Locally
FAQ

About CantusSVS

CantusSVS is a singing voice synthesis tool that automatically generates audio playback for the Latin chants in Cantus. You can access CantusSVS directly in the browser here https://cantussvs.streamlit.app. For training and inferencing, we use DiffSinger, a diffusion-based singing voice synthesis model described in the paper below:

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism

Liu, Jinglin, Chengxi Li, Yi Ren, Feiyang Chen, and Zhou Zhao. 2022. "Diffsinger: Singing Voice Synthesis via Shallow Diffusion Mechanism." In Proceedings of the AAAI Conference on Artificial Intelligence 36 10: 11020–11028. https://arxiv.org/abs/2105.02446.

Training was done using Cedar, a cluster provided by the Digital Research Alliance of Canada. To set up training locally, follow this tutorial by tigermeat.

For general help training and creating a dataset, this tutorial by PixPrucer is an excellent guide. For help, join the DiffSinger Discord server.

The dataset used for this project was built using Adventus: Dominica prima adventus Domini, the first track from Psallentes' album Salzinnes Saints. Psallentes is a Belgian women's chorus that specializes in Late Medieval and Renaissance music. Salzinnes Saints is an album of music from the Salzinnes Antiphonal, a mid-sixteenth century choirbook with the music and text for the Liturgy of the Hours.

Quick Start

Clone the repository:

git clone https://github.com/yourusername/CantusSVS.git
cd CantusSVS

Set up the environment:
```
make setup
```
Run the web app locally:
```
make run
```
Open your browser at:
```
http://localhost:8501
```

Or just use the hosted app here: https://cantussvs.streamlit.app

Preparing Your Input

Most commercial music composition software can export .mei files. MuseScore 4 is free to use.
Input format must be .mei (Music Encoding Initiative XML).
Only monophonic scores are supported (one staff, one voice).
Lyrics must be embedded in the MEI file and aligned with notes.

Validation tool:

python scripts/validate_mei.py your_song.mei

Running Locally

Drop your .mei file into the upload area of the web app.
Choose settings:
- Tempo (BPM)
- Output file name (optional)
Hit "Synthesize" and download the resulting .wav file.

Generated files:

.wav: final audio output
.mel.npy: intermediate mel-spectrogram
.info.json: metadata (phoneme sequence, note mapping)

FAQ

Q: Can I synthesize polyphonic (multi-voice) chants?
A: No, only monophonic scores are supported currently. However, in the future, polyphonic chants could be synthesized by layering multiple monophonic voices.

Q: Can I change the voice timbre?
A: In the webapp, only the provided pre-trained model is available. However, DiffSinger will learn the timbre of the input dataset so if you train your own model, you can control the timbre that way.

Spaces:

liampond
/

CantusSVS-hf

Sleeping

CantusSVS

Table of Contents

About CantusSVS

Quick Start

Preparing Your Input

Running Locally

FAQ