Model Card for Model ID

This is a fine-tuned model for automatic dialectal transcription of Finnish dialect recordings. The model is based on a model trained on colloquial Finnish: GetmanY1/wav2vec2-large-fi-lp-cont-pt. The model has been finetuned on old Finnish dialect recordings and their corresponding transcriptions in the Uralic Phonetic Alphabet. This model outputs detailed transcription. The audio recordings are sampled at 16kHz.

Model Sources [optional]

  • Paper [optional]: TBA

Uses

You can use this model for automatic dialectal transcription of Finnish dialects. Note that this model does not produce standard Finnish text.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

The training data is an utterance-level version of the Samples of Spoken Finnish corpus. The utterance-level version is available at okuparinen/skn.

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
1
Safetensors
Model size
315M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for okuparinen/SKN_300m_detailed

Finetuned
(2)
this model

Dataset used to train okuparinen/SKN_300m_detailed

Collection including okuparinen/SKN_300m_detailed