HuBERT Lao ASR

Fine-tuned HuBERT-Large model for Lao automatic speech recognition.

Model Performance

Test CER: 25.37%
Training Time: 2.1 hours
Dialects: Central, Northern, Southern Lao

Usage

from transformers import HubertForCTC, Wav2Vec2Processor
import torch
import librosa

# Load model and processor
model = HubertForCTC.from_pretrained("h3llohihi/hubert-lao-asr")
processor = Wav2Vec2Processor.from_pretrained("h3llohihi/hubert-lao-asr")

# Load audio
audio, sr = librosa.load("audio.wav", sr=16000)

# Process audio
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")

# Generate prediction
with torch.no_grad():
    logits = model(**inputs).logits
    predicted_ids = torch.argmax(logits, dim=-1)
    transcription = processor.batch_decode(predicted_ids)[0]

print(transcription)

@thesis{naovalath2025lao,
  title={Lao Automatic Speech Recognition using Transfer Learning},
  author={Souphaxay Naovalath and Sounmy Chanthavong},
  advisor={Dr. Somsack Inthasone},
  school={National University of Laos, Faculty of Natural Sciences, Computer Science Department},
  year={2025}
}

h3llohihi
/

hubert-lao-asr

HuBERT Lao ASR

Model Performance

Usage

Model tree for h3llohihi/hubert-lao-asr

Dataset used to train h3llohihi/hubert-lao-asr