English-Tigrinya Tokenizer
This tokenizer is trained for English to Tigrinya machine translation tasks using the NLLB dataset for training and OPUS parallel data for testing.
Model Details
- Languages: English, Tigrinya
- Model type: Tokenizer using SentencePiece
- License: MIT License
- Training dataset: NLLB
- Testing dataset: OPUS parallel data
- Evaluation metric: BLEU score
Machine Translation Model: English β Tigrinya
This model is a fine-tuned machine translation model trained to translate between English and Tigrinya. It was trained on the parallel corpus of English and Tigrinya sentences.
Model Overview
- Model Type: MarianMT (Multilingual Transformer Model)
- Languages: English β Tigrinya
- Model Architecture: MarianMT, fine-tuned for English β Tigrinya translation
- Training Framework: Hugging Face Transformers, PyTorch
Training Details
Training Dataset: NLLB Parallel Corpus (English β Tigrinya)
Training Epochs: 3
Batch Size: 8
Max Length: 128 tokens
Learning Rate: Starts from
1.44e-07
and decays during trainingTraining Loss:
- Final training loss: 0.4756
- Per-epoch loss progress:
- Epoch 1: 0.443
- Epoch 2: 0.4077
- Epoch 3: 0.4379
Gradient Norms:
- Epoch 1: 1.14
- Epoch 2: 1.11
- Epoch 3: 1.06
Training Time: 43376.7 seconds (~12 hours)
Training Speed:
- Training samples per second: 96.7
- Training steps per second: 12.08
Model Usage
This model can be used for translating English sentences to Tigrinya and vice versa.
Example Usage (Python)
from transformers import MarianMTModel, MarianTokenizer
# Load the model and tokenizer
model_name = "Hailay/MachineT_TigEng"
model = MarianMTModel.from_pretrained(model_name)
tokenizer = MarianTokenizer.from_pretrained(model_name)
# Translate an English sentence to Tigrinya
english_text = "We must obey the Lord and leave them alone"
encoded_input = tokenizer(english_text, return_tensors="pt", padding=True, truncation=True)
translated = model.generate(**encoded_input)
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
print(f"Translated text: {translated_text}")
- Downloads last month
- 10
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support