π¦ Liyama-3B
Liyama-3B is a fine-tuned version of Metaβs LLaMA-3B (3.2) model, built to understand and respond fluently in Tagalog. It was trained on the AnoNa dataset over 3 epochs, aiming for natural, context-aware instruction-following in Filipino.
π€ Origin of the Name
The name Liyama is a Tagalified version of llama, reflecting both its LLaMA base and its Tagalog-focused language capabilities. It mirrors how Filipino often adapts foreign terms into familiar, phonetic formsβlike camera β kamera, lion β leon, and now, llama β liyama.
π§ Training Data: The AnoNa Dataset
Liyama-3B was trained solely on response completions from the AnoNa dataset β a self-instruct corpus generated using Gemini 1.5 and 2.0.
Inspired by SimpleQnA, the dataset contains short, helpful instruction-response pairs. But AnoNa introduces several improvements:
- β Less English, More Tagalog prompts
- β Less IFEVAL-style formatting
- β No overuse of modifiers in instructions
- β Balanced task types to avoid dominant categories
- β Complex tasks favored (65% complex / 35% simple)
- β Reduced sycophancy and generic praise
- β Improved follow-up handling
- β AI self-intro appears only when relevant
- β Implicit chain-of-thought reasoning, not labeled
- β Extra task types added to increase variety
This focus creates a model that's practical, straightforward, and tuned for realistic conversational use in Filipino, without excessive formatting or irrelevant disclaimers.
π£οΈ Use Case
Liyama-3B is ideal for:
- Answering questions in Tagalog
- Writing essays, reflections, and letters in Filipino
- Following natural instructions, even when mixed with English
- Chat-based tasks where fluency and tone matter
- Educational or community apps centered around local language use
π¦ Model Details
Feature | Value |
---|---|
Base Model | LLaMA-3B v3.2 |
Fine-tuned Dataset | AnoNa |
Epochs | 3 |
Language Focus | Tagalog (with some English) |
Prompt Format | Responses only |
Liyama-3B is part of a broader effort to create open, practical Filipino-language models for real useβnot just benchmarks. Expect follow-ups tuned for multi-turn chat, reasoning, and creative tasks.
- Downloads last month
- 3