|
--- |
|
library_name: transformers |
|
language: fa |
|
tags: |
|
- persian |
|
- text-generation |
|
- qlora |
|
- 4-bit-quantization |
|
license: apache-2.0 |
|
datasets: |
|
- mshojaei77/Persian_sft |
|
metrics: |
|
- bleu |
|
base_model: |
|
- google/gemma-3-4b-it |
|
--- |
|
# Gemma 3-4B Persian (v0) |
|
|
|
|
|
 |
|
`mshojaei77/gemma-3-4b-persian-v0` is a Persian-specialized model built on the Gemma 3 architecture. It leverages QLoRA for 4-bit quantization to reduce computational overhead while generating and understanding Persian text. In addition to text generation, the model also retains image input capabilities inherited from its base model. |
|
|
|
## Usage |
|
|
|
This model is compatible with both the Hugging Face Transformers library and Ollama. |
|
|
|
### Running with Ollama |
|
|
|
```bash |
|
ollama run hf.co/mshojaei77/gemma-3-4b-persian-v0:Q8_0 |
|
``` |
|
|
|
### Running with Hugging Face Transformers |
|
|
|
1. **Install Dependencies:** |
|
|
|
```bash |
|
pip install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3 accelerate |
|
``` |
|
|
|
2. **Load Model and Tokenizer:** |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model_id = "mshojaei77/gemma-3-4b-persian-v0" |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
device_map="auto", # Use "cuda" for GPU usage if available |
|
torch_dtype=torch.bfloat16, # Alternatively, use torch.float16 |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
messages = [ |
|
{ |
|
"role": "user", |
|
"content": "توماس جفرسون کیست؟" |
|
} |
|
] |
|
inputs = tokenizer.apply_chat_template( |
|
messages, |
|
add_generation_prompt=True, tokenize=True, return_tensors="pt" |
|
).to(model.device) |
|
|
|
outputs = model.generate(**inputs, max_new_tokens=200) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Training Data and Fine-Tuning |
|
|
|
### Training Dataset |
|
|
|
This model was fine-tuned using the [mshojaei77/Persian_sft](https://huggingface.co/datasets/mshojaei77/Persian_sft) dataset, which contains approximately 681,000 rows of Persian text focused on instruction-following and conversational interactions. The dataset features: |
|
|
|
### Fine-Tuning |
|
|
|
- **Method:** Supervised Fine-Tuning (SFT) using QLoRA (4-bit quantization) |
|
- **Hardware:** one T4 GPU |
|
- **Software:** Utilizes Hugging Face Transformers, with supporting libraries like `peft` for QLoRA and `bitsandbytes` for quantization |
|
- **Trade-offs:** Reduced memory footprint at the expense of some precision compared to full-precision models |
|
|
|
## Evaluation |
|
|
|
[SOON] |
|
|
|
## Usage Considerations and Limitations |
|
|
|
### Intended Use Cases |
|
|
|
- **Question Answering:** Responding accurately to Persian language queries |
|
- **Instruction Following:** Interpreting and executing text-based instructions in Persian |
|
- **Text Generation:** Producing fluent, context-aware Persian content |
|
- **Conversational AI:** Integrating into chatbots and virtual assistants |
|
- **Image Processing:** Retaining image input capabilities from the base model |
|
|
|
### Limitations |
|
|
|
- **Quantization Impact:** 4-bit quantization may reduce output precision and result in occasional incoherent responses. |
|
- **Evaluation Scope:** Absence of comprehensive evaluation metrics specific to this variant. |
|
- **Bias:** The model might mirror biases present in both the original Gemma 3 data and the Persian_sft dataset. |
|
- **Hallucination:** As with all LLMs, there is a risk of generating plausible-sounding but inaccurate information. |
|
- **Safety:** The model has not undergone safety tuning, so extra caution is advised when deploying in sensitive contexts. |
|
|
|
## Maintenance and Future Work |
|
|
|
This model is under active maintenance. Future updates may include: |
|
|
|
- Additional evaluation metrics and benchmarks |
|
- Enhanced safety tuning and bias mitigation strategies |
|
- Expanded documentation and usage examples |
|
- Incorporation of community feedback for iterative improvements |
|
|
|
For any queries, contributions, or issues, please contact me. |