Phi-4 Mini Instruct - Fine-tuned on Dolly 15K (MLX)
This model is a fine-tuned version of microsoft/Phi-4-mini-instruct using Apple's MLX framework, trained on the Databricks Dolly 15K instruction dataset.
Model Description
This model was fine tuned for educational purposes only and attempts to enhance Phi-4 Mini's instruction-following capabilities by fine-tuning it on 15,000 high-quality human-generated instruction-following examples from the Dolly dataset. The fine-tuning was performed using LoRA (Low-Rank Adaptation) on Apple Silicon hardware, making it optimized for deployment on Mac devices.
Key Features
- Base Model: Phi-4-mini-instruct (3.8B parameters)
- Fine-tuning Method: LoRA with MLX
- Training Dataset: Dolly 15K (12,000 training, 3,000 validation examples)
- Optimized for: Apple Silicon (M1, M2, M3, M4)
- License: MIT
Intended Uses & Limitations
Intended Uses
- General instruction following and question answering
- Educational applications
- Content generation
- Code assistance
- Creative writing tasks
Limitations
- The model inherits limitations from the base Phi-4 model
- Performance is optimized for Apple Silicon; may have different characteristics on other hardware
- Should not be used for critical decision-making without human oversight
- May exhibit biases present in the training data
Training Details
Training Configuration
- LoRA Rank: 64
- LoRA Alpha: 16
- LoRA Dropout: 0.1
- Target Layers: 16
- Learning Rate: 1e-4
- Batch Size: 2
- Training Iterations: 1,000
- Max Sequence Length: 2,048
Training Hardware
- Trained on Apple Silicon (M-series chip)
- Peak memory usage: ~8GB
- Training time: 25-60 minutes depending on specific chip
Training Data
The model was fine-tuned on the Databricks Dolly 15K dataset, which consists of:
- Total examples: 15,000 instruction-following records
- Training split: 12,000 examples (80%)
- Validation split: 3,000 examples (20%)
- Categories: Brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization
Performance
Inference Speed (on Apple Silicon)
- Prompt processing: ~240-315 tokens/second
- Generation: ~66-73 tokens/second
- Peak memory usage: 7.7-8.0 GB
Example Outputs
Machine Learning Explanation:
"Machine learning is when our computers learn from experience by looking at examples of things that have already been done, and then apply that knowledge to new situations. For example, if we want to teach a computer to recognize cats, we can show it many pictures of cats and tell it 'this is a cat', and it will eventually be able to recognize when it sees a cat in a picture."
Renewable Energy Benefits:
"Renewable energy is sustainable and minimizes carbon emissions. It is readily available in most locations and can be exploited in the smallest of communities. It can be used for various purposes, such as generating heat and electricity and powering electric vehicles. Renewable energy is a clean fuel source. It helps to reduce air pollution and the risk of climate change."
How to Use
With MLX
from mlx_lm import load, generate
# Load the model
model, tokenizer = load("your-username/phi-4-mini-instruct-dolly-15k-mlx")
# Generate text
prompt = "<|user|>\nExplain quantum computing in simple terms<|end|>\n<|assistant|>\n"
response = generate(model, tokenizer, prompt=prompt, max_tokens=300)
print(response)
Chat Format
The model uses Phi-4's chat template with special tokens:
<|user|>
- User message start<|assistant|>
- Assistant message start<|end|>
- Message end
Example:
<|user|>
What is machine learning?
<|end|>
<|assistant|>
Machine learning is...
<|end|>
Training Procedure
Data Preprocessing
- Dolly 15K dataset was downloaded and formatted for Phi-4's chat template
- Instructions and responses were wrapped with appropriate special tokens
- Data was split 80/20 for training/validation
- Saved as JSONL files for MLX compatibility
LoRA Fine-tuning
- Applied LoRA adapters to 16 transformer layers
- Trained for 1,000 iterations with validation every 200 steps
- Saved checkpoints every 500 iterations
- Final adapter weights were fused with the base model
Post-processing
The LoRA adapters were merged with the base model weights to create this standalone model, eliminating the need for adapter loading during inference.
Evaluation
Validation loss decreased from ~3.0 to ~1.5-2.0 during training, indicating successful learning of the instruction-following patterns in the Dolly dataset.
Environmental Impact
This model was trained on energy-efficient Apple Silicon hardware, resulting in lower power consumption compared to traditional GPU training. Estimated carbon footprint is minimal due to:
- Short training time (< 1 hour)
- Efficient LoRA method (only 0.082% of parameters trained)
- Apple Silicon's power efficiency
Citation
If you use this model, please cite:
@misc{phi4-mini-dolly-mlx,
author = {Your Name},
title = {Phi-4 Mini Instruct Fine-tuned on Dolly 15K for MLX},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/your-username/phi-4-mini-instruct-dolly-15k-mlx}
}
Acknowledgments
- Microsoft for the Phi-4 base model
- Databricks for the Dolly 15K dataset
- Apple MLX team for the framework
- The open-source community
Model Card Contact
For questions or issues with this model, please open an issue on the GitHub repository or contact via Hugging Face.
- Downloads last month
- 29
Model tree for ianphil/phi4-mini-dolly-15k-mlx
Base model
microsoft/Phi-4-mini-instruct