Phi-4 Mini Instruct - Fine-tuned on Dolly 15K (MLX)

This model is a fine-tuned version of microsoft/Phi-4-mini-instruct using Apple's MLX framework, trained on the Databricks Dolly 15K instruction dataset.

Model Description

This model was fine tuned for educational purposes only and attempts to enhance Phi-4 Mini's instruction-following capabilities by fine-tuning it on 15,000 high-quality human-generated instruction-following examples from the Dolly dataset. The fine-tuning was performed using LoRA (Low-Rank Adaptation) on Apple Silicon hardware, making it optimized for deployment on Mac devices.

Key Features

Base Model: Phi-4-mini-instruct (3.8B parameters)
Fine-tuning Method: LoRA with MLX
Training Dataset: Dolly 15K (12,000 training, 3,000 validation examples)
Optimized for: Apple Silicon (M1, M2, M3, M4)
License: MIT

Intended Uses & Limitations

Intended Uses

General instruction following and question answering
Educational applications
Content generation
Code assistance
Creative writing tasks

Limitations

The model inherits limitations from the base Phi-4 model
Performance is optimized for Apple Silicon; may have different characteristics on other hardware
Should not be used for critical decision-making without human oversight
May exhibit biases present in the training data

Training Details

Training Configuration

LoRA Rank: 64
LoRA Alpha: 16
LoRA Dropout: 0.1
Target Layers: 16
Learning Rate: 1e-4
Batch Size: 2
Training Iterations: 1,000
Max Sequence Length: 2,048

Training Hardware

Trained on Apple Silicon (M-series chip)
Peak memory usage: ~8GB
Training time: 25-60 minutes depending on specific chip

Training Data

The model was fine-tuned on the Databricks Dolly 15K dataset, which consists of:

Total examples: 15,000 instruction-following records
Training split: 12,000 examples (80%)
Validation split: 3,000 examples (20%)
Categories: Brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization

Performance

Inference Speed (on Apple Silicon)

Prompt processing: ~240-315 tokens/second
Generation: ~66-73 tokens/second
Peak memory usage: 7.7-8.0 GB

Example Outputs

Machine Learning Explanation:

"Machine learning is when our computers learn from experience by looking at examples of things that have already been done, and then apply that knowledge to new situations. For example, if we want to teach a computer to recognize cats, we can show it many pictures of cats and tell it 'this is a cat', and it will eventually be able to recognize when it sees a cat in a picture."

Renewable Energy Benefits:

"Renewable energy is sustainable and minimizes carbon emissions. It is readily available in most locations and can be exploited in the smallest of communities. It can be used for various purposes, such as generating heat and electricity and powering electric vehicles. Renewable energy is a clean fuel source. It helps to reduce air pollution and the risk of climate change."

How to Use

With MLX

from mlx_lm import load, generate

# Load the model
model, tokenizer = load("your-username/phi-4-mini-instruct-dolly-15k-mlx")

# Generate text
prompt = "<|user|>\nExplain quantum computing in simple terms<|end|>\n<|assistant|>\n"
response = generate(model, tokenizer, prompt=prompt, max_tokens=300)
print(response)

Chat Format

The model uses Phi-4's chat template with special tokens:

<|user|> - User message start
<|assistant|> - Assistant message start
<|end|> - Message end

Example:

<|user|>
What is machine learning?
<|end|>
<|assistant|>
Machine learning is...
<|end|>

Training Procedure

Data Preprocessing

Dolly 15K dataset was downloaded and formatted for Phi-4's chat template
Instructions and responses were wrapped with appropriate special tokens
Data was split 80/20 for training/validation
Saved as JSONL files for MLX compatibility

LoRA Fine-tuning

Applied LoRA adapters to 16 transformer layers
Trained for 1,000 iterations with validation every 200 steps
Saved checkpoints every 500 iterations
Final adapter weights were fused with the base model

Post-processing

The LoRA adapters were merged with the base model weights to create this standalone model, eliminating the need for adapter loading during inference.

Evaluation

Validation loss decreased from ~3.0 to ~1.5-2.0 during training, indicating successful learning of the instruction-following patterns in the Dolly dataset.

Environmental Impact

This model was trained on energy-efficient Apple Silicon hardware, resulting in lower power consumption compared to traditional GPU training. Estimated carbon footprint is minimal due to:

Short training time (< 1 hour)
Efficient LoRA method (only 0.082% of parameters trained)
Apple Silicon's power efficiency

Citation

If you use this model, please cite:

@misc{phi4-mini-dolly-mlx,
  author = {Your Name},
  title = {Phi-4 Mini Instruct Fine-tuned on Dolly 15K for MLX},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/your-username/phi-4-mini-instruct-dolly-15k-mlx}
}

Acknowledgments

Microsoft for the Phi-4 base model
Databricks for the Dolly 15K dataset
Apple MLX team for the framework
The open-source community

Model Card Contact

For questions or issues with this model, please open an issue on the GitHub repository or contact via Hugging Face.

ianphil
/

phi4-mini-dolly-15k-mlx