Spaces:
Sleeping
Sleeping
# π Free H200 Training: Nano-Coder on Hugging Face | |
This guide shows you how to train a nano-coder model using **Hugging Face's free H200 GPU access** (4 minutes daily). | |
## π― What You Get | |
- **Free H200 GPU**: 4 minutes per day | |
- **No Credit Card Required**: Completely free | |
- **Easy Setup**: Just a few clicks | |
- **Model Sharing**: Automatic upload to HF Hub | |
## π Quick Start | |
### Option 1: Hugging Face Space (Recommended) | |
1. **Create HF Space:** | |
```bash | |
huggingface-cli repo create nano-coder-free --type space | |
``` | |
2. **Upload Files:** | |
- Upload all the Python files to your space | |
- Make sure `app.py` is in the root directory | |
3. **Configure Space:** | |
- Set **Hardware**: H200 (free tier) | |
- Set **Python Version**: 3.9+ | |
- Set **Requirements**: `requirements.txt` | |
4. **Launch Training:** | |
- Go to your space URL | |
- Click "π Start Free H200 Training" | |
- Wait for training to complete (3.5 minutes) | |
### Option 2: Local Setup with HF Free Tier | |
1. **Install Dependencies:** | |
```bash | |
pip install -r requirements.txt | |
``` | |
2. **Set HF Token:** | |
```bash | |
export HF_TOKEN="your_token_here" | |
``` | |
3. **Run Free Training:** | |
```bash | |
python hf_free_training.py | |
``` | |
## π Model Configuration (Free Tier) | |
| Parameter | Free Tier | Full Model | | |
|-----------|-----------|------------| | |
| **Layers** | 6 | 12 | | |
| **Heads** | 6 | 12 | | |
| **Embedding** | 384 | 768 | | |
| **Context** | 512 | 1024 | | |
| **Parameters** | ~15M | ~124M | | |
| **Training Time** | 3.5 min | 2-4 hours | | |
## β° Time Management | |
- **Daily Limit**: 4 minutes of H200 time | |
- **Training Time**: 3.5 minutes (safe buffer) | |
- **Automatic Stop**: Script stops before time limit | |
- **Daily Reset**: New 4 minutes every day at midnight UTC | |
## π¨ Features | |
### Training Features | |
- β **Automatic Time Tracking**: Stops before limit | |
- β **Frequent Checkpoints**: Every 200 iterations | |
- β **HF Hub Upload**: Models saved automatically | |
- β **Wandb Logging**: Real-time metrics | |
- β **Progress Monitoring**: Time remaining display | |
### Generation Features | |
- β **Interactive UI**: Gradio interface | |
- β **Custom Prompts**: Any Python code start | |
- β **Adjustable Parameters**: Temperature, tokens | |
- β **Real-time Generation**: Instant results | |
## π File Structure | |
``` | |
nano-coder-free/ | |
βββ app.py # HF Space app | |
βββ hf_free_training.py # Free H200 training script | |
βββ prepare_code_dataset.py # Dataset preparation | |
βββ sample_nano_coder.py # Code generation | |
βββ requirements.txt # Dependencies | |
βββ model.py # nanoGPT model | |
βββ configurator.py # Configuration | |
βββ README_free_H200.md # This file | |
``` | |
## π§ Customization | |
### Adjust Training Parameters | |
Edit `hf_free_training.py`: | |
```python | |
# Model size (smaller = faster training) | |
n_layer = 4 # Even smaller | |
n_head = 4 # Even smaller | |
n_embd = 256 # Even smaller | |
# Training time (be conservative) | |
MAX_TRAINING_TIME = 3.0 * 60 # 3 minutes | |
# Batch size (larger = faster) | |
batch_size = 128 # If you have memory | |
``` | |
### Change Dataset | |
```python | |
# In prepare_code_dataset.py | |
dataset = load_dataset("your-dataset") # Your own dataset | |
``` | |
## π Expected Results | |
After 3.5 minutes of training on H200: | |
- **Training Loss**: ~2.5-3.0 | |
- **Validation Loss**: ~2.8-3.3 | |
- **Model Size**: ~15MB | |
- **Code Quality**: Basic Python functions | |
- **Iterations**: ~500-1000 | |
## π― Use Cases | |
### Perfect For: | |
- β **Learning**: Understand nanoGPT training | |
- β **Prototyping**: Test ideas quickly | |
- β **Experiments**: Try different configurations | |
- β **Small Models**: Code generation demos | |
### Not Suitable For: | |
- β **Production**: Too small for real use | |
- β **Large Models**: Limited by time/parameters | |
- β **Long Training**: 4-minute daily limit | |
## π Daily Workflow | |
1. **Morning**: Check if you can train today | |
2. **Prepare**: Have your dataset ready | |
3. **Train**: Run 3.5-minute training session | |
4. **Test**: Generate some code samples | |
5. **Share**: Upload to HF Hub if good | |
6. **Wait**: Come back tomorrow for more training | |
## π¨ Troubleshooting | |
### Common Issues | |
1. **"Daily limit reached"** | |
- Wait until tomorrow | |
- Check your timezone | |
2. **"No GPU available"** | |
- H200 might be busy | |
- Try again in a few minutes | |
3. **"Training too slow"** | |
- Reduce model size | |
- Increase batch size | |
- Use smaller context | |
4. **"Out of memory"** | |
- Reduce batch_size | |
- Reduce block_size | |
- Reduce model size | |
### Performance Tips | |
- **Batch Size**: Use largest that fits in memory | |
- **Context Length**: 512 is good for free tier | |
- **Model Size**: 6 layers is optimal | |
- **Learning Rate**: 1e-3 for fast convergence | |
## π Monitoring | |
### Wandb Dashboard | |
- Real-time loss curves | |
- Training metrics | |
- Model performance | |
### HF Hub | |
- Model checkpoints | |
- Training logs | |
- Generated samples | |
### Local Files | |
- `out-nano-coder-free/ckpt.pt` - Latest model | |
- `daily_limit_YYYY-MM-DD.txt` - Usage tracking | |
## π Success Stories | |
Users have achieved: | |
- β Basic Python function generation | |
- β Simple class definitions | |
- β List comprehensions | |
- β Error handling patterns | |
- β Docstring generation | |
## π Resources | |
- [Hugging Face Spaces](https://huggingface.co/spaces) | |
- [Free GPU Access](https://huggingface.co/docs/hub/spaces-sdks-docker-gpu) | |
- [NanoGPT Original](https://github.com/karpathy/nanoGPT) | |
- [Python Code Dataset](https://huggingface.co/datasets/flytech/python-codes-25k) | |
## π€ Contributing | |
Want to improve the free H200 setup? | |
1. **Optimize Model**: Make it train faster | |
2. **Better UI**: Improve the Gradio interface | |
3. **More Datasets**: Support other code datasets | |
4. **Documentation**: Help others get started | |
## π License | |
This project follows the same license as the original nanoGPT repository. | |
--- | |
**Happy Free H200 Training! π** | |
Remember: 4 minutes a day keeps the AI doctor away! π |