Mistral-Small-24B-Reasoning
Mistral-Small-24B-Reasoning is a fine-tuned version of mistralai/Mistral-Small-24B-Instruct-2501 that has been enhanced for advanced reasoning and thinking tasks. This model was trained on the high-quality OpenThoughts-114k dataset, which contains 114,000 synthetic reasoning examples covering mathematics, science, coding, and complex puzzles.
π Model Overview
Mistral-Small-24B-Reasoning excels at:
- Step-by-step reasoning across multiple domains
- Mathematical problem solving with detailed explanations
- Scientific analysis and conceptual understanding
- Code generation and debugging with logical thinking
- Complex puzzle solving requiring multi-step reasoning
The model has been fine-tuned to generate explicit thinking processes, making its reasoning transparent and interpretable.
π Model Details
- Base Model: mistralai/Mistral-Small-24B-Instruct-2501
- Parameters: 24 billion
- Architecture: MistralForCausalLM
- Context Length: 32,768 tokens
- Precision: bfloat16
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: OpenThoughts-114k (114,000 high-quality reasoning examples)
π§ Training Configuration
- LoRA Rank: 8
- LoRA Alpha: 16
- Learning Rate: 5e-5
- Batch Size: 2 per device
- Gradient Accumulation: 8 steps
- Training Epochs: 5
- Optimizer: AdamW
- Scheduler: Cosine
- Max Samples: 100,000
- Thinking Mode: Enabled
π Training Loss
The training process shows excellent convergence with consistent loss reduction across epochs:
Training loss curve showing stable convergence during the fine-tuning process with OpenThoughts-114k dataset.
π» Usage
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load the model and tokenizer
model_name = "RekklesAI/Mistral-Small-24B-Reasoning"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Example usage
prompt = "Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 1?"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Chat Template
messages = [
{"role": "user", "content": "Explain how to solve a quadratic equation using the quadratic formula."}
]
# Apply chat template
formatted_prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
π― Use Cases
Mathematical Reasoning
- Solving complex equations step-by-step
- Proof verification and generation
- Statistical analysis and probability
- Calculus and advanced mathematics
Scientific Analysis
- Physics problem solving
- Chemistry reaction mechanisms
- Biology concept explanations
- Data interpretation
Code Development
- Algorithm design and optimization
- Debugging complex code issues
- Code review and improvement suggestions
- Technical architecture decisions
Problem Solving
- Logic puzzles and brain teasers
- Strategic planning scenarios
- Decision analysis frameworks
- Creative problem-solving approaches
π Performance
The model demonstrates significant improvements in reasoning tasks compared to the base model:
- Enhanced step-by-step problem decomposition
- More accurate mathematical computations
- Better code generation with explanations
- Improved logical consistency across responses
β οΈ Limitations
- The model may occasionally generate verbose explanations
- Performance on extremely specialized domains may vary
- Responses should be verified for critical applications
- May require significant computational resources for inference
π Training Data
The model was trained on the OpenThoughts-114k dataset, which includes:
- Mathematics: Algebra, calculus, geometry, statistics
- Science: Physics, chemistry, biology concepts
- Programming: Algorithms, data structures, debugging
- Logic: Puzzles, reasoning challenges, problem-solving
The dataset contains high-quality synthetic examples with detailed reasoning traces, enabling the model to learn explicit thinking patterns.
ποΈ Model Architecture
MistralForCausalLM(
- Hidden Size: 5,120
- Intermediate Size: 32,768
- Number of Layers: 40
- Attention Heads: 32
- Key-Value Heads: 8
- Vocabulary Size: 131,072
- Max Position Embeddings: 32,768
- RoPE Theta: 100,000,000
)
π Citation
@misc{mistralsmall24breasoning,
title={Mistral-Small-24B-Reasoning: A Reasoning-Enhanced Large Language Model},
author={[Your Name]},
year={2025},
note={Fine-tuned from Mistral-Small-24B-Instruct-2501 using OpenThoughts-114k dataset}
}
π License
This model is released under the Apache 2.0 License, following the base model's licensing terms.
π Acknowledgments
- Mistral AI for the exceptional base model
- OpenThoughts team for the high-quality reasoning dataset
- LLaMA-Factory for the excellent fine-tuning framework
Built with β€οΈ using LLaMA-Factory
- Downloads last month
- 6
Model tree for RekklesAI/Mistral-Small-24B-Reasoning
Base model
mistralai/Mistral-Small-24B-Base-2501