Mistral-Small-24B-Reasoning

Mistral-Small-24B-Reasoning is a fine-tuned version of mistralai/Mistral-Small-24B-Instruct-2501 that has been enhanced for advanced reasoning and thinking tasks. This model was trained on the high-quality OpenThoughts-114k dataset, which contains 114,000 synthetic reasoning examples covering mathematics, science, coding, and complex puzzles.

🚀 Model Overview

Mistral-Small-24B-Reasoning excels at:

Step-by-step reasoning across multiple domains
Mathematical problem solving with detailed explanations
Scientific analysis and conceptual understanding
Code generation and debugging with logical thinking
Complex puzzle solving requiring multi-step reasoning

The model has been fine-tuned to generate explicit thinking processes, making its reasoning transparent and interpretable.

📊 Model Details

Base Model: mistralai/Mistral-Small-24B-Instruct-2501
Parameters: 24 billion
Architecture: MistralForCausalLM
Context Length: 32,768 tokens
Precision: bfloat16
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Dataset: OpenThoughts-114k (114,000 high-quality reasoning examples)

🔧 Training Configuration

LoRA Rank: 8
LoRA Alpha: 16
Learning Rate: 5e-5
Batch Size: 2 per device
Gradient Accumulation: 8 steps
Training Epochs: 5
Optimizer: AdamW
Scheduler: Cosine
Max Samples: 100,000
Thinking Mode: Enabled

📊 Training Loss

The training process shows excellent convergence with consistent loss reduction across epochs:

Training loss curve showing stable convergence during the fine-tuning process with OpenThoughts-114k dataset.

💻 Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load the model and tokenizer
model_name = "RekklesAI/Mistral-Small-24B-Reasoning"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example usage
prompt = "Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 1?"

inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Chat Template

messages = [
    {"role": "user", "content": "Explain how to solve a quadratic equation using the quadratic formula."}
]

# Apply chat template
formatted_prompt = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True
)

inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

🎯 Use Cases

Mathematical Reasoning

Solving complex equations step-by-step
Proof verification and generation
Statistical analysis and probability
Calculus and advanced mathematics

Scientific Analysis

Physics problem solving
Chemistry reaction mechanisms
Biology concept explanations
Data interpretation

Code Development

Algorithm design and optimization
Debugging complex code issues
Code review and improvement suggestions
Technical architecture decisions

Problem Solving

Logic puzzles and brain teasers
Strategic planning scenarios
Decision analysis frameworks
Creative problem-solving approaches

📈 Performance

The model demonstrates significant improvements in reasoning tasks compared to the base model:

Enhanced step-by-step problem decomposition
More accurate mathematical computations
Better code generation with explanations
Improved logical consistency across responses

⚠️ Limitations

The model may occasionally generate verbose explanations
Performance on extremely specialized domains may vary
Responses should be verified for critical applications
May require significant computational resources for inference

🔍 Training Data

The model was trained on the OpenThoughts-114k dataset, which includes:

Mathematics: Algebra, calculus, geometry, statistics
Science: Physics, chemistry, biology concepts
Programming: Algorithms, data structures, debugging
Logic: Puzzles, reasoning challenges, problem-solving

The dataset contains high-quality synthetic examples with detailed reasoning traces, enabling the model to learn explicit thinking patterns.

🏗️ Model Architecture

MistralForCausalLM(
  - Hidden Size: 5,120
  - Intermediate Size: 32,768
  - Number of Layers: 40
  - Attention Heads: 32
  - Key-Value Heads: 8
  - Vocabulary Size: 131,072
  - Max Position Embeddings: 32,768
  - RoPE Theta: 100,000,000
)

📝 Citation

@misc{mistralsmall24breasoning,
  title={Mistral-Small-24B-Reasoning: A Reasoning-Enhanced Large Language Model},
  author={[Your Name]},
  year={2025},
  note={Fine-tuned from Mistral-Small-24B-Instruct-2501 using OpenThoughts-114k dataset}
}

📄 License

This model is released under the Apache 2.0 License, following the base model's licensing terms.

🙏 Acknowledgments

Mistral AI for the exceptional base model
OpenThoughts team for the high-quality reasoning dataset
LLaMA-Factory for the excellent fine-tuning framework

Built with ❤️ using LLaMA-Factory

RekklesAI
/

Mistral-Small-24B-Reasoning