image/png

Mistral-Small-24B-Reasoning

Mistral-Small-24B-Reasoning is a fine-tuned version of mistralai/Mistral-Small-24B-Instruct-2501 that has been enhanced for advanced reasoning and thinking tasks. This model was trained on the high-quality OpenThoughts-114k dataset, which contains 114,000 synthetic reasoning examples covering mathematics, science, coding, and complex puzzles.

πŸš€ Model Overview

Mistral-Small-24B-Reasoning excels at:

  • Step-by-step reasoning across multiple domains
  • Mathematical problem solving with detailed explanations
  • Scientific analysis and conceptual understanding
  • Code generation and debugging with logical thinking
  • Complex puzzle solving requiring multi-step reasoning

The model has been fine-tuned to generate explicit thinking processes, making its reasoning transparent and interpretable.

πŸ“Š Model Details

  • Base Model: mistralai/Mistral-Small-24B-Instruct-2501
  • Parameters: 24 billion
  • Architecture: MistralForCausalLM
  • Context Length: 32,768 tokens
  • Precision: bfloat16
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Dataset: OpenThoughts-114k (114,000 high-quality reasoning examples)

πŸ”§ Training Configuration

  • LoRA Rank: 8
  • LoRA Alpha: 16
  • Learning Rate: 5e-5
  • Batch Size: 2 per device
  • Gradient Accumulation: 8 steps
  • Training Epochs: 5
  • Optimizer: AdamW
  • Scheduler: Cosine
  • Max Samples: 100,000
  • Thinking Mode: Enabled

πŸ“Š Training Loss

The training process shows excellent convergence with consistent loss reduction across epochs:

Training Loss

Training loss curve showing stable convergence during the fine-tuning process with OpenThoughts-114k dataset.

πŸ’» Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load the model and tokenizer
model_name = "RekklesAI/Mistral-Small-24B-Reasoning"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example usage
prompt = "Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 1?"

inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Chat Template

messages = [
    {"role": "user", "content": "Explain how to solve a quadratic equation using the quadratic formula."}
]

# Apply chat template
formatted_prompt = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True
)

inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

🎯 Use Cases

Mathematical Reasoning

  • Solving complex equations step-by-step
  • Proof verification and generation
  • Statistical analysis and probability
  • Calculus and advanced mathematics

Scientific Analysis

  • Physics problem solving
  • Chemistry reaction mechanisms
  • Biology concept explanations
  • Data interpretation

Code Development

  • Algorithm design and optimization
  • Debugging complex code issues
  • Code review and improvement suggestions
  • Technical architecture decisions

Problem Solving

  • Logic puzzles and brain teasers
  • Strategic planning scenarios
  • Decision analysis frameworks
  • Creative problem-solving approaches

πŸ“ˆ Performance

The model demonstrates significant improvements in reasoning tasks compared to the base model:

  • Enhanced step-by-step problem decomposition
  • More accurate mathematical computations
  • Better code generation with explanations
  • Improved logical consistency across responses

⚠️ Limitations

  • The model may occasionally generate verbose explanations
  • Performance on extremely specialized domains may vary
  • Responses should be verified for critical applications
  • May require significant computational resources for inference

πŸ” Training Data

The model was trained on the OpenThoughts-114k dataset, which includes:

  • Mathematics: Algebra, calculus, geometry, statistics
  • Science: Physics, chemistry, biology concepts
  • Programming: Algorithms, data structures, debugging
  • Logic: Puzzles, reasoning challenges, problem-solving

The dataset contains high-quality synthetic examples with detailed reasoning traces, enabling the model to learn explicit thinking patterns.

πŸ—οΈ Model Architecture

MistralForCausalLM(
  - Hidden Size: 5,120
  - Intermediate Size: 32,768
  - Number of Layers: 40
  - Attention Heads: 32
  - Key-Value Heads: 8
  - Vocabulary Size: 131,072
  - Max Position Embeddings: 32,768
  - RoPE Theta: 100,000,000
)

πŸ“ Citation

@misc{mistralsmall24breasoning,
  title={Mistral-Small-24B-Reasoning: A Reasoning-Enhanced Large Language Model},
  author={[Your Name]},
  year={2025},
  note={Fine-tuned from Mistral-Small-24B-Instruct-2501 using OpenThoughts-114k dataset}
}

πŸ“„ License

This model is released under the Apache 2.0 License, following the base model's licensing terms.

πŸ™ Acknowledgments

  • Mistral AI for the exceptional base model
  • OpenThoughts team for the high-quality reasoning dataset
  • LLaMA-Factory for the excellent fine-tuning framework

Built with ❀️ using LLaMA-Factory

Downloads last month
6
Safetensors
Model size
23.6B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for RekklesAI/Mistral-Small-24B-Reasoning

Dataset used to train RekklesAI/Mistral-Small-24B-Reasoning