|
--- |
|
license: apache-2.0 |
|
base_model: mistralai/Mistral-Small-24B-Instruct-2501 |
|
tags: |
|
- mistral |
|
- reasoning |
|
- fine-tuned |
|
- synthetic-thinking |
|
- math |
|
- science |
|
- code |
|
- puzzles |
|
- lora |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
datasets: |
|
- open-thoughts/OpenThoughts-114k |
|
language: |
|
- en |
|
--- |
|
|
|
 |
|
|
|
# LogicFlow-Mistral-Small-24B-Reasoning |
|
|
|
**LogicFlow-Mistral-Small-24B-Reasoning** is a fine-tuned version of [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501) that has been enhanced for advanced reasoning and thinking tasks. This model was trained on the high-quality [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset, which contains 114,000 synthetic reasoning examples covering mathematics, science, coding, and complex puzzles. |
|
|
|
## π Model Overview |
|
|
|
LogicFlow-Mistral-Small-24B-Reasoning excels at: |
|
- **Step-by-step reasoning** across multiple domains |
|
- **Mathematical problem solving** with detailed explanations |
|
- **Scientific analysis** and conceptual understanding |
|
- **Code generation and debugging** with logical thinking |
|
- **Complex puzzle solving** requiring multi-step reasoning |
|
|
|
The model has been fine-tuned to generate explicit thinking processes, making its reasoning transparent and interpretable. |
|
|
|
## π Model Details |
|
|
|
- **Base Model**: mistralai/Mistral-Small-24B-Instruct-2501 |
|
- **Parameters**: 24 billion |
|
- **Architecture**: MistralForCausalLM |
|
- **Context Length**: 32,768 tokens |
|
- **Precision**: bfloat16 |
|
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) |
|
- **Dataset**: OpenThoughts-114k (114,000 high-quality reasoning examples) |
|
|
|
## π§ Training Configuration |
|
|
|
- **LoRA Rank**: 8 |
|
- **LoRA Alpha**: 16 |
|
- **Learning Rate**: 5e-5 |
|
- **Batch Size**: 2 per device |
|
- **Gradient Accumulation**: 8 steps |
|
- **Training Epochs**: 5 |
|
- **Optimizer**: AdamW |
|
- **Scheduler**: Cosine |
|
- **Max Samples**: 100,000 |
|
- **Thinking Mode**: Enabled |
|
|
|
## π Training Loss |
|
|
|
The training process shows excellent convergence with consistent loss reduction across epochs: |
|
|
|
 |
|
|
|
*Training loss curve showing stable convergence during the fine-tuning process with OpenThoughts-114k dataset.* |
|
|
|
## π» Usage |
|
|
|
### Quick Start |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
# Load the model and tokenizer |
|
model_name = "RekklesAI/LogicFlow-Mistral-Small-24B-Reasoning" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto" |
|
) |
|
|
|
# Example usage |
|
prompt = "Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 1?" |
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
with torch.no_grad(): |
|
outputs = model.generate( |
|
**inputs, |
|
max_new_tokens=512, |
|
temperature=0.7, |
|
do_sample=True, |
|
pad_token_id=tokenizer.eos_token_id |
|
) |
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
### Chat Template |
|
|
|
```python |
|
messages = [ |
|
{"role": "user", "content": "Explain how to solve a quadratic equation using the quadratic formula."} |
|
] |
|
|
|
# Apply chat template |
|
formatted_prompt = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize=False, |
|
add_generation_prompt=True |
|
) |
|
|
|
inputs = tokenizer(formatted_prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs, max_new_tokens=512) |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
``` |
|
|
|
## π― Use Cases |
|
|
|
### Mathematical Reasoning |
|
- Solving complex equations step-by-step |
|
- Proof verification and generation |
|
- Statistical analysis and probability |
|
- Calculus and advanced mathematics |
|
|
|
### Scientific Analysis |
|
- Physics problem solving |
|
- Chemistry reaction mechanisms |
|
- Biology concept explanations |
|
- Data interpretation |
|
|
|
### Code Development |
|
- Algorithm design and optimization |
|
- Debugging complex code issues |
|
- Code review and improvement suggestions |
|
- Technical architecture decisions |
|
|
|
### Problem Solving |
|
- Logic puzzles and brain teasers |
|
- Strategic planning scenarios |
|
- Decision analysis frameworks |
|
- Creative problem-solving approaches |
|
|
|
## π Performance |
|
|
|
The model demonstrates significant improvements in reasoning tasks compared to the base model: |
|
- Enhanced step-by-step problem decomposition |
|
- More accurate mathematical computations |
|
- Better code generation with explanations |
|
- Improved logical consistency across responses |
|
|
|
## β οΈ Limitations |
|
|
|
- The model may occasionally generate verbose explanations |
|
- Performance on extremely specialized domains may vary |
|
- Responses should be verified for critical applications |
|
- May require significant computational resources for inference |
|
|
|
## π Training Data |
|
|
|
The model was trained on the [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset, which includes: |
|
- **Mathematics**: Algebra, calculus, geometry, statistics |
|
- **Science**: Physics, chemistry, biology concepts |
|
- **Programming**: Algorithms, data structures, debugging |
|
- **Logic**: Puzzles, reasoning challenges, problem-solving |
|
|
|
The dataset contains high-quality synthetic examples with detailed reasoning traces, enabling the model to learn explicit thinking patterns. |
|
|
|
## ποΈ Model Architecture |
|
|
|
``` |
|
MistralForCausalLM( |
|
- Hidden Size: 5,120 |
|
- Intermediate Size: 32,768 |
|
- Number of Layers: 40 |
|
- Attention Heads: 32 |
|
- Key-Value Heads: 8 |
|
- Vocabulary Size: 131,072 |
|
- Max Position Embeddings: 32,768 |
|
- RoPE Theta: 100,000,000 |
|
) |
|
``` |
|
|
|
## π Citation |
|
|
|
```bibtex |
|
@misc{logicflowmistralsmall24breasoning, |
|
title={LogicFlow-Mistral-Small-24B-Reasoning: A Reasoning-Enhanced Large Language Model}, |
|
author={[Your Name]}, |
|
year={2025}, |
|
note={Fine-tuned from Mistral-Small-24B-Instruct-2501 using OpenThoughts-114k dataset} |
|
} |
|
``` |
|
|
|
## π License |
|
|
|
This model is released under the Apache 2.0 License, following the base model's licensing terms. |
|
|
|
## π Acknowledgments |
|
|
|
- **Mistral AI** for the exceptional base model |
|
- **OpenThoughts team** for the high-quality reasoning dataset |
|
- **LLaMA-Factory** for the excellent fine-tuning framework |
|
|
|
--- |
|
|
|
*Built with β€οΈ using [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)* |