RekklesAI's picture
Update README.md
c3caa13 verified
---
license: apache-2.0
base_model: mistralai/Mistral-Small-24B-Instruct-2501
tags:
- mistral
- reasoning
- fine-tuned
- synthetic-thinking
- math
- science
- code
- puzzles
- lora
library_name: transformers
pipeline_tag: text-generation
datasets:
- open-thoughts/OpenThoughts-114k
language:
- en
---
![image/png](https://cdn-uploads.huggingface.co/production/uploads/664589a52d210101d1eac6ad/GeOMgW7RLvZ5PpMY1klCU.png)
# LogicFlow-Mistral-Small-24B-Reasoning
**LogicFlow-Mistral-Small-24B-Reasoning** is a fine-tuned version of [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501) that has been enhanced for advanced reasoning and thinking tasks. This model was trained on the high-quality [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset, which contains 114,000 synthetic reasoning examples covering mathematics, science, coding, and complex puzzles.
## πŸš€ Model Overview
LogicFlow-Mistral-Small-24B-Reasoning excels at:
- **Step-by-step reasoning** across multiple domains
- **Mathematical problem solving** with detailed explanations
- **Scientific analysis** and conceptual understanding
- **Code generation and debugging** with logical thinking
- **Complex puzzle solving** requiring multi-step reasoning
The model has been fine-tuned to generate explicit thinking processes, making its reasoning transparent and interpretable.
## πŸ“Š Model Details
- **Base Model**: mistralai/Mistral-Small-24B-Instruct-2501
- **Parameters**: 24 billion
- **Architecture**: MistralForCausalLM
- **Context Length**: 32,768 tokens
- **Precision**: bfloat16
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Dataset**: OpenThoughts-114k (114,000 high-quality reasoning examples)
## πŸ”§ Training Configuration
- **LoRA Rank**: 8
- **LoRA Alpha**: 16
- **Learning Rate**: 5e-5
- **Batch Size**: 2 per device
- **Gradient Accumulation**: 8 steps
- **Training Epochs**: 5
- **Optimizer**: AdamW
- **Scheduler**: Cosine
- **Max Samples**: 100,000
- **Thinking Mode**: Enabled
## πŸ“Š Training Loss
The training process shows excellent convergence with consistent loss reduction across epochs:
![Training Loss](training_loss.png)
*Training loss curve showing stable convergence during the fine-tuning process with OpenThoughts-114k dataset.*
## πŸ’» Usage
### Quick Start
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load the model and tokenizer
model_name = "RekklesAI/LogicFlow-Mistral-Small-24B-Reasoning"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Example usage
prompt = "Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 1?"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Chat Template
```python
messages = [
{"role": "user", "content": "Explain how to solve a quadratic equation using the quadratic formula."}
]
# Apply chat template
formatted_prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## 🎯 Use Cases
### Mathematical Reasoning
- Solving complex equations step-by-step
- Proof verification and generation
- Statistical analysis and probability
- Calculus and advanced mathematics
### Scientific Analysis
- Physics problem solving
- Chemistry reaction mechanisms
- Biology concept explanations
- Data interpretation
### Code Development
- Algorithm design and optimization
- Debugging complex code issues
- Code review and improvement suggestions
- Technical architecture decisions
### Problem Solving
- Logic puzzles and brain teasers
- Strategic planning scenarios
- Decision analysis frameworks
- Creative problem-solving approaches
## πŸ“ˆ Performance
The model demonstrates significant improvements in reasoning tasks compared to the base model:
- Enhanced step-by-step problem decomposition
- More accurate mathematical computations
- Better code generation with explanations
- Improved logical consistency across responses
## ⚠️ Limitations
- The model may occasionally generate verbose explanations
- Performance on extremely specialized domains may vary
- Responses should be verified for critical applications
- May require significant computational resources for inference
## πŸ” Training Data
The model was trained on the [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset, which includes:
- **Mathematics**: Algebra, calculus, geometry, statistics
- **Science**: Physics, chemistry, biology concepts
- **Programming**: Algorithms, data structures, debugging
- **Logic**: Puzzles, reasoning challenges, problem-solving
The dataset contains high-quality synthetic examples with detailed reasoning traces, enabling the model to learn explicit thinking patterns.
## πŸ—οΈ Model Architecture
```
MistralForCausalLM(
- Hidden Size: 5,120
- Intermediate Size: 32,768
- Number of Layers: 40
- Attention Heads: 32
- Key-Value Heads: 8
- Vocabulary Size: 131,072
- Max Position Embeddings: 32,768
- RoPE Theta: 100,000,000
)
```
## πŸ“ Citation
```bibtex
@misc{logicflowmistralsmall24breasoning,
title={LogicFlow-Mistral-Small-24B-Reasoning: A Reasoning-Enhanced Large Language Model},
author={[Your Name]},
year={2025},
note={Fine-tuned from Mistral-Small-24B-Instruct-2501 using OpenThoughts-114k dataset}
}
```
## πŸ“„ License
This model is released under the Apache 2.0 License, following the base model's licensing terms.
## πŸ™ Acknowledgments
- **Mistral AI** for the exceptional base model
- **OpenThoughts team** for the high-quality reasoning dataset
- **LLaMA-Factory** for the excellent fine-tuning framework
---
*Built with ❀️ using [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)*