README.md · RekklesAI/LogicFlow-Mistral-Small-24B-Reasoning at main

LogicFlow-Mistral-Small-24B-Reasoning / README.md

RekklesAI

Update README.md

c3caa13 verified about 1 month ago

preview code

raw

history blame contribute delete

6.38 kB

	---
	license: apache-2.0
	base_model: mistralai/Mistral-Small-24B-Instruct-2501
	tags:
	- mistral
	- reasoning
	- fine-tuned
	- synthetic-thinking
	- math
	- science
	- code
	- puzzles
	- lora
	library_name: transformers
	pipeline_tag: text-generation
	datasets:
	- open-thoughts/OpenThoughts-114k
	language:
	- en
	---

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/664589a52d210101d1eac6ad/GeOMgW7RLvZ5PpMY1klCU.png)

	# LogicFlow-Mistral-Small-24B-Reasoning

	LogicFlow-Mistral-Small-24B-Reasoning is a fine-tuned version of [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501) that has been enhanced for advanced reasoning and thinking tasks. This model was trained on the high-quality [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset, which contains 114,000 synthetic reasoning examples covering mathematics, science, coding, and complex puzzles.

	## 🚀 Model Overview

	LogicFlow-Mistral-Small-24B-Reasoning excels at:
	- Step-by-step reasoning across multiple domains
	- Mathematical problem solving with detailed explanations
	- Scientific analysis and conceptual understanding
	- Code generation and debugging with logical thinking
	- Complex puzzle solving requiring multi-step reasoning

	The model has been fine-tuned to generate explicit thinking processes, making its reasoning transparent and interpretable.

	## 📊 Model Details

	- Base Model: mistralai/Mistral-Small-24B-Instruct-2501
	- Parameters: 24 billion
	- Architecture: MistralForCausalLM
	- Context Length: 32,768 tokens
	- Precision: bfloat16
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- Dataset: OpenThoughts-114k (114,000 high-quality reasoning examples)

	## 🔧 Training Configuration

	- LoRA Rank: 8
	- LoRA Alpha: 16
	- Learning Rate: 5e-5
	- Batch Size: 2 per device
	- Gradient Accumulation: 8 steps
	- Training Epochs: 5
	- Optimizer: AdamW
	- Scheduler: Cosine
	- Max Samples: 100,000
	- Thinking Mode: Enabled

	## 📊 Training Loss

	The training process shows excellent convergence with consistent loss reduction across epochs:

	![Training Loss](training_loss.png)

	Training loss curve showing stable convergence during the fine-tuning process with OpenThoughts-114k dataset.

	## 💻 Usage

	### Quick Start

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	# Load the model and tokenizer
	model_name = "RekklesAI/LogicFlow-Mistral-Small-24B-Reasoning"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)

	# Example usage
	prompt = "Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 1?"

	inputs = tokenizer(prompt, return_tensors="pt")
	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	temperature=0.7,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	### Chat Template

	```python
	messages = [
	{"role": "user", "content": "Explain how to solve a quadratic equation using the quadratic formula."}
	]

	# Apply chat template
	formatted_prompt = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	inputs = tokenizer(formatted_prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=512)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	## 🎯 Use Cases

	### Mathematical Reasoning
	- Solving complex equations step-by-step
	- Proof verification and generation
	- Statistical analysis and probability
	- Calculus and advanced mathematics

	### Scientific Analysis
	- Physics problem solving
	- Chemistry reaction mechanisms
	- Biology concept explanations
	- Data interpretation

	### Code Development
	- Algorithm design and optimization
	- Debugging complex code issues
	- Code review and improvement suggestions
	- Technical architecture decisions

	### Problem Solving
	- Logic puzzles and brain teasers
	- Strategic planning scenarios
	- Decision analysis frameworks
	- Creative problem-solving approaches

	## 📈 Performance

	The model demonstrates significant improvements in reasoning tasks compared to the base model:
	- Enhanced step-by-step problem decomposition
	- More accurate mathematical computations
	- Better code generation with explanations
	- Improved logical consistency across responses

	## ⚠️ Limitations

	- The model may occasionally generate verbose explanations
	- Performance on extremely specialized domains may vary
	- Responses should be verified for critical applications
	- May require significant computational resources for inference

	## 🔍 Training Data

	The model was trained on the [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset, which includes:
	- Mathematics: Algebra, calculus, geometry, statistics
	- Science: Physics, chemistry, biology concepts
	- Programming: Algorithms, data structures, debugging
	- Logic: Puzzles, reasoning challenges, problem-solving

	The dataset contains high-quality synthetic examples with detailed reasoning traces, enabling the model to learn explicit thinking patterns.

	## 🏗️ Model Architecture

	```
	MistralForCausalLM(
	- Hidden Size: 5,120
	- Intermediate Size: 32,768
	- Number of Layers: 40
	- Attention Heads: 32
	- Key-Value Heads: 8
	- Vocabulary Size: 131,072
	- Max Position Embeddings: 32,768
	- RoPE Theta: 100,000,000
	)
	```

	## 📝 Citation

	```bibtex
	@misc{logicflowmistralsmall24breasoning,
	title={LogicFlow-Mistral-Small-24B-Reasoning: A Reasoning-Enhanced Large Language Model},
	author={[Your Name]},
	year={2025},
	note={Fine-tuned from Mistral-Small-24B-Instruct-2501 using OpenThoughts-114k dataset}
	}
	```

	## 📄 License

	This model is released under the Apache 2.0 License, following the base model's licensing terms.

	## 🙏 Acknowledgments

	- Mistral AI for the exceptional base model
	- OpenThoughts team for the high-quality reasoning dataset
	- LLaMA-Factory for the excellent fine-tuning framework

	---

	Built with ❤️ using [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)