File size: 2,729 Bytes
fbd8cff
 
 
f1eae87
 
 
 
 
 
 
 
 
fbd8cff
 
0ca6063
fbd8cff
 
 
f1eae87
 
 
 
fbd8cff
14d6cc7
fbd8cff
f1eae87
 
a4d1825
f1eae87
fbd8cff
 
 
 
 
 
f1eae87
fbd8cff
f1eae87
fbd8cff
f1eae87
fbd8cff
 
 
f1eae87
fbd8cff
 
 
f1eae87
 
fbd8cff
 
 
 
f1eae87
 
 
 
 
fbd8cff
 
f1eae87
fbd8cff
f1eae87
 
 
7417fc8
f1eae87
fbd8cff
f1eae87
fbd8cff
f1eae87
fbd8cff
f1eae87
 
 
fbd8cff
f1eae87
 
 
 
 
 
fbd8cff
f1eae87
 
 
 
 
 
 
 
fbd8cff
f1eae87
fbd8cff
f1eae87
 
fbd8cff
f1eae87
fbd8cff
f1eae87
 
 
 
 
 
 
 
 
 
fbd8cff
 
 
 
 
675c90f
fbd8cff
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
base_model: unsloth/Qwen3-0.6B
library_name: peft
license: mit
datasets:
- unsloth/OpenMathReasoning-mini
- mlabonne/FineTome-100k
language:
- en
pipeline_tag: question-answering
tags:
- Math
---

# Model Card for Qwen3-0.6B-OpenMathReason

### Model Description

This model is fine-tuned version of Qwen/Qwen3-0.6B using the Unsloth library and LoRA for parameter-efficient training.
This model is trained on two datasets:
- unsloth/OpenMathReason-mini — for enhancing mathematical reasoning skills.
- mlabonne/FineTome-100k — to improve general conversational abilities.

#### Model Details

- **Developed by:** Rustam Shiriyev
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** unsloth/Qwen3-0.6B 


## Uses

### Direct Use

This model can be used as a lightweight assistant capable of solving basic to intermediate math problems (OpenMathReason tasks).

### Downstream Use

- Can be integrated into educational chatbots for STEM learning.

### Out-of-Scope Use

- Not suitable for high-stakes decision-making.

## Bias, Risks, and Limitations

- Mathematical reasoning is limited to the scope of the OpenMathReason-mini dataset.
- Conversational quality may degrade with complex or multi-turn inputs.


## How to Get Started with the Model

```python 
from transformers import TextStreamer
from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel


login(token="")  

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-0.6B",)
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen3-0.6B",
    device_map={"": 0}, token=""
)

model = PeftModel.from_pretrained(base_model,"Rustamshry/Qwen3-0.6B-OpenMathReason")

question = "Solve (x + 2)^2 = 0"

messages = [
    {"role" : "user", "content" : question}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True, 
    enable_thinking = True,
)

_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to(model.device),
    max_new_tokens = 2048,
    temperature = 0.6, top_p = 0.95, top_k = 20,
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)
```
## Training Details

### Training Data

- unsloth/OpenMathReason-mini: 10k+ instruction-following examples focused on math.
- mlabonne/FineTome-100k: 100k examples of diverse, high-quality chat data.

### Training Procedure

- batch size=8,
- gradient accumulation steps=2,
- optimizer=adamw_torch,
- learning rate=2e-5,
- warmup steps=100,
- fp16=True,
- dataloader_num_workers=16,
- num_train_epochs=1,
- weight_decay=0.01,
- lr_scheduler_type = "linear"



### Results

- Loss Value >> 0.56

### Framework versions

- PEFT 0.14.0