Introduction
This model is a fine-tuned version of unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit. It has been trained using TRL.
This model is fine-tuned for detecting vulnerabilities in code with the Chain-of-Thought method.
Dataset Used: Mackerel2/cybernative_code_vulnerability_cot
Use Cases
- Use for code vulnerability analysis
- Use for general code related question answering (use without given chat template)
Use model with a chat template for Chain-of-Thought response
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from peft import PeftModel
# Define model IDs
base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
finetuned_model_id = "navodPeiris/Vulnerability-Analyst-Qwen2.5-1.5B-Instruct"
# Load tokenizer (trust remote code for Qwen models)
tokenizer = AutoTokenizer.from_pretrained(finetuned_model_id, trust_remote_code=True)
# Load base model
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True
)
# Apply LoRA weights
model = PeftModel.from_pretrained(model, finetuned_model_id)
# (Optional) Merge LoRA adapters for faster inference
model = model.merge_and_unload()
# Prompt construction
system_prompt = (
"You are an expert coder with a strong code vulnerability detection and reasoning ability. "
"You first think through the reasoning process step-by-step in your mind and then provide the user with the answer."
)
user_prompt = (
"Below is a question that describes a coding related problem. Write a response that appropriately answers the question. "
"Show your reasoning in <think> </think> tags. And return the final response in <answer> </answer> tags.\n"
"###Question###:\n{question}\n"
"###Response###:\n<think>"
)
# Example question
question = """Find vulnerabilities in the following PHP code:
```php
<?php
$db = new PDO('mysql:host=localhost;dbname=test', $user, $pass);
$username = $_GET['username'];
$password = $_GET['password'];
$sql = "SELECT * FROM users WHERE username = '$username' AND password = '$password'";
foreach ($db->query($sql) as $row) {
print_r($row);
}
?>
```"""
# Apply tokenizer's chat template
prompt = tokenizer.apply_chat_template(
[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt.format(question=question)},
],
tokenize=False,
add_generation_prompt=True,
)
# Run inference using transformers pipeline
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")
output = pipe(prompt, max_new_tokens=1024, return_full_text=False)[0]["generated_text"]
print("<think>\n" + output)
Training procedure
This model was trained with SFT Trainer of trl library. I have Leveraged Unsloth’s FastLanguageModel with 4-bit quantization and smart gradient checkpointing to fit within consumer GPUs. I designed prompts where reasoning is enclosed in <think>...</think> and final answers in <answer>...</answer>. This helps guide the model to reason step-by-step before answering. I have used SFTTrainer from HuggingFace TRL with LoRA + 8bit optimizer + cosine LR scheduling. Evaluation is performed every 50 steps. I have used PEFT/LoRA for efficient fine-tuning.
Parameter | Value |
---|---|
per_device_train_batch_size |
8 |
gradient_accumulation_steps |
4 |
per_device_eval_batch_size |
16 |
logging_steps |
50 |
eval_steps |
50 |
num_train_epochs |
2 |
warmup_ratio |
0.03 |
learning_rate |
3e-5 |
fp16 |
True |
optim |
adamw_8bit |
weight_decay |
0.1 |
lr_scheduler_type |
cosine |
dataset_text_field |
prompt |
max_seq_length |
1024 |
lora_rank (r) |
16 |
target_modules |
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
lora_alpha |
32 |
use_gradient_checkpointing |
unsloth |
Framework versions
- TRL: 0.18.1
- Transformers: 4.52.4
- Pytorch: 2.6.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citations
Cite TRL as:
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}