--- base_model: unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit library_name: transformers model_name: Vulnerability-Analyst-Qwen2.5-1.5B-Instruct tags: - question-answering - chat - text-generation - unsloth - trl - sft licence: license license: mit datasets: - Mackerel2/cybernative_code_vulnerability_cot --- # Introduction This model is a fine-tuned version of [unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit](https://huggingface.co/unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit). It has been trained using [TRL](https://github.com/huggingface/trl). This model is fine-tuned for detecting vulnerabilities in code with the Chain-of-Thought method. Dataset Used: [Mackerel2/cybernative_code_vulnerability_cot](https://huggingface.co/datasets/Mackerel2/cybernative_code_vulnerability_cot) ## Use Cases - Use for code vulnerability analysis - Use for general code related question answering (use without given chat template) ## Use model with a chat template for Chain-of-Thought response ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline from peft import PeftModel # Define model IDs base_model_id = "Qwen/Qwen2.5-1.5B-Instruct" finetuned_model_id = "navodPeiris/Vulnerability-Analyst-Qwen2.5-1.5B-Instruct" # Load tokenizer (trust remote code for Qwen models) tokenizer = AutoTokenizer.from_pretrained(finetuned_model_id, trust_remote_code=True) # Load base model model = AutoModelForCausalLM.from_pretrained( base_model_id, device_map="auto", torch_dtype=torch.float16, trust_remote_code=True ) # Apply LoRA weights model = PeftModel.from_pretrained(model, finetuned_model_id) # (Optional) Merge LoRA adapters for faster inference model = model.merge_and_unload() # Prompt construction system_prompt = ( "You are an expert coder with a strong code vulnerability detection and reasoning ability. " "You first think through the reasoning process step-by-step in your mind and then provide the user with the answer." ) user_prompt = ( "Below is a question that describes a coding related problem. Write a response that appropriately answers the question. " "Show your reasoning in tags. And return the final response in tags.\n" "###Question###:\n{question}\n" "###Response###:\n" ) # Example question question = """Find vulnerabilities in the following PHP code: ```php query($sql) as $row) { print_r($row); } ?> ```""" # Apply tokenizer's chat template prompt = tokenizer.apply_chat_template( [ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt.format(question=question)}, ], tokenize=False, add_generation_prompt=True, ) # Run inference using transformers pipeline pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto") output = pipe(prompt, max_new_tokens=1024, return_full_text=False)[0]["generated_text"] print("\n" + output) ``` ## Training procedure This model was trained with SFT Trainer of trl library. I have Leveraged Unsloth’s FastLanguageModel with 4-bit quantization and smart gradient checkpointing to fit within consumer GPUs. I designed prompts where reasoning is enclosed in \...\ and final answers in \...\. This helps guide the model to reason step-by-step before answering. I have used SFTTrainer from HuggingFace TRL with LoRA + 8bit optimizer + cosine LR scheduling. Evaluation is performed every 50 steps. I have used PEFT/LoRA for efficient fine-tuning. | Parameter | Value | |----------------------------|-------------------------------------| | `per_device_train_batch_size` | 8 | | `gradient_accumulation_steps` | 4 | | `per_device_eval_batch_size` | 16 | | `logging_steps` | 50 | | `eval_steps` | 50 | | `num_train_epochs` | 2 | | `warmup_ratio` | 0.03 | | `learning_rate` | 3e-5 | | `fp16` | True | | `optim` | adamw_8bit | | `weight_decay` | 0.1 | | `lr_scheduler_type` | cosine | | `dataset_text_field` | prompt | | `max_seq_length` | 1024 | | `lora_rank (r)` | 16 | | `target_modules` | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | | `lora_alpha` | 32 | | `use_gradient_checkpointing` | unsloth | ### Framework versions - TRL: 0.18.1 - Transformers: 4.52.4 - Pytorch: 2.6.0 - Datasets: 3.6.0 - Tokenizers: 0.21.1 ## Citations Cite TRL as: ```bibtex @misc{vonwerra2022trl, title = {{TRL: Transformer Reinforcement Learning}}, author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec}, year = 2020, journal = {GitHub repository}, publisher = {GitHub}, howpublished = {\url{https://github.com/huggingface/trl}} } ```