--- inference: false datasets: - bigcode/commitpackft model-index: - name: patched-coder-34b results: - task: type: text-generation dataset: type: openai_humaneval name: HumanEval metrics: - name: pass@1 type: pass@1 value: 53.567 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalFix Python metrics: - name: pass@1 type: pass@1 value: 41.341 verified: false - task: type: text-generation dataset: type: patched-codes/static-analysis-eval name: Static Analysis Eval metrics: - name: pass@1 type: pass@1 value: 51.316 verified: false --- # Model Card for patched-coder-34b This is an instruction fine-tuned model focussed on the task of patching code. Patching may include fixing bugs, remediating security vulnerabilities, doing API migrations and other kinds of code matainence. ## Model Details ### Model Description - **Developed by:** [codelion](https://huggingface.co/codelion) - **Model type:** Code Llama - **Finetuned from model:** [CodeLlama-34b-Python](https://huggingface.co/codellama/CodeLlama-34b-Python-hf) ## How to Get Started with the Model Make sure to install Transformers from the main git branch: ```bash pip install git+https://github.com/huggingface/transformers.git ``` ## How to Prompt the Model This model accepts the alpaca instruction format. For example: ``` ### Instruction: {instruction} ### Input: {input} ### Response: ... ``` ## Bias, Risks, and Limitations This model has undergone very limited testing. Additional safety testing should be performed before any real-world deployments. ## Training Details - **GPU:** A100 80 GB - **Time:** ~8 hrs ### Training Data The model was fine-tuned on [commitpackft](https://huggingface.co/datasets/bigcode/commitpackft), an open dataset consisting of commits. We started with the commits for the `python` langauge from the dataset and then filtered all the commits that were related to fixing bugs. ### Training Procedure Instruction fine-tuning to follow instructions in natural langauge related to code. We load the quantized base model in 4 bits and then use QLoRA for Parameter-Efficient Fine-Tuning (PEFT) with Flash Attention. The model was trained for 2 epochs. #### Training Hyperparameters **Training regime:** The following `bitsandbytes` quantization config was used during training: - quant_method: bitsandbytes - load_in_8bit: False - load_in_4bit: True - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: nf4 - bnb_4bit_use_double_quant: True - bnb_4bit_compute_dtype: bfloat16 ## Evaluation We evaluate the model on `HumanEval` and `HumanEvalPack` benchmarks using [Code Generation LM Evaluation Harness](https://github.com/bigcode-project/bigcode-evaluation-harness). We also evaluate the model for vulnerability remediation using the `Static Analysis Eval` benchmark available [here](https://huggingface.co/datasets/patched-codes/static-analysis-eval). ### Results