OPTML-Group
/

SimNPO-WMDP-zephyr-7b-beta

Text Generation

machine-unlearning

large-language-models

trustworthy-machine-learning

text-generation-inference

Model card Files Files and versions Community

a-F1 commited on Oct 28, 2024

Commit

457765e

·

verified ·

1 Parent(s): fefdba1

Update README.md

Files changed (1) hide show

README.md +44 -3

README.md CHANGED Viewed

@@ -1,3 +1,44 @@
----
-license: mit
----

+---
+license: mit
+---
+# Zephyr-7B-beta unlearned using SimNPO on WMDP
+## Model Details
+- **Base Model**: Zephyr-7B-beta
+- **Unlearning**: SimNPO on WMDP-Bio and WMDP-Cyber
+## Unlearning Algorithm
+This model uses the `SimNPO` unlearning algorithm with the following parameters:
+- Learning Rate: `4e-6`
+- beta: `5.5`
+- lambda: `5.0`
+- gamma: `0.0`
+## Loading the Model
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("OPTML-Group/SimNPO-WMDP-zephyr-7b-beta", use_flash_attention_2=True, torch_dtype=torch.bfloat16, trust_remote_code=True)
+```
+## Citation
+If you use this model in your research, please cite:
+```
+@misc{fan2024simplicityprevailsrethinkingnegative,
+      title={Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning},
+      author={Chongyu Fan and Jiancheng Liu and Licong Lin and Jinghan Jia and Ruiqi Zhang and Song Mei and Sijia Liu},
+      year={2024},
+      eprint={2410.07163},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2410.07163},
+}
+```
+## Contact
+For questions or issues regarding this model, please contact chongyu.fan93@gmail.com.