Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper โข 2502.01534 โข Published Feb 3 โข 41
Lamarck-14B Qwen 2.5 and relatives Collection Lamarck's public releases, plus significant related merges and finetunes โข 6 items โข Updated Feb 18 โข 1
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs โข 7 items โข Updated Dec 11, 2024 โข 43