ai4privacy/llama-ai4privacy-english-anonymiser-openpii

1 day ago

hi, nice work.
I am wondering which model is used for the fine-tuning, like llama-7B? BTW, are you planning to open-source the fine-tuning code?

MikeDoes

Ai4Privacy org 1 day ago

Thanks for pointing it out @IICurious , added the base model: ModernBert which (really rocks for English). The reason there is llama is because it has been fine-tuned through synthetic data provided by Llama:
https://huggingface.co/datasets/ai4privacy/open-pii-masking-500k-ai4privacy

And their policy states that models fine-tuned need to have this in there (disclaimer: this is not an legal or financial statement).

Hope this clarifies. I can share with you the repo for fine-tuning sure, check it out on our github account and feel free to leave a star if it's helpful:
https://github.com/AI4Privacy in the notebooks repo

Also to make the environment more sustainable, please free to join our discord: https://discord.gg/FmzWshaaQT
or if you have entreprise use-cases that we can assist with:
https://forms.gle/oDDYqQkyoTB93otHA / partnerships@ai4privacy.com

MikeDoes changed discussion status to closed 1 day ago

ai4privacy
/

llama-ai4privacy-english-anonymiser-openpii

Base model