Guide on logit distillation?
#9
by
WyattTheSkid
- opened
Hey I know it's a long shot but I am incredibly fascinated with the concept of logit distillation from larger models into smaller ones. If one of you guys would be willing to either contact me privately or write up a detailed guide on the process that would be so greatly appreciated. I know some inference providers on openrouter provide access to Logprobs and Top_logprobs but I am unsure of the difference and which one I need and all of the finer details. Thank you for your time, this is a very good model!!! (especially for it's size)