Any plans for 32B/70B distilled models?

#83
by NanaBanana22 - opened

Hey. Any plans to distill qwen3 32b / llama 70b?

we want this too!

qwen3 30b a3b

Please no more distills. They just lack so far behind because they use entirely different architectures (in this case, Qwen3)

I'd rather have a DeepSeek R1 Lite. The same model, with the same training data, just scaled down so it can run on consumer hardware.

Sign up or log in to comment