You should try training a model with 2B parameters and context length 32000.
#3
by
win10
- opened
You should try training a model with 2B parameters and context length 32000.
Wishing you a happy new year and a successful new year.