Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
There are some important advantages
to taking the more difficult road in the beginning:
at a later stage when comparing the original model to the Hugging Face implementation, you can verify automatically
for each component individually that the corresponding component of the 🤗 Transformers implementation matches instead
of relying on visual comparison via print statements
it can give you some rope to decompose the big problem of porting a model into smaller problems of just porting
individual components and thus structure your work better
separating the model into logical meaningful components will help you to get a better overview of the model's design
and thus to better understand the model
at a later stage those component-by-component tests help you to ensure that no regression occurs as you continue
changing your code
Lysandre's integration checks for ELECTRA
gives a nice example of how this can be done.