There are some important advantages | |
to taking the more difficult road in the beginning: | |
at a later stage when comparing the original model to the Hugging Face implementation, you can verify automatically | |
for each component individually that the corresponding component of the 🤗 Transformers implementation matches instead | |
of relying on visual comparison via print statements | |
it can give you some rope to decompose the big problem of porting a model into smaller problems of just porting | |
individual components and thus structure your work better | |
separating the model into logical meaningful components will help you to get a better overview of the model's design | |
and thus to better understand the model | |
at a later stage those component-by-component tests help you to ensure that no regression occurs as you continue | |
changing your code | |
Lysandre's integration checks for ELECTRA | |
gives a nice example of how this can be done. |