Please consider implementing https://arxiv.org/pdf/1909.11556.pdf with a README.md in the examples to reproduce the results. @huihuifan
We are actively working on this and will release code and commands for reproduction very soon, then open source models soon afterwards. Thanks for your interest!
Acoording to this ran --model-overrides "{'decoder_layers_to_keep':'0,2,4,6'}" for a transformer-base model while evaluating, but ran into below error.
RuntimeError: Error(s) in loading state_dict for TransformerModel:
Missing key(s) in state_dict:
Also tried resuming training from layer-dropped out model, still the same error occurs @huihuifan .
Thanks.
hi @gvskalyan, does the error message say which keys in the state dict are missing?
Sorry, it is evident that I have trained the transformer base, but am overriding only decoder layers but not encoder layers. Also it works only for same layers on both sides. Thanks.
thanks for raising this error, I will look into it on current master branch and get back to you.
Most helpful comment
We are actively working on this and will release code and commands for reproduction very soon, then open source models soon afterwards. Thanks for your interest!