Hi @aconneau we haven't done experiments on Spanish CoNLL-02, but would be happy to post numbers if someone in the community does the experiments!
@aconneau I'll add the results in my experiments repo the next days.
I already did some experiments with the recently added NER fine-tuning code in 馃/Transformers, see them here for comparison :)
@aconneau without hyper-parameter tuning and using the default parameters (as used in the English CoNLL training example):
| Embeddings | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Avg.
| -- | -- | -- | -- | -- | -- | --
| Flair Embeddings (Test) | 85.1 | 86.21 | 86.44 | 86.41 | 85.92 | 86.016
| Word Embeddings + Flair Embeddings (Test) | 86.8 | 87.52 | 87.63 | 87.29 | 87.7 | 87.388
Btw: just found your "Unsupervised Cross-lingual Representation Learning at Scale" paper - really great work, I'm excited to integrate XLM-R into Flair!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.