Allennlp: ELMo Models in Different Languages

Created on 5 Sep 2018  路  15Comments  路  Source: allenai/allennlp

Is your feature request related to a problem? Please describe.
I would like to use an ELMo language model in other languages than English, concretely German.

Describe the solution you'd like
Additional available ELMo models in other languages, i.e. German, without the need to train them myself.

Describe alternatives you've considered
Use the training code to train them myself. Unfortunately this is very expensive.

Also thanks for the great product :) it's a joy using AllenNLP!

Most helpful comment

@fsonntag we're presently discussing whether to invest heavily in extending ELMo to multiple languages next year. Stay tuned!

All 15 comments

There is a repo for this: https://github.com/HIT-SCIR/ELMoForManyLangs but some of the embeddings are not accessible.

Thanks @KeremZaman, thanks for the answer! Unfortunately they don't provide the embeddings in a format that is compatible with AllenNLP. But if you managed to get them incorporated into AllenNLP, I would be happy to know!

@fsonntag we're presently discussing whether to invest heavily in extending ELMo to multiple languages next year. Stay tuned!

Thanks a lot for the answer @schmmd

We're pretty settled that we're not going to be investing in doing this ourselves, at this point. As we've said in other issues, if people want to contribute back pre-trained models in other languages, we are happy to host them and say very nice things about the people who contribute them.

I am also interested in training ELMO for other languages such as Persian which I have the datasets.

What are the steps? Can you provide us with a reference or a guide to do so?
The reason that I'm asking so is that we probably want all the pre-trained models to be in a certain format so that switching between languages would not require much work in terms of coding.

@adelra we should have a training module for ELMo in AllenNLP soon which should make training ELMo for other languages easier. Presently you would need to follow the instructions in https://github.com/allenai/bilm-tf however.

Is ELMo Models for non-English currently available??

We have a contributed model for Portuguese. See https://allennlp.org/elmo.

I need to apply it in Arabic text??

if there any way to build it by my self??

You can train the original LSTM architecture in your corpus using https://github.com/allenai/bilm-tf

You can train a transformer version using allennlp, see https://github.com/allenai/allennlp/blob/master/tutorials/how_to/training_transformer_elmo.md

People looking for more languages can find some here:
https://github.com/TalSchuster/CrossLingualELMo

Thank you so much Tal, and your Cross Lingual ELMo paper (and code) is awesome. Congrats. Love it 馃挴

Thank you Matthew! Your code is written very well so it was very convenient to extend upon

Was this page helpful?
0 / 5 - 0 ratings