Transformers: bert-base-multilingual-cased - Text bigger than 512

Created on 5 Dec 2018  路  2Comments  路  Source: huggingface/transformers

Hello,

I am trying to extract features from German text using bert-base-multilingual-cased. However, my text is bigger than 512 words.
Is there any way to use the pertained Bert for text greater than 512 words

Most helpful comment

Hi @agemagician, you cannot really use pretrained bert for text longer than 512 tokens per se but you can use the sliding window approach.

Check this issue of the original bert repo for more details: https://github.com/google-research/bert/issues/66

All 2 comments

Hello,

I do not think that it is possible out of the box. The article states the following:

We use learned positional embeddings with supported sequence lengths up to 512 tokens.

The positional embeddings are therefore limited to 512 tokens. You may be able to add positional embeddings for position greater than 512 and learn them on your specific dataset but I don't know how efficient that would be.

Hi @agemagician, you cannot really use pretrained bert for text longer than 512 tokens per se but you can use the sliding window approach.

Check this issue of the original bert repo for more details: https://github.com/google-research/bert/issues/66

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hsajjad picture hsajjad  路  3Comments

siddsach picture siddsach  路  3Comments

yspaik picture yspaik  路  3Comments

zhezhaoa picture zhezhaoa  路  3Comments

chuanmingliu picture chuanmingliu  路  3Comments