i find that if my context is too long, i will get some error, does bert has max length of context?
I believe you're looking for
and
max_seq_length: The released models were trained with sequence lengths up to 512, but you can fine-tune with a shorter max sequence length to save substantial memory. This is controlled by the max_seq_length flag in our example code.
I.e., 512 is max.
I have experienced error from having longer than 512 input as well.
INFO:tensorflow:Error recorded from training_loop: Paddings must be non-negative for 'gradients/bert/embeddings/Slice_grad/Pad' (op: 'Pad') with input shapes: [1024,768], [2,2] and with computed input tensors: input[1] = <[0 -512][0 0]>.
Since this value is loaded from bert_config.json, does it mean that I have to retrain the BERT model in order to support a larger max_position_embeddings? Does having a larger max_position_embeddings lower the performance of BERT?
does it mean that I have to retrain the BERT model in order to support a larger
max_position_embeddings?
Yes you have to train the model to understand the new position embeddings.
Does having a larger
max_position_embeddingslower the performance of BERT?
No, but it will be longer to train and maybe harder. You can also try to add new position embeddings while reusing the existing weights but it won't be easy.
is there any way to increase max_position _embedding ? and used pretrained model of bert
Most helpful comment
I believe you're looking for
https://github.com/google-research/bert/blob/60454702590a6c69bd45c5d4258c7e17b8a3e1da/modeling.py#L42
and
I.e., 512 is max.