I am trying to make this switch from classification to regression as disccused here: https://github.com/google-research/bert/issues/74
with BERT, but I basically get the same output no matter what (e.g. I'm trying to predict scores on a range from 1-10, and everything is given 5.5). Does anybody know why this may be happening?
hello,have you sovled it? I have the same problem with you.
try my pr.
try my pr.
θε₯θ½εθ―ζδΊ§ηθΏζ ·η»ζηεε εοΌζ±ε©
I have the same issue, did any one has a solution?
I am trying to make this switch from classification to regression as disccused here: #74
with BERT, but I basically get the same output no matter what (e.g. I'm trying to predict scores on a range from 1-10, and everything is given 5.5). Does anybody know why this may be happening?
have you solved it ???
try my pr.
θε₯θ½εθ―ζδΊ§ηθΏζ ·η»ζηεε εοΌζ±ε©
θε₯δ½ εζ₯ζδΉθ§£ε³ηοΌοΌ
Hi, I had the same problem but I finally fix it by freezing all layers except the head layers
Run into the same problem, a subtle issue, something to investigate here.
In my case if I fine-tune all layers, predictions are distributed in a very narrow range ~1% of the target distribution span. Freezing BERT embeddings helped a bit but still distribution of predictions is not as wide as those of the target. Finally, I just extracted embeddings and run Ridge regression on top (find this pipeline easier to extend with handcrafted features), and it worked perfect, though I understand that it is equivalent to freezing BERT layers and fine-tuning only the regression head.
I'm trying a token-level regression task and I also found the model quickly collapsing onto a very small range (0.1% of target span) of predictions. Again as @Yorko mentions, freezing all of bert helps expand the range a little (10% of target span). I'll try freezing all layers except the last. I'd love to hear others' relevant experiences :)
Most helpful comment
Hi, I had the same problem but I finally fix it by freezing all layers except the head layers