Fairseq: BART : Is there a tutorial for pre-training BART on your own dataset ?

Created on 18 Jul 2020  路  7Comments  路  Source: pytorch/fairseq

Thanks!

documentation

Most helpful comment

@shamanez @jasonwu0731 I can't confirm everything that I am trying is 100% correct but I think I've pieced together a procedure for (possibly) retraining BART on a new dataset. I'm happy to be proved wrong/improve this. I'd also appreciate FAIR endorsed guidance. The data processing mostly comes from here. I made Gists for preprocessing and training. Let me know if this is helpful or if you have improvements!

All 7 comments

@myleott ? thanks

Same question here. Appreciate if there is a guide.

@shamanez @jasonwu0731 I can't confirm everything that I am trying is 100% correct but I think I've pieced together a procedure for (possibly) retraining BART on a new dataset. I'm happy to be proved wrong/improve this. I'd also appreciate FAIR endorsed guidance. The data processing mostly comes from here. I made Gists for preprocessing and training. Let me know if this is helpful or if you have improvements!

Same problem here. It would be great if there is one.

@tomsherborne Amazing! Actually I also have taken initial steps. But still, I am not sure about in what percentage that we need to use different pretext denoising tasks.

Let's do this!

@ngoyal2707 wondering if you can provide us with the details on training BART or comment on @tomsherborne gist if they are good. As it would be good to have the readme in examples folders describing the process.

yeah, it would be vert useful.

Was this page helpful?
0 / 5 - 0 ratings