Is your feature request related to a problem? Please describe.
When training dataset reader uses on-the-fly augmentations and validation dataset reader does not, Predictor.from_path does not allow to select right DatasetReader from the archive.
Describe the solution you'd like
Add string parameter to the from_path and from_archive functions (default to training dataset reader), search for it in the archive and instantiate selected reader or throw an error if not found.
Describe alternatives you've considered
I've written my own wrapper that instantiated DatasetReader and Predictor
Hmm, I actually think the right thing to do might be to always load the validation dataset reader if it's there, and fall back to the training dataset reader if not. You _probably_ want whatever you did for validation when you're loading a predictor. I've had to modify model archives plenty of times when putting things in the demo because of this.
We could also add a flag that toggles this behavior if you really did want the training dataset reader. If you want something totally separate, you can just override predictor._dataset_reader after it's loaded.
This seems like a reasonable thing to me; I'd say a PR is welcome here. If anyone else has differing opinions, say so.
Hmm, I actually think the right thing to do might be to always load the validation dataset reader if it's there, and fall back to the training dataset reader if not.
Agreed, I actually thought this was already the default.
This is the default for the evaluate command, which is probably what you were thinking of. It probably should also be the default for Predictors in general.
This was addressed in https://github.com/allenai/allennlp/pull/3033
Thanks @danieldeutsch !
Most helpful comment
This is the default for the
evaluatecommand, which is probably what you were thinking of. It probably should also be the default forPredictorsin general.