Sometimes, the hash of the downloaded files doesn't match the expected hash, see e.g.
Other builds in the same group passed fine, so I don't think that this is an issue with the data. Rather, it seems pooch doesn't know to retry if it's failed. (Or even if it's failed.) @leouieda do you think this is something that pooch should handle, or should we somehow wrap it to include retries?
Incidentally maybe this means the earlier errors we were seeing had nothing to do with github vs gitlab bandwidth. 馃槵 Or, they did but we are hitting gitlab bandwidth limits also. I don't know, this is hard. 馃槶
@sciunto sorry, I missed this entirely. The logs for that build are no longer available 馃槩
I would guess that it might have been something with the bandwidth that caused the download to be incomplete (in which case the hash wouldn't match).
Right now, pooch doesn't retry failed downloads and we don't have an option for that. We could add a configuration parameter for the Pooch class like retry_if_failed that can be 0 by default to retain the current behaviour. This seems reasonable, particularly for CIs where most of the failures end up happening.
I created https://github.com/fatiando/pooch/issues/205, feel free to chime in there with any requirements you'd have for this feature.
Thanks thanks!
@jni Pooch 1.3.0 is out now on PyPI and conda-forge including the retry_if_failed option. Give it a shot and let me know if anything doesn't work as expected 馃檪 https://www.fatiando.org/pooch/latest/advanced.html#retry-failed-downloads
Thanks @leouieda! We finally added this in #5105. 馃 that it resolves our CI issues!
Most helpful comment
@jni Pooch 1.3.0 is out now on PyPI and conda-forge including the
retry_if_failedoption. Give it a shot and let me know if anything doesn't work as expected 馃檪 https://www.fatiando.org/pooch/latest/advanced.html#retry-failed-downloads