Fmriprep: Exit early if dependency is missing.

Created on 6 Apr 2018  路  4Comments  路  Source: nipreps/fmriprep

Disclaimer: I know that the recommended way of using fmriprep is via docker or singularity where this is a non-issue.

I sometimes wait days for fmriprep to finish when handling large datasets. It's a bit frustrating to see that a node crashed because of a missing dependency. Example: if I pass a flag --use-aroma, and ICA aroma is not installed, then fmriprep shouldn't even attempt to run in the first place.

A good place to insert such checks would be in the cli run function.

I could pull request changes that check for aroma, but it'd be nice to have something more general that checks for all possible dependencies according to what arguments are passed to fmriprep.

good first issue help wanted

Most helpful comment

Thanks for the quick and through feedback. I've sorted out all of the dependencies at my end, so fmriprep is running OK. I'll talk to our admins nonetheless to check on Singularity support.

While a robust dependency check isn't in place, would it perhaps be useful to have something along the lines of:

import shutil
tools = ['fsl', 'afni', 'c3d', 'ANTS']
for check, tool in [(shutil.which(dep), dep) for dep in tools]:
if check is None:
    raise OSError(f'Dependency {tool} missing')

All 4 comments

One approach to achieve this would be to add a nipype function workflow.check_commands. This could walk the workflow graph, build a list of commands and verify whether they can be found in the path.

There would be a cost to this, which will scale with graph size, but it may be worth paying for cases like yours.

And just in case you weren't aware, I will also note that if you keep your working directory around, you can install the missing dependency, and re-run. Successfully run portions of the workflow will be reused.

I suppose the cost would be minimal wrt the time it takes to preprocess the data.

The thing is that I submit jobs to multiple machines, so that each processes a different subject. Because of the size of the intermediate datasets (~100GB per subject), and the number of subjects/sessions I have, I tell each machine to use it's own /tmp directory as a working directory. I have no guarantees that resubmitted jobs will land on the same machine so I cannot count on reusing portions of the workflow, which is unfortunate.

You're right about the relative cost, and I didn't mean to suggest that a minute or two on a multi-hour process is excessive. I appreciate the constraints you're working under, and I agree that it does make sense to do some kind of pre-run dependency check.

Such a check will take some time to implement. If I can suggest two interim solutions:

1) If your cluster supports Singularity (or your administrators are amenable to adding support), Singularity images are the surest way to get dependencies bundled with versions that we've tested to ensure compatibility.

2) Failing that, you can create a stripped down subject, that has just enough data to exercise all paths, which will ensure that all necessary dependencies are installed. For instance, if you have multiple T1w images, reduce to two images; for BOLD series, limit to a single task of a single run. If you have field maps, one BOLD series for each type of field map would be required to fully exercise functionality. BOLD series can be truncated to 50 volumes without running a risk of altering code paths. Finally, run with --fs-no-reconall (other uses of FreeSurfer tools will make sure that FreeSurfer is installed).

I hope that this is helpful.

Thanks for the quick and through feedback. I've sorted out all of the dependencies at my end, so fmriprep is running OK. I'll talk to our admins nonetheless to check on Singularity support.

While a robust dependency check isn't in place, would it perhaps be useful to have something along the lines of:

import shutil
tools = ['fsl', 'afni', 'c3d', 'ANTS']
for check, tool in [(shutil.which(dep), dep) for dep in tools]:
if check is None:
    raise OSError(f'Dependency {tool} missing')
Was this page helpful?
0 / 5 - 0 ratings