Fmriprep: Run time estimation for singularity container fmriprep

Created on 26 Feb 2019  路  10Comments  路  Source: nipreps/fmriprep

Hello,

I am just hoping to estimate how long an fmriprep run will take and ask a few questions about speeding it up.

I have 156 subjects each with a T1w and a 6min resting state run. The whole dataset is 15gb in size ~90Mb per subject. They have all been previously processed with freesurfer (run in 2011, stable5_0_0, recon-all v 1.313.2.36).

I'm running a singularity container on my linux machine (32 processors: Intel Xeon E5-2630 v3 @ 2.4GHz). 62GB ram.

I would assume that --nthreads 32 is appropriate?

I also have access to a PBS cluster. Is there a way to parallelize even further using that?

Also, if you abort while fmriprep is running, will it pick up right where it left off previously?

Thanks!

  • Harris
question

Most helpful comment

Sorry, meant to respond to this earlier, but we should put in a big warning: If you want to use FreeSurfer 5 results, you cannot use the Docker containers we provide, because those contain FreeSurfer 6. recon-all in FreeSurfer 5 and 6 have different outputs, so the check for whether recon-all has already been run is version-dependent. You will need to build an alternative container that uses the version of FreeSurfer that you ran previously.

All 10 comments

I have 156 subjects each with a T1w and a 6min resting state run. The whole dataset is 15gb in size ~90Mb per subject. They have all been previously processed with freesurfer (run in 2011, stable5_0_0, recon-all v 1.313.2.36).
I'm running a singularity container on my linux machine (32 processors: Intel Xeon E5-2630 v3 @ 2.4GHz). 62GB ram.

I would run each of the 156 subjects separately. Please make sure you reuse your pre-calculated freesurfer results.

62GB RAM seem enough. I would probably lean towards using Docker if it is your PC.

I would assume that --nthreads 32 is appropriate?

You may hit memory problems and probably won't use that much most of the time. I would recommend running with --omp-nthreads 8 --nthreads 10 (or 12).

I also have access to a PBS cluster. Is there a way to parallelize even further using that?

I would parallelize subjects in that scenario. Make sure to point the work directory to some kind of local scratch.

Also, if you abort while fmriprep is running, will it pick up right where it left off previously?

If you keep your work directory somewhere with enough space and reuse it, then it will pick up cached results. If you follow my advice on using some volatile storage that gets deleted when the job ends, then that would not be possible. Since you are reusing FreeSurfer, I would not try to reuse the work directory.

Sorry, meant to respond to this earlier, but we should put in a big warning: If you want to use FreeSurfer 5 results, you cannot use the Docker containers we provide, because those contain FreeSurfer 6. recon-all in FreeSurfer 5 and 6 have different outputs, so the check for whether recon-all has already been run is version-dependent. You will need to build an alternative container that uses the version of FreeSurfer that you ran previously.

Yes, I also missed the main question.

I am just hoping to estimate how long an fmriprep run will take

Given your dataset, I'd say it will take around 2h per subject. +6h-12h per subject considering that you'll need to run FreeSurfer 6. Otherwise, you have @effigies' option or a bare-metal installation.

Thanks for the input!
@oesteban our linux system doesn't support Docker so I am running through a singularity container. I'll split into one job per subject on the cluster to parallelize.

While I was waiting I started running the container.
singularity run -B /autofs/space/rainier_002/users/harris/CTS/data /autofs/space/rainier_002/users/harris/CTS/data/singularity/fmriprep-1.1.8.simg --fs-license-file=/autofs/space/rainier_002/users/harris/CTS/data/license.txt /autofs/space/rainier_002/users/harris/CTS/data/BIDS /autofs/space/rainier_002/users/harris/CTS/data/BIDS_output participant

@effigies
I noticed that aseg.presurf.mgz did not exist and looking at Freesurfer documentation it seemed that it was created by
cp aseg.auto.mgz aseg.presurf.mgz
so I just ran that command.

The container appears to be running fine, but it is definitely going through several freesurfer commands. Would it just be running additional commands to update from version 5 to version 6? Or are they different enough that it is going to be really messed up and I should just re-run with version 6?

If I were to stop the job and start running individual jobs for each subject, would it be able to pick back up (assuming I don't need to just completely restart with Freesurfer v6)? The single_subject_*wf folders are in /autofs/space/rainier_002/users/harris/CTS/data/BIDS_output/freesurfer/work/fmriprep_wf

I don't think that continuing a FreeSurfer 5 reconstruction by running FreeSurfer 6 is likely to be a great idea. I suspect it'll mix-and-match things, and FreeSurfer 6 tools might assume that their inputs have been processed in a FreeSurfer 6 way. I think you're best off patching in an old FreeSurfer or re-running FreeSurfer from scratch using fMRIPrep.

Got it. Thanks! Given that I have access to a cluster I will try to just run a new Freesurfer for each subject if I can successfully do it in parallel.

I have to request a specific amount of memory and nodes per job. The standard job submission is 1node:1ppn and vmem=7gb, but it looks like it almost immediately goes over 7gb. How much memory should be needed per subject?

Per subject we generally recommend 8 cores and 16GB. 24GB is safer if your core/memory ratios allow it. And if you're requesting less than the full system amount, you should use the --n-cpus and --mem-gb options in fMRIPrep. By default, fMRIPrep will assume it can safely access all of the cores and memory on the system, as it has no access to the limits set by the batch scheduler.

Please reopen if you think we need to address further questions. It would also be nice to have @surchs' opinion on this one.

Thanks for the help! Currently running in parallel on the cluster. Just FYI I had to use --mem-mb and --nthreads as the flags.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

effigies picture effigies  路  19Comments

danielkimmel picture danielkimmel  路  24Comments

oricon picture oricon  路  39Comments

jdkent picture jdkent  路  35Comments

effigies picture effigies  路  29Comments