While processing a large dataset, i have encountered an unusual misalignment of data in sct_process_segmentation csv file output. For example:

This is rare: for over 9000 entries it only about 60 have a misalignment issue.
SCT version: commit--> https://github.com/neuropoly/spinalcordtoolbox/commit/a52f725fc440edae1d1cbd687ca8c0443a25a6bc
dataset: https://github.com/spine-generic/data-multi-subject
with version https://github.com/sct-pipeline/csa-atrophy/commit/db2095a6abe997fe5972d04b4b997a8cca9d72ee of csa-atrophy
run command: sct_run_batch -config config_sct_run_batch.yml
@PaulBautin it's "heavy" for SCT core team to reproduce these results by running sct_run_batch with a dependency from your other repos-- same comment about pointing to a 10GB dataset.
good practice is to point to a single data (seg) and single syntax (here: sct_process_segmentation) to reproduce the bug.
@jcohenadad i agree. i am not able to reproduce error myself for the moment. Still investigating..
@jcohenadad, I have re-run sct_process_segmentation on same subject (with same transfo and same rescaling) and found no misalignment in the csv files.
My suspicion, is that during the aggregation, slice 108 was missing, the output string to be added to the csv file was 107,109:110, which then translated into 107 |聽109:110 with csv formatting.
@PaulBautin maybe you can verify this suspicion by removing a slice before running process_segmentation
I tried to remove 3 different slices:
paul@montreal:~/Github/csa-atrophy$ sct_process_segmentation -i csa_atrophy_results/data_processed/sub-amu01/anat_r1/sub-amu01_T1w_RPI_r_crop_r1_t1_seg_copy.nii.gz -vert 2:5 -perlevel 1 -vertfile csa_atrophy_results/data_processed/sub-amu01/anat_r1/sub-amu01_T1w_RPI_r_crop_r1_seg_labeled_t1.nii.gz -o /home/paul/Github/csa-atrophy/csa_atrophy_results/results/csa_perlevel_sub-amu01_t1_1.csv
--
Spinal Cord Toolbox (git-master-a52f725fc440edae1d1cbd687ca8c0443a25a6bc)
Compute shape analysis: 26%|###8 | 45/175 [00:00<00:00, 449.40iter/s]
No properties for slice: 108
Compute shape analysis: 57%|########4 | 99/175 [00:00<00:00, 471.73iter/s]
No properties for slice: 133
Compute shape analysis: 86%|############ | 151/175 [00:00<00:00, 483.96iter/s]
No properties for slice: 175
Compute shape analysis: 100%|##############| 175/175 [00:00<00:00, 502.86iter/s]
Done! To view results, type:
xdg-open /home/paul/Github/csa-atrophy/csa_atrophy_results/results/csa_perlevel_sub-amu01_t1_1.csv
This does not seem to affect the output csv file:
csa_perlevel_sub-amu01_t1_1.zip
@PaulBautin can you pls upload sub-amu01_T1w_RPI_r_crop_r1_t1_seg_copy.nii.gz so we can reproduce
also: try to remove slices there: csa_atrophy_results/data_processed/sub-amu01/anat_r1/sub-amu01_T1w_RPI_r_crop_r1_seg_labeled_t1.nii.gz
Here is the segmentation with the removed slices:
sub-amu01_T1w_RPI_r_crop_r1_t1_seg_copy.zip
and it's labelling:
sub-amu01_T1w_RPI_r_crop_r1_seg_labeled_t1.zip
Ah, to reproduce error remove slices here: csa_atrophy_results/data_processed/sub-amu01/anat_r1/sub-amu01_T1w_RPI_r_crop_r1_seg_labeled_t1.nii.gz
@jcohenadad, the missing slices on the labelling seem to be random (not able to reproduce error when running segmentation and labelling on same subjects), should we try to manually correct these?
no. we should understand what causes these missing slices
I count 65 corrupted entries (all T1w images) with only 6 subjects impacted (sub-brnoUhb05, sub-brnoUhb02, sub-tokyo750w03, sub-tokyo750w04, sub-tokyo750w06, sub-perform02). @jcohenadad Could you send the processing log?
Corrupted entries:
corrupted_csv_file_entries.zip
here you go: log_20200828-test.zip
But my suggestion is to actually look at how your transformations and/or rescaling could create those missing slices. That's how I would attack the problem. Take a random subject, and look at the rescaled/transformed labeled segmentation, and try to understand what could cause this issue. I'm happy to help after you've investigated it a bit.
Up to now i was not able to reproduce error because the script (on my dataset) by default used a manual segmentation.
It seems that subjects with corrupted entries also have missing slices in their segmentation. From what i understand missing slices in *seg.nii.gz files are corrected (interpolated) during sct_process_segmentation but not the slices from *seg_labelled files.
@jcohenadad, why did the script not pick-up the manual segmentation on your side? Could we fill the missing slices in the segmentation before labelling?
@jcohenadad, why did the script not pick-up the manual segmentation on your side?
I'm not sure. What does the log file say? ie: can you point to the line that says "looking for manual segmentation, file XXX". And then i can check if file XXX exists in my version of the repos on compute canada
On my side this subject has a manual segmentation:
but this log does not indicate it:
process_data_sub-brnoUhb05.zip
FILESEG=sub-brnoUhb05_T1w_RPI_r_seg
+ FILESEGMANUAL=/scratch/jcohen/data-multi-subject-rsync/derivatives/labels/sub-brnoUhb05/anat/sub-brnoUhb05_T1w_RPI_r_seg-manual.nii.gz
+ '[' -e /scratch/jcohen/data-multi-subject-rsync/derivatives/labels/sub-brnoUhb05/anat/sub-brnoUhb05_T1w_RPI_r_seg-manual.nii.gz ']'
+ sct_deepseg_sc -i sub-brnoUhb05_T1w_RPI_r.nii.gz -c t1 -qc /scratch/jcohen/results_csa_t1_20200828/qc -qc-subject sub-brnoUhb05
I think I understand the problem now. I launched the process of csa, and then I realized that I forgot to rsync the derivatives/ folder.
Let me re-run the process right now!
@jcohenadad, we should have merged https://github.com/sct-pipeline/csa-atrophy/pull/60, before re-run the process. Sorry
problem fixed (was caused by corrupted data)