Nextflow: Allow the `storeDir` directive to handle file names with "non-standard" characters

Created on 21 Apr 2019  Â·  6Comments  Â·  Source: nextflow-io/nextflow

New feature

The storeDir directive should be able to handle file names with "non-standard" characters (e.g. :, >)

Usage scenario

Consider the following workflow.nf script:

keysChannel = Channel.from(
    "chr1:111969210_C>T",
    "chr1:115527461_C>T",
    "chr1:11771998_G>A",
    "chr1:150811945_G>A",
    "chr1:158324250_C>T")

process createVcf {
    storeDir "results"
    stageOutMode 'move'

    input:
    val key from keysChannel

    output:
    file "${key}.vcf"

    shell:
    '''
    touch "!{key}.vcf"
    '''
}

Given a list of feature identifiers with "non-standard" characters (i.e. :, >), a .vcf file is created with the feature identifier being the prefix. Nextflow (v19.01.0) will throw the following error when this workflow is run:

mv: cannot stat ‘chr1’: No such file or directory
mv: cannot stat ‘11771998_G>A.vcf’: No such file or directory

I suspect what is happening is that the storeDir directive uses the mv command, but does not double quote the filenames. I recognize that these filenames are not "ideal", but in some scenarios having the option to have non-standard characters in filenames is useful.

All 6 comments

Actually, the problem here is that the : is interpreted as a path separator. As a workaround specify file '*.vcf'.

Hi Paolo, thanks for the suggestion. Sadly, that workaround won't work with resuming runs - with storeDir, any file that matches the output file pattern will indicate that the task has already run, so if even one VCF exists in the output directory, all processes will pick up that file and assume it is the output file:

$ touch results/false.vcf
$ ./workflow.nf
N E X T F L O W  ~  version 0.30.2
Launching `./workflow.nf` [elegant_lavoisier] - revision: c98b27dbd2
[warm up] executor > local
[skipping] Stored process > createVcf (3)
[skipping] Stored process > createVcf (5)
[skipping] Stored process > createVcf (1)
[skipping] Stored process > createVcf (2)
[skipping] Stored process > createVcf (4)

For now we will have to avoid use of : in the file names. However, it would be nice if Nextflow could support any valid *nix filename, or if the documentation were updated to warn about the use of colons in filenames.

I see. I agree should be fixed or at least better documented. I leave this issue open as a reminder.

: in filenames should be avoided, the moment you send the file to a collaborator on Windows, or try to copy it to an NFS drive, it will break because : is the drive separator character there e.g. C:\\. Outside the scope of Nextflow perhaps, but the proximity of the context maybe something to keep in mind.

: in filenames should be avoided, the moment you send the file to a collaborator on Windows, or try to copy it to an NFS drive, it will break

Or maybe collaborators on Windows should be avoided ;-)

It's not a big issue and I certainly wouldn't advocate putting in lots of work to fix it, but for users of exclusively unix-like machines like myself, this is an unexpected limitation, so a note in the documentation would be appropriate and sufficient.

Confirming that using path instead of file, it works 🎉

Was this page helpful?
0 / 5 - 0 ratings