Nextflow: Allow multiple `publishDir` directive for a single process

Created on 1 Dec 2016 · 12Comments · Source: nextflow-io/nextflow

It can be sometimes useful to publish output files into multiple directories.

A use case could be if I have a process that produces two files and I would like to publish them in two different folders based on their names. E.g.:

process test {
    publishDir 'foos', pattern: 'foo.*'
    publishDir 'bars', pattern: 'bar.*'

    output:
    file('*.txt')

    '''
    echo -e "foo\nbar" | awk '{f=$1".txt"; print ("hello", $1) > f}'
    ''' 
}

kinfeature

Source

emi80

👍6 🎉1

Most helpful comment

Since v0.29.0 you can specify as many publishDir as needed.

pditommaso on 25 Apr 2018

👍5

All 12 comments

I'd love to see that too ;-)

MaxUlysse on 1 Feb 2017

fmorency on 8 Mar 2017

It works in fact, if you do something like that:

 publishDir '.', saveAs: { it == "foo.*" ? "foos/$it" : "bars/$it" }

MaxUlysse on 8 Mar 2017

@MaxUlysse Thanks but my use-case involves outputting the same file in multiple folders

fmorency on 8 Mar 2017

Bioninbo on 8 Feb 2018

My work-around for now to publish in two different directories: running the script twice with two different options. This way:
publishDir path: out_option == 'id' ? "path1" : "path2"
Or this way:
publishDir path: "${outdir}/${out_option}/end_of_path"

Bioninbo on 8 Feb 2018

I think that you could use the copyTo method to copy your file in a second directory without running your script twice.
https://www.nextflow.io/docs/latest/script.html#copy-files

MaxUlysse on 8 Feb 2018

Thx Max. I could not make it work with the copyTo method, but it might do the trick indeed.

Bioninbo on 8 Feb 2018

👍1

I think it's time to give a try to this.

pditommaso on 12 Feb 2018

Available in version 0.29.0-RC1.

pditommaso on 8 Apr 2018

🎉2

I have a similar issue but a different usecase; I want to publish the same files into two different publishDir's. Is this supported? For example, something like this:

process tsv_2_sqlite {
    // convert TSV files into SQLite databases
    tag "${caller}-${sampleID}"
    publishDir "output_per_sample/${sampleID}/${caller}", mode: 'copy', overwrite: true
    publishDir "output/tsv_2_sqlite"

    input:
    set val(caller), val(sampleID), file(sample_tsv) from samples_updated_tsvs

    output:
    set val(caller), val(sampleID), file("${sampleID}.sqlite") into samples_sqlite

    script:
    """
    table2sqlite.py -i "${sample_tsv}" -o "${sampleID}.sqlite" -t variants -d '\t'
    """
}

Essentially, I want to have the same set of files but published under two different directory structures. One for a 'per-sample' output directory structure, and another for a 'per-process' output directory structure.

Not sure how to see what implementation was made in version 0.29.0-RC1.