It would be great if users could set new variables from inside the script. For example:
process example {
input:
val species_name
output:
set taxid, accession, file('genome.fasta') into labeled_files
"""
taxid=`get_taxid $species_name`
accession=`get_accession $species_name`
download_genome \$accession > genome.fasta
"""
}
At the moment, we can extract at most one variable by using stdout:
process example {
input:
val species_name
output:
set stdout, file('genome.fasta') into labeled_files
"""
taxid=`get_taxid $species_name`
download_genome \$accession > genome.fasta
printf "\$foo"
"""
}
It's possible that this is sufficient for most people, but the request is here just in case others would find the bash→groovy variable extraction helpful.
I agree this would be great!
An option could be to add the support for env outputs in order to capture the values of environment variables in the context of the BASH script. For example:
process example {
input:
val species_name
output:
set env(taxid), file('genome.fasta') into labeled_files
"""
taxid=`get_taxid $species_name`
download_genome \$accession > genome.fasta
"""
}
Indeed, it sounds like an elegant solution to me.
env(taxid) is a great idea
On Tue, 1 Sep 2015 04:55 Matthieu Foll [email protected] wrote:
Indeed, it sounds like an elegant solution to me.
—
Reply to this email directly or view it on GitHub
https://github.com/nextflow-io/nextflow/issues/69#issuecomment-136639772
.
I'm bringing up this issue again because I have a similar need. Was this enhancement implemented in a recent release?
Unfortunately not yet.
An approach I've used in the past is to print JSON or YAML from the process and capture that string in stdout. Then use JsonSlurper (or yaml analog) in the resulting output channel to parse that string into a groovy object. This object can be used directly or otherwise manipulated or decomposed into channel values with the usual suspect operators.
Not sure to understand. Could you provide an example?
Here's a basic example:
import groovy.json.JsonSlurper
Channel.from(3,4,5).into{ data }
process emitJson {
input:
val x from data
output:
stdout into out
"""
#!/usr/bin/env python
import json
result = []
for i in range(1,$x):
result.append({'a': i*$x, 'b': i*i*$x})
# I print a list of objects as a JSON string
print json.dumps(result)
"""
}
slurp = new JsonSlurper()
out
.flatMap{ x -> slurp.parseText(x) }
.view()
Now try replacing the flatMap at the with just a view to see the difference:
out.view()
This is smart, but ideally it should be possible to capture one or more variable w/o using an external parsing.
A way could be to dump the process environment into a file, then having NF to parse that file to fetch the variable(s) specified in the process output.
What I don't like about this is that it requires yet another file to created for each process. Moreover it would work only for BASH task. Thus I'm not so convinced to implement it.
Maybe a better way to handle this problem would consist on having NF opening a TCP socket that can be use to fetch script ENV variables without having to pass trough a file, eg.
env > /dev/tcp/<nextflow host address>/<port>
But stil would only work for BASH scripts.
It would be useful yes.
Here I'm trying to extract the ID of a sample from its BAM file, all the while indexing it.
process IndexBAM {
input:
set val(status), file(bam) from ch_unindexed_bam
output:
set val(status), stdout, file(bam), file("*.bai") into ch_indexed_bam
script:
"""
# Extract ID from BAM header
idSample=`samtools view -H ${bam} | grep "^@RG" | grep -oP 'ID:\\K(\\S+)'`
# Index BAM
filename=${bam}
samtools index ${bam} \${filename%.bam}.bai
# Send idSample to stdout to be captured by nextflow
printf "\$idSample"
"""
}
Here I used the trick suggested by OP (thanks @robsyme ). But there could be other features I'd like to extract!
@msmootsgi Nice idea too!
After having spent a week at Pawsey in Perth, in honour of Robert that likely has been the first Australian Nextflow user, I've decided finally to implement this feature. The relaxed atmosphere at this latitude has contributed. Copying @SvenDowideit that was also requiring this feature.
Ha. Thanks Paolo :)
An option could be to add the support for
envoutputs in order to capture the values of environment variables in the context of the BASH script. For example:process example { input: val species_name output: set env(taxid), file('genome.fasta') into labeled_files """ taxid=`get_taxid $species_name` download_genome \$accession > genome.fasta """ }
Hi
I have tried this solution but it did not work I do not know why but it says:
invalid set output parameter declaration -- item nextflow.script.tokenEnvCall(nextflow.script.TokenVar(e))
I have included the variable in the shell script
"""
export e="sugar"
"""
any ideas please?
Thanks
Hamza
This has not released yet, it only works if you build from source.
This has not released yet, it only works if you build from source.
Oh I see, Many thanks
Hamza
This feature will be really helpful. Is this coming on v20.01.0??
Yes
Most helpful comment
After having spent a week at Pawsey in Perth, in honour of Robert that likely has been the first Australian Nextflow user, I've decided finally to implement this feature. The relaxed atmosphere at this latitude has contributed. Copying @SvenDowideit that was also requiring this feature.