Hi I want to create a fromFilePairs channel out of files that are the result of a previous process (it gets th files from an object store). Is there any way to do this without running two pipelines?
You don't need that. You can do that declaring the output of the process (pulling the data) with something.
output:
set sampleId, file('*_1.fastq'), file('*_2.fastq') into a_new_channel
A possible problem could be that you don't have the sampleId and you need to extract it from the file name or other meta-data. let me know.
sampleID is going to be the input here I think; so that seems workable.
set 1235, file('_1.fastq'), file('_2.fastq') into a_new_channel
ERROR ~ expecting '}', found ',' @ line 14, column 18.
set val 12345, file('_1.fastq'), file('_2.fastq') into read_files_fastqc
^
Try:
set val(1235), file('_1.fastq'), file('_2.fastq') into a_new_channel
Also you should use *_1.fastq and *_2.fastq otherwise no files will match.
yay. This tuple has fewer [] than the one from readFilePairs; is this an issue? I am unsure of how fastqc expects its input.
12456[/home/fastq/12456_1.fastq.gz, /home/fastq/12456_2.fastq.gz]
vs
input.1 12345_1.fastq 12345_2.fastq
when
echo $reads >> /home/ubuntu/tup
where $reads is inputting from the channel.
I should see a code snippet to help.
Channel
.fromFilePairs( params.reads, size: 2)
.ifEmpty { exit 1, "Cannot find any reads matching: ${params.reads}" }
.into { read_files_fastqc; read_files_trimming }
process fastqc {
container = 'ubuntu'
input:
file(reads) from read_files_fastqc
script:
"""
cat $reads >> /home/tup
"""
}
vs
params.list = '/home/list'
myFile = file(params.list)
allLines = myFile.readLines()
process get {
container = 'hc7docker/test'
input:
val a from allLines
output:
set val(a), file('*_1.fastq'), file('*_2.fastq') into read_files_fastqc
script:
"""
/docker-entrypoint.sh
iget "${a}_1.fastq"
iget "${a}_2.fastq"
"""
}
process fastqc {
container = 'ubuntu'
input:
file(reads) from read_files_fastqc
script:
"""
echo $reads >> /home/ubuntu/tup
"""
}
in both cases there is one pair of text files with the right names available, these pipelines run and give the outputs above. The files are not actually data files, so I can't process them as data.
Yes, the first fromFilePairs produces a tuple in which the second element in a pair with both files, unless you specify flat: true. In that case it will keep the files as sole elements. See the options.
If you want to have the process output to behave as the (default) fromFilePairs method, you can declared the output as:
output:
set val(a), file('*_{1,2}.fastq') into read_files_fastqc
thanks. That produces much more similar outpet.
I think this is solve. I'm closing it. Feel free to comment/re-open if needed.
@pditommaso I have a related question. In my case, process A output a set of files filesA and I want process B to process the pairs in filesA matching a pattern, e.g., *.{1,2}.out.txt. Is there a way to achieve this without running two pipelines?