Galaxy: When processing collections in batch mode, $input.element_identifier seems to revert to $input.name instead

Created on 21 May 2018  路  19Comments  路  Source: galaxyproject/galaxy

Example is RSeQC geneBody_coverage.xml. When running in "merge" mode, $input.element_identifier is set to the element id in the collection. However, when run in "batch" mode $input.element_identifier get's set to the filename (which is not desired).

aredataset-collections aretool-framework kinbug

All 19 comments

I don't think this is true in general because we have tests I think (https://github.com/galaxyproject/galaxy/blob/dev/test/api/test_tools.py#L912), but we don't have tests for mapping over a parameter inside a conditional - the conditional might be what is breaking this. We also have tests for element identifiers inside conditionals but they are for multiple=true parameters not single parameters like this tool.

Feel free to edit the issue to make the description more precise. I only know this one case atm.

I have a test case that puts a single data identifier inside a repeat and a conditional and the identifier is still used for me - at least in 18.05. What version of Galaxy did you observe this in?

I saw it on 17.09 I believe. I'm in the process of updating to 18.01 (or possibly straight to 18.05), but didn't want to push it out before the long weekend, so I'll do it next week and let you know if things looks resolved or not. Thanks for looking into it, I appreciate it.

@jmchilton I'm running 18.05 and this is still happening.

Any updates on this? It makes the RNASeQC GeneBodyCoverage output in MultiQC much less useful. Has anyone come up with a workaround? I'd be happy to patch the tools to fix this.

I think the tool here might be doing something unexpected. Can you try adding the conditional prefix to input ? https://github.com/galaxyproject/tools-iuc/blob/2bd06c2b43c295fb4cf172c4f156fed5475855a4/tools/rseqc/geneBody_coverage.xml#L32

(so that you have $batch_mode.input)

Thanks @mvdbeek. I made that change and the tests still pass, but I'm having trouble writing a test that actually consumes a collection. I tried setting the input param to a collection element in the test, but I get the error: AssertionError: batch_mode|input is not optional. You must provide a valid filename.. Is there a good example anywhere?

You can't do that (in tool tests, you could do this in workflows tests), mapping over collections over inputs is a framework function that should just work鈩笍. In a sense you'd be testing Galaxy instead of your tool.

So, can I write a test that would fail for the currently "broken" tool, but pass when I fix it? Just trying to improve the tests as also check that this fixes the issue (without actually modifying things on my current install of Galaxy).

This example seems to show a test specifying a collection as input: https://tldrify.com/ru9

I can't access this form work (looked it up on my phone, here the input param is a collection, so that makes sense but doesn't apply here)

But you can only specify collections for collection inputs, which isn't what you want to do here.

Just one of the examples here: https://planemo.readthedocs.io/en/latest/writing_advanced.html#improved-input-handling-via-test-driven-development

But I think I get your point, and I guess I can't write a test for it then (which was the case when I wrote this too). I'll see what I can do to test this on some other galaxy outside of planemo. Thx!

No problem. It might not be the issue, but it's always better to use the prefixed input where possible. If this is the problem though we can fix and test this in galaxy to prevent regressions

What do you know, that fixed it! Thanks @mvdbeek. Would be good to create an issue to write a test for galaxy. Submitting a PR to fix the tool now.

Awesome, thanks for confirming.

We fixed this in https://github.com/galaxyproject/galaxy/pull/6798, it'll be in the upcoming 18.09 release

Was this page helpful?
0 / 5 - 0 ratings