Dear all,
I would find it useful to have an option to rerun not only the selected step in my history, but also the subsequent steps. There is already probably similar feature when some step crashes...
I can think of two potential additional settings with the options to select from:
1) what to do with the current results?
A) overwrite
B) make completely separate branch
2) to what step it should be rerun?
select the step...
(It might not be necessary to rerun all steps, the mistake in the step to be run again might influence the list of the steps as well so one might want to select the last one that makes sense at that moment...)
Thanks for your consideration!
David
@weiclav thanks for writing this issue!
This idea came up during our last workshop in Prague. We have a similar feature already, but only for failed jobs. These jobs are restartable and all subsequent jobs will be triggered again. If we could offer this for 'green' jobs as well this would cover this use-case I think and the user could partially rerun workflows.
We can also extract workflows from a history, I think that an implementation should be based on this, but making it easier to select the starting point.
This also elegantly solves the issue of which steps to include in the re-run.
We will definitely not allow overwriting datasets, that is and should not be possible in Galaxy.
We can also extract workflows from a history, I think that an implementation should be based on this, but making it easier to select the starting point.
do also consider having a different interface to the extract workflow from history - the current interface is a bit abstract, showing which datasets were created with which tool, but not the relationship between them, making it harder to figure out which tool to include in the workflow imo.
in addition, this would also make this error less anoying: This dataset is currently being used as input or output. You cannot change metadata until the jobs have completed or you have canceled them. -> where a dataset has an error on the metadata, but is already scheduled for subsequent jobs..
not sure I got the extraction of the history part as workflow correctly - you meant to use the view where you are selecting what to extract, right? sorry for dumb question, I am relatively new to galaxy so I may miss something...
anyway, I like the idea. I was thinking in "history box" sort of, with potential utilization of tools check marks ("Operation on multiple datasets"), but this would be probably too confusing
not sure how often this re-run scenario might happen to others. I am using workflow system where I play a lot so it is sort of normal for me, but I may be biased... would it be possible to extract some type of tools usage statistics on this issue? not possible/practical I suppose, right?
Most helpful comment
do also consider having a different interface to the extract workflow from history - the current interface is a bit abstract, showing which datasets were created with which tool, but not the relationship between them, making it harder to figure out which tool to include in the workflow imo.