If Nextflow encounters a process that requests resources that are not available on the current system, it fails. For example, if 10 processors are requested and the system only has 8, we exit with an error. This is A Good Thing 馃憤
However, when making config files for a pipeline that aim to have generic works-for-most-situations requests, it can be nice to be able to throttle these for small genomes / test datasets / small machines for example.
In most of our pipelines, we now have this little helper function in nextflow.config:
// Function to ensure that resource requirements don't go beyond
// a maximum limit
def check_max(obj, type) {
if(type == 'memory'){
try {
if(obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
return params.max_memory as nextflow.util.MemoryUnit
else
return obj
} catch (all) {
println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj"
return obj
}
} else if(type == 'time'){
try {
if(obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
return params.max_time as nextflow.util.Duration
else
return obj
} catch (all) {
println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj"
return obj
}
} else if(type == 'cpus'){
try {
return Math.min( obj, params.max_cpus as int )
} catch (all) {
println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
return obj
}
}
}
Then, in each config we wrap the request in the above function, as so:
cpus = { check_max( 1 * task.attempt, 'cpus') }
memory = { check_max( 8.GB * task.attempt, 'memory') }
time = { check_max( 2.h * task.attempt, 'time') }
This way, the defaults work most of the time. But if you pass --max_memory "8.GB" then all processes will use a maximum of 8.GB and the pipeline will run on your tiny machine. This is really helpful for test datasets and allows us to have a default "base" config file that is throttled by other system-specific configs.
Is this something that could be built into Nextflow as a new feature? Keep the current default behaviour, but have new command line arguments and config scopes to be able to throttle the maximum resources without failing..
Phil
As well as / instead of these manually specified limits, there could perhaps be something similar to errorStrategy:
process {
notEnoughResources: 'terminate' // default behaviour
notEnoughResources: 'limit' // throttle to available resources
notEnoughResources: { // limit to specified values
cpus: 4
memory: 8.GB
time: 16.h
}
}
So, as first thing I would suggest to use this simplified version of check_max function that infers that infers the argument type and takes the max value as second argument.
def check_max(obj, max) {
if( obj instanceof nextflow.util.MemoryUnit ) {
try {
def other = max as nextflow.util.MemoryUnit
return obj.compareTo(other) == 1 ? other : obj
}
catch( all ) {
println " ### ERROR ### Max memory '${max}' is not valid! Using default value: $obj"
return obj
}
}
if( obj instanceof nextflow.util.Duration ) {
try {
def other = max as nextflow.util.Duration
return obj.compareTo(other) == 1 ? other : obj
}
catch( all ) {
println " ### ERROR ### Max time '${max}' is not valid! Using default value: $obj"
return obj
}
}
if( obj instanceof Integer ) {
try {
return Math.min( obj, max as int )
}
catch( all ) {
println " ### ERROR ### Max cpus '${max}' is not valid! Using default value: $obj"
return obj
}
}
}
Still not sure if the best thing is to add a build-in function like the above or implement a move advanced feature as you are suggesting in the second comment.
Ace! Thanks for the improved function, I will use that in the short term at least 馃憤
Still think that the core feature would be better though 馃槈
also the try catch should be useless, therefore it could be simplified as:
def check_max(obj, max) {
if( obj instanceof nextflow.util.MemoryUnit ) {
def other = max as nextflow.util.MemoryUnit
return obj.compareTo(other) == 1 ? other : obj
}
if( obj instanceof nextflow.util.Duration ) {
def other = max as nextflow.util.Duration
return obj.compareTo(other) == 1 ? other : obj
}
if( obj instanceof Integer ) {
return Math.min( obj, max as int )
}
println " ### ERROR ### invalid check_max value=$obj"
return obj
}
Still think that the core feature would be better though
I agree !
Aha, because we're now testing the type - nice! But if supplied on the command line, will these not always be strings? I think that's why I was trying to coerce the type (and catching the exception if it failed)..
I was a bit too optimistic, yes the try catch is need to handler and invalid max value (I was thinking to obj instead).
Could probably just have one try/ catch statement wrapping the whole function though at least...
Yep, but want be able anymore to have specific error message then
You could check the type of obj to customise it.. ;) But may end up not saving many lines of code..
lol
I can't resist a challenge 馃槅How about this?
def check_max(obj, max) {
def obj_types = [
'memory': nextflow.util.MemoryUnit,
'time': nextflow.util.Duration,
'cpus': Integer
]
obj_types.each { key, obj_type ->
try {
if( obj instanceof obj_type )
return obj.compareTo(max as obj_type) == 1 ? max as obj_type : obj
catch( all ) {
println " ### ERROR ### Max $key '${max}' is not valid! Using default value: $obj"
return obj
}
}
println " ### ERROR ### Object type not recognised! Using default value: $obj"
return obj
}
if it works, it's good :)
Haven't tested it yet.. ;)
Copying my main request back down here as we got a bit carried away by function refactoring 馃榿
process {
notEnoughResources: 'terminate' // default behaviour
notEnoughResources: 'limit' // throttle to available resources
notEnoughResources: { // limit to specified values
cpus: 4
memory: 8.GB
time: 16.h
}
}
I was thinking at this, however I'm not understanding how resource are modified? Do you still need to provide a dynamic (ie closure) rule to increase the request when the task re-submitted or you were thinking to something different ?
Not sure I totally understand the question. But I was thinking that the syntax could remain the same as it is currently - static values would be limited (modified) at runtime by nextflow if needed to make them smaller, dynamic rules could be used with task.attempt etc and then limited if needed.
I mean, basically your proposal is to provide a mechanism that allows the setting of a max limits for some resources.
Whenever the resource request overcome this limit (either statically or dynamically defined), it fallback to this limit.
Is that right?
Yup!
Proposed syntax after discussion with default value and also a range (minimum and maximum values).
process {
withName:fastqc {
memory: { 1.GB * input.size(), range: 8.GB..24.GB },
time: { 2.4.h * input.size(), range: 16.h..48.h },
cpus: { 2 * input.size(), range: 4..16 }
}
}
The input.size() dynamic computation here is just an example of when we could have a variable default. Here the maximum of the range would prevent the requested resources from going above a sensible default.
Another scenario is when people want to limit the requested resources:
process {
memory: 4.GB
withName:fastqc {
memory: 16.GB, range: 8.GB..24.GB
}
withName:star {
memory: 4.GB, range: 2.GB..6.GB
}
}
Then have nextflow run --max_memory 8.GB which would work - using 8.GB memory for fastqc and 4.GB memory for star.
However, doing nextflow run --max_memory 2.GB would exit with an error because this is outside the range of fastqc.
Phil
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This is still probably my number-1 feature request issue 馃槃 The @nf-core check_max function is a constant thorn in the side, but it's too useful to remove. Would be great if we could move this functionality into Nextflow 馃憤
I agree we need definitively better here, soon I need to tackle this problem. Pinning this issue.
Was it intentional to let this issue be auto-closed @pditommaso?
I agree we need definitively better here, soon I need to tackle this problem. Pinning this issue.
It seems the bot ignores the pinned label :/