I am trying to understand if its possible to limit number of periodic job childs somehow.
E.g. i do have a periodic job running every 15 minutes which normally exits in a 5-30m. It is fine to have
Is there any way to limit allowed maximum amount of children?
Currently you cannot specify a run limit for periodic batch jobs. You may want to track #1782 as timeouts might be useful for your use case when implemented.
That being said one workaround would be to use constraints, probably node_class, to limit the number of nodes the periodic batch jobs could run on. For example if you have 5 servers with node_class=periodic that have 30gb of memory, and each invocation of the periodic batch job requires 3gb: (5 servers * 30 gb) / 3 = 50 - so 50 max running instances. Further invocations will be queued until resources are freed.
I'm going to close this ticket, but please feel free to open a feature request with your ideal behavior if this doesn't meet your needs.
@schmichael thank you for quick reply and explanation, that was very helpful.
As for now we would probably just disable prohibit_overlap for all running batches, as it seems to be very dangerous feature for the our env. However, it would be great to see some limits of maximum task children in the feature. Thanks for the hint with node_class, however, this is not the best situation for us as we will need to provision special "batch workers" nodes for that and its not something we would like to do.
Our enterprise product does have Namespaces and Quotas which would allow you to use my node_class approach without provisioning new nodes. You would launch these batch jobs in their own namespace with a resource constrained quota.
(I promise we are not intentionally leaving out the batch job limit to sell more licenses! I just wanted to offer another workaround.)
Most helpful comment
Currently you cannot specify a run limit for periodic batch jobs. You may want to track #1782 as timeouts might be useful for your use case when implemented.
That being said one workaround would be to use constraints, probably node_class, to limit the number of nodes the periodic batch jobs could run on. For example if you have 5 servers with
node_class=periodicthat have 30gb of memory, and each invocation of the periodic batch job requires 3gb:(5 servers * 30 gb) / 3 = 50- so 50 max running instances. Further invocations will be queued until resources are freed.I'm going to close this ticket, but please feel free to open a feature request with your ideal behavior if this doesn't meet your needs.