Nomad: System jobs are not rescheduled when resources become available

Created on 29 Mar 2018  路  10Comments  路  Source: hashicorp/nomad

Nomad version

0.7.1

Operating system and Environment details

Linux

Issue

When a system job is evaluated, if a node has no free resources the task is queued. Later, if resources on the node become available the system task should be allocated.

Reproduction steps

  1. Start a Nomad cluster
  2. Run jobs to consume all of the CPU resources
  3. Run a system job
  4. Notice that the system jobs are Queued
  5. Stop the job started at step 2
  6. Notice the the system jobs remain Queued

Expected behaviour:
System jobs are removed from the Queued state and go into the Running state

Observed behaviour:
System jobs remain in the Queued state.

themsystem-scheduler typenhancement

All 10 comments

By porting the blocked evaluation logic from generic_scheduler.go into system_scheduler.go I'm able to get things to work as expected.

https://github.com/maihde/nomad/tree/issue-4072

@maihde I think you identified the problem correctly. We do want to bring a lot of the improvements from the generic scheduler to the system scheduler. Hopefully we can use some of your work!

@dadgar thanks for the feedback. If you would like a pull-request for my patch, let me know.

@maihde would be nice to get that PR up

@jippi just opened up. Thanks.

Is there a timeline for the system scheduler rework that will include the fix for this?

@jippi @mwalters-workmarket my original pull request was closed because there was a major refactor planned to the schedulers and it was deemed easier to start afresh. If this feature is still on your roadmap I'd be happy to implement the feature against the current master and provide a new pull request.

would be cool to revive this

Following up here late, sorry. Is this still a relevant issue? I believe Nomad 0.9.4 fixed this issue with https://github.com/hashicorp/nomad/pull/5900 . Can someone confirm?

Closing this ticket, as it seems fixed and I'm unable to reproduce it now. Please re-open or open a new one if you believe this to be an error. Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

stongo picture stongo  路  3Comments

clinta picture clinta  路  3Comments

hamann picture hamann  路  3Comments

mlafeldt picture mlafeldt  路  3Comments

dvusboy picture dvusboy  路  3Comments