Nomad: [feature request or bug] Guarantee unique allocation index

Created on 29 Dec 2017 · 9Comments · Source: hashicorp/nomad

This is a feature request or bug report after this issue: https://github.com/hashicorp/nomad/issues/3593
After that issue, there are still duplicate indices for allocations from time to time.

For example, in the following job:
https://github.com/hashicorp/nomad/files/1574186/allocations_0.7.1.txt
Allocation 36, 19 and 25 were scheduled twice.

If we can't guarantee the uniqueness the alloc index, why bother offer such variable interpolation on https://www.nomadproject.io/docs/runtime/interpolation.html? It's very misleading and useless.

themscheduling typbug

Source

dukeland9

All 9 comments

@dukeland9 we are back after the break and looking at this again. Would like to confirm one thing - in your job specification did you ask for 100 allocs? Would be helpful if you posted the job specification here.

preetapan on 4 Jan 2018

@dukeland9 After some more investigation, we found that there are two different things happening here:

Reusing the same alloc index on a lost allocation that was replaced - that is expected behavior. Nomad will use indexes starting from 0 to desired_count -1. When one of those need to be replaced, like 19, 25 and 36 in your example because it lost its connection to the node they were on, the scheduler will reuse that alloc index to create a replacement.

Creating the right number of allocations - We did find a bug in how we count whether a batch job was successfully allocated. This resulted in the scheduler not creating allocations with indexes 97, 98 and 99 to make a desired total count of 100. The bug was because it incorrectly counted the replaced allocations (19, 25, 36) against the total number of desired running allocations(100). We have a fix for this, will be commenting shortly with a binary to test this.

Also noting that this bug is a rare edge case, it only happens when there is a large enough batch being requested in a CPU contentious environment, and all the initial set of placements have not yet been made before which there are lost allocations.

Thanks once again for stress testing this in your environment.

preetapan on 4 Jan 2018

nomad.zip
Hey here is a Linux AMD64 binary that includes the changes from https://github.com/hashicorp/nomad/pull/3717. If you want to give that a test that would be great!

dadgar on 4 Jan 2018

@preetapan Thank you for investigating this issue!

dukeland9 on 5 Jan 2018

@dadgar I tried to run your binary, but it got "error while loading shared libraries: liblxc.so.1: cannot open shared object file: No such file or directory". I want to confirm that the binary was compiled correctly. I don't think we had lxc dependency before.

dukeland9 on 5 Jan 2018

@dukeland9 can you try this binary ? Tried building one for you on my Linux box. Verified that this does not depend on liblxc.so (I added some output from ldd below that shows this)

preetha@preetha-work ~/nomad/bin (master) $ldd nomad
    linux-vdso.so.1 =>  (0x00007fff60bfd000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe9a7b1c000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe9a7752000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe9a7d39000)

preetapan on 5 Jan 2018

@preetapan I can't access your file from Amazon S3 due to the blocking issue in China. Would you please provide one on github like dadgar did? Thanks a lot.

BTW, I only have to update the binaries on the servers right?

dukeland9 on 6 Jan 2018

👍1

@preetapan Never mind. I managed to build one myself. I replaced the binary on the servers and tested running two jobs with node draining situation. It seemed the system worked correctly.
I'll watch the system running a few more days to give more solid feedback then.

dukeland9 on 7 Jan 2018

👍1

The system has been running as expected for several days. Thank @preetapan and @dadgar for fixing this!

dukeland9 on 10 Jan 2018

👍1

Was this page helpful?

0 / 5 - 0 ratings