Nomad v0.5.0-rc2 ('019e2442cc3ddccfa9cd485a41f2f6564f122c71+CHANGES')
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
I am unable to set default network speed on the client. No matter what is configured for network_speed, the client thinks it has only 16 MBits available:
root@nomad-dev-1:~# curl -s http://localhost:4646/v1/node/9a0f8adb-6974-174b-c050-be0a0be77b0e | jq .Resources
{
"Networks": [
{
"DynamicPorts": null,
"ReservedPorts": null,
"MBits": 16,
"IP": "172.31.65.166",
"CIDR": "172.31.65.166/32",
"Device": "eth0"
}
],
"IOPS": 0,
"DiskMB": 42319,
"MemoryMB": 3952,
"CPU": 4800
}
Any job configured with network (mbits) resources > 16 results in Dimension "network: bandwidth exceeded" exhausted on <X> nodes.
Also, setting "fingerprint.blacklist" = "network" in client options has no effect.
network_speed = 1000 in nomad client config.2016/11/12 19:47:20.850134 [DEBUG] fingerprint.network: Detected interface eth0 with IP 172.31.65.166 during fingerprinting
2016/11/12 19:47:20.850164 [DEBUG] fingerprint.network: Unable to read link speed from /sys/class/net/eth0/speed
2016/11/12 19:47:20.850167 [DEBUG] fingerprint.network: Unable to read link speed; setting to default 1000
Using the same config, OS, etc. in another environment running nomad 0.4.1 shows the node configuration below. Not sure why eth0 is shown twice, however it allows me to place the jobs I need to ¯\_(ツ)_/¯...
root@nomad-test-1:~# curl -s http://localhost:4646/v1/node/239a4bd7-2772-5650-2d93-f524fb6e8359 | jq .Resources
{
"Networks": [
{
"DynamicPorts": null,
"ReservedPorts": null,
"MBits": 16,
"IP": "172.31.65.105",
"CIDR": "172.31.65.105/32",
"Device": "eth0"
},
{
"DynamicPorts": null,
"ReservedPorts": null,
"MBits": 1000,
"IP": "172.31.65.105",
"CIDR": "172.31.65.105/32",
"Device": "eth0"
}
],
"IOPS": 0,
"DiskMB": 42447,
"MemoryMB": 3952,
"CPU": 5000
}
2016/11/12 20:30:04.763547 [DEBUG] fingerprint.network: Detected interface eth0 with IP 172.31.65.105 during fingerprinting
2016/11/12 20:30:04.763588 [DEBUG] fingerprint.network: Unable to read link speed from /sys/class/net/eth0/speed
2016/11/12 20:30:04.763592 [DEBUG] fingerprint.network: Unable to read link speed; setting to default 1000
Are you on AWS? If yes, I think it might be env_aws that sets the default speed, doesn't allow overriding it and doesn't say anything in the logs, not even in DEBUG (which sounds to me like a bug).
Since you're running 0.5.0-rc2 you can try blacklisting env_aws by adding this to your client configuration:
options = {
"fingerprint.blacklist" = "env_aws"
}
Then network_speed will work (at least it did for me).
Thanks for the tip @omame! Indeed I am on AWS, and blacklisting env_aws does allow me to hack my way around the problem for now.
From what I can tell, it looks like the recent 87201fa510dac0fc586caffe1b5baa244a00b225 commit has not only changed the order so environmental fingerprinters like env_aws run _after_ host-level ones, but it also has env_aws _setting_ node.Resources.Networks instead of appending to it, therefore causing host-level network settings (and defaults) to be overwritten.
Hey all
Just wanted to update this issue. The way I see it is there are two problems:
1) AWS fingerprinter gets the wrong interface speed
2) network_speed should be an override and not a default
Both these will be addressed in 0.5
Most helpful comment
Hey all
Just wanted to update this issue. The way I see it is there are two problems:
1) AWS fingerprinter gets the wrong interface speed
2) network_speed should be an override and not a default
Both these will be addressed in 0.5