Nomad Server and Clients are both running the following build:
Nomad v0.11.1 (b43457070037800fcc8442c8ff095ff4005dab33)
Amazon Linux 2:
4.14.173-137.229.amzn2.x86_64
While running the EBS CSI plugin I have noticed that nomad expects plugin tasks that complete to still report as healthy:
$ nomad plugin status aws-ebs4
ID = aws-ebs4
Provider = ebs.csi.aws.com
Version = v0.6.0-dirty
Controllers Healthy = 1
Controllers Expected = 2
Nodes Healthy = 3
Nodes Expected = 4
Allocations
ID Node ID Task Group Version Desired Status Created Modified
738adb4b 46e6db9e controller 4 run running 33m59s ago 31m16s ago
a999a840 4470dc51 controller 3 stop complete 32m4s ago 31m25s ago
9290e85e 46e6db9e nodes 0 run running 42m23s ago 42m16s ago
eed3459a ec4c06b3 nodes 0 stop complete 42m23s ago 35m8s ago
d9ecfc6b 4470dc51 nodes 0 run running 42m23s ago 42m8s ago
ad2698aa eaac2f32 nodes 0 run running 37m49s ago 37m31s ago
This seems unusual since if a CSI plugin has completed it should no longer be expected to be running and healthy. When this mismatch between healthy and expected plugin task counts occurs, all tasks that need to attach a CSI volume using the plugin in question are unable to do so. Instead of successfully mounting the volume, the following error occurs:
failed to setup alloc: pre-run hook "csi_hook" failed: rpc error: code = InvalidArgument desc = Device path not provided
The plugin returns this error when it is missing information in the PublishContext passed to a NodePublishVolume/NodeStageVolume RPC as seen here.
The PublishContext is returned by a ControllerPublishVolume RPC, however, after checking the logs of my controller plugin it turns out ControllerPublishVolume is never called.
Again, this only occurs when there is a mismatch between healthy and expected counts. Otherwise ControllerPublishVolume is called when a task requesting a CSI volume is scheduled and the volume is successfully attached.
The easiest way to create a healthy/expected value mismatch is to increase the number of controller tasks to 2 then decrement back to 1.
job "plugin-aws-ebs-controller" {
datacenters = ["dc1"]
group "controller" {
task "plugin" {
driver = "docker"
config {
image = "amazon/aws-ebs-csi-driver:latest"
args = [
"controller",
"--endpoint=unix://csi/csi.sock",
"--logtostderr",
"--v=5",
]
}
csi_plugin {
id = "aws-ebs0"
type = "controller"
mount_dir = "/csi"
}
resources {
cpu = 500
memory = 256
}
# ensuring the plugin has time to shut down gracefully
kill_timeout = "2m"
}
}
}
2 Run the CSI node plugin job:
job "plugin-aws-ebs-nodes" {
datacenters = ["dc1"]
# you can run node plugins as service jobs as well, but this ensures
# that all nodes in the DC have a copy.
type = "system"
group "nodes" {
task "plugin" {
driver = "docker"
config {
image = "amazon/aws-ebs-csi-driver:latest"
args = [
"node",
"--endpoint=unix://csi/csi.sock",
"--logtostderr",
"--v=5",
]
# node plugins must run as privileged jobs because they
# mount disks to the host
privileged = true
}
csi_plugin {
id = "aws-ebs0"
type = "node"
mount_dir = "/csi"
}
resources {
cpu = 500
memory = 256
}
# ensuring the plugin has time to shut down gracefully
kill_timeout = "2m"
}
}
}
Create and register an EBS volume with nomad. E.G. https://learn.hashicorp.com/nomad/stateful-workloads/csi-volumes
Optionally run the example MySQL job to verify that volumes can be attached successfully. Be sure to use constraints to run the task using the volume in the same availability zone as your EBS volume
job "mysql-server" {
datacenters = ["dc1"]
type = "service"
group "mysql-server" {
count = 1
volume "mysql" {
type = "csi"
read_only = false
source = "mysql"
}
restart {
attempts = 10
interval = "5m"
delay = "25s"
mode = "delay"
}
task "mysql-server" {
driver = "docker"
volume_mount {
volume = "mysql"
destination = "/srv"
read_only = false
}
env = {
"MYSQL_ROOT_PASSWORD" = "password"
}
config {
image = "hashicorp/mysql-portworx-demo:latest"
args = ["--datadir", "/srv/mysql"]
port_map {
db = 3306
}
}
resources {
cpu = 500
memory = 1024
network {
port "db" {
static = 3306
}
}
}
service {
name = "mysql-server"
port = "db"
check {
type = "tcp"
interval = "10s"
timeout = "2s"
}
}
}
}
}
Increment the count for the number of controller plugin tasks to 2 and wait for the new task to become healthy. Then scale down to 1 task and wait for the other task to complete.
Run nomad plugin status. You should see mismatched healthy/expected values for the controller plugins. E.G.
Container Storage Interface
ID Provider Controllers Healthy/Expected Nodes Healthy/Expected
aws-ebs4 ebs.csi.aws.com 1/2 3/4
I am also seeing issues where plugins with no running jobs are not being garbage collected as described here:
https://github.com/hashicorp/nomad/issues/7743
$ nomad plugin status
Container Storage Interface
ID Provider Controllers Healthy/Expected Nodes Healthy/Expected
aws-ebs0 ebs.csi.aws.com 0/3 0/29
aws-ebs2 ebs.csi.aws.com 0/2 0/25
aws-ebs3 ebs.csi.aws.com 0/2 0/3
aws-ebs4 ebs.csi.aws.com 1/2 3/4
Not sure if this could be related but I figured it was worth mentioning.
Hi @tydomitrovich! Thanks for the thorough reproduction!
This seems unusual since if a CSI plugin has completed it should no longer be expected to be running and healthy. When this mismatch between healthy and expected plugin task counts occurs, all tasks that need to attach a CSI volume using the plugin in question are unable to do so.
Yeah, agreed that this is totally a bug. That'll impact updates to plugins too, I think. I don't have a good workaround for you at the moment but I'll dig in and see if I can come up with a fix shortly.
Hello @tgross, thanks for taking a look! I will be monitoring as I am really excited about using the new CSI features.
I'm working up a PR https://github.com/hashicorp/nomad/pull/7844 which should clear this up. I need to check a few more things out but I'm making good progress on it.