Getting a "volume in use" error when trying to deregister, long after the volume has stopped being in use. Possibly a recurrence or variation of https://github.com/hashicorp/nomad/issues/7625, but this time my whole cluster is running Nomad 0.11.2, whereas that issue was on a 0.11.0 RC.
v0.11.2
Ubuntu 16
I have stopped jobs, which have been stopped for over a day, but they're still somehow preventing me from deregistering the CSI volume that was associated with them.
[hashidev] hwilkins ~ $ nomad volume deregister test_default_1a
Error deregistering volume: Unexpected response code: 500 (volume in use: test_default_1a)
[hashidev] hwilkins ~ $ nomad volume status test_default_1a
ID = test_default_1a
Name = test_default_1a
External ID = vol-<SCRUB>
Plugin ID = aws-ebs0
Provider = ebs.csi.aws.com
Version = v0.6.0-dirty
Schedulable = false
Controllers Healthy = 0
Controllers Expected = 0
Nodes Healthy = 0
Nodes Expected = 0
Access Mode = single-node-writer
Attachment Mode = file-system
Mount Options = <none>
Namespace = default
Allocations
ID Node ID Task Group Version Desired Status Created Modified
090982bb fed5596e csi-ebs-hello-world 0 run complete 1d3h ago 1d3h ago
[hashidev] hwilkins ~ $ nomad alloc status 090982bb
No allocation(s) with prefix or id "090982bb" found
[hashidev] hwilkins ~ $ nomad status csi-ebs-hello-world
Multiple matches found for id "csi-ebs-hello-world"
Volumes:
csi-ebs-hello-world-ebs_1a, csi-ebs-hello-world_1a
[hashidev] hwilkins ~ $ nomad status csi-ebs-hello-world-ebs_1a
ID = csi-ebs-hello-world-ebs_1a
Name = csi-ebs-hello-world-ebs_1a
External ID = vol-<SCRUB>
Plugin ID = aws-ebs0
Provider = ebs.csi.aws.com
Version = v0.6.0-dirty
Schedulable = false
Controllers Healthy = 0
Controllers Expected = 0
Nodes Healthy = 0
Nodes Expected = 0
Access Mode = single-node-writer
Attachment Mode = file-system
Mount Options = <none>
Namespace = default
Allocations
ID Node ID Task Group Version Desired Status Created Modified
7eba8625 fed5596e csi-ebs-hello-world 0 run complete 1d1h ago 1d1h ago
[hashidev] hwilkins ~ $ nomad status csi-ebs-hello-world_1a
ID = csi-ebs-hello-world_1a
Name = csi-ebs-hello-world_1a
External ID = vol-<SCRUB>
Plugin ID = aws-ebs0
Provider = ebs.csi.aws.com
Version = v0.6.0-dirty
Schedulable = false
Controllers Healthy = 0
Controllers Expected = 0
Nodes Healthy = 0
Nodes Expected = 0
Access Mode = single-node-writer
Attachment Mode = file-system
Mount Options = <none>
Namespace = default
Allocations
ID Node ID Task Group Version Desired Status Created Modified
af81bebc fed5596e csi-ebs-hello-world 0 run complete 1d2h ago 1d2h ago
I've also verified via the AWS console that the volume itself is not attached to an instance, but in the available state.
May 27 04:58:50 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T04:58:50.174Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume in use: test_default_1a"
May 27 04:58:50 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T04:58:50.174Z [ERROR] http: request failed: method=DELETE path=/v1/volume/csi/test_default_1a error="volume in use: test_default_1a" code=500
May 27 04:58:50 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T04:58:50.174Z [DEBUG] http: request complete: method=DELETE path=/v1/volume/csi/test_default_1a duration=1.07008ms
I also just saw the -purge flag and tried that:
[hashidev] hwilkins ~ $ nomad status csi-ebs-hello-world-ebs_1a
nomaID = csi-ebs-hello-world-ebs_1a
Name = csi-ebs-hello-world-ebs_1a
External ID = vol-<SCRUB>
Plugin ID = aws-ebs0
Provider = ebs.csi.aws.com
Version = v0.6.0-dirty
Schedulable = false
Controllers Healthy = 0
Controllers Expected = 0
Nodes Healthy = 0
Nodes Expected = 0
Access Mode = single-node-writer
Attachment Mode = file-system
Mount Options = <none>
Namespace = default
Allocations
ID Node ID Task Group Version Desired Status Created Modified
7eba8625 fed5596e csi-ebs-hello-world 0 run complete 1d1h ago 1d1h ago
[hashidev] hwilkins ~ $ nomad stop -purge csi-ebs-hello-world-ebs_1a
No job(s) with prefix or id "csi-ebs-hello-world-ebs_1a" found
It's also probably worth noting that the node in question where these old allocs were held had been deployed with 0.11.1, but they've since been cycled-out and replaced by nodes running 0.11.2
[hashidev] hwilkins ~ $ nomad node status fed5596e
error fetching node stats: Unexpected response code: 404 (No path to node)
ID = fed5596e-968a-a260-e819-0b7abf8f64bf
Name = hashidev-worker-<SCRUB>
Class = <none>
DC = us-east-1
Drain = false
Eligibility = ineligible
Status = down
CSI Controllers = aws-ebs0
CSI Drivers = aws-ebs0
Host Volumes = <none>
CSI Volumes = <none>
Driver Status = docker
Node Events
Time Subsystem Message
2020-05-26T05:12:58Z Cluster Node heartbeat missed
2020-05-26T05:12:23Z CSI failed fingerprinting with error: rpc error: code = Canceled desc = context canceled
2020-05-26T05:12:22Z CSI failed fingerprinting with error: rpc error: code = Canceled desc = context canceled
2020-05-26T05:12:22Z Drain Node drain complete
2020-05-26T05:12:21Z Drain Node drain strategy set
2020-05-26T04:00:11Z Storage Unmount volume
2020-05-26T04:00:04Z Storage Mount volume
2020-05-26T03:00:07Z Storage Unmount volume
2020-05-26T03:00:04Z Storage Mount volume
2020-05-26T02:40:09Z Storage Unmount volume
Allocated Resources
CPU Memory Disk
0/4250 MHz 0 B/6.4 GiB 0 B/65 GiB
Allocation Resource Utilization
CPU Memory
0/4250 MHz 0 B/6.4 GiB
error fetching node stats: actual resource usage not present
Allocations
No allocations placed
So I'm just looking for a way to forcefully get rid of the reference to this volume first, and second to lodge this as a bug :).
I've also tried forcing nomad system gc many hours ago, so it doesn't seem to have cleared it up either.
I'm a little confused by the CLI outputs you've provided here. It looks like you're showing me three volumes:
test_default_1a, claimed by alloc 090982bbcsi-ebs-hello-world-ebs_1a, claimed by alloc 7eba8625csi-ebs-hello-world_1a, claimed by alloc af81bebcBut then you're trying to run a job stop on the volume, which isn't an operation supported on volumes:
$ nomad stop -purge csi-ebs-hello-world-ebs_1a
No job(s) with prefix or id "csi-ebs-hello-world-ebs_1a" found
By any chance do the jobs have the same name as the volume? I'm wondering if using nomad status :id without the more specific subcommands nomad volume status :id or nomad job status :id is confusing matters here.
Moving on to some observations:
I've also verified via the AWS console that the volume itself is not attached to an instance, but in the
availablestate.
It looks like all 3 allocs that claim these 3 volumes are complete, so having the volume unattached is a good sign. Do you have any logs from when the volumes were detached? That might give us some insight.
This is a little weird:
$ nomad alloc status 090982bb
No allocation(s) with prefix or id "090982bb" found
I'm not sure how we could get into a state where the job would be GC'd out from under us but the allocation could still be queried by the volume API to give us information like complete status. What I'd want to look at is the job that are claiming the volumes, to see what they are showing for those allocations.
2020-05-26T05:12:23Z CSI failed fingerprinting with error: rpc error: code = Canceled desc = context canceled
Is the aws-ebs0 plugin still running and healthy on all the nodes? Can you provide a nomad plugin status aws-ebs0?
Hey @tgross , thanks for the response.
You're right, that there are 3 volumes that all claim the same volume ID. I was playing with trying to develop my automation, and ended-up creating 3 different volumes for the same EBS volume.
The job was previously called simply csi-ebs-hello-world, I don't think I ever created a job with the other suffixes, so those are probably only in the volume names.
[hashidev] hwilkins ~ $ nomad plugin status aws-ebs0
ID = aws-ebs0
Provider = ebs.csi.aws.com
Version = v0.6.0-dirty
Controllers Healthy = 2
Controllers Expected = 2
Nodes Healthy = 4
Nodes Expected = 4
Allocations
ID Node ID Task Group Version Desired Status Created Modified
8b9cd937 ff38c034 nodes 0 run running 13h29m ago 13h29m ago
e2430e2c f1b03f8f nodes 0 run running 13h29m ago 13h29m ago
94c3e6b7 00f7f888 nodes 0 run running 13h29m ago 13h29m ago
81d32416 d733c1a3 nodes 0 run running 13h29m ago 13h29m ago
e1c0f03b f1b03f8f controller 11 run running 13h29m ago 13h29m ago
62504f57 00f7f888 controller 11 run running 13h29m ago 13h29m ago
However, note that this plugin was broken for some time as I transitioned it to use Nomad ACLs (which I may be doing incorrectly since how to do that isn't documented - the only doc I could find assumed you didn't have Nomad ACLs). I fixed it around 12h ago, and I still can't deregister the volume(s). I'm now wondering if this has something to do with ACLs. I found some more logs floating around on the Nomad leader as I attempt to deregister the volume:
May 27 21:04:35 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:35.726Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume in use: csi-ebs-hello-world_1a"
May 27 21:04:35 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:35.726Z [ERROR] http: request failed: method=DELETE path=/v1/volume/csi/csi-ebs-hello-world_1a error="volume in use: csi-ebs-hello-world_1a" code=500
May 27 21:04:35 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:35.726Z [DEBUG] http: request complete: method=DELETE path=/v1/volume/csi/csi-ebs-hello-world_1a duration=1.269438ms
May 27 21:04:36 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:36.764Z [DEBUG] http: request complete: method=GET path=/v1/status/leader duration=131.481碌s
May 27 21:04:38 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:38.777Z [DEBUG] http: request complete: method=GET path=/v1/status/leader duration=123.815碌s
May 27 21:04:40 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:40.549Z [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=server duration=148.313碌s
May 27 21:04:41 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:41.768Z [DEBUG] http: request complete: method=GET path=/v1/status/leader duration=129.428碌s
May 27 21:04:43 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:43.778Z [DEBUG] http: request complete: method=GET path=/v1/status/leader duration=121.493碌s
May 27 21:04:44 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:44.547Z [DEBUG] nomad: memberlist: Stream connection from=127.0.0.1:33592
May 27 21:04:46 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:46.770Z [DEBUG] http: request complete: method=GET path=/v1/status/leader duration=131.599碌s
May 27 21:04:48 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:48.781Z [DEBUG] http: request complete: method=GET path=/v1/status/leader duration=117.261碌s
May 27 21:04:50 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:50.550Z [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=server duration=135.271碌s
May 27 21:04:51 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:51.217Z [DEBUG] worker: dequeued evaluation: eval_id=8d257055-07ce-ced6-a1e1-ae7f47ebcaf6
May 27 21:04:51 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:51.217Z [DEBUG] core.sched: CSI plugin GC scanning before cutoff index: index=1733620 csi_plugin_gc_threshold=1h0m0s
May 27 21:04:51 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:51.217Z [ERROR] core.sched: failed to GC plugin: plugin_id=aws-ebs0 error="Permission denied"
May 27 21:04:51 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:51.217Z [ERROR] worker: error invoking scheduler: error="failed to process evaluation: Permission denied"
May 27 21:04:51 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:51.217Z [DEBUG] worker: nack evaluation: eval_id=8d257055-07ce-ced6-a1e1-ae7f47ebcaf6
May 27 21:04:51 hashidev-nomad-<SCRUB> nomad[8186]: 2020-05-27T21:04:51.217Z [WARN] nomad: eval reached delivery limit, marking as failed: eval="<Eval "8d257055-07ce-ced6-a1e1-ae7f47ebcaf6" JobID: "csi-plugin-gc" Namespace: "-">"
I also see some of this:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.412Z [DEBUG] worker: dequeued evaluation: eval_id=9e17f245-8036-101c-a6f7-1ec0a12b61e2
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.412Z [DEBUG] core.sched: node GC scanning before cutoff index: index=1721801 node_gc_threshold=24h0m0s
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.412Z [DEBUG] worker: ack evaluation: eval_id=9e17f245-8036-101c-a6f7-1ec0a12b61e2
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.414Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=csi-ebs-hello-world-ebs_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.414Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=csi-ebs-hello-world_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.414Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=test_default_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.416Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=test_default_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.416Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=csi-ebs-hello-world-ebs_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.416Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=csi-ebs-hello-world_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.417Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=test_default_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.417Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=csi-ebs-hello-world-ebs_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: 2020-05-27T21:00:46.418Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=default volume_id=csi-ebs-hello-world_1a error="1 error occurred:
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: * could not detach from node: node lookup failed: index error: UUID must be 36 characters
May 27 21:00:46 hashidev-nomad-<SCRUB>> nomad[8186]: "
Hey @tgross , I've sent my plugin job definitions to nomad-oss-debug. Here's my Nomad anonymous ACL policy
namespace "default" {
policy = "read"
capabilities = [
# I want to get rid of this line
"csi-mount-volume",
"submit-job",
]
}
# View status of Nomad servers
agent {
policy = "read"
}
# View status of Nomad clients
node {
policy = "read"
}
# manage Raft cluster status
operator {
policy = "read"
}
# manage namespace resource governing
quota {
policy = "read"
}
# read plugins, like the CSI plugin
plugin {
policy = "read"
}
Hey @tgross , any thoughts on how to workaround this bug? I can't seem to get Nomad to "let go" of these volume definitions because it's still tracking those allocations. And of course none of the allocations exist when I run nomad alloc status on them.
Not sure if it's helpful or not, but I also tried deploying a job to use this volume, since it's theoretically defined, and here's the relevant placement failure:
Constraint "CSI volume csi-ebs-hello-world-ebs_1a has exhausted its available writer claims": 1 nodes excluded by filter
Note that I've also recycled all nodes in the cluster at this point. The only thing I haven't done is totally destroy & recreate the Nomad raft DB from my deployment scripts. I'm fairly confident that will fix this, but I'd really like to figure out how to fix this in case we hit this in production at some point. However, if I can't get this fixed in the next day, a clean wipe of the DB is what I'm going to have to do so I can keep going on my automation here.
I got this to happen again. It seems like the problem is related to draining a node. As in, my job using the EBS volume was running on node x, say. I issue a drain on node x, and the job tries to move to node y. However, the node plugin job (which is a system job as prescribed) didn't unmount the volume from the node. Then when I terminated the drained node, nomad still thinks that the volume's writer claims are exhausted, even though it's not claimed by any node (since the node originally claiming it has been destroyed).
Constraint "CSI volume csi-ebs-staticvolume-alt-service-test has exhausted its available writer claims": 1 nodes excluded by filter
[hashidev] hwilkins ~ $ aws ec2 describe-volumes --volume-ids vol-00b...
{
"Volumes": [
{
"Attachments": [],
"AvailabilityZone": "us-east-1c",
"CreateTime": "2020-05-18T02:39:41.645000+00:00",
"Encrypted": true,
"KmsKeyId": "arn:aws:kms:us-east-1:<NUM>:key/<KEY>",
"Size": 1,
"SnapshotId": "",
"State": "available",
"VolumeId": "vol-00b...",
"Iops": 100,
"Tags": [
{
"Key": "Hashistack",
"Value": "hashidev..."
},
{
"Key": "JobName",
"Value": "csi-ebs-test-default"
}
],
"VolumeType": "gp2",
"MultiAttachEnabled": false
}
]
}
[hashidev] hwilkins ~ $ nomad volume status csi-ebs-staticvolume-hello-world-vol1
ID = csi-ebs-staticvolume-hello-world-vol1
Name = csi-ebs-staticvolume-hello-world-vol1
External ID = vol-00b...
Plugin ID = aws-ebs0
Provider = ebs.csi.aws.com
Version = v0.6.0-dirty
Schedulable = false
Controllers Healthy = 1
Controllers Expected = 1
Nodes Healthy = 0
Nodes Expected = 0
Access Mode = single-node-writer
Attachment Mode = file-system
Mount Options = <none>
Namespace = default
Allocations
ID Node ID Task Group Version Desired Status Created Modified
80da1a5b 7c981673 staticvolume-hello-world 7 stop complete 12h46m ago 4h7m ago
17596645 0e1c4bcf staticvolume-hello-world 7 stop complete 18h9m ago 12h46m ago
[hashidev] hwilkins ~ $ nomad volume deregister csi-ebs-staticvolume-hello-world-vol1
Error deregistering volume: Unexpected response code: 500 (rpc error: volume in use: csi-ebs-staticvolume-hello-world-vol1)
Faced similar issue when trying to deregister absent volume.
Here's how to reproduce:
-purge flagNomad thinks my volume is still there in UI/cli and I'm not able to deregister it:
$ nomad system gc # has no effect
$ nomad volume status
Container Storage Interface
ID Name Plugin ID Schedulable Access Mode
mysql0 mysql0 csi.hetzner.cloud false single-node-writer
$ nomad volume deregister mysql0
Error deregistering volume: Unexpected response code: 500 (volume in use: mysql0)
Obviously my bad for doing things in wrong order, but still would be glad to have something like nomad volume deregister -force flag for such cases.
@holtwilkins I'm looking into this, but in the meanwhile Nomad 0.11.3 was shipped with some improvements to the volume claim reaping (ref https://github.com/hashicorp/nomad/blob/master/CHANGELOG.md#0113-june-5-2020) so it might be worth your while to check that out.
Thanks @tgross , I've reproduced this same issue on Nomad 0.11.3 shortly after it was released, unfortunately.
It seems like the problem is related to draining a node. As in, my job using the EBS volume was running on node x, say. I issue a drain on node x, and the job tries to move to node y. However, the node plugin job (which is a system job as prescribed) didn't unmount the volume from the node.
Ok, so this is probably related to https://github.com/hashicorp/nomad/issues/8232 and https://github.com/hashicorp/nomad/issues/8080
The advice I gave to folks in those issues applies here as well, but that only helps prevent the issue in the future by reducing the risk of triggering the race condition between plugin teardown and volume claim reaping:
-ignore-system flag while node draining to ensure that the node plugin stays on the host even after it's done draining so as to avoid the race with the volumewatcher.I've also opened https://github.com/hashicorp/nomad/issues/8251 for the nomad volume deregister -force flag suggestion, which is something I've been meaning to do for a while.
Hi,
I've also encountered same problem. We have nomad client nodes in ASG and when they got rotated/replaced we've stuck with multiple clients in "ineligible" state even though they are long terminated. Removing job using volume, volume itself and plugin did not help.
Probably worth mentioning is fact, that as part of autoscaling clients are auto drained by lambda before termination proceeds and it worked as expected.
Every time GC is run or one of the servers restarted there are multiple messages like this:
Jun 23 15:44:26 nomad[8303]: 2020-06-23T15:44:26.510Z [ERROR] nomad.fsm: DeleteNode failed: error="csi plugin delete failed: csi_plugins missing plugin aws-ebs0"
I've even changed defaults in servers to:
node_gc_threshold = "30s"
csi_plugin_gc_threshold = "30s"
but it just made messages to appear more often, not only when forcing GC via command line.
Cheers
Is there a workaround for "unsticking" volumes stuck in limbo like this? I have one that's currently attached to a five-days-dead job that I don't know how to reclaim.
Is there a workaround for "unsticking" volumes stuck in limbo like this? I have one that's currently attached to a five-days-dead job that I don't know how to reclaim.
Update: my five-days-dead job is now over two weeks dead, and the EBS volume is still attached.
To be blunt, I don't know that CSI support can even be called "beta" at this point. The current bugs are severe enough to make CSI effectively unusable, and it doesn't appear from the release notes that any progress was made on it in 0.12.
Hello, I'm facing a similar issue, is there a release for the force option on nomad 0.11 ?
It seems that http://localhost:4646/v1/volume/csi/id_of_the_volume does not update the write allocs on non existents allocations
It prevents any job of using this volume
The only solution I had was to register the same volume with another id
I have the same message and can't detach attach my volume which has write allocations linked in volume API but theses allocations does not exists in allocation API:
could not detach from node: node lookup failed: index error: UUID must be 36 characters
error seems to come from https://github.com/hashicorp/nomad/blob/master/nomad/state/state_store.go#L723
throwed by https://github.com/hashicorp/nomad/blob/master/nomad/volumewatcher/volume_watcher.go#L312
Could it be possible NodeID being empty preventing from detaching the volume removing the lost allocations ?
NodeID being could be provoked by a lost node ?
EDIT : my volume isn't attached it is just not claimable
Is there a workaround for "unsticking" volumes stuck in limbo like this?
The nomad volume deregister -force option was merged for 0.12.0-rc1 but unfortunately it looks like there's a bug in it that isn't going to make it into the GA (see https://github.com/hashicorp/nomad/pull/8380)
To be blunt, I don't know that CSI support can even be called "beta" at this point. The current bugs are severe enough to make CSI effectively unusable, and it doesn't appear from the release notes that any progress was made on it in 0.12.
I really can't disagree with you there, @jfcantu. The CSI project had been my primary focus but we were pretty short-handed for it during development. As you might note from some of the other open issues with the storage tag there's constant struggle with the spec itself and plugins that may not be bug-free themselves. I'll admit I'm not particularly proud of the shape it's in. Unfortunately I had to focus on some other projects for the 0.12.0 release cycle, but post-0.12.0 GA I've got some time carved out to spend on this to get the rest of these bugs worked out.
is there a release for the force option on nomad 0.11 ?
There is not and we won't be backporting that.
I really can't disagree with you there, @jfcantu. The CSI project had been my primary focus but we were pretty short-handed for it during development. As you might note from some of the other open issues with the
storagetag there's constant struggle with the spec itself and plugins that may not be bug-free themselves. I'll admit I'm not particularly proud of the shape it's in. Unfortunately I had to focus on some other projects for the 0.12.0 release cycle, but post-0.12.0 GA I've got some time carved out to spend on this to get the rest of these bugs worked out.
That's great info. Thanks for the update! I'm eagerly anticipating the next steps.
Hello,
I have just updated to 0.12.0, it seems your fix is in.
I have most of my ghost volumes removed :tada:
However, I am still unable to remove this ghost allocation in volume locked state
# nomad volume status csi-disk-0
ID = csi-disk-0
Name = csi-disk-0
External ID = projects/myproject/zones/myzone/disks/data-csi-disk-0
Plugin ID = gce-pd
Provider = pd.csi.storage.gke.io
Version = v0.7.0-gke.0
Schedulable = true
Controllers Healthy = 2
Controllers Expected = 2
Nodes Healthy = 2
Nodes Expected = 2
Access Mode = single-node-writer
Attachment Mode = file-system
Mount Options = <none>
Namespace = default
Allocations
ID Node ID Task Group Version Desired Status Created Modified
88cfffe3 b1042488 csi-disk 47 run running 3d4h ago 2d2h ago
bed001f0 511b5f41 csi-disk 47 stop lost 3d14h ago 3d4h ago
# nomad alloc-status 88cfffe3
No allocation(s) with prefix or id "88cfffe3" found
# nomad node status b1042488
No node(s) with prefix "b1042488" found
# nomad volume deregister csi-disk-0
Error deregistering volume: Unexpected response code: 500 (volume in use: csi-disk-0)
# nomad volume deregister -force csi-disk-0
Are you sure you want to force deregister volume "csi-disk-0"? [y/N] y
Error deregistering volume: Unexpected response code: 500 (volume in use: csi-disk-0)
Corresponding log
2020-07-10T07:09:57.713Z [ERROR] nomad.fsm: CSIVolumeDeregister failed: error="volume in use: csi-disk-0"
2020-07-10T07:09:57.714Z [ERROR] http: request failed: method=DELETE path=/v1/volume/csi/csi-disk-0?force=true error="volume in use: csi-disk-0" code=500
Ok, that's an interesting state: the allocation doesn't exist _at all_ but is showing _running_ status for the volume. That gives me a new clue on where to look.
Wanted to give a quick status update. I've landed a handful of PRs that will be released as part of the upcoming 0.12.2 release:
I believe these fixes combined should get us into pretty good shape, and #8584 will give you an escape hatch to manually detach the volume via nomad volume detach once that's merged.
For sake of our planning, I'm going to close this issue. We'll continue to track progress of this set of problems in https://github.com/hashicorp/nomad/issues/8100.
I'm still seeing this issue on 12.4. Even all allocations are failed or stopped (or I do garbage collect so they don't exist anymore), the volume status still shows up as these ghost allocations.
I can deregister if I -force, but it doesn't seem to garbage collect these ghost allocations correctly.
Most helpful comment
The
nomad volume deregister -forceoption was merged for 0.12.0-rc1 but unfortunately it looks like there's a bug in it that isn't going to make it into the GA (see https://github.com/hashicorp/nomad/pull/8380)I really can't disagree with you there, @jfcantu. The CSI project had been my primary focus but we were pretty short-handed for it during development. As you might note from some of the other open issues with the
storagetag there's constant struggle with the spec itself and plugins that may not be bug-free themselves. I'll admit I'm not particularly proud of the shape it's in. Unfortunately I had to focus on some other projects for the 0.12.0 release cycle, but post-0.12.0 GA I've got some time carved out to spend on this to get the rest of these bugs worked out.There is not and we won't be backporting that.