I'm trying to set up CSI plugin for moosefs file system. But it is not clear from the documentation how can I provide the mount path (I've exported /nomad directory in MFS). I'm getting the following error:
failed to setup alloc: pre-run hook "csi_hook" failed: rpc error: code = InvalidArgument desc = NodeStageVolume Endpoint must be provided
Thanks for any help in this matter.
$ cat mfs-cgi-plugin.nomad
job "mfs-cgi-plugin" {
datacenters = ["home"]
type = "system"
group "nodes" {
task "plugin" {
driver = "docker"
config {
image = "quay.io/tuxera/moosefs-csi-plugin:0.0.4"
args = ["-endpoint=unix:///csi/csi.sock", "-mfs-endpoint=10.0.0.21", "-topology=master:EP,chunk:EP"]
privileged = true
}
csi_plugin {
id = "mfs-csi"
type = "node"
mount_dir = "/csi"
}
resources {
cpu = 500
memory = 256
}
}
}
}
$ cat volume.hcl
type = "csi"
plugin_id = "mfs-csi"
id = "mfs-vol1"
name = "mfs-vol1"
access_mode = "single-node-writer"
attachment_mode = "file-system"
mount_options {
fs_type = "moosefs"
}
Sample job using volume:
$ cat nginx.nomad
job "nginx" {
datacenters = ["home"]
type = "service"
group "nginx" {
count = 1
volume "vol1" {
type = "csi"
source = "mfs-vol1"
read_only = false
}
task "nginx" {
driver = "docker"
config {
image = "nginx"
port_map {
http = 80
}
}
volume_mount {
volume = "vol1"
destination = "/test"
}
resources {
network {
port "http" {}
}
}
}
}
}
Hi @toomyem!
Can you provide the alloc logs from the plugin allocation? (nomad alloc logs :alloc_id, possibly with the -stderr flag). It should contain some clues as to what's happening here.
But looks like we might be hitting this line: https://github.com/moosefs/moosefs-csi/blob/77152be87c277fd3e50fcfe492ae8264f862a797/driver/node.go#L44-L46... and I don't see volume_context or VolumeContext anywhere in the Nomad code base. It may be an optional field that we missed. I'll dig into that and report back here.
One other thing I'll note is that the beta of CSI that we shipped with 0.11.0 doesn't include support for topologies. See https://github.com/hashicorp/nomad/issues/7669 for tracking that work, which we're planning before we call CSI "GA". It looks like it might be possible to use the MooseFS CSI plugin without it, according to https://github.com/moosefs/moosefs-csi#storage-deployment-topology-optional ?
It may be related to the issue mentioned by @angrycub in https://github.com/hashicorp/nomad/issues/7764#issuecomment-617451731. I'll try to check if it also fixes my problem.
EDIT: no, it is probably not related.
Here is the log from plugin allocation:
time="2020-04-21T20:06:49Z" level=info msg="removing socket" node_id="unix:///csi/csi.sock" socket=/csi/csi.sock
time="2020-04-21T20:06:49Z" level=info msg="server started" addr=/csi/csi.sock node_id="unix:///csi/csi.sock"
time="2020-04-21T20:06:50Z" level=info msg="probe called" method=prove node_id="unix:///csi/csi.sock"
time="2020-04-21T20:06:50Z" level=info msg="get plugin info called" method=get_plugin_info node_id="unix:///csi/csi.sock" response="name:\"com.tuxera.csi.moosefs\" vendor_version:\"dev\" "
time="2020-04-21T20:06:50Z" level=info msg="get plugin capabitilies called" method=get_plugin_capabilities node_id="unix:///csi/csi.sock" response="capabilities:<service:<type:CONTROLLER_SERVICE > > capabilities:<service:<type:VOLUME_ACCESSIBILITY_CONSTRAINTS > > "
time="2020-04-21T20:06:50Z" level=info msg="node get info called" method=node_get_info node_id="unix:///csi/csi.sock"
time="2020-04-21T20:06:50Z" level=info msg="probe called" method=prove node_id="unix:///csi/csi.sock"
time="2020-04-21T20:06:50Z" level=info msg="node get capabilities called" method=node_get_capabilities node_capabilities="rpc:<type:STAGE_UNSTAGE_VOLUME > " node_id="unix:///csi/csi.sock"
time="2020-04-21T20:06:50Z" level=info msg="probe called" method=prove node_id="unix:///csi/csi.sock"
(...)
time="2020-04-22T07:45:46Z" level=error msg="method failed" error="rpc error: code = InvalidArgument desc = NodeStageVolume Endpoint must be provided" method=/csi.v1.Node/NodeStageVolume node_id="unix:///csi/csi.sock"
time="2020-04-22T07:45:47Z" level=info msg="node unpublish volume called" method=node_unpublish_volume node_id="unix:///csi/csi.sock" target_path=/csi/per-alloc/b9f938a7-efb7-6d4d-3bb4-0b6d5abb7852/mfs-vol1/rw-file-system-single-node-writer volume_id=mfs-vol1
time="2020-04-22T07:45:47Z" level=info msg="target path is already unmounted" method=node_unpublish_volume node_id="unix:///csi/csi.sock" target_path=/csi/per-alloc/b9f938a7-efb7-6d4d-3bb4-0b6d5abb7852/mfs-vol1/rw-file-system-single-node-writer volume_id=mfs-vol1
time="2020-04-22T07:45:47Z" level=info msg="unmounting volume is finished" method=node_unpublish_volume node_id="unix:///csi/csi.sock" target_path=/csi/per-alloc/b9f938a7-efb7-6d4d-3bb4-0b6d5abb7852/mfs-vol1/rw-file-system-single-node-writer volume_id=mfs-vol1
time="2020-04-22T07:45:47Z" level=info msg="node unstage volume called" method=node_unstage_volume node_id="unix:///csi/csi.sock" staging_target_path=/csi/staging/mfs-vol1/rw-file-system-single-node-writer volume_id=mfs-vol1
time="2020-04-22T07:45:47Z" level=info msg="staging target path is already unmounted" method=node_unstage_volume node_id="unix:///csi/csi.sock" staging_target_path=/csi/staging/mfs-vol1/rw-file-system-single-node-writer volume_id=mfs-vol1
time="2020-04-22T07:45:47Z" level=info msg="unmounting stage volume is finished" method=node_unstage_volume node_id="unix:///csi/csi.sock" staging_target_path=/csi/staging/mfs-vol1/rw-file-system-single-node-writer volume_id=mfs-vol1
time="2020-04-22T07:45:47Z" level=info msg="probe called" method=prove node_id="unix:///csi/csi.sock"
(...)
Ok, thanks @toomyem. I'm going to change the title of this issue to reflect the underlying problem and make sure it gets on the team's schedule for wrapping up the remaining CSI features.
Thank you. I'll be monitoring this issue, as I'm interested in making this work.
Closed by https://github.com/hashicorp/nomad/pull/7957. This just missed the deadline for 0.11.2, so we'll ship it in 0.11.3.
Does anyone successfully run nomad with csi plugin for mfs (moosefs)? Any info about such setup (plugin and volume configuration) would be _very_ appreciated. I cannot pass over the following error:
failed to setup alloc: pre-run hook "csi_hook" failed: rpc error: code = InvalidArgument desc = NodeStageVolume Endpoint must be provided.
My volume definition file looks like:
type = "csi"
plugin_id = "mfs-csi"
id = "mfs-vol1"
name = "mfs-vol1"
access_mode = "single-node-writer"
attachment_mode = "file-system"
mount_options {
fs_type = "moosefs"
}
context {
endpoint = "http://10.0.0.21:9425"
}
But the plugin still cannot see the _endpoint_ parameter.
@toomyem you're using the just-released 0.11.3?
Yes, of course. That's why I'm getting back to this issue.
Hi @toomyem sorry about the delay. I dug into this a bit more and realized that we added the field to the controller RPCs but missed it on the node RPCs. I'm fixing that in https://github.com/hashicorp/nomad/pull/8239, which I'm targeting for the 0.12.0 release. I didn't quite get that done in time for the 0.12.0-beta1 that came out this morning, but assuming that PR gets merged it'll go out in the following beta or the GA.
@toomyem that fix will ship in 0.12.0-beta2, which should be shipping later this week if you don't want to try it out on a build from master.
Hi.
I just would like to confirm that it finally worked ;) Below is the configuration I used, in case someone is looking for reference.
Thank you for your support and help.
sample volume definition:
type = "csi"
plugin_id = "mfs-csi"
id = "mfs-vol"
name = "mfs-vol"
access_mode = "single-node-writer"
attachment_mode = "file-system"
context {
endpoint = "mfsmaster:/shared/test"
}
sample job which uses this volume:
job "nginx" {
datacenters = ["home"]
type = "service"
group "nginx" {
count = 1
volume "mfs-vol" {
type = "csi"
source = "mfs-vol"
}
task "nginx" {
driver = "docker"
config {
image = "nginx"
port_map {
http = 80
}
}
volume_mount {
volume = "mfs-vol"
destination = "/test"
}
resources {
network {
port "http" {}
}
}
}
}
}
sample job for mfs csi plugin:
job "mfs-csi-plugin" {
datacenters = ["home"]
type = "system"
group "mfs-csi-plugin" {
task "mfs-csi-plugin" {
driver = "docker"
config {
image = "quay.io/tuxera/moosefs-csi-plugin:0.0.4"
args = ["-endpoint=unix:///csi/csi.sock", "-mfs-endpoint=mfsmaster", "-topology=master:EP,chunk:EP"]
privileged = true
}
csi_plugin {
id = "mfs-csi"
type = "node"
mount_dir = "/csi"
}
resources {
cpu = 500
memory = 256
}
}
}
}
Glad to hear it and thanks again for reporting the issue and helping me figure out what needs to be done!