Terraform Version
v0.11.4
Affected Resource(s)
Expected Behavior
Kubernetes version 1.8 introduced Extended Resources, required for example to request allocation of Nvidia GPUs to containers. This option is currently missing from the terraform kubernetes provider.
resource "kubernetes_pod" "test" {
...
spec {
container {
...
resources {
requests {
"nvidia/gpu" = "1"
}
limits {
"nvidia/gpu" = "1"
}
}
}
}
}
References
Just checking whether there is a reason this is not yet supported (other than perhaps not being a high priority)? Is anyone aware of a work-around to request GPUs for a container?
The only thing I'm aware of is using the third-party provider that has support for native YAML-based Kubernetes configurations (https://github.com/ericchiang/terraform-provider-k8s), though I much prefer the official provider.
I added a quick and dirty way to do this in the pull request #591 --^ Seems to work for me, so feel free to use it and merge/decline as desired.
Any updates on adding this feature? kubernetes has had gpu scheduling for a while now but the terraform provider does not support it.
https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
Anyone has a good workaround for this? I have to manually remove the flag to be able to update my deployment. Ohterwise I get
Error: Invalid address to set: []string{"spec", "0", "template", "0", "spec", "0", "container", "0", "resources", "0", "limits", "0", "nvidia.com/gpu"}
Is there a way to ignore that flags or something like that?
Please merge this.
At least add a fix which will allow the provider to run when the extra resource request is set on a resource managed by terraform instead of failing with the error described by https://github.com/terraform-providers/terraform-provider-kubernetes/issues/149#issuecomment-596788250.
I can't believe that an issue with an open PR sits idle like this, seems like nobody uses GPUs in a terraform managed workload. At least please fix the error in https://github.com/terraform-providers/terraform-provider-kubernetes/issues/149#issuecomment-596788250 so that regular operations don't fail if the nvidia.com/gpu resource request is added outside of terraform.
We would need this as well. Makes us a lot of pain. Thanks for investigating in it and we hope it can be merged soon.
Also would need support for other resource requests like local-ephemeral-storage.
Error: Invalid address to set: []string{"spec", "0", "template", "0", "spec", "0", "container", "0", "resources", "0", "limits", "0", "ephemeral-storage"}
Even if you don't add support at least remove these errors so that the resource requests can be added outside of terraform and keep working with terraform, otherwise this becomes a real pain. @alexsomesan could you please take a look at this?
Sorry for the wait on this one folks, this just landed at the top of our backlog. You can read about our new process here if you are interested in how we triage these things.
I had a look at the code for this. It seems like the schema for Container was implemented without realizing that the limits and requests fields of the Kubernetes API are actually just maps that accept key value pairs that are either:
foo.bar/bazSo this attribute should actually be a Map with a custom validation function that verifies the above constraints. However, this is going to be a breaking change to the schema for Container and making this change will create diffs for everyone using any resource with a container in it - so we have to hold off on making this change for a major version release of this provider. We are currently collecting breaking changes for v2 of the provider, so this should happen reasonably soon although I can't specify a date.
~In the mean time - given that the reported use cases for this feature are few enough - I am open to hard-coding these additional attributes into the container schema. I would prefer to do this rather than add an additional attribute that we will deprecate in the next major version. The Kubernetes API does this all inside one field, so the provider should have parity with that.~
~This will mean adding the following attributes to the schema for requests & limits in the resources attribute:~
nvidia.com/gpu~amd.com/gpu~ephemeral-storage~edit: I just realized that we won't even be able to hard-code these resources as arguments to the schema because they contain punctuation, so we will have to add an additional field that is of type Map to support the extended resources, and then flatten it into a single Map in v2.
@jrhouston any updates on this? :)
Closing as this was added in v2.0.0
Most helpful comment
@jrhouston any updates on this? :)