There are numerous potential ways to support cluster autoscaling in OpenStack. Services like Heat and Senlin are not guaranteed to be support in an OpenStack deployment. So base support using only core OpenStack services should be targeted. The eventual path could be to optionally support using other services, but that would be outside the scope of this issue, other than driving the implementation to be abstract around ties to underlying services.
I found #230, but it has been closed as stale.
@dklyle
We are also struggling with cluster auto-scalar. Our requirement is to provdeprov nodes (VMs) automatically based on the usage of the resources on each of the node.
We are planning to have a "VM Farm" having n number of VMs which can be utilized by any k8s cluster (in our on-prim env) looking for a node. This VM farm would be maintainedmanaged by OpenStack.
Currently we do not find any implementationsolution where K8s CA is able to communicate properly with OpenStack and is able to do node provdeprov automatically.
Could you please advise by when you would be able to release the solution for the same?
Thanks,
Varun T
@varuntalus This is very much a work in progress. I'm working on code in spurts. I will update here when I have enough progress to be useful. But I also wanted to solicit input of use cases and any discussion of desired criteria to help shape the implementation.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
We are very interested in getting this to work. @dklyle would you be able to provide an update for this? Anything we can do to help you move forward with this?
could you guys implement this to kops as well :)
@zetaab appears I have to fix gophercloud as well (already some fix merged,but needs more), will also implement in kops once we get full test here.
@timstoop Since this has been pending for long time, I will take over if that's fine with all. Any help or comments are welcome
Current patch still missing Test, and some implementation, I need to fix some issue in gophercloud (OpenStack Golang SDK) first before I can push all implement and tests up.
I'm no Go programmer, but let me know if and how I can help!
After we discuss in OpenStack PTG [1]. It's time we start to do something as next steps. I'm working on 2(build a common lib) now to build common lib (which I believe we still need to keep adding stuff as we try to implement it in three version). And after that, I assume I gonna help on 4(add heat support). So if you find anything you would like to help, feel free to leave some message. Also any idea, suggestions, or reviews are more than welcome!
[1] https://etherpad.openstack.org/p/sig-k8s-2018-denver-ptg
[2] https://github.com/gophercloud/gophercloud/issues/1157
[3] https://etherpad.openstack.org/p/autoscaling-integration-and-feedback
@ricolin I am also interested in implementing this( As discussed on slack), let me know how I can help.
Might be a little revolutionary here, but why not use the logic from cluster-api-provider-openstack(CAP-OS) instead of reimplementing? Anything we want to support in autoscaler we also want to support in CAP-OS, and we can prevent duplication of the logic. Also, the long-term intent is that cluster-api would directly be responsible for the creation and destruction of machines, and the autoscaler would be cloud-agnostic.
@ricolin why support Nova, Heat, Senlin of openstack projects for cluster autoscaler(CA), I think CA is used to adjusts the k8s nodes by using openstack cloud, the initiator is k8s, if k8s need more nodes, it should notify openstack to create more nodes and then the nodes join k8s, also about reduce nodes.
CA adds VMs on cloudprovider side (for example it resizes GCP MIG or AWS ASG). Basically it needs to be able to ask cloudprovider to either add new node or remove existing one.
@adsl123gg I guess the reason to add OpenStack will be the same when added AWS, GCP, etc. Which @MaciekPytel just mentioned
@ricolin the Cluster Autoscaler is only responsible for managing compute (i.e. Nova) resources. All other resources are managed by the Cloud Provider Openstack, where contributions would be greatly appreciated
@chaosaffe currently we're building a common way (which I will push to git hub soon) to treat all of them the same as libraries which we can move to cloud-provider-openstack later.
Just to clarify
Cluster Autoscaler is managing cluster like ASG in AWS which equivalent to ASG in OpenStack Heat, or Cluster in OpenStack Senlin. In that sense, I don't think it's fair to say Nova is what Cluster Autoscaler only responsible. We still will leave room to implement Nova as one of backend anyway. IMO all library (OpenStack resources) should move to OpenStack provider, and which is what I'm intended (I guess that's what you ask for too)
I like the idea to add them into cluster API, but there are still some more things need to be added even after this work in Autoscaler is done. Will try to help on that too.
@chaosaffe cloud-provider-openstack is based on k8s position, if the service/pod of k8s need LoadBalance/persistentVolume, k8s will ask the cloud provider to create the corresponding resources, k8s is initiator. For Cluster Autoscaler(CA), I think is seem like a component in openstack, the initiator is CA, CA scale k8s nodes according k8s resource utilizing status, do I understand correct @ricolin. So for auto scale k8s cluster, there are two views to implement, from k8s position or cloud provider position, @ricolin do you think which is better and what's their advantages and disadvantages?
@ricolin I know CA should support openstack, but I want to know how? from your comments I don't know how to auto scale k8s cluster, you only mention Nova, heat or other components, which confuse me with the FAQ in Cluster Autoscaler, so could you explain your design of how to finish autoscale in k8s cluster with openstack?
according to the README in Cluster Autoscaler on Azure/AWS/AliCloud , they scales worker nodes, and from the architecture diagram, CA is running in k8s cluster, so I think the design is different with other projects in cluster-autoscaler.
@adsl123gg fear not my friend, the design is actually very simple, is the same structure as other Provider (Azure, AWS, etc). The difference is, instead of only support single method to talk to Provider, we tend to allow support multiple ways in the future. As for those OpenStack components support here, the only different is the implementation for some fundamental function like FetchASGTargetSize, or DeleteInstances etc.
@ricolin, Thanks for working on this.
Wouldn't Nova be pretty straightforward to implement? I'd imagine some value would need to be passed to enable joining the cluster perhaps via kubeadm...via cloudinit and new node metadata?
Can you please provide more details of the implementation?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
Might be a little revolutionary here, but why not use the logic from
cluster-api-provider-openstack(CAP-OS) instead of reimplementing? Anything we want to support inautoscalerwe also want to support inCAP-OS, and we can prevent duplication of the logic. Also, the long-term intent is thatcluster-apiwould directly be responsible for the creation and destruction of machines, and theautoscalerwould be cloud-agnostic.