Jx: Support multiple clusters

Created on 22 Mar 2018 · 40Comments · Source: jenkins-x/jx

This is not an easy one, but I'm regularly encountering scenarios where production is not a different namespace but a different cluster (or even more complicated). This is the reason why many times I'm not able to use Helm as a release manager in earnest. As Jenkins X is depending on Helm supporting multiple clusters is extra-hard, I'm not even sure it's possible. However if I look at this from a purely delivery pipeline perspective it's not important at all to have all environments on a single cluster. Jenkins X could just have the credentials to each cluster and the definition of an environment would change from =namespace to =cluster/namespace.

This issue is for starting a discussion, not demanding that you guys implement it. Jenkins-x is such an awesome project, that I would like to be able to use it in scenarios where I can't now.

aremulti-cluster kinenhancement prioritcritical-urgent

Source

adam-sandor

👍50

Most helpful comment

@edblackburn we're hoping to write a blog post in the next couple of weeks on multi-cluster recommendations and how to set it up; we're pretty close to being able to automate it all now

jstrachan on 4 Jan 2019

👍18 🎉6

All 40 comments

@adam-sandor thanks for your issue and I totally agree!!

I think most folks really should have separate clusters for production and do development/testing in a different cluster which can really help reduce risk (e.g. a bad system test using up all your resources on production ;). There's also the security angle too; building docker images has security issues giving access to the docker socket etc - though hopefully things like skaffold can add rootless + daemonless docker builds soon!

We specifically defaulted to using GitOps for environment promotion specifically to help address this need; of the development cluster + pipelines being in a different cluster. If we ignore the installation side of things for a moment, it should be possible to manually setup a multi-cluster implementation today. I'd love someone to try then we can find any issues ;)

Today when promotion happens today (either after a push to master on an app, or manual promotion) - Jenkins X basically submits a Pull Request then waits for the PR to merge and for the release job on the environment git repo to complete. So its completely decoupled from the kubernetes cluster + Jenkins/Prow where that happens!

So after Jenkins X has been installed, you could install Jenkins X on the production cluster too - using the same git repos for the Staging/Production environments. Then on the Production cluster delete the Staging job in Jenkins. Then on the Dev cluster, delete the Production jobs.

Then when a promote to Production happens in the dev cluster, the PR on the Production git repo occurs - that should trigger the PR + Release pipelines on the Jenkins in the Production cluster to promote into the production cluster + namespace!

There will be some UX issues related to multi-cluster; commands like jx get apps for viewing versions/pod counts/URLs of apps in environments don't yet check if the environments are in different clusters yet - though hopefully thats not too hard to fix.

So I think the main work to get this going is to improve the install wizard so its easier to have Environments on other clusters + to setup the necessary Jenkins + jobs there. We may want to create a custom Environment Tools chart which includes, say a locked down Jenkins as there's only gonna be 1 job really - the CI/CD pipeline of the Environment git repo - there's no need for any other tools like Nexus / Chartmuseum / Docker registry etc.

TL;DR I don't think we're too far away from being able to at least demonstrate multi-cluster is possible + hopefully we can polish the installer to make it easy to do. Volunteers most welcome! :)

jstrachan on 23 Mar 2018

👍16

At one of our customers we have 3 k8s cluster:

tools (tools namespace with Jenkins, Nexus, ChartMuseum, etc.)
devtest (dev, test & mock namespace)
prod (prod namespace namespace)

Each running in different AWS accounts with private topology.

And we have one build job per app and one release job per environment.

We use ONE git repo per environment to hold the state and currently we have this workflow (helm, landscaper, chartmuseum):

once the build is successfull it updates dev git repo automatically and rolls out the new version
once the developer is happy they go and manually update the prod git repo and make PR; once PR gets into master the release job rolls out to prod

@jstrachan I don't understand why should we install another jenkins for prod cluster?

rasheedamir on 23 Mar 2018

👍5

@rasheedamir that use of git repositories, environments and automatic promotion to staging and manual to production sounds like the exact OOTB experience of Jenkins X :) Great minds think alike and all that :)

There doesn't have to be another Jenkins really - its just a deployment option. If you just have 1 Jenkins for all clusters then that would then mean you'd need a Secret in the dev cluster that would give you remote + write access to the production cluster. From a security perspective it would be safer to have a Jenkins (or Prow or GitKube or something to trigger a kubernetes Job on PRs + merge to master) running inside the Production cluster with write access (via a Service Account) without requiring remote write access to the cluster. Its totally fine to just use 1 jenkins though!

jstrachan on 23 Mar 2018

makes absolute sense @jstrachan

to avoid having managing another tool in prod cluster we went with just one jenkins and then our prod cluster api is firewalled and traffic is only allowed from whitelisted ip's; and it includes ip's of NAT gateways of tools cluster

rasheedamir on 26 Mar 2018

👍2

yeah, I think a single Jenkins may well be easier; we should certainly support both options

jstrachan on 26 Mar 2018

👍7

I agree - if Jenkins X is to be successful this much flexibility would be a great thing. I think it's also easier for people to wrap their heads around a single Jenkins, while also having a global view of jobs that ran in the past. Hopefully I can experiment with all of this next week.

adam-sandor on 29 Mar 2018

This would be a wonderful feature for us. I'd be happy to start contributing on this. Where could I start?

icereed on 25 Jul 2018

🎉2

@icereed we did a demo at the last Office Hours meetup on supporting multi-cluster: https://www.youtube.com/watch?v=Io3p7NurYqY&t

to simplify the install we're looking at using Terraform to setup the cloud infrastructure (e.g. N clusters) + install Jenkins X. We're working on a teraform provider to install Jenkins X and setup the environments: https://github.com/jenkins-x/terraform-provider-jx then we're looking at using Prow inside the Staging + Production cluster to implement the promotion pipelines so that we don't have to expose secrets for connecting to Staging/Production into the Jenkins in the dev cluster.

Maybe a way to start is to try out the terraform side of things to see if it works for your cloud provider / kubernetes provider - we've gone with GCP to start with

jstrachan on 30 Jul 2018

🎉8

I totally agree with the need of multiple clusters. In addition to the reasons already mentioned we also have the need for “infrastructural staging” which, for us, is solved by multiple clusters(2 to be exact).

What we have is an infrastructural staging cluster and an infrastructural production cluster. Infrastructural changes and upgrades are done to staging first (fx. upgrading kubernetes etc)

We use our infrastructural staging for application development environment. The infrastructural production cluster we use for both application staging environment and application production environment.

We reckon that both application staging and application production needs to be run on a infrastructural production grade cluster as superb uptime is needed in both.

michaelkrog on 6 Aug 2018

👍3

Unfortunatelly we can't use even manual solution with multiple jenkins instances proposed by @jstrachan, because we have a strict policy about any access from production.
As I understand it, the environment is now represented by git repo -> jenkins pipeline -> k8s namespace.
I wonder where is the environment jenkins pipeline created and can it be updated to do something else then helm install?
I was looking into code, but couldn't find that, is it managed by some Operator maybe?

jwerak on 31 Aug 2018

We're actually in the process of changing from using Jenkins to do the promotion via the environment repo webhook to using prow we can use a lightweight webhook handler in each environment or cluster which uses secrets to validate events before applying the release. This is so we can avoid deploying multiple Jenkins instances and support multi cluster. This should be coming out in the next few weeks, would that help here?

rawlingsj on 31 Aug 2018

🎉2 👍1

ok, webhook sounds great, one can do a lot with it...
Just one clarification, are you saying that there will be no more gitops for environments and state will be only in k8s CRDs or is new flow going to look like:
commit in environment git repo -> webhook to prow -> whatever webhook configured in environment (including payload data) ?

Is there opened issue about this change you could point me to?

jwerak on 31 Aug 2018

Prow sounds like a cool solution - curious couldn't we just run an agent in each cluster for the actual deploy step?

patrickleet on 5 Sep 2018

One of the reasons we're thinking of not doing that is folks tend to like having access to multiple clusters restricted, if we run an agent in each cluster it means the master (which receives the git environment, promotion webhook event) would need to have access to all clusters.
Also I could imagine running remote master / agents could result in issues with network connectivity etc. As Prow's hook component is extremely lightweight we can run one in each environment or cluster which gives nice isolation. Then RBAC is controlled at the environments git repo, so once a promotion PR is merged by a team with the right permissions, we get the new deployment.

rawlingsj on 5 Sep 2018

Cool thanks for the explanation. I had used different agents in different
docker swarm environments in the past and it worked well enough.

I thought the connection was made and authenticated from each agent to
exposed interfaces on the master instance. Is it two way?

And just for some more context I’m thinking about a particular project that
I’d like to move to jx.

As well as multiple clusters, my current client is using VPNs around each
kube cluster for added security and isolation. Still trying to wrap my head
around needing to be connected to a VPN to access a particular cluster
might affect things.

Does your vision for prow take potential VPNs in consideration too, or do
you think there will be more required to support that constraint as well?
On Wed, Sep 5, 2018 at 2:39 PM James Rawlings notifications@github.com
wrote:

One of the reasons we're thinking of not doing that is folks tend to like
having access to multiple clusters restricted, if we run an agent in each
cluster it means the master (which receives the git environment, promotion
webhook event) would need to have access to all clusters.
Also I could imagine running remote master / agents could result in issues
with network connectivity etc. As Prow's hook component is extremely
lightweight we can run one in each environment or cluster which gives nice
isolation. Then RBAC is controlled at the environments git repo, so once a
promotion PR is merged by a team with the right permissions, we get the new
deployment.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/jenkins-x/jx/issues/479#issuecomment-418835750, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAc-y6GECEw2spjpwI5WwY9UMoHfhSP5ks5uYBpVgaJpZM4S3csG
.

patrickleet on 5 Sep 2018

We are right now solving the multi-cluster with access control problem, by leveraging gitops. If the master Jenkins only creates PRs against the git repo corresponding to an environment, then it doesn't have to have credentials to the clusters. Of course in this case you need another component which will execute the deployment process - like a deployer Jenkins on each cluster or Weave Flux, so the deployment process essentially becomes a separate thing from the CD pipeline.

We also needed to solve the problem of letting the master Jenkins know when a deployment is done, as this Jenkins has no access to the environment. We do this by making a callback from the deployer Jenkins to the master Jenkins - calling back is not a security problem. The main pipeline is waiting for this callback using an input step.

With this approach we are essentially outside of the realm of JenkinsX, even though it's very close to (and heavily inspired) what JX does. Except for the fact that it's implemented with multiple Jenkinses instead of one having all the pipelines.

adam-sandor on 12 Sep 2018

Just wondering if there's any news on a Jenkins-X native way to use separate clusters?

I've seen some things about Prow and 'serverless' jx, but haven't found one place to understand what has changed or is planned to support this use-case.

Cheers

BrianChristie on 9 Nov 2018

👍18

I'm trying to set things up so the production environment is on a different cluster.

My plan was to simply neuter the Jenkinsfile in environment-jx-production and then (at least for now) manually deploy the environment using helm to a different cluster.

I've figured out how to manually install production via helm, but when I neuter the Jenkinsfile to do nothing my jx promote doesn't complete:

jx promote foo --version 0.0.38 --env production -u alias:releases
....
Could not find the service URL in namespace jx-production for names foo, jx-production-foo, jx-production-foo

Is there any way to provide the service URL to jx promote so it'll do whatever it does that makes jx get apps work?

bryanlarsen on 5 Dec 2018

Hi, we are excited about using Jenkins-X but have security reservations about using a single cluster. We are in a heavily regulated environment that favours "physical" separation. We favour Environments segregated by Cluster as opposed to Namespace.

Does anyone have any experience or advice regarding locking down Environments defined as Namespaces? We've seen https://www.projectcalico.org/ but adding more moving parts is something we're keen to avoid.

@jstrachan or @rawlingsj do you have any more information regarding multi-cluster installations? Any advice would be greatly received.

edblackburn on 3 Jan 2019

We are also going to use jx in a multi-cluster environment, with a master cluster containing Jenkins and all jx dependencies, a dev cluster with n dev environments and then several clusters for various QA, pre-prod and production environments (1 env per cluster there). It will take us some time to set this in motion, but we will definitely share our experience here.

ethiebaut on 3 Jan 2019

🚀1

@edblackburn we're hoping to write a blog post in the next couple of weeks on multi-cluster recommendations and how to set it up; we're pretty close to being able to automate it all now

jstrachan on 4 Jan 2019

👍18 🎉6

I’ll share my experiences if it’s useful to any, also happy if anyone wants to let me know about problems I might run into doing it this way.

I was able to get this running on Amazon EKS. Here’s my setup:

EKS running on private subnets in AWS
o To make the private subnets work you install the Nginx service and then swap the NLB annotation for an internal-load-balancer annotation while the install is running. Then you delete the NLB from AWS and update Route53 with the correct internal load balancer.
To make the Jenkins webhooks public for integration with BitBucket Cloud I use Nginx reverse proxies running on ECS.
I use additional reverse proxies to allow the ‘build cluster to share its Chart Museum server with the ‘prod cluster. These are restricted to the VPCs involved so not public. JX stores the containers in ECR already so that’s not an issue.
After they’re both running I set up the environments in the build cluster, with CI on the environments that are in the build cluster and manual promotion on the environments in the production cluster. I create all 4 of them on the build cluster but then I delete the Jenkins jobs for the ones I don’t want to deploy to that environment. I don’t use prow or anything, I just use the default options.

o For some reason with private subnets I have to copy the exposecontroller and ingress-config configmaps from the ‘jx’ namespace manually to the new namespaces I create for my environments. If I don’t do this expose fails for that environment on every build.

I create the two environments that I want to use on my production cluster and I specify their git-urls explicitly.
I import my projects and adjust their build scripts to use a git hash at the end of the version number instead of auto-increment. I’d recommend this should be an option during import because auto-increment can be messy. Basically I have a VERSION_NUMBER file that has something like 1.0.0 in it and then I concatenate that with a git rev-parse –short HEAD .
When I promote them I need to specify the Chart-Museum URL explicitly. Just setting the CHART_REPOSITORY variable on the production environment isn’t enough because inside requirements.yaml in the environments it will contain a local reference instead of a domain name reference. I think this should also be an option – if I have told you what my domain is I’d like the option to make it explicit in the environments.
Regarding the .jx folder – I use symlinks to swap between them but usually I don’t need to leave the ‘build cluster’.

gregastasothebys on 4 Jan 2019

👍1

Not sure how much of this is already designed, but I hope that we are being general when discussing "multiple clusters" here. I am hoping to deploy to separate GCP projects all managed by a master account, so it's not just N clusters on a single GCP project with a single set of credentials. Is this in scope for what is being brainstormed here?

seanjfellows on 15 Jan 2019

👍1

@jstrachan wrote:

we're hoping to write a blog post in the next couple of weeks on multi-cluster recommendations and how to set it up; we're pretty close to being able to automate it all now

How is that coming along? Could you share a draft? Is there help needed?

msvticket on 18 Jan 2019

👍10 👀5

I'd like to bump this to see if there's any additional info at this time.

porterja31 on 4 Feb 2019

👍12

We are planning to run a multi-cluster environment with one ops cluster catering to deploys on n-customer-clusters. We want to install jenkins X on only the ops cluster and managed by a single team, this is the pattern we are currently interested in implementing with Jenkins X. This is similar to normal Jenkins that builds on the build server that deploys to a runtime environment of choice. A solution for this defect will go a long way towards implementing our thought process.

usvgmani on 4 Feb 2019

👍12

Any updates?

thunderball1 on 18 Feb 2019

👍3

@jstrachan Just echoing the above - any updates/help needed?

jkoncel on 28 Feb 2019

👍6

Any new news on multicluster plans/workarounds?

We are probably going to end up creating a second Jenkins instance in our production cluster.

rdelpret on 27 Mar 2019

I didn't really follow the entire of @gregastasothebys said. However, I have a similar setup. I'm using EKS as well, with an ELB, route-53 and all the other stuff.
My needs are the same though.

I have an EKS cluster.
I have 3 VPCs. One each for dev, staging and prod. I need to setup clusters on each of them. Currently I have only the dev cluster.
I need to have just 1 instance of jenkins, that can trigger my dev pods in the dev cluster, stage pors in staging and prod deployments in the production cluster.
I'm not sure how to establish this communication via jenkins. What do i change and how... the k8s secrets, introduce kubeconfigs for all 3 clusters? Any suggestions?
I want the jenkinsfile to control what goes where based on the the input parameters.

Any help on these requirements is appreciated.

romilpunetha on 8 Apr 2019

see https://jenkins-x.io/getting-started/multi-cluster/

current pending issues are you have to add a chart museum environment variable into your Jenkins-x.yml file and need to disable the release pipeline in the Dev cluster

I did a demo of this at last weeks Office Hours

jstrachan on 20 Apr 2019

🎉6 ❤4 🚀3

here's the demo btw: https://jenkins-x.io/community/april-18/

jstrachan on 23 Apr 2019

🎉4

I'm gonna mark this one as closed now as we now disable the release pipelines of remote clusters via the --remote-environments flag when installing Jenkins X: https://jenkins-x.io/getting-started/multi-cluster/ and we automate configuring the setup of the external chart museum URL in the staging/production git repository so that promotions work fine in the remote clusters without any custom steps.

we also have a simple way to enable ingress on remote clusters if required: https://jenkins-x.io/getting-started/multi-cluster/#installing-ingress-controller

jstrachan on 29 Apr 2019

🎉2

Just watched the demo @jstrachan , the solution looks great!

adam-sandor on 29 Apr 2019

@jstrachan Do I need to install exposecontroller manually? It seems that the exposecontroller is not installed while environment controller is installed. And, does this feature work for static instance？

sanigo on 26 May 2019

Using a bitbucketserver, and it seems that envctl is not working...
using require GitHub headers: true using environment source directory https://git.****.com/jx/environment-jx-prod and external webhook URL: http://10.10.10.11/ verifying that the webhook is registered for the git repository https://git.****.com/jx/environment-jx-prod error: failed to create git provider for git URL https://git.****.com/jx/environment-jx-prod kind bitbucketserver: Running in batch mode and no default API token found

sanigo on 28 May 2019

Looks like the environment controller takes an arg to ignore GitHub headers, --require-headers=false - unsure how you pass that flag though, perhaps edit the deployment?

polothy on 28 May 2019

Looks like the environment controller takes an arg to ignore GitHub headers, --require-headers=false - unsure how you pass that flag though, perhaps edit the deployment?

I am using bitbucket server not Github, and did not modify any deployment yaml file, I installed envctl like this:

jx create addon envctl -s https://git..com/scm/jx/environment-jx-prod.git
--git-kind=bitbucketserver
--docker-registry **
--cluster-rbac true
--user * --token *****

sanigo on 28 May 2019

@sanigo I did some tests with BitBucket Cloud and couldn't make it work. #3983 was raised as a result. BitBucket Server is probably not supported either.

duemir on 29 May 2019

@sanigo I did some tests with BitBucket Cloud and couldn't make it work. #3983 was raised as a result. BitBucket Server is probably not supported either.

Yes, I believe that it is not working currently.

sanigo on 29 May 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

use GitOps for the developer environment too

jstrachan · 23Comments

Bitbucket Cloud API deprecations

wbrefvem · 47Comments

support Gitlab and Bitbucket integration

vietwow · 30Comments

Hook and Tide pods failing with credentials error

tdcox · 29Comments

support Istio based user group promotion in Production

jstrachan · 30Comments