Cloud-on-k8s: Improvement of CI setup

Created on 8 Aug 2019  路  9Comments  路  Source: elastic/cloud-on-k8s

We starting to have issues with current state of our CI setup. Looks like it's a time to discuss how we can improve here.

:ci discuss

Most helpful comment

So to follow up on this issue. I'll assume that we care about 2. and 3. and while we think that it's technically challenging to have 1., but would be nice to have.

Given that, I'd propose the following.

Flow:

  1. We keep the same structure we have right now: job yaml -> jenkinsfile -> ci make (in build/ci/) -> "dev" make (in operators/) -> e2e/deployer tool.
  2. ci make - we make it have exactly two purposes: get vault token and run "dev" make in a properly built container.
  3. "dev" make - we put all logic here. At least for e2e and cluster provisioning it's nicely contained in separate tools already.

Config:

To keep config in least number of places I'd propose to create a env var file in jenkinsfile with all the settings defined by given job and include it in the makefile, like:

include environment

This is similar to what @artem proposed in #1496, but doesn't include secrets that are not used in particular job and doesn't require any changes to intermediate makefile - best of both worlds, I believe. In addition to that, this avoids handling not set and set to empty string cases separately, which we started doing lately (#1535, #1547).

Reproducibility:

As it was noted, it is very difficult to reproduce Jenkins run, but I think being able to run even a single part of Jenkinsfile has a lot of value and it would save us time in few instances already (#1524, #1533, #1548, #1559, #1565). The ci make would detect if it's running in Jenkins. If yes, it'd grab Vault token using approle auth method, if not it would use github auth method which works for all of us. The only manual part would be to recreate file for deployer and environment file, but as these are fully populated in a single Jenkinsfile, this should be straightforward. We would just make -C build/ci target like Jenkinsfile does and most of the environment would be 1 to 1 with real ci. I think that having that would improve dev velocity greatly compared to 'change, push, pr, pr-ci, approve, run' cycle that is there now.

Work needed

If we were to accept this, the implementation would require:

  1. Add creating environment file to jenkinsfiles.
  2. Load environment file in dev make.
  3. Move fetching secrets from vault to dev make or a tool.

These actually don't amount to a lot of work.

After this is done, we will be able to look at the jenkinsfile of any job and know entire config. We could repro this locally if needed easily by creating file(s) and running make. While we still keep some logic in scripts, at least it's a single place. For extending ci functionalities, we would need to work only with jenkinsfile and dev make (or better yet, only e2e tool/deployer).

Additional property of the above is that community contributors can easily repro our CI runs just as we do (given they have the required infra).

Thoughts, feedback and critique welcome:)

All 9 comments

Thanks for creating this issue.

I think there is some space for improvement and we might start from defining what we want to end up with before looking into technical aspects. I'd like to be able to:

  1. rerun CI jobs from my dev box (to the degree possible) with no/minimal setup - I think that's important because it helps with developer velocity, debuggability and gives higher confidence with regards to credentials security.
  2. easily inspect configuration - go to (ideally) single place to see the entire config for a given job.
  3. easily inspect steps - go to (ideally) single place to see what a given job is doing.

While these are very high level I think they are a starting point. If we can agree on these (and others) then we can move to talk about how to implement it.

One additional things that I think we should decide on: What do we support for developers wishing to contribute? Can they run e2e on GKE (or other providers) within their account? Can they do CI in similar fashion? I don't think these are necessarily important today, but I'd like us to make an explicit decision (what should and what doesn't need to work out of the box).

With our CI setup

rerun CI jobs from my dev box (to the degree possible) with no/minimal setup

this is impossible, unfortunately. To properly run CI job locally you need to do https://github.com/elastic/infra/blob/master/docs/jenkins/adding-or-updating-jobs.md#testing-changes-locally

Alternatively, you can simply run official Docker image for Jenkins locally, but it will not guarantee that if it works in Jenkins image it will work on our CI. Although it might help with debugging issues in pipelines or something else which is common for Jenkins and not related to our setup.

Regarding rerunning CI jobs - I went through those instructions, but 1) they are fairly complex, 2) they did not work for me:). What I meant by 'to the degree possible' is that I can accept reconstructing some state provided by jenkinsfile (env var, a simple file), but I'd like to run targets from build/ci/Makefile on my box. I'm pretty positive that we could achieve that technically, (without too much effort) but before going into the details I wanted to probe is that considered by others as a value add.

If you want to truly reproduce CI job, then you need to have exact the same Jenkins state as on CI. To do this you need to go with https://github.com/elastic/infra/blob/master/docs/jenkins/adding-or-updating-jobs.md#testing-changes-locally
Other ways will give you only approximation

What do we support for developers wishing to contribute? Can they run e2e on GKE (or other providers) within their account? Can they do CI in similar fashion? I don't think these are necessarily important today, but I'd like us to make an explicit decision (what should and what doesn't need to work out of the box).

This seems like a reasonable goal IMO. Since we need to be able to bootstrap our own clusters for tests it seems reasonable that people should be able to run tests in their own environment. It's not something we need to spend a lot of time optimizing for (right now at least), but seems useful and like something that would naturally follow from enabling us to run e2e tests locally (assuming we agree on that as a goal).


@artemnikitin my understanding is that the goal at least early on was to have minimal logic in jenkins and have most of the logic in make or in scripts called by make. In that case an approximation might be Good Enough.


One thing that I would like to add is a make target that runs most of the quick checks that CI runs -- maybe lint, unit tests, etc that I can run before I commit/push. Then perhaps another overarching one that runs most/all of the tests that CI will run that I can kick off before opening a PR.

One thing that I would like to add is a make target that runs most of the quick checks that CI runs

You can run make ci already.

Are we constrained to use the Jenkins infrastructure for CI? Is there room for exploring other options such as Cloud Build, Tekton, Argo etc.?

So to follow up on this issue. I'll assume that we care about 2. and 3. and while we think that it's technically challenging to have 1., but would be nice to have.

Given that, I'd propose the following.

Flow:

  1. We keep the same structure we have right now: job yaml -> jenkinsfile -> ci make (in build/ci/) -> "dev" make (in operators/) -> e2e/deployer tool.
  2. ci make - we make it have exactly two purposes: get vault token and run "dev" make in a properly built container.
  3. "dev" make - we put all logic here. At least for e2e and cluster provisioning it's nicely contained in separate tools already.

Config:

To keep config in least number of places I'd propose to create a env var file in jenkinsfile with all the settings defined by given job and include it in the makefile, like:

include environment

This is similar to what @artem proposed in #1496, but doesn't include secrets that are not used in particular job and doesn't require any changes to intermediate makefile - best of both worlds, I believe. In addition to that, this avoids handling not set and set to empty string cases separately, which we started doing lately (#1535, #1547).

Reproducibility:

As it was noted, it is very difficult to reproduce Jenkins run, but I think being able to run even a single part of Jenkinsfile has a lot of value and it would save us time in few instances already (#1524, #1533, #1548, #1559, #1565). The ci make would detect if it's running in Jenkins. If yes, it'd grab Vault token using approle auth method, if not it would use github auth method which works for all of us. The only manual part would be to recreate file for deployer and environment file, but as these are fully populated in a single Jenkinsfile, this should be straightforward. We would just make -C build/ci target like Jenkinsfile does and most of the environment would be 1 to 1 with real ci. I think that having that would improve dev velocity greatly compared to 'change, push, pr, pr-ci, approve, run' cycle that is there now.

Work needed

If we were to accept this, the implementation would require:

  1. Add creating environment file to jenkinsfiles.
  2. Load environment file in dev make.
  3. Move fetching secrets from vault to dev make or a tool.

These actually don't amount to a lot of work.

After this is done, we will be able to look at the jenkinsfile of any job and know entire config. We could repro this locally if needed easily by creating file(s) and running make. While we still keep some logic in scripts, at least it's a single place. For extending ci functionalities, we would need to work only with jenkinsfile and dev make (or better yet, only e2e tool/deployer).

Additional property of the above is that community contributors can easily repro our CI runs just as we do (given they have the required infra).

Thoughts, feedback and critique welcome:)

Closing this with #1610 merged. Feel free to reopen if needed.

Was this page helpful?
0 / 5 - 0 ratings