Test-infra: bootstrapping service-catalog use of prow for CI/CD

Created on 7 Jun 2018  Â·  34Comments  Â·  Source: kubernetes/test-infra

Goals

  • Be consistent with overall kubernetes convention with respect to project behavior.
  • lots of reviewers reviewing and giving feedback without having full
    access to the github repo
  • no manual merges anymore

Current State

  • PRs Reviewed by people with github membership and owners files so they can use the prow plugin comments to get the normal kubernetes PR process labels. ~so they can apply a LGTM1, LGTM2 label.~
  • manual merges of PRs
  • make + docker based build
  • travis runs our build and produces all output

    • service-catalog binary as output for multiple archs

    • create and push images to quay.io

    • svcat binary as output for win/lin/mac

    • unit-test & integration-test

    • generate doc site

  • ~jenkins for e2e~ no e2e running currently

    • spins up gce cluster using a google account associated with

      @kibbles-n-bytes.

    • e2e tests are a go binary that uses kubeconfig to talk to the

      kubernetes instance



      • also runs the unit and integration-tests again



  • ~charts manually pushed by https://charts.ci.vicnastea.io/job/sync-repo-service-catalog/~
  • charts automatically pushed by a travis job
  • releases done by manually git tagging and updating the charts
    before pushing a button on the chart server

Wants

  • [x] Use the OWNERS file with APPROVERS and REVIEWERS
  • [x] have prow/tide whoever run ALL tests and manage the merge queue

Ambiguous

  • not tied to quay.io but we have control of it currently
  • don't have to dump travis
  • probably better to dump the jenkins
  • depending on how approvers/reviewers works, may want some
    additional features to prow reviewing plugin

first do's

  • [x] #8278
  • [x] #8330
  • [x] #8429
  • [x] #8431
  • [x] #8566
  • [x] #8763

/cc @BenTheElder
This is probably missing key info you would like to see, so let me know what else is helpful.

Second Round of Questions from July 2018 - W1

  • how much load are we allocated?
  • what are the pod restrictions?

    • mem, cpu, etc

  • what size are the nodes?

  • do we need /ok-to-test ? Auto /ok-to-test for people belonging to repo/owners file?

  • is it better to do a bunch of steps in

    • one job?
    • multiple jobs?
    • parallel jobs?
  • can we kick off a chain of jobs? Should we?

    1. first create build image
    2. keep using build image throughout pipeline?
  • test-apiserver.sh questions

    • runs etcd in container
    • runs apiserver in container
    • attaches a kubectl container to shared network, and does kubectl CRUD ops
    • download etcd & kubectl binaries?
    • run as separate pods?
    • how to get a fresh svc-cat pod?

      • chain after images are built?

  • e2e tests

    • how do I get a kubernetes to talk to ?
    • Only need one node, but must have api-aggregation enabled in apiserver
    • going to install our aggregated api server and controller into the cluster and then do some tests against it to make sure they work and reconcile appropriately.
  • final image generation

    • what does kube release do?
    • we push to quay, all successful master builds
  • cross buildability testing

    • all golang, maybe stick with travis. ~1600s CPU time. To ensure everything builds on every platform.
  • needs rebase

    • it's external plugin? how do I get it?

Parallel Tracks of work:

  • [ ] configure prow to merge once "things are done"

    • [x] tide is the name of the 'merge bot'

    • [ ] determine what things and what "done" looks like

    • [x] gate on completion of testing (partially #8710)

    • [ ] gate on successful release

    • [x] PR contains no 'hold' #8710

    • [x] PR has LGTM #8710

    • [x] can we configure the merge bot so it does a merge on the current LGTM1 + LGTM2 + /approve AND NOT /hold ? Then adapt it after the fact? #8710

    • [x] merge condition of tests passing, what indicates tests passing to the merge bot? Does it read github checks?

  • [ ] prow run all of the CI

    • [x] /ok-to-test is enabled

    • [ ] verify

    • [x] unit tests

    • [x] integration tests - uses multiple coordinated docker containers

    • [ ] e2e tests - needs a kubernetes deployed #8724, but I don't think this is even close to correct

    • [x] xbuild check using dind #8734

    • [ ] how combined or decomposed should each of these steps be?

    • [x] report all of this to each PR, enabled reports for each job as github checks

    • [x] ensure mergability. run tests on merged PR, rerun dependent PRs after merge, or batch them.

    • [ ] skip CI if DOCS ONLY

  • [ ] prow do all of the release delivery

    • [ ] figure out secrets so we can push

    • [ ] push images

    • [ ] push charts

    • [ ] push svcat binaries

  • [ ] test-infra dashboards/metrics

    • [x] we would like the velocity dashboard

  • [ ] testgrid configuration
  • [ ] look into the bootstrap.py config among other things to get the link from the prow CRD to the uploaded gs bucket. currently the test results expire, yet the gs bucket still has content. see this slack thread

My attempt at an overall document

areprow

Most helpful comment

we're going smoothy. prow/tide/whoever is running nicely. :fire:

All 34 comments

This all sounds reasonable to me, thanks for all the info!

I think the next step is probably configuring those things and then looking at what sorts of tests (unit, build, e2e?) you would want to migrate to Prow, if any.

/area prow
/assign
/cc @cjwagner

also, for a complete catalog of plugins and their usage see:

you may want to enable more of these in general for various labeling / automation tasks

@BenTheElder does it make sense to turn tide on without configuring it? I don't want it to accidentally start merging things that aren't ready by our selection.

right now to do a build we rely on docker existing.
Then it's a make build or some such.
Does this infra support a clone+make type operation? Or do we have to package the repo up into a ready to run image before it's possible to run the build & tests?

@MHBauer:

  • turning on tide is fine, but "turning it on" really just means telling it to merge things by some config, so if you don't want to merge things... probably don't 🙃
  • sure, you can execute make build, probably the best forward thinking way is with a job using the pod utilities
  • yep, we do cloning, again probably best see the pod uitilies
  • docker can exist, I'd like to hear how you use it. we run docker-in-docker on k8s for prow, most things like just docker build are fine, we have images that support this

I will look at pod utilities. Thank you. Probably not the time to enable tide yet.
We use docker build two ways. A 'buildimage' for development and compiling of the output binaries, and also for release images.

stream of consciousness while reading...

It looks like I want to write an entry for presubmits in config.yaml https://github.com/kubernetes/test-infra/blob/master/prow/config.yaml#L408 .

I see https://github.com/kubernetes/test-infra/tree/master/prow#how-to-add-new-jobs afterwards.

I think I need a new prowjob. Reading more, I'm not sure if I want a presubmit, postsubmit, periodic, or batch. What is the context of "submit" in presubmit/postsubmit? What's being submitted where?

So as not to waste resources, maybe we can start with a clone job that does not do anything after the clone, or prints out some env-vars? This seems like it would be a good entry point for people starting in the future.

@MHBauer submit might not be the best term, in this context submit == PR merge. presubmit -> testing PRs before they merge, post-submit -> testing triggered by merges, periodic == just runs on a schedule, used for EG, monitoring that the master branch is healthy with tests that are a bit prohibitive to be presubmits.

So as not to waste resources, maybe we can start with a clone job that does not do anything after the clone, or prints out some env-vars? This seems like it would be a good entry point for people starting in the future.

That sounds doubly excellent. @cjwagner we should put this in the docs.
I plan to flesh out the docs around jobs some more once @krzyzacy finishes sorting out support for many job files (so you can just configure your jobs in your file instead of one 16k config...)

Cloning is provided by the pod utilities so your job container doesn't need to do any cloning. You could have a container that just echos some env, but I don't think that would really be all that helpful for future use.

@BenTheElder What exactly should we add to the docs? Are you suggesting adding an echo example? I have some simple examples already here: https://github.com/kubernetes/test-infra/blob/master/prow/pod-utilities.md#how-to-configure
I suppose an echo example could be even clearer.

Yeah an echo env (maybe also ls?) would be a pretty harmless example. We
should also eventually add more directly useful ones like "use the go image
and go test".

On Tue, Jun 19, 2018, 09:14 Cole Wagner notifications@github.com wrote:

Cloning is provided by the pod utilities so your job container doesn't
need to do any cloning. You could have a container that just echos some
env, but I don't think that would really be all that helpful for future use.

@BenTheElder https://github.com/BenTheElder What exactly should we add
to the docs? Are you suggesting adding an echo example? I have some simple
examples already here:
https://github.com/kubernetes/test-infra/blob/master/prow/pod-utilities.md#how-to-configure
I suppose an echo example could be even clearer.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/8291#issuecomment-398457409,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA4Bq_-Z-up1yLfxM_yHERKiQWDAragvks5t-SNmgaJpZM4UfIkj
.

@cjwagner https://github.com/kubernetes/test-infra/issues/8291#issuecomment-398457409
What is my job container?

Looking at that referenced config, I see image: gcr.io/test-images/bug-finder

Where does that come from? The top of the doc references the utilities as init-containers or sidecars, but I don't see that spec configuration in an example. Are they invisibly added?

@BenTheElder https://github.com/kubernetes/test-infra/issues/8291#issuecomment-398465145
As an example those would be nice. At the current moment, I do not have any idea what I get for free vs what I need to do.

Thanks for the definitions, we should start a test/dev glossary.

Not in any rush yet to start creating a job, so we can wait if necessary for the split support. I do not know what I would put into a job definition right now.

Prow runs jobs as kubernetes pods. The specification for a job includes a pod spec with a single container that runs the test. This is the job container that I am referring to, but test container is probably a more appropriate name.
The bug-finder image doesn't actually exist. That is just an example test container image name.

You don't see specs for the pod utility containers because they are transparently added when plank creates the pod. Configuration for the pod-utility behavior is exposed through other fields in the job config. The pod utilities are explained here: https://github.com/kubernetes/test-infra/blob/master/prow/pod-utilities.md#pod-utilities This section details the available config fields that you may need or want: https://github.com/kubernetes/test-infra/blob/master/prow/pod-utilities.md#how-to-configure

At the current moment, I do not have any idea what I get for free vs what I need to do.

This section describes exactly that: https://github.com/kubernetes/test-infra/blob/master/prow/pod-utilities.md#what-the-test-container-can-expect
If something is missing please let me know and I'll add it.

What is the test container? I am providing an image to run? I do not have one.

Yes, the test container is the main thing that you'll need to provide. It is what actually runs your tests. You may be able to directly use the DinD image that @BenTheElder mentioned or you may need to use that image as a base for your own custom image if you have additional dependencies that are not vendored in the repo.

Okay, all of the reading of this documentation has not been very clear about what I provide and how I provide it.
Given that it's got a container/pod-spec, I can make assumptions, but it is much better to be very explicit.

Some statements I now think are true facts:

  • I give prow a image, prow runs my image in a pod
  • All the standard pod rules apply, but prow mounts some extra stuff, invisibly to me.
  • My image is run as test-container
  • the working directory of test-container is the git clone directory by convention (due to some of those invisible things)
  • I give my image to prow by making a prowjob
  • prowjobs are defined in config.yaml (for now, to be split up in the future)

What is the name of the docker-in-docker image? Are we talking about the standard dind image? Is there a common repo of all test images?

We have some common images in the images/ directory, the bootstrap image
supports docker in docker amongst other things. We haven't migrated most
jobs to the podutils just yet so it includes some other stuff in the
entrypoint..
We'll need to create new images for use with the podutils specifically soon.

On Tue, Jun 19, 2018 at 2:20 PM Morgan Bauer notifications@github.com
wrote:

Okay, all of the reading of this documentation has not been very clear
about what I provide and how I provide it.
Given that it's got a container/pod-spec, I can make assumptions, but it
is much better to be very explicit.

Some statements I now think are true facts:

  • I give prow a image, prow runs my image in a pod.
  • All the standard pod rules apply, but prow mounts some extra stuff,
    invisibly to me.
  • My image is run as test-container
  • the working directory of test-container is the git clone directory
    by default

What is the name of the docker-in-docker image? Are we talking about the
standard dind image? Is there a common repo of all test images?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/8291#issuecomment-398550033,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA4Bqwojdm9XB3D_JZF6hFS8z__Gnc4bks5t-WsngaJpZM4UfIkj
.

I give prow a image, prow runs my image in a pod
I give my image to prow by making a prowjob
My image is run as test-container

You give Prow an entire Kubernetes PodSpec as part of the job specification. Kubernetes Pods aren't Prow specific so we don't go into detail about how to write them. The only Prow specific rule is that you can only specify one container in the pod (this does need to be better documented).

These docs may also be helpful @MHBauer: https://github.com/kubernetes/test-infra/tree/master/prow#how-to-add-new-jobs
When pod-utilities.md says In addition to normal ProwJob configuration... this is the normal configuration that it is referring to. That should be linked from pod-utilities.md and could be rephrased now that we intend for this to be the normal way to configure jobs.

All the standard pod rules apply, but prow mounts some extra stuff, invisibly to me.
the working directory of test-container is the git clone directory by convention (due to some of those invisible things)
prowjobs are defined in config.yaml (for now, to be split up in the future)

These points are already covered in these two documents. If there is something unclear about how they are written I can try and improve the wording. The organization of the main prow README and linking between documents is something I've been meaning to improve.

Actually in addition, the idea is that you create a PodSpec with one
container
, which must specify the command / entrypoint, and that command
should exit 0 on success and anything else on failure.

The podutils stuff adds the sidecar/init containers to the podspec at
runtime (prior to scheduling the pod) to handle git checkout, log upload,
artifact upload, etc.

I think we could make it clearer that the spec field is a podspec, and
the requirements on number of containers etc... The pod utilites are still
new. Most old jobs are using the horribly hacky jenkins/bootstrap.py script
for logging etc. still as we are transitioning.

On Tue, Jun 19, 2018 at 4:14 PM Cole Wagner notifications@github.com
wrote:

I give prow a image, prow runs my image in a pod
I give my image to prow by making a prowjob
My image is run as test-container

You give Prow an entire Kubernetes PodSpec as part of the job
specification. Kubernetes Pods aren't Prow specific so we don't go into
detail about how to write them. The only Prow specific rule is that you can
only specify one container in the pod (this does need to be better
documented).

These docs may also be helpful @MHBauer https://github.com/MHBauer:
https://github.com/kubernetes/test-infra/tree/master/prow#how-to-add-new-jobs
When pod-utilities.md
https://github.com/kubernetes/test-infra/blob/master/prow/pod-utilities.md
says In addition to normal ProwJob configuration... this is the normal
configuration that it is referring to. That should be linked from
pod-utilities.md and could be rephrased now that we intend for this to be
the normal way to configure jobs.

All the standard pod rules apply, but prow mounts some extra stuff,
invisibly to me.
the working directory of test-container is the git clone directory by
convention (due to some of those invisible things)
prowjobs are defined in config.yaml (for now, to be split up in the future)

These points are already covered in these two documents. If there is
something unclear about how they are written I can try and improve the
wording. The organization of the main prow README and linking between
documents is something I've been meaning to improve.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/8291#issuecomment-398574624,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA4BqzocswrzjxJQvSzgCXJW2xYamd-Nks5t-YW9gaJpZM4UfIkj
.

There seems to be a lot of confusion here. @cjwagner @BenTheElder do we feel like we have a high-level document that describes how Prow runs k8s-native tests using plank? I wasn't expecting there to be uncertainty in that area but perhaps the different available execution backends (Jenkins, k8s simple, k8s decorated) are not laid out anywhere?

Yes, we definitely need more docs regarding that.

On Wed, Jun 20, 2018 at 7:22 AM Steve Kuznetsov notifications@github.com
wrote:

There seems to be a lot of confusion here. @cjwagner
https://github.com/cjwagner @BenTheElder
https://github.com/BenTheElder do we feel like we have a high-level
document that describes how Prow runs k8s-native tests using plank? I
wasn't expecting there to be uncertainty in that area but perhaps the
different available execution backends (Jenkins, k8s simple, k8s decorated)
are not laid out anywhere?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/8291#issuecomment-398767702,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA4Bq6p8Asca7uWKXeRxBQGkmwURvai4ks5t-lq0gaJpZM4UfIkj
.

I think part of this is "How do I know I want a prowjob at all? Why is it called a job? What even is a job?"

The organization of the main prow README and linking between documents is something I've been meaning to improve.

I think maybe I'm confused by the organization. I seem to end up jumping around a lot. A short tutorial with:

  • my repo is at github.com:x/y
  • is a basic go program, or other sort of one liner
  • adds a presubmit job
  • has an example of the output as provided by whatever

I'm psuedo writing this up as a big checklist myself, but still learning how things are expected to be. I'll read through those two docs again.

you create a PodSpec with one container, which must specify the command / entrypoint, and that command should exit 0 on success and anything else on failure

^ this is very clear and direct statement.

do we feel like we have a high-level document that describes how Prow runs k8s-native tests using plank?

I think the closest thing that we have is the "Life of a ProwJob" document: https://github.com/kubernetes/test-infra/blob/master/prow/architecture.md
Its isn't a high level overview of the things that are important for users to know though, its more of a technical overview of how ProwJobs are triggered. It also isn't discoverable.

so the first prowjob is setup, but we need to enable the trigger plugin to enable triggering presubmit jobs from pull requests. making a PR to do this now...

Added some more questions and discussion topics for the future in the OP. No need to answer, just want to have my thoughts writen down.

/cc @jeremyrickard @nikhita
as other interested parties

Recommendation today was configuring tide is our immediate next step.

the result is thus #8710

we're going smoothy. prow/tide/whoever is running nicely. :fire:

Whoo! 🎉

@MHBauer if everything is still running smoothly, can we close this out?

/close
I'm going to close this out, please /reopen if there's anything missing

@spiffxp: Closing this issue.

In response to this:

/close
I'm going to close this out, please /reopen if there's anything missing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings