Cluster-api: Flakes in the E2E Conformance tests labeled as Serial

Created on 14 Jan 2021  路  6Comments  路  Source: kubernetes-sigs/cluster-api

What steps did you take and what happened:
Recently we are observing an increasing number of failures in E2E conformance tests labeled as [Serial] e.g.

  • Kubernetes e2e suite.[sig-apps] Daemon set [Serial] should run and stop complex daemon [Conformance]
  • Kubernetes e2e suite.[sig-scheduling] SchedulerPreemption [Serial] validates basic preemption works [Conformance]
  • Kubernetes e2e suite.[sig-scheduling] SchedulerPreemption [Serial] validates lower priority pod preemption by critical pod [Conformance]

This is due to the fact that in our E2E test framework we are running tests in parallel 5 ginkgo nodes.

So now we have two options:

  • Remove parallelism; this will increase the overall test duration to ~1 for the conformance test only.
  • Skip test labeled as [Serial]

I would like to hear other opinions as well before proceeding with the change.

Personally, I'm +1 to choose option 2, given that:

  • CAPD on prow (docker in docker in docker) is fast, but also easily subject to noisy neighbors, so I prefer stability over depth of our conformance signal.
  • We will get a better signal about conformance from providers.
  • this is the same approach followed by kubeadm and the CI signal there is almost stable there

@vincepri @CecileRobertMichon @randomvariable @detiber opinions?

Environment:

  • Cluster-api version: main

/kind bug
/area testing

aretesting good first issue help wanted kinbug

All 6 comments

/milestone v0.4.0

+1 to skip the serial tests, although would like to hear from others that have more experience in this area

we are skipping these in kubeadm tests.
serial tests are meant to run in a separate less-freq job (and it's going to be slow..)

/help
/good-first-issue

To fix this add ginkgo.skip: \[Serial\] to https://github.com/kubernetes-sigs/cluster-api/blob/master/test/e2e/data/kubetest/conformance.yaml, send a PR, and then check the output of conformance tests after triggering /test pull-cluster-api-e2e-full-main; no tests labeled [Serial] should be there for the latest run...

@fabriziopandini:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

/help
/good-first-issue

To fix this add ginkgo.skip: \[Serial\] to https://github.com/kubernetes-sigs/cluster-api/blob/master/test/e2e/data/kubetest/conformance.yaml, send a PR, and then check the output of conformance tests after triggering /test pull-cluster-api-e2e-full-main; no tests labeled [Serial] should be there for the latest run...

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fabriziopandini I can test it locally and send a patch with suggested changes.

/assign

Was this page helpful?
0 / 5 - 0 ratings