Delete any job that:
The following 35 jobs meet this criteria:
ci-kubernetes-e2e-gce-gci-qa-serial-m55
ci-kubernetes-e2e-gci-gce-federation
ci-kubernetes-e2e-gce-federation-release-1.5
ci-kubernetes-e2e-gci-gke-alpha-features-release-1.4
ci-kubernetes-e2e-gke-container_vm-1.4-container_vm-latest-upgrade-master
ci-kubernetes-e2e-gce-gci-latest-1.4-debian-latest-upgrade-master
ci-kubernetes-e2e-gke-container_vm-1.4-gci-latest-upgrade-master
ci-kubernetes-e2e-gke-gci-1.4-container_vm-latest-upgrade-cluster
ci-kubernetes-e2e-gce-debian-latest-1.4-gci-latest-upgrade-cluster
ci-kubernetes-e2e-gce-gci-latest-1.4-debian-latest-upgrade-cluster
ci-kubernetes-e2e-gke-container_vm-1.4-container_vm-latest-upgrade-cluster
ci-kubernetes-e2e-gke-container_vm-1.4-gci-1.5-upgrade-master
ci-kubernetes-e2e-gke-container_vm-1.4-gci-latest-upgrade-cluster
ci-kubernetes-e2e-gke-gci-1.4-container_vm-latest-upgrade-master
ci-kubernetes-e2e-gke-gci-1.4-gci-latest-upgrade-master
ci-kubernetes-e2e-gce-1.4-1.5-upgrade-cluster
ci-kubernetes-e2e-gce-1.4-1.5-upgrade-master
ci-kubernetes-e2e-gce-debian-latest-1.4-gci-latest-upgrade-master
ci-kubernetes-e2e-gce-gci-latest-1.4-gci-latest-upgrade-cluster
ci-kubernetes-e2e-gce-gci-latest-1.4-gci-latest-upgrade-master
ci-kubernetes-e2e-gke-container_vm-1.4-container_vm-1.5-upgrade-cluster
ci-kubernetes-e2e-gke-container_vm-1.4-gci-1.5-upgrade-cluster
ci-kubernetes-e2e-gke-gci-1.4-container_vm-1.5-upgrade-cluster
ci-kubernetes-e2e-gke-gci-1.4-container_vm-1.5-upgrade-master
ci-kubernetes-e2e-gke-gci-1.4-gci-1.5-upgrade-master
ci-kubernetes-e2e-gke-gci-1.4-gci-latest-upgrade-cluster
ci-kubernetes-e2e-gke-container_vm-1.4-container_vm-1.5-upgrade-master
ci-kubernetes-e2e-gke-gci-1.4-gci-1.5-upgrade-cluster
ci-kubernetes-e2e-gce-1.4-1.5-upgrade-cluster-new
ci-kubernetes-e2e-gce-gci-latest-1.4-debian-latest-upgrade-cluster-new
ci-kubernetes-e2e-gce-debian-latest-1.4-gci-latest-upgrade-cluster-new
ci-kubernetes-e2e-gce-gci-latest-1.4-gci-latest-upgrade-cluster-new
Saving the following jobs that meet this criteria:
ci-kubernetes-e2e-gce-federation-serial - https://github.com/kubernetes/kubernetes/issues/44354 (sig-federation)Would you like to save any of these jobs? If so then please:
Note that these 9 jobs have been running since February and never passed (will keep these for now):
ci-kubernetes-e2e-gce-latest-upgrade-cluster
ci-kubernetes-e2e-non-cri-gce-federation
ci-kubernetes-e2e-non-cri-gce-examples
ci-kubernetes-e2e-non-cri-gke-flaky
ci-kubernetes-soak-gce-non-cri-test
ci-kubernetes-e2e-non-cri-gce-flaky
ci-kubernetes-e2e-non-cri-gce-serial
ci-kubernetes-charts-gce
/assign
Bigquery details
jobs.job,
latest_pass,
weekly_builds,
first_run,
latest_run
from (
select
job,
count(1) weekly_builds
from `k8s-gubernator.build.all`
where
started > timestamp_sub(current_timestamp(), interval 7 day)
group by job
order by job
) jobs
left join (
select
job,
max(started) latest_pass
from `k8s-gubernator.build.all`
where
result = 'SUCCESS'
group by job
) passes
on jobs.job = passes.job
left join (
select
job,
date(min(started)) first_run,
date(max(started)) latest_run
from `k8s-gubernator.build.all`
group by job
) runs
on jobs.job = runs.job
where
latest_pass is null
and date_diff(current_date, first_run, month) > 1
and date_diff(current_date, latest_run, day) < 7
order by latest_pass, first_run, weekly_builds desc, jobs.job
ci-kubernetes-soak-gce-federation-test
Shouldn't be deleted. We are having various problems with it and @shashidharatd is tirelessly working on it.
ci-kubernetes-e2e-gce-federation-serial
Shouldn't be deleted either. However, we need a plan to fix this which we don't have yet.
Also, we just deleted ci-kubernetes-e2e-gci-gce-federation and ci-kubernetes-e2e-gce-federation-release-1.5 today - PR #2435.
+1 to remove &
+1 close the open issue(s) related to that
@madhusudancs @shashidharatd we can definitely keep those two jobs (and thanks for working to fix them!) Will you please create issues for ci-kubernetes-soak-gce-federation-test and ci-kubernetes-e2e-gce-federation-serial as described above?
Filed
ci-kubernetes-soak-gce-federation-testci-kubernetes-e2e-gce-federation-serialci-kubernetes-e2e-gce-federation-serial
ci-kubernetes-e2e-gce-latest-upgrade-cluster
ci-kubernetes-e2e-non-cri-gce-federation
ci-kubernetes-e2e-non-cri-gce-examples
ci-kubernetes-e2e-non-cri-gke-flaky
ci-kubernetes-soak-gce-non-cri-test
ci-kubernetes-e2e-non-cri-gce-flaky
ci-kubernetes-e2e-non-cri-gce-serial
ci-kubernetes-charts-gce
The "non-cri" jobs are just there to make sure they are no more flakier than their "cri" counterparts. We plan to remove all of them once we remove the implementation, hopefully in a couple of weeks.
On the side note, many of the jobs without the "non-cri" tags also fit the same criteria but are not on the list:
https://k8s-testgrid.appspot.com/google-gce#gce-flaky
https://k8s-testgrid.appspot.com/google-gce#gci-gce-flaky
https://k8s-testgrid.appspot.com/google-gce#gci-gce-examples
https://k8s-testgrid.appspot.com/google-gce#gci-gce-examples
https://k8s-testgrid.appspot.com/google-gce#gci-gce-serial
gce-flaky has passed at least once during its existence. The ci-kubernetes-e2e-non-cri-gce-flaky has passed exactly 0 times. Ever.
gce-flaky has passed at least once during its existence. The ci-kubernetes-e2e-non-cri-gce-flaky has passed exactly 0 times. Ever.
Ah...I missed the part "since February" in your original post.
gce-flaky has been around much longer than non-cri-gce-flaky, so I don't doubt that it has passed before. They have been equally flaky in the past two weeks though, so I don't expect the non-cri jobs to magically heal itself unless their counterparts are fixed.
What is the plan fix the flaky suite?
This week I will only delete jobs that have not passed in 2017, which excludes the non-cri jobs since they passed in February. After we delete these then I'll find the next oldest set of jobs that have not passed, which will include the non-cri jobs assuming they still exist.
What is the plan fix the flaky suite?
I think that goes to the individual test owners of the flaky suites.
This week I will only delete jobs that have not passed in 2017, which excludes the non-cri jobs since they passed in February. After we delete these then I'll find the next oldest set of jobs that have not passed, which will include the non-cri jobs assuming they still exist.
Ack.
Sorry I'm a bit late to the party here.
I disagree with just deleting tests because they haven't passed in a while. Poking around the testgrid, some of these suites are pretty green, with just a few individual consistently-failing tests. It likely means the people responsible for the tests don't have the cycles to fix them - or that the tests don't really have clear owners. We shouldn't try to fix a lack of time or ownership by ignoring the items (failing suites) on our collective todo list.
Some of these suites look like renamed versions of older suites, which may have passed at some point.
I'd rather have a fixit or something for the tests than just delete failing ones.
Renaming is not the issue here. All of suites started running in 2016 and have not passed in 2017.
Note that there are around 100 suites in total that have never passed, it might be better to prioritize getting these to work (as I'll target these in the next round of deletion).
However I can hold off on deleting this set of jobs until May 1 if you think you'll be able to fix them.
So I'm flummoxed by some of these suites, unless there is a owner they will bit-rot.
Update: After talking to Erick about this, we're going ahead with deletion but will be tracking the PRs that delete tests in an aggregation issue so that they can be reviewed in more detail by domain-experts.
Migrated the "invest time into repairing these suites" into another issue.
In the near future will create a follow up issue to delete jobs which have been failing for at least the last 90 days.
might want to automate this process
Interesting idea. Which part?
Like, every month, check jobs has never passed for past month, send a warning email and deletion PR, merge the deletion if no-one cares.