Bazel 0.10.0 is now in use in test-infra and kubernetes (release-1.10/master). This release contains some nice improvements to the HTTP remote caching system, we should leverage this instead of our existing "use persistent storage for the local cache".
Why?
Why not?
Action Items
nursery to greenhouse to be less confusing :-) (https://github.com/kubernetes/test-infra/pull/6879)/area bazel
/area jobs
/assign
Some more notes:
pull-kubernetes-bazel-test takes 25-30m currently, once the cache is hot pull-kubernetes-bazel-test-canary takes ~5min typically. There is a lot of variation for both though mostly due to the load on the node the job runs onpull-test-infra-bazel takes 8-10m currently, pull-test-infra-bazel-canary takes ~3 min (about two minutes of which is spent installing python deps and running pylint...)pull-kubernetes-bazel-build-canary is not caching well currently, we probably need to mark things like hyperkube and the tarballs as no-cacheFYI @perotinus we can probably look at using this with the cluster-registry soon, test-infra is using it now ๐
Tested eviction a bit more with: https://github.com/BenTheElder/test-infra/blob/20d7d58ac34d59e241eddfb107e3b735398cd8d7/experiment/fill_cache.sh
Will PR some logging changes but so far WAI
Edit: see also, results of turning this on for test-infra:

Metrics dashboard is now up here: http://velodrome.k8s.io/dashboard/db/cache-monitoring?orgId=1
Edit: moved to here http://velodrome.k8s.io/dashboard/db/bazel-cache?orgId=1
Testing a new test-infra PR appears to have 3468 action cache hits, 7 action cache misses, and 1920 CAS hits (!)
I've now turned this on for the kubernetes CI bazel-build and bazel-test jobs with great results*


* Note: the build job only runs in post-submit, and once every 6 hours, currently. Once this job is properly continuous the results for it will be more obvious.
We also have a monitoring dashboard now:

https://k8s-testgrid.appspot.com/presubmits-kubernetes-blocking#pull-kubernetes-bazel-test&graph-metrics=test-duration-minutes
As expected instead of ~25+ minutes we're seeing ~5-6 minutes for pull-kubernetes-bazel-test after switching this on today.
Absolutely amazing work @BenTheElder. Congratulations!
Thanks Jakob :-)
On Fri, Mar 2, 2018 at 8:18 AM Jakob Buchgraber notifications@github.com
wrote:
Absolutely amazing work @BenTheElder https://github.com/bentheelder.
Congratulations!โ
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/6808#issuecomment-369969999,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA4Bq3O6fYMjBbZtwdfK8IGwjLzmB7HPks5taXDqgaJpZM4SEbFa
.
kops is now using the remote cache, 8+ minutes to test -> ~2 minutes
I think once https://github.com/kubernetes/test-infra/pull/7205 is in we can close this, I've significantly upped the cache storage and we're flipping it on for pretty much all other presubmits that should leverage caching.
/close
We've rolled this out to many more jobs, pull-kubernetes-bazel-build in particular is now trending towards 5-8 minutes instead of 13+

woohooo!!! ๐พ ๐ ๐
cc @ulfjack
Most helpful comment
I've now turned this on for the kubernetes CI bazel-build and bazel-test jobs with great results*


* Note: the build job only runs in post-submit, and once every 6 hours, currently. Once this job is properly continuous the results for it will be more obvious.
We also have a monitoring dashboard now:
