Lighthouse: Optimum k8s node-worker resource limits for Lighthouse running w/ headless chrome in a container?

Created on 19 May 2018 · 6Comments · Source: GoogleChrome/lighthouse

Hi there! 👋

We are seeing large gaps in our Lighthouse performance scores when running Lighthouse v2 (via CLI) in a container using headless chrome vs running Lighthouse locally in our browser. In both scenarios, and running against the same URL, the containerized solution with headless chrome gives significant drops in performance scores.

Our deployed image has headless chrome installed and then we use the chrome-launcher + lighthouse NPM packages in our testing application to run Lighthouse tests against all our urls.

We use Kubernetes and deploy our testing application as a cronjob. In k8s we have a dedicated node-pool to which we deploy our above lighthouse testing application. Each worker in the pool uses the n1-standard-2 (2 vCPUs, 7.5 GB memory) machine type. When running our tests we try to run 8 jobs in parallel (basically 8 node-workers each running lighthouse tests against different urls)

We deploy our application as a cronjob in k8s to the dedicated node-pool using the following resource limit settings:

  resources:
    requests:
      cpu: 1000m
      memory: 4Gi

These resource settings give us _reasonable_ performance score results, but still significantly lower scores than running Lighthouse locally in our browser against the same URL.

Would it please be possible to get some recommended settings which would help us achieve parity in our containerized lighthouse performance scores vs running lighthouse locally? Given Chrome is heavily multi-threaded would it make sense to try bumping the cpu request limit to 2000m?

Such a change would also force us to upgrade our node-pool workers to the next step up machine type - n1-standard-4 (4 vCPUs, 15 GB memory) I think, but it's something we're willing to invest in if it helps us achieve the parity we're looking for!

Any input and/or advice would greatly be appreciated from the Lighthouse/Chrome team!
Please also let us know if there are any possible follow up questions?

chillin ❄ question

Source

skalfyfan

Most helpful comment

A little confused that you would suggest we reduce CPU and disable throttling. Is the goal not to emulate a Nexus 5 on 3G? Maybe there should be a warning when LH cannot properly emulate these conditions.

That is definitely the goal, but in cases where the CPU you're running on is closer to a Nexus 5X than a desktop, then the throttling we apply is overkill. What I'm essentially suggesting is that you manage the CPU throttling yourself in k8s by limiting the available CPU resources and tell Lighthouse to disable the throttling because you're already throttling outside of Lighthouse. Does that make sense?

patrickhulce on 22 May 2018

👍2

All 6 comments

We haven't done extensive parity testing with cloud machines vs. local setups, but one thing you might want to try if you haven't already, is actually limiting the available CPU more and then disabling the CPU throttling in Lighthouse when run as part of the job. We apply 4x CPU throttling normally so right now you might be experiencing some extra, unnecessary slowdown. (Aside: we've found that running in some CI environments like Appveyor the CPU is sufficiently slower than local machines that entirely disabling CPU throttling yields results more similar to local)

v3 also offers significantly more tunable throttling settings so if you want to give v3 a try, you can experiment with adjusting those multipliers to achieve more consistent numbers between the two environments.

Either way we'd love to hear about your findings in differences, so we can better document them for folks :)

patrickhulce on 21 May 2018

Hi @patrickhulce

Thanks for the quick response.

A little confused that you would suggest we reduce CPU and disable throttling. Is the goal not to emulate a Nexus 5 on 3G? Maybe there should be a warning when LH cannot properly emulate these conditions.

My concern is that by removing these controls, we will be increasing the variability our test conditions and results, no?

githubjosh on 22 May 2018

p.s. we love the work you're doing with LH :) and appreciate testing performance in a meaningful and reliable way is pretty darn challenging.

githubjosh on 22 May 2018

A little confused that you would suggest we reduce CPU and disable throttling. Is the goal not to emulate a Nexus 5 on 3G? Maybe there should be a warning when LH cannot properly emulate these conditions.

patrickhulce on 22 May 2018

👍2

got it. seems reasonable. many thanks @patrickhulce. will give it a go and report back. @skalfyfan

githubjosh on 22 May 2018

👍1

closing since there hasn't been any movement since may, but happy to keep discussing here or elsewhere