Lighthouse: How to properly kill a lighthouse process prematurely?

Created on 16 May 2019  路  13Comments  路  Source: GoogleChrome/lighthouse

Let's say for some reason I want to kill the lighthouse process and all it's child processes before the run has completed, how could I achieve this? In an asynchronous way would be best, so I can stop program execution until I know all the chrome processes have been killed properly.

Is it as simple as chrome.kill(), assuming chrome is an instance of chrome-launcher, and lighthouse is running on this via port ?

The reason I am doing this is because I want to run lots and lots of lighthouse instances sequentially in a single node.js process. It seems to work fairly smoothly and reliably if the lighthouse runs finish properly, but not so well if my custom timeout function fires, triggering chrome.kill and then going to the next url to audit.

I'm purposefully leaving out any error messages I have been getting so that the bot doesn't pick them up and sideline me! 馃槈

needs-priority question

All 13 comments

There's not currently an easy way to cancel in the middle of a run. All of our environments that desire cancellation treat LH as a separate process that can be killed at any time.

lots and lots of lighthouse instances sequentially in a single node.js process

Some of our assumptions aren't really optimizing for this case so you might have some issues, but if it's working for you, great!

The blessed way of running lots of LH runs would definitely be separate child processes for each. Is there a particular reason you want them all in the same process?

There's not currently an easy way to cancel in the middle of a run.
All of our environments that desire cancellation treat LH as a separate process that can be killed at any time.

Sorry, these two statements didn't make sense to me, they seem to conflict, is there a typo?

It seems to be working OK for now, it's definitely not without it's issues, but we have worked hard in mitigating what we can.

The blessed way of running lots of LH runs would definitely be separate child processes for each. Is there a particular reason you want them all in the same process?

When you say separate _'child processes for each'_, does that include running it in the chrome-launcher process, since that runs as a child process of node? Or am I missing something?

There is no particular desire to do this, it's just what our architecture dictates. Urls to audit sit on an Amazon SQS-queue, node.js application consumes queue. On receiving message, boot chrome-launcher, run LH, process results etc, kill chrome-launcher, get next message. So the node process stays alive, chrome gets booted and killed for each LH run. Its effectively lighthouse-as-a-service, but private to our web-app.

Note: this is NOT running Lighthouse concurrently, but running lots of runs sequentially on the same node process

Sorry, these two statements didn't make sense to me, they seem to conflict, is there a typo?

No typo. The important distinction is that they are running LH in a separate process so that they can kill them because there is no way to easily cancel a run.

My suggestion to you is to do what other services have done, including our own implementations, which is have the long-running node process just run LH CLI as a separate child process (i.e. with require('child_process').execAsync. It will launch chrome for you, do all the cleanup on exit, etc and it's easily killable at any point in time. It doesn't sound like you have any particular reason that you need to run LH within the same node process, so this might work for you. Does that make sense?

tl;dr - LH has no API for cancelling within node but you can obviously kill a process tree anytime you like, so make sure you run LH in a process that you don't mind killing i.e. a child one that's not running your timeout code too :)

As an aside, you may be interested in using PSI's API, which offers LH as a service maintained by us.

Hey Patrick, as usual thanks for the excellent replies

No typo. The important distinction is that they are running LH in a separate process so that they can kill them because there is no way to easily cancel a run.

Sorry, I understand perfectly now, I was reading that late last night when tired!!

OK it appears as if running in separate processes per message/LH run is the only safe way forward. The problem with your LH CLI approach is that in our application we are doing something like (highly simplified):

results = await lighthouse(url, opts, config);
await sendResultsSomewhere(results);

Basically, as far as I can tell, CLI outputs results as JSON/HTML to file, whereas we require the LHR javascript object.

Opening each LH run in a separate process would work absolutely fine for us. But is there a solution for doing this using the node module? using fork() from child_process or something?

If you just need the LHR object in memory (and not the artifacts) you can send the JSON output to stdout out and parse it in your node script to avoid hitting the filesystem.

One of our implementations does this for example and only pulls in the artifacts from disk for debug runs (see link below)

https://github.com/patrickhulce/dzl-lighthouse/blob/c77e0b37ca05777d24d3c232878c6987b2e855ab/cli/lib/collectors/local.js#L83-L101

And don't forget you're always free to write your own javascript file that invokes LH, puts the result wherever you need them to go and run that script in a separate process instead of LH directly too :)

Patrick, thanks, that's awesome.

We have custom audits, that make use of the artifacts, so we need those in memory but only during the execution time of LH, right? So using JSON.parse(results.stdout) should work fine for us, I _think_

The problem is, the async exec-promise type libraries (such as execa that you've used) don't return the pid of the spawned child process until they have resolved/rejected, which is useless if you want to kill (or tree-kill) the process before it has finished. I need something like bluebird's cancel feature.

And don't forget you're always free to write your own javascript file that invokes LH, puts the result wherever you need them to go and run that script in a separate process instead of LH directly too :)

I had a go at this, its... weird. The main problem is that wrapping lighthouse in a script and running with childprocess.fork() or whatever is easy enough when the url is hardcoded in the wrapper script. What you'd want though, is to pass a url argument to the forked process so the wrapper script can receive it, which is actually not so simple?

In an ideal world:

const results = await fork(runLH(url))

Where runLH is the lightweight launch-chrome-then-lighthouse and return LHR wrapper.

For anyone reading this, I've tried just about everything including execa(), bluebird, loads of promise/process libraries, with varying amounts of success (biased to our use-case) and found that the simplest way was to use chromelauncher, and just chrome.kill() instance when I need to abort. I have had to accept that some buffer time is necessary to ensure chrome dies and cleans up properly before trying the next run.

seems like this is as solved as it's going to get. I'm going to close, but can always reopen if you'd like to see more done on the LH side.

Hmm, we are still experiencing issues with this, but thanks for the update

@upugo-dev regarding your last update, the best bet is to just use child_process libraries directly. That way you can control when a promise would resolve and could still return the pid immediately.

@patrickhulce thanks for the info, I seem to recall though that killing a child_process that has spawned it's own child_process(es) -i.e lighthouse- is kind of unreliable. The kill command doesn't kill the whole tree properly sometimes, leaving you with orphaned processes. I'll need to do some experiments again with this though

Was this page helpful?
0 / 5 - 0 ratings