I am using lighthouse from javascript to check a few pages from a website. I would like to be able to tell lighthouse to use a specific tab that is already opened in my chrome to do that. Maybe by giving it at link to a ws endpoint. That's because I create the browser using puppeteer and I want to do some operations before to run lighthouse (like to set the useragent or some request interception configuration) and once the lighthouse check is done (like to get the html of the page or to interact with the page).
Is it possible to tell lighthouse to use a tab already existing?
Thanks for filing @Khady!
In short, it's not currently possible today. Lighthouse always controls the navigation to the page. There are some settings (user agent/setting a cookie/logging in) you'd be able to set using puppeteer before providing the same port to LH, but you'll need to make sure LH isn't overriding those by disabling mobile emulation/storage reset where applicable.
We've got it on our roadmap to enable auditing without a navigation though in which case this sort of thing becomes possible :)
related #1769, #3833
We're not ready to open up the full multiclient story where Lighthouse and Puppeteer/CRI talk to the same page. There are some dragons within here that we're not ready to fight yet.
There's another approach we're discussing and we currently favor:
You can do all this today without any code changes (although #3864 should help quite a bit..) Your custom gatherer won't actually return a useful artifact, but that's OK. We're just abusing its lifecycle hooks.
@patrickhulce does this match what you were thinking?
Yeah this seems like the quickest way to achieve as much of the goals as possible today. Long-term vision, reusing existing tab and making LH more flexible in analyzing existing pages is definitely the way we should be moving to play better with DevTools and puppeteer ๐
how do we feel about creating an example:
https://github.com/GoogleChrome/lighthouse/tree/master/docs/recipes
I'm working on a new version of my tool following your advices.
I'm using a custom gatherer to setup what I need. But this is not convenient at all. I have many missing information that I have to transfer to the gatherer to do the whole setup properly. There are two options to do that if I understand correctly the code I read:
wsEndpoint ร url, but it is not unique if the same url is opened multiple times in the same browser. It would be nice to have wsEndpoint ร pageId, but this information is not publicly available in lighthouse and puppeteer (options.driver._connection._pageId in lh, page._client._targetId in pptr). And anyway I can't know before to launch lighthouse what will be the id of the page. ยฏ\_(ใ)_/ยฏflags object :)?Also if there is an error during the setup of the page (it shouldn't happen, but we are dealing with computers => it will happen), I didn't find a way from the gatherer to interrupt the whole lighthouse operation. I can see in the artifacts that my gatherer has returned an empty object and invalidate the results like this. But it is cumbersome. Plus it still cost me the duration of the lighthouse run which can be pretty long and it can create an unnecessary crawl of the page I try to evaluate.
aslushnikov says in https://github.com/GoogleChrome/puppeteer/pull/1398#issuecomment-346193179 that pptr could move from ws connection to pipe connection. If the pageId/targetId system is not portable over the pipe connection then I guess there is not much choice but to keep the connection system as it is currently. No point adding #3857 if it is going to be deprecated soon.
ps: I think I found a possible bug in cri.js while reading the code, but I don't have the time to properly investigate it (and it's probably not faced very often). When a page is closed here, it can be the latest page of the browser, because of the condition at this point. It is possible to reuse an existing tab and if this tab is the latest tab and closed by lighthouse at the end of the run, the browser will be closed too.
Great feedback @Khady you're somewhat of a pioneer in this area, so it's great to be aware of the pain points :) A few responses to your comments below
using a global state on my side ยญโ which is not very clean and not very convenient. I have no unique identifier available in my code and in the gatherer to which I can attach the information. The best I found is wsEndpoint ร url, but it is not unique if the same url is opened multiple times in the same browser. It would be nice to have wsEndpoint ร pageId, but this information is not publicly available in lighthouse and puppeteer
Ah, you're having trouble finding the target to use in puppeteer once the page has been loaded correct? Yeah, we should expand https://github.com/GoogleChrome/lighthouse/pull/3864 to communicate the target/page ID as well.
storing the values I need in the flags object which is transfered all the way until the gatherer โ and I am a bit scared to use this solution because there is no guaranty that the flags will always be given to the gatherer and it smells like a bit hack. Is there an object in which I can put some data to use in my gatherer and be sure it won't disappear in the future? With a better semantic than the flags object :)?
Yes, we had a plan for this and haven't gotten around since there wasn't an immediate need, but we want to implement audit and gatherer options to pass in dynamic runtime information that can control audit/gatherer behavior separately from the gatherer/audit code itself.
Also if there is an error during the setup of the page (it shouldn't happen, but we are dealing with computers => it will happen), I didn't find a way from the gatherer to interrupt the whole lighthouse operation. I can see in the artifacts that my gatherer has returned an empty object and invalidate the results like this. But it is cumbersome. Plus it still cost me the duration of the lighthouse run which can be pretty long and it can create an unnecessary crawl of the page I try to evaluate.
You should be able to mark an error with a .fatal property to have LH exit immediately rather than just fail the gatherer.
pass(/** stuff */) {
const error = new Error("Uh-oh something went wrong!");
error.fatal = true;
throw error;
}
It is possible to reuse an existing tab and if this tab is the latest tab and closed by lighthouse at the end of the run, the browser will be closed too
Ah, good find! You're right we've never really run into this, especially since we discourage using headless for its lack of throttling, but we should update that to throw loudly at this point if we can't create a tab :)
Thank you for your help!
Ah, you're having trouble finding the target to use in puppeteer once the page has been loaded correct? Yeah, we should expand #3864 to communicate the target/page ID as well.
Correct.
Yes, we had a plan for this and haven't gotten around since there wasn't an immediate need, but we want to implement audit and gatherer options to pass in dynamic runtime information that can control audit/gatherer behavior separately from the gatherer/audit code itself.
Good news. I can manage to do what I want in the current situation. But it's great to have visibility on the future plans.
You should be able to mark an error with a .fatal property to have LH exit immediately rather than just fail the gatherer.
Awesome. I should have read the whole code related to the gatherers and not only some parts.
Nothing is blocking me for now, thanks to your advices. I just exploit a few undocumented information (driver._connection._pageId, the flags object, ...). I understand puppeteer is pretty young and it's not common (yet?) to connect it with lighthouse. My hope is that feedback can help to understand what are the necessary bits for possible improvements.
We will sort this out in the next 2 quarters. Thanks!
I'm also interested in the request interception side of things.
We're running Lighthouse as part of our CI/CD pipeline. However, our API endpoints have really erratic behavior, and requests can take anything from 500ms to 2s. That's forcing us to make our TTI checks much laxer than what we'd want.
If we could intercept requests to those endpoints and immediately respond with a fixture, we'd have much more deterministic numbers, and we could tighten our TTI checks.
See #5472 for another use case
I'm running Chrome with chrome-launcher, then connecting to it with puppeteer. The only thing I'm setting up is this:
// add HTTP BasicAuth credentials on new tab creation:
browser.on('targetcreated', async (target) => {
const page = await target.page()
if (page) await page.authenticate(basicAuth)
})
It works, but once Puppeteer is connected, Lighthouse (and Chrome's devtools, for that matter) stops gathering the size of requests.
Anybody know why, or how to mitigate this (size 0 everywhere)?

Hm that's really strange @niieani. I can't think of anything off the top of my head. It might be a Chrome-side issue. Have you asked in the puppeteer repo yet for any quirks with page.authenticate we might be missing first?
@patrickhulce I think I understand what's going on.
Puppeteer's authenticate method starts network interception. I think it's likely this is what's causing devtools to not display the Sizes.
Additionally, it seems to have introduced a slowdown in the runs (it was almost 2x slower with the authentication enabled).
I solved the problem by passing in the basic auth login/password directly in the URL and getting rid of puppeteer from my setup.
Is there anyway to run lighthouse with pupeteer inside of google cloud functions? I have puppeteer working fine, but the lighthouse part times out from this code:https://github.com/GoogleChrome/lighthouse/blob/master/docs/puppeteer.md
Yeah this seems like the quickest way to achieve as much of the goals as possible today. Long-term vision, reusing existing tab and making LH more flexible in analyzing existing pages is definitely the way we should be moving to play better with DevTools and puppeteer ๐
@patrickhulce : Is there any update on "reusing existing tab" and "audit without refresh" ? will be really helpful if you can provide some example.
Is there any update on "reusing existing tab" and "audit without refresh" ? will be really helpful if you can provide some example.
No, there has not been any progress here. The work was deprioritized and is unlikely to be addressed in the next several months.
I am using lighthouse from javascript to check a few pages from a website. I would like to be able to tell lighthouse to use a specific tab that is already opened in my chrome to do that. Maybe by giving it at link to a ws endpoint. That's because I create the browser using puppeteer and I want to do some operations before to run lighthouse (like to set the useragent or some request interception configuration) and once the lighthouse check is done (like to get the html of the page or to interact with the page).
Is it possible to tell lighthouse to use a tab already existing?
Hi @Khady , i have found out a work-around to make use of already existing chrome tab o launch lighthouse.
I make changes in a file --> node_modules > lighthouse > lighthouse-core > gather > connections > cri.js
I have modified the connect() method by commenting out a section where it tries to launch new window.
connect() {
// return this._runJsonCommand('new')
// .then(response => this._connectToSocket(/** @type {LH.DevToolsJsonTarget} */(response)))
// .catch(_ => {
// Compat: headless didn't support `/json/new` before m59. (#970, crbug.com/699392)
// If no support, we fallback and reuse an existing open tab
log.warn('CriConnection', 'Cannot create new tab; reusing open tab.');
return this._runJsonCommand('list').then(tabs => {
if (!Array.isArray(tabs) || tabs.length === 0) {
return Promise.reject(new Error('Cannot create new tab, and no tabs already open.'));
}
const firstTab = tabs[0];
// first, we activate it to a foreground tab, then we connect
return this._runJsonCommand(`activate/${firstTab.id}`)
.then(() => this._connectToSocket(firstTab));
});
// });
}
It is not the best solution (even recommended), but it solved my desire.
@patrickhulce : can we handle it with any flags while launching lighthouse pragmatically. Just like we send port/hostname information, there should be flag to tell cri.js to not to launch a new tab. I made changes in lighthouse pm module, but every-time when i update npm , it will be overwritten.
Note: I am using remote debugging port 9222 to launch chrome with protractor
It's highly specific to a particular use case, so it's harder to make the argument for a new flag without the surrounding non-reload support around it. Which particular problem did re-using the tab but still reloading it solve for you?
I'm a bit concerned that in general it will lead to confusion that LH will actually asses the loaded page instead of starting a new navigation/series of page loads. But if we can make it very clear that's not the case in our documentation, then I'm not personally opposed to it.
It's highly specific to a particular use case, so it's harder to make the argument for a new flag without the surrounding non-reload support around it. Which particular problem did re-using the tab but still reloading it solve for you?
I'm a bit concerned that in general it will lead to confusion that LH will actually asses the loaded page instead of starting a new navigation/series of page loads. But if we can make it very clear that's not the case in our documentation, then I'm not personally opposed to it.
Hi @patrickhulce , in my case the problem I have when running lighthouse (and puppeteer as automation tool), is that the page I am testing requires username/password authentication. So, after doing the username/pwd authentication, the main tab continues with my puppeteer automated test case steps, but the second tab that lighthouse opens is always redirected to the initial "Login Page". So, I am always getting the metrics of the "Login Page" instead of the user authenticated area pages.
It would be very useful for these scenarios to have an option for not opening new tabs again and again to retrieve the lighthouse metrics.
How are you storing authentication state such that a new tab is not logged in? sessionStorage? If it's anything else, there should be a workaround for you that doesn't require tab reuse :)
Related discussion #1418
How are you storing authentication state such that a new tab is not logged in?
sessionStorage? If it's anything else, there should be a workaround for you that doesn't require tab reuse :)Related discussion #1418
Hi Patrick, thanks in advance!
Yes, I am using sessionStorage; after reviewing all the comments, the only option I think I have is to create a custom config and custom gatherer for lighthouse. Is there any other workaround?
Ah, not really sorry. Custom config with gatherer or tab reuse is the primary path if you're using sessionStorage today.
You could give a puppeteer-side listener a try instead to install the sessionStorage but that might be a bit flaky.
It'd look something like the approach in https://github.com/GoogleChrome/lighthouse/issues/4376#issuecomment-361486901 but in browser.on('targetchanged', you'd want to set your session storage
I am using lighthouse from javascript to check a few pages from a website. I would like to be able to tell lighthouse to use a specific tab that is already opened in my chrome to do that. Maybe by giving it at link to a ws endpoint. That's because I create the browser using puppeteer and I want to do some operations before to run lighthouse (like to set the useragent or some request interception configuration) and once the lighthouse check is done (like to get the html of the page or to interact with the page).
Is it possible to tell lighthouse to use a tab already existing?Hi @Khady , i have found out a work-around to make use of already existing chrome tab o launch lighthouse.
I make changes in a file --> node_modules > lighthouse > lighthouse-core > gather > connections > cri.jsI have modified the connect() method by commenting out a section where it tries to launch new window.
connect() { // return this._runJsonCommand('new') // .then(response => this._connectToSocket(/** @type {LH.DevToolsJsonTarget} */(response))) // .catch(_ => { // Compat: headless didn't support `/json/new` before m59. (#970, crbug.com/699392) // If no support, we fallback and reuse an existing open tab log.warn('CriConnection', 'Cannot create new tab; reusing open tab.'); return this._runJsonCommand('list').then(tabs => { if (!Array.isArray(tabs) || tabs.length === 0) { return Promise.reject(new Error('Cannot create new tab, and no tabs already open.')); } const firstTab = tabs[0]; // first, we activate it to a foreground tab, then we connect return this._runJsonCommand(`activate/${firstTab.id}`) .then(() => this._connectToSocket(firstTab)); }); // }); }It is not the best solution (even recommended), but it solved my desire.
@patrickhulce : can we handle it with any flags while launching lighthouse pragmatically. Just like we send port/hostname information, there should be flag to tell cri.js to not to launch a new tab. I made changes in lighthouse pm module, but every-time when i update npm , it will be overwritten.
Note: I am using remote debugging port 9222 to launch chrome with protractor
Hi @abhinaba1080
I tried your workaround modifying cri.js, but I get the errors:
Runner:warn MetaElements gatherer, required by audit viewport, encountered an error: NO_FCP +0ms
viewport:warn Caught exception: Required MetaElements gatherer encountered an error: NO_FCP +0ms
status Auditing: Contains some content when JavaScript is not available +1ms
status Auditing: First Contentful Paint +4ms
first-contentful-paint:warn Caught exception: NO_FCP +12ms
status Auditing: First Meaningful Paint +0ms
first-meaningful-paint:warn Caught exception: NO_FCP +0ms
status Auditing: Page load is fast enough on mobile networks +1ms
load-fast-enough-for-pwa:warn Caught exception: NO_FCP +0ms
status Auditing: Speed Index +1ms
....
for this fragment of code:
const chromeLauncher = require('chrome-launcher');
const puppeteer = require('puppeteer');
const lighthouse = require('lighthouse');
const config = require('lighthouse/lighthouse-core/config/lr-desktop-config.js');
const request = require('request');
const util = require('util');
(async () => {
const opts = {
logLevel: 'info'
};
// Launch chrome using chrome-launcher
const chrome = await chromeLauncher.launch(opts);
opts.port = chrome.port;
// Connect to it using puppeteer.connect().
const resp = await util.promisify(request)(`http://localhost:${opts.port}/json/version`);
const {webSocketDebuggerUrl} = JSON.parse(resp.body);
const browser = await puppeteer.connect({browserWSEndpoint: webSocketDebuggerUrl});
// Visit pages, take metrics
page = (await browser.pages())[0];
await page.setViewport({width: 1200, height: 900});
await page.goto('http://google.com', {waitUntil: 'networkidle2'});
await runLighthouseForURL(page.url(), opts, "Google Page");
await browser.disconnect();
await chrome.kill();
})().catch( e => {
console.error(e);
process.exit(1);
});
async function runLighthouseForURL(pageURL, opts, reportName) {
const report = await lighthouse(pageURL, opts, config).then(results => {
return results;
});
}
using lighthouse 5.1.0 and puppeteer 1.17.0.
Could you please share a fragment of your code to get rid of that error? Probably force lighthouse to use the same tab is the fastest way to get a workaround for this.
Thanks in advance!
We will sort this out in the next 2 quarters. Thanks!
@paulirish I'm checking in about your comment above. Have you ironed out Puppeteer & Lighthouse integration so that Lighthouse audits can be gathered from the already-launched Puppeteer Chrome tab?
@emiliavanderwerf hi, i sincerely request you to take care selenium case as well. I use selenium to use remote debugging chrome port to connect lighthouse.
@abhinaba1080 That is a good idea. Do you have a code example you would be willing to share?
@patrickhulce Do you have any feedback or an update on the status of this integration ticket?
Thank you both!
@emiliavanderwerf
Thanks for asking. Here I propose my wish list for the Chris-mus.
Note: I am using protractor here. But the wish can meet for any tool.
Wish 1:
my configuration file have the remote debugging port configured as :
browserName: 'chrome',
chromeOptions: {
useAutomationExtension: false,
args: [
// "--headless",
'--remote-debugging-port=9222',
'--disable-gpu',
"--disable-extensions"
]
}
Now, somewhere in my code i want to run the below:
let pageCurrentURL = "https://the-internet.herokuapp.com/secure";
let lhr = await lighthouse(pageCurrentURL, {
port: 9222,
output: "json",
logLevel: "info"
});
It should run the lighthouse audits in the same tab with the current page url. Refresh or No-Refresh should a part of lighthouse configuration.
Wish 2:
I want to run lighthouse as a watcher for certain step performance. Suppose I want to measure the performance of the login activity, then I wish to do:
lighthouse.startWatch();
element(by.css("#username")).sendKeys("google");
element(by.css("#password")).sendKeys("awsomeMinds");
element(by.css("#loginBtn")).click();
lighthouse.stopWatch();
[just in mind: this should not be triggered as separate entity, protractor/ any tool can kick start the same]
In this way we can also be able measure the page load as well. like:
lighthouse.startWatch();
browser.get("https://www.lighthouse.com");
lighthouse.stopWatch();
This will be helpful for the application which has only single url attached throughout the app.
These are just some wishlist. I feel lighthouse is a great tool and I have seen teams are just giving up the idea to use lighthouse as its not easily be integrated with protractor. The second wish really will solve my problem though.
Keep going team. You are really doing something good to ensure software quality. ๐
Going to add my perspective because I did not see mention of this after reading through the thread:
My use-case: I'm looking to run multiple concurrent Lighthouse audits using a single instance of Chrome using a new Incognito Browser Context for each audit so that no data storage/state is shared between concurrent audits.
Ideally each audit could be preceded by a series of actions (e.g. a log in), and state would be maintained per incognito context (tab).
However, following the Puppeteer recipes in this repo, it seems like Lighthouse always opens the URL in the default (shared) browser context.
Have you seen this? https://github.com/GoogleChrome/lighthouse/blob/master/docs/recipes/auth/example-lh-auth.js
Puppeteer, by default, uses a fresh Chrome profile, so if you launch it like the above script does you shouldn't see any state persist.
multiple concurrent Lighthouse audits
FYI we recommend against this. If you rely on the performance category, the results will be skewed. Even if you don't, you risk protocol timeouts by asking Chrome to do too much at once.
Thanks @connorjclark.
Yep, I've seen that recipe. My specific problem with the fresh Chrome profile on launch approach is that I'm exposing Puppeteer as a micro-service (so it only launches when the service re/starts). Multiple clients can hit this service, but their requests are sandboxed from each other via Incognito contexts; I was hoping to borrow the same sandboxing approach for Lighthouse performance audits as well.
FYI we recommend against this. If you rely on the performance category, the results will be skewed.
This is good to know (I'm looking at just the performance category for right now). Is this something that can be mitigated by throwing additional resources at Chrome (e.g. CPU cores / Memory)? Any documentation you have on this would be very much appreciated.
I'd also be curious to hear how Google approaches scaling the PageSpeed Insights API, given the recommendation against concurrent audits in a single Chrome instance.
so it only launches when the service re/starts
I'd suggest this is a micro-optimization. Also, LH directs the browser to clear the cache on each run (by default), so you're also at risk of runs stomping on the cache of other runs.
I'd also be curious to hear how Google approaches scaling the PageSpeed Insights API, given the recommendation against concurrent audits in a single Chrome instance.
We have many machines, a load balancer, and queue things up in the worst case.
You could probably get away with a few concurrent runs, but I'd measure to be safe. 3 is probably fine on any non-network constrained machine. In any case, you certainly should queue up LH runs if you get more than 3 req/minute.
In addition to connor's advice, if you're going to run LH concurrently (again we recommend you don't or your performance variability will be quite high), run each Lighthouse in its own child process and dedicate at least 2 cores to its execution.
Scaling horizontally has shown to yield more consistent results than scaling vertically, i.e. using 8 smaller 2-core machines as opposed to running 8 runs on a 16-core machine. Just avoid any burst-able instance types.
@iamEAP In our case, we run Lighthouse in a serverless compute service (e.g. AWS Lambda). We do this to run 60 tests simultaneously and then extract median performance data to see whether a given code change causes a performance regression (or is an improvement). This makes it easy to run LH concurrently (and scalably) and you get meaningful results as soon as the longest run completes.
You also get a fresh run of Chrome with every hit of the API, so you won't hit any of the issues you mentioned.
Sorry if this was already obvious, but as far as I understand most usecases above could be solved by adding a way to connect to a chrome instance by providing a browser websocket url right? Just like puppeteer.connect accepts a browserWSEndpoint.
Yes that is in fact the plan @Siilwyn but not the hardest part :) The full story is in #11313 and the associated links therein if you're interested in following along ๐
Most helpful comment
@emiliavanderwerf
Thanks for asking. Here I propose my wish list for the Chris-mus.
Note: I am using protractor here. But the wish can meet for any tool.
Wish 1:
Now, somewhere in my code i want to run the below:
It should run the lighthouse audits in the same tab with the current page url. Refresh or No-Refresh should a part of lighthouse configuration.
Wish 2:
I want to run lighthouse as a watcher for certain step performance. Suppose I want to measure the performance of the login activity, then I wish to do:
[just in mind: this should not be triggered as separate entity, protractor/ any tool can kick start the same]
In this way we can also be able measure the page load as well. like:
This will be helpful for the application which has only single url attached throughout the app.
These are just some wishlist. I feel lighthouse is a great tool and I have seen teams are just giving up the idea to use lighthouse as its not easily be integrated with protractor. The second wish really will solve my problem though.
Keep going team. You are really doing something good to ensure software quality. ๐