We were getting OOMs in LR and I managed to confidently bisect down to the commit where #11188 was merged. (its core(image-elements): collect CSS sizing, ShadowRoot, & position)
node lighthouse-cli http://cosmetiqon.gr/ --only-audits=unsized-images -G
Here's one URL where this can sometimes OOM, though I definitely can't get an OOM locally. I'm not entirely sure which context is getting the OOM.. the page or Lighthouse.
I do know that if I comment out these lines...
鈥he imageGatherer takes 2s instead of 28s.
I attempted to do some memory debugging but didn't get too far. Still requires a bit of investigation
Similar efforts: #7274 #9818
getMatchedStylesForNode is hella slow on that page for some reason
Promise.alling that set but that's probably just going to make the OOM situation even worse 馃槄 I like the idea of only finding this data for the largest X displayed dimension images since those will have largest CLS impact, but still worth figuring out why it's so slow on this page compared to say photo gallery pages, CNN, The Verge, which are all pretty heavy image-wise.
Diff for Parallelization
--- a/lighthouse-core/gather/gatherers/image-elements.js
+++ b/lighthouse-core/gather/gatherers/image-elements.js
@@ -325,11 +325,9 @@ class ImageElements extends Gatherer {
return collectImageElementInfo();
})()`;
/** @type {Array<LH.Artifacts.ImageElement>} */
const elements = await driver.evaluateAsync(expression);
-
- /** @type {Array<LH.Artifacts.ImageElement>} */
- const imageUsage = [];
const top50Images = Object.values(indexedNetworkRecords)
.sort((a, b) => b.resourceSize - a.resourceSize)
.slice(0, 50);
@@ -339,7 +337,8 @@ class ImageElements extends Gatherer {
driver.sendCommand('DOM.getDocument', {depth: -1, pierce: true}),
]);
- for (let element of elements) {
+ /** @type {Array<LH.Artifacts.ImageElement>} */
+ const imageUsage = await Promise.all(elements.map(async element => {
// Pull some of our information directly off the network record.
const networkRecord = indexedNetworkRecords[element.src] || {};
element.mimeType = networkRecord.mimeType;
@@ -366,8 +365,8 @@ class ImageElements extends Gatherer {
element = await this.fetchElementWithSizeInformation(driver, element);
}
- imageUsage.push(element);
- }
+ return element;
+ }));
await Promise.all([
driver.sendCommand('DOM.disable'),
I did some exploring on this issue, but I couldn't find (& don't think I have) access to LR so this is coming from observing what happens on my local machine:
I don't know if the slow down from getMatchedStylesForNode has to do with the OOM issue, but my intuition believes they might be two separate things to consider, especially after reading the performance issues previously encountered when using getMatchedStylesForNode
In the font-size audit, as far as I could tell, we optimize how many times we actually call getMatchedStylesForNode, which is not something I did when I wrote unsized-images, because I didn't realize how slow getMatchedStylesForNode can be. In order to improve the runtime of unsized-images by reducing calls to getMatchedStylesForNode one optimization that I think we should include is to change
if (!element.isInShadowDOM) {
to
if (!element.isInShadowDOM && !element.isCss) {
since we don't currently check css background-images in unsized-images anyway, and cssWidth/cssHeight aren't as relevant to background-images because of background-repeat & parent element sizing
Additionally, I agree with @patrickhulce about
finding this data for the largest X displayed dimension images since those will have largest CLS impact
or other workarounds that can reduce the total calls to getMatchedStylesForNode.
I noticed that a large amount of the ImageElements in http://cosmetiqon.gr/ had the same src because they were the default gif for the site's lazy loaded images. There might be potential here to reduce the calls to getMatchedStylesForNode, i.e. caching the CSS.GetMatchedStylesForNodeResponse for sibling nodes that have the same CSS class (might make OOM worse), or not calling getMatchedStylesForNode on lazy loaded images outside of the viewport (not sure if this is what we'd want to encourage)
As for the OOM issue, I had some leads I could think of:
unsized-images on http://cosmetiqon.gr/ there was a handful of errors likemethod <= browser ERR:error DOM.pushNodeByPathToFrontend +16s
that disappear after adding && !element.isCss. Is there a possibility for a memory leak caused from too many errors?
const matchedRules = await driver.sendCommand('CSS.getMatchedStylesForNode', {
nodeId: nodeId,
})
to
const matchedRules = await driver.sendCommand('CSS.getMatchedStylesForNode', {
nodeId,
})
In worst case this somehow causes a circular reference since nodeId was declared earlier & in best case this is just a nit
async afterPass(passContext, loadData) within image-elements.js
image-elements.js:I JSON.stringified instances of matchedRules and found sizes ranging from ~30000 chars/bytes to ~200000 chars/bytes with median fitting between ~100000-150000 chars/bytes for a page such as http://cosmetiqon.gr/
Based off a 150000 byte ballpark for each instance of matchedRules in http://cosmetiqon.gr/, and the fact that it has ~100 ImageElements, we get 150000 * 100 -> ~15MB
I do not know if this is a reasonable use of memory / cache when running lighthouse or LR, & whether we do save all matchedRules, I'll check ndb later to see if this happens locally
not OOM stuff, but some hard data on the perf slowdown. here's the runtime of the imageelements gatherer.

this is before and after the #11188 commit specifically. (i was doing a bisect)
After discussing with @lemcardenas, we're going to revert ##11217 and #11188 before we ship 6.3.0
Some changes we might make to the gatherer:
1) don't collect source rules if element is CSS (as @lemcardenas said)
2) it seems we might as well do getComputedStyles in the page, instead of over the protocol. we aren't keeping the origin of the width / height rules, only the resolved value, so there's no need to do it over the protocol. Might be faster?
Let's collect a few slow URLs and assess these changes using the compare-runs script.
Disappointing, but SG.
Do you have a URL list for these extra long URLs @paulirish? It's shocking to me that just the 95th percentile is hovering around 10s in LR when my typical worst offender testing sites (cnn, theverge, hulce.photography) are all ~500-700ms locally. Would be super useful to expand my repertoire :)
it seems we might as well do getComputedStyles in the page. we aren't keeping the origin of the width / height rules, only the resolved value, so there's no need to do it over the protocol. Might be faster?
gCS will give you the computed width/height though and not whether that computed style was determined because of a CSS rule or the intrinsic size of the image, which is the whole point of the audit, right?
gCS will give you the computed width/height though and not whether that computed style was determined because of a CSS rule or the intrinsic size of the image, which is the whole point of the audit, right?
Ah, that's right, oop! That really should be explicitly stated in the gatherer.
Once we have some urls we can start digging in more...
I'd start by adding timing marks around these three areas:
DOM.pushNodeByPathToFrontendCSS.getMatchedStylesForNodegetEffectiveSizingRuleIt's unfortunate we are working from a devtools node path here. Perhaps there's a way to grab the DOM snapshot (must verify that the width/height properties from the snapshot don't include intrinsic image sizes) and then connect that data to the image element we scraped.
Either:
1) via constructing the devtools node path from the snapshot or
2) for each of the unsized images in the snapshot (this limits amount of work), get the remote object id and determine its node path in the page. Tweak the artifact to have a boolean isExpicitlySized (no need to get the actual size, audit doesn't care): ImageElements would have isExpicitlySized set to true iff the snapshot for that element has a value for width and height.
Do you have a URL list for these extra long URLs
ya.. mentioned http://cosmetiqon.gr/ in top comment. here's a list:
http://dev.digitalcodes.co/in/
http://dehof.nl
http://beta.digitalcodes.co/in/
http://bulknutrients.com.au/
http://jolly-playstore.myshopify.com
http://www.digitalcodes.co/in/
http://aashmani.in/
http://babi-shop.ci
http://omnicord.myshopify.com
http://cosmetiqon.gr/
http://betpara46.com
http://modesens.herokuapp.com/designers/
http://jezaa.in/
http://actifproducts.com/
http://shopat24.com
http://ithaldus.ee/
http://nodepositdaddy.com.labs.mk/
http://camilledefago.tuwebpress.com/
I'd start by adding timing marks around these three areas:
yup, that's also where i'd go next. 馃憤
i agree with everything in yr comment.
(re prioritizing as the revert addresses the most immediate concerns)
thanks @paulirish that set is something else
http://www.digitalcodes.co/in/ has 306 offscreen images and takes two MINUTES 馃槺

the snapshot idea i had in https://github.com/GoogleChrome/lighthouse/issues/11289#issuecomment-680279085 doesn't work for us because the width or height values in the snapshot are from computed style.
isCss is a significant speed up, enough to reland this change. https://github.com/GoogleChrome/lighthouse/issues/11289
quick followup we will do is sort the elements by image size, then apply a sensible time budget to fetching source rules. https://github.com/GoogleChrome/lighthouse/pull/11340#issuecomment-682259063
Most helpful comment
isCssis a significant speed up, enough to reland this change. https://github.com/GoogleChrome/lighthouse/issues/11289quick followup we will do is sort the elements by image size, then apply a sensible time budget to fetching source rules. https://github.com/GoogleChrome/lighthouse/pull/11340#issuecomment-682259063