Profiling shows that the majority of the time spent in WR (on real pages) is in producing text runs (dealing with the glyph cache primarily).
It's also often the case that when WR receives a new DL, the majority of the text runs are actually the same as in the previous DL.
If we could identify text runs that are the same across DLs (e.g. hash the string, font options, etc) and cache these in FBOs, I think we could drastically reduce the CPU time both when scrolling, and also when receiving new display lists.
If the calling code was able to provide a unique identifier for a text run (perhaps it already has such information or can easily generate it?) that was consistent across display lists, this could be even faster.
cc @jrmuizel @kvark @lsalzman
Is it possible to fix this in a generic way through incremental display list updates?
If the calling code was able to provide a unique identifier for a text run (perhaps it already has such information or can easily generate it?) that was consistent across display lists, this could be even faster.
That's a good question to Gecko folks (@jrmuizel, @nical?). Do you guys know if a text run is the same as you submitted the previous frame? In this case, we could use this information effectively by making the text runs go through generate_key and such, similar to how we handle images and fonts.
I did a rough check on Servo, and they appear to call push_text at higher frequency (in general) than rebuilding their display lists, so the optimization would make sense. Although, there is a TODO item to use WR's display lists directly, in which case this issue will probably be solved.
Yeah, it's possible for us to know if the text run is the same.
Great, it sounds like this is worth further investigation!
This is probably not needed, now that the hashing and general glyph cache handling is much faster. We can re-open in the future if we want to consider again.
After some discussions and thinking about it again this week, I think this may be worth pursuing. It has the potential to drastically reduce the size of display lists, and also the CPU time in WR itself.
If it would work in Gecko, the ideal interface for this would be similar to how images work. So, instead of passing a list of glyph instances in to add_text when building the display list, we'd have something like:
fn generate_text_run_key() -> TextRunKey;
fn add_text_run(key: TextRunKey, font: FontInstance, glyphs: &[GlyphInstance], transient: bool);
fn delete_text_run(key: TextRunKey);
The transient field above would specify that WR should automatically delete this text run when this scene gets deleted. It may or may not be useful.
And then, when building display lists:
fn push_text(info: &LayoutPrimitiveInfo, key: TextRunKey);
How difficult would it be to make something like this work in Gecko @lsalzman @jrmuizel ?
In thinking about this more, it's not as easy as I thought to know which text run we're in. i.e. A single Gecko object corresponds to multiple text runs depending on the contents of the text run. We'd need to think about how to make these ids stable.
I think we can detect this internally at a minimal cost, so I wouldn't worry too much about it on the Gecko side, for now.
Most helpful comment
After some discussions and thinking about it again this week, I think this may be worth pursuing. It has the potential to drastically reduce the size of display lists, and also the CPU time in WR itself.
If it would work in Gecko, the ideal interface for this would be similar to how images work. So, instead of passing a list of glyph instances in to
add_textwhen building the display list, we'd have something like:The
transientfield above would specify that WR should automatically delete this text run when this scene gets deleted. It may or may not be useful.And then, when building display lists:
How difficult would it be to make something like this work in Gecko @lsalzman @jrmuizel ?