Sdk: Reduce AOT snapshot size by compressing stackmaps

Created on 27 Nov 2018  ·  13Comments  ·  Source: dart-lang/sdk

Stack maps contribute to roughly 8% of the snapshot size of Flutter Gallery.

These can be compressed in many ways, e.g. prefix encoding, delta encoding or global canonicalization.

@Hixie

area-vm customer-flutter type-performance vm-aot-code-size

Most helpful comment

New change brings more improvements. Here's the pertinent numbers from the commit message:

The impact on AOT snapshot size when compiling the Flutter Gallery
in release mode:

   armv7: Total size -2.58% (Isolate RO: +14.46%, Isolate snapshot: -22.93%)
   armv8: Total size -1.85% (Isolate RO: +15.69%, Isolate snapshot: -22.97%)

The impact on in-memory, not on-disk, size for the Flutter Gallery as seen
in the Observatory while running a profile (not release) build:

   armv7: Drops from 7.1 MB to 6.2MB (-0.9 MB)
   armv8: Drops from 13.5MB to 11.7MB (-1.8 MB)

I have one more CL I'm working on at the moment for this particular issue, and then I'm going to switch my attention to #35851.

All 13 comments

FWIW, StackMaps are already canonicalized.

One idea I had but haven't looked into is whether we can attribute most of the StackMaps to uninitialized spill slots. Unoptimized code initializes all its stack slots during frame building and generates no StackMaps. The runtime interprets the lack of a StackMap for a PC to indicate every stack slot contains a valid object reference. Optimized code does not initialize stack slots up front, so it generates StackMaps for every safepoint so the runtime can tell which slots are valid references and which are either uninitialized or contain unboxed values. My guess is that uninitialized slots are much more common than unboxed slots. If so, optimized code could initialize them up front and at most safepoints every slot would be an object reference and the StackMap could be omitted.

@sstrickl is going to look at this. There are few low-hanging fruits (e.g. packing stack maps together to avoid header overheads, special encoding for all 0s and all 1s stack maps. truncting 1s suffix from stack map), but there might be some more radical changes possible too.

@rmacnak-google @mraleph Could you tell me the specific meaning represented by StackMaps? Are you making some progress for compressing stackmaps?

Could you tell me the specific meaning represented by StackMaps?

Stackmaps are mapping PC offsets (relative to start of function) to effectively a bitmap. The bitmap describes which words in the stack frame hold heap pointers. We have one such RawStackMap object per PC. Those are deduplicated but there is redundancy across multiple PCs as well as overhead from object headers.

We will start looking at this in the near future.

@wangying3426 it seems to work fine, does not mean it actually works fine. if you hit a GC in a place where there is double or an integer on the stack your application will crash without stack maps. it also changes GC behavior - some objects might end up living longer than with StackMaps.

By the way, our primary goal is to reduce the release size of builds now.

Yes, we understand this desire. That is why we are planning to look at this at some point in the future. Right now you have to keep in them in the app.

Naive question: should stackmaps be stripped/partially stripped in Release builds? (no is a fine answer).

@eseidelGoogle they can't be stripped - GC can't function without them.

Read the description of a2bb730 for more details, but I've reduced the amount of the heap snapshot used by StackMaps when compiling the Flutter gallery from ~8% to ~2.6% and the total size of the heap snapshot by 0.3 MB. This was done by lifting the PC offset outside of the StackMap object itself, which allowed a lot more StackMap objects to be canonicalized into one object (originally we generated 49,381 StackMaps, now we only generate 16,139).

I've also tried a few other things that were fairly localized changes: shrinking fields, special encoding for empty StackMaps, and compressing StackMaps where the register bits or the spill slot bits are all set or unset. However, the gains there were minimal, especially after the increased canonicalization. Ideas like Ryan's are likely the way to go for further gains at this point.

However, the gains there were minimal

Thanks for doing the experiments! Did you retain the numbers for different things that you have tried? It would be good to post them here for the future reference.

Yep, I’ve been keeping a doc with various numbers, like those seen in the commit message, for each of the changes (sometimes together with other related changes, sometimes separate). I’ll take some time to add that info here soon!

New change brings more improvements. Here's the pertinent numbers from the commit message:

The impact on AOT snapshot size when compiling the Flutter Gallery
in release mode:

   armv7: Total size -2.58% (Isolate RO: +14.46%, Isolate snapshot: -22.93%)
   armv8: Total size -1.85% (Isolate RO: +15.69%, Isolate snapshot: -22.97%)

The impact on in-memory, not on-disk, size for the Flutter Gallery as seen
in the Observatory while running a profile (not release) build:

   armv7: Drops from 7.1 MB to 6.2MB (-0.9 MB)
   armv8: Drops from 13.5MB to 11.7MB (-1.8 MB)

I have one more CL I'm working on at the moment for this particular issue, and then I'm going to switch my attention to #35851.

New change brings more improvements. Here's the pertinent numbers from the commit message:

The impact on AOT snapshot size when compiling the Flutter Gallery
in release mode:

   armv7: Total size -2.58% (Isolate RO: +14.46%, Isolate snapshot: -22.93%)
   armv8: Total size -1.85% (Isolate RO: +15.69%, Isolate snapshot: -22.97%)

The impact on in-memory, not on-disk, size for the Flutter Gallery as seen
in the Observatory while running a profile (not release) build:

   armv7: Drops from 7.1 MB to 6.2MB (-0.9 MB)
   armv8: Drops from 13.5MB to 11.7MB (-1.8 MB)

I have one more CL I'm working on at the moment for this particular issue, and then I'm going to switch my attention to #35851.

Great, we will try it soon!

Status update: most of the low hanging fruits has been picked up. No active development on this is expected anymore - however @sstrickl has some ideas how to reduce the size a bit more. One CL is pending. Moving this issue to icebox for now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

DartBot picture DartBot  ·  3Comments

ranquild picture ranquild  ·  3Comments

xster picture xster  ·  3Comments

55555Mohit55555 picture 55555Mohit55555  ·  3Comments

bergwerf picture bergwerf  ·  3Comments