Zfs: ARC not accounted as MemAvailable in /proc/meminfo

Created on 25 Apr 2020  路  3Comments  路  Source: openzfs/zfs

Describe the problem you're observing

As described in e.g. here, ARC does not count the same way as the Linux buffer cache (e.g. yellow in htop), and ZoL does not use the buffer cache. ARC shows up as green in htop.

More importantly, ARC does not count as MemAvailable in /proc/meminfo.

This interacts badly with various memory-related tools on Linux.

Most prompinently, software like earlyoom (and possibly Facebook's similar oomd) that can be used to avoid hangs due to inefficiencies in Linux's OOM killer that result in ~20 minute system stalls (recent news coverage e.g. here). I observe this myself daily, when programs get killed by earlyoom even though GBs could (and would) be available from ARC, and others have also noticed.

This puts ZoL and its users at a disadvantage and forbids its usage in some cases.

Describe how to reproduce the problem

  • Install earlyoom on desktop system with ZoL.
  • Do some heavy IO, then some heavy browsing with many tabs, observe some of them getting killed.
  • Observe that this stops when you do echo 3 | sudo tee /proc/sys/vm/drop_caches.

Suggested solution

It seems the best solution would be to count reclaimable ARC memory in to the kernel's MemAvailable statistic. This would be better than having to adjust every single application to read e.g. /proc/spl/kstat/zfs/arcstats and do computations itself.

Memory Management

Most helpful comment

This is something we can take another look at. When this code was originally integrated with the kernel there were a variety of technical issues which prevented us from reporting the ARC space as page cache pages. Since then the Linux kernel has changed considerably, as has ZFS, so this may now be more feasible and is definitely worth re-investigating.

All 3 comments

I've encountred the same problem with qemu/kvm. Despite ARC being cache-like, it would not free when qemu/kvm requested ram.

Another relevant question:

Even if you did want to update every single application instead, by what value would you have to adjust MemAvailable?

I am trying + size - c_min - arc_meta_min, but there is still a 300-500 MB difference between that and what e.g. htop shows that's unnaccounted for, which does not seem to exist on a similar ext4 system.

This is something we can take another look at. When this code was originally integrated with the kernel there were a variety of technical issues which prevented us from reporting the ARC space as page cache pages. Since then the Linux kernel has changed considerably, as has ZFS, so this may now be more feasible and is definitely worth re-investigating.

Was this page helpful?
0 / 5 - 0 ratings