As described in e.g. here, ARC does not count the same way as the Linux buffer cache (e.g. yellow in htop), and ZoL does not use the buffer cache. ARC shows up as green in htop.
More importantly, ARC does not count as MemAvailable in /proc/meminfo.
This interacts badly with various memory-related tools on Linux.
Most prompinently, software like earlyoom (and possibly Facebook's similar oomd) that can be used to avoid hangs due to inefficiencies in Linux's OOM killer that result in ~20 minute system stalls (recent news coverage e.g. here). I observe this myself daily, when programs get killed by earlyoom even though GBs could (and would) be available from ARC, and others have also noticed.
This puts ZoL and its users at a disadvantage and forbids its usage in some cases.
earlyoom on desktop system with ZoL.echo 3 | sudo tee /proc/sys/vm/drop_caches.It seems the best solution would be to count reclaimable ARC memory in to the kernel's MemAvailable statistic. This would be better than having to adjust every single application to read e.g. /proc/spl/kstat/zfs/arcstats and do computations itself.
I've encountred the same problem with qemu/kvm. Despite ARC being cache-like, it would not free when qemu/kvm requested ram.
Another relevant question:
Even if you did want to update every single application instead, by what value would you have to adjust MemAvailable?
I am trying + size - c_min - arc_meta_min, but there is still a 300-500 MB difference between that and what e.g. htop shows that's unnaccounted for, which does not seem to exist on a similar ext4 system.
This is something we can take another look at. When this code was originally integrated with the kernel there were a variety of technical issues which prevented us from reporting the ARC space as page cache pages. Since then the Linux kernel has changed considerably, as has ZFS, so this may now be more feasible and is definitely worth re-investigating.
Most helpful comment
This is something we can take another look at. When this code was originally integrated with the kernel there were a variety of technical issues which prevented us from reporting the ARC space as page cache pages. Since then the Linux kernel has changed considerably, as has ZFS, so this may now be more feasible and is definitely worth re-investigating.