Syncthing: Compression option on folders

Created on 9 Aug 2017  路  4Comments  路  Source: syncthing/syncthing

Whether or not to compress data should be decided on a folder basis not by device.

For instance, I have a device where it syncs one folder with database log segments (highly compressible binary data) and other folder with "tar.gz" backups. I've turned on the compression for the entire device because the log segments have 16MB each and they usually compress (with gzip) to something like 50~100KB, but compression is a waste for the backup files.

enhancement

Most helpful comment

IIRC the original idea behind making it depend on the device was to make it easier for devices low on processing power (or normal devices with higher bandwidth) like the first generation Raspberry Pi by allowing technical users to disable compression for those devices (see https://github.com/syncthing/syncthing/issues/446).

Later in the discussion we agreed on "metadata only" as the default (https://github.com/syncthing/syncthing/issues/1374), because we didn't expect to safe a lot of bandwidth except form log files and our own metadata , as most other big files are media files which are already compressed.

IMHO this request makes sense, as the assumptions have changed from when this setting was created. We don't need to turn it off any more for less powerful devices (or those with a high bandwidth connection), because compressing everything isn't the default anymore. Now the setting only has to allow technical users to enable it when needed. But how do we decide which content is worth compressing? By folder, because that's what decides the actual content to be transmitted.

So for me there are only two sensible ways forward:

  • We either actually test the original hypothesis ("it makes sense to lz4 everything") on PCs, laptops, SBCs and mobile devices. The results are so obviously in favor of compression that we reenable it as the default and keep the compression option device-dependent.
  • Or we decide the current default (metadata only) is the best in todays ecosystem and make it easier to enable it for specific folders with highly compressable data as requested.

All 4 comments

Honestly I suspect the lz4 compression we use is cheap enough to always be less expensive than the crypto, and pretty quickly realize that something is uncompressible and give up. But that's an argument against being able to configure it at all, not against making it more granular when it is configurable.

Also, compression applies on the block level, which is 128kb, so your 16mb to 100kb assumption is wrong right off the bat, yet the suggestion is perfectly valid.

Not necessarily. We can request 128 KiB of the log data and receive 800 bytes of compressed data, achieving roughly the mentioned compression ratio. But the overhead is significant at that level. LZ4 is slightly less efficient than gzip but that's not the point anyhow.

IIRC the original idea behind making it depend on the device was to make it easier for devices low on processing power (or normal devices with higher bandwidth) like the first generation Raspberry Pi by allowing technical users to disable compression for those devices (see https://github.com/syncthing/syncthing/issues/446).

Later in the discussion we agreed on "metadata only" as the default (https://github.com/syncthing/syncthing/issues/1374), because we didn't expect to safe a lot of bandwidth except form log files and our own metadata , as most other big files are media files which are already compressed.

IMHO this request makes sense, as the assumptions have changed from when this setting was created. We don't need to turn it off any more for less powerful devices (or those with a high bandwidth connection), because compressing everything isn't the default anymore. Now the setting only has to allow technical users to enable it when needed. But how do we decide which content is worth compressing? By folder, because that's what decides the actual content to be transmitted.

So for me there are only two sensible ways forward:

  • We either actually test the original hypothesis ("it makes sense to lz4 everything") on PCs, laptops, SBCs and mobile devices. The results are so obviously in favor of compression that we reenable it as the default and keep the compression option device-dependent.
  • Or we decide the current default (metadata only) is the best in todays ecosystem and make it easier to enable it for specific folders with highly compressable data as requested.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

calmh picture calmh  路  3Comments

norgeous picture norgeous  路  3Comments

trymeouteh picture trymeouteh  路  4Comments

tomasz1986 picture tomasz1986  路  3Comments

STaRDoGG picture STaRDoGG  路  3Comments