Irssi: very fragmented DCC on NTFS

Created on 15 Jan 2020  ·  5Comments  ·  Source: irssi/irssi

I have problem with very, very defragmented files when downloading via DCC. I just don't know how much (or if) responsible is Irssi on my setup. Unfortunately I use Windows, NTFS and launch irssi via cygwin.

Please, don't judge.

To ilustrate how fragmented files (yellow color) are:
https://i.imgur.com/PccE8sw.png

After zooming on middle-ish cluster:
https://i.imgur.com/TQ3uNBV.png

All yelow parts are one file, aprox ~1.3gb. Getting it via DCC is very fast, that I can tell, reading from it - that's a nightmare. When dcc'ing multiple files they interweave freely because why shouldn't they.

I looked through /set options for dcc but it was nothing even remotely helpful. It looks like irssi is trying to write as much small chunks of data in as much small packages as possible and filesystem can't help but make it as sparse as it can get.

enhancement question

Most helpful comment

We could preallocate!

Apparently there's fallocate and posix_fallocate - former is linux-specific and latter is standard, but if the latter is used with an unsupported filesystem in glibc, it will do a really shitty/slow emulation of it. So usually fallocate is preferred because it will fail if not supported. And a typical filesystem that doesn't support this is ntfs-3g.

Which sounds like there's no way to fix this, except this isn't linux!

While searching for this I found this rsync patch which became this commit with no proper attribution and apparently cygwin specifically is the only platform where posix_fallocate is preferred and handles ntfs just fine.

The man page also documents this nicely:

--preallocate
       This tells the receiver to allocate each destination file to its even‐
       tual size before writing data to the file.  Rsync will  only  use  the
       real filesystem-level preallocation support provided by Linux’s fallo‐
       cate(2) system call or Cygwin’s posix_fallocate(3), not the slow glibc
       implementation that writes a null byte into each block.

       Without  this  option,  larger files may not be entirely contiguous on
       the filesystem, but with this option rsync  will  probably  copy  more
       slowly.   If  the  destination  is not an extent-supporting filesystem
       (such as ext4, xfs, NTFS, etc.), this option may have no positive  ef‐
       fect at all.

This seems to be a default-off option for rsync, and since irssi's primary use case isn't file transfer, we could have a simpler implementation, IMO: if a /set is set, posix_fallocate(). If it's slow then the user can just turn it off.

All 5 comments

We could preallocate!

Apparently there's fallocate and posix_fallocate - former is linux-specific and latter is standard, but if the latter is used with an unsupported filesystem in glibc, it will do a really shitty/slow emulation of it. So usually fallocate is preferred because it will fail if not supported. And a typical filesystem that doesn't support this is ntfs-3g.

Which sounds like there's no way to fix this, except this isn't linux!

While searching for this I found this rsync patch which became this commit with no proper attribution and apparently cygwin specifically is the only platform where posix_fallocate is preferred and handles ntfs just fine.

The man page also documents this nicely:

--preallocate
       This tells the receiver to allocate each destination file to its even‐
       tual size before writing data to the file.  Rsync will  only  use  the
       real filesystem-level preallocation support provided by Linux’s fallo‐
       cate(2) system call or Cygwin’s posix_fallocate(3), not the slow glibc
       implementation that writes a null byte into each block.

       Without  this  option,  larger files may not be entirely contiguous on
       the filesystem, but with this option rsync  will  probably  copy  more
       slowly.   If  the  destination  is not an extent-supporting filesystem
       (such as ext4, xfs, NTFS, etc.), this option may have no positive  ef‐
       fect at all.

This seems to be a default-off option for rsync, and since irssi's primary use case isn't file transfer, we could have a simpler implementation, IMO: if a /set is set, posix_fallocate(). If it's slow then the user can just turn it off.

Preallocating space surely does sound like perfect solution for my peculiar case. Where do I sign?

I assume you call/start cygwin with --preallocate

as a workaround you can always copy the file to a new copy after the transfer is finished

@vague666 no such flag in cygwin, it's for rsync only
@ailin-nemui that's what I've been doing, along with downloading stuff not on windows and ntfs

I've made a patch for IRC library for allocating space for incoming DCC files to check if idea holds
https://i.imgur.com/z28hQhx.png
and it's perfect. For time being I won't be using irssi but this library.

Was this page helpful?
0 / 5 - 0 ratings