Go: archive/tar: should treat empty tar files as invalid

Created on 23 Mar 2019  路  3Comments  路  Source: golang/go

What version of Go are you using (go version)?

https://play.golang.org/

What did you do?

https://play.golang.org/p/VeyYROaJQZu

What did you expect to see?

An error to be printed, similar to how GNU tar produces an error when reading an empty file:

$ touch empty
$ tar --version
tar (GNU tar) 1.29
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by John Gilmore and Jay Fenlason.
$ tar tf empty
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors

What did you see instead?

No error printed.

FrozenDueToAge NeedsDecision

Most helpful comment

tar archives lack a unified formal specification of what the format actually is. Though the USTAR and PAX formats specify that an archive should end with 2 blocks of zeros, the reality is that this specification came after the world has been doing all sorts of weird things. Most parsers treat the last 2 blocks of zeros as optional for maximal compatibility. This even goes for GNU tar, which also treats those blocks as optional. It just so happens that GNU tar has a special case where they fail on an empty file. On the other hand, BSD tar seems perfectly happy parsing an empty file.

Personally, I think we should keep the current behavior since an empty file should be valid supposing the trailing 2 blocks of zeros is considered optional. Also, turning an empty file into an error now is likely to break people who have come to depend on this behavior.

All 3 comments

tar archives lack a unified formal specification of what the format actually is. Though the USTAR and PAX formats specify that an archive should end with 2 blocks of zeros, the reality is that this specification came after the world has been doing all sorts of weird things. Most parsers treat the last 2 blocks of zeros as optional for maximal compatibility. This even goes for GNU tar, which also treats those blocks as optional. It just so happens that GNU tar has a special case where they fail on an empty file. On the other hand, BSD tar seems perfectly happy parsing an empty file.

Personally, I think we should keep the current behavior since an empty file should be valid supposing the trailing 2 blocks of zeros is considered optional. Also, turning an empty file into an error now is likely to break people who have come to depend on this behavior.

For other references, perl also treat empty file as valid:

perl -MArchive::Tar -e 'my $tar = Archive::Tar->new; $tar->read("./empty")'

python raise error:

python -c 'import tarfile; tarfile.TarFile("./empty")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/tarfile.py", line 1587, in __init__
    self.firstmember = self.next()
  File "/usr/lib/python2.7/tarfile.py", line 2371, in next
    raise ReadError("empty file")
tarfile.ReadError: empty file

I'm going to close this as working as intended:

  1. Had we been implementing archive/tar from scratch perhaps we could have treated an empty file as invalid, but that has not be the case since Go 1.0. Changing it may break people relying on this property.
  2. Many tar implementations also treat empty file as valid, so we are hardly the exception.
Was this page helpful?
0 / 5 - 0 ratings