Go-ipfs: fix storage limits

Created on 9 Aug 2016  路  12Comments  路  Source: ipfs/go-ipfs

we currently have code that is supposed to limit the size of an fsrepo, but i'm fairly certain it doesnt actually work. We need to go through and fix (and test this).

This is also (I beleive) related to the auto-gc code which i'm not sure actually works either.

Code links:

Notes:

  • need to keep approximate repo size so we don't have to constantly recalculate
dihard help wanted statudeferred topirepo

All 12 comments

@whyrusleeping if this is still open, I would love to take a crack at this?

@lanzafame I don't believe this has been resolved. At the very least, we need to test it. If the test proves that its broken, then we need to fix it. Go ahead and give it a shot :)

@whyrusleeping So I re-enabled the repo-gc-auto tests, and they still fail.
output:

*** test/sharness/t0082-repo-gc-auto.sh ***
/Users/lanzafame/dev/src/github.com/ipfs/go-ipfs/test/sharness
ok 1 - ipfs init succeeds
ok 2 - prepare config -- mounting
ok 3 - generate 2 600 kB files and 2 MB file using go-random
ok 4 - set ipfs gc watermark, storage max, and gc timeout
ok 5 - 'ipfs daemon' succeeds
ok 6 - api file shows up
ok 7 - set up address variables
ok 8 - set swarm address vars
ok 9 - 'ipfs daemon' is ready
ok 10 - adding data below watermark doesn't trigger auto gc
not ok 11 - adding data beyond watermark triggers auto gc
#
#       HASH=`ipfs add -q 600k2` &&
#       ipfs pin rm -r $HASH &&
#       go-sleep 40ms &&
#       DU=$(disk_usage "$IPFS_PATH/blocks") &&
#       if test $(uname -s) = "Darwin"; then
#         test "$DU" -lt 1400  # 60% of 2MB
#       else
#         test "$DU" -lt 1000000
#       fi
#
not ok 12 - adding data beyond storageMax fails
#
#     test_must_fail ipfs add 2M 2>add_fail_out
#
not ok 13 - ipfs add not enough space message looks good
#
#     echo "Error: file size exceeds slack space allowed by storageMax. Maybe unpin some files?" >add_fail_exp &&
#     test_cmp add_fail_exp add_fail_out
#
ok 54 - periodic auto gc stress test
ok 55 - 'ipfs daemon' is still running
ok 56 - 'ipfs daemon' can be killed
# failed 32 among 56 test(s)

I have been able to solve 12, but am still digging into 11.

@whyrusleeping I have been digging around all gc related areas of the code base and am struggling to figure out where the daemon should be triggering auto gc due to adding data beyond the watermark? Any ideas would be greatly appreciated 馃憤

@lanzafame hrm... I think @kevina or @Kubuxu might have an idea

@whyrusleeping I am not that familiar with the code, should the auto-gc be triggered at all? I thought we removed that code.

I'm pretty sure its still there, i believe @lgierth has it running on the gateways...

  • StorageMax
    An upper limit on the total size of the ipfs repository's datastore. Writes to
    the datastore will begin to fail once this limit is reached.

This part in docs/config.md is currently inaccurate - the datastore will generally accept writes until the disk is full. The StorageMax value is used for:

  1. printing it in ipfs repo stat
  2. checking whether GC watermark is hit and GC should run

So you need to run with --enable-gc for StorageMax to do anything at all. When GC is active, it's gonna check repo size every GCPeriod, and trigger if it's more than (StorageMax * 100 / GCWatermark).

So eventually we should make StorageMax a hard limit on the diskspace used, as it was meant in the beginning, but I think we should defer that to next year, since proper accounting of the diskspace used (without constantly scanning the repo) will actually be a bit tricky, and that's just one ball too much in the air right now :)

@lanzafame do you think you could come up with better wording for the StorageMax docs though?

@lgierth Just checking that I understand correctly, with the --enable-gc flag set, the StorageMax value is used with the GCWatermark value to determine whether a garbage collection should be run at every GCPeriod?

@lanzafame yep correct!

Was this page helpful?
0 / 5 - 0 ratings