Hi,
We have about 10MB of actual data in Dgraph, but the wal directory with vlog files taking about 56GB, I understand that there should be some kind of GC running each 10 minutes or per each GB, but I don't see the space was ever reclaimed.
Can you explain how this works, and what I'm missing...
Here is information from the badger info --dir /data/dgraph/w command:
[ 2019-06-02T13:37:33Z] MANIFEST 16 B MA
[ 5 days earlier] 000000.vlog 1.1 GB VL
[ 5 days earlier] 000001.vlog 1.1 GB VL
[ 5 days earlier] 000002.vlog 1.1 GB VL
[ 5 days earlier] 000003.vlog 1.1 GB VL
[ 5 days earlier] 000004.vlog 1.1 GB VL
[ 5 days earlier] 000005.vlog 1.1 GB VL
[ 5 days earlier] 000006.vlog 1.1 GB VL
[ 5 days earlier] 000007.vlog 1.1 GB VL
[ 5 days earlier] 000008.vlog 1.1 GB VL
[ 5 days earlier] 000009.vlog 1.1 GB VL
[ 5 days earlier] 000010.vlog 1.1 GB VL
[ 4 days earlier] 000011.vlog 1.1 GB VL
[ 4 days earlier] 000012.vlog 1.1 GB VL
[ 4 days earlier] 000013.vlog 1.1 GB VL
[ 4 days earlier] 000014.vlog 1.1 GB VL
[ 4 days earlier] 000015.vlog 1.1 GB VL
[ 4 days earlier] 000016.vlog 1.1 GB VL
[ 4 days earlier] 000017.vlog 1.1 GB VL
[ 4 days earlier] 000018.vlog 1.1 GB VL
[ 4 days earlier] 000019.vlog 1.1 GB VL
[ 4 days earlier] 000020.vlog 1.1 GB VL
[ 4 days earlier] 000021.vlog 1.1 GB VL
[ 4 days earlier] 000022.vlog 1.1 GB VL
[ 4 days earlier] 000023.vlog 1.1 GB VL
[ 4 days earlier] 000024.vlog 1.1 GB VL
[ 4 days earlier] 000025.vlog 1.1 GB VL
[ 4 days earlier] 000026.vlog 1.1 GB VL
[ 4 days earlier] 000027.vlog 1.1 GB VL
[ 4 days earlier] 000028.vlog 1.1 GB VL
[ 4 days earlier] 000029.vlog 1.1 GB VL
[ 4 days earlier] 000030.vlog 1.1 GB VL
[ 4 days earlier] 000031.vlog 1.1 GB VL
[ 4 days earlier] 000032.vlog 1.1 GB VL
[ 4 days earlier] 000033.vlog 1.1 GB VL
[ 4 days earlier] 000034.vlog 1.1 GB VL
[ 4 days earlier] 000035.vlog 1.1 GB VL
[ 4 days earlier] 000036.vlog 1.1 GB VL
[ 4 days earlier] 000037.vlog 1.1 GB VL
[ 4 days earlier] 000038.vlog 1.1 GB VL
[ 4 days earlier] 000039.vlog 1.1 GB VL
[ 3 days earlier] 000040.vlog 1.1 GB VL
[ 3 days earlier] 000041.vlog 1.1 GB VL
[ 3 days earlier] 000042.vlog 1.1 GB VL
[ 3 days earlier] 000043.vlog 1.1 GB VL
[ 3 days earlier] 000044.vlog 1.1 GB VL
[ 3 days earlier] 000045.vlog 1.1 GB VL
[ 2 days earlier] 000046.vlog 1.1 GB VL
[ 25 minutes earlier] 000047.vlog 1.1 GB VL
[ 9 minutes earlier] 000048.vlog 1.1 GB VL
[ 21 hours later] 000049.vlog 1.1 GB VL
[ 1 day later] 000050.vlog 1.1 GB VL
[ 2 days later] 000051.vlog 19 MB VL
[EXTRA]
[2019-06-02T13:37:33Z] LOCK 5 B
[Summary]
Total index size: 0 B
Value log size: 56 GB
Abnormalities:
1 extra file.
0 missing files.
0 empty files.
0 truncated manifests.
Not enough keys in your system. Instead of DefaultOption, you can use LSMOnlyOption to build this DB, so it would try it's best to colocate keys and values, which would work better in your case.
Hi @manishrjain
I'm using dgraph 1.0.15, I don't see there badger.options argument, so how do I set it to lsmonly?
Ahh.. yeah. We don't expose that option. Maybe we can add a flag to expose it.
FWIW, I don't see even a single SST file in your badger info output. So, looks like you're just writing data to the same keys repeatedly. This would be eventually GCed as SSTs do compactions, not before.
yes we write data to the same keys (mostly), in that case we will never have SST file? Is this means our disk will be filled up?
How you suggest to tackle such case?
@manishrjain by the way, the above command was invoked for wal dir
here is same command for postings dir
badger info --dir /data/dgraph/p
[ 2019-06-04T00:00:27Z] MANIFEST 1.5 kB MA
[ 9 hours earlier] 000079.sst 37 MB L0
[ now] 000080.sst 42 MB L0
[ 1 day later] 000002.vlog 69 MB VL
[EXTRA]
[2019-06-02T13:44:16Z] LOCK 5 B
[Summary]
Level 0 size: 79 MB
Level 1 size: 0 B
Total index size: 79 MB
Value log size: 69 MB
Abnormalities:
1 extra file.
0 missing files.
0 empty files.
0 truncated manifests.
so we have sst files there
the question is what I do with wal dir how do I clean it?
I have the same problem. How did you solve it?
@jarifibrahim @MichelDiz Have you seen this issue before? Trying to understand if this still needs a fix.
Hey @anurags92, @shamil and @Grool , the value file size issue is because GC is unable to reclaim space. Let me explain how badger vlog GC works.
The reason value log GC is not working for @shamil is because the data is actively being updated. If the data is being updated (no new data is added) then the in-memory data in badger will never be flushed to the disk. This is because the flush happens after we have enough data to be written on the disk.
Badger has a 64 mb in-memory skiplist. Unless this skiplist is filled up, we cannot flush it and convet it into SSTs and unless there are on disk SSTs, we cannot GC the value log.
@anurags92 I see two things that we can do here.
in-memory data structure will fill up quickly and it will be flushed sooner. So, the GC will have some SSTs to work with.badger.KeepL0InMemory option. This option enabled by default and it means badger stores the level 0 in memory. This means, there are two structures in memory (skiplist and level 0 tables) and GC will work only after both the structures are flushed to the disk.Ideally, we should have some kind of small-data mode which sets both these options internally. A user might not know the implications of setting these options and dgraph should take care of it.
Github issues have been deprecated.
This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

Most helpful comment
Hey @anurags92, @shamil and @Grool , the value file size issue is because GC is unable to reclaim space. Let me explain how badger vlog GC works.
The reason value log GC is not working for @shamil is because the data is actively being updated. If the data is being updated (no new data is added) then the
in-memorydata in badger will never be flushed to the disk. This is because the flush happens after we have enough data to be written on the disk.Badger has a 64 mb in-memory skiplist. Unless this skiplist is filled up, we cannot flush it and convet it into SSTs and unless there are on disk SSTs, we cannot GC the value log.
@anurags92 I see two things that we can do here.
in-memorydata structure will fill up quickly and it will be flushed sooner. So, the GC will have some SSTs to work with.badger.KeepL0InMemoryoption. This option enabled by default and it means badger stores thelevel 0in memory. This means, there are two structures in memory (skiplist and level 0 tables) and GC will work only after both the structures are flushed to the disk.Ideally, we should have some kind of
small-datamode which sets both these options internally. A user might not know the implications of setting these options and dgraph should take care of it.