Dgraph: vlog files use lots of disk space: Add option to set LSMOnly option when opening p dir

Created on 4 Jun 2019  路  9Comments  路  Source: dgraph-io/dgraph

Hi,

We have about 10MB of actual data in Dgraph, but the wal directory with vlog files taking about 56GB, I understand that there should be some kind of GC running each 10 minutes or per each GB, but I don't see the space was ever reclaimed.

Can you explain how this works, and what I'm missing...

Here is information from the badger info --dir /data/dgraph/w command:

[     2019-06-02T13:37:33Z] MANIFEST       16 B MA
[           5 days earlier] 000000.vlog  1.1 GB VL
[           5 days earlier] 000001.vlog  1.1 GB VL
[           5 days earlier] 000002.vlog  1.1 GB VL
[           5 days earlier] 000003.vlog  1.1 GB VL
[           5 days earlier] 000004.vlog  1.1 GB VL
[           5 days earlier] 000005.vlog  1.1 GB VL
[           5 days earlier] 000006.vlog  1.1 GB VL
[           5 days earlier] 000007.vlog  1.1 GB VL
[           5 days earlier] 000008.vlog  1.1 GB VL
[           5 days earlier] 000009.vlog  1.1 GB VL
[           5 days earlier] 000010.vlog  1.1 GB VL
[           4 days earlier] 000011.vlog  1.1 GB VL
[           4 days earlier] 000012.vlog  1.1 GB VL
[           4 days earlier] 000013.vlog  1.1 GB VL
[           4 days earlier] 000014.vlog  1.1 GB VL
[           4 days earlier] 000015.vlog  1.1 GB VL
[           4 days earlier] 000016.vlog  1.1 GB VL
[           4 days earlier] 000017.vlog  1.1 GB VL
[           4 days earlier] 000018.vlog  1.1 GB VL
[           4 days earlier] 000019.vlog  1.1 GB VL
[           4 days earlier] 000020.vlog  1.1 GB VL
[           4 days earlier] 000021.vlog  1.1 GB VL
[           4 days earlier] 000022.vlog  1.1 GB VL
[           4 days earlier] 000023.vlog  1.1 GB VL
[           4 days earlier] 000024.vlog  1.1 GB VL
[           4 days earlier] 000025.vlog  1.1 GB VL
[           4 days earlier] 000026.vlog  1.1 GB VL
[           4 days earlier] 000027.vlog  1.1 GB VL
[           4 days earlier] 000028.vlog  1.1 GB VL
[           4 days earlier] 000029.vlog  1.1 GB VL
[           4 days earlier] 000030.vlog  1.1 GB VL
[           4 days earlier] 000031.vlog  1.1 GB VL
[           4 days earlier] 000032.vlog  1.1 GB VL
[           4 days earlier] 000033.vlog  1.1 GB VL
[           4 days earlier] 000034.vlog  1.1 GB VL
[           4 days earlier] 000035.vlog  1.1 GB VL
[           4 days earlier] 000036.vlog  1.1 GB VL
[           4 days earlier] 000037.vlog  1.1 GB VL
[           4 days earlier] 000038.vlog  1.1 GB VL
[           4 days earlier] 000039.vlog  1.1 GB VL
[           3 days earlier] 000040.vlog  1.1 GB VL
[           3 days earlier] 000041.vlog  1.1 GB VL
[           3 days earlier] 000042.vlog  1.1 GB VL
[           3 days earlier] 000043.vlog  1.1 GB VL
[           3 days earlier] 000044.vlog  1.1 GB VL
[           3 days earlier] 000045.vlog  1.1 GB VL
[           2 days earlier] 000046.vlog  1.1 GB VL
[       25 minutes earlier] 000047.vlog  1.1 GB VL
[        9 minutes earlier] 000048.vlog  1.1 GB VL
[           21 hours later] 000049.vlog  1.1 GB VL
[              1 day later] 000050.vlog  1.1 GB VL
[             2 days later] 000051.vlog   19 MB VL

[EXTRA]
[2019-06-02T13:37:33Z] LOCK            5 B

[Summary]
Total index size:      0 B
Value log size:      56 GB

Abnormalities:
1 extra file.
0 missing files.
0 empty files.
0 truncated manifests.
areperformance kinenhancement prioritP2 statuaccepted

Most helpful comment

Hey @anurags92, @shamil and @Grool , the value file size issue is because GC is unable to reclaim space. Let me explain how badger vlog GC works.

  1. The data being written to badger is stored in memory
  2. The in-memory data will be flushed to disk when there's enough data. The flushed data is written on the disk in the form of SSTs.
  3. Badger GC will not delete any data from the value log (vlog) file unless the data is stored in the value log file is also flushed to the disk in the form of SST. (which translates to badger will not delete data from value log file unless it's sure this data can be discarded).
  4. The GC also performs some sampling to ensure that there is enough data to for the vlog file to be GCed.

The reason value log GC is not working for @shamil is because the data is actively being updated. If the data is being updated (no new data is added) then the in-memory data in badger will never be flushed to the disk. This is because the flush happens after we have enough data to be written on the disk.

Badger has a 64 mb in-memory skiplist. Unless this skiplist is filled up, we cannot flush it and convet it into SSTs and unless there are on disk SSTs, we cannot GC the value log.

@anurags92 I see two things that we can do here.

  1. Expose an option so that users can run alpha in LSM Only mode. With LSM only mode, the keys and values will be stored together which means the in-memory data structure will fill up quickly and it will be flushed sooner. So, the GC will have some SSTs to work with.
  2. Expose an option to enable/disable badger.KeepL0InMemory option. This option enabled by default and it means badger stores the level 0 in memory. This means, there are two structures in memory (skiplist and level 0 tables) and GC will work only after both the structures are flushed to the disk.

Ideally, we should have some kind of small-data mode which sets both these options internally. A user might not know the implications of setting these options and dgraph should take care of it.

All 9 comments

Not enough keys in your system. Instead of DefaultOption, you can use LSMOnlyOption to build this DB, so it would try it's best to colocate keys and values, which would work better in your case.

Hi @manishrjain

I'm using dgraph 1.0.15, I don't see there badger.options argument, so how do I set it to lsmonly?

Ahh.. yeah. We don't expose that option. Maybe we can add a flag to expose it.

FWIW, I don't see even a single SST file in your badger info output. So, looks like you're just writing data to the same keys repeatedly. This would be eventually GCed as SSTs do compactions, not before.

yes we write data to the same keys (mostly), in that case we will never have SST file? Is this means our disk will be filled up?

How you suggest to tackle such case?

@manishrjain by the way, the above command was invoked for wal dir
here is same command for postings dir

badger info --dir /data/dgraph/p


[     2019-06-04T00:00:27Z] MANIFEST     1.5 kB MA
[          9 hours earlier] 000079.sst    37 MB L0 
[                      now] 000080.sst    42 MB L0 
[              1 day later] 000002.vlog   69 MB VL

[EXTRA]
[2019-06-02T13:44:16Z] LOCK            5 B

[Summary]
Level 0 size:        79 MB
Level 1 size:          0 B
Total index size:    79 MB
Value log size:      69 MB

Abnormalities:
1 extra file.
0 missing files.
0 empty files.
0 truncated manifests.

so we have sst files there
the question is what I do with wal dir how do I clean it?

I have the same problem. How did you solve it?

@jarifibrahim @MichelDiz Have you seen this issue before? Trying to understand if this still needs a fix.

Hey @anurags92, @shamil and @Grool , the value file size issue is because GC is unable to reclaim space. Let me explain how badger vlog GC works.

  1. The data being written to badger is stored in memory
  2. The in-memory data will be flushed to disk when there's enough data. The flushed data is written on the disk in the form of SSTs.
  3. Badger GC will not delete any data from the value log (vlog) file unless the data is stored in the value log file is also flushed to the disk in the form of SST. (which translates to badger will not delete data from value log file unless it's sure this data can be discarded).
  4. The GC also performs some sampling to ensure that there is enough data to for the vlog file to be GCed.

The reason value log GC is not working for @shamil is because the data is actively being updated. If the data is being updated (no new data is added) then the in-memory data in badger will never be flushed to the disk. This is because the flush happens after we have enough data to be written on the disk.

Badger has a 64 mb in-memory skiplist. Unless this skiplist is filled up, we cannot flush it and convet it into SSTs and unless there are on disk SSTs, we cannot GC the value log.

@anurags92 I see two things that we can do here.

  1. Expose an option so that users can run alpha in LSM Only mode. With LSM only mode, the keys and values will be stored together which means the in-memory data structure will fill up quickly and it will be flushed sooner. So, the GC will have some SSTs to work with.
  2. Expose an option to enable/disable badger.KeepL0InMemory option. This option enabled by default and it means badger stores the level 0 in memory. This means, there are two structures in memory (skiplist and level 0 tables) and GC will work only after both the structures are flushed to the disk.

Ideally, we should have some kind of small-data mode which sets both these options internally. A user might not know the implications of setting these options and dgraph should take care of it.

Github issues have been deprecated.
This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

drawing

Was this page helpful?
0 / 5 - 0 ratings

Related issues

djdoeslinux picture djdoeslinux  路  4Comments

allen-munsch picture allen-munsch  路  4Comments

mbudge picture mbudge  路  3Comments

marvin-hansen picture marvin-hansen  路  4Comments

fritzblue picture fritzblue  路  5Comments