Meilisearch: Empty data.ms folder takes 200 GB disk space

Created on 13 Feb 2020  路  4Comments  路  Source: meilisearch/MeiliSearch

Hello,

When I cargo run --release on a newly cloned repo, it creates a data.ms folder which is 200 GB in size. Why is this folder so big even before I've indexed any documents? Then after I indexed the movies.json file, which is around 8 MB, the data.ms folder size remained the same. Is this some kind of pre-allocation of disk space? Can I reduce it by configuring some environment variable, or maybe point me to the code where this is hardcoded?

I'm running MeiliSearch on Windows.

Thanks

question

All 4 comments

I found the file where the sizes are hardcoded. It is in meilisearch-core/src/database.rs in the Database's open_or_create function, but I'm still wondering about the required size.

Hey @imor,
This is the first time I see the disk space used be equal to the LMDB::max_db_size parameter, this setting is normally a high bound to the size of the database (data + updates, 100GB + 100GB), it seems like on Windows this setting is not handled the same way as on unix systems, Windows kind of reserve the disk space beforehand.

This setting must have been configurable but we haven鈥檛 made it be for the moment, it seems like you will need to change the harcoded value by hand for the moment, sorry for that.

The required size for 8MB of data is probably something between 2GB and 5GB. Internal indexes are big!

Thank you for this report :)

This is related to https://github.com/mozilla/lmdb-rs/issues/40

Unfortunately, it is nuanced: there is sort of a fix upstream, but it's never been enabled on a stable release because of performance reasons.

Closed by #646

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mzperix picture mzperix  路  4Comments

LeoHsiao1 picture LeoHsiao1  路  4Comments

frank-io picture frank-io  路  3Comments

curquiza picture curquiza  路  5Comments

curquiza picture curquiza  路  3Comments