Mailcow-dockerized: Use sdbox/mdbox

Created on 7 Feb 2018  路  30Comments  路  Source: mailcow/mailcow-dockerized

I was wondering if it would be possible to use sdbox or mdbox instead of maildir? I've already fiddled around with the config but it didn't seem to work.

The idea is that I'd like to use mail_attachment_dir to put larger attachments onto an external storage, but that only works for dbox. Also there's a nice performance boost when using dbox (on larger installations).

dunno

Most helpful comment

Ok, ive been doing allot of research and testing regarding DBOX and have completely changed my opinion with it.

The biggest benefit is with the following: MDBOX + SIS + LZ4

SBOX reduced the load on needing to scan an entire directory on loading an inbox. There is also a vastly reduced IO as the dbox index files contain the message flags, so if a user marks 50emails as viewed, only 1 files is updated. (maildir, each file would be renamed).

SIS (Single Instance Attached Storage) allows for full attachment de-duplication, on one of my domains I tested with had a 10:1 space saving. ie. 10Gb of messages became 1Gb. SHA256 as the HASH avoids the collision issues observed with SHA1. SHA384/SHA512 can be used if one is still worried about hash collisions.

LZ4 gives a 2:1 text compression on the message bodies.

DBOX also enables alternative/slow storage for older emails, (ALT=, make sure the permission of this directory is root:root 0755, to prevent killing the index files if the alternative storage is not mounted) https://wiki.dovecot.org/MailLocation/dbox

"doveadm altmove" will move emails older than X to the alternative storage, so this is a simple weekly/montly cron to enable slow storage.

Notes: SIS requires LMTP via the postfix virtual transport.

With regards to SDBOX or MDBOX:
MDBOX has a single file which contains many messages, this allows for faster backups and reduced IO, the map/index file points to the location of the email. If there is a corruption many many emails will be lost. There are more issues and complications when running via NFS. Also since the entire MDBOX file changes continuously it hammers backup systems, as the entire file has to be copied every backup, not just the chnages.

SDBOX each message is stored as a separate file with the attachment stripped and de-duplicated when running with SIS. The index contains the headers and flags etc of each email. If there is a corruption only the single corrupt email would be lost.


It's possible to use both mboxes and maildirs for the same user by configuring multiple namespaces. https://wiki.dovecot.org/Namespaces#Mixed_mbox_and_Maildir


As per Zafara regarding SIS
https://doc.zarafa.com/trunk/Administrator_Manual/en-US/html/_single_instance_attachment_storage.html
6.5. Single Instance Attachment Storage Since ZCP 6.30 the Zarafa Server provides Single Instance Attachment Storage to avoid redundant storage of attachments. This feature, as its name implies, only keeps one copy of each attachment when a message is sent to multiple recipients within the same server. This mechanism, thus, minimizes the disk space requirements and remarkably enhances delivery efficiency when messages with attachments sent to large distribution lists. Let鈥檚 assume the following situation: user A belongs to a Zarafa server; he sends a message with 10 MB of attachments to 30 users that reside on the same server. In a normal situation 30 copies of the files would be saved on the database, leading to an inefficient usage of the storage space (310 MB of data). With single instance attachment store, only one copy of each attachment is saved on the database (only 10 MB of data in this example) and all the 30 users can access the attachment through a reference pointer.

All 30 comments

removed

removed

I second this - I have used mdbox on large scale systems, maildir would have killed it with IO. Also, there is no coding required for moving mail to another storage layer, dovecot has a built in "alternative storage" with dbox/mdbox: https://wiki2.dovecot.org/MailboxFormat/dbox.

All my deploys of mailcow I have updated maildir to mdbox. For compression I use ZFS :)

@Lennix - you'll have to update the config in

mailcow-dockerized/data/Dockerfiles/dovecot/docker-entrypoint.sh:

SELECT CONCAT('mdbox:/var/vmail/',maildir) ....

(change the first maildir to mdbox)

and add

mail_location = mdbox:~/

to dovecot.conf

and rebuild your image (https://mailcow.github.io/mailcow-dockerized-docs/u_e-docker-cust_dockerfiles/)

Another way is to copy your docker-entrypoint.sh to the conf dir and update it, then link it into the image via docker-compose.yml:

      volumes:
        - ./data/conf/dovecot/docker-entrypoint.sh:/docker-entrypoint.sh:ro

I know it's hacky but works for me.

@tgmedia-nz Out of curiosity, what sort of size were you at to notice the difference in IO?

Ok, ive been doing allot of research and testing regarding DBOX and have completely changed my opinion with it.

The biggest benefit is with the following: MDBOX + SIS + LZ4

SBOX reduced the load on needing to scan an entire directory on loading an inbox. There is also a vastly reduced IO as the dbox index files contain the message flags, so if a user marks 50emails as viewed, only 1 files is updated. (maildir, each file would be renamed).

SIS (Single Instance Attached Storage) allows for full attachment de-duplication, on one of my domains I tested with had a 10:1 space saving. ie. 10Gb of messages became 1Gb. SHA256 as the HASH avoids the collision issues observed with SHA1. SHA384/SHA512 can be used if one is still worried about hash collisions.

LZ4 gives a 2:1 text compression on the message bodies.

DBOX also enables alternative/slow storage for older emails, (ALT=, make sure the permission of this directory is root:root 0755, to prevent killing the index files if the alternative storage is not mounted) https://wiki.dovecot.org/MailLocation/dbox

"doveadm altmove" will move emails older than X to the alternative storage, so this is a simple weekly/montly cron to enable slow storage.

Notes: SIS requires LMTP via the postfix virtual transport.

With regards to SDBOX or MDBOX:
MDBOX has a single file which contains many messages, this allows for faster backups and reduced IO, the map/index file points to the location of the email. If there is a corruption many many emails will be lost. There are more issues and complications when running via NFS. Also since the entire MDBOX file changes continuously it hammers backup systems, as the entire file has to be copied every backup, not just the chnages.

SDBOX each message is stored as a separate file with the attachment stripped and de-duplicated when running with SIS. The index contains the headers and flags etc of each email. If there is a corruption only the single corrupt email would be lost.


It's possible to use both mboxes and maildirs for the same user by configuring multiple namespaces. https://wiki.dovecot.org/Namespaces#Mixed_mbox_and_Maildir


As per Zafara regarding SIS
https://doc.zarafa.com/trunk/Administrator_Manual/en-US/html/_single_instance_attachment_storage.html
6.5. Single Instance Attachment Storage Since ZCP 6.30 the Zarafa Server provides Single Instance Attachment Storage to avoid redundant storage of attachments. This feature, as its name implies, only keeps one copy of each attachment when a message is sent to multiple recipients within the same server. This mechanism, thus, minimizes the disk space requirements and remarkably enhances delivery efficiency when messages with attachments sent to large distribution lists. Let鈥檚 assume the following situation: user A belongs to a Zarafa server; he sends a message with 10 MB of attachments to 30 users that reside on the same server. In a normal situation 30 copies of the files would be saved on the database, leading to an inefficient usage of the storage space (310 MB of data). With single instance attachment store, only one copy of each attachment is saved on the database (only 10 MB of data in this example) and all the 30 users can access the attachment through a reference pointer.

There isn't a "one size fits all" solution. I am totally aware there are mdbox, sdbox etc. Depending on your setup, mdbox might be the better solution with even more benefits.
There are other reasons we use maildir right now. Everyone knows it, it is almost unbreakable and well supported. mailcow can be easily modified to use whatever mailbox format you want to, @tgmedia-nz summed it up above, thanks! 馃憤 We can create a variable to change that easily.

I agree about compression though, we should enable lz4.

We can create a variable to change that easily.

Having the option prompt during the intial setup on which storage method and defaulting to maildir would be ideal.
ie.
maildir
sdbox
mdbox
sdbox + sis
mdbox + sis

I can do some testing and push a patch if you are too busy ?

@andryyy "We can create a variable to change that easily." would be great
will kill for anything lowering pressure on the disks
some of us using nfs and glusterfs network distributed storages

Besides, I can't imagine going away from dovecot, at least for mailcow project

One can convert rom dbox format back to maildir with the usage of dsync. If we ever needed to move away from dovecot.

Can anybody test how Dovecot reacts to maildir with "mail_attachment_fs = sis posix"? I will add this option later today.

@extremeshok SDBOX vs MDBOX what had lower disk usage? I assume both don't go through files just use index. So MDBOX will just contribute with lock problem (SOGO+ActiveSync+IMAP accessing it from, in my case, different servers).
Not bothered much with SIS and LZ4, which I probably should be.
If you can share your test environment configs to try.

I also think it disregards mail_location = mdbox:~/ in dovecot.conf, as data is in sql. To put index to a different location (local SSD) i had to change SQL query for domain creation where it has mail_location

@andryyy > Can anybody test how Dovecot reacts to maildir with "mail_attachment_fs = sis posix"? I will add this option later today.
dovecot with Maildir ignores the following when they are present in the config

BTW, below is the correct way for SIS and the sha512 prevents hash collisions
````

Support for mail attachment de-duplication (aka SIS aka Single Instance Storage)

mail_attachment_dir = /var/vmail/attachments
mail_attachment_hash = %{sha512}
mail_attachment_min_size = 64k
mail_attachment_fs = sis posix
````

Conversion from maildir with lz4 to mdbox + sis

MDBOX
mbox_dirty_syncs = yes mbox_dotlock_change_timeout = 2 mins mbox_lazy_writes = yes mbox_lock_timeout = 5 mins mbox_md5 = apop3d mbox_min_index_size = 0 mbox_read_locks = fcntl mbox_very_dirty_syncs = no mbox_write_locks = dotlock fcntl mdbox_preallocate_space = no mdbox_purge_preserve_alt = no mdbox_rotate_interval = 1d mdbox_rotate_size = 16M
SIS
````

Support for mail attachment de-duplication (aka SIS aka Single Instance Storage)

mail_attachment_dir = /var/vmail/attachments
mail_attachment_hash = %{sha512}
mail_attachment_min_size = 64k
mail_attachment_fs = sis posix
LZ4

Enable zlib compression, 2:1 on text files

plugin {
zlib_save_level = 9
zlib_save = lz4
}
````

@lavdnone
We have massive lock issues with maildir.. (2x dedicated standalone sogo servers, 150+ connections a second, webmail server, multiple imap load balancers, etc)

Server is enterprise SSD raid 1 (mirror) ZFS, 128GB DDR4 ecc,
maildir had a minimum I/O Delay of 80%
sdbox had a minimum I/O delay of 35%
mdbox seems to hover around 0-5%

mdbox with the default 2mb rotate size, ensures the files are 2MB. This is way quicker than thousands of less than 64K files.

Remember maildir requires the files to be renamed and linked/copied

@extremeshok similar mail storage environment what I was using a few years ago. I'd recommend to push the rotate file size to 8/16MB for even better I/O results

Results of maildir + lz4 to mdbox+sis+lz4
--- Converting: community ( community@ ) Before (mailbox lz4): 2.8G After (mdbox sis): 471M Attachments Total (sis): 1.5G --- Converting: abca ( abc@ ) Before (mailbox lz4): 137M After (mdbox sis): 1000K Attachments Total (sis): 1.6G --- Converting: auxiliary ( auxiliary@ ) Before (mailbox lz4): 1.9G After (mdbox sis): 244M Attachments Total (sis): 2.3G --- Converting: residents ( residents@ ) Before (mailbox lz4): 303M After (mdbox sis): 29M Attachments Total (sis): 2.3G --- Converting: accounts ( accounts@ ) Before (mailbox lz4): 1.9G After (mdbox sis): 244M Attachments Total (sis): 3.3G --- Converting: admin ( admin@ ) Before (mailbox lz4): 4.5G After (mdbox sis): 407M Attachments Total (sis): 4.9G --- Converting: rooms ( rooms@ ) Before (mailbox lz4): 2.6G After (mdbox sis): 247M Attachments Total (sis): 5.9G --- Converting: ceo ( ceo@ ) Before (mailbox lz4): 5.9G After (mdbox sis): 732M Attachments Total (sis): 8.4G --- Converting: pro ( pro@ ) Before (mailbox lz4): 2.0G After (mdbox sis): 188M Attachments Total (sis): 8.9G --- Converting: housing ( housing@ ) Before (mailbox lz4): 5.2G After (mdbox sis): 1.2G Attachments Total (sis): 9.9G --- Converting: village ( village@ ) Before (mailbox lz4): 474M After (mdbox sis): 71M Attachments Total (sis): 9.9G --- Converting: louise ( louise@ ) Before (mailbox lz4): 258M After (mdbox sis): 32M Attachments Total (sis): 11G --- Converting: cherie ( cherie@ ) Before (mailbox lz4): 706M After (mdbox sis): 60M Attachments Total (sis): 11G --- Converting: lanie ( lanie@ ) Before (mailbox lz4): 131M After (mdbox sis): 4.3M Attachments Total (sis): 11G --- Converting: roekeya ( roekeya@ ) Before (mailbox lz4): 14G After (mdbox sis): 1.3G Attachments Total (sis): 17G --- Converting: denzel ( denzel@ ) Before (mailbox lz4): 21M After (mdbox sis): 15M Attachments Total (sis): 17G --- Converting: john ( john@ ) Before (mailbox lz4): 607M After (mdbox sis): 29M Attachments Total (sis): 18G --- Converting: operations2 ( operations2@ ) Before (mailbox lz4): 1.7G After (mdbox sis): 94M Attachments Total (sis): 19G --- Converting: kobie ( kobie@ ) Before (mailbox lz4): 2.2G After (mdbox sis): 140M Attachments Total (sis): 20G --- Converting: operations ( operations@ ) Before (mailbox lz4): 1.7G After (mdbox sis): 184M Attachments Total (sis): 21G --- Converting: serieta ( serieta@ ) Before (mailbox lz4): 7.1G After (mdbox sis): 821M Attachments Total (sis): 24G --- Converting: deon ( deon@ ) Before (mailbox lz4): 508M After (mdbox sis): 43M Attachments Total (sis): 24G --- Converting: tamlyn ( tamlyn@ ) Before (mailbox lz4): 6.4G After (mdbox sis): 671M Attachments Total (sis): 27G --- Converting: noleen ( noleen@ ) Before (mailbox lz4): 663M After (mdbox sis): 56M Attachments Total (sis): 27G --- Converting: joanne ( joanne@ ) Before (mailbox lz4): 8.4G After (mdbox sis): 882M Attachments Total (sis): 32G --- Converting: laura ( laura@ ) Before (mailbox lz4): 3.0G After (mdbox sis): 82M Attachments Total (sis): 33G --- Converting: gerrie ( gerrie@ ) Before (mailbox lz4): 2.7G After (mdbox sis): 297M Attachments Total (sis): 35G --- Converting: peter ( peter@ ) Before (mailbox lz4): 1.3G After (mdbox sis): 94M Attachments Total (sis): 35G --- Converting: jason ( jason@ ) Before (mailbox lz4): 55M After (mdbox sis): 472K Attachments Total (sis): 35G --- Converting: rebecca ( rebecca@ ) Before (mailbox lz4): 506M After (mdbox sis): 109M Attachments Total (sis): 35G --- Converting: accounts ( accounts@ ) Before (mailbox lz4): 2.4G After (mdbox sis): 191M Attachments Total (sis): 37G --- Converting: vanessa ( vanessa@ ) Before (mailbox lz4): 1.3G After (mdbox sis): 140M Attachments Total (sis): 38G --- Converting: timothy ( timothy@ ) Before (mailbox lz4): 6.2G After (mdbox sis): 787M Attachments Total (sis): 40G --- Converting: jaymie ( jaymie@ ) Before (mailbox lz4): 1.2M After (mdbox sis): 128K Attachments Total (sis): 40G --- Converting: sean ( sean@ ) Before (mailbox lz4): 852M After (mdbox sis): 141M Attachments Total (sis): 41G --- Converting: michelle ( michelle@ ) Before (mailbox lz4): 100K After (mdbox sis): 68K Attachments Total (sis): 41G --- Converting: accounts ( accounts@ ) Before (mailbox lz4): 100K After (mdbox sis): 68K Attachments Total (sis): 41G --- Converting: elaine ( [email protected] ) Before (mailbox lz4): 1.4G After (mdbox sis): 296M Attachments Total (sis): 41G --- Converting: root ( [email protected] ) Before (mailbox lz4): 2.1G After (mdbox sis): 806M Attachments Total (sis): 42G --- Converting: sales ( [email protected] ) Before (mailbox lz4): 4.3G After (mdbox sis): 975M Attachments Total (sis): 44G --- Converting: bounce ( [email protected] ) Before (mailbox lz4): 104K After (mdbox sis): 72K Attachments Total (sis): 44G --- Converting: admin ( [email protected] ) Before (mailbox lz4): 239M After (mdbox sis): 36M Attachments Total (sis): 44G --- Converting: spam ( [email protected] ) Before (mailbox lz4): 4.0K After (mdbox sis): 8.0K Attachments Total (sis): 44G --- Converting: carly ( [email protected] ) Before (mailbox lz4): 2.8G After (mdbox sis): 118M Attachments Total (sis): 45G --- Converting: rachel ( [email protected] ) Before (mailbox lz4): 285M After (mdbox sis): 24M Attachments Total (sis): 45G --- Converting: help ( [email protected] ) Before (mailbox lz4): 100K After (mdbox sis): 72K Attachments Total (sis): 45G --- Converting: orders1 ( orders1@ ) Before (mailbox lz4): 46M After (mdbox sis): 5.4M Attachments Total (sis): 45G --- Converting: orders2 ( orders2@ ) Before (mailbox lz4): 2.6M After (mdbox sis): 580K Attachments Total (sis): 45G --- Converting: accounts ( accounts@ ) Before (mailbox lz4): 1.6G After (mdbox sis): 119M Attachments Total (sis): 46G --- Converting: factory ( factory@ ) Before (mailbox lz4): 108K After (mdbox sis): 76K Attachments Total (sis): 46G --- Converting: ray ( ray@ ) Before (mailbox lz4): 68M After (mdbox sis): 11M Attachments Total (sis): 46G --- Converting: sales1 ( sales1@ ) Before (mailbox lz4): 4.0K After (mdbox sis): 8.0K Attachments Total (sis): 46G --- Converting: sales2 ( sales2@ ) Before (mailbox lz4): 100K After (mdbox sis): 68K Attachments Total (sis): 46G --- Converting: lab ( lab@ ) Before (mailbox lz4): 7.8M After (mdbox sis): 412K Attachments Total (sis): 46G --- Converting: userone ( userone@ ) Before (mailbox lz4): 15M After (mdbox sis): 9.3M Attachments Total (sis): 46G --- Converting: usertwo ( usertwo@ ) Before (mailbox lz4): 1.7M After (mdbox sis): 264K Attachments Total (sis): 46G --- Converting: alison ( alison@ ) Before (mailbox lz4): 2.9G After (mdbox sis): 255M Attachments Total (sis): 47G --- Converting: choppie ( choppie@ ) Before (mailbox lz4): 321M After (mdbox sis): 11M Attachments Total (sis): 47G --- Converting: samantha ( samantha@ ) Before (mailbox lz4): 294M After (mdbox sis): 42M Attachments Total (sis): 47G --- Converting: chiro ( chiro@ ) Before (mailbox lz4): 820M After (mdbox sis): 98M Attachments Total (sis): 48G --- Converting: dawn ( dawn@ ) Before (mailbox lz4): 300M After (mdbox sis): 68M Attachments Total (sis): 48G --- Converting: arthur ( arthur@ ) Before (mailbox lz4): 37M After (mdbox sis): 2.1M Attachments Total (sis): 48G --- Converting: lloyd ( lloyd@ ) Before (mailbox lz4): 52M After (mdbox sis): 33M Attachments Total (sis): 48G --- Converting: appledawn ( appledawn@ ) Before (mailbox lz4): 1.1M After (mdbox sis): 852K Attachments Total (sis): 48G --- Converting: insure ( insure@ ) Before (mailbox lz4): 337M After (mdbox sis): 27M Attachments Total (sis): 48G --- Converting: pauline ( pauline@ ) Before (mailbox lz4): 2.4G After (mdbox sis): 119M Attachments Total (sis): 49G --- Converting: scanner ( scanner@ ) Before (mailbox lz4): 84K After (mdbox sis): 60K Attachments Total (sis): 49G --- Converting: admin ( admin@ ) Before (mailbox lz4): 3.6G After (mdbox sis): 90M Attachments Total (sis): 52G --- Converting: harryb ( harryb@ ) Before (mailbox lz4): 1.1G After (mdbox sis): 53M Attachments Total (sis): 53G --- Converting: scanner ( scanner@ ) Before (mailbox lz4): 84K After (mdbox sis): 60K Attachments Total (sis): 53G --- Converting: kevin ( kevin@ ) Before (mailbox lz4): 21M After (mdbox sis): 1.9M Attachments Total (sis): 53G --- Converting: warren ( warren@ ) Before (mailbox lz4): 9.5M After (mdbox sis): 464K Attachments Total (sis): 53G --- Converting: accounts ( accounts@ ) Before (mailbox lz4): 12M After (mdbox sis): 1.7M Attachments Total (sis): 53G --- Converting: vincent ( vincent@ ) Before (mailbox lz4): 837M After (mdbox sis): 70M Attachments Total (sis): 53G --- Converting: felicity ( felicity@ ) Before (mailbox lz4): 3.4G After (mdbox sis): 208M Attachments Total (sis): 55G --- Converting: sales ( sales@ ) Before (mailbox lz4): 2.1G After (mdbox sis): 259M Attachments Total (sis): 56G --- Converting: philip ( philip@ ) Before (mailbox lz4): 635M After (mdbox sis): 81M Attachments Total (sis): 57G --- Converting: pe ( pe@ ) Before (mailbox lz4): 3.0G After (mdbox sis): 428M Attachments Total (sis): 58G --- Converting: accounts ( accounts@ ) Before (mailbox lz4): 275M After (mdbox sis): 26M Attachments Total (sis): 58G --- Converting: admin ( admin@ ) Before (mailbox lz4): 852M After (mdbox sis): 84M Attachments Total (sis): 58G --- Converting: fax ( fax@ ) Before (mailbox lz4): 31M After (mdbox sis): 16M Attachments Total (sis): 58G --- Converting: sales2 ( sales2@ ) Before (mailbox lz4): 140M After (mdbox sis): 13M Attachments Total (sis): 59G --- Converting: service ( service@ ) Before (mailbox lz4): 855M After (mdbox sis): 169M Attachments Total (sis): 59G --- Converting: daycare ( daycare@ ) Before (mailbox lz4): 124K After (mdbox sis): 80K Attachments Total (sis): 59G --- Converting: admin ( admin@ ) Before (mailbox lz4): 3.0G After (mdbox sis): 219M Attachments Total (sis): 60G --- Converting: pledge ( pledge@ ) Before (mailbox lz4): 124K After (mdbox sis): 80K Attachments Total (sis): 60G --- Converting: tracey ( tracey@ ) Before (mailbox lz4): 276K After (mdbox sis): 172K Attachments Total (sis): 60G --- Converting: kim ( kim@ ) Before (mailbox lz4): 56M After (mdbox sis): 35M Attachments Total (sis): 60G --- Converting: ulrught ( ulrught@ ) Before (mailbox lz4): 495M After (mdbox sis): 53M Attachments Total (sis): 61G --- Converting: francine ( francine@ ) Before (mailbox lz4): 4.0K After (mdbox sis): 8.0K Attachments Total (sis): 61G --- Converting: leigh ( leigh@ ) Before (mailbox lz4): 4.0K After (mdbox sis): 8.0K Attachments Total (sis): 61G --- Converting: info ( info@ ) Before (mailbox lz4): 4.0K After (mdbox sis): 8.0K Attachments Total (sis): 61G --- Converting: finance ( finance@ ) Before (mailbox lz4): 4.0K After (mdbox sis): 8.0K Attachments Total (sis): 61G --- Converting: anri ( anri@ ) Before (mailbox lz4): 4.0K After (mdbox sis): 8.0K Attachments Total (sis): 61G --- Converting: sales ( sales@ ) Before (mailbox lz4): 1.7G After (mdbox sis): 160M Attachments Total (sis): 62G --- Converting: carmen ( carmen@ ) Before (mailbox lz4): 1.3G After (mdbox sis): 131M Attachments Total (sis): 62G --- Converting: keith ( keith@ ) Before (mailbox lz4): 631M After (mdbox sis): 37M Attachments Total (sis): 62G --- Converting: gary ( gary@ ) Before (mailbox lz4): 11G After (mdbox sis): 840M Attachments Total (sis): 68G --- Converting: deon ( deon@ ) Before (mailbox lz4): 1.9G After (mdbox sis): 189M Attachments Total (sis): 68G --- Converting: elke ( elke@ ) Before (mailbox lz4): 6.4G After (mdbox sis): 809M Attachments Total (sis): 71G --- Converting: astin ( astin@ ) Before (mailbox lz4): 1.4G After (mdbox sis): 24M Attachments Total (sis): 72G --- Converting: cellcec ( cellcec@ ) Before (mailbox lz4): 254M After (mdbox sis): 13M Attachments Total (sis): 72G --- Converting: jfrimpong ( jfrimpong@ ) Before (mailbox lz4): 574M After (mdbox sis): 57M Attachments Total (sis): 72G --- Converting: sales ( sales@ ) Before (mailbox lz4): 12G After (mdbox sis): 1.5G Attachments Total (sis): 78G --- Converting: accounts ( accounts@ ) Before (mailbox lz4): 2.6G After (mdbox sis): 184M Attachments Total (sis): 80G --- Converting: admin ( admin@ ) Before (mailbox lz4): 1.3G After (mdbox sis): 280M Attachments Total (sis): 80G --- Converting: jrockson ( jrockson@ ) Before (mailbox lz4): 1.9G After (mdbox sis): 164M Attachments Total (sis): 80G --- Converting: accounts2 ( accounts2@ ) Before (mailbox lz4): 931M After (mdbox sis): 103M Attachments Total (sis): 81G

@extremeshok thanks for data
looks like people sending attachments to each other 90% of the time

In my case with mdbox index files need to be moved back to shared glusterfs storage from local SSD
Maildir index didn't matter much as it was listing mail files in folders anyway. With mdbox index will be part of emails. Will have to test performance and locks.

Since this got some traction I was wondering if someone could comment how difficult it would be to integrate an option for the config-scripts to enable mdbox + addons?

I just noticed, that compression was enabled in February Enable maildir compression #1090.

In spite of this my newly installed mailcow server with dovecot IMAP mailbox storage uses twice as much space as anticipated. I transferred a large Gmail mailbox with 9 GB of mail (90,000 messages) to mailcow using imapsync. I was surprised to find out, that dovecot uses 21 GB of storage for the same data.

@extremeshok shows a large decrease in mailbox size from 'maildir + lz4' to 'mdbox+sis+lz4'.

Would the dbox format be able to decrease the amount of space used by the mailbox? Could this be added as an option in generate_config.sh?

@chriswayg: This sounds to me as if during your imapsync mails got duplicated. Generally if you transfer the "All Mail" directory, it will contain duplicates of every mail you have in a different folder. I recommend using Thunderbird and the "Remove Duplicate Messages (Alternative)" Addon to get rid of the duplicates. If you cut your used space in half you're at roughly the GMail size.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

I would like to work on this, in order to submitr a PR to have it added as a user selectable option for new installations.

Default will remain maildir without compression

Then adding the user selectable options
option: enable compression for mail storage
option: enable mdbox+sis instead of maildir

From what I saw compression is already Default in current

Wonder if there is a way to properly sync indexes between cluster nodes: "main reasons for dbox's high performance is that it uses Dovecot's index files as the only storage for message flags and keywords"
Now I have index on local ssd for maildir: each node indexes on it's own.
Those dovecot.index* and map.index* will probably get over-locked or messed up if laying on a glusterfs read by multiple nodes

@lavdnone ceph plugin is required for your use case, gluster has a terrible lag once the inbox grows beyond million mail mark

@chriswayg Obviously you have limited experience dealing with corporate email . Here is an example, a company with 100 users.. They email the same attachment to multiple people or multiple people receive the same attachment. Now imagine a single email has a 50mb dwg attachment. You can fill your email server within 1 week.

As per being at gmail size.. 1 organization contains 22 TByte of de-dupilicated emails. One of my personal email accounts is more than 7 million emails

This will be an optional setting, like full text search

@extremeshok Thank you for advice. I have seph in works as block device for the purpose, never downed that there is dovecot RADOS objects plugin.
As fast fix for gluster had to make fuse file system to bypass gluster on local reads
https://github.com/lavdnone/unionfs-fuse
2ook inboxes load great. Funny but works.

It is _not_ enough to stupidly enable mdbox and deduplication. You need to take care of cross-format ACLs, of encrypted deduplication, shared folders, compatibility of existing shares (!) etc.

I am not willing to accept a PR for this, because I guarantee any PR would just add/change 3 or 4 parameters, no tests will be made and that's it. "Worked for me".

Even worse, making it selectable and introduce total madness.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

I second this - I have used mdbox on large scale systems, maildir would have killed it with IO. Also, there is no coding required for moving mail to another storage layer, dovecot has a built in "alternative storage" with dbox/mdbox: https://wiki2.dovecot.org/MailboxFormat/dbox.

All my deploys of mailcow I have updated maildir to mdbox. For compression I use ZFS :)

@Lennix - you'll have to update the config in

mailcow-dockerized/data/Dockerfiles/dovecot/docker-entrypoint.sh:

SELECT CONCAT('mdbox:/var/vmail/',maildir) ....

(change the first maildir to mdbox)

and add

mail_location = mdbox:~/

to dovecot.conf

and rebuild your image (https://mailcow.github.io/mailcow-dockerized-docs/u_e-docker-cust_dockerfiles/)

Another way is to copy your docker-entrypoint.sh to the conf dir and update it, then link it into the image via docker-compose.yml:

      volumes:
        - ./data/conf/dovecot/docker-entrypoint.sh:/docker-entrypoint.sh:ro

I know it's hacky but works for me.

Hi,
unfortunality this don't work for me with the actual version... I had to change the attributes entry in the database:
update mailbox set attributes='{"force_pw_update":"0","tls_enforce_in":"0","tls_enforce_out":"0","sogo_access":"1","mailbox_format":"mdbox:","quarantine_notification":"never"}' where username='[email protected]';
I want to switch to mdbox, due to backup issues with maildir - an full backup of millions of small files takes hours (days), which is no issue with mdbox.

I had to do it in DB too and change mailbox creation script.
Is there a reason this setting is in active DB at all? not that we want to change mailbox format on the fly from admin panel or anything. It is effectively static.
Easier would be to have it back at dovecot.conf

I understand that mdbox and automatic deduplication can be troublesome, but why isn't mdbox at least configurable?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

phipag picture phipag  路  3Comments

pgollor picture pgollor  路  3Comments

thannaske picture thannaske  路  3Comments

mritzmann picture mritzmann  路  3Comments

damdinsharav picture damdinsharav  路  3Comments