I have encountered the following error when trying to access the admin UI whenever some containers in the stack are recreated or restarted in a docker-compose deployment.
Steps to reproduce:
docker-compose logs -f front
front_1 | 2020/02/05 16:36:42 [info] 11#11: *13 client 192.168.203.11:38536 connected to 0.0.0.0:10143
front_1 | 2020/02/05 16:36:42 [error] 11#11: *15 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /auth/email HTTP/1.0", upstream: "http://192.168.203.2:80/internal/auth/email", host: "127.0.0.1"
front_1 | 127.0.0.1 - - [05/Feb/2020:16:36:42 +0000] "GET /auth/email HTTP/1.0" 502 150 "-" "-"
front_1 | 2020/02/05 16:36:42 [error] 11#11: *13 auth http server 127.0.0.1:8000 did not send server or port while in http auth state, client: 192.168.203.11, server: 0.0.0.0:10143, login: "user@mail"
docker network inspect mailu_default
"e0d90201de96842f31391fc71a13324f353ead73fe085bc2fa3fc8b11d6ddf8c": {
"Name": "mailu_antivirus_1",
"EndpointID": "a0b581d759b534bbe3b93c3259845d2fa4422ab5a852cc27b8a2ce1d5a28b1df",
"MacAddress": "02:42:c0:a8:cb:02",
"IPv4Address": "192.168.203.2/24",
"IPv6Address": ""
},
"e8ebf91bcda51637c6b92eeb299434a301d6bc00335b390b67ed70ef9af6f299": {
"Name": "mailu_admin_1",
"EndpointID": "e470f9f4220a9330223950db2acba4a9320493b2845c3d3f067e71e0064f35f0",
"MacAddress": "02:42:c0:a8:cb:06",
"IPv4Address": "192.168.203.6/24",
"IPv6Address": ""
},
----Snip---
It seems that containers were assigned new IP addresses when restarted but the front container is still using the IP it had saved when it first started.
In my example front is trying to connect to admin using an IP address that now belongs to the antivirus container.
The documentation for the master branch indicates that the startup scripts will resolve HOST_* to their IP addresses and store the result in *_ADDRESS for further use.
To resolve this "drift" I have to stop and start all containers so the IP address can be refreshed, subsequent restarts for some containers result in the same issue.
This behaviour should at least be documented for users that are upgrading from 1.7 and definitely before the *_ADDRESS updates are backported to 1.7.
@muhlemmer
For completeness,
This issue will occur when a user updates an environment variable after Mailu is started then issues
docker-compose up -d
See below;
docker-compose up -ddocker network inspect mailu_default | grep admin -A 3 "Name": "mailu_admin_1",
"EndpointID": "ab6adb15acd0b27bef9fa89f8e289864971de7fd0ceabf4a6199271357e8c6ce",
"MacAddress": "02:42:c0:a8:cb:09",
"IPv4Address": "192.168.203.9/24"
docker-compose exec front cat /etc/nginx/nginx.conf | grep "\$admin "
set $admin 192.168.203.9;
set $admin 192.168.203.9;
This is correct.
docker-compose up -ddocker network inspect mailu_default | grep admin -A 3 "Name": "mailu_admin_1",
"EndpointID": "1e6faf7dec9132487458e140777ff3fca5e8f5654e45957e1b21ad19907329b5",
"MacAddress": "02:42:c0:a8:cb:03",
"IPv4Address": "192.168.203.3/24",
docker-compose exec front cat /etc/nginx/nginx.conf | grep "\$admin "
set $admin 192.168.203.9;
set $admin 192.168.203.9;
Front now has a stale reference to admin
Subsequent updates will now result in
admin
"IPv4Address": "192.168.203.4/24",
front
set $admin 192.168.203.3;
admin
"IPv4Address": "192.168.203.9/24",
front
set $admin 192.168.203.4;
This is all cleared by issuing
docker-compose restart
or declaring
*_ADDRESS for all the services
The behaviour should have been similar prior to that backport, except that HOST_* and *_ADDRESS was used equally in some services. The change is that if HOST_* is set, it's resolved on startup to the address (this was implemented to keep the old behaviour). You can instead set *_ADDRESS (it can also be set to a hostname), this will cause run-time resolution instead of start-time resolution.
I have implemented this because the old behaviour (only start-time resolution) exactly because of the issue you describe. On the kubernetes helm chart, I use only *_ADDRESS, which works as expected.
@muhlemmer Are there any objections against using *_ADDRESS on docker too?
(edit: formatting)
Should the docs encourage explicitly setting *_ADDRESS for docker deploys to protect against this behaviour?
As it is now, when containers are recreated if *_ADDRESS is not defined, a user is going to experience strange failures.
I believe this is because on recreate startup order by depends_on is not guaranteed, front will resolve admin IP as it is recreated and if admin container is recreated afterwards, the stored IP will be incorrect.
I think so, yes. But the current behaviour was introduced for reason (dns lookup failures I think) and that was before my time. So I'm waiting for feedback of the others.
Alright, I understand the enhancements it offers for Kubernetes and external service discovery.
Could you test out the scenarios above and see if you can reproduce the behaviour?
@jawabuu I don't think that we have the resources to test it ourself.
The limitations of HOST_* and the advantage of *_ADDRESS are well documented.
@muhlemmer @kaiyou Do you think we should switch the docker-compose setup from HOST_* to *_ADDRESS?
Hey @micw The test to reproduce is as highlighted above
https://github.com/Mailu/Mailu/issues/1341#issuecomment-582823314
You can do all these locally.
@jawabuu Sorry, got the "test out the scenarios above" wrong.
What you describe is indeed expected behaviour. Mailu by default resolve all HOST_* to static addresses. This gives strong resistance against DNS failures when it's running but it breaks if an IP address changes.
For kubernetes deployment, we added the *_ADDRESS feature (along with some other changes, e.g. that admin resolves addresses for nginx) which allows DNS based dynamic service discovery.
So at the end, it's a trade-off between both features. I personally prefer the 2nd because I run in kubernetes, actually _do_ want service discovery and because I think if there are DNS issues, it should be fixed at root cause.
For docker-compose I don't know what the best option is, so I'd wait for answer from muhlemmer or kaiyou. You can also join the chat at matrix #mailu for an open discussion.
@micw Noted. I believe the resolution for this could be just an emphasis on the documentation that the behaviour for mailu with HOST_* differs from how many users expect docker-compose DNS resolution to work.
@muhlemmer @kaiyou Do you think we should switch the docker-compose setup from
HOST_*to*_ADDRESS?
Yeah, we could start testing. It's just a matter of env setting, correct? (been out of the game a bit long) I can reconfigure my test server and load with emails.
Yes, it's env only. I have no suitable docker-compose test env. When it works smoothly, I'd switch to *_ADDRESS. If not I'd dig into the issue.
Is docker-compose still going to be a supported configuration in the future? And if this is the case, shouldn't this issue be part of the next milestone, since it breaks the docker compose setup?
Also how would the workaround look like? I have the following in my env file, but recreating the mailu containers is still unreliable.
FRONT_ADDRESS=mailu_front_1
REDIS_ADDRESS=mailu_redis_1
ADMIN_ADDRESS=mailu_admin_1
WEBMAIL_ADDRESS=mailu_webmail_1
ANTIVIRUS_ADDRESS=mailu_antivirus_1
ANTISPAM_WEBUI_ADDRESS=mailu_antispam_1
ANTISPAM_MILTER_ADDRESS=mailu_antispam_1
WEBDAV_ADDRESS=mailu_mailu_webdav_1
LMTP_ADDRESS=mailu_imap_1
IMAP_ADDRESS=mailu_imap_1
POP3_ADDRESS=mailu_imap_1
SMTP_ADDRESS=mailu_smtp_1
HOSTIMAP_ADDRESS=mailu_imap_1
AUTHSMTP_ADDRESS=mailu_smtp_1
Hi There,
The Mailu-Project is currently in a bit of a bind! We are short on man-power, and we need to judge if it is possible for us to put in some work on this issue.
To help with that, we are currently trying to find out which issues are actively keeping users from using Mailu, which issues have someone who want to work on them — and which issues may be less important. These a less important ones could be discarded for the time being, until the project is in a more stable and regular state once again.
In order for us to better assess this, it would be helpful if you could put a reaction on this post (use the :smiley: icon to the top-right).
I think this is pretty necessary for a smooth docker-compose setup, isn't it? Because without it, I can not use scripts to back up automatically, and expect everything to come up again properly!
I suppose it can be ignored if docker-compose won't be supported in the future (I haven't tried the Kubernetes setup yet myself)
I am able to work on this myself, if someone provides me with a few pointers since I am not familiar with the Mailu Dockerfiles' internals. I am a software engineer working mostly in C and ARM assembly at work, but I'm familiar with writing Dockerfiles etc, and have been exposed to python before. So I should be able to figure things out.
It would need however for someone to introduce me or point me to more previous discussions, about the rationale for having the two different resolution mechanisms.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Thank you for your offer, @tvelocity ! We are currently quite busy with sorting and sifting issues to see where we need to help more. This is something we should have a closer look at in the near future. Thanks!
Hello @Nebukadneza, thanks for your response. I understand Mailu is very limited in man-power at the moment, I would really appreciate it if someone could comment on whether my understanding of the issue is correct so far :)
So I tried to dig into the older Mailu issues, to find out why the name resolution at container start up was implemented in the first place. What I found was merge request #886 for IPv6 support, which also mentions issue #827
Both refer to the Docker DNS resolver being unreliable when used with IPv6, which of course leads to unexpected problems with containers not being to reach each other by name. Thus, it was proposed and accepted in #886 to resolve the addresses of Mailu containers during startup. It is worth stressing that according to the findings back then, this was (and still is) extra effort and complexity in Mailu, in order to work around a problem that wasn't caused by Mailu in the first place - but because of a serious issue in Docker.
Unfortunately this causes its own set of problems, as the addresses of remote containers are not guaranteed not to change, and in fact it is the case that recreating a container will often result in a new IP address being allocated to it. Therefore by solving one problem on IPv6 deployments, we now have another problem on all setups, IPv4 and IPv6 alike, whenever a container is recreated with another address. The problem was apparent enough on certain setups (notably including Kubernetes setups), that another workaround was introduced on top of the current workaround, and we end up with the two address resolution mechanisms. And you obviously can't rely on both of them, they are mutually exclusive.
This sucks for everyone, because people deploying with IPv6 turned on, still get a broken setup, albeit slightly less broken than without any workaround. And everyone else gets to deal with it too, since they have to learn about the two address resolution mechanisms and the difference between the HOST_* and *_ADDRESS sets of env variables - or at least they will be forced to as soon as their containers don't come back up as expected after an upgrade, reconfiguration or even a nightly backup.
There might still be important aspects I am missing here, but if my understanding is correct so far, I would seriously propose to just scrap it all and revert to one set of env variables and NO address resolution workarounds. The added complexity and issues caused by it are really not worth the effort, IMHO. The env variables by default should just refer to remote containers by name.
People who wish to use IPv6, may still do so by assigning static addresses to their containers, set in their Dockerfiles, manifests, or any other tool they are using to orchestrate their deployment (and updating the env variables accordingly). It is a problem with the Docker DNS resolver after all, so let's keep Mailu out of it and point users to the right direction to deal with this Docker issue directly at the source.
The current approach is adding complexity and maintainability burden, the way to keeping it involves more workarounds and writing regression tests. If Mailu can't afford to keep it under control, then it really should just be declared out of scope and removed, in my humble opinion :)
My conclusion however is assuming I have not missed some other major aspects, which is quite possible since I've only briefly looked into the Mailu code, and older issues and merge requests.
@tvelocity Thanks for the write up.
I agree, that it probably makes sense to remove the resolution as it causes issues and imo really complicates the whole env var handling, but don't know about the opinion of other project members.
I'm currently testing the *_ADDRESS 'overwrite' on my own setup.
This would also fix #1430 btw
Most helpful comment
Hi There,
The
Mailu-Project is currently in a bit of a bind! We are short on man-power, and we need to judge if it is possible for us to put in some work on this issue.To help with that, we are currently trying to find out which issues are actively keeping users from using
Mailu, which issues have someone who want to work on them — and which issues may be less important. These a less important ones could be discarded for the time being, until the project is in a more stable and regular state once again.In order for us to better assess this, it would be helpful if you could put a reaction on this post (use the :smiley: icon to the top-right).
We want to keep this voting open for 2 weeks from now, so please help out!