Mailu: Health checks do not work on swarm mode

Created on 23 Dec 2019  Â·  24Comments  Â·  Source: Mailu/Mailu

Health check failed when trying to setup mailu on swarm:

[root@warm5 mailu]# docker stack ps mailu
ID                  NAME                IMAGE                   NODE                DESIRED STATE       CURRENT STATE             ERROR                              PORTS
mlkmn1v8g3ud        mailu_front.1       mailu/nginx:master      warm5               Running             Starting 7 seconds ago                                       
e00lb5wikqa1        mailu_webmail.1     mailu/rainloop:master   warm5               Running             Starting 17 seconds ago                                      
frbv8rpgsslh        mailu_admin.1       mailu/admin:master      warm5               Running             Starting 34 seconds ago                                      
y8ij7f2i9gf7        mailu_front.1       mailu/nginx:master      warm5               Shutdown            Failed 12 seconds ago     "task: non-zero exit (137): do…"   
9xv0no6dvk43        mailu_webmail.1     mailu/rainloop:master   warm5               Shutdown            Failed 22 seconds ago     "task: non-zero exit (137): do…"   
1rpso5s2bvbx        mailu_admin.1       mailu/admin:master      warm5               Shutdown            Failed 39 seconds ago     "task: non-zero exit (137): do…"   
nb8r3wznfie4        mailu_antispam.1    mailu/rspamd:master     warm5               Running             Starting 2 minutes ago                                       
r0c0fizm95nt        mailu_imap.1        mailu/dovecot:master    warm5               Running             Starting 3 minutes ago                                       
j4k2m696v1c0        mailu_smtp.1        mailu/postfix:master    warm5               Running             Starting 3 minutes ago                                       
pre531u41gzg        mailu_front.1       mailu/nginx:master      warm5               Shutdown            Failed 2 minutes ago      "task: non-zero exit (137): do…"   
j9iqizvhyj1t        mailu_webmail.1     mailu/rainloop:master   warm5               Shutdown            Failed 2 minutes ago      "task: non-zero exit (137): do…"   
olseb7g75fd5        mailu_admin.1       mailu/admin:master      warm5               Shutdown            Failed 2 minutes ago      "task: non-zero exit (137): do…"   
ypk23ozqhjem        mailu_front.1       mailu/nginx:master      warm5               Shutdown            Failed 3 minutes ago      "task: non-zero exit (137): do…"   
rke8o7a2fu9d        mailu_webmail.1     mailu/rainloop:master   warm5               Shutdown            Failed 4 minutes ago      "task: non-zero exit (137): do…"   
tkx7gmkc1ac6        mailu_admin.1       mailu/admin:master      warm5               Shutdown            Failed 4 minutes ago      "task: non-zero exit (137): do…"   
hrcq1ugftwlp        mailu_front.1       mailu/nginx:master      warm5               Shutdown            Failed 5 minutes ago      "task: non-zero exit (137): do…"   
8s5k2dqceqbe        mailu_webmail.1     mailu/rainloop:master   warm5               Shutdown            Failed 5 minutes ago      "task: non-zero exit (137): do…"   
o4a1pqx1wvq5        mailu_admin.1       mailu/admin:master      warm5               Shutdown            Failed 6 minutes ago      "task: non-zero exit (137): do…"   
slbuwnr3902z        mailu_antispam.1    mailu/rspamd:master     warm5               Shutdown            Failed 2 minutes ago      "task: non-zero exit (1)"          
qpyl7krapryb        mailu_imap.1        mailu/dovecot:master    warm5               Shutdown            Failed 3 minutes ago      "task: non-zero exit (1)"          
b26kp50av12a        mailu_smtp.1        mailu/postfix:master    warm5               Shutdown            Failed 3 minutes ago      "task: non-zero exit (1)"          
qarhazkss1gh        mailu_redis.1       redis:alpine            warm5               Running             Running 15 minutes ago                                       
jbp5pwikp9y9        mailu_antispam.1    mailu/rspamd:master     warm5               Shutdown            Failed 8 minutes ago      "task: non-zero exit (1)"          
49x9vj7jiidh        mailu_smtp.1        mailu/postfix:master    warm5               Shutdown            Failed 9 minutes ago      "task: non-zero exit (1)"          
pag9y099zmzh        mailu_imap.1        mailu/dovecot:master    warm5               Shutdown            Failed 9 minutes ago      "task: non-zero exit (1)"

Tried to exec health check by hands while admin container was running and it really cannot connect to localhost:80

[root@warm5 mailu]# docker exec -it a2cb462a440e ash
/app # curl -f -L http://localhost/ui/login?next=ui.index
curl: (7) Failed to connect to localhost port 80: Connection refused
flavokubernetes flavoswarm

Most helpful comment

Disabling healthchecks in docker-compose.yml seems solves the problem but ... now I have no healthcheck :-).

All 24 comments

looks like containers in starting state are not registered in DNS. I.e. starting containers do not resolve by other starting containers and thus it cannot become healthy. Exception is a redis container due to it has no healthcheck and it resolves just after start.

Disabling healthchecks in docker-compose.yml seems solves the problem but ... now I have no healthcheck :-).

how did you disable them?

Disabling healthchecks in docker-compose.yml -- I had commented out healthchecks in docker-compose.yml at 26 Dec 2019. I did not check it since this time. I.e. I did not generate new docker-compose.yml for Mailu so probably newly generated compose file will have some diff from my case.

okay no as far as I can see there are none in it...but 6 out of eleven containers don't start...don't know why yet...

Are You running Mailu in swarm ?

yes

I run it with Swarmpit...and traefik reverseproxy...i just figured out that certdumper is not working...didn't figure out why...

  certdumper:
    image: mailu/traefik-certdumper:1.7
    environment:
      DOMAIN: mailu.test.yolo
      TRAEFIK_VERSION: v2
    volumes:
     - traefik_reverseproxy_letsencrypt:/traefik
     - mailu_certs:/output
    networks:
     - net
    logging:
      driver: json-file

also the front container never starts...so I can't get any logs...

Try docker stack services <stack name>, docker service logs <service name> and so on. Also make sure that You have all images downloaded on nodes docker images | grep mailu. If container cannot start for some reason (no image as a case) it cannot produce any logs. Also try docker ps -a and learn output. Probably You already have done steps above.

Ok I'll try this :-) I always get docker exit on 137...so something is going wrong here...certdumper writes always that he didn't found the data but I directly mounted the acme.json...ill do further investigation. Thanks for the help :-)

okay I tried out to fetch logs but it doesn't show any...all images are downloaded...are there any other reasons that comes to your mind? I also tried it on another machine, but no luck. These services are starting:
antivirus
certdumper (but can't find the certs)
fetchmail
redis
webdav

all others fail...one writes unhealthy container as reason...

admin: task: non-zero exit (137): dockerexec: unhealthy container
antispam: task: non-zero exit (1)
front: task: non-zero exit (137): dockerexec: unhealthy container
imap: task: non-zero exit (1)
smtp: task: non-zero exit (1)
webmail: task: non-zero exit (137): dockerexec: unhealthy container

I've the feeling that Im doing something horribly wrong XD. But no matter what I do, the containers don't want to start...

logs of certdumper: gxm5nwr Fri Mar 20 15:11:18 UTC 2020 Dumping certificates gxm5nwr Fri Mar 20 15:11:18 UTC 2020 Certificate or key differ, updating gxm5nwr mv: can't rename '/tmp/work/mailu.something.yolo/*.pem': No such file or directory

Have no ideas ... try to use nossl or cert

I use flavor mail...

Ok I found out that certdumper is only working with the label master
do you have a working docker-swarm example with mysql? my containers still don't start...

No, I tried to use postgres as DB backend but got some strange issues and switched to sqlite as legacy DB. After that all worked nice. I did not try mysql. Also please note: postgres and mysql are relatively new backends in mailu so probably both are not tested enough. Currently I do not use Maily at all due to cannot fix domainless users completely.

Okay I See...i also couldn't get the stack up for now...

I am also trying to run mailu with stack deploy and am getting the same error. As warmanton noted, containers in a starting state are not registered in Docker's DNS so various start scripts cannot complete.

@stamps9k can You please provide Your fresh docker-compose.yml which is failed ?

@warmanton I copied the docker-compose from the swarm docs page.

The only changes that I have made is setting the replica counts to 1 for all services and removing a bad (empty) volumes setting for the fetchmail service.

The full docker-compose file is:

version: '3.2'

services:

  front:
    image: mailu/nginx:$VERSION
    env_file: .env
    hostname: front
    ports:
      - target: 80
        published: 80
      - target: 443
        published: 443
      - target: 110
        published: 110
      - target: 143
        published: 143
      - target: 993
        published: 993
      - target: 995
        published: 995
      - target: 25
        published: 25
      - target: 465
        published: 465
      - target: 587
        published: 587
    volumes:
      - "$ROOT/certs:/certs"
    deploy:
      replicas: 1

  redis:
    image: redis:alpine
    volumes:
      - "$ROOT/redis:/data"
    deploy:
      replicas: 1

  imap:
    image: mailu/dovecot:$VERSION
    env_file: .env
    volumes:
      - "$ROOT/mail:/mail"
      - "$ROOT/overrides:/overrides"
    depends_on:
      - front
    deploy:
      replicas: 1

  smtp:
    image: mailu/postfix:$VERSION
    env_file: .env
    environment:
      - POD_ADDRESS_RANGE=10.0.1.0/24
    volumes:
      - "$ROOT/overrides:/overrides"
    depends_on:
      - front
    deploy:
      replicas: 1

  antispam:
    image: mailu/rspamd:$VERSION
    env_file: .env
    environment:
      - POD_ADDRESS_RANGE=10.0.1.0/24
    volumes:
      - "$ROOT/filter:/var/lib/rspamd"
      - "$ROOT/dkim:/dkim"
      - "$ROOT/overrides/rspamd:/etc/rspamd/override.d"
    depends_on:
      - front
    deploy:
      replicas: 1

  antivirus:
    image: mailu/none:$VERSION
    env_file: .env
    volumes:
      - "$ROOT/filter:/data"
    deploy:
      replicas: 1

  webdav:
    image: mailu/none:$VERSION
    env_file: .env
    volumes:
      - "$ROOT/dav:/data"
    deploy:
      replicas: 1

  admin:
    image: mailu/admin:$VERSION
    env_file: .env
    environment:
      - POD_ADDRESS_RANGE=10.0.1.0/24
    volumes:
      - "$ROOT/data:/data"
      - "$ROOT/dkim:/dkim"
      - /var/run/docker.sock:/var/run/docker.sock:ro
    depends_on:
      - redis
    deploy:
      replicas: 1

  webmail:
    image: mailu/roundcube:$VERSION
    env_file: .env
    volumes:
      - "$ROOT/webmail:/data"
    depends_on:
      - imap
    deploy:
      replicas: 1

  fetchmail:
    image: mailu/fetchmail:$VERSION
    env_file: .env
    deploy:
      replicas: 1

networks:
  default:
    external:
      name: mailu_default

The exact same error as mine https://github.com/Mailu/Mailu/issues/1287#issuecomment-603768094
Maybe The resolvement of this issue also allows to integrate unbound into swarm...

Yes there is a chicken and egg issue between healtchcheck and dns resolution, and this is preventing mailu-swarm to properly start.
One possible workaround is to disable heatlcheck, by modifying your docker-compose.yml and adding:

    healthcheck:
      disable: true

for the front, admin, imap, smtp, webmail services

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hoellen picture hoellen  Â·  4Comments

binaryfire picture binaryfire  Â·  3Comments

Yermo picture Yermo  Â·  3Comments

Diman0 picture Diman0  Â·  3Comments

styxlab picture styxlab  Â·  4Comments