Postgres: Shutting down postgres gracefully with docker-compose down to avoid recovery process

Created on 18 Jan 2019  路  8Comments  路  Source: docker-library/postgres

Sometimes when I shutdown postgres (v9.4.20) with docker-compose down (v1.23.2) , the next time that I run docker-compose up, postgres claims database system was not properly shut down; automatic recovery in progress and takes forever to start back up. Even though when I shut it down the postgres logs say received smart shutdown request. Even trying with docker-compose down -v does not seem to resolve the issue. Here's my docker-compose.yml file.

version: "3.5"
networks:
  network1:
    driver: bridge
    name: sharedservices

services:

  postgres:
    image: postgres:9.4
    volumes:
      - "${PG_CONF_FILE}:/conf/postgresql.conf"
      - "${PG_DATA_DIR}:/var/lib/postgresql/data"
      - "${PG_HBA_FILE}:/conf/pg_hba.conf"
      - "${PG_LOG_DIR}:/logs"
    ports:
      - "${PG_PORT}:5432"
    expose:
      - "5432"
    networks:
      - network1
    environment:
      TZ: America/Chicago
    command: postgres -c log_directory=/logs -c config_file=/conf/postgresql.conf -c hba_file=/conf/pg_hba.conf

Any ideas on how to avoid an "automatic recovery" ?

question

All 8 comments

Relevant thread https://github.com/docker-library/postgres/issues/184

Noting https://github.com/docker-library/postgres/issues/184#issuecomment-394822161

Smart Shutdown: ... lets existing sessions end their work normally. It shuts down only after all of the sessions terminate.

So if you want to ensure a graceful shutdown of Postgres then do a pg_ctl stop

You could also play with stop signal and timeout, but I think TERM is what
you want (my guess is that 10 seconds just isn't enough, hence playing with
timeout).

My feeling is that docker-compose sends the SIGTERM but closes the container after some time, which is what you are talking about with the timeout, but doesn't really ensure that postgres is actually shutdown, and it stops the container anyway before postgres is actually shutdown.

Right, so if you increase --stop-timeout (on docker run) / stop_grace_period: (on docker-compose.yml; https://docs.docker.com/compose/compose-file/compose-file-v2/#stop_grace_period), it will give PostgreSQL more time before it gets killed and thus increase your chances of PostgreSQL actually finishing up properly and shutting down cleanly.

For my own database instances, I typically use a pretty generous --stop-timeout value of 120 since if I'm using docker stop / SIGTERM, I want to give the database as much time as it needs (if I want a faster shutdown, I use docker rm -f / docker kill directly).

Oh nice, it seems like stop_grace_period looks promising. I'll give that a shot and report my findings here.

So it seems like stop_grace_period does indeed do what it says. However, it still fails to cut some connections and then the container eventually stops leading to another "automatic recovery" by postgres. In my case, I've narrowed it down to the connection between postgres and pgAdmin not terminating, so I'm thinking of doing some sort of a wrapper script to send a SIGINT after some time... e.g.:

docker-compose exec postgres /bin/bash -c "su - postgres -c '/usr/lib/postgresql/9.4/bin/pg_ctl stop -m fast -D /var/lib/postgresql/data'"

Yeah, that seems reasonable, but at this point I think we've determined this isn't really an issue with the image (certainly not something we can fix), so I'm going to close. :+1:

I'm not satisfied with this resolution and I'm not sure how you could be either. As a docker admin, I would not be comfortable with running a db container in production knowing that a docker stop could corrupt the database. I encountered this issue in test with nothing running but the postgres:11.3-alpine and pgadmin4:4.8 containers running. If that minimal level of connectivity can trigger the issue, I can only imagine the carnage in full scale production.

It seems evident to me that postgresql should be holding on longer to a shutdown sequence if there are more actions needed for a clean shutdown. Certainly an MS SQL container doesn't behave this way.

Was this page helpful?
5 / 5 - 1 ratings

Related issues

mcnesium picture mcnesium  路  3Comments

Enelar picture Enelar  路  4Comments

note89 picture note89  路  3Comments

phanikumarp picture phanikumarp  路  3Comments

greaber picture greaber  路  4Comments