I use eosio/eos:v1.0.7 as the docker images.
After docker stop eos-1.0.7, then start it.
I always meet followed error:
2635907ms thread-0 main.cpp:123 main ] database dirty flag set (likely due to unclean shutdown): replay required
I have to --replay-blockchain like issue https://github.com/EOSIO/eos/issues/4002 and wait for more than 1 hour.
having the same issue also when not using a tag
Same issue
Same here. The node wasn't forcefully killed. It keeps crashing during the --replay-blockchain. The only thing to start --replay-blockchain again. But it never finishes.
Guys, you have to start noeos with flag --hard-replay-blockchain , and wait until replay finished . This is one way to fix this problem .
That's not a fix, that is a workaround.
Normally, docker stop eosio should stop it gracefully.
Did you see
744083ms thread-0 net_plugin.cpp:2962 plugin_shutdown ] close acceptor
744083ms thread-0 net_plugin.cpp:2965 plugin_shutdown ] close 0 connections
744083ms thread-0 net_plugin.cpp:2973 plugin_shutdown ] exit shutdown
744085ms thread-0 fork_database.cpp:93 close ] num_blocks_in_fork_db: 2
744088ms thread-0 controller.cpp:261 ~controller_impl ] db.revision(): 73 head->block_num: 73 blog.read_head()->block_num(): 72
in the last logs of your container when you stopped your container?
No. those logs do not appear. In fact when doing docker stop no additional logs appear besides the logs that are always there when just running the node.
Can you share "the logs that are always there when just running the node" after you stop your docker container? Normally the above lines I mentioned before should be appended in the end of that log
Last 5 lines of logs:
2407903ms thread-0 net_plugin.cpp:2190 operator() ] Peer p.jeda.one:3322 closed connection
2407908ms thread-0 net_plugin.cpp:2190 operator() ] Peer peer.eosio.sg:9876 closed connection
2408003ms thread-0 net_plugin.cpp:2190 operator() ] Peer mainnet.eospay.host:19876 closed connection
2408004ms thread-0 net_plugin.cpp:2190 operator() ] Peer mainnet.eoseco.com:10010 closed connection
2408095ms thread-0 net_plugin.cpp:2190 operator() ] Peer 807534da.eosnodeone.io:19872 closed connection
@andriantolie I updated eos to 1.0.8 but the issue is same.
Which log should I collect before docker stop? docker logs like below?
ls -lh $(find /var/lib/docker/containers/ -name *-json.log)
@lcgogo docker has a logs command which is very handy. docker logs <name of container>
According to https://github.com/EOSIO/eos/issues/4301#issuecomment-399681155, it seems that you may use docker exec to fire a SIGTERM to nodeos, making it shutdown gracefully.
It can also be that your nodeos takes longer than 10 seconds to stop.
Try to use a higher grace period for your docker stop. Say docker stop -t=300 eos-1.07 https://docs.docker.com/engine/reference/commandline/stop/
For docker-compose down or docker-compose stop, things are the same:
https://docs.docker.com/compose/reference/down/
https://docs.docker.com/compose/reference/stop/
-t, --timeout TIMEOUT Specify a shutdown timeout in seconds.
(default: 10)
You need to make sure nodeos shuts down properly. When processing large number of blocks (or reprocessing), nodeos might not catch the SIGTERM signal right away for different reasons.. make sure you wait enough between when you send the signal, don't send a SIGKILL.. it'll close at some point. If your system has a timeout, make it 10 minutes to be sure (Kubernetes has a grace period for example).
You can send a SIGTERM to the container with docker kill --signal TERM container_id. Wait until it shuts down on its own.. don't kill it with docker kill again (which defaults to SIGKILL, bad bad :) A SIGINT (Ctrl+C) will also shut it down properly.
This did solve it for me. Needed a longer timeout, and also when using an entrypoint make sure that nodeos actually is PID 1 inside the container, otherwise it will not receive the docker stop command. (See https://hynek.me/articles/docker-signals/)
I opened a PR #4632 for this issue.
I wrote this script to stop the node:
#!/usr/bin/env bash
nodeosd_pid=$(pgrep nodeos)
echo "Found nodeos pid: [${nodeosd_pid}]"
if [ -n "$(ps -p ${nodeosd_pid} -o pid=)" ]; then
echo "Send SIGINT"
kill -SIGINT ${nodeosd_pid}
fi
while [ -n "$(ps -p ${nodeosd_pid} -o pid=)" ]
do
sleep 1
done
echo "Process nodeosd has finished"
named it stop.sh, move to the /bin in container and stop node with: docker exec -it $(NAME_CONTAINER) stop.sh
nger timeout, and also when using an entrypoint make sure that nodeos actually is PID 1 inside the container, otherwise it will not receive the docker stop command
If you are using supervisor and nodeos is a child process, supervisor will send the stop signal to child processes. In this case supervisor will be PID 1.
Most helpful comment
I wrote this script to stop the node:
named it
stop.sh, move to the /bin in container and stop node with:docker exec -it $(NAME_CONTAINER) stop.sh