Clickhouse: Centos/8: sysV vs systemd mess. Service status is 'stopped' although it is running

Created on 31 Aug 2020  ·  17Comments  ·  Source: ClickHouse/ClickHouse

Description
Whenever I run sudo service clickhouse-server status I get clickhouse-server service is stopped, although the server is actually running.

I can verify that it is running with the following:

$ ps -ef | grep clickhouse
clickho+    2646       1  0 20:00 ?        00:00:01 clickhouse-server --daemon --pid-file=/var/run/clickhouse-server/clickhouse-server.pid --config-file=/etc/clickhouse-server/config.xml

or

$ sudo lsof -i:9000
COMMAND    PID       USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
clickhous 2646 clickhouse   54u  IPv6  60885      0t0  TCP *:cslistener (LISTEN)

or simpliy, I can open clickhouse-client with no problems.

It is not just about the displayed status; I cannot stop the server using sudo service clickhouse-server stop, I must kill the process manually. I cannot restart the server using sudo service clickhouse-server restart because the other process is allocating the resources.

Environment
I am using a fresh image of CentOS 8.2 with ClickHouse 20.7.2.30 installed.

Here is the configuration that I am using:

<yandex>
    <path>/mnt/clickhouse/</path>
    <tmp_path>/mnt/clickhouse/tmp/</tmp_path>
    <user_files_path>/mnt/clickhouse/user_files/</user_files_path>
    <access_control_path>/mnt/clickhouse/access/</access_control_path>
    <format_schema_path>/mnt/clickhouse/format_schemas/</format_schema_path>

    <listen_host>::</listen_host>

    <compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
        </case>
    </compression>

    <remote_servers>
        <clickhouse_cluster>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>clickhouse-01</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>clickhouse-02</host>
                    <port>9000</port>
                </replica>
            </shard>
        </clickhouse_cluster>
    </remote_servers>

    <zookeeper>
        <node index="1">
            <host>zookeeper-01</host>
            <port>2181</port>
        </node>
    </zookeeper>
</yandex>

bug comp-packaging v20.7-affected

All 17 comments

Please show an output of

systemctl status clickhouse-server

cat /etc/cron.d/clickhouse-server

Have you upgrade CH ? Or install it?

I installed the latest version of the official pre-compiled rpm packages for CentOS.

$ systemctl status clickhouse-server
● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)
   Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Mon 2020-08-31 20:29:50 UTC; 27s ago
  Process: 3390 ExecStart=/usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid (code=exited, status=70)
 Main PID: 3390 (code=exited, status=70)

Aug 31 20:29:50 clickhouse-01 systemd[1]: clickhouse-server.service: Main process exited, code=exited, status=70/SOFTWARE
Aug 31 20:29:50 clickhouse-01 systemd[1]: clickhouse-server.service: Failed with result 'exit-code'.
$ cat /etc/cron.d/clickhouse-server
#*/10 * * * * root (which service > /dev/null 2>&1 && (service clickhouse-server condstart ||:)) || /etc/init.d/clickhouse-server condstart > /dev/null 2>&1

try /etc/init.d/clickhouse-server status, /etc/init.d/clickhouse-server stop

Still giving me stopped status.

$ /etc/init.d/clickhouse-server status
clickhouse-server service is stopped

Please

1) check the pid of the running clickhouse - server process = pid reporter by service status
2) show the content of /run/clickhouse-server/clickhouse-server.pid
3) show the content of /mnt/clickhouse/status

Also it looks like clickhouse was started by init.d (sysv script in the /etc/init.d/clickhouse-server) and you are checking the status of systemd service.

May be it is inaccurate update - did you saw that message during updating the package?

https://github.com/ClickHouse/ClickHouse/blob/9791be4effc3cb74d0e7460a13505b187f1e3f90/debian/clickhouse-server.postinst#L26-L29

I didn't update ClickHouse, I installed the latest version 20.7.2.30 directly.

  1. The PID is 2646:

    $ sudo lsof -i:9000
    
    COMMAND    PID       USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
    clickhous 2646 clickhouse   54u  IPv6  60885      0t0  TCP *:cslistener (LISTEN)
    
  2. The process file is not found:

    $ sudo cat /run/clickhouse-server/clickhouse-server.pid
    
    cat: /run/clickhouse-server/clickhouse-server.pid: No such file or directory
    
  3. The status file content:
    ```
    $ sudo cat /mnt/clickhouse/status

    PID: 2646
    Started at: 2020-08-31 20:00:28
    Revision: 54437
    ````

By the way, this issue happens also with the default configurations when I just try to restart the server.

Tried on VM with centos/8 without success.

It is possible if you will start clickhouse not through service / systemctl command but directly via SysV script, i.e.

sudo /etc/init.d/clickhouse-server start

But in that case, the simple restart should solve the issue, and sudo /etc/init.d/clickhouse-server status should show proper status.

Please run the following:

sudo /etc/init.d/clickhouse-server status
sudo service clickhouse-server status
sudo systemctl status clickhouse-server

sudo chkconfig --list
sudo systemctl list-unit-files

sudo chkconfig clickhouse-server
sudo systemctl is-enabled clickhouse-server

sudo /etc/init.d/clickhouse-server stop
sudo kill $(pidof clickhouse-server)
sudo kill -9 $(pidof clickhouse-server)

sudo systemctl start clickhouse-server
sudo systemctl status clickhouse-server

Please share the output & try to restart the server afterward.

I am able to reproduce the bug using the following steps:

  1. Create a VM with CentOS v8.2.2004
  2. Install the latest ClickHouse using
    sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64 sudo yum install -y clickhouse-server clickhouse-client
  3. Restart the VM, without starting ClickHouse server or doing any further steps.
    Just after restarting the VM I get the following:

    $ sudo /etc/init.d/clickhouse-server status
    
    clickhouse-server service is running
    

    ```
    $ sudo service clickhouse-server status

    clickhouse-server service is running

    
    

    $ sudo systemctl status clickhouse-server

    ● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)
    Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: disabled)
    Active: active (running) since Tue 2020-09-01 14:17:58 UTC; 20s ago
    Main PID: 956 (clickhouse-serv)
    Tasks: 46 (limit: 50048)
    Memory: 617.5M
    CGroup: /system.slice/clickhouse-server.service
    └─956 /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/ clickhouse-server.pid

    Sep 01 14:18:03 clickhouse-02 clickhouse-server[956]: Include not found: clickhouse_compression
    Sep 01 14:18:03 clickhouse-02 clickhouse-server[956]: Logging trace to /var/log/clickhouse-server/clickhouse-server.log
    Sep 01 14:18:03 clickhouse-02 clickhouse-server[956]: Logging errors to /var/log/clickhouse-server/clickhouse-server.err. log
    Sep 01 14:18:12 clickhouse-02 clickhouse-server[956]: Processing configuration file '/etc/clickhouse-server/users.xml'.
    Sep 01 14:18:12 clickhouse-02 clickhouse-server[956]: Include not found: networks
    Sep 01 14:18:12 clickhouse-02 clickhouse-server[956]: Saved preprocessed configuration to '/var/lib/clickhouse// preprocessed_configs/users.xml'.
    Sep 01 14:18:15 clickhouse-02 clickhouse-server[956]: Processing configuration file '/etc/clickhouse-server/config.xml'.
    Sep 01 14:18:15 clickhouse-02 clickhouse-server[956]: Include not found: clickhouse_remote_servers
    Sep 01 14:18:15 clickhouse-02 clickhouse-server[956]: Include not found: clickhouse_compression
    Sep 01 14:18:15 clickhouse-02 clickhouse-server[956]: Saved preprocessed configuration to '/var/lib/clickhouse// preprocessed_configs/config.xml'.
    ```

  4. Restart ClickHouse (once or twice) to get this:

    $ sudo service clickhouse-server restart
    
    Stop clickhouse-server service: DONE
    Start clickhouse-server service: Path to data directory in /etc/clickhouse-server/config.xml: /var/lib/clickhouse/
    DONE
    
    $ sudo service clickhouse-server restart
    Start clickhouse-server service: Path to data directory in /etc/clickhouse-server/config.xml: /var/lib/clickhouse/
    UNKNOWN
    

    Afterwards, I get that the status of the server is STOPPED while it is actually running

    $ sudo /etc/init.d/clickhouse-server status
    
    clickhouse-server service is stopped
    
    $ sudo service clickhouse-server status
    
    clickhouse-server service is stopped
    
    sudo systemctl status clickhouse-server
    
    ● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)
    Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: disabled)
    Active: activating (auto-restart) (Result: exit-code) since Tue 2020-09-01 14:32:53 UTC; 9s ago
    Process: 1769 ExecStart=/usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid (code=exited, status=70)
    Main PID: 1769 (code=exited, status=70)
    
    Sep 01 14:32:53 clickhouse-02 systemd[1]: clickhouse-server.service: Main process exited, code=exited, status=70/SOFTWARE
    Sep 01 14:32:53 clickhouse-02 systemd[1]: clickhouse-server.service: Failed with result 'exit-code'.
    

Here is the output that you asked for:

$ sudo /etc/init.d/clickhouse-server status
clickhouse-server service is stopped

$ sudo service clickhouse-server status
clickhouse-server service is stopped

$ sudo systemctl status clickhouse-server
● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)
   Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Tue 2020-09-01 14:36:25 UTC; 2s ago
  Process: 1866 ExecStart=/usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid (code=exited, status=70)
 Main PID: 1866 (code=exited, status=70)

Sep 01 14:36:25 clickhouse-02 systemd[1]: clickhouse-server.service: Main process exited, code=exited, status=70/SOFTWARE
Sep 01 14:36:25 clickhouse-02 systemd[1]: clickhouse-server.service: Failed with result 'exit-code'.

$ sudo chkconfig --list

Note: This output shows SysV services only and does not include native
      systemd services. SysV configuration data might be overridden by native
      systemd configuration.

      If you want to list systemd services use 'systemctl list-unit-files'.
      To see services enabled on particular target use
      'systemctl list-dependencies [target]'.

$ sudo systemctl list-unit-files | grep clickhouse
clickhouse-server.service                  enabled

$ sudo chkconfig clickhouse-server
Note: Forwarding request to 'systemctl is-enabled clickhouse-server.service'.
enabled

$ sudo systemctl is-enabled clickhouse-server
enabled

$ sudo /etc/init.d/clickhouse-server stop
$ sudo kill $(pidof clickhouse-server)
$ sudo kill -9 $(pidof clickhouse-server)
kill: not enough arguments

$ sudo systemctl start clickhouse-server

$ sudo service clickhouse-server status
clickhouse-server service is running

$ sudo /etc/init.d/clickhouse-server status
clickhouse-server service is running

$ sudo systemctl status clickhouse-server
● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)
   Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-09-01 14:37:56 UTC; 2s ago
 Main PID: 1938 (clickhouse-serv)
    Tasks: 62 (limit: 50048)
   Memory: 104.3M
   CGroup: /system.slice/clickhouse-server.service
           └─1938 /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid

Sep 01 14:37:56 clickhouse-02 clickhouse-server[1938]: Include not found: clickhouse_compression
Sep 01 14:37:56 clickhouse-02 clickhouse-server[1938]: Logging trace to /var/log/clickhouse-server/clickhouse-server.log
Sep 01 14:37:56 clickhouse-02 clickhouse-server[1938]: Logging errors to /var/log/clickhouse-server/clickhouse-server.err.log
Sep 01 14:37:56 clickhouse-02 clickhouse-server[1938]: Processing configuration file '/etc/clickhouse-server/users.xml'.
Sep 01 14:37:56 clickhouse-02 clickhouse-server[1938]: Include not found: networks
Sep 01 14:37:56 clickhouse-02 clickhouse-server[1938]: Saved preprocessed configuration to '/var/lib/clickhouse//preprocessed_configs/users.xml'.
Sep 01 14:37:58 clickhouse-02 clickhouse-server[1938]: Processing configuration file '/etc/clickhouse-server/config.xml'.
Sep 01 14:37:58 clickhouse-02 clickhouse-server[1938]: Include not found: clickhouse_remote_servers
Sep 01 14:37:58 clickhouse-02 clickhouse-server[1938]: Include not found: clickhouse_compression
Sep 01 14:37:58 clickhouse-02 clickhouse-server[1938]: Saved preprocessed configuration to '/var/lib/clickhouse//preprocessed_configs/config.xml'.

But restarting the server one or more time using sudo service clickhouse restart produces the same issue again.

It seems that sudo service clickhouse restart causes the problem while sudo systemctl restart clickhouse-server does not.

systemd vs sysV.

Centos 8 is systemd based, but it also has compatibility layer for sysV.

So do clickhouse - it provides both systemd service file and sysV init script.

Because of that when you use systemctl - it uses systemd service, when you use service it uses init.d script.

And it seems like it messes up when you try to use one after another. systemctl using fallback to init.d and vice versa.

For now - just use systemctl always, and it should work (by the way that is the only correct way to control services on systemd based distros)

(we need to figure out how to package sysV init sctipt to avoid that mess).

I think /etc/cron.d/clickhouse-server also needs to be upgraded for systemctl . I used systemctl to start the server and still turned on the cron.d then got pid conflict because of the different pid .

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zhicwu picture zhicwu  ·  3Comments

vvp83 picture vvp83  ·  3Comments

jangorecki picture jangorecki  ·  3Comments

lttPo picture lttPo  ·  3Comments

fizerkhan picture fizerkhan  ·  3Comments