Hi,
I have been experimenting with LXD, targeting of moving from "plain" LXC.
(using LXD 0.18-0ubuntu2~ubuntu14. from lxd-stable ppa)
lxc profile create my-default
lxc profile device add my-default "mnt_shared" disk "source=/lxd-mounts/guests.shared" "path=/lxd-mnt/_shared" readonly=false
lxc launch :ubuntu1404 :d-glu -p my-default -p default
lxc stop :d-glu
lxc config show :d-glu
lxc config device add :d-glu "mnt_data" disk source=/lxd-mounts/guests.data/d-glu path=/lxd-mnt/data readonly=false
lxc config device add :d-glu "mnt_bak" disk source=/lxd-mounts/guests.bak/d-glu path=/lxd-mnt/bak readonly=false
lxc config device add :d-glu "mnt_glu1" disk source=/lxd-mounts/d-glu.glu1 path=/lxd-mnt/glu1 readonly=false
lxc config show :d-glu
lxc config set :d-glu boot.autostart true
lxc config set :d-glu boot.autostart.delay 10
lxc config set :d-glu boot.autostart.priority 10
lxc config set :d-glu environment.ENV_IS_HERE yes-it-is
lxc config set :d-glu limits.cpus 2
lxc config set :d-glu limits.memory 1024
Trying to do the actions above, "lxc stop" hangs (and the container keeps running).
If I kill "lxc stop" and then try it again it exits with error: exit status 254 even though the container ends up being stopped.
Ideally I would like to do this w/o having to start/stop the container, ie, by cloning it and leaving it stopped.
So, alternative ways of creating the container with this strategy would also be appreciated.
Thx
you could use "lxc init" instead of "lxc launch"
Also looks like github ate chunks of your commands with its markdown processor, you may have to escape some characters to make the report readable :)
Hi. I meant actions "above" :) Thx for the "lxc init" hint.
In the meantime I went back to the trusted lxc/lxc 1.1.
But, regarding the hang-up of "lxc stop", I only get the exit status above. Any ideas?
Thx
Not sure, lxc info containername --show-log may help there
I am facing a similar issue
Environment:
System Info
~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty
~$ uname -a
Linux openring 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
~$ sudo lxc version
0.19
~$ sudo lxc list
+-------+---------+-----------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | EPHEMERAL | SNAPSHOTS |
+-------+---------+-----------+------+-----------+-----------+
| node1 | RUNNING | 10.0.3.26 | | YES | 0 |
+-------+---------+-----------+------+-----------+-----------+
# This hangs
~$ sudo lxc stop node1
Work around
~$ ps aux | grep "containers node1"
root 6619 0.0 0.2 75032 4280 ? Ss 19:27 0:00 [lxc monitor] /var/lib/lxd/containers node1
pratz 7745 0.0 0.0 15916 2028 pts/1 S+ 19:35 0:00 grep --color=auto containers node1
~$ sudo kill -9 6619
~$ sudo lxc list
+-------+---------+------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | EPHEMERAL | SNAPSHOTS |
+-------+---------+------+------+-----------+-----------+
| node1 | STOPPED | | | YES | 0 |
+-------+---------+------+------+-----------+-----------+
What's running in the container?
lxc stop does a clean shutdown, in that it sends SIGPWR to the container's init process.
if init in the container doesn't react to the signal or fails to shutdown the container, you get a hang (unless you specify a timeout, in which case you'd get an error).
If your container can't be shutdown properly, you can pass --force to lxc stop which will instead kill the init process, instantly killing the whole container.
Container info
~$ sudo lxc image info 50045c285f19
Fingerprint: 50045c285f19fafb411410c28094779c5aa7ec69a6096bfad5c38674fb059f89
Size: 57MB
Architecture: x86_64
Public: no
Timestamps:
Created: 2015/10/06 03:22 UTC
Uploaded: 2015/10/06 13:30 UTC
Expires: never
Properties:
description: Centos 7 (amd64)
Aliases:
--force option is good, but I do not want to loose assigned ip-address.
If I kill the container, I have to launch it again and a new ip is assigned.
Also, I am not sure if there is some issue with the container image, as I am using the image from
http://images.linuxcontainers.org/images/centos/7/amd64/
yeah, wouldn't surprise me if Centos 7's init system doesn't know about SIGPWR.
As for the IP, that seems odd to me, the MAC of the container is static and that's handled at startup time, so even an unclean shutdown should keep the IP address.
stgraber@castiana:~$ lxc launch images:centos/7/amd64 centos
Creating centos done.
Starting centos done.
stgraber@castiana:~$ lxc list centos
+--------+---------+------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | EPHEMERAL | SNAPSHOTS |
+--------+---------+------------+------+-----------+-----------+
| centos | RUNNING | 10.0.3.225 | | NO | 0 |
+--------+---------+------------+------+-----------+-----------+
stgraber@castiana:~$ lxc stop centos --force
stgraber@castiana:~$ lxc start centos
stgraber@castiana:~$ lxc list centos
+--------+---------+------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | EPHEMERAL | SNAPSHOTS |
+--------+---------+------------+------+-----------+-----------+
| centos | RUNNING | 10.0.3.225 | | NO | 0 |
+--------+---------+------------+------+-----------+-----------+
stgraber@castiana:~$
Sorry, not working for me :(
~$ sudo lxc list
+-------+---------+------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | EPHEMERAL | SNAPSHOTS |
+-------+---------+------------+------+-----------+-----------+
| node1 | RUNNING | 10.0.3.243 | | YES | 0 |
+-------+---------+------------+------+-----------+-----------+
~$ sudo lxc stop node1 --force
~$ sudo lxc start node1
error: not found
Your container is ephemeral so that's expected :)
An ephemeral container is deleted once stopped, that's the very definition of ephemeral.
Oh man!!! my bad, works well
~$ sudo lxc list
+-------+---------+------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | EPHEMERAL | SNAPSHOTS |
+-------+---------+------------+------+-----------+-----------+
| node1 | RUNNING | 10.0.3.244 | | NO | 0 |
+-------+---------+------------+------+-----------+-----------+
~$ sudo lxc stop node1 --force
~$ sudo lxc start node1
~$ sudo lxc list
+-------+---------+------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | EPHEMERAL | SNAPSHOTS |
+-------+---------+------------+------+-----------+-----------+
| node1 | RUNNING | 10.0.3.244 | | NO | 0 |
+-------+---------+------------+------+-----------+-----------+
If SIGPWR does not work for few systems, can we have a fallback of SIGKILL
I do not have much os knowledge, just a suggestion if it helps
We can't because LXD itself can't know whether it worked or not. The signal has always been supported by Linux so we'll never get an error sending it and LXD doesn't have any knowledge as to what's running in the container so it can't know whether the container is ignoring the signal or whether it's just taking very long to shutdown.
That's why we have the --timeout and --force options, you can try a clean shutdown, timeout after 30s and then do a forced shutdown.
Ok sounds good (y).
Thank you for contributing to LXD.
"lxc stop" always hangs for me
fresh install of Ubuntu 15.10 && Updates / Upgrades
apt-get install bridge-utils (change eth0 to br0 in /etc/network/interfaces)
add-apt-repository ppa:ubuntu-lxc/lxd-stable
apt-get update
apt-get dist-upgrade
apt-get install lxd
lxc profile edit default (change bridge to br0)
lxc remote add images images.linuxcontainers.org
lxc launch images:centos/7/amd64 centos
lxc stop centos
hang
lxc launch images:debian/jessie/amd64 debian
lxc stop debian
hang
Easy to reproduce. Just follow the steps above.
I tested lxc stop with a wily guest on a wily host. It works.
CentOS and Debian guests don't. See above.
This might be the same?
https://github.com/lxc/lxc/issues/736
I don't know. I use ext4.
The problem here is that init in your container doesn't understand SIGPWR; you can work around it by just always using --force.
Just ran into this myself. For reference, good old sysvinit does work, so using it instead of systemd will make lxc stop work:
lxc exec jessie-amd64-sysvinit -- /bin/bash
# apt-get update && apt-get install sysvinit-core
# exit
lxc stop jessie-amd64-sysvinit --force
lxc start jessie-amd64-sysvinit
lxc exec jessie-amd64-sysvinit -- /bin/bash
# apt-get remove --purge --auto-remove systemd
# exit
lxc stop jessie-amd64-sysvinit
I hit the same issue. I found a few suggestions to get the systemd containers like CentOS v7 to respect SIGPWR. The below seems to work well.
1) Log into the container (lxc exec
2) Create a new sigpwr.target like so:
ln -s /usr/lib/systemd/system/halt.target /etc/systemd/system/sigpwr.target
3) Force the container to restart so the change takes effect
lxc stop
lxc start
Result:
Now you can stop the container with a simple "lxc stop
I found this in a comment in the below thread:
http://lxc-users.linuxcontainers.narkive.com/ekRrTST6/lxc-stop-doesn-t-stop-centos-waits-for-the-timeout
@dmacbride I checked a vanilla Debian Jessie image and sigpwr is already pointing to halt.target, but still hangs.
oot@jessietest:~# ls -l /etc/systemd/system
total 28
lrwxrwxrwx 1 root root 37 Jan 10 22:49 default.target -> /lib/systemd/system/multi-user.target
-rw-r--r-- 1 root root 306 Jan 10 22:49 getty-static.service
drwxr-xr-x 2 root root 4096 Jan 10 22:49 getty.target.wants
-rw-r--r-- 1 root root 1538 Jan 10 22:49 [email protected]
drwxr-xr-x 2 root root 4096 Jan 10 22:48 halt.target.wants
drwxr-xr-x 2 root root 4096 Jan 10 22:49 multi-user.target.wants
drwxr-xr-x 2 root root 4096 Jan 10 22:48 poweroff.target.wants
drwxr-xr-x 2 root root 4096 Jan 10 22:48 reboot.target.wants
lrwxrwxrwx 1 root root 31 Jan 10 22:49 sigpwr.target -> /lib/systemd/system/halt.target
lrwxrwxrwx 1 root root 9 Jan 10 22:49 systemd-udevd.service -> /dev/null
lrwxrwxrwx 1 root root 9 Jan 10 22:49 udev.service -> /dev/null
So far, switching to sysvinit has been the only way I found to get Debian Jessie containers to stop gracefully.
Hi, i'm also faceing this issue.
host ubuntu 16 with ZFS, lxc is centos7 (i have also updated systemctl / systemd to latest 221) but no luck.
here [https://bbs.archlinux.org/viewtopic.php?id=181032] i have found a trick:
(_from container_)
ln -s /usr/lib/systemd/system/poweroff.target /etc/systemd/system/sigpwr.target
and finally lxc stop works as aspected
Similar problem with debian/jessie/amd64
lxc info
driver: lxc
driverversion: 2.0.1
kernel: Linux
kernelarchitecture: x86_64
kernelversion: 4.4.0-28-generic
server: lxd
serverpid: 2674
serverversion: 2.0.2
storage: zfs
storageversion: "5"
config:
core.https_address: 0.0.0.0:8443
core.https_allowed_headers: Content-Type
core.https_allowed_methods: GET, POST, PUT, DELETE, OPTIONS
core.https_allowed_origin: '*'
core.trust_password: true
storage.zfs_pool_name: zfspool
public: false
lxc init images:debian/jessie/amd64 debian2
lxc list
+--------+-------+------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+-------+------+------+------------+-----------+
| debian2 | STOPPED | | | PERSISTENT | 0 |
+---------+---------+------+------+------------+-----------+
lxc start debian2
+--------+-------+------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+-------+------+------+------------+-----------+
| debian2 | RUNNING | | | PERSISTENT | 0 |
+---------+---------+------+------+------------+-----------+
lxc stop --force=true debian2
---------------------------------------------------------
lxc monitor
metadata:
context:
ip: '@'
method: GET
url: /1.0/containers/debian2
level: info
message: handling
timestamp: 2016-07-10T14:42:54.802717899+01:00
type: logging
metadata:
context:
ip: '@'
method: PUT
url: /1.0/containers/debian2/state
level: info
message: handling
timestamp: 2016-07-10T14:42:54.816220805+01:00
type: logging
metadata:
class: task
created_at: 2016-07-10T14:42:54.818145207+01:00
err: ""
id: 45389589-f4b3-4903-8258-f622984972a9
may_cancel: false
metadata: null
resources:
containers:
- /1.0/containers/debian2
status: Running
status_code: 103
updated_at: 2016-07-10T14:42:54.818145207+01:00
timestamp: 2016-07-10T14:42:54.818572589+01:00
type: operation
metadata:
context: {}
level: dbug
message: 'New task operation: 45389589-f4b3-4903-8258-f622984972a9'
timestamp: 2016-07-10T14:42:54.818187735+01:00
type: logging
metadata:
class: task
created_at: 2016-07-10T14:42:54.818145207+01:00
err: ""
id: 45389589-f4b3-4903-8258-f622984972a9
may_cancel: false
metadata: null
resources:
containers:
- /1.0/containers/debian2
status: Pending
status_code: 105
updated_at: 2016-07-10T14:42:54.818145207+01:00
timestamp: 2016-07-10T14:42:54.818467611+01:00
type: operation
metadata:
context: {}
level: dbug
message: 'Started task operation: 45389589-f4b3-4903-8258-f622984972a9'
timestamp: 2016-07-10T14:42:54.818538649+01:00
type: logging
metadata:
context:
ip: '@'
method: GET
url: /1.0/operations/45389589-f4b3-4903-8258-f622984972a9/wait
level: info
message: handling
timestamp: 2016-07-10T14:42:54.826518814+01:00
type: logging
------------------------------------------------------------------
lxc info --show-log debian2
lxc 20160710145748.548 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.237 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.548 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.627 INFO lxc_confile - confile.c:config_idmap:1500 - read uid map: type u nsid 0 hostid 165536 range 65536
lxc 20160710145749.627 INFO lxc_confile - confile.c:config_idmap:1500 - read uid map: type g nsid 0 hostid 165536 range 65536
lxc 20160710145749.628 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.639 INFO lxc_confile - confile.c:config_idmap:1500 - read uid map: type u nsid 0 hostid 165536 range 65536
lxc 20160710145749.639 INFO lxc_confile - confile.c:config_idmap:1500 - read uid map: type g nsid 0 hostid 165536 range 65536
lxc 20160710145749.640 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.640 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.641 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.647 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.648 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.648 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.648 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.648 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.648 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.649 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.649 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.649 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.649 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.649 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.659 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.659 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.659 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.659 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.659 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.659 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.660 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.660 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.660 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.660 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
lxc 20160710145749.664 DEBUG lxc_commands - commands.c:lxc_cmd_handler:893 - peer has disconnected
Something interesting, after I wait for a while ....
lxc list
+--------+-------+------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+-------+------+------+------------+-----------+
| debian2 | FREZZING | | | PERSISTENT | 0 |
+---------+---------+------+------+------------+-----------+
Hi, so we've recently merged https://github.com/lxc/lxc/pull/1086 in LXC which aims to detect whether SIGRTMIN+3 is in the blocked signal set of the containers init process. If so, it sends SIGRTMIN+3 as shutdown signal instead of SIGPWR. This should take care of sending the correct shutdown signal to systemd-based init systems as it is the only init system (to our knowledge) which uses SIGRTMIN+3. So the ln -s hack will not be needed anymore.
@saghul thanks for solution. Neither Debian Jessie amd64 nor CentOS 7 amd64 doesn't work with ln -s .../poweroff.target .../sigpwr.target hack in my case.
@brauner thats a great archivement.
Hi, I'm having the similar problem. I found a stupid solution to stop the container.
For example,
$ lxc exec centos -- poweroff
$ lxc list
+---------+--------------+--------+------+------------------+-------------------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+----------+--------------+-------+-------+------------------+------------------+
| centos | STOPPED | | | PERSISTENT | 0 |
+----------+--------------+-------+-------+------------------+------------------+
It works but I hope the "lxc stop centos" hang issue could be resolved soon.
Cheers
I'm having the same issue with my Void Linux container running runit as init process.
I can power off manually using lxc exec <container> bash and then running shutdown -P now or even lxc exec test -- poweroff.
Most helpful comment
I hit the same issue. I found a few suggestions to get the systemd containers like CentOS v7 to respect SIGPWR. The below seems to work well.
1) Log into the container (lxc exec /bin/bash)
2) Create a new sigpwr.target like so:
ln -s /usr/lib/systemd/system/halt.target /etc/systemd/system/sigpwr.target
3) Force the container to restart so the change takes effect --force
lxc stop
lxc start
Result:"
Now you can stop the container with a simple "lxc stop
I found this in a comment in the below thread:
http://lxc-users.linuxcontainers.narkive.com/ekRrTST6/lxc-stop-doesn-t-stop-centos-waits-for-the-timeout