Openjdk-infrastructure: Test Linux machines are down

Created on 13 Nov 2018  Â·  10Comments  Â·  Source: AdoptOpenJDK/openjdk-infrastructure

Out of the 4 Linux test machiens at adopt (https://ci.adoptopenjdk.net/label/ci.role.test&&hw.arch.x86&&sw.os.linux&&sw.tool.docker/), the following two are down:

1) test-softlayer-ubuntu1604-x64-1 is down with message:

Disk space is too low. Only 0.631GB left on /tmp.

2) test-scaleway-ubuntu1604-x64-1 is down with error message:

Disconnected by smlambert : disk out of space errors
bug

All 10 comments

@sxa555... fyi

The scaleway machine appears to have been manually disconnected by @smlambert for a perceived lack of disk space but it looks ok to me:

/dev/vda        47929956 8409504  37062664  19% /
root@test-scaleway-ubuntu1604-x64-1:~# df -k
Filesystem     1K-blocks    Used Available Use% Mounted on
none             2018844       0   2018844   0% /dev
tmpfs             404612   54848    349764  14% /run
/dev/vda        47929956 8409504  37062664  19% /
tmpfs            2023044       0   2023044   0% /dev/shm
tmpfs               5120       0      5120   0% /run/lock
tmpfs            2023044       0   2023044   0% /sys/fs/cgroup
/dev/vdb1       47928916 5553420  39917760  13% /home
tmpfs             404612       0    404612   0% /run/user/1002
tmpfs             404612       0    404612   0% /run/user/1000
tmpfs             404612       0    404612   0% /run/user/0
root@test-scaleway-ubuntu1604-x64-1:~# 

The softlayer machine looks like it's got quite a few old docker images running:

root@test-softlayer-ubuntu1604-x64-1:/var/lib# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
root@test-softlayer-ubuntu1604-x64-1:/var/lib# docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                      PORTS               NAMES
3904d21f7511        1abd63135345        "/bin/sh -c 'git clo…"   3 months ago        Exited (0) 3 months ago                         gifted_mestorf
c29a2b6aad5b        3cf495263030        "/bin/sh -c 'ls -la …"   4 months ago        Exited (2) 4 months ago                         wonderful_goldstine
30a065dbdea5        1bd8fd98a06b        "/bin/sh -c 'apt-get…"   4 months ago        Exited (100) 4 months ago                       priceless_nobel
6c1f03f6a09d        05701a729259        "/bin/sh -c 'wget --…"   4 months ago        Exited (1) 4 months ago                         hardcore_nobel
ccdbe87351e0        b2ddf15f3a03        "/bin/sh -c 'wget --…"   4 months ago        Exited (1) 4 months ago                         wizardly_mahavira
717995a0004a        f27b50c10ba9        "/bin/sh -c 'wget --…"   4 months ago        Exited (1) 4 months ago                         epic_agnesi
a94695085df8        bdfd5762f885        "/bin/sh -c 'wget --…"   4 months ago        Exited (3) 4 months ago                         flamboyant_bardeen
ff89c8b4b347        bdfd5762f885        "/bin/sh -c 'wget --…"   4 months ago        Exited (3) 4 months ago                         sad_benz
5276b98f49f4        2d3b55d70fba        "/bin/sh -c 'adduser…"   4 months ago        Exited (1) 4 months ago                         gifted_pare
root@test-softlayer-ubuntu1604-x64-1:/var/lib# 

Which will likely need to be removed to clear up space

Yes, my mistake on scaleway machine for disabling and not reenabling upon clean-up.

We don't seem to have a cleanup for the docker runs. I had to do a massive sweep a few months ago, something that needs looking at...

Agree, we do a 'docker rm containerName' step at the end of each test run, but should have a more general call to clean up in the cleanWS step ('docker system prune' or some such thing)

Have executed docker system prune --all on test-softlayer-ubuntu1604-x64-1 consistent with https://github.com/AdoptOpenJDK/openjdk-tests/issues/686 and brought the machine back online. 15Gb now free - closing and re-enabling

I did it on test-scaleway-ubuntu1604-x64-1 and it seems to be clean:

jenkins@test-scaleway-ubuntu1604-x64-1:~$ docker system prune --all
WARNING! This will remove:
        - all stopped containers
        - all networks not used by at least one container
        - all images without at least one container associated to them
        - all build cache
Are you sure you want to continue? [y/N] y
Total reclaimed space: 0B

@smlambert - could we have test-scaleway-ubuntu1604-x64-1 back online too?

@Mesbah-Alam @smlambert I've re-enabeld test-scaleway-ubuntu1604-x64-1

thanks @sxa555 !

Was this page helpful?
0 / 5 - 0 ratings