Molecule: Molecule leaves zombie agetty with 100% CPU load

Created on 25 Jan 2018  Â·  12Comments  Â·  Source: ansible-community/molecule

Issue Type

  • Bug report

Molecule and Ansible details

ansible 2.4.2.0
molecule, version 2.7.0
  • Molecule installation method: pip
  • Ansible installation method: pip

Desired Behaviour

All processes started during testing should be killed afterwards.

Actual Behaviour (Bug report only)

On the host system remains an idle agetty process with 100% CPU load. I can only solve this by manually changing all the Dockerfile.j2 as proposed in https://github.com/moby/moby/issues/4040#issuecomment-339022455

All 12 comments

The agetty process is running on your host OS?

Yes. I'm starting a vagrant box like this

# -*- mode: ruby -*-
# vi: set ft=ruby :

VAGRANTFILE_API_VERSION = "2"

$molecule_prep_script = <<SCRIPT
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
curl -fsSL https://download.docker.com/linux/$(. /etc/os-release; echo "$ID")/gpg | sudo apt-key add -
echo "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | sudo tee -a /etc/apt/sources.list.d/docker.list
sudo apt-get update
sudo apt-get install -y python-pip docker-ce
sudo pip install molecule docker-py
SCRIPT

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.ssh.insert_key = false

   # VirtualBox.
  config.vm.provider :virtualbox do |v|
    v.memory = 1024
    v.cpus = 3
    v.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
    v.customize ["modifyvm", :id, "--ioapic", "on"]
  end

  # Debian Stretch
  config.vm.define "stretch" do |stretch|
    stretch.vm.hostname = "stretch"
    stretch.vm.box = "debian/stretch64"
    stretch.vm.network "private_network", ip: "192.168.33.25"
  end

  # prepare for molecule
  config.vm.provision "shell", inline: $molecule_prep_script
end

and inside run molecule test on https://github.com/systemli/ansible-sshd.
As soon as the docker container runs, top says like

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
 13553 root      20   0   14536   1728   1596 R  99.7  0.2   1:04.05 agetty 

And the process stays also there once molecule has finished.
Compare to https://github.com/moby/moby/issues/4040

Since Molecule ships a Dockerfile that the user can control for their purposes. I suggest modifying the Dockerfile and adding the workaround from the issue you referenced.

Work-a-round can be implemented in the Dockerfile template provided by Molecule init, for users affected.

FWIW, I resolved this by applying the suggestion from this comment. This problem affects @geerlingguy's docker-...-ansible images and I resolved it like this:

# molecule/default/Dockerfile.j2

FROM {{ item.image }}

RUN rm -f /lib/systemd/system/systemd*udev* \
  && rm -f /lib/systemd/system/getty.target

Then just set pre_build_image: false for any images that you'd like to treat:

# molecule/default/molecule.yml
...
platforms:
  - name: debian9
    image: "geerlingguy/docker-debian9-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: debian8
    image: "geerlingguy/docker-debian8-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: ubuntu1804
    image: "geerlingguy/docker-ubuntu1804-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: ubuntu1604
    image: "geerlingguy/docker-ubuntu1604-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: centos7
    image: "geerlingguy/docker-centos7-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: centos6
    image: "geerlingguy/docker-centos6-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
...

@percygrunwald Thanks for further investigating this. Have you considered doing a pull request in the @geerlingguy repos? (f.e. https://github.com/geerlingguy/docker-debian9-ansible)

(Just an aside, I haven't had this issue in any of my role or playbook testing using molecule...?)

I still do and switched to testing with vagrant locally.

@t2d, regarding the PR in Jeff's repos, I have created an issue there to discuss it before making a PR. I'm not sure Jeff would consider a PR to fix this issue if he can't replicate it himself. The issue is here: https://github.com/geerlingguy/docker-ubuntu1804-ansible/issues/9. I'm able to consistently replicate/resolve the issue with the steps I've outlined there. @t2d, maybe you can check those steps yourself to see if you're able to replicate it in the same way.

@geerlingguy, not sure if I'm being overly presumptuous but it seems that in your public repos for roles that you're only testing against one platform at a time. I only had this issue when launching >2 platforms including Debian-based instances at the same time, so if your role development workflow never launches 3 or more instances with Molecule, it may be that the issue has never presented itself on your system. Curious to see if you can replicate the issue with the steps I gave in https://github.com/geerlingguy/docker-ubuntu1804-ansible/issues/9 (I believe you're on Mac OS X), but I can totally understand that this might not be a good use of your time.

I'm happy to maintain some Docker images based on Jeff's with the getty services removed and see how things go.

@percygrunwald - I control which OSes I run tests with using an environment variable, and test almost all my roles on at least Ubuntu 18 and 16, Debian 9, and CentOS 7, but test some on Debian 8, CentOS 6, and Fedora 29 as well (see the .travis.yml files in those repos).

When testing locally I just run one test like MOLECULE_DISTRO=debian9 molecule test — it’s just a style thing.

Yeah, I took an extensive look through your repos and assumed that was your workflow. It makes total sense that if you're testing one platform per run that you have never encountered this issue.

I was trying to create a workflow where I can develop roles against all 6 OSs at the same time, since with Docker there's not really that much overhead to run it. I'm looking at ways that you can combine the "6 at once" style for local development and then the "one platform per Travis runner" model with the same Molecule config. Initially using --base-config seemed perfect, but it actually doesn't work with platforms (see https://github.com/ansible/molecule/issues/1423#issuecomment-460915577).

I'm able to achieve the desired outcome using Molecule scenarios. I created a second scenario called travis that just references everything in the default scenario, but only runs against a single platform:

# molecule/travis/molecule.yml
---
dependency:
  name: galaxy
driver:
  name: docker
lint:
  name: yamllint
  options:
    config-file: molecule/default/yaml-lint.yml
platforms:
  - name: instance
    image: "geerlingguy/docker-${PLATFORM_DISTRO:-centos7}-ansible:latest"
    command: ${PLATFORM_COMMAND:-""}
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: true
provisioner:
  name: ansible
  lint:
    name: ansible-lint
  playbooks:
    converge: ../default/playbook.yml
scenario:
  name: travis
verifier:
  name: testinfra
  directory: ../default/tests/
  lint:
    name: flake8

Then for local development against all 6 OSs I can run molecule test and for Travis, or when I want to isolate a single platform, I can run PLATFORM_DISTRO=ubuntu1804 molecule test -s travis.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

surfer190 picture surfer190  Â·  3Comments

nkakouros picture nkakouros  Â·  4Comments

r0ckyte picture r0ckyte  Â·  5Comments

srizzling picture srizzling  Â·  3Comments

dfinninger picture dfinninger  Â·  5Comments