Awx: SSH Machine Credential Signed Certificate not working

Created on 21 Jun 2019  路  10Comments  路  Source: ansible/awx

ISSUE TYPE

Feature bug

SUMMARY

After #3098 I've tested signed certificate login but it didn't worked. The signed certificate is not available when "ssh-add" command is called by "anisble-runner"

ENVIRONMENT
  • AWX version: 5.0.0
  • AWX install method: docker on linux
STEPS TO REPRODUCE
  • Create a Machine Credential with private key and a signed certificate.
  • Try to run a job using this credential
EXPECTED RESULTS

The SSH connection to the job must succeed

ACTUAL RESULTS

The connection fails with "connection denied"

ADDITIONAL INFORMATION

After #3098 I've tested signed certificate login but it didn't worked. I checked the "ssh-add" command is invoked by "ansible-runner" with "artifacts_dir/ssh_key_data" file, but the signed certificates are stored to "private_data_dir/credential_%d-cert.pub" file.
Since "ssh-add" looks for the certificate file from a file with the same name as the private key plus "-cert.pub", the certificate file must be placed in the same directory as the private key and with the same name prefix.

api high bug

All 10 comments

This is a blocker issue since there is no workaround for signed certificates.
@jakemcdermott, @ryanpetrello

The following seemed to work: https://github.com/ansible/awx/pull/4320, pretty much just write the public certificate in the place where ansible runner expects it to be.

Worst case we currently have the following hack running before starting the Tasks:

# Patch for issue https://github.com/ansible/awx/issues/4139
original="path = os.path.join(private_data_dir, name)"
patched="artifact_dir = os.path.join(private_data_dir, 'artifacts', str(self.instance.id))\n                if not os.path.exists(artifact_dir):\n                    os.makedirs(artifact_dir, mode=0o700)\n                path = os.path.join(artifact_dir, 'ssh_key_data-cert.pub')"
sed -z -i "s/${original}/${patched}/g" /var/lib/awx/venv/awx/lib64/python3.6/site-packages/awx/main/tasks.py

/usr/bin/launch_awx_task.sh

@bkmeneguello Thanks for digging into this. I've done the following steps to generate a CA and sign a key:

~ ssh-keygen -C CA -f ca

Next, I configured /etc/ssh/sshd_config on an external managed host in my inventory and restarted sshd:

~ scp ca.pub ec2-user@<my-managed-host>:
~ ssh ec2-user@<my-managed-host>
~ [ec2-user@ip-xyz ~] $ sudo mv ca.pub /etc/ssh/ca.pub
~ [ec2-user@ip-xyz ~] $ sudo bash -c 'echo "TrustedUserCAKeys /etc/ssh/ca.pub" >> /etc/ssh/sshd_config'
~ [ec2-user@ip-xyz ~] $ sudo systemctl restart sshd

Next, I generated a new private key and signed it with the CA:

~ ssh-keygen -f ./test
~ ssh-keygen -s ca -I some-identifier -n ec2-user -V +1w -z 1 ./test.pub
Signed user key ./test-cert.pub: id "some-identifier" serial 1 for ec2-user valid from 2019-07-12T12:39:00 to 2019-07-19T12:40:48

From here, I have a test private half, and a test-cert.pub half signed by my CA. I paste both of these into a new Tower Machine credential:

image

And I'm able to reproduce what you've described - the signed public key half doesn't seem to be provided to ssh-agent.

I've applied your diff at https://github.com/ansible/awx/pull/4320/, and it does fix it for me, but I want to brainstorm a bit about whether or not that's the optimal solution here. This _looks_ to me like a regression when we recently moved AWX task execution to use ansible runner.

@bkmeneguello,

After thinking on this, I think just fixing this in runner is a better potential solution.

In my opinion it would be best if the ansible runner code could be changed from runner_config.py#L169 to:
self.command = self.wrap_args_with_ssh_agent(self.command,self.ssh_key_data)

Then you only have to align the name of the signed ssh certificate in the awx code.

The problem is the signed certificate is read by ssh-add and it's searched by name, so there is no option other then copying to the same directory. I don't know if there is another way to add certificates to ssh-agent without ssh-add.

@bkmeneguello Ah I guess it should use the path of the ssh key then (env/ssh_key), assuming the aws/tasks.py places the public cert there (env/ssh_key-cert.pub)?

@jdekoning actually there is no alternative to fix this unless respecting the ssh-add rules. It's code (https://github.com/openssh/openssh-portable/blob/master/ssh-add.c#L336) don't allow certificates from the command line, only direct lookup by file name.

Thanks for the legwork on this one - it's just something we quietly broke without noticing. The PR you put together isn't perfect, but it's probably the shortest path to just fixing this without having to actually develop and ship a new documented feature in runner.

Confirmed that this is now fixed using the verification steps listed in a prior comment. (https://github.com/ansible/awx/issues/4139#issuecomment-510955154)

Was this page helpful?
0 / 5 - 0 ratings