Flux: Flux fails to clone a repo in one cluster but not in another.

Created on 22 Feb 2019  Â·  12Comments  Â·  Source: fluxcd/flux

Flux fails to read a repository. This is strange as flux has been previously configured on a different cluster to use a different folder of the same repository. All repos are under Github.

User reports that trying a different repo also failed to work.

Original issue

https://weavesupport.zendesk.com/agent/tickets/317

Log extract

ts=2019-02-20T15:55:12.061579015Z caller=loop.go:87 component=sync-loop err="git repo not ready: git clone --mirror: running git command: git [clone --mirror [email protected]:dwightbiddle-ef/ontour-k8s-prod /tmp/flux-gitclone132785868]: context deadline exceeded"
ts=2019-02-20T15:55:14.644257634Z caller=images.go:17 component=sync-loop msg="polling images"
onboardinactivation question

Most helpful comment

Hey all, this is resolved. There was a security group blocking 22 outbound from the NAT. Thank you for your assistance, I really appreciate it.

All 12 comments

We've adjusted the git-timeout container argument for theweave-flux-agent deployment to 1 minute, and deleted the pod to force a redeployment.

spec:
      containers:
      - args:
        - --token=$(WEAVE_CLOUD_TOKEN)
        - --connect=wss://cloud.weave.works./api/flux
        - --memcached-hostname=weave-flux-memcached.weave.svc.cluster.local
        - --git-ci-skip=true
        - [email protected]:redacted/redacted
        - --git-path=prod
        - --git-branch=master
        - --git-label=redacted
        - --git-timeout=1m
        - --ssh-keygen-dir=/var/fluxd/ssh

However, we're still seeing the timeout in the logs, manifesting in a loop of the following:

Log extract

ts=2019-02-22T16:06:22.409689833Z caller=loop.go:87 component=sync-loop err="git repo not ready: git clone --mirror: running git command: git [clone --mirror [email protected]:redacted/redacted /tmp/flux-gitclone209160108]: context deadline exceeded"
ts=2019-02-22T16:06:24.758439178Z caller=images.go:17 component=sync-loop msg="polling images"
ts=2019-02-22T16:06:24.758500893Z caller=images.go:27 component=sync-loop msg="no automated services"

I'm guessing this is a connectivity problem cluster wide. Can you clone the repo if you exec into Flux pod? Can you reach github.com from a pod running inside the weave namespace?

@stefanprodan I can in fact connect to the weave-flux-agent with kubectl exec, and git clone one of my own public repositories onto the container, and can hit github.com no problem.

git clone one of my own public repositories onto the container

@danbudris Can you clone the git repo you're using for flux, in the container? (by doing it interactively, you will also get a sense of how long it takes and whether that might be an issue)

Hi guys - got around to doing this today, and when i turn on super-verbose ssh debugging it seems its just hanging connecting to github.com. Perhaps this is an SSH issue? I deleted the namespace and the deploy key and started over with a new cluster and we are still having the same problem. For the record, this was just done in our non-production cluster and worked fine (exact same repository with a new deploy key), so perhaps there is a networking setting in our EKS cluster that needs to be updated? Any suggestions would be helpful there.

/home/flux # GIT_SSH_COMMAND="ssh -vvv" git clone [email protected]:eftours/kubernetes.git
Cloning into 'kubernetes'...
OpenSSH_7.5p1-hpn14v4, LibreSSL 2.5.5
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 1: Applying options for *
debug2: resolving "github.com" port 22
debug2: ssh_connect_direct: needpriv 0
debug1: Connecting to github.com [192.30.253.113] port 22.
debug1: connect to address 192.30.253.113 port 22: Operation timed out
debug1: Connecting to github.com [192.30.253.112] port 22.
debug1: connect to address 192.30.253.112 port 22: Operation timed out
ssh: connect to host github.com port 22: Operation timed out
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Are there permissions you can grant besides “allow write access”? And what folder are you referring to, the one in the container or the one in the git repo?

Get Outlook for iOShttps://aka.ms/o0ukef


From: Hidde Beydals notifications@github.com
Sent: Tuesday, March 5, 2019 2:19 AM
To: weaveworks/flux
Cc: Dwight Biddle; Comment
Subject: Re: [weaveworks/flux] Flux fails to clone a repo in one cluster but not in another. (#1756)

Looking at the needpriv 0 it seems to be unable to read your private key. What are the permissions of the key (and folder)?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHubhttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fweaveworks%2Fflux%2Fissues%2F1756%23issuecomment-469568101&data=02%7C01%7C%7Cc0e49539384a478a223308d6a13af621%7Cf0d1c6fddff0486a8e91cfefefc7d98d%7C0%7C1%7C636873671937829681&sdata=PnX68ZtktigJv9fXl%2FjRcGzjjglYUlx%2F2k0%2Fy1RehAw%3D&reserved=0, or mute the threadhttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAtd6AovyTnH7xW1pttZDYwVj_AiN9lHVks5vThqXgaJpZM4bJ2ma&data=02%7C01%7C%7Cc0e49539384a478a223308d6a13af621%7Cf0d1c6fddff0486a8e91cfefefc7d98d%7C0%7C1%7C636873671937829681&sdata=OQZd12Zczwv2esjh%2Bq%2FGfGg0hccSchwlCqV5OahurtA%3D&reserved=0.

Ignore my previous comment. Can you try connecting over port 443 to see if it is your firewall?

$ ssh -T -p 443 [email protected]
Hi hiddeco! You've successfully authenticated, but GitHub does not provide shell access.

/home/flux # ssh -T -p 443 [email protected]
No RSA host key is known for [ssh.github.com]:443 and you have requested strict checking.
Host key verification failed.

Err, yeah that is kind of expected as it is a different domain and needs to be added to the known_hosts file first.

$ ssh-keyscan ssh.github.com >> /etc/ssh/ssh_known_hosts
$ ssh -T -p 443 [email protected]

Yep, that works! I actually just copied the existing host entry for github.com and added "ssh." at the beginning in a new entry with the same ssh-rsa token. What does that mean? We need to open port 22?

_Something_ blocks outgoing connections to port 22. The easiest solution would be to simply allow it. If this is not an option you could try to modify the configuration to connect on port 443 instead as described here.

Note that the latter would also require a persistent change to your known hosts file.

Hey all, this is resolved. There was a security group blocking 22 outbound from the NAT. Thank you for your assistance, I really appreciate it.

Was this page helpful?
0 / 5 - 0 ratings