Reported by user Andrew Lewis on Slack:
fluxctl gives me
Error: Could not create a dialer: Could not get pod name: Listing pods in kubernetes: Get https://api-my-company.domain.tld/api/v1/namespaces/default/pods?labelSelector=name+in+%28flux%2Cfluxd%2Cweave-flux-agent%29 x509: certificate signed by unknown authority
it actually looks like flux fails to validate a ca cert provided in ~/.kube/config if it doesn't have a trailing newline, kubectl works fine
I cannot reproduce this problem.
Using the following kubeconfig file both fluxctl and kubectl work just fine, with or without a trailing new line
apiVersion: v1
contexts:
- context:
cluster: docker-desktop
user: docker-desktop
name: docker-desktop
current-context: docker-desktop
kind: Config
preferences: {}
users:
- name: docker-desktop
user:
client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBMDJEZGp6OUg3U0VBaHlXMHBtRXNTWm5HYkRiOUpodUo5YUlYMkU3RHhvWUdBSXNhCjVmL0xtV0lOcDZreFdjdW9wcGRMWHZHUmovMi9MZThZZnFwR2tpY3Z2MFU1L21QbFkrekx2cEFGYVVyU2drRUMKWXM1RU5OaS95ZmNhbU52YkNmenZqTHkwdlFhWDNrWW5EVU5Hd1dFYzkzWHozcXFrdUdCQWFXSys3cXkrV0xHSgpqUy9BeUh1cXNoUjAvelZRTm5JWXpQMHlRUjdBSUNPMjlwRVI5OG5LVzdEZVdIOE9BVVJNT3krYjg0SHUzVGpHClZMbUx0QTlUODBBK1NEcVJhK1pmR3dpOHhUeUZibG5PTHhKcy9OZkZlRW9zTkVwNmdzWkNyNmZvdy9oSFJrOHUKSFpjN09zR2NhcTVYa0dBeHREMm4rZSsrU3YwSHNCMFFRVElkTndJREFRQUJBb0lCQVFEREwrNUhpWks4YUQ0WgpqZnJickYxOTlVNmh2TUZVUUJmTHg0c25SY3hJZEpmMFJuOW5KU1ozazNDeWdteTgwMi8vclROL0Q2aHZnMHZYCmViSnZ2SUtBRVVsbzRON2xobzliRDEwd3NjTWRjR2NrdWkwZTJnTm5uVUswbDhacW9wSUlEaFdTS3VybXBjRDEKRW1pYk9hV0xpa1o2UitRU21CYVUzdVI3ckNCTk9XNlVKREVIZ0tLOUtldDVPR1ZVd1BsWE5LTnkrVmdOMzNSTApReGY5YUsxekh3ZzBaUUt2c3hjVCtWSXZtdGlPa2lZbSs0ZVk2amNXZVJUTEhyNWRQYUF0TVFia1NFMEJHWk5iCkg2eURlOGU5Z2RKbytYNG9tMW5OQ1hacTFCbjdXRU0vZEs5Rnk2Q1BuOWpzSHQ3SDUyS1NUWi9pNkQ2bDFwRFAKTFRxTEJDaEpBb0dCQU9GdFRqTjVwdUdjN1RkMmpCSXFhcVN4b2ZMWFUrNE1tTk5BOHdmMVVwZlFrM2duK2htcwpVdkpJVkZPMFpLYUhqUEhEendITHlBcGc2TEtkaGk0VU1TS2VBcFJQdHVPSUxrUWZBbjVPcU4xL0ZSY0JZVyt3CmdsL1k4ZjlXSGlid1pYZFN4WFc4TUFSMTNjMXZpY2kzbWwrekdtRHlTMkpsRGNUZVRTTEdUSTNMQW9HQkFQQUwKelFreW1UcWVxNmcyRUVySUpGemxMb1dKN0RXZ2ZqZVZPQTVKMGEwMVlJeksvV01icWpIdlhKWE5VOG5meE5RdQpDV3dOeVNDMEtXZHp0RDZraExJQm1WKzBlY29BK0xJdGR5SWI5czUyQm5lQ3QxQnZ3Z1QxODJaWVdPUDJZUENWCm16YWN3ZjBYNEdSMXlDbDZveFJzcE1qa0dXVGV3RFNnMTlPZnV3REZBb0dBVzZoTTdxTWRGb2p1N3VrMXBNRTQKWVVTVWU1L1AwVyt2eDkyVnMzdXRIR3RET3N6T1pSZnJGZllReTRRQ2xLOXl3RzZFUWMrd3czK0p2ZTVNdTNtYwpUTjRBWUh0VStvakpmd3M0d1ZDTVdwc3NZUkUvbytFWjhZek5RS3VzVU5yWDlyOWg2REFmT2dFT3NWUVRxdGYwClhjcitBOE1nb1o1RERmUHhXeDFUelNjQ2dZQlpJU2lhZDYwcGRPenI0bGNlVFQ0ak4vMlVHK2dXNldhMElMWFYKcUZjd1p4ZFliNjZ4OC8yMzJOYVowTW1CdFpLUjdoNFZmdkRsTWNBRjU1SlBpQ3ExSlo3YlNGbklSYUFTR0l1WgpvS1I5ZUpsaUdxa1NOc3pscHFVZnBVSXNNcmsvMjJ6c1ZEdzdTM2hJRk91amF5UE9XNkM3N3VYMjdEYVYwL3NQCldzbkd1UUtCZ1FESmt4c3FtcllqKzJ4S21XZmNHd1BWdS82ajhiY05PNng1NWdvWnV3eWViU25namRKSndLT1EKRVlYeGFZc1Vmbm9XS2Z3UTZraDU3S0lIMkV5eGlYcE9COXU2d2l0c09WRVRlZG1FeTlwOVRSMXhtZmtaeFNBbQp3YWhLT1k3bFByTWJybENiWURoRlR3c0Vnb043bHowcWRGV2IyWk1pUC9YOW5xVml1V0RSZGc9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM5RENDQWR5Z0F3SUJBZ0lJZHRIUlBNbFRSVzh3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB4T1RBeE1UWXhORFU0TVRCYUZ3MHlNREF5TVRneE5qVTVOREJhTURZeApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sc3dHUVlEVlFRREV4SmtiMk5yWlhJdFptOXlMV1JsCmMydDBiM0F3Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRRFRZTjJQUDBmdElRQ0gKSmJTbVlTeEptY1pzTnYwbUc0bjFvaGZZVHNQR2hnWUFpeHJsLzh1WllnMm5xVEZaeTZpbWwwdGU4WkdQL2I4dAo3eGgrcWthU0p5Ky9SVG4rWStWajdNdStrQVZwU3RLQ1FRSml6a1EwMkwvSjl4cVkyOXNKL08rTXZMUzlCcGZlClJpY05RMGJCWVJ6M2RmUGVxcVM0WUVCcFlyN3VyTDVZc1ltTkw4REllNnF5RkhUL05WQTJjaGpNL1RKQkhzQWcKSTdiMmtSSDN5Y3Bic041WWZ3NEJSRXc3TDV2emdlN2RPTVpVdVl1MEQxUHpRRDVJT3BGcjVsOGJDTHpGUElWdQpXYzR2RW16ODE4VjRTaXcwU25xQ3hrS3ZwK2pEK0VkR1R5NGRsenM2d1p4cXJsZVFZREcwUGFmNTc3NUsvUWV3CkhSQkJNaDAzQWdNQkFBR2pKekFsTUE0R0ExVWREd0VCL3dRRUF3SUZvREFUQmdOVkhTVUVEREFLQmdnckJnRUYKQlFjREFqQU5CZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFBV1V4aXlLa2IvSlF1azIwc3VkYkw2bkwzY3U1RW9iRQp3Z3pxRFVmajBETzJNM3VVQUp0ZTBwa2pvREx0NDBOU0pYbTB2Z0xubkJxY3RUUVRXVVZjeDdmUFFuVi9uelYwCjk5Q1RBU1gzUVNHYnRGeWZBTG5Oa2ZqVGRDMEwvVk1kT0NxTkZCQUo0bWxyd3U5aVhXVnZSSHQ4K3J1R2NNNlgKSzhTYml6aFVQaU03MjZoZk9kNlN1WmFpa0VOK25CcFJ0QlcveWM5QUhxdVdSNnQ0NWZxaEhWNC80cTROdzVYRQpZejhsdkRycHQ2MksyUXlpemtwa3JDZE1BRnNQTzQxK280ZGJuM3VNYXNiencvWk9QTFBWbGZ4N1BrNTcrc2grCkg5aDJGTWQwOTJ5OGNnRjdhL3NQNUFMNGNFazJ1ZkZSa215WjRocG1TY1A0VjJKVVE5S0M1Zz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
clusters:
- name: docker-desktop
cluster:
server: https://kubernetes:6443
# insecure-skip-tls-verify: true
# certificate-authority: ca.crt
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRFNU1ERXhOakUwTlRneE1Gb1hEVEk1TURFeE16RTBOVGd4TUZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS214CjNBSnlTcGJ1UHB4WDRRb05lTHFJenJMdGNJL1FKNmdSOWQ0NXE5SGU1VGhBY09IWDVrVXA4TkpkSkRPZEh0RDAKWFlLTHRFbHdlQkEzUEJPck5VVGFzUDZuRWpHR3BtL3RZSjVFYlVVSm01ODgxems3MUlPdk1zYkNNTzdUUmF6Rwo1UjR5MEkzbWYxRFlIbXBUS2FOcEhoYVQ4WmJoVGFUMjRRZkp1YXh0RzI0VHFHWHRGc3d5eUtDL3NpK1RZVUlzCjdiOFVjd3BaSnA0aXhuWWJ3VHd3Q1BMK3Z1b2EzSXVqODhzOThwOCtFb1NGTEswWFR2dFVPME94TjR0MURkTXcKV2JRRG1wc2Zaek9rYUZYOWJhcDhLK0VRbkZoRCtGaFRXVGFjT1U0ampYSjZqay9WdHVkUDJOVlZERGdGcjJZUwpCYlFxV1BVRFlPTjFsZHllK1ZVQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFERXFMRkVVZUJiZ0w1ek5XaTN2dEc5QjBiaGgKMEV5dVMzSEVUTWY4d3VJRlVmNENTa0MvSmZEODBRMnRmTkxBdU1pa0t4dlRydVhJNDJyc0VYV1VFMC9XRWI3MwozYkVybXFXcjlJeEFwUHQ1RFZ0OTgwNmc2ODRpdWtoeHRLQ005UEduL0NzektueHdWM2QzVFp2M1dkeXhkRVYyClhwVWpNQmVvSzUrbHhxam4wc29SbFBXZ3dBRkNZcFNoN0dYbXBrejZLLyswQlNGTjZ0QkpYajhwNGtzQzFHNksKenc4c056ZEo5amNVREMzRWJrWjRMc2duQmxyS05FQkY1VmZnOW9UdWc2K3pmTy9JTkpWbDNQNndxY042cDI1SApQRnFvTndReHBhN3VwZmY2eFYwMGg2SmxmYmh4T2pSbkNaRjlDUmJOOEJJQWxra2NPYzNSNEwvT1NzYz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
I have also tried putting the certificate in a separate file (certificate-authority: ca.crt) and removing the trailing EOL there, but it also works without problems:
-----BEGIN CERTIFICATE-----
MIICyDCCAbCgAwIBAgIBADANBgkqhkiG9w0BAQsFADAVMRMwEQYDVQQDEwprdWJl
cm5ldGVzMB4XDTE5MDExNjE0NTgxMFoXDTI5MDExMzE0NTgxMFowFTETMBEGA1UE
AxMKa3ViZXJuZXRlczCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKmx
3AJySpbuPpxX4QoNeLqIzrLtcI/QJ6gR9d45q9He5ThAcOHX5kUp8NJdJDOdHtD0
XYKLtElweBA3PBOrNUTasP6nEjGGpm/tYJ5EbUUJm5881zk71IOvMsbCMO7TRazG
5R4y0I3mf1DYHmpTKaNpHhaT8ZbhTaT24QfJuaxtG24TqGXtFswyyKC/si+TYUIs
7b8UcwpZJp4ixnYbwTwwCPL+vuoa3Iuj88s98p8+EoSFLK0XTvtUO0OxN4t1DdMw
WbQDmpsfZzOkaFX9bap8K+EQnFhD+FhTWTacOU4jjXJ6jk/VtudP2NVVDDgFr2YS
BbQqWPUDYON1ldye+VUCAwEAAaMjMCEwDgYDVR0PAQH/BAQDAgKkMA8GA1UdEwEB
/wQFMAMBAf8wDQYJKoZIhvcNAQELBQADggEBADEqLFEUeBbgL5zNWi3vtG9B0bhh
0EyuS3HETMf8wuIFUf4CSkC/JfD80Q2tfNLAuMikKxvTruXI42rsEXWUE0/WEb73
3bErmqWr9IxApPt5DVt9806g684iukhxtKCM9PGn/CszKnxwV3d3TZv3WdyxdEV2
XpUjMBeoK5+lxqjn0soRlPWgwAFCYpSh7GXmpkz6K/+0BSFN6tBJXj8p4ksC1G6K
zw8sNzdJ9jcUDC3EbkZ4LsgnBlrKNEBF5Vfg9oTug6+zfO/INJVl3P6wqcN6p25H
PFqoNwQxpa7upff6xV00h6JlfbhxOjRnCZF9CRbN8BIAlkkcOc3R4L/OSsc=
-----END CERTIFICATE-----
apfelmus-2:~ fons$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"2018-12-03T21:04:45Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"2018-12-03T20:56:12Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
apfelmus-2:~ fons$ kubectl get pods
NAME READY STATUS RESTARTS AGE
flux-654975db4d-jnmnz 1/1 Running 0 6m48s
memcached-57f9c8d4c-9qfsm 1/1 Running 0 6m48s
apfelmus-2:~ fons$ fluxctl version
1.10.1
apfelmus-2:~ fons$ fluxctl identity
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCoFbCBrNU1wPZ5xL1Z1/gtFbTo4L1kZzLcO5eI9J4l02+Jtr+/4ZWtXj6qVSeHmb0nBjGoCG+vJc3gqusI9QihNp3ZSCg33BsarGA664ybTZShjpeVq2n7PUx9MAdK7NsQMKuDpV1OO7CvVv7i5HZsnT1z0DoyYLWaN6jevgguWRlFxd7ywVw7AyBrgW2YjNC7F2OHpid5vcMIKe41whfn2GuO1BHTgIhrsfuhivTEfFSKDY9nrVltrgUcE8oYCGuUYZ+xom309Wo8oRB2vid9UYhyUJOFln0DGAcXNemZZjY2Y7ungC/EFlVAFsyBBh0vja6zCQpQx3QkQx/ONGEJ
It turns out that the problem happens when using certificate-authority-data with a base64-encoded certificate missing an EOL after -----END CERTIFICATE-----.
This works on kubectl 1.13.0 but fails on fluxctl 1.10.1
My guess is that this is coming from a bug in a Go library whose dependency we haven't bumped yet.
This is still a mystery. It works with kubectl, back to version 1.5.0
I think I found the culprit at https://github.com/kubernetes/client-go/blob/v8.0.0/tools/clientcmd/loader.go#L227-L229
config := clientcmdapi.NewConfig()
mergo.Merge(config, mapConfig)
mergo.Merge(config, nonMapConfig)
The code above results in the certificate being duplicated if it doesn't have an EOL at the end :S
Actually, it always concats the certificate with itself but, if there is an EOL, then it complies with the pem format.
TLDR: this is caused by a vendoring mess, and will be fixed after bumping the kubernetes go client to version 10.0.0
Moral of the story:
The problem is that we are vendoring a version of https://github.com/imdario/mergo with different operation semantics than the one expected by the version of https://github.com/kubernetes/client-go we vendor.
Specifically, the semantics of the Merge operation in mergo changed sometime between version 0.1.4 (expected by the kubernetes go-client we vendor) and 0.3.2 which is the one we actually vendor.
This explains why this problem didn't surface with kubectl.
We vendored version 0.3.2 of mergo at https://github.com/weaveworks/flux/commit/66354840fbe4d69cf9c865c417226d9da0eb41db#diff-bd247e83efc3c45ae9e8c47233249f18R176 but I still can't understand why that version was chosen instead of respecting the requirement from https://github.com/kubernetes/client-go.
Good news is that https://github.com/kubernetes/client-go already upgraded to mergo 0.3.5 (downstreamed from https://github.com/kubernetes/kubernetes/pull/65779 and included in the 10.0.0 release), adapting the code to the _new_ Mergo semantics.
This means we can fix the problem by bumping our vendored https://github.com/kubernetes/client-go to version 10.0.0
As a final pun to myself ... this is the second time I run into this problem, but I didn't remember https://github.com/imdario/mergo/issues/33 ... sigh
Marking as blocked until we try to bump the kubernetes client library. I may give it a go soon.
Most helpful comment
TLDR: this is caused by a vendoring mess, and will be fixed after bumping the kubernetes go client to version 10.0.0
Moral of the story:
The problem is that we are vendoring a version of https://github.com/imdario/mergo with different operation semantics than the one expected by the version of https://github.com/kubernetes/client-go we vendor.
Specifically, the semantics of the
Mergeoperation in mergo changed sometime between version 0.1.4 (expected by the kubernetes go-client we vendor) and 0.3.2 which is the one we actually vendor.This explains why this problem didn't surface with
kubectl.We vendored version 0.3.2 of mergo at https://github.com/weaveworks/flux/commit/66354840fbe4d69cf9c865c417226d9da0eb41db#diff-bd247e83efc3c45ae9e8c47233249f18R176 but I still can't understand why that version was chosen instead of respecting the requirement from https://github.com/kubernetes/client-go.
Good news is that https://github.com/kubernetes/client-go already upgraded to mergo 0.3.5 (downstreamed from https://github.com/kubernetes/kubernetes/pull/65779 and included in the 10.0.0 release), adapting the code to the _new_ Mergo semantics.
This means we can fix the problem by bumping our vendored https://github.com/kubernetes/client-go to version 10.0.0
As a final pun to myself ... this is the second time I run into this problem, but I didn't remember https://github.com/imdario/mergo/issues/33 ... sigh