I am still seeing this issue #1705; my description is as per #1705 and I think this should be closed and that issue reopened, but for context:
Steps to reproduce the behavior:
For example, on a default install (no imported project) of JX into EKS, after two days I have just deleted over 400 of these
jenkins-x-gcactivities-1541615400-8s7sd 0/1 Error 0 7m
jenkins-x-gcactivities-1541615400-fj2tb 0/1 Error 0 3m
jenkins-x-gcactivities-1541615400-lpkxt 0/1 Error 0 5m
My versions:
Running in namespace: jx
Jenkins X Version:
Using helmBinary helm with feature flag: none
NAME VERSION
jx 1.3.535
jenkins x platform 0.0.2859
Kubernetes cluster v1.10.3-eks
kubectl v1.12.2
helm client v2.11.0+g2e55dbe
helm server v2.11.0+g2e55dbe
git git version 2.17.2 (Apple Git-113)
but I assumed I would have the fix/workaround from #1705.
Can confirm it's happening to me too.
NAME VERSION
jx 1.3.532
jenkins x platform 0.0.2859
Kubernetes cluster v1.10.3-eks
kubectl v1.12.2
helm client v2.11.0+g2e55dbe
helm server v2.11.0+g2e55dbe
git git version 2.17.2 (Apple Git-113)
Can you paste the logs from one of the failed pods please?
Will do next time it happens. I reinstalled Jenkins X right after posting.
Here's one of the pod's logs:
Error loading team settings. environments.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list environments.jenkins.io in the namespace "jx"
Error loading team settings. environments.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list environments.jenkins.io in the namespace "jx"
plugins.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list plugins.jenkins.io in the namespace "jx"
To be able to connect to the Jenkins server we need a username and API Token
error: EOF
? Jenkins user name: (admin) 7
kc describe pod jenkins-x-gcactivities-1544562000-29d6v:
Name: jenkins-x-gcactivities-1544562000-29d6v
Namespace: jx
Node: ip-10-1-9-10.ec2.internal/10.1.9.10
Start Time: Tue, 11 Dec 2018 15:02:45 -0600
Labels: app=gcactivities
controller-uid=badd0f1d-fd87-11e8-b690-0a0697b453e4
job-name=jenkins-x-gcactivities-1544562000
release=jenkins-x
Annotations: <none>
Status: Failed
IP: 10.1.8.255
Controlled By: Job/jenkins-x-gcactivities-1544562000
Containers:
gcactivities:
Container ID: docker://3802db8b4c0428316e1462b5a4b769725b71a8ba9a91a00e764fb0157bbf2874
Image: jenkinsxio/jx:1.3.639
Image ID: docker-pullable://jenkinsxio/jx@sha256:b6ad86b6b3b54c45f358417cfcc8895e1710d3759105c0b92228db362e07510e
Port: <none>
Host Port: <none>
Command:
jx
Args:
gc
activities
State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 11 Dec 2018 15:02:45 -0600
Finished: Tue, 11 Dec 2018 15:02:45 -0600
Ready: False
Restart Count: 0
Environment:
cheese: wine
foo: bar
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from jenkins-x-gcactivities-token-xlqhs (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
jenkins-x-gcactivities-token-xlqhs:
Type: Secret (a volume populated by a Secret)
SecretName: jenkins-x-gcactivities-token-xlqhs
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned jenkins-x-gcactivities-1544562000-29d6v to ip-10-1-9-10.ec2.internal
Normal SuccessfulMountVolume 10m kubelet, ip-10-1-9-10.ec2.internal MountVolume.SetUp succeeded for volume "jenkins-x-gcactivities-token-xlqhs"
Normal Pulled 10m kubelet, ip-10-1-9-10.ec2.internal Container image "jenkinsxio/jx:1.3.639" already present on machine
Normal Created 10m kubelet, ip-10-1-9-10.ec2.internal Created container
Normal Started 10m kubelet, ip-10-1-9-10.ec2.internal Started container
NAME VERSION
jx 1.3.651
jenkins x platform 0.0.3036
Kubernetes cluster v1.10.11-eks
kubectl v1.13.0
helm client v2.12.0+gd325d2a
helm server v2.12.0+gd325d2a
git git version 2.17.2 (Apple Git-113)
Operating System Mac OS X 10.14.1 build 18B75
We are also experiencing this issue. Pods build up after some time. The cluster is built on eks and has been running for one week. I followed the getting started guide to deploy jenkins x but the actual cluster itself was deployed using Terraform. We are using a static master setup.
Our configuration is quite basic at the moment, I have created two environments with which to test, and some Springboot applications. The pipeline and other jx pods appear to run ok except for these pods which I have killed but they keep coming back after a few hours.
Please let me know if I can provide any further detail.
NAME VERSION
jx 1.3.647
jenkins x platform 0.0.3036
Kubernetes cluster v1.10.11-eks
kubectl v1.12.2
helm client v2.11.0+g2e55dbe
helm server v2.11.0+g2e55dbe
git git version 2.19.1
Operating System Unkown Linux distribution Linux version 4.19.1-arch1-1-ARCH (builduser@heftig-16768) (gcc version 8.2.1 20180831 (GCC)) #1 SMP PREEMPT Sun Nov 4 16:49:26 UTC 2018
Name: jenkins-x-gcactivities-1544617800-xsrgd
Namespace: jx
Node: ip-10-8-0-113.eu-west-1.compute.internal/10.8.0.113
Start Time: Wed, 12 Dec 2018 12:30:03 +0000
Labels: app=gcactivities
controller-uid=a696b9a8-fe09-11e8-b319-06522f68053e
job-name=jenkins-x-gcactivities-1544617800
release=jenkins-x
Annotations: <none>
Status: Failed
IP: 10.8.0.84
Controlled By: Job/jenkins-x-gcactivities-1544617800
Containers:
gcactivities:
Container ID: docker://cc8309c71368de5c4312bd4cbe249f7aee2ceb6e0d0e142943f470f571787d7b
Image: jenkinsxio/jx:1.3.639
Image ID: docker-pullable://jenkinsxio/jx@sha256:b6ad86b6b3b54c45f358417cfcc8895e1710d3759105c0b92228db362e07510e
Port: <none>
Host Port: <none>
Command:
jx
Args:
gc
activities
State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 12 Dec 2018 12:30:04 +0000
Finished: Wed, 12 Dec 2018 12:30:04 +0000
Ready: False
Restart Count: 0
Environment:
cheese: wine
foo: bar
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from jenkins-x-gcactivities-token-62227 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
jenkins-x-gcactivities-token-62227:
Type: Secret (a volume populated by a Secret)
SecretName: jenkins-x-gcactivities-token-62227
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
Error loading team settings. environments.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list environments.jenkins.io in the namespace "jx"
Error loading team settings. environments.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list environments.jenkins.io in the namespace "jx"
plugins.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list plugins.jenkins.io in the namespace "jx"
To be able to connect to the Jenkins server we need a username and API Token
error: EOF
? Jenkins user name: (admin) `
$ kubectl describe clusterrolebinding gcactivities-jx
Name: gcactivities-jx
Labels: <none>
Annotations: <none>
Role:
Kind: ClusterRole
Name: gcactivities-jx
Subjects:
Kind Name Namespace
---- ---- ---------
ServiceAccount jenkins-x-gcactivities jx
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: 2018-12-10T16:11:36Z
name: gcactivities-jx
resourceVersion: "3111406"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/gcactivities-jx
uid: 451ec341-fc96-11e8-b319-06522f68053e
rules:
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
- create
- update
- apiGroups:
- ""
resources:
- namespaces
verbs:
- get
- delete
- list
- apiGroups:
- apps
resources:
- deployments
verbs:
- get
$ jx status
Jenkins X checks passed for Cluster(arn:aws:eks:eu-west-1:878062504042:cluster/dev-cluster): 3 nodes, memory 4% of 24199212Ki, cpu 23% of 6. Jenkins is running at http://jenkins.jx.eks.actual-experience.com
Can confirm this is also happening on our GKE cluster. Symptoms identical to @rust84 and @ranginuitrot.
$ jx version
NAME VERSION
jx 1.3.640
jenkins x platform 0.0.3036
Kubernetes cluster v1.10.9-gke.5
kubectl v1.11.3
helm client v2.11.0+g2e55dbe
helm server v2.11.0+g2e55dbe
git git version 2.18.0
Operating System Mac OS X 10.13.6 build 17G4015
Should the jenkins-x-gc-activities be asking for those resources in the first place, or is something wrong in our setup? It seems that the gcactivities should not have these permissions in the first place looking at values.yaml here: https://github.com/jenkins-x/jenkins-x-platform/blob/master/values.yaml#L63
The controllerbuild on the other hand has them: https://github.com/jenkins-x/jenkins-x-platform/blob/master/values.yaml#L313
Are the permissions added at another point in time? Perhaps when adding an environment (we installed Jenkins X with the --no-default-environments option).
The workaround of adding cluster-admin to the service account would work but I would much prefer knowing what the problem is.
Can confirm this is also happening on our EKS cluster.
NAME VERSION
jx 1.3.661
jenkins x platform 0.0.3036
Kubernetes cluster v1.10.11-eks
kubectl v1.13.1
helm client v2.12.0+gd325d2a
helm server v2.12.0+gd325d2a
git git version 2.20.1
Operating System Mac OS X 10.14.2 build 18C54
Adding @pmuir to this as he mentioned in the jenkins-x-user Slack channel that
The permissions errors in that log statement are misleading, they don't cause the code to fail
I'll get them removed
Any idea why this may be happening @pmuir?
One idea I have is that, through switching jx contexts (I had a minikube installation and a GKE installation running), the values in the yaml files created during installation on one cluster were somehow used/included/replaced the values in the other installation.
$ ls -1 ~/.jx/*.yaml
/Users/sboardwell/.jx/adminSecrets.yaml
/Users/sboardwell/.jx/chartmuseumAuth.yaml
/Users/sboardwell/.jx/gitAuth.yaml
/Users/sboardwell/.jx/jenkinsAuth.yaml
@ccojocar is this becaue of removing old secrets?
Actually no, this looks like an issue in the gc /cc @jstrachan
Hi @jstrachan, I used one of the failing pods to recreate a test pod and then used kubectl exec -it .... bash. Running jx gc activities indeed asked for credentials (and I can confirm, that the permissions log output has no meaning - @pmuir).
I'm not sure what is expected here, but here is what I found:
/root/.jx/jenkinsAuth.yaml present.jenkinsAuth.yaml first). I could have also just edited the file I guess.So:
/root/.jx/jenkinsAuth.yaml supposed to be there when the jenkins-x-gcactivities pod is created?jenkins-admin-user: "true" and jenkins-admin-password but no jenkins-admin-usernameLet me know if I can supply any more information, and thanks for all the cool work.
@jstrachan - I don't think this issue has been fixed by just adding the missing roles. The missing roles just removed
Error loading team settings. environments.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list environments.jenkins.io in the namespace "jx"
Error loading team settings. environments.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list environments.jenkins.io in the namespace "jx"
plugins.jenkins.io is forbidden: User "system:serviceaccount:jx:jenkins-x-gcactivities" cannot list plugins.jenkins.io in the namespace "jx"
from the logs. The pods are still failing because there is no authentication method.
OK I think those RBAC issues are fixed - am just waiting to see if I get any failures.
Though I'm still not sure why you are getting errors about the Jenkins API Token not being present. You are using Static Jenkins masters right - you're not using Prow / Serverless jenkins?
btw version 0.0.2859 is pretty old - we just released the RBAC fix in 0.0.3086
Yes, I have ported your RBAC fixes and can confirm the log "forbidden" messages are gone. However, the pods are still failing as the Jenkins API token is missing. Where should it be? and how should it get there? I can see anything in the k8s cron job or job template.
Btw, I'm seeing this on jenkins-x-platform-0.0.3078
You are using Static Jenkins masters
yes, sorry, I'm using a static master
the API token should be created for you when you first import a project into Jenkins X - have you done that yet - do you have any projects? e.g. does jx get pipeline return anything?
usually when using the static Jenkins master the api token has to be setup to be able to create the Staging and Production environments
No, no projects yet. I created a test pipeline job by hand. Let me check to see if it works.
I wonder if you are experiencing issues of an older cluster being upgrading not having the necessary Jenkins secret? e.g. if you rm ~/.jx/jenkinsAuth.yaml and then create/import a project it should create the necessary Jenkins secret?
usually the initial install creates the Jenkins API token secret as it sets up CI/CD for the Environments
Yes the jenkins secret is there, complete with api token, etc.
$ k get secrets jenkins -o yaml | ksd
apiVersion: v1
data:
jenkins-admin-api-token: {REDACTED}
jenkins-admin-password: {REDACTED}
jenkins-admin-user: admin
jenkins-bearer-token: ""
kind: Secret
metadata:
creationTimestamp: "2018-12-16T19:59:23Z"
labels:
app: jenkins
chart: jenkins-0.10.31
heritage: Tiller
release: jenkins-x
name: jenkins
namespace: jx
resourceVersion: "14531"
selfLink: /api/v1/namespaces/jx/secrets/jenkins
uid: 15a140f3-016d-11e9-8c5f-1e557db74be1
type: Opaque
Hi @rawlingsj, could you keep this open please? I believe there is still another problem that @jstrachan wants to address.
oops - bad copy paste of issues sorry
Just confirming this has fixed my problem. Great work. Thanks a lot!
@sboardwell , how did you solve your problem. i have similar kind of issue?. my pods status shows like this?
jenkins-x-gcactivities-1574078400-227nt 0/1 Error 0 112m
jenkins-x-gcactivities-1574078400-24bs6 0/1 Error 0 116m
jenkins-x-gcactivities-1574078400-24kf4 0/1 Error 0 89m
and the logs of the pod shows as below
kubectl logs jenkins-x-gcactivities-1574080200-zzld9
error: Could not find any user auths for jenkins server http://jenkins.xxx.xx.xx.xx.xx.nip.io.io has server URLs
Thanks
@anilkumarpasupuleti - I can't remember exactly. I ended up starting up a jenkins-x-gcactivities pod of my own to debug. I think it had something to do with the initial installation values having changed or something.
Most helpful comment
Just confirming this has fixed my problem. Great work. Thanks a lot!