One case reports issue that
We need to make sure the component (pipeline step) is using the certain KSA (e.x. pipeline-runner).
If the client doesn't use workload identity bound GSA, that's a bug in gcp python client, we should report it there.
Based on the email, it seems reporter didn't well express the problem. Waiting for replies but keep this open for a while here to see if others also hit some issue.
I'm running into an issue with a demo pipeline that seems like it could be related to this. When running the XGBoost demo pipeline as instructed in the KF Pipelines Quickstart guide, the dataproc-create-cluster block fails with
File "kfp_component/google/dataproc/_create_cluster.py", line 74, in create_cluster
wait_interval)
File "kfp_component/google/dataproc/_create_cluster.py", line 89, in _create_cluster_internal
return _dump_cluster(operation.get('response'))
File "kfp_component/google/dataproc/_create_cluster.py", line 109, in _dump_cluster
cluster.get('clusterName'))
AttributeError: 'NoneType' object has no attribute 'get'
In the GCP console, my Kubernetes node pools are listed as having the {KF_PROJECT}-vm service account, rather than the default GCE service account. In the IAM section, it also appears that this VM account doesn't have sufficient permissions to modify resources in the cluster (only has Logs Writer, Monitoring Metric Writer, Monitoring Viewer, Storage Object Viewer).
Should I post a new issue or does this seem related this issue?
@mmwebster Can you share your kubeflow version?
After 1.0.2, pipelines should already have permissions using workload identity, not default service account.
If not, suggest binding workload identity for pipeline-runner service account
/close
because the original report has been resolved, it was because of misconfigured workload identity binding.
@Bobgy: Closing this issue.
In response to this:
/close
because the original report has been resolved, it was because of misconfigured workload identity binding.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/close
because the original report has been resolved, it was because of misconfigured workload identity binding.
I'm wondering whether we publish the WI self test to github or website as a FAQ. I mean the tool to check the WI binding. It seems a FAQ. Will put one to post-1.0 backlog.
that's a good point, we should mention it in workload identity doc in troubleshooting section