In a multi-user enabled env, I created a notebook server on user's namespace, launch a notebook and try to call Python SDK from there. When I execute the code below:
pipeline = kfp.Client().create_run_from_pipeline_func(mnist_pipeline, arguments={}, namespace='mynamespace')
The API call was rejected with the following errors:
~/.local/lib/python3.6/site-packages/kfp_server_api/rest.py in request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
236
237 if not 200 <= r.status <= 299:
--> 238 raise ApiException(http_resp=r)
239
240 return r
ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'content-length': '19', 'content-type': 'text/plain', 'date': 'Tue, 01 Sep 2020 00:58:39 GMT', 'server': 'envoy', 'x-envoy-upstream-service-time': '8'})
HTTP response body: RBAC: access denied
A pipeline run should be created and executed
How did you deploy Kubeflow Pipelines (KFP)?
I installed the KFP on IKS with multi-user support
KFP version:
v1.1.0
KFP SDK version:
v1.0.0
[Miscellaneous information that will assist in solving the issue.]
/kind bug
Since the notebook server uses serviceaccount: default-editor in my user's namespace, I can fixed the RBAC issue by adding a servicerolebinding to allow the serviceaccount to access ml-pipeline-service. However, the request is still rejected by the ml-pipelie-api-server:
~/.local/lib/python3.6/site-packages/kfp_server_api/rest.py in request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
236
237 if not 200 <= r.status <= 299:
--> 238 raise ApiException(http_resp=r)
239
240 return r
ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Tue, 01 Sep 2020 01:07:38 GMT', 'x-envoy-upstream-service-time': '2', 'server': 'envoy', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header.","message":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header.","code":10,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Request header error: there is no user identity header.","error_details":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header."}]}
This is by design. In current phase, KFP api server needs a trusted source for user identity.
and letting user specify userid header is not trustable.
We introduced the design with @yanniszark in https://kccnceu20.sched.com/event/Zeok/enabling-multi-user-machine-learning-workflows-for-kubeflow-pipelines-yannis-zarkadas-arrikto-yuan-gong-google
You may check the slides there and our public design doc. There is an idea to use service account token to authenticate. @yanniszark is commited to implement that.
current suggested workaround is to always authenticate from the public endpoint using user credentials
@Bobgy What would be the steps needed to make programmatic call to client methods?
I think in the past kfp.Client() worked in in-cluster notebook scenarios...
@Bobgy thanks for the link. It's really helpful. In the slides, the idea is to use RBAC to store the access policies. I guess it should be k8s RBAC (because of SubjectAccessReview). How about using istio RBAC instead? Because the goal is to protect the pipeline server API endpoints and actually istio can do that by setting up proper istio RBAC for those endpoints. KFAM just needs to maintain correct istio RBAC objects. It ties to istio though.
@yhwang to be specific, the goal is to protect pipeline resources based on user identity.
It's also possible to use istio RBAC to define rules that uses http request path and parameters.
I think problems are
I don't have full context whether the option was initially considered or not.
@IronPan @gaoning777 @yanniszark do you have context on this?
I just want to explore possible approaches. k8s RBAC and istio RBAC are designed to handle different level authorization. One is for k8s resource, another is for application level endpoints. I just tried to figure out which one is better for our purpose.
Another point is that if KFAM is doing the validation, ml-pipeline-api-server has to call its API to validate the coming requests. Then all requests go to ml-pipeline-api-server will be examined. The validation can't be done before ml-pipeline-api-server, otherwise some paths go to ml-pipeline-api-server may not be inspected.
About the troubleshoot of the RBAC, envoy sidecar allows you to turn on the debug level of RBAC and you can see detailed information about how it performs the RBAC validation as well as requests' metadata, i.e. request.auth.principal, source.principal and etc.
Agree, to clarify, I think istio is relatively harder to use because the abstraction of istio RBAC needs so much knowledge from users to use properly. While k8s rbac is very clear: just resource, namespace, verb, user, there isn't any redundant concepts here.
current suggested workaround is to always authenticate from the public endpoint using user credentials
How does this work for non-GCP clusters? I saw this issue where it was stated that auth is only possible using GCP IAP, and that someone using AWS should use the kfp client from within the cluster.
@meowcakes
that someone using AWS should use the kfp client from within the cluster.
It doesn't work now at least in my case (multi-user enabled env, kfp v1.1.0). The reason is there are two user identity checkings in ml-pipeline:
kubeflow-userid header validation in ml-pipelineFirst one could be solved by adding some istio RBAC configs. However, the second one is related to design and under implementation based on @Bobgy comment above
@meowcakes the issue you are Looking at is now outdated.
/assign @PatrickXYS
for How to connect on AWS
@yhwang For KFP 1.1, in-cluster communitation from notebook to Kubeflow Pipeline is not supported in this phase.
In order to use kfp as before, you need to pass a cookie to KFP for communication as a workaround.
Ref: https://www.kubeflow.org/docs/aws/pipeline/#authenticate-kubeflow-pipeline-using-sdk-inside-cluster
current suggested workaround is to always authenticate from the public endpoint using user credentials
@Bobgy what does this mean exactly? How would I authenticate to the public endpoint?
@jonasdebeukelaer Here's documentation for GCP
https://www.kubeflow.org/docs/gke/pipelines/authentication-sdk/#connecting-to-kubeflow-pipelines-in-a-full-kubeflow-deployment
@Bobgy
I thought about this issue again and I think this is where istio envoy filter could be used without any application change. I added an envoy filter to add kubeflow-userid header for those HTTP traffics going to ml-pipeline.kubeflow.svc.cluster.local:8888. Then it works. So you actually don't need to do the authentication trick for in-cluster use case. It's weird to me to perform authentication for in-cluster scenario. The kubeflow-userid I injected in the http header is the namespace owner's userid. I think it totally make sense.
In conclusion, I added two config objects to make it work:
kubeflow-userid header for ml-pipeline-api-server to validate the incoming request.If these two configs could be created with the notebook server, then it will be perfect!
@yhwang interesting, can you share an example of these configs?
@Bobgy
sure!
apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
name: bind-ml-pipeline-nb-mynamespace
namespace: kubeflow
spec:
roleRef:
kind: ServiceRole
name: ml-pipeline-services
subjects:
- properties:
source.principal: cluster.local/ns/mynamespace/sa/default-editor
kubeflow-userid header from notebook to ml-pipeline service. In the example below, the notebook server's name is mynotebook and userid for namespace: mynamespace is [email protected]apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: mynamespace
spec:
workloadSelector:
labels:
notebook-name: mynotebook
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_OUTBOUND
listener:
portNumber: 8888
filterChain:
filter:
name: "envoy.http_connection_manager"
subFilter:
name: "envoy.router"
patch:
operation: INSERT_BEFORE
value: # lua filter specification
name: envoy.lua
config:
inlineCode: |
function envoy_on_request(request_handle)
request_handle:headers():add("kubeflow-userid", "[email protected]")
end
The envoy filter above only inject the kubeflow-userid HTTP header for those traffic going to ml-pipelie service
@Bobgy
I studied the envoy filter more and here is a better version:
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: mynamespace
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: [email protected]
workloadSelector:
labels:
notebook-name: mynotebook
It directly uses the custom request header feature that http_connection_manager provides. Because the header name/value are fixed, no need to use lua filter.
@yhwang Thanks for the suggestion. I tried adding ServiceRoleBinding, after which as you mentioned i get the below error. After the error I tried adding envoy filter, however the error remains same. Providing the envoy filter yaml file for reference. Namespace: brainyapps header Value: [email protected] and notebook name: mynotebook
Does the below envoy-filter to be added in our namespace (brainyapps) or in the namespace (kubeflow). For your information: default and system namespaces are in restricted pod security policy
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Tue, 08 Sep 2020 12:33:51 GMT', 'x-envoy-upstream-service-time': '2', 'server': 'envoy', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header.","message":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header.","code":10,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Request header error: there is no user identity header.","error_details":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header."}]}
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: brainyapps
spec:
configPatches:
@arshashi
The envoyfilter should be added to user's namespace. You may check your notebook server pod and see if its label does have the notebook-name: mynotebook. That will make sure the envoyfilter would apply to the notebook server in user's namespace. Also check the user's namespace, i.e.
kubectl get ns brainyapps -o yaml
and make sure the owner is [email protected]. for example:
apiVersion: v1
kind: Namespace
metadata:
annotations:
owner: [email protected]
......
.....
In your case, there is no kubeflow-userid is injected. I guess the notebook-name: mynotebook is wrong. that's my guess.
_edit_
oh another possibility is that in your kubeflow, the identity header name is not kubeflow-userid. You may double check your kubeflow config.
@yhwang My notebook server name was incorrect and now it works for me with the above two changes. Thanks alot for your time and suggestion, I was stuck with this issue for long time.
Thanks @yhwang! The solution sounds secure and reasonable!
I guess the only concern I have, is that other users sharing the namespace can act as namespace owner's permissions.
So if we can make up a service account that represents this namespace and only grant permission to access the same namespace, that will be ideal, but I believe that is totally possible, at least for GCP.
/cc @yanniszark @IronPan
what do you think about this approach to grant cluster workload access to KFP api?
@Bobgy
So if we can make up a service account that represents this namespace and only grant permission to access the same namespace
In the RBAC config I posted above, the notebook would be cluster.local/ns/mynamespace/sa/default-editor. So ml-pipeline-apiserver could use that and only allow that serviceaccount to access specific resources, for example: pipelines created by mynamespace's owner
Exactly, that was one of the auth options in the design too.
PRs welcomed on reading istio request header approach for authentication
@Bobgy can you elaborate more about this:
PRs welcomed on reading istio request header approach for authentication
For istio request header, it could be handled by istio RBAC. I guess you mean adding corresponding istio RBAC in this case. Or you was talking about other implementation?
@Bobgy I have a change here to create the istio configs with the notebook server. It also removes istio configs when deleting notebook server. Do you think it align with your plan? and should I have a PR for that?
Hi all! I will try to answer some of the questions that came up in this issue:
I guess it should be k8s RBAC (because of SubjectAccessReview). How about using istio RBAC instead? Because the goal is to protect the pipeline server API endpoints and actually istio can do that by setting up proper istio RBAC for those endpoints. KFAM just needs to maintain correct istio RBAC objects. It ties to istio though.
@IronPan @gaoning777 @yanniszark do you have context on this?
Istio RBAC is deprecated so I'm going to talk about Istio AuthorizationPolicy.
Istio AuthorizationPolicy is a useful tool, but besides the obvious disadvantages (tied to Istio, harder to use, etc.), it doesn't have the flexibility that the Pipelines model requires right now. Consider that in the current code:
On the contrary, we use Kubernetes RBAC as an authorization database and perform whatever complex logic we want in the API-Server. And as an authorization database, Kubernetes RBAC makes much more sense. Does this answer your question @yhwang? Please tell me if something is not clear! cc @Bobgy
Please also take a look at the Kubeflow-wide guideline for authorization, which prescribes RBAC and SubjectAccessReview:
https://github.com/kubeflow/community/blob/master/guidelines/auth.md
I added an envoy filter to add kubeflow-userid header for those HTTP traffics going to ml-pipeline.kubeflow.svc.cluster.local:8888.
Envoy filter to inject the kubeflow-userid header from notebook to ml-pipeline service. In the example below, the notebook server's name is mynotebook and userid for namespace: mynamespace is [email protected]
Thanks @yhwang! The solution sounds secure and reasonable!
I guess the only concern I have, is that other users sharing the namespace can act as namespace owner's permissions.
I want to make it clear that the config I see outlined here is NOT secure. The sidecar can impersonate ANY identity.
The correct way to enable programmatic access is to:
So if we can make up a service account that represents this namespace and only grant permission to access the same namespace, that will be ideal, but I believe that is totally possible, at least for GCP.
@Bobgy but we don't need to have a notion of a ServiceAccount that "represents" a namespace. All ServiceAccount identities will be able to prove themselves to the Pipelines API Server with the design outlined above.
@yanniszark these are great points, I'll add some of my latest ideas in a few days
@yanniszark thanks for those information. Here are my thoughts
Personally, I don't think using istio config is bad, especially, you already use mTLS from istio. The key point of using istio is to control/define application level access/permission. For me, using k8s RBAC to achieve application level security is kind of abuse of it. It's main purpose is to control the k8s resource level permission. For example, a istio RBAC/authorization config is needed to allow a notebook server to access ml-pipeline-service. This is application level. Creating a notebook server by a user contains 2 level of permissions:
The envoyfilter I provided above is just to incorporate with current ml-pipeline limitation. Once ml-pipeline improves, this approach should also be modified. The audience-bound solution could also be done by istio JWTRule. In this case, we don't need to reinvent the wheel.
These are my two cents.
Is there a solution for making kfp.Client() work out of the box in an in-cluster pod in a KF cluster with no auth? e.g. a cluster installed with https://raw.githubusercontent.com/kubeflow/manifests/v1.1-branch/kfdef/kfctl_k8s_istio.v1.1.0.yaml
It currently fails with:
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/rest.py", line 238, in request
raise ApiException(http_resp=r)
kfp_server_api.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'content-length': '19', 'content-type': 'text/plain', 'date': 'Wed, 23 Sep 2020 14:51:35 GMT', 'server': 'istio-envoy', 'connection': 'close', 'x-envoy-decorator-operation': 'ml-pipeline.kubeflow.svc.cluster.local:8888/*'})
HTTP response body: RBAC: access denied
This is with KFP SDK 1.0.1.
@lukemarsden
You can check my comments above for a workaround:
ServiceRoleBinding
EnvoyFilter
And @yanniszark mentioned this:
I want to make it clear that the config I see outlined here is NOT secure. The sidecar can impersonate ANY identity.
It uses a fixed userid to access ml-pipeline-service and I suggest the value should be the userid who creates the notebook server. If you are comfortable with this, then you can try the workaround.
@yhwang thanks for sharing your thoughts! I have some remarks on some of them:
The key point of using istio is to control/define application level access/permission. For me, using k8s RBAC to achieve application level security is kind of abuse of it. It's main purpose is to control the k8s resource level permission.
As mentioned above, Istio AuthorizationPolicy just doesn't have the flexibility to deal with the complexity required for authorization. So for Pipelines, we needed an authorization database. We could make our own from scratch or try to put something together from existing solutions. Then we would need to provide tooling (clients, CLI, UI) and docs around it. Instead, we opt to use Kubernetes RBAC as the authz database and we get:
The audience-bound solution could also be done by istio JWTRule.
Can you elaborate? Currently, ServiceAccountTokens can only be validated with theTokenReviewAPI call and AFAIK Istio JWTRule doesn't support that.
@lukemarsden if you are ok with working without auth features, you can also disable Istio RBAC completely (ClusterRBACConfig object). I wouldn't recommend it for any multitenant environment though.
Thanks @yanniszark and @yhwang for the guidance.
Applying this config:
apiVersion: rbac.istio.io/v1alpha1
kind: ClusterRbacConfig
metadata:
name: default
spec:
mode: "OFF"
changes the error to:
kfp_server_api.exceptions.ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Fri, 25 Sep 2020 12:41:59 GMT', 'x-envoy-upstream-service-time': '1', 'server': 'istio-envoy', 'x-envoy-decorator-operation': 'ml-pipeline.kubeflow.svc.cluster.local:8888/*', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Validate experiment request failed.: Invalid input error: Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []","message":"Validate experiment request failed.: Invalid input error: Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []","code":3,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []","error_details":"Validate experiment request failed.: Invalid input error: Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []"}]}
So I suspect disabling Istio RBAC alone is not sufficient. I will try experimenting with the ServiceRoleBinding and EnvoyFilter @yhwang suggested, perhaps only the EnvoyFilter is required to pin the namespace owner to anonymous in the case where Istio RBAC is disabled.
Update: I realized this error probably relates to the client.create_run_from_pipeline_func call needing a namespace argument, not the HTTP request missing a header. Testing that now.
Update 2: Adding the namespace arg changed the error to:
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Fri, 25 Sep 2020 13:54:25 GMT', 'x-envoy-upstream-service-time': '1', 'server': 'istio-envoy', 'x-envoy-decorator-operation': 'ml-pipeline.kubeflow.svc.cluster.local:8888/*', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header.","message":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header.","code":10,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Request header error: there is no user identity header.","error_details":"Failed to authorize the request.: Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header."}]}
So I guess I do need that user identity header, too! I'm trying this:
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: istio-system
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: [email protected]
But it's not working yet.
@lukemarsden
I guess your envoyfilter should be added to anonymous namespace where your notebook is. and don't forget to specify your notebook server by using the workloadSelector, for example:
workloadSelector:
labels:
notebook-name: mynotebook
@lukemarsden if you are ok with working without auth features, you can also disable Istio RBAC completely (ClusterRBACConfig object). I wouldn't recommend it for any multitenant environment though.
@yanniszark note, KFP api server does not expose configuration today to disable its authz, so it's not enough just disabling istio RBAC completely.
The following config is (finally) working for me. Note that my use case isn't for notebooks, but rather an in-cluster kfp client, so switching the listenerType from GATEWAY to SIDECAR_INBOUND was necessary so that you got the header added on in-cluster traffic as well.
# Create anonymous namespace (profile) in Kubeflow without having to click
# a button on a web page
curl -XPOST http://localhost:31380/api/workgroup/create
# disable Istio RBAC to workaround
# https://github.com/kubeflow/pipelines/issues/4440#issuecomment-697920377
kubectl apply -f - <<EOF
apiVersion: rbac.istio.io/v1alpha1
kind: ClusterRbacConfig
metadata:
name: default
spec:
mode: "OFF"
EOF
# tell kfp that [email protected] even for in-cluster clients
# like pachyderm (listenerType=SIDECAR_INBOUND, not GATEWAY)
kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-user-everywhere
namespace: istio-system
spec:
filters:
- filterConfig:
inlineCode: |
function envoy_on_request(request_handle)
request_handle:headers():replace("kubeflow-userid","[email protected]")
end
filterName: envoy.lua
filterType: HTTP
insertPosition:
index: FIRST
listenerMatch:
listenerType: SIDECAR_INBOUND
EOF
# Stop header being added multiple times
kubectl delete envoyfilter -n istio-system add-user-filter
@lukemarsden Great thanks to your comments! This works for me.
In our case, we don't want any auth or RBAC with kfctl_k8s_istio.v1.1.0.yaml deployment. All team members can freely run kfp.Client() to share DGX resource.
Coming a little late, but let me explain my current thoughts.
First, we don't want to ask users to configure istio for RBAC access, istio configs are really brittle and requires a lot of knowledge to use and debug.
Therefore, my ideal setup is like:
X-Forwarded-Client-Cert injected by istio sidecar: https://stackoverflow.com/a/58099997X-Forwarded-Client-Cert should contain auth information like spiffe://cluster.local/ns/<namespace>/sa/<service account>.3.1 If a request comes from istio gateway (probably we can configure this), then KFP api server interprets it as coming from users reading the special header like kubeflow-userid
3.2 If a request comes from other istio mTLS enabled sources, we know which service account initiated the request.
3.3 If a request doesn't have X-Forwarded-Client-Cert, it's not authed by istio, we may develop other ways for auth like providing service account token with pipeline audience
SubjectAccessReview to test if the corresponding user/service account can access a certain resource representing KFP using Kubernetes RBAC.With this setup, access to all KFP resources are backed by Kubernetes RBAC. If users have notebook servers in-cluster with istio sidecar (mTLS enabled), they only need to grant K8s RBAC permissions to those servers' service accounts.
And we should provide an option to disable all of the authz checks, so if it's not useful for an org, they can just disable it.
What are your thoughts?
@yanniszark @yhwang
@Bobgy yes, what you describe is pretty much how I planned to use Istio mTLS for authentication (via the XFCC header).
As for SubjectAccessReview, we plan on delivering it after Kubecon is over.
We have also refactored the authentication code a bit in order to support multiple auth methods (we call them authenticators). We'll be pushing that upstream as well, after SubjectAccessReview.
Does this sound good?
@Bobgy sounds good to me. Let me add some cases here:
@Bobgy and @yanniszark Please let me know anything I can help. I'd love to help on filling the gaps.
@yanniszark I see. I didn't see any explanation how you planned to actually implement this. I am glad it's the same.
@Bobgy sounds good to me. Let me add some cases here:
- user A creates notebook server which is in namespace A. user A adds user B as collaborator into his namespace A. When user B runs code on the notebook server which resides in namespace A, the XFCC header should be the notebook service account in namespace A (i.e. cluster.local/ns/A/sa/default-editor). In this case, KFP api server can only verify the SubjectAccessView against service account of notebook server of namespace A but not user B. is this correct?
Yes, that's right, because user A can also use this notebook server. It should have its own identity.
- Is the RBAC/authorization of istio still enabled in general? if yes, we still need to add corresponding istio RBAC/authorization in order to allow the user to call KFP SDK from notebook server in addition to K8s RBAC. Or istio RBAC/authorization will be turned off?
Istio authz is still enabled, but the KFP manifest will include an istio authz rule to allow any traffic to KFP API server.
@Bobgy and @yanniszark Please let me know anything I can help. I'd love to help on filling the gaps.
Thank you for offering help! @yanniszark are you talking about contributing this implementation after the kubecon in November?
I believe @yanniszark has implemented some of this in minikf fork of KFP. You can ask him if you can help in any way. When he creates the final PRs, welcome some help on review and fixes (if needed).
@Bobgy Thanks!
Yes, that's right, because user A can also use this notebook server. It should have its own identity.
I just wonder if KFP API server needs to know the request is from User A or User B?
For example, User B can access run/experiment in namespace A and B. But User A only can access namespace A.
By using cluster.local/ns/A/sa/default-editor from XFCC header, KFP API server won't be able to know which user sending the request but only notebook server. Then we need to provide a mechanism to allow users to legally specify kubeflow-userid header in KFP SDK. Or this is out of scope?
I just wonder if KFP API server needs to know the request is from User A or User B?
With above design, KFP API Server doesn't need to know. It uses the service account as requester identity. So we won't need SDK method to add kubeflow userid.
With above design, KFP API Server doesn't need to know. It uses the service account as requester identity. So we won't need SDK method to add kubeflow userid.
So the following scenario would fail and we should document it as limitation
User settings:
Scenario:
When User B runs the code kfp.Client.create_run_from_pipeline_func() and specifies namespace=B on the notebook server that User A creates in namespace A, I guess User B would expects he can access the resource in his own namespace. But based on the design, he can't becaue the XFCC is cluster.local/ns/A/sa/default-editor and KFP API Server only allow this identity to access resource in namespace A.
That's right, and I'd rather consider that as expected behavior.
The scenario is based on the assumption, user A/B didn't authenticate as themselves when using the notebook server, therefore, they should only be able to access what the notebook server's service account can access.
User B will still have the choice to use KFP SDK to connect to cluster public endpoint and use his user credentials for authentication. In that case, User B will have access to namespace B, but the notebook server are shared between user A and B, so user B needs to be aware that his credentials may be used by anyone else having access to namespace A (which should be avoided).
With above design, KFP API Server doesn't need to know. It uses the service account as requester identity. So we won't need SDK method to add kubeflow userid.
So the following scenario would fail and we should document it as limitation
User settings:
- User A and he owns namespace A
- User B and he owns namespace B
- User A invites User B as collaborator, so User B can also access namespace A
- User B can access run/experiment in namespace A and B. But User A only can access namespace A.
Scenario:
When User B runs the codekfp.Client.create_run_from_pipeline_func()and specifiesnamespace=Bon the notebook server that User A creates in namespace A, I guess User B would expects he can access the resource in his own namespace. But based on the design, he can't becaue the XFCC iscluster.local/ns/A/sa/default-editorand KFP API Server only allow this identity to access resource in namespace A.
Another scenario (I guess this is more of what you want to support) is that, we have the same users A and B and namespaces A and B and namespace A is shared to User B.
User B uses a notebook server in namespace B and tries to kfp.Client.create_run_from_pipeline_func() for namespace=A.
The notebook server in namespace B (despite User B being the owner and namespace A is shared to User B) won't have access to pipelines in namespace A.
To fix the permission issue, user A should also invite namespace B's default-editor service account as collaborator to namespace A.
So conceptually, in this security model, a service account in the cluster has its own identity different from the user, and access are managed by the identity sending the request.
Arguably, Kubernetes has the same auth model, if you run a notebook in cluster, you cannot use your own account's permissions, the notebook has its service account's permissions.
hey, having an issue applying @yhwang's workaround. I'm still getting the error as if the envoy filter is not being applied:
409 ... HTTP response body: {"error":"Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header." ...
however as far as I can tell I applied the envoy filter correctly ([email protected] is owner of default-profile namespace):
> kubectl -n default-profile get envoyfilters.networking.istio.io add-header -o yaml
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"name":"add-header","namespace":"default-profile"},"spec":{"configPatches":[{"applyTo":"VIRTUAL_HOST","match":{"context":"SIDECAR_OUTBOUND","routeConfiguration":{"vhost":{"name":"ml-pipeline.kubeflow.svc.cluster.local:8888","route":{"name":"default"}}}},"patch":{"operation":"MERGE","value":{"request_headers_to_add":[{"append":true,"header":{"key":"kubeflow-userid","value":"[email protected]"}}]}}}],"workloadSelector":{"labels":{"notebook-name":"test-notebook"}}}}
creationTimestamp: "2020-10-21T15:30:18Z"
generation: 1
name: add-header
namespace: default-profile
resourceVersion: "33518018"
selfLink: /apis/networking.istio.io/v1alpha3/namespaces/default-profile/envoyfilters/add-header
uid: 72d8ce19-9246-431b-b61c-74b3da7b4732
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: [email protected]
workloadSelector:
labels:
notebook-name: test-notebook
What and where should be picking this up/using this? How should I debug?
@yhwang you mentioned something about my kubeflow config:
oh another possibility is that in your kubeflow, the identity header name is not kubeflow-userid. You may double check your kubeflow config.
where do I check this?
Thanks!
@jonasdebeukelaer are you on GCP?
The header should be 'x-goog-authenticated-user-email'
thanks @Bobgy! Got it working now.
Also note the header value has to be formatted like accounts.google.com:<email>
so what is the correct format, i try to follow by https://www.kubeflow.org/docs/pipelines/multi-user/#in-cluster-api-request-authentication, where said: If you need to access the API endpoint from in-cluster workload like Jupyter notebooks or cron tasks, current suggested workaround is to connect through public endpoint
the code i written like this:
pipeline = kfp.Client(host='http://istio-ingressgateway.newbase.com/_/pipeline/?ns=chejinguo').create_run_from_pipeline_func(mnist_pipeline, arguments={})
the host address is accesible by web browser
but reports error in jupyter notebook
HTTP response body: {"error":"Validate experiment request failed.: Invalid input error: Invalid resource references for experiment. Expect one namespace type with owner relationship. Got: []"
@majorinche which Kubeflow deployment are you using? There should be a page on www.kubeflow.org introducing how to authenticate to KFP endpoint specific to your deployment.
@Bobgy
sure!
- The RBAC to allow the notebook server in user's namespace: "mynamespace" to access ml-pipeline service
apiVersion: rbac.istio.io/v1alpha1 kind: ServiceRoleBinding metadata: name: bind-ml-pipeline-nb-mynamespace namespace: kubeflow spec: roleRef: kind: ServiceRole name: ml-pipeline-services subjects: - properties: source.principal: cluster.local/ns/mynamespace/sa/default-editor
- Envoy filter to inject the
kubeflow-useridheader from notebook to ml-pipeline service. In the example below, the notebook server's name ismynotebookand userid for namespace: mynamespace is[email protected]apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: name: add-header namespace: mynamespace spec: workloadSelector: labels: notebook-name: mynotebook configPatches: - applyTo: HTTP_FILTER match: context: SIDECAR_OUTBOUND listener: portNumber: 8888 filterChain: filter: name: "envoy.http_connection_manager" subFilter: name: "envoy.router" patch: operation: INSERT_BEFORE value: # lua filter specification name: envoy.lua config: inlineCode: | function envoy_on_request(request_handle) request_handle:headers():add("kubeflow-userid", "[email protected]") endThe envoy filter above only inject the
kubeflow-useridHTTP header for those traffic going to ml-pipelie service
I tried to apply the same fix but it doesn't work for me somehow:
$ cat servicerolebinding.yaml envoyfilter.yaml
apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
name: bind-ml-pipeline-nb-anonymous
namespace: kubeflow
spec:
roleRef:
kind: ServiceRole
name: ml-pipeline-services
subjects:
- properties:
source.principal: cluster.local/ns/anonymous/default-editor
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: anonymous
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: [email protected]
workloadSelector:
labels:
notebook-name: kale
$ kubectl get ns anonymous -oyaml
apiVersion: v1
kind: Namespace
metadata:
annotations:
owner: [email protected]
creationTimestamp: "2020-11-10T11:53:55Z"
...
$ kubectl get ns anonymous --show-labels
NAME STATUS AGE LABELS
anonymous Active 4h44m istio-injection=enabled,katib-metricscollector-injection=enabled,serving.kubeflow.org/inferenceservice=enabled
$ kubectl -n anonymous get po -l notebook-name=kale
NAME READY STATUS RESTARTS AGE
kale-0 2/2 Running 0 3h38m
Output from JupyterLab
Message: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'content-length': '19', 'content-type': 'text/plain', 'date': 'Tue, 10 Nov 2020 16:42:07 GMT', 'server': 'envoy', 'x-envoy-upstream-service-time': '0'})
HTTP response body: RBAC: access denied
I used deployment method for installation on Bare Metal: https://www.kubeflow.org/docs/started/k8s/kfctl-k8s-istio/
Any ideas why it doesn't work for [email protected] ? I already stuck completely with it :(
@swiftdiaries Did I recall correctly you maintain this manifest, can you answer this question?
I would ping the appropriate WG that owns the config. I currently don't have the bandwidth to work on this
@mr-yaky in your ServiceRoleBinding you should change source.principal: cluster.local/ns/anonymous/default-editor to source.principal: cluster.local/ns/anonymous/sa/default-editor
Can you try it?
@yhwang thank you. I have changed how you recommended but now I get the new error:
Message: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Fri, 13 Nov 2020 08:51:29 GMT', 'x-envoy-upstream-service-time': '15', 'server': 'envoy', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Invalid input error: Invalid resource references for experiment. Namespace is empty.","message":"Invalid input error: Invalid resource references for experiment. Namespace is empty.","code":3,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid resource references for experiment. Namespace is empty.","error_details":"Invalid input error: Invalid resource references for experiment. Namespace is empty."}]}
Well, I think now it's working correctly:
jovyan@kale-0:~$ kfp pipeline list
+--------------------------------------+-------------------------------------------------+---------------------------+
| Pipeline ID | Name | Uploaded at |
+======================================+=================================================+===========================+
| 271f4189-1bd3-425a-8b59-213f4a6502b2 | [Tutorial] DSL - Control structures | 2020-10-26T11:58:27+00:00 |
+--------------------------------------+-------------------------------------------------+---------------------------+
| e8989196-9105-41b2-b302-fe7b2a1f92cc | [Tutorial] Data passing in python components | 2020-10-26T11:58:26+00:00 |
+--------------------------------------+-------------------------------------------------+---------------------------+
| 3fb1b41c-dcca-4c12-88cf-cdb602c5c665 | [Demo] TFX - Iris classification pipeline | 2020-10-26T11:58:25+00:00 |
+--------------------------------------+-------------------------------------------------+---------------------------+
| ff55dd05-deb3-40c9-87c3-a8a06871b801 | [Demo] TFX - Taxi tip prediction model trainer | 2020-10-26T11:58:24+00:00 |
+--------------------------------------+-------------------------------------------------+---------------------------+
| 99666886-380b-488b-bd84-d3be3d12b2d8 | [Demo] XGBoost - Training with confusion matrix | 2020-10-26T11:58:23+00:00 |
+--------------------------------------+-------------------------------------------------+---------------------------+
@yhwang thank you. I think the error above is related to Kale already. So, I'll try to fix it on Kale side.
For me trying these workarounds (on KF 1.2) results in Error from server: error when creating ".\\envoy_filter.yaml": admission webhook "pilot.validation.istio.io" denied the request: configuration is invalid: envoy filter: missing filters
@karlschriek Could you post the envoy filter you applied, along with the relevant information about the cluster such as the namespace and notebook name?
@yanniszark Is there any more information regarding the timeline of the upstream push of mTLS and SubjectAccessReview?
This is what I used:
EDIT:
Fixed after @DavidSpek's comment below
export NAMESPACE=mynamespace
export NOTEBOOK=mynotebook
export [email protected]
cat > ./envoy_filter.yaml << EOM
apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
name: bind-ml-pipeline-nb-${NAMESPACE}
namespace: kubeflow
spec:
roleRef:
kind: ServiceRole
name: ml-pipeline-services
subjects:
- properties:
source.principal: cluster.local/ns/${NAMESPACE}/sa/default-editor
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: ${NAMESPACE}
spec:
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: SIDECAR_OUTBOUND
routeConfiguration:
vhost:
name: ml-pipeline.kubeflow.svc.cluster.local:8888
route:
name: default
patch:
operation: MERGE
value:
request_headers_to_add:
- append: true
header:
key: kubeflow-userid
value: ${USER}
workloadSelector:
labels:
notebook-name: ${NOTEBOOK}
EOM
@karlschriek You have the kubeflow-userid set to your namespace rather than your userid (such as [email protected]).
Sorry, my bad. I no longer had the script I used this morning so I quickly put something together to answer you, tried it, saw it gave the same result and posted it. Have now fixed it and can confirm that it still gives the same "missing filters" error
I got the same problem that @karlschriek. In a further investigation, I discovered that KF v1.1 and above is using a very outdated istio version (1.1.6) so the EnvoyFilter @yhwang provided is not compatible with this version.
I tried to port the filter to be compatible with version 1.1.6 but it still doesn't work.
```apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: __namepace__
spec:
filters:
Error:
```ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Wed, 02 Dec 2020 00:26:05 GMT', 'x-envoy-upstream-service-time': '2', 'server': 'envoy', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header.","message":"Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header.","code":10,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Request header error: there is no user identity header.","error_details":"Failed to authorize with API resource references: Bad request.: BadRequestError: Request header error: there is no user identity header.: Request header error: there is no user identity header."}]}
EDIT:
Very important to mention I'm on AWS.
actually, the envoyfilter I posted here: https://github.com/kubeflow/pipelines/issues/4440#issuecomment-687703390
is based on istio 1.3.1. I upgraded my env to kfp v1.2 today and the envoyfilter still works properly for me.
I've been using the envoyfilter with Kubeflow 1.1 and 1.2 with istio 1.3.1 and have also not had any issues.
@yhwang @DavidSpek thank you for your considerations.
It turns out that the kfctl configuration file I used to install kubeflow doesn't contain the istio-stack-1-3-1 and cluster-local-gateway-1-3-1, therefore kubeflow was installed based on istio 1.1.6.
The configuration file I used was the recommended one to provide authentication via OIDC https://raw.githubusercontent.com/kubeflow/manifests/v1.1-branch/kfdef/kfctl_aws_cognito.v1.1.0.yaml (https://www.kubeflow.org/docs/aws/deploy/install-kubeflow/)
Do you know if the authentication via OIDC/Cognito requires istio 1.1.6? Would update istio mess up with existing kubeflow installation?
@yhwang Is there a plan to add this as a PR to include the servicerolebinding and Envoyfilter to be created every time a new notebook server is created in a user namespace? If there isn't, how do you propose I can solve this? Thanks
@HassanOuda As discussed above the ServiceRoleBinding and EnvoyFilter are workarounds and should not be seen as a secure solution.
https://github.com/kubeflow/pipelines/issues/4440#issuecomment-697317390
The proper implementation will hopefully be pushed upstream by @yanniszark soon.
@pedrocwb I have a similar question about 1.1.6 vs 1.3.1. For me the more relevant case is being able to authenticate from outside the cluster. Currently this requires passing the Cognito cookies. I have managed to get this to work with 1.1.6, but it actually looks like this currently doesn't work with 1.3.1.
Even though I pass the correct cookies, I still get the Request header error: there is no user identity header error. I am going to spend some more time on this today and will give you feedback if I know a bit more.
For our clients the two most important KF components are KFP and KFServing. At the moment we can use KFP with 1.1.6, but not 1.3.1. And only a very old version of KFServing seems to be compatible with 1.1.6.
For reference, SubjectAccessReview has been merged in https://github.com/kubeflow/pipelines/pull/4723. This is available in https://github.com/kubeflow/pipelines/releases/tag/1.2.0 (first available in https://github.com/kubeflow/pipelines/releases/tag/1.1.1-beta.1). However, from looking at https://github.com/kubeflow/pipelines/issues/3513 regarding SubjectAccessReview, it is not clear to me if Istio mTLS support has been added for in-cluster authentication.
Useful documentation: https://docs.google.com/document/d/1R9bj1uI0As6umCTZ2mv_6_tjgFshIKxkSt00QLYjNV4/edit?ts=5e4d8fbb#heading=h.b3vxor3gcdvs
it is not clear to me if Istio mTLS support has been added for in-cluster authentication.
No, it's not added
thanks @yhwang , the suggestion works
@yhwang thank you. I have changed how you recommended but now I get the new error:
Message: (400) Reason: Bad Request HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Fri, 13 Nov 2020 08:51:29 GMT', 'x-envoy-upstream-service-time': '15', 'server': 'envoy', 'transfer-encoding': 'chunked'}) HTTP response body: {"error":"Invalid input error: Invalid resource references for experiment. Namespace is empty.","message":"Invalid input error: Invalid resource references for experiment. Namespace is empty.","code":3,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid resource references for experiment. Namespace is empty.","error_details":"Invalid input error: Invalid resource references for experiment. Namespace is empty."}]}
Hi @mr-yaky,
Would you please share some more details about how to solve the below error?
Message: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Fri, 13 Nov 2020 08:51:29 GMT', 'x-envoy-upstream-service-time': '15', 'server': 'envoy', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Invalid input error: Invalid resource references for experiment. Namespace is empty.","message":"Invalid input error: Invalid resource references for experiment. Namespace is empty.","code":3,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid resource references for experiment. Namespace is empty.","error_details":"Invalid input error: Invalid resource references for experiment. Namespace is empty."}]}
Solved based on
https://github.com/kubeflow-kale/kale/issues/210#issuecomment-727018461
@kosehy my guess is you need to specify the namespace since it complains about empty namespace
@kosehy my guess is you need to specify the namespace since it complains about empty namespace
@yhwang You are right.
I fixed above error based on https://github.com/kubeflow-kale/kale/issues/210#issuecomment-727018461 this comment.
Thank you for your reply!
@yhwang, would appreciate some help from you. I also cannot access pipelines from the notebook
when I run kfp -n kubeflow pipeline list in the terminal I get the following error
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Length': '19', 'Content-Type': 'text/plain', 'Date': 'Thu, 04 Mar 2021 00:50:15 GMT', 'Server': 'istio-envoy', 'X-Envoy-Decorator-Operation': 'ml-pipeline.kubeflow.svc.cluster.local:8888/*'})
HTTP response body: RBAC: access denied
I have tried to add the istio injection for the namespace kubeflow, but it does not work. The update yaml file I used is the following:
```apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: add-header
namespace: kubeflow
spec:
configPatches:
How can I make the namespace able to access the pipeline information? Is it necessary for me to start a notebook server and do the above steps? Can I use the command line to do the same access?
Thanks!
@perseusyang1997 Based on your description, I think you were trying to use kfp CLI to access the kubeflow pipeline via kube-api server. Since the value of kubeflow-userid header you are using is [email protected] (you have an extra s in your envoyfilter yaml), I just wonder are you using single user set up or multi-user? The purpose of the envoyfilter you posted is to add kubeflow-userid header for the out going traffics from the notebook server in a user's namespace to kfp api service. Therefore, it should be applied to the user's namespace where the notebook server is but not kubeflow namespace. And it doesn't help the kfp CLI use case. Short answers for your question are:
Is it necessary for me to start a notebook server and do the above steps?
Yes. And please add the envoyfilter to the same namespace as the notebook server.
Can I use the command line to do the same access?
Yes and No. Using kfp CLI to access kubeflow pipeline is different. If you don't specify the --endpoint argument pointing to your kfp api url, it goes through the kube-api server and use it as a proxy to access the kfp api server. In this case you, you will hit the RBAC access error. If you do specify the kfp api url and your kubeflow is deployed on GCP, you should be able to use the kfp CLI to access kubeflow pipeline. You can check the document here: https://www.kubeflow.org/docs/gke/pipelines/authentication-sdk/
@perseusyang1997 Please be aware that this workaround is not actually secure.
@yanniszark will this be fixed in the 1.3 release?
hello guys, I'm working on AWS and the RBAC and Envoy filter fixed my problems when I was using Kubeflow without Cognito, but now I'm changing the deployment in order to use Cognito as auth and as mentioned above by @pedrocwb and @karlschriek when trying to apply Envoy filter I have the following return:
Error from server: error when creating "envoy.yaml": admission webhook "pilot.validation.istio.io" denied the request: configuration is invalid: envoy filter: missing filters
@MatheusPush Not sure if it is related, but the maximum Istio version for Cognito is 1.1 I believe.
@Bobgy
sure!
- The RBAC to allow the notebook server in user's namespace: "mynamespace" to access ml-pipeline service
apiVersion: rbac.istio.io/v1alpha1 kind: ServiceRoleBinding metadata: name: bind-ml-pipeline-nb-mynamespace namespace: kubeflow spec: roleRef: kind: ServiceRole name: ml-pipeline-services subjects: - properties: source.principal: cluster.local/ns/mynamespace/sa/default-editor
- Envoy filter to inject the
kubeflow-useridheader from notebook to ml-pipeline service. In the example below, the notebook server's name ismynotebookand userid for namespace: mynamespace is[email protected]apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: name: add-header namespace: mynamespace spec: workloadSelector: labels: notebook-name: mynotebook configPatches: - applyTo: HTTP_FILTER match: context: SIDECAR_OUTBOUND listener: portNumber: 8888 filterChain: filter: name: "envoy.http_connection_manager" subFilter: name: "envoy.router" patch: operation: INSERT_BEFORE value: # lua filter specification name: envoy.lua config: inlineCode: | function envoy_on_request(request_handle) request_handle:headers():add("kubeflow-userid", "[email protected]") endThe envoy filter above only inject the
kubeflow-useridHTTP header for those traffic going to ml-pipelie service
@yhwang Hi Yh, I have a single user kubeflow setup, in this case, what should be the value for kubeflow-userid
@omlomloml
I have a single user kubeflow setup, in this case, what should be the value for kubeflow-userid
It should be [email protected]
@omlomloml
I have a single user kubeflow setup, in this case, what should be the value for kubeflow-userid
It should be
[email protected]
@yhwang
Thank you so much!
@lukemarsden also provided slightly different work around at https://github.com/kubeflow/pipelines/issues/4440#issuecomment-700759162
can you guys explain what is the difference between these to work arounds
Thanks
I am a newbee here
@omlomloml Please read this comment as it also explains why these workarounds are not secure. The proper solution should be included in release 1.3 which is soon. https://github.com/kubeflow/pipelines/issues/4440#issuecomment-697317390
@yanniszark Is there anything more than needs to happen to solve this issue for 1.3? Or was everything regarding the SubjectAccessReview already merged?
@yhwang @DavidSpek
Hi guy,
after I applied the binding and the filter I still can't get it work, did I do anything wrong, I am not using the notebook here, so I am adding the filter to all the workload
here is the bingding:
root@metis1-1:~# kubectl -n kubeflow get ServiceRoleBinding bind-ml-pipeline-metis -o yaml
apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.istio.io/v1alpha1","kind":"ServiceRoleBinding","metadata":{"annotations":{},"name":"bind-ml-pipeline-metis","namespace":"kubeflow"},"spec":{"roleRef":{"kind":"ServiceRole","name":"ml-pipeline-services"},"subjects":[{"properties":{"source.principal":"cluster.local/ns/metis/sa/default"}}]}}
creationTimestamp: "2021-03-10T15:01:52Z"
generation: 1
managedFields:
and here is the filter:
root@metis1-1:~# kubectl -n metis get envoyfilters.networking.istio.io add-header -o yaml
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"name":"add-header","namespace":"metis"},"spec":{"configPatches":[{"applyTo":"VIRTUAL_HOST","match":{"context":"SIDECAR_OUTBOUND","routeConfiguration":{"vhost":{"name":"ml-pipeline.kubeflow.svc.cluster.local:8888","route":{"name":"default"}}}},"patch":{"operation":"MERGE","value":{"request_headers_to_add":[{"append":true,"header":{"key":"kubeflow-userid","value":"[email protected]"}}]}}}]}}
creationTimestamp: "2021-03-10T15:23:13Z"
generation: 1
managedFields:
but I can't see anything is added
root@metis-backend-857fc5b98d-c4qc7:/metis/metis/aix# curl -I http://ml-pipeline.kubeflow.svc.cluster.local:8888
HTTP/1.1 403 Forbidden
content-length: 19
content-type: text/plain
date: Wed, 10 Mar 2021 15:33:45 GMT
server: istio-envoy
x-envoy-decorator-operation: ml-pipeline.kubeflow.svc.cluster.local:8888/*
Thanks!
For people tracking this issue, the correct solution will come from issue: https://github.com/kubeflow/pipelines/issues/5138
@yanniszark in 1.3 release?
Most helpful comment
@Bobgy
sure!
kubeflow-useridheader from notebook to ml-pipeline service. In the example below, the notebook server's name ismynotebookand userid for namespace: mynamespace is[email protected]The envoy filter above only inject the
kubeflow-useridHTTP header for those traffic going to ml-pipelie service