Pipelines: Creating runs via kfp does not respect pipeline versions

Created on 26 Oct 2020  路  9Comments  路  Source: kubeflow/pipelines

What steps did you take:

  1. Create a pipeline
  2. Upload multiple versions to the same pipeline with different manifests (let's call the versions A, B, C, and their respective manifests A', B', C'. First created version is A, official default version is C)
  3. Create run for pipeline via python SDK (tried without and without explicitly specifying the version to run)
  4. View run on UI, and introspect run object via python SDK

What happened:

Created run for pipeline, explicitly specifying version C. Run object showed pipeline version C, but manifest attached to that run object was A'. Confirmed via UI that pipeline ran incorrect version.

As a sanity check, I kicked off version C entirely via UI (not python SDK), and that ran the correct version.

What did you expect to happen:

Expected version manifest to match the pipeline version.

Environment:

Python 3.7, see KFP versions below.

How did you deploy Kubeflow Pipelines (KFP)?
AI Pipelines - managed.

KFP version: 743746b
KFP SDK version: 1.0.1

Anything else you would like to add:

/kind bug
/area backend
/area sdk

arebackend aredocs aresdk good first issue kinbug prioritp1

All 9 comments

Hi @quanjielin ! Thank you for the report!
Can you share the python script you used to create the run?
I suspect there's something wrong there, you can open browser console and see the request UI sends to backend as a reference.

Unfortunately I cannot share the full code, but we specifically kick-off a new run with the following code:

import kfp
cl = kfp.Client(host=cfg.KUBEFLOW_HOST)

# Get arg pipeline_id, and a params dict of the arguments the pipeline expects as input

pipeline = cl.get_pipeline(pipeline_id)
run = cl.run_pipeline(
        experiment_id, run_name, pipeline_id=pipeline_id, params=params, version_id=pipeline.default_version.id
    )

The UI does to do the correct thing with versions, but this snippet using the SDK does not.

If I then look at the run object returned, I see that the version ID on the run object is pipeline.default_version.id, but the manifest attached to the run ID doesn't match the manifest of that version.

That looks like a confusing behavior on that method.

You should not pass pipeline id when version id is already specified, when you pass both, pipeline id is taking effect.
https://github.com/kubeflow/pipelines/blob/e3992faf8305570ca56c5932d701eb8765305568/sdk/python/kfp/_client.py#L585

The behavior is kind of confusing, we should raise an error if both ids are specified.

Hey, is anyone working on this? I can make a PR.

@vipul-sharma20 That would be great! Thank you for offering help!

Hmm, in fact, I think backend behavior for taking pipeline ID with priority is the more problematic behavior than SDK allowing two values specified at the same time

https://github.com/kubeflow/pipelines/blob/5f992f5d06a33e85bc53b8531cfc2cbf1f1faa87/backend/src/apiserver/resource/resource_manager_util.go#L244

I believe backend already prefers version id just reading its code. So I think the easiest short term fix is let SDK not specify pipeline id when version id is present.

A better long term fix is to figure out why backend still prefers pipeline id (maybe in some cases)

I am not sure, but https://github.com/kubeflow/pipelines/blob/5f992f5d06a33e85bc53b8531cfc2cbf1f1faa87/backend/src/apiserver/resource/resource_manager.go#L327 looks a little suspicious. When we cannot fetch the version manifest, we'll use pipeline manifest while attributing it to the version. This should be rare, but it could be the problem for the original issue

I tried to reproduce the bug with master branch whose head is dc5386e801cafe0f4bb50d24b6f2979c46269b82
But I can't reproduce the bug and I checked the code https://github.com/kubeflow/pipelines/blob/7a66414cf72ba5c746cac7fc6c8ecad67fc5e885/backend/src/apiserver/resource/resource_manager.go#L317
The logic follows the comments of kfp sdk:
If both pipeline_id and version_id are specified, version_id will take precendence. If only pipeline_id is specified, the default version of this pipeline is used to create the run.
And I verified the behavior is correct.

I think the bug existed before but it already got fixed.
Here is the previous logic https://github.com/kubeflow/pipelines/blob/743746b96e6efc502c33e1e529f2fe89ce09481c/backend/src/apiserver/resource/resource_manager.go#L248
For the previous logic, it would try to get workflow manifest from pipeline spec first, if it can't get get workflow manifest from pipeline spec, it would try to get workflow manifest from PipelineVersion.

I think we can close this issue since this bug already got fixed in master branch

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Bobgy picture Bobgy  路  4Comments

talhairfanbentley picture talhairfanbentley  路  5Comments

Bobgy picture Bobgy  路  3Comments

discordianfish picture discordianfish  路  4Comments

rcleere picture rcleere  路  3Comments