Pipeline: Design PipelineResource extensibility

Created on 10 Nov 2018 · 23Comments · Source: tektoncd/pipeline

Expected Behavior

We should have clear extension points in the Pipeline CRD system. These points should be well documented, clear, and tested.

Ideally it will be possible to extend the system without having to modify the existing controller binaries or CRD definitions.

Actual Behavior

At the moment, if a user wanted to extend the Pipeline they have these options:

They want to create their own kind of Resource (e.g. they use something like mercurial, but we currently only support git) = They have to modify and re-deploy the TaskRun controller @_@ (and maybe the PipelineRun controller also)
They want to create their own kind of Task = #215 + ducktyping should allow for alternative Task implementation
They want to run xyz specific logic = They can put any logic they want inside a task

In the worst case, the last option _should_ allow a user to do pretty much anything they want, outside of the functionality provided by this system.

Additional Info

n/a

design

Source

bobcatfish

Most helpful comment

Here is one proposal, which I am unreasonably attached to:

Similar to how consourse uses Resources as an extension point, we could also use Resources as an extension point.

We add a Resource controller, which only pays attention to Resource types it knows about and ignores the rest
When the Resource controller sees a Resource it knows about, it will make that Resource available via a PVC (e.g. if it's a Git resource check out the repo into a volume mount) or via some string data in the Status of the resource (e.g. for an image we just need the url + digest)
If a user wants to add their own kind of Resource, they can add a controller that knows how to handle that Resource type
When the Resource is used as an output, maybe this updated via some fields in the spec, which the controller can react to?

Cons:

This interferes with the most appealing option for #200, which is to stop using CRDs for Resources
This requires the system we deploy to support PVCs, not sure if this is a problem or not, have been trying with @aaron-prindle to avoid solutions to #224 that require PVC

bobcatfish on 10 Nov 2018

👍3

All 23 comments

Here is one proposal, which I am unreasonably attached to:

Similar to how consourse uses Resources as an extension point, we could also use Resources as an extension point.

We add a Resource controller, which only pays attention to Resource types it knows about and ignores the rest
When the Resource controller sees a Resource it knows about, it will make that Resource available via a PVC (e.g. if it's a Git resource check out the repo into a volume mount) or via some string data in the Status of the resource (e.g. for an image we just need the url + digest)
If a user wants to add their own kind of Resource, they can add a controller that knows how to handle that Resource type
When the Resource is used as an output, maybe this updated via some fields in the spec, which the controller can react to?

Cons:

This interferes with the most appealing option for #200, which is to stop using CRDs for Resources
This requires the system we deploy to support PVCs, not sure if this is a problem or not, have been trying with @aaron-prindle to avoid solutions to #224 that require PVC

bobcatfish on 10 Nov 2018

👍3

Seems reasonable at first glance - I'm putting together some thoughts on extensibility based on some design work here (i.e., we're establishing what we want to be able to accomplish to do something like automatic quality gates on Tasks), and will try to get that written up to figure out how that would fit in with this approach.

abayer on 12 Nov 2018

We will probably need to have a PVC anyways to share outputs from one taskRun (in a pod) with another taskRun in the same pipeline (in a different pod)

nader-ziada on 13 Nov 2018

This requires the system we deploy to support PVCs

last time i was playing with azure's kubernetes service attaching volume to a node was taking ~2min whereas on gke it was only 10s. i think if persistent volumes are used then we should figure out how to deal with long attachment times (or how to avoid them entirely if that's possible).

cppforlife on 14 Nov 2018

@pivotal-nader-ziada would downloading the output from a result store be a viable option ?

tejal29 on 14 Nov 2018

@tejal29 you mean something like gcs or s3 for sharing artifacts between tasks? yes, this would be an option, but its one extra thing users have to setup and have ready. Based on @cppforlife comment and my own investigations, I want to avoid using PVC, but still investigating best options to discuss with everyone

nader-ziada on 14 Nov 2018

Something worth discussing - what exactly do we want the extensions to be able to do? Some things definitely map directly to Resources - different SCM systems, different artifact storage, etc. But is that actually the only thing we want to be able to extend? For example, there's notifications (i.e., https://github.com/knative/build-pipeline/issues/49) - obviously, we can't hardcode in every notification system someone might want to use, so we need there to be a way to have extensions provide additional notification systems. But are those Resources?

abayer on 14 Nov 2018

I think of resources as the input and outputs of the task, so notifications can be expressed as output of the task. If we design the extension point with enough flexibility, the resource should be responsible for taking the result of the task and doing what it wants with it including sending it to slack for example.

nader-ziada on 14 Nov 2018

Ok, so in cases like notifications, we wouldn't be using PVC, right?

abayer on 15 Nov 2018

Yes, we would not be using PVC for the notifications, we are trying to minimize any use of PVC in general for performance reasons

nader-ziada on 15 Nov 2018

👍1

It would be helpful to lay out some more examples of what we want to be extensible in the first place - notification mechanisms and SCM checkout processes are pretty clear to me, but I'm not sure what else is being considered here?

abayer on 15 Nov 2018

last time i was playing with azure's kubernetes service attaching volume to a node was taking ~2min whereas on gke it was only 10s. i think if persistent volumes are used then we should figure out how to deal with long attachment times (or how to avoid them entirely if that's possible).

Great point @cppforlife , thanks for sharing those stats!

Something worth discussing - what exactly do we want the extensions to be able to do?

@abayer that's a great point - we should try to be clear about what we want users to be able to extend. It's hard to know in advance though :S

Definitely we want them to be able to provide their own:

Places to get data from (e.g. images, version control systems)
Places to put data (e.g. registries, repositories)

(Maybe what I'm doing here is defining what a Resource is ... lol as it slowly becomes a Concourse resource minus polling...)

I created this doc for brainstorming - knative-dev@ should have edit access.

For example, there's notifications (i.e., #49) - obviously, we can't hardcode in every notification system someone might want to use, so we need there to be a way to have extensions provide additional notification systems. But are those Resources?

Depends on how we design notifications! They could be viewed as a type of resource (something you put data to) like @pivotal-nader-ziada described or we could make them a top level thing in the system, e.g. we have Resources, Tasks, Pipelines + Notifications. I can see either working well.

bobcatfish on 16 Nov 2018

Another note re. using PVCs, I realized the design I proposed in https://github.com/knative/build-pipeline/issues/238#issuecomment-437549025 has a big flaw: if multiple Pipelines, or even Tasks within a Pipeline, try to use the same Resources as inputs, they'd be mounting in the same PVC, which seems like it either wouldn't work or would be asking for some terrible side effects.

Here's an updated proposal (specifically for extending Resources) which I think is better:

We add a Resource controller, which only pays attention to Resource types it knows about and ignores the rest
When the Resource controller sees a Resource it knows about, ~it will make that Resource available via a PVC (e.g. if it's a Git resource check out the repo into a volume mount)~ create the spec for a corev1.Container (i.e. the same way that build specifies steps) that knows how to make that Resource available in a location expected by the Task (maybe to the workspace path?) in the Resource's status section
If a user wants to add their own kind of Resource, they can add a controller that knows how to handle that Resource type
When the Resource is used as an output, maybe this updated via some fields in the spec, which the controller can react to?

bobcatfish on 16 Nov 2018

Relevant comment from @mattmoor in slack:

I am wondering it PipelineResource would benefit from being a duck type instead of a concrete type...
e.g. right now you bake in type:, which isn't particularly extensible: https://github.com/knative/build-pipeline/blob/0d6a3a27a05ba1a765b6f28a434a2c76baf883ca/examples/pipelines/kritis-resources.yaml#L7

bobcatfish on 20 Nov 2018

Okay so I'm feeling pretty good about this plan, what do folks think:

For both PipelineResources and Tasks we allow extensiblilty by using duck typing, i.e. defining an interface which they must each comply with.

For PipelineResources, we'll define controllers that know what types of Resources they can handle.
For Tasks, we already provide a generic TaskRun controller, the key will be to update PipelineRuns so that they can know how to create other Run types for other Task types (e.g. was talking with @balopat about having a SkaffoldTask)

bobcatfish on 29 Nov 2018

👍1

Note to self in #414 we are adding rules around which types of resources produce outputs - something to take into account in a design where resources are extensible :D

bobcatfish on 22 Jan 2019

Hi @bobcatfish :wave:

We would like to know, how this issue is coming along? have we started working on controller implementation for PipelineResource or we are expecting this design to evolve more?

When the Resource is used as an output, maybe this updated via some fields in the spec, which the controller can react to?

Also, would you mind elaborating this point? :thinking:

hrishin on 29 Apr 2019

We would like to know, how this issue is coming along? have we started working on controller implementation for PipelineResource or we are expecting this design to evolve more?

No implementation yet! I'm going to try to flesh out the design a bit further asap :D

bobcatfish on 1 May 2019

Design doc focusing on PipelineResource based extensibility available at: https://docs.google.com/document/d/12SPwIHZpERbFGroQulu3A16Sp-sPODAGP8b78-3WQDs

(I realized I actually don't have a clear idea of how to design custom Tasks, so if anyone wants to take on that design work, that's still available!)

bobcatfish on 7 May 2019

I'm gonna downscope this issue to be just about PipelineResource extensibility, and we can use #215 for Task extensibility!

bobcatfish on 8 May 2019

Relevant discussion here: https://tektoncd.slack.com/archives/CJ62C1555/p1560974129126900

It would be nice if there was a way to specify a file/directory or even just a simple text value to be passed between tasks without needing to create a volume across tasks. Maybe even a non-typed PipelineResource?

AdityaGupta1 on 19 Jun 2019

@vdemeester @pmorie what are the next steps for the design here? I think folks are pretty happy with the design as proposed and we could close this issue, and move into implementing it, what do you think?

bobcatfish on 14 Sep 2019

I'm gonna close this! @vdemeester @pmorie can we make some follow up issues to actually implement this? Happy to help if needed.

bobcatfish on 19 Sep 2019

Was this page helpful?

0 / 5 - 0 ratings