I followed the guide to run pipeline examples.
https://www.kubeflow.org/docs/guides/pipelines/pipelines-quickstart/

Because i want to run this example in my local environment, so i replace the gs with my local directories like this(i download these file in advance):


so i run this code:

So i think the file is in the right place, i should do something else to run this example.
i have a tentative suggestion that i should replace the "GcsUri", but i don't know what to be replaced with.

SO, i hope someone can help me , i will be very appreciated !!!
(1) can the ML pipeline examples run in local kubeflow environment (without GCP).
(2) if the above answer is yes. how to modify the code to use local file.
ML pipeline is a cross-platform product and can be surely deployed and run in local environments. However, the file not being detected is due to the fact that each component is a containerized operator. In other words, the code can only fetch the files inside the container.
There are two options:
1) copy your file to some place(your own cloud, file system) that the kubernetes container has access to.
or
2) build a new image containing your local file.
It's a known issue that most of current ML components are assuming the input paths are gcs paths. We are working on a solution to pass artifact to container through cloud provider agnostic way.
Though it's not tested yet, the container code should be able to work with local path if the file is accessable in the container. Other than Ning's suggestion, if you just want to make it work locally. You might try using hostPath volume to make your local files visible to the container. You can use add_volume and add_volumne_mount add hostPath volume to the ContainerOp and change the path to the mounted path.
@gaoning777 i followed you advice, and i build a new image. it works, but it cannot solve the problem totally. Because the second container can't find the first container's output.
so i want to mount volume in Container op.
https://github.com/kubeflow/pipelines/issues/477
i found that you gave an example how to mount a volume with both add_volume and add_volume_mount. but the page is not exist now. can you give the example again if you can find the example.
@hongye-sun
hi, i'am very appreciated for your advice, and i followed your advice and used add_volume and add_volumne_mount, but it didn't work.
so i run a example to mount volume in Container op to test how to use add_volume and add_volumne_mount. i hope you can give me some advice again if you know.
first, I created a PersistentVolume using:

and it successed

then, I run an example, the code is here:

the result is here:
image

it seems to i didn't mount volume in Container op successfully, and the status of the pv(tfx-pv) is always Available.

(PS: i created '/nfs-data/tfx-pv/train.csv' in advance)
i have no idea now how to solve it , i hope you can give some advice, Thanks!!!
I made that successfully before. I create PV/PVC firstly, and then edit the sample taxi-cab-classification-pipeline.py to attach the PVC. Copied the related data such as train.csv to the local storage before running the TFX sample.
@jinchihe Thanks for your comment.
I followed your steps, but i didn't success. by the way i only created the pv. Should i create both pv and pvc?
i'm a new learner for k8s, i will be very appreciated if you can describe how to create pvc and attach the pvc in detail. Thanks!
@zoux86 Yes, I think you should create both PV and PVC manually. There is no specific configuration in the pvc defination file.
# kubectl describe pvc pipeline-pvc -n kubeflow
Name: pipeline-pvc
Namespace: kubeflow
StorageClass:
Status: Bound
Volume: pipeline-pv
Labels: <none>
Annotations: pv.kubernetes.io/bind-completed=yes
pv.kubernetes.io/bound-by-controller=yes
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 10Gi
Access Modes: RWX
Events: <none>
@jinchihe Thank you very much. i run the example succeeded finally.
i created pvc. And i found that in my environment, i should use NFS volume rather than the HostPath.
i will close the question.
@jinchihe thanks for explaining the volumes mount steps. We will add instructions on how to mount volumes and share among components.
@gaoning777 I would like to discuss this in the #721, thanks.
Most helpful comment
ML pipeline is a cross-platform product and can be surely deployed and run in local environments. However, the file not being detected is due to the fact that each component is a containerized operator. In other words, the code can only fetch the files inside the container.
There are two options:
1) copy your file to some place(your own cloud, file system) that the kubernetes container has access to.
or
2) build a new image containing your local file.