Skaffold: Skaffold sync failure

Created on 12 Apr 2019 · 28Comments · Source: GoogleContainerTools/skaffold

I am running skaffold 24 version. I am using skaffold sync(with sync command same as in version 26), its failing with exit status 2.

Expected behavior

I should be able to directly sync the modified file to running pod.

Actual behavior

Returning exit code 2.

level=warning msg="Skipping deploy due to sync error: copying files: exit status 2"

Information

Skaffold version: 24
Operating system: Linux
Contents of skaffold.yaml:

apiVersion: skaffold/v1beta6
kind: Config
build:
 artifacts:
 - image: someurl.com/kube/pythonserver
 context: .
 sync:
 '*.py': .
 kaniko:
 buildContext:
 localDir: {}
 namespace: prashant
 image: someurl.com/gkp/kaniko-executor:latest dockerConfig:
 path: C:\Users\prashant\.docker\config.json
deploy:
 kubectl:
 manifests:
 - deploy.yml

Steps to reproduce the behavior

Make a sample hello world python application
Create dockerfile for the same, build and deploy the application using kaniko and skaffold in dev mode.
Change Hello world to Hello New world in python application.
Skaffold would catch the changes made and would try to copy the file to running pod.
Skaffold fails to copy the changed file.

Here is the Dockerfile for my application

FROM someurl.net/lrh7:latest
RUN yum install tar -y
COPY app.py .
EXPOSE 8080
CMD [ "python", "app.py" ]

https://groups.google.com/d/msg/skaffold-users/nGPcn6uypq8/oBciDgE5CQAJ

aresync kindocumentation

Source

prary

Most helpful comment

Okay - after rereading the posts - my first thought would be to make the file/dirs writable by the user 199 for development rather than a iterating with a root in dev. If you iterate with a root user, you might get access to files that are not realistic and get surprised later... Explicitly chmod-ing the files/dirs for sync seems like a smaller scale divergence between dev / prod.

balopat on 19 Apr 2019

👍3

All 28 comments

@prary please try a later version of Skaffold: I think you're hitting #1721 which was fixed with #1722 for v0.25.0.

On the mailing list, @prary showed:

time="2019-04-11T10:26:34Z" level=debug msg="Running command: [kubectl exec simplehttp-6bffcd784c-dt8xq --namespace prashant -c simplehttp -i -- tar xmf - --no-same-owner]"
time="2019-04-11T10:26:36Z" level=warning msg="Skipping deploy due to sync error: copying files: exit status 2"

Note that the tar is missing -C /.

briandealwis on 12 Apr 2019

Hi @briandealwis ,
I have added that missing change also but still facing the issue, in some time from now I will share the screenshot for the same.

prary on 12 Apr 2019

Hi @briandealwis
here are the running logs attached.

time="2019-04-12T06:11:27Z" level=debug msg="Running command: [kubectl exec simplehttp-68cd855688-hfzz6 --namespace prashant -c simplehttp -i --     tar xmf - -C / --no-same-owner]"

134 time="2019-04-12T06:11:28Z" level=warning msg="Skipping deploy due to sync error: copying files: exit status 2"

prary on 12 Apr 2019

@prary I'm guessing from your FROM line that your base image is RHEL7, so I tried reproducing using centos:7 and a (simple python HTTP server example](https://www.acmesystems.it/python_http), but using a Docker build from Skaffold v0.26.0. And it worked fine:

INFO[0051] files modified: [app.py]                     
DEBU[0052] Checking base image centos:7 for ONBUILD triggers. 
DEBU[0052] Found dependencies for dockerfile: [app.py]  
Syncing 1 files for someurl.com/kube/pythonserver:2a2c25944f5c95f4855bcda825287179864d467afc598b65ee02a6336b21871d
INFO[0052] Copying files: map[app.py:/app.py] to someurl.com/kube/pythonserver:2a2c25944f5c95f4855bcda825287179864d467afc598b65ee02a6336b21871d 
DEBU[0052] Running command: [kubectl exec web-5498d4b54b-v7mdq --namespace default -c web -i -- tar xmf - -C / --no-same-owner] 
Watching for changes every 1s...

My suspicion is that your built container image isn't quite right. What happens if you try running the image locally something like:

$ docker run --rm --entrypoint /bin/sh someurl.com/kube/pythonserver \
   -c "tar xmf - -C / --no-same-owner" < /dev/null

What happens if you use a local docker builder instead of Kaniko?

briandealwis on 12 Apr 2019

Hi @briandealwis ,

This is the error i am getting on running above command

[a_ansible@master1 out]$ sudo docker run --rm --entrypoint /bin/sh 273de95a8e5f -c "tar xmf - -C / --no-same-owner" < /dev/null
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors

prary on 12 Apr 2019

Does this work too?

docker run --rm 273de95a8e5f tar xmf - -C / --no-same-owner < /dev/null

And what happens if you try running the same kubectl exec as Skaffold is trying to do?

(Oh, can you show us your deploy.yml file too?)

And just to confirm, is this image 273de95a8e5f the same Kaniko-built image from Skaffold, or is it something you built locally with docker build .?

briandealwis on 12 Apr 2019

Oh, we're not logging the output of the kubectl exec commands:
https://github.com/GoogleContainerTools/skaffold/blob/6e9c61df6336540325b927071d62947b261a77bf/pkg/skaffold/sync/kubectl/kubectl.go#L73-L78

briandealwis on 12 Apr 2019

My deploy.yaml file
`apiVersion: apps/v1beta1 kind: Deployment metadata: name: simplehttp spec: selector: matchLabels: app: simplehttp replicas: 1 template: metadata: labels: app: simplehttp spec: containers: - name: simplehttp image: someurl.net/docker-sandbox/kube/pythonserver ports: - containerPort: 8080 securityContext: runAsUser: 199 imagePullSecrets: - name: regcred apiVersion: v1 kind: Service metadata: name: simplehttp labels: app: simplehttp spec: ports: `- port: 80` targetPort: 8080 protocol: TCP name: http selector: app: simplehttp apiVersion: extensions/v1beta1 kind: Ingress metadata: annotations: ingress.kubernetes.io/ssl-redirect: "true" nginx.ingress.kubernetes.io/ssl-redirect: "true" creationTimestamp: 2018-11-05T10:41:19Z generation: 1 labels: app: simplehttp name: skaffold-example-ingress # namespace: prashant resourceVersion: "14651061" selfLink: /apis/extensions/v1beta1/namespaces/********** uid: 54698fa9-e0e7-11e8-ad52-6a1aa97f2964 spec: rules: - host: someurl.net http: paths: - backend: serviceName: simplehttp servicePort: 8080 path: / tls: - hosts: - someurl.net status: loadBalancer: ingress: - {}

My sample python application
`#!/usr/bin/python
import sys

from BaseHTTPServer import BaseHTTPRequestHandler,HTTPServer

PORT_NUMBER = 8080

This class will handles any incoming request from

the browser

class myHandler(BaseHTTPRequestHandler):

#Handler for the GET requests
def do_GET(self):
    self.send_response(200)
    self.send_header('Content-type','text/htmlo')
    self.end_headers()
    # Send the html message
    self.wfile.write("Hello World Prashant ")
    return

try:
#Create a web server and define the handler to manage the
#incoming request
sys.stdout.flush()
server = HTTPServer(('', PORT_NUMBER), myHandler)
print 'Started httpserver on port ' , PORT_NUMBER
sys.stdout.flush()
#Wait forever for incoming htto requests
server.serve_forever()

except KeyboardInterrupt:
print '^C received, shutting down the web server'
server.socket.close()

`
Image is something I build locally, its not the same kaniko is using.

When running command manually nothing happens.

[a_ansible@master1 out]$ kubectl exec simplehttp-7957fd946f-5g6zh --namespace prashant -c simplehttp -i -- tar xmf - -C / --no-same-owner

prary on 12 Apr 2019

Oh, we're not logging the output of the kubectl exec commands:
skaffold/pkg/skaffold/sync/kubectl/kubectl.go

Lines 73 to 78 in 6e9c61d

func copyFileFn(ctx context.Context, pod v1.Pod, container v1.Container, files map[string]string) []*exec.Cmd {
// Use "m" flag to touch the files as they are copied.
reader, writer := io.Pipe()
copy := exec.CommandContext(ctx, "kubectl", "exec", pod.Name, "--namespace", pod.Namespace, "-c", container.Name, "-i",
"--", "tar", "xmf", "-", "-C", "/", "--no-same-owner")
copy.Stdin = reader

I have attached copy.stdout = os.stderr() but could not fetch the logs.

prary on 12 Apr 2019

Thanks for uploading the deploy.yml. I wanted to check that you weren't overriding the container's entrypoint.

Image is something I build locally, its not the same kaniko is using.

Is it possible to try using the kaniko image? If you're building against a remote cluster, you should be able to use the Skaffold-reported image reference (in your email you showed someurl.com/docker-sandbox/kube/pythonserver:8ea12bf@sha256:9444b85). If you're using Minikube, you can configure your local docker to use the Minikube daemon with eval $(minikube docker-env); docker run ...

When running command manually nothing happens.

That suggests that it's been able to launch tar and tar is waiting for input, which is a good thing. But it doesn't seem to be happening when Skaffold is trying to run tar.

briandealwis on 12 Apr 2019

If I make the following change:

--- a/pkg/skaffold/sync/kubectl/kubectl.go
+++ b/pkg/skaffold/sync/kubectl/kubectl.go
@@ -19,6 +19,7 @@ package kubectl
 import (
    "context"
    "io"
+   "os"
    "os/exec"

    "github.com/GoogleContainerTools/skaffold/pkg/skaffold/sync"
@@ -76,6 +77,9 @@ func copyFileFn(ctx context.Context, pod v1.Pod, container v1.Container, files m
    copy := exec.CommandContext(ctx, "kubectl", "exec", pod.Name, "--namespace", pod.Namespace, "-c", container.Name, "-i",
        "--", "tar", "xmf", "-", "-C", "/", "--no-same-owner")
    copy.Stdin = reader
+   copy.Stdout = os.Stdout
+   copy.Stderr = os.Stderr
+
    go func() {
        defer writer.Close()

and then deliberately introduce an error into the kubectl exec command, then I see an error message on the output:

DEBU[0013] Running command: [kubectl exec web-f56b44648-6vl86 --namespace default -c web -i -- error tar xmf - -C / --no-same-owner] 
OCI runtime exec failed: exec failed: container_linux.go:344: starting container process caused "exec: \"error\": executable file not found in $PATH": unknown
command terminated with exit code 126
WARN[0013] Skipping deploy due to sync error: copying files: exit status 126

briandealwis on 12 Apr 2019

Hi @briandealwis ,

I re-run with stdout and stderr with your way, i got this error.

 time="2019-04-13T13:49:16Z" level=info msg="Copying files: map[app.py:/app.py] to someurl.net/docker-sandbox/kube/pythonserver:v0.26.0-1    04-g642f7f9@sha256:e6e4f99649f3e6971630f14ad95f690839fdcea35697b087cfad25fb87785960"
148 time="2019-04-13T13:49:16Z" level=debug msg="Running command: [kubectl exec simplehttp-55679456f-d8pb5 --namespace 103000-prash-dev -c simplehttp -i -- t    ar xmf - -C / --no-same-owner]"
149 error: unable to upgrade connection: container not found ("simplehttp")
150 time="2019-04-13T13:49:21Z" level=warning msg="Skipping deploy due to sync error: copying files: exit status 1"

Should I raise the request for enabling stdout ans stderr, in the mean time?

prary on 13 Apr 2019

hi @briandealwis ,

time="2019-04-13T17:07:52Z" level=debug msg="Running command: [kubectl exec simplehttp-7b995fbff5-l4qvl --namespace prashant -c simplehttp -i -- tar xmf - -C / --no-same-owner]"
320 tar: Removing leading `/' from member names
321 tar: app.py: Cannot open: File exists
322 tar: Exiting with failure status due to previous errors
323 command terminated with exit code 2
324 time="2019-04-13T17:07:53Z" level=warning msg="Skipping deploy due to sync error: copying files: exit status 2"

Different logs

prary on 13 Apr 2019

Hi @briandealwis ,

getting the same error if i explicitly do copy to root location of the running pod.

[a_ansible@master1 out]$ kubectl cp app.py prashant/simplehttp-5585bb97d4-b6vgv:/
tar: app.py: Cannot open: File exists
tar: Exiting with failure status due to previous errors
command terminated with exit code 2

but this thing runs fine when I copy to /tmp location.

[a_ansible@master1 out]$ kubectl cp app.py prashant/simplehttp-5585bb97d4-b6vgv:/ -v=1
tar: app.py: Cannot open: File exists
tar: Exiting with failure status due to previous errors
command terminated with exit code 2
[a_ansible@master1 out]$ kubectl cp app.py prashant/simplehttp-5585bb97d4-b6vgv:/tmp/ -v=1
[a_ansible@master1 out]$

any clue?

prary on 15 Apr 2019

Ok I found the problem, my pod is running in nonroot user say uid 200 and skaffold is copy to root location( which I should configure to the user directory)

prary on 15 Apr 2019

Hi @briandealwis ,

kubectl cp is failing when the app.py is in used. If we copy to some other location like /tmp, in that case cp is successful otherwise it throws the file exist error.

tar: app.py: Cannot open: File exists
tar: Exiting with failure status due to previous errors
command terminated with exit code 2

prary on 15 Apr 2019

@prary Are you, by any chance, using windows containers?

corneliusweig on 15 Apr 2019

@corneliusweig
no,not using windows container

prary on 15 Apr 2019

That makes total sense @prary. I totally missed the runAsUser:

securityContext:
        runAsUser: 199

I think you could use Skaffold's kustomize support to remove the securityContect at deploy-time, perhaps as part of Skaffold profile?

briandealwis on 15 Apr 2019

The other solution is to build your container as user 199, so app.py is owned by the running user. I think the kustomize solution is better though.

briandealwis on 15 Apr 2019

I wonder if Skaffold can detect or warm about this situation. The builders could communicate which user(s) were used in the build, and that could be compared to the UID in the runAsContext. But it seems fragile.

briandealwis on 15 Apr 2019

kubectl cp is failing when the app.py is in used. If we copy to some other location like /tmp, in that case cp is successful otherwise it throws the file exist error.

Hi @briandealwis ,

what's your opinion about it, if app.py is being used then pod refuse to replace that file. If you copy app.py to /tmp location or other place then copy succeed i.e no root user error. Is this due to non-root user issue or due to pod not replacing the file being used (this would not be the case), just wanted to confirm?

Thanks

prary on 15 Apr 2019

@prary do you mean you adjust your Dockerfile to COPY app.py /tmp or to set the WORKDIR /tmp? I'd be surprised if that works since the app.py's owner would still be root and the directory sticky bit on /tmp should prevent UID 199 from overwriting /tmp/app.py.

briandealwis on 15 Apr 2019

@briandealwis , you are right it didnt work when I tried earlier today but i find it hard to draw root/non-root conclusion. It seems to be that but not 100 % sure about it.

prary on 15 Apr 2019

Thanks for debugging with us @prary, keep us posted!
I added an issue in the meantime to #1981 to have better error messages.

balopat on 18 Apr 2019

balopat on 19 Apr 2019

👍3

@balopat, so issue is because of non-root user, I ran this with root user then sync worked fine.

prary on 19 Apr 2019

I marked this as a documentation issue.
We should document this in under Sync that if there might be permission issues that should be resolved via separate dev/prod profiles.

balopat on 24 Apr 2019

👍2

Was this page helpful?

0 / 5 - 0 ratings