Origin: Container creation fails because of "Failed create pod sandbox"

Created on 26 Oct 2017  路  35Comments  路  Source: openshift/origin

Pods are not getting created anymore

Version

oc v3.6.173.0.7
kubernetes v1.6.1+5115d708d7
features: Basic-Auth

Server https://api.starter-ca-central-1.openshift.com:443
openshift v3.7.0-0.143.7
kubernetes v1.7.0+80709908fd

Steps To Reproduce
  1. create a application (e.g. resid (persistent) from catalog)
  2. check pod/container creation
  3. wait for timeouts
Current Result

Warn messages on pod:
1:33:46 PM | Normal | Sandbox changed | Pod sandbox changed, it will be killed and re-created. 2 times in the last聽5 minutes -- | -- | -- | -- 1:33:42 PM | Warning | Failed create pod sand box | Failed create pod sandbox. 2 times in the last 5 minutes
--> pod is not created

the only real error I could grab was :
Failed kill pod | error killing pod: failed to "KillPodSandbox" for "c4c2ec61-ba29-11e7-8b2c-02d8407159d1" with KillPodSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod \"redis-1-deploy_instantsoundbot\" network: CNI request failed with status 400: 'Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?\n)\n'"

Expected Result

pod should start up as the used to do...

Additional Information

Couldn't get oc adm diagnostics working atm
I guess it could have to do with the introduction of https://github.com/openshift/origin/pull/15880

componenkubernetes kinbug lifecyclrotten prioritP1

Most helpful comment

Same problem on starter-ca-central-1.openshift.com

error streaming logs from build pod: sii-test/app-5-build container: , container "sti-build" in pod "app-5-build" is not available

All 35 comments

I'm facing the same in the starter-us-east-1.openshift.com environment. It's been unstable for couple of days already...

/cc @jupierce

This is a known issue that has a fix and it being rolled out to the starter clusters presently.

I think 'm facing the same issue at starter-ca-central-1.openshift.com:

10:04:10 AM | Normal | Deadline exceeded | Pod was active on the node longer than the specified deadline
-- | -- | -- | --
10:00:03 AM | Normal | Sandbox changed | Pod sandbox changed, it will be killed and re-created.14 times in the last聽58 minutes
-- | -- | -- | --
9:59:40 AM | Warning | Failed create pod sand box | Failed create pod sandbox.14 times in the last 58 minutes

When will the fix finish rolling out?

Facing same issue not possible to rollout anything on starter-ca-central-1.openshift.com. Hope will be fixed soon.

i got same issue , i tried to create application using tomcat 8 , and try to build source code that exist in this path
https://github.com/osamahassan245/samplepp

i got build error , when tried to check the log , got this log

container "sti-build" in pod is not available

Same problem on starter-ca-central-1.openshift.com

error streaming logs from build pod: sii-test/app-5-build container: , container "sti-build" in pod "app-5-build" is not available

issue solved , i tried to use " Red Hat JBoss Web Server 3.1 Tomcat 8 1.0 " , it's working fine now

The issue is still actual on starter-us-east-1.openshift.com

Still problem on ca-central

Glad I'm not the only one seeing this issue. It's been occurring for me on console.starter-us-west-1.openshift.com since last weekend (11/4).

still seeing this on starter-ca-central-1.openshift.com

I have same issue . error streaming logs from build pod: mavajsunco-website/mavajsunco-msc-6-build container: , container "sti-build" in pod "mavajsunco-msc-6-build" is not available

Same issue deploying rhscl/mysql-57-rhel7 on starter-us-east-1.

@dcbw this is the all too familiar iptables-restore issue. You are closer to this that I am and hopefully can provide better feedback about the progress.

馃憤

Still having this problem on starter-us-west-2.

I've got 7 failed deployments in a row for this error message.

^same

@dcbw @sjenning any input as to where the issue might be?

Seeing this on pro-us-east-1

Seeing this the last couple of days on pro-us-east-1 as well

Same here!!! Observing on pro-us-east-1.

Hey Folks! Any update on this one, do you have a fix already on openshift or openshift ansible repos that I can pick up. Is there a temporary workaround for this issue? We are facing the same issue with our openshift cluster on AWS.

Version
OpenShift Master:
v3.7.0+7ed6862
Kubernetes Master:
v1.7.6+a08f5eeb62

@pweil- , @jupierce are you still looking into this issue. Is there any progress or workaround available?

@dcbw @knobunc ping

I am facing similar issue using OCP v3.9.30 with CDK. In my case I have Che deployed on OpenShift and when I start a new workspace, its node crashes on sandbox changed:

11:52:32 AM     Normal  Killing     Killing container with id docker://container:Need to kill Pod
11:52:30 AM     Normal  Sandbox Changed     Pod sandbox changed, it will be killed and re-created.
11:52:28 AM     Normal  Started     Started container
11:52:28 AM     Normal  Created     Created container

Is there any update on this issue @dcbw?

I used Openshift for more than 5 years. Spend a lot time making my app running on v2 again. At the end traffic was just not rooted anymore. Mooved to heroku took me 2 hours to migrate all my data(db) and make the necessary Source Code changes. Since then no more Problems. Sorry Openshift

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

Seeing this (or something similar) currently on OpenShift Online starter-us-west-1. Unable to build or deploy because of it. No logs from pods that have this issue. Status page says all green.

We still see this issue on okd 3.7.1

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

I am seeing this issue or something simular when deploying 3Scale API Management Platform on Openshift in particular system-sidekiq.

Failed create pod sandboxc: rpc error: code = Unknown desc = failed to set up sandbox container "2cc1e1d064082f2a2b8cd7a10efb7d135a8a150e7d95fb7b939d6368e1717309" network for pod "system-sidekiq-6-deploy- debug": NetworkPlugin cni failed to set up pod "system-sidekiq-6-deploy-debug_mmcneilly-3scale-onprem" network: CNI request failed with status 400: 'pods "system-sidekiq-6-deploy-debug" not found '

Can this issue be reopened?
/reopen

@matthewmcneilly: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings