Test-infra: k8s-ci-robot says that a test has failed, but the test actually succeeded

Created on 4 Dec 2018  Â·  11Comments  Â·  Source: kubernetes/test-infra

What happened:

The k8s robot has added the following comment to my PR: https://github.com/kubeflow/pipelines/pull/461#issuecomment-444027099

The problem is that the "failed" build-image test it links to actually succeeded: https://gubernator.k8s.io/build/kubernetes-jenkins/pr-logs/pull/kubeflow_pipelines/461/build-image/792

Test name | Commit | Details | Rerun command
-- | -- | -- | --
presubmit-e2e-test | e1af66b | https://gubernator.k8s.io/build/kubernetes-jenkins/pr-logs/pull/kubeflow_pipelines/461/presubmit-e2e-test/648| /test presubmit-e2e-test
build-image | 9b185ae | https://gubernator.k8s.io/build/kubernetes-jenkins/pr-logs/pull/kubeflow_pipelines/461/build-image/792| /test build-image

https://gubernator.k8s.io/build/kubernetes-jenkins/pr-logs/pull/kubeflow_pipelines/461/build-image/792

PR | Ark-kun: [WIP]Tests - reintegrate build-image stage back into test suites
-- | --
Result | SUCCESS

What you expected to happen:
I expect the robot to tell the truth.

Please provide links to example occurrences, if any:
https://github.com/kubeflow/pipelines/pull/461#issuecomment-444027099
https://gubernator.k8s.io/build/kubernetes-jenkins/pr-logs/pull/kubeflow_pipelines/461/build-image/792


/kind bug

areprow kinbug

Most helpful comment

probably we shouldn't fail the job when job passed and gcs upload failed?

All 11 comments

/area prow
/cc @cjwagner @BenTheElder

/cc @krzyzacy

the prowjob is in failure state, can gcs upload failure cause we mark the prowjob as failed?

Perhaps -- can you check the pod? In this case the GCS upload would have had to have failed for some artifact but not finshed.json

yeah...

"failed to upload to GCS: failed to upload to GCS: encountered errors during upload: [[Post https://www.googleapis.com/upload/storage/v1/b/kubernetes-jenkins/o?alt=json&projection=full&uploadType=multipart: oauth2: cannot fetch token: Post https://oauth2.googleapis.com/token: net/http: TLS handshake timeout] [Post https://www.googleapis.com/upload/storage/v1/b/kubernetes-jenkins/o?alt=json&projection=full&uploadType=multipart: oauth2: cannot fetch token: Post https://oauth2.googleapis.com/token: net/http: TLS handshake timeout]]"   

probably we shouldn't fail the job when job passed and gcs upload failed?

yeah, we could make sidecar never fail

I've encountered this before, forgot about this. Is anyone prepping a patch?

On Tue, Dec 4, 2018 at 10:38 AM Steve Kuznetsov notifications@github.com
wrote:

yeah, we could make sidecar never fail

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/test-infra/issues/10328#issuecomment-444209390,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA4Bq1i8sJHqLNoaYPk2axhTwYCqfpWaks5u1sEmgaJpZM4ZARTL
.

Thanks for fixing this!

I wonder whether the problem is somehow related to the fact that that test suite exits immediately after start (exit 0)

I think it's just a gcs flake. (also it's easier to drop a /retest)

This is most likely fixed.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

stevekuznetsov picture stevekuznetsov  Â·  3Comments

cjwagner picture cjwagner  Â·  3Comments

lavalamp picture lavalamp  Â·  3Comments

spzala picture spzala  Â·  4Comments

benmoss picture benmoss  Â·  3Comments