i'm seeing something very odd in the kind deployer for kubetest.
this PR:
https://github.com/kubernetes/test-infra/pull/11379/files
is causing:
2019/02/20 16:00:32 kind.go:336: kind.go:IsUp()
2019/02/20 16:00:32 kind.go:457: kind.go:KubectlCommand()
2019/02/20 16:00:32 process.go:153: Running: kubectl get nodes --no-headers
The connection to the server localhost:8080 was refused - did you specify the right host or port?
2019/02/20 16:00:32 process.go:155: Step 'kubectl get nodes --no-headers' finished in 77.854404ms
2019/02/22 05:07:22 main.go:307: Something went wrong: encountered 2 errors: [error starting kubectl --match-server-version=false version: exec: already started error during kubectl --match-server-version=false get nodes -oyaml: exit status 1]
https://gubernator.k8s.io/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kind-master/311
can be reproduced locally using:
./kubetest --provider=skeleton --deployment=kind --kind-binary-version=build --kind-config-path=./../../k8s.io/kubeadm/tests/e2e/kind/single-cp/single-cp.yaml --build=bazel --up --check-version-skew=false --ginkgo-parallel --timeout=30m
investigating..
/area kubetest
/kind bug
/assign
cc @BenTheElder @krzyzacy
was the kubeconfig path set? this control flow is so convoluted, in kubetest2 i think we will just let the deployer decide if it wants to cleanup previous deployments in Up() itself.
was the kubeconfig path set?
yes.
i've added this call to double check:
log.Printf("kind.go:getKubeConfigPath(): %s\n", string(o))
but i'm seeing this, which looks like a mix between kind and kubetest output.
2019/02/22 05:50:59 kind.go:172: kind.go:getKubeConfigPath(): time="05:50:58" level=debug msg="Running: /usr/bin/docker [docker ps -q -a --no-trunc --filter label=io.k8s.sigs.kind.cluster --format {{.Names}}\t{{.Label \"io.k8s.sigs.kind.cluster\"}}]"
/home/lubo-it/.kube/kind-config-kind-kubetest
we call clusterExists in both down and TestSetup, should be from Down()?
I wonder if we should have the KubectlCommand() method ensure the kubectl command has the kubeconfig set. (either by flag or by env on the command)
I wonder if we should have the KubectlCommand() method ensure the kubectl command has the kubeconfig set. (either by flag or by env on the command)
this seems to work fine.
i have a suspicion that we are somehow convoluting the d.control.Output calls with the logs from kind with --v=6 and --loglevel=debug. o_O
setting --loglevel=info fixes the problem.
i remembered this now:
https://github.com/kubernetes-sigs/kind/issues/55#issuecomment-461583253
setting
--loglevel=infofixes the problem.
whoah, uhhh. what?
how is kind#55 related?
whoah, uhhh. what?
this is the weird part. the problem was introduced with PR https://github.com/kubernetes/test-infra/pull/11379/files
i reproduce this locally.
how is kind#55 related?
the user mentioned that toggling log level fixes a cluster creation bug.
sounds like a race / memory corruption / something else.
i will now build custom kind binary and try to feed it in a custom kubetest to check something.
oh, I'm pretty certain 55 had a race just in if the api-server survives startup or not, disk eviction on a noisy host / workstation caused all kinds of fun.
the fact that a kind debug level toggle, modifies the app behavior i think means a race in kind?
it may have just been a coincidence, there shouldn't be any races in kind currently, the bringup is all serial.
diff --git a/pkg/exec/local.go b/pkg/exec/local.go
index 088438e..a8c2f4f 100644
--- a/pkg/exec/local.go
+++ b/pkg/exec/local.go
@@ -19,8 +19,7 @@ package exec
import (
"io"
osexec "os/exec"
-
- log "github.com/sirupsen/logrus"
+ // log "github.com/sirupsen/logrus"
)
// LocalCmd wraps os/exec.Cmd, implementing the kind/pkg/exec.Cmd interface
@@ -64,6 +63,6 @@ func (cmd *LocalCmd) SetStderr(w io.Writer) {
// Run runs
func (cmd *LocalCmd) Run() error {
- log.Debugf("Running: %v %v", cmd.Path, cmd.Args)
+ // log.Debugf("Running: %v %v", cmd.Path, cmd.Args)
return cmd.Cmd.Run()
}
this patch fixes the bug that i'm seeing with --loglevel=debug.
this issue is in the path after all:
~/.kubetest/kind/kind get kubeconfig-path --loglevel=debug --name=kind-kubetest
DEBU[07:23:35] Running: /usr/bin/docker [docker ps -q -a --no-trunc --filter label=io.k8s.sigs.kind.cluster --format {{.Names}}\t{{.Label "io.k8s.sigs.kind.cluster"}}]
/home/lubo-it/.kube/kind-config-kind-kubetest
debug adds extra output and pollutes the config path.
oh LOL that makes sense. imo that's a bug in the kubetest/kind package but we should really think more about logging in kind
https://github.com/kubernetes/test-infra/pull/11444
^ fix is here.
while digging around i found that tryUntil calls might need a small amount of sleep in the for loop:
https://github.com/kubernetes-sigs/kind/search?q=tryUntil&unscoped_q=tryUntil
especially the one that runs kubectl on the control-plane.
will send PR soon for that too.
yeah, tryUntil also needs us to fix commands possibly hanging indefinitely. (another thing I want to change at the exec package level). thanks for hunting this down and patching it 馃檹