i'm seeing a problem in the 1.12 kubeadm/kind job:
https://testgrid.k8s.io/sig-cluster-lifecycle-all#kubeadm-kind-1.12
https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kind-1-12/13/build-log.txt
2019/02/25 20:59:56 process.go:153: Running: ./hack/ginkgo-e2e.sh --ginkgo.focus=\[Conformance\] --ginkgo.skip=\[Serial\]|Alpha|Kubectl|\[(Disruptive|Feature:[^\]]+|Flaky)\] --num-nodes=3 --report-dir=/logs/artifacts --disable-log-dump=true
Conformance test: not doing test setup.
Found no test suites
For usage instructions:
ginkgo help
!!! Error in ./hack/ginkgo-e2e.sh:143
Error in ./hack/ginkgo-e2e.sh:143. '"${ginkgo}" "${ginkgo_args[@]:+${ginkgo_args[@]}}" "${e2e_test}" -- "${auth_config[@]:+${auth_config[@]}}" --ginkgo.flakeAttempts="${FLAKE_ATTEMPTS}" --host="${KUBE_MASTER_URL}" --provider="${KUBERNETES_PROVIDER}" --gce-project="${PROJECT:-}" --gce-zone="${ZONE:-}" --gce-region="${REGION:-}" --gce-multizone="${MULTIZONE:-false}" --gke-cluster="${CLUSTER_NAME:-}" --kube-master="${KUBE_MASTER:-}" --cluster-tag="${CLUSTER_ID:-}" --cloud-config-file="${CLOUD_CONFIG:-}" --repo-root="${KUBE_ROOT}" --node-instance-group="${NODE_INSTANCE_GROUP:-}" --prefix="${KUBE_GCE_INSTANCE_PREFIX:-e2e}" --network="${KUBE_GCE_NETWORK:-${KUBE_GKE_NETWORK:-e2e}}" --node-tag="${NODE_TAG:-}" --master-tag="${MASTER_TAG:-}" --cluster-monitoring-mode="${KUBE_ENABLE_CLUSTER_MONITORING:-standalone}" --prometheus-monitoring="${KUBE_ENABLE_PROMETHEUS_MONITORING:-false}" ${KUBE_CONTAINER_RUNTIME:+"--container-runtime=${KUBE_CONTAINER_RUNTIME}"} ${MASTER_OS_DISTRIBUTION:+"--master-os-distro=${MASTER_OS_DISTRIBUTION}"} ${NODE_OS_DISTRIBUTION:+"--node-os-distro=${NODE_OS_DISTRIBUTION}"} ${NUM_NODES:+"--num-nodes=${NUM_NODES}"} ${E2E_REPORT_DIR:+"--report-dir=${E2E_REPORT_DIR}"} ${E2E_REPORT_PREFIX:+"--report-prefix=${E2E_REPORT_PREFIX}"} "${@:-}"' exited with status 1
Call stack:
1: ./hack/ginkgo-e2e.sh:143 main(...)
i need to dig into why this is happening, but posting in advance in case Found no test suites is a known problem.
cc @krzyzacy @BenTheElder
/kind bug
/area kubetest
looks like i can repro this locally:
./kubetest --deployment=kind --test --test_args="--ginkgo.focus=\[Conformance\] --num-nodes=3 --report-dir=/logs/artifacts --disable-log-dump=true" --kind-binary-version=build
Found no test suites
will now try to figure out what is different here and between existing 1.12 test jobs that work. O_o
did something change in ginkgo-e2e.sh?
really hoping we can stop using this in the kubetest2 tester(s) ...
between 1.12 and master:
diff --git a/hack/ginkgo-e2e.sh b/hack/ginkgo-e2e.sh
old mode 100755
new mode 100644
index 0cac8afc6b..c4fc31186d
--- a/hack/ginkgo-e2e.sh
+++ b/hack/ginkgo-e2e.sh
@@ -87,7 +87,7 @@ if [[ "${KUBERNETES_PROVIDER}" == "gce" ]]; then
set_num_migs
NODE_INSTANCE_GROUP=""
for ((i=1; i<=${NUM_MIGS}; i++)); do
- if [[ $i == ${NUM_MIGS} ]]; then
+ if [[ ${i} == ${NUM_MIGS} ]]; then
# We are assigning the same mig names as create-nodes function from cluster/gce/util.sh.
NODE_INSTANCE_GROUP="${NODE_INSTANCE_GROUP}${NODE_INSTANCE_PREFIX}-group"
else
@@ -161,6 +161,8 @@ export PATH=$(dirname "${e2e_test}"):"${PATH}"
--master-tag="${MASTER_TAG:-}" \
--cluster-monitoring-mode="${KUBE_ENABLE_CLUSTER_MONITORING:-standalone}" \
--prometheus-monitoring="${KUBE_ENABLE_PROMETHEUS_MONITORING:-false}" \
+ --dns-domain="${KUBE_DNS_DOMAIN:-cluster.local}" \
+ --ginkgo.slowSpecThreshold="${GINKGO_SLOW_SPEC_THRESHOLD:-300}" \
${KUBE_CONTAINER_RUNTIME:+"--container-runtime=${KUBE_CONTAINER_RUNTIME}"} \
${MASTER_OS_DISTRIBUTION:+"--master-os-distro=${MASTER_OS_DISTRIBUTION}"} \
${NODE_OS_DISTRIBUTION:+"--node-os-distro=${NODE_OS_DISTRIBUTION}"} \
test/e2e/e2e.go diff:
--- e2e.go 2019-02-26 05:27:18.751107012 +0200
+++ e2e.go_master 2019-02-26 05:27:11.823108597 +0200
@@ -24,18 +24,16 @@
"testing"
"time"
- "github.com/golang/glog"
"github.com/onsi/ginkgo"
"github.com/onsi/ginkgo/config"
"github.com/onsi/ginkgo/reporters"
"github.com/onsi/gomega"
+ "k8s.io/klog"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
runtimeutils "k8s.io/apimachinery/pkg/util/runtime"
- "k8s.io/apiserver/pkg/util/logs"
clientset "k8s.io/client-go/kubernetes"
- "k8s.io/kubernetes/pkg/cloudprovider/providers/azure"
- gcecloud "k8s.io/kubernetes/pkg/cloudprovider/providers/gce"
+ "k8s.io/component-base/logs"
"k8s.io/kubernetes/pkg/version"
commontest "k8s.io/kubernetes/test/e2e/common"
"k8s.io/kubernetes/test/e2e/framework"
@@ -46,86 +44,20 @@
// ensure auth plugins are loaded
_ "k8s.io/client-go/plugin/pkg/client/auth"
+
+ // ensure that cloud providers are loaded
+ _ "k8s.io/kubernetes/test/e2e/framework/providers/aws"
+ _ "k8s.io/kubernetes/test/e2e/framework/providers/azure"
+ _ "k8s.io/kubernetes/test/e2e/framework/providers/gce"
+ _ "k8s.io/kubernetes/test/e2e/framework/providers/kubemark"
+ _ "k8s.io/kubernetes/test/e2e/framework/providers/openstack"
)
var (
- cloudConfig = &framework.TestContext.CloudConfig
+ cloudConfig = &framework.TestContext.CloudConfig
+ nodeKillerStopCh = make(chan struct{})
)
-// setupProviderConfig validates and sets up cloudConfig based on framework.TestContext.Provider.
-func setupProviderConfig() error {
- switch framework.TestContext.Provider {
- case "":
- glog.Info("The --provider flag is not set. Treating as a conformance test. Some tests may not be run.")
-
- case "gce", "gke":
- framework.Logf("Fetching cloud provider for %q\r", framework.TestContext.Provider)
- zone := framework.TestContext.CloudConfig.Zone
- region := framework.TestContext.CloudConfig.Region
-
- var err error
- if region == "" {
- region, err = gcecloud.GetGCERegion(zone)
- if err != nil {
- return fmt.Errorf("error parsing GCE/GKE region from zone %q: %v", zone, err)
- }
- }
- managedZones := []string{} // Manage all zones in the region
- if !framework.TestContext.CloudConfig.MultiZone {
- managedZones = []string{zone}
- }
-
- gceCloud, err := gcecloud.CreateGCECloud(&gcecloud.CloudConfig{
- ApiEndpoint: framework.TestContext.CloudConfig.ApiEndpoint,
- ProjectID: framework.TestContext.CloudConfig.ProjectID,
- Region: region,
- Zone: zone,
- ManagedZones: managedZones,
- NetworkName: "", // TODO: Change this to use framework.TestContext.CloudConfig.Network?
- SubnetworkName: "",
- NodeTags: nil,
- NodeInstancePrefix: "",
- TokenSource: nil,
- UseMetadataServer: false,
- AlphaFeatureGate: gcecloud.NewAlphaFeatureGate([]string{}),
- })
-
- if err != nil {
- return fmt.Errorf("Error building GCE/GKE provider: %v", err)
- }
-
- cloudConfig.Provider = gceCloud
-
- // Arbitrarily pick one of the zones we have nodes in
- if cloudConfig.Zone == "" && framework.TestContext.CloudConfig.MultiZone {
- zones, err := gceCloud.GetAllZonesFromCloudProvider()
- if err != nil {
- return err
- }
-
- cloudConfig.Zone, _ = zones.PopAny()
- }
-
- case "aws":
- if cloudConfig.Zone == "" {
- return fmt.Errorf("gce-zone must be specified for AWS")
- }
- case "azure":
- if cloudConfig.ConfigFile == "" {
- return fmt.Errorf("config-file must be specified for Azure")
- }
- config, err := os.Open(cloudConfig.ConfigFile)
- if err != nil {
- framework.Logf("Couldn't open cloud provider configuration %s: %#v",
- cloudConfig.ConfigFile, err)
- }
- defer config.Close()
- cloudConfig.Provider, err = azure.NewCloud(config)
- }
-
- return nil
-}
-
// There are certain operations we only want to run once per overall test invocation
// (such as deleting old namespaces, or verifying that all system pods are running.
// Because of the way Ginkgo runs tests in parallel, we must use SynchronizedBeforeSuite
@@ -137,10 +69,6 @@
var _ = ginkgo.SynchronizedBeforeSuite(func() []byte {
// Run only on Ginkgo node 1
- if err := setupProviderConfig(); err != nil {
- framework.Failf("Failed to setup provider config: %v", err)
- }
-
switch framework.TestContext.Provider {
case "gce", "gke":
framework.LogClusterImageSources()
@@ -148,7 +76,7 @@
c, err := framework.LoadClientset()
if err != nil {
- glog.Fatal("Error loading client: ", err)
+ klog.Fatal("Error loading client: ", err)
}
// Delete any namespaces except those created by the system. This ensures no
@@ -163,7 +91,7 @@
if err != nil {
framework.Failf("Error deleting orphaned namespaces: %v", err)
}
- glog.Infof("Waiting for deletion of the following namespaces: %v", deleted)
+ klog.Infof("Waiting for deletion of the following namespaces: %v", deleted)
if err := framework.WaitForNamespacesDeleted(c, deleted, framework.NamespaceCleanupTimeout); err != nil {
framework.Failf("Failed to delete orphaned namespaces %v: %v", deleted, err)
}
@@ -210,24 +138,23 @@
// Reference common test to make the import valid.
commontest.CurrentSuite = commontest.E2E
+ if framework.TestContext.NodeKiller.Enabled {
+ nodeKiller := framework.NewNodeKiller(framework.TestContext.NodeKiller, c, framework.TestContext.Provider)
+ nodeKillerStopCh = make(chan struct{})
+ go nodeKiller.Run(nodeKillerStopCh)
+ }
return nil
}, func(data []byte) {
// Run on all Ginkgo nodes
-
- if cloudConfig.Provider == nil {
- if err := setupProviderConfig(); err != nil {
- framework.Failf("Failed to setup provider config: %v", err)
- }
- }
})
-// Similar to SynchornizedBeforeSuite, we want to run some operations only once (such as collecting cluster logs).
+// Similar to SynchronizedBeforeSuite, we want to run some operations only once (such as collecting cluster logs).
// Here, the order of functions is reversed; first, the function which runs everywhere,
// and then the function that only runs on the first Ginkgo node.
var _ = ginkgo.SynchronizedAfterSuite(func() {
// Run on all Ginkgo nodes
- framework.Logf("Running AfterSuite actions on all node")
+ framework.Logf("Running AfterSuite actions on all nodes")
framework.RunCleanupActions()
}, func() {
// Run only Ginkgo on node 1
@@ -240,6 +167,9 @@
framework.Logf("Error gathering metrics: %v", err)
}
}
+ if framework.TestContext.NodeKiller.Enabled {
+ close(nodeKillerStopCh)
+ }
})
func gatherTestSuiteMetrics() error {
@@ -296,12 +226,12 @@
// TODO: we should probably only be trying to create this directory once
// rather than once-per-Ginkgo-node.
if err := os.MkdirAll(framework.TestContext.ReportDir, 0755); err != nil {
- glog.Errorf("Failed creating report directory: %v", err)
+ klog.Errorf("Failed creating report directory: %v", err)
} else {
r = append(r, reporters.NewJUnitReporter(path.Join(framework.TestContext.ReportDir, fmt.Sprintf("junit_%v%02d.xml", framework.TestContext.ReportPrefix, config.GinkgoConfig.ParallelNode))))
}
}
- glog.Infof("Starting e2e run %q on Ginkgo node %d", framework.RunId, config.GinkgoConfig.ParallelNode)
+ klog.Infof("Starting e2e run %q on Ginkgo node %d", framework.RunId, config.GinkgoConfig.ParallelNode)
ginkgo.RunSpecsWithDefaultAndCustomReporters(t, "Kubernetes e2e suite", r)
}
i never understood why ginkgo is used even, but someone mentioned at some point that it was a mistake.
i found the problem....
https://github.com/kubernetes/kubernetes/commit/275212bbc964c453fbde596812eea1f992468ee2
:fire: bash :fire:
while in the release-1.12 branch of k/k, the kind deployer from kubetest builds the e2e.test binary, but then for some reason this line in ginkgo-e2e.sh returns an empty string:
e2e_test=$(kube::util::find-binary "e2e.test")
need to backport this for 1.12.
I came here to comment this, I remember this change. There was a fix to make it locate the bazel binary.