(roachtest).acceptance/many-splits failed on master@2b2eac926b5a51f78d64f6d7d3e86c80c13931a4:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200410-1866229/acceptance/many-splits/run_1
cluster.go:2378,many_splits.go:43,acceptance.go:91,test_runner.go:753: error with attached stack trace:
main.(*monitor).WaitE
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2366
main.(*monitor).Wait
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2374
main.runManySplits
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/many_splits.go:43
main.registerAcceptance.func2
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/acceptance.go:91
main.(*testRunner).runTest.func2
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:753
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
- monitor failure:
- unexpected node event: 4: dead
cluster.go:1420,context.go:135,cluster.go:1409,test_runner.go:825: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-1866229-1586504087-13-n4cpu4 --oneshot --ignore-empty-nodes: exit status 1 3: 3704
4: dead
2: 3838
1: 3836
Error: UNCLASSIFIED_PROBLEM:
- 4: dead
main.glob..func13
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1129
main.wrap.func1
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:272
github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).execute
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:766
github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:852
github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).Execute
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:800
main.main
/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1793
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
Artifacts: /acceptance/many-splits See this test on roachdashMore
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200421 17:57:04.139605 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200421 20:13:20.422364 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200421 21:09:09.323789 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200421 22:13:04.430347 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200421 23:03:40.602174 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200421 23:41:20.789852 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200422 00:17:54.512406 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200422 00:57:36.095418 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@998abbe628d7133932c1beb9240a18c229bab735:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:1941,many_splits.go:23,acceptance.go:91,test_runner.go:753: /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned: exit status 1
(1) /go/src/github.com/cockroachdb/cockroach/bin/roachprod start --env=COCKROACH_SCAN_MAX_IDLE_TIME=5ms local returned
| stderr:
| ckroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| 3: invalid version string '998abbe'
| github.com/cockroachdb/cockroach/pkg/util/version.Parse
| /go/src/github.com/cockroachdb/cockroach/pkg/util/version/version.go:90
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.getCockroachVersion
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:96
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.Cockroach.Start.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cockroach.go:168
| github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install.(*SyncedCluster).Parallel.func1.1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/install/cluster_synced.go:1660
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357:
| I200422 01:31:52.607795 1 cluster_synced.go:1742 command failed
|
| stdout:
| local: starting
Wraps: (2) exit status 1
Error types: (1) *main.withCommandDetails (2) *exec.ExitError
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@056e32e84831f13b286fceb7681dd0cd2b00b4b4:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
test_runner.go:800: test timed out (10m0s)
cluster.go:2444,many_splits.go:43,acceptance.go:91,test_runner.go:753: monitor failure: monitor task failed: context canceled
(1) attached stack trace
| main.(*monitor).WaitE
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2432
| main.(*monitor).Wait
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2440
| main.runManySplits
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/many_splits.go:43
| main.registerAcceptance.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/acceptance.go:91
| [...repeated from below...]
Wraps: (2) monitor failure
Wraps: (3) attached stack trace
| main.(*monitor).wait.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2488
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357
Wraps: (4) monitor task failed
Wraps: (5) context canceled
Error types: (1) *withstack.withStack (2) *errutil.withMessage (3) *withstack.withStack (4) *errutil.withMessage (5) *errors.errorString
cluster.go:1481,context.go:135,cluster.go:1470,test_runner.go:825: dead node detection: /go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor local --oneshot --ignore-empty-nodes: exit status 1 1: dead
4: 25976
2: 25774
3: 25876
Error: UNCLASSIFIED_PROBLEM: 1: dead
(1) UNCLASSIFIED_PROBLEM
Wraps: (2) 1: dead
| main.glob..func13
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1129
| main.wrap.func1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:272
| github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).execute
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:766
| github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).ExecuteC
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:852
| github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).Execute
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:800
| main.main
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1793
| runtime.main
| /usr/local/go/src/runtime/proc.go:203
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357
Error types: (1) errors.Unclassified (2) *errors.fundamental
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@5750a9abee7e1e39923f77ade5c01448117ff842:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
test_runner.go:800: test timed out (10m0s)
cluster.go:1481,context.go:135,cluster.go:1470,test_runner.go:825: dead node detection: /go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor local --oneshot --ignore-empty-nodes: exit status 1 4: dead
2: 26268
1: 26165
3: 26370
Error: UNCLASSIFIED_PROBLEM: 4: dead
(1) UNCLASSIFIED_PROBLEM
Wraps: (2) 4: dead
| main.glob..func13
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1129
| main.wrap.func1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:272
| github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).execute
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:766
| github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).ExecuteC
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:852
| github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).Execute
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:800
| main.main
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1793
| runtime.main
| /usr/local/go/src/runtime/proc.go:203
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357
Error types: (1) errors.Unclassified (2) *errors.fundamental
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
I took a look at the repro I got (without asking for it) in an unrelated PR (. Thanks to the roachtest stack trace, I was able to see that roachtest was simply blocking on the single statement this test runs:
From the node logs I was seeing split messages that correspond to roughly ~2000 splits the test does, the highest one being:
I200429 15:12:47.497883 1154118 kv/kvserver/replica_command.go:397 [n2,s2,r2051/1:/{Table/52/1/1…-Max}] initiating a split of this range at key /Table/52/1/2000 [r2052] (manual)
...
W200429 15:12:48.621465 255 kv/kvserver/store_raft.go:508 [n2,s2,r2052/1:/{Table/52/1/2…-Max}] handle raft ready: 0.7s [applied=0, batches=0, state_assertions=0]
The test hit the timeout at 15:12:25, so it was already game over at that point. So it seems that for some reason, it sometimes takes ten minutes to run 2000 splits. I'm seeing some badness in the logs - slow raft readies at the end, and this one here:
W200429 15:12:13.328702 256 kv/kvserver/split_trigger_helper.go:146 [n3,s3,r1984/2:{-}] would have dropped incoming MsgApp to wait for split trigger, but allowing due to 101 (>100) ticks
(wouldn't expect to see that in regular circumstances).
In-memory, locally, the following takes me around ~15s via cockroach demo:
CREATE TABLE t(x, PRIMARY KEY(x)) AS TABLE generate_series(1,2000);
ALTER TABLE t SPLIT AT TABLE generate_series(1,2000);
Certainly 10 minutes is enough time here.
The test history indicates that something got worse here, too:

Most of those failures are on 19.1 and 2.1. On master, it looks better:

Hmm, the issue at the top has a stack trace from n4. The others don't even though they were supposed to be created for all of them:
I wonder if this kill invocation doesn't reliably work.
(roachtest).acceptance/many-splits failed on master@733b32eb316ff85b62499c4ea6dd96059ae17568:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:2467,many_splits.go:43,acceptance.go:95,test_runner.go:757: monitor failure: unexpected node event: 4: dead
(1) attached stack trace
| main.(*monitor).WaitE
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2455
| main.(*monitor).Wait
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2463
| main.runManySplits
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/many_splits.go:43
| main.registerAcceptance.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/acceptance.go:95
| main.(*testRunner).runTest.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:757
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1373
Wraps: (2) monitor failure
Wraps: (3) unexpected node event: 4: dead
Error types: (1) *withstack.withStack (2) *errutil.withMessage (3) *errors.errorString
cluster.go:1512,context.go:135,cluster.go:1501,test_runner.go:829: dead node detection: /go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor local --oneshot --ignore-empty-nodes: exit status 1 4: dead
2: 16441
1: 16390
3: 16492
Error: UNCLASSIFIED_PROBLEM: 4: dead
(1) UNCLASSIFIED_PROBLEM
Wraps: (2) attached stack trace
| main.glob..func13
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1115
| main.wrap.func1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:266
| github.com/spf13/cobra.(*Command).execute
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
| github.com/spf13/cobra.(*Command).ExecuteC
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
| github.com/spf13/cobra.(*Command).Execute
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
| main.main
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1789
| runtime.main
| /usr/local/go/src/runtime/proc.go:203
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1373
Wraps: (3) 3 safe details enclosed
Wraps: (4) 4: dead
Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *safedetails.withSafeDetails (4) *errors.errorString
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
(roachtest).acceptance/many-splits failed on master@e6119d7bfbc887ad0667b34a0586364eb24974b6:
The test failed on branch=master, cloud=gce:
test artifacts and logs in: artifacts/acceptance/many-splits/run_1
cluster.go:2467,many_splits.go:43,acceptance.go:95,test_runner.go:757: monitor failure: unexpected node event: 3: dead
(1) attached stack trace
| main.(*monitor).WaitE
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2455
| main.(*monitor).Wait
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2463
| main.runManySplits
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/many_splits.go:43
| main.registerAcceptance.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/acceptance.go:95
| main.(*testRunner).runTest.func2
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:757
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1373
Wraps: (2) monitor failure
Wraps: (3) unexpected node event: 3: dead
Error types: (1) *withstack.withStack (2) *errutil.withMessage (3) *errors.errorString
cluster.go:1512,context.go:135,cluster.go:1501,test_runner.go:829: dead node detection: /go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor local --oneshot --ignore-empty-nodes: exit status 1 3: dead
4: 16995
1: 16842
2: 16893
Error: UNCLASSIFIED_PROBLEM: 3: dead
(1) UNCLASSIFIED_PROBLEM
Wraps: (2) attached stack trace
| main.glob..func13
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1115
| main.wrap.func1
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:266
| github.com/spf13/cobra.(*Command).execute
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
| github.com/spf13/cobra.(*Command).ExecuteC
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
| github.com/spf13/cobra.(*Command).Execute
| /go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
| main.main
| /go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1789
| runtime.main
| /usr/local/go/src/runtime/proc.go:203
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1373
Wraps: (3) 3 safe details enclosed
Wraps: (4) 3: dead
Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *safedetails.withSafeDetails (4) *errors.errorString
Artifacts: /acceptance/many-splits See this test on roachdashMore
Related:
powered by pkg/cmd/internal/issues
Going to skip this test as it has been affecting master.
@nvanbenschoten this is a really simple test and it's concerning that it's failing. Could you (as L2 secondary) make sure something happens here?
BTW I looked at the artifacts for the node deaths and found ... nothing. No stack traces, zilch, I wonder if something got broken. Maybe it really is that kill -11 doing something bad? Hope we can repro this easily.
Possibly not related, but kv/splits/nodes=3/quiesce=false also failed sort of opaquely today, with a dead node but nothing interesting in logs:
https://teamcity.cockroachdb.com//viewLog.html?buildId=2067818&buildTypeId=Cockroach_MergeToMaster
Also, didn't we use to pull dmesg and such? I don't see that. Bet that broke somehow...
Another theory is that $something broke in https://github.com/cockroachdb/cockroach/pull/50703 (but seems unlikely, we're definitely calling t.Fatal and things seem to all work except that dmesg.txt is MIA)
BTW I verified with a one-off run that I intentionally failed that the dmesg etc do end up in artifacts:
$ less artifacts/acceptance/many-splits/run_1/
1.dmesg.txt 2.dmesg.txt 3.dmesg.txt 4.dmesg.txt MONITOR.log test.log
1.journalctl.txt 2.journalctl.txt 3.journalctl.txt 4.journalctl.txt roachprod_state/
Beats me why they're not collected on TC.
This is the snippet from build log. Usually when TC fails to collect some file (b/c permissions or whatnot) there's a warning printed here, but this all looks fine
[05:50:10] : [Step 4/4] Publishing artifacts (1s)
[05:50:10] : [Publishing artifacts] Collecting files to publish: [artifacts/acceptance/many-splits/** => acceptance/many-splits]
[05:50:12]E: [Step 4/4] Failed tests detected
[05:50:12] : [Publishing artifacts] Publishing 48 files using [WebPublisher]: artifacts/acceptance/many-splits/** => acceptance/many-splits
[05:50:12] : [Publishing artifacts] Publishing 48 files using [ArtifactsCachePublisher]: artifacts/acceptance/many-splits/** => acceptance/many-splits
[05:50:10] : [Run local roachtests] 05:50:10 test_runner.go:685: [w0]
[05:50:10]i: [Run local roachtests] ##teamcity[publishArtifacts 'artifacts/acceptance/many-splits/** => acceptance/many-splits']
[05:50:10] : [Step 4/4] Publishing artifacts (2s)
[05:50:10] : [Publishing artifacts] Collecting files to publish: [artifacts/acceptance/many-splits/** => acceptance/many-splits]
[05:50:12] : [Publishing artifacts] Publishing 48 files using [WebPublisher]: artifacts/acceptance/many-splits/** => acceptance/many-splits
[05:50:12] : [Publishing artifacts] Publishing 48 files using [ArtifactsCachePublisher]: artifacts/acceptance/many-splits/** => acceptance/many-splits
[05:50:10]i: [Run local roachtests] ##teamcity[publishArtifacts 'artifacts/acceptance/many-splits/** => acceptance/many-splits']
I'll take a look at this next week.
cc @aayushshah15 the failure mode here definitely pre-dates Go 1.15 bump as 998abbe628d7133932c1beb9240a18c229bab735: failed multiple times in a day (dated sometime in early April) and the bump happened only on Jun 29.
I'll attempt a repro at 998abbe628d7133932c1beb9240a18c229bab735.
Will run a few more, but at around master (10f0c57d02d5cf786be9fc907f73e7c411d4569d) with an unskip this passes in ~90s on my gceworker.
I wasted some time initially to try to run this on a n1-standard-8 machine (the CI agent machine) but... it's hard to get it to run. roachtest expects to be in the cockroach main repo, so you have to copy that, etc... quite tedious.
Yeah I don't think this will fail easily on my gceworker
artifacts/acceptance/many-splits/run_4/test.log
28:13:00:21 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (152.09s)
artifacts/acceptance/many-splits/run_8/test.log
28:13:10:47 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (134.11s)
artifacts/acceptance/many-splits/run_2/test.log
28:12:55:38 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (91.02s)
artifacts/acceptance/many-splits/run_5/test.log
28:13:03:12 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (169.17s)
artifacts/acceptance/many-splits/run_6/test.log
28:13:05:55 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (161.44s)
artifacts/acceptance/many-splits/run_9/test.log
28:13:13:30 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (161.39s)
artifacts/acceptance/many-splits/run_7/test.log
28:13:08:32 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (155.94s)
artifacts/acceptance/many-splits/run_10/test.log
28:13:16:39 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (187.74s)
artifacts/acceptance/many-splits/run_1/test.log
28:12:54:05 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (104.09s)
artifacts/acceptance/many-splits/run_3/test.log
28:12:57:48 test_runner.go:670: [w0] --- PASS: acceptance/many-splits (129.45s)
But I am curious if I can see a difference in how long this test takes as I go back to a commit that precedes these failures.
Testing April 10 next (2b2eac926b5a51f78d64f6d7d3e86c80c13931a4
It's still running but I can already say that it's not significantly faster. Looks about the same, in the 150 range most of the time.
On my macbook.
# a565ade7001ff5844581fec834b542 (v20.1). Passes, takes 2m. Built using go1.14.
rm -r artifacts; make bin/roachprod; make bin/roachtest; make buildshort; roachprod wipe local; roachprod destroy local; roachtest run --local --debug acceptance/many-splits --cockroach (which cockroach20.1)
# 5eb39081057234aa58e8dbe15b7576 (master). Times out. Built using go 1.14.
rm -r artifacts; make bin/roachprod; make bin/roachtest; make buildshort; roachprod wipe local; roachprod destroy local; roachtest run --local --debug acceptance/many-splits --cockroach (which cockroach)
Just in case some else missed it (I did), this test recently started failing 15 days ago. So did #50865. I think the original failures that created this issue were something else.
I wasn't very successful in getting this test to fail on gceworkers, but I was on my (weak) laptop. I tried bisecting, except I always built with go1.14. I landed on https://github.com/cockroachdb/cockroach/commit/4cb8298b738f0bbaa30a7958d89f7ea778a62d1a. I wonder if the pebble switch + go1.14 ends up somehow causing more utilization that makes this test a lot slower? Or more resource intensive?
I was using the following to run things locally:
make buildshort; roachprod wipe local; roachprod destroy local; roachtest run --local --debug acceptance/many-splits --cockroach ./cockroachshort
On https://github.com/cockroachdb/cockroach/commit/9a06d3bd38ed4d85a56a8355e1a15e37a4a638f8, right before the pebble bump, we're pretty happy (creating 2000 ranges takes about 2m).:
01:35:23 many_splits.go:34: creating 2000 ranges...
3: 70936
4: 70946
2: 70923
1: 70913
01:37:03 cluster.go:352: > /Users/irfansharif/Software/src/github.com/cockroachdb/cockroach/bin/roachprod pgurl --external local:1
On https://github.com/cockroachdb/cockroach/commit/4cb8298b738f0bbaa30a7958d89f7ea778a62d1a, now using pebble, not so much (creating 2000 ranges times out after 10m):
01:42:42 many_splits.go:34: creating 2000 ranges...
1: 74395
2: 74410
3: 74424
4: 74436
01:52:06 test_runner.go:773: [w0] dumped stacks to __stacks
So far I've been lazily avoiding actually looking at the logs, but now I see a ton of learner snapshot/replica movement activity, which makes sense - that's what this test is supposed to do. Does that put undue pressure on pebble somehow? I (or someone else) should try this last run again with go1.13. Aayush saw added CPU utilization over in https://github.com/cockroachdb/cockroach/issues/50865, and it seems to be due to increased GC activity. We could also profile allocations during this period of heavy splits and see what pops up.
I (or someone else) should try this last run again with go1.13.
$ rm -r artifacts; env GO=(which go1.13.9) make build; bin/roachprod wipe local; bin/roachprod destroy local; bin/roachtest run --local --debug acceptance/many-splits --cockroach ./cockroach
[...]
02:54:33 many_splits.go:34: creating 2000 ranges...
4: 22122
1: 22083
2: 22103
3: 22112
02:54:58 test_runner.go:773: [w0] dumped stacks to __stacks
So looks like it isn't cause of go1.14? I'm a bit confused. I hope it's not some silly darwin build perf regression that I'm chasing down..
Wanted to check if we're straight up stuck, or just slow. So tried lowering the range count to 200. With pebble (1m 15s):
03:15:49 many_splits.go:34: creating 200 ranges...
2: 33653
3: 33669
1: 33643
4: 33679
03:17:03 cluster.go:352: > /Users/irfansharif/Software/src/github.com/cockroachdb/cockroach/bin/roachprod pgurl --external local:1
With rocksdb (2s!):
03:17:58 many_splits.go:34: creating 200 ranges...
2: 34667
3: 34676
1: 34658
4: 34690
03:18:01 cluster.go:352: > /Users/irfansharif/Software/src/github.com/cockroachdb/cockroach/bin/roachprod pgurl --external local:1
I hope it's not some silly darwin build perf regression that I'm chasing down..
Thankfully, it's not.
# RocksDB, on gceworker, 200 ranges.
--- PASS: acceptance/many-splits (13.46s)
# Pebble, on gceworker, 200 ranges.
--- PASS: acceptance/many-splits (47.03s)
I believe Nathan mentioned that we are now using the fsync call that
actually does the real thing on OSX and that that is the reason why it is
so different on MacBooks now. However, the original failure is on Linux,
I'm not sure the same argument applies there. @nvanbenschoten can you
remind me?
On the whole though, I was seeing 2 minutes per run on gceworker, which...
Is ridiculous? 2k splits should take a few seconds I thought. (All with
go1.14)
On Thu, Jul 16, 2020, 05:21 irfan sharif notifications@github.com wrote:
Wanted to check if we're straight up stuck, or just slow. So tried
lowering the range count to 200. With pebble (1m 15s):03:15:49 many_splits.go:34: creating 200 ranges...
2: 33653
3: 33669
1: 33643
4: 33679
03:17:03 cluster.go:352: > /Users/irfansharif/Software/src/github.com/cockroachdb/cockroach/bin/roachprod pgurl --external local:1With rocksdb (2s!):
03:17:58 many_splits.go:34: creating 200 ranges...
2: 34667
3: 34676
1: 34658
4: 34690
03:18:01 cluster.go:352: > /Users/irfansharif/Software/src/github.com/cockroachdb/cockroach/bin/roachprod pgurl --external local:1—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/cockroachdb/cockroach/issues/47325#issuecomment-659133742,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABGXPZAFJUMGWVMF2UWOOUTR3ZWZ7ANCNFSM4MFJA4SA
.
I'm seeing a perf regression with pebble on my gceworker as well though (see my last message - edited it in so might've missed it if reading through mail), so I'm no longer suspecting any darwin funkiness.
I looked into the perf regression between pebble and rocksdb, and couldn't really reproduce anything significant. When running with the default aws instance size for that roachtest, I get comparable performance (around 1m30s-2m for the phase of many-splits before the consistency checks). The Pebble ones can sometimes be slower but there's enough variance with either engine(s) that it's fair to say both are comparable. The performance difference is nonexistent and sometimes reversed when I run on my gceworker. This was all with 2000 ranges; there's a more noticeable difference in runtime at 200 ranges between the two engines (10-15s with rocksdb, 15-25s with Pebble), but that may not be too indicative of an actual slowness.
Note that all of the above was with latest cockroach master, and Pebble has seen many ingestion and L0 compaction related improvements since April. This workload basically stresses a lot of small ingestions at the storage level, and if they aren't compacted away soon enough, ingestions and by extension snapshot application can slow down. This is also why I don't want to dive too much into performance over a 10-second workload as we would with a 200 range test, at that point we're just measuring other factors and not things that would dominate in a longer, more realistic workload.
Since this issue hasn't seen new roachtest failures since, and since Bilal is confident that Pebble has made progress, any objection to closing this issue now?
Reopening this since it hasn't been resolved yet.
I was not able to reproduce this issue. I'm going to unskip the roachtest for now and see if CI digs it up again. I'll keep this issue alive for the next little while so it stays on my radar.
Closing this since we haven't seen new nightly failures after unskipping.