Cockroach: sentry: conn_executor.go:643: panic while executing 1 statements: SELECT DISTINCT _.* FROM _ AS _ JOIN _ AS _ ON ((_._ = _._) AND (_._ = _._)) AND (_._ IS _) WHERE _._ = $1: caused by <redacted>

Created on 24 Dec 2018  路  13Comments  路  Source: cockroachdb/cockroach

This issue was autofiled by Sentry. It represents a crash or reported error on a live cluster with telemetry enabled.

Sentry link: https://sentry.io/cockroach-labs/cockroachdb/issues/820548164/?referrer=webhooks_plugin

Panic message:

conn_executor.go:643: panic while executing 1 statements: SELECT DISTINCT _.* FROM _ AS _ JOIN _ AS _ ON ((_._ = _._) AND (_._ = _._)) AND (_._ IS _) WHERE _._ = $1: caused by


Stacktrace (expand for inline code snippets):

https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/conn_executor.go#L386-L388 in pkg/sql.(Server).ServeConn.func1
/usr/local/go/src/runtime/asm_amd64.s#L572-L574 in runtime.call32
/usr/local/go/src/runtime/panic.go#L501-L503 in runtime.gopanic
/usr/local/go/src/runtime/panic.go#L27-L29 in runtime.panicindex
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/distsql_physical_planner.go#L934-L936 in pkg/sql.(
DistSQLPlanner).convertOrdering
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/distsql_physical_planner.go#L1917-L1919 in pkg/sql.(DistSQLPlanner).createPlanForLookupJoin
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/distsql_physical_planner.go#L2272-L2274 in pkg/sql.(
DistSQLPlanner).createPlanForNode
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/distsql_physical_planner.go#L2278-L2280 in pkg/sql.(DistSQLPlanner).createPlanForNode
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/distsql_physical_planner.go#L2657-L2659 in pkg/sql.(
DistSQLPlanner).createPlanForDistinct
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/distsql_physical_planner.go#L2328-L2330 in pkg/sql.(DistSQLPlanner).createPlanForNode
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/distsql_physical_planner.go#L2278-L2280 in pkg/sql.(
DistSQLPlanner).createPlanForNode
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/distsql_running.go#L751-L753 in pkg/sql.(DistSQLPlanner).PlanAndRun
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/conn_executor_exec.go#L981-L983 in pkg/sql.(
connExecutor).execWithDistSQLEngine
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/conn_executor_exec.go#L823-L825 in pkg/sql.(connExecutor).dispatchToExecutionEngine
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/conn_executor_exec.go#L401-L403 in pkg/sql.(
connExecutor).execStmtInOpenState
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/conn_executor_exec.go#L95-L97 in pkg/sql.(connExecutor).execStmt
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/conn_executor.go#L1172-L1174 in pkg/sql.(
connExecutor).run
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/conn_executor.go#L388-L390 in pkg/sql.(Server).ServeConn
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/pkg/sql/pgwire/conn.go#L312-L314 in pkg/sql/pgwire.(
conn).serveImpl.func4

pkg/sql/conn_executor.go in pkg/sql.(*Server).ServeConn.func1 at line 387
/usr/local/go/src/runtime/asm_amd64.s in runtime.call32 at line 573
/usr/local/go/src/runtime/panic.go in runtime.gopanic at line 502
/usr/local/go/src/runtime/panic.go in runtime.panicindex at line 28
pkg/sql/distsql_physical_planner.go in pkg/sql.(*DistSQLPlanner).convertOrdering at line 935
pkg/sql/distsql_physical_planner.go in pkg/sql.(*DistSQLPlanner).createPlanForLookupJoin at line 1918
pkg/sql/distsql_physical_planner.go in pkg/sql.(*DistSQLPlanner).createPlanForNode at line 2273
pkg/sql/distsql_physical_planner.go in pkg/sql.(*DistSQLPlanner).createPlanForNode at line 2279
pkg/sql/distsql_physical_planner.go in pkg/sql.(*DistSQLPlanner).createPlanForDistinct at line 2658
pkg/sql/distsql_physical_planner.go in pkg/sql.(*DistSQLPlanner).createPlanForNode at line 2329
pkg/sql/distsql_physical_planner.go in pkg/sql.(*DistSQLPlanner).createPlanForNode at line 2279
pkg/sql/distsql_running.go in pkg/sql.(*DistSQLPlanner).PlanAndRun at line 752
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execWithDistSQLEngine at line 982
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).dispatchToExecutionEngine at line 824
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execStmtInOpenState at line 402
pkg/sql/conn_executor_exec.go in pkg/sql.(*connExecutor).execStmt at line 96
pkg/sql/conn_executor.go in pkg/sql.(*connExecutor).run at line 1173
pkg/sql/conn_executor.go in pkg/sql.(*Server).ServeConn at line 389
pkg/sql/pgwire/conn.go in pkg/sql/pgwire.(*conn).serveImpl.func4 at line 313

| Tag | Value |
|---|---|
| Cockroach Release | v2.1.3 |
| Cockroach SHA: | 0c87b11cb99ba5c677c95ded55dcba385928474e |
| Platform | linux amd64 |
| Distribution | CCL |
| Environment | v2.1.3 |
| Command | server |
| Go Version | go1.10.3|
| # of CPUs |2 |
| # of Goroutines |145 |

A-sql-optimizer C-bug O-sentry

Most helpful comment

When applying the v2.1.4 Docker image to nexdrew/cockroach-33342-repro, the panic and subsequent crash due to an "index out of range" error goes away. So I think this is fixed with the 2.1.4 release. 馃槂

Nice job, @RaduBerinde and @jordanlewis!

All 13 comments

Dupe of #33343 but it's much simpler so going to leave this open and close that one.

SELECT
    DISTINCT b.*
FROM
    a AS b
    JOIN c AS d ON (b.z = d.y AND b.x = d.w) AND b.v IS true
WHERE
    b.f = $1;

I have been trying to reproduce this but I don't have a lot to go on. I was able to get a plan where the DISTINCT relies on the lookup join ordering but it seems to work fine (on v2.1.3).
https://cockroachdb.github.io/distsqlplan/decode.html#eJy8lUFr2zAYhu_7Fea7VtB8lp0mgoEPY5Ax2lF6Gz649kfRlkpGkmGj5L8Px93shETySOyj7bx-n_h5QW-gdEX3xStZEN8BgUEMDDgwSIBBCjmD2uiSrNWm_UkX2FS_QCwYSFU3rr2dMyi1IRBv4KTbEgh4Kp639EhFReZ2AQwqcoXc7mtqI18L8zsrgMFnuXVkRJTx6GOEQojN_RPkOwa6cf3LrSteCATu2HiAL1qq9_70dP8zMPiq9c-mjn5oqSKtRJQhy9pP8NC49wuW8bNA8f8AfZLWSVW6Wzz6Hv86TUWGqr8Q50r52dK-S3evOu65YVl8A_nOS5deRJcc0OH4keAkIwkADEaynGckAaBeA15xJPF4DfEkGgIAAw1382gIAPUa4itq4OM18Ek0BAAGGlbzaAgA9Rr4FTUk4zUkk2gIAAw0rOfREADqNSQTnVwnSh_J1lpZOmg89-ZFe6RR9ULdMWh1Y0r6ZnS5r-kuH_a5_Y2KrOueYnexUd2jFnAYRm849odjb5gfhPE4zP3YS3914k2n_nDqDQeal5f86TtveOVvXnnDa394fQk2BjYWGpl_ZRiYGV60MwwMLQmU-5eGgamhf2vH7Pnuw58AAAD__1I-FJ4

Yeah I haven't been able to figure it out either. There are some more reproduction examples in another issue. I'll paste them, but perhaps @nexdrew can also provide the relevant schema which would be really helpful:

SELECT mpr.comp_plan_id, count(mpr.rep_id) AS count FROM map_comp_plan_rep AS mpr INNER JOIN comp_plan AS p ON p.id = mpr.comp_plan_id INNER JOIN rep AS r ON r.id = mpr.rep_id WHERE ((p.org_id = $1) AND (p.deleted = $2)) AND (r.deleted = $3) GROUP BY mpr.comp_plan_id
SELECT cr.comp_plan_id, count(cr.id) AS count FROM comp_rule AS cr INNER JOIN comp_plan AS p ON p.id = cr.comp_plan_id WHERE ((p.org_id = $1) AND (p.deleted = $2)) AND (cr.deleted = $3) GROUP BY cr.comp_plan_id
SELECT q.comp_plan_id, count(q.id) AS count FROM quota AS q INNER JOIN comp_plan AS p ON p.id = q.comp_plan_id WHERE ((p.org_id = $1) AND (p.deleted = $2)) AND (q.deleted = $3) GROUP BY q.comp_plan_id

I created a tiny repo that you can use to test: https://github.com/nexdrew/cockroach-33342-repro

It reliably produces this error for me:

E190104 18:15:22.268081 1288 sql/conn_executor.go:636  [n1,client=127.0.0.1:41052,user=root] a SQL panic has occurred while executing "SELECT q.comp_plan_id, count(q.id) AS count FROM test.public.quota AS q INNER JOIN test.public.comp_plan AS p ON p.id = q.comp_plan_id WHERE ((p.org_id = 'cjo63dwkz0000p8j7hytgl4jx') AND (p.deleted = false)) AND (q.deleted = false) GROUP BY q.comp_plan_id": runtime error: index out of range
E190104 18:15:22.268598 1288 util/log/crash_reporting.go:203  [n1,client=127.0.0.1:41052,user=root] a panic has occurred!
E190104 18:15:22.662802 1288 util/log/crash_reporting.go:477  [n1,client=127.0.0.1:41052,user=root] Reported as error 89796b1532cc48e8a8e97da6bf7c22c5
panic: runtime error: index out of range [recovered]
    panic: panic while executing 1 statements: SELECT _._, count(_._) AS _ FROM _._._ AS _ INNER JOIN _._._ AS _ ON _._ = _._ WHERE ((_._ = _) AND (_._ = _)) AND (_._ = _) GROUP BY _._; caused by runtime error: index out of range

goroutine 1288 [running]:
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).closeWrapper(0xc42c9cf500, 0x3032640, 0xc42d35f400, 0x2770200, 0x4231d70)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:650 +0x36f
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1(0xc42c9cf500, 0x3032640, 0xc42d35f400)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:387 +0x61
panic(0x2770200, 0x4231d70)
    /usr/local/go/src/runtime/panic.go:502 +0x229
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).convertOrdering(0xc420903600, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_physical_planner.go:935 +0x27a
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).createPlanForLookupJoin(0xc420903600, 0xc42d64a660, 0xc42d6484e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_physical_planner.go:1918 +0x964
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).createPlanForNode(0xc420903600, 0xc42d64a660, 0x3025380, 0xc42d6484e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_physical_planner.go:2273 +0x1284
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).createPlanForLookupJoin(0xc420903600, 0xc42d64a660, 0xc42d648680, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_physical_planner.go:1844 +0xba
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).createPlanForNode(0xc420903600, 0xc42d64a660, 0x3025380, 0xc42d648680, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_physical_planner.go:2273 +0x1284
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).createPlanForNode(0xc420903600, 0xc42d64a660, 0x3025700, 0xc42d623600, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_physical_planner.go:2279 +0xccd
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).createPlanForNode(0xc420903600, 0xc42d64a660, 0x3025180, 0xc42d655200, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_physical_planner.go:2289 +0x863
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRun(0xc420903600, 0x3032700, 0xc42d572d50, 0xc42c9cf9b0, 0xc42d64a660, 0xc42cb17440, 0x3025180, 0xc42d655200, 0xc42d65ca00)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:752 +0x126
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execWithDistSQLEngine(0xc42c9cf500, 0x3032700, 0xc42d572d50, 0xc42c9cf918, 0x3, 0x7f52136cb950, 0xc42cb174d0, 0x1, 0x0, 0x0)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:982 +0x2d8
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine(0xc42c9cf500, 0x3032700, 0xc42d572d50, 0x3035d80, 0xc42cbc3c00, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:824 +0xa8a
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState(0xc42c9cf500, 0x3032700, 0xc42d572d50, 0x3035d80, 0xc42cbc3c00, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:402 +0xb3b
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt(0xc42c9cf500, 0x3032700, 0xc42d572d50, 0x3035d80, 0xc42cbc3c00, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:96 +0x341
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc42c9cf500, 0x3032640, 0xc42d35f400, 0xc4209af098, 0x5400, 0x15000, 0xc4209af130, 0xc4297cd490, 0x0, 0x0)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1116 +0x21aa
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0xc4209ba960, 0x3032640, 0xc42d35f400, 0xc42c9cf500, 0x5400, 0x15000, 0xc4209af130, 0xc4297cd490, 0x0, 0x0)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:389 +0xce
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).serveImpl.func4(0xc4209ba960, 0x3032640, 0xc42d35f400, 0xc42c9cf500, 0x5400, 0x15000, 0xc4209af130, 0xc4297cd490, 0xc4297cd4a0, 0xc42cea7e60)
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:313 +0x81
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).serveImpl
    /go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:312 +0x107c

Hopefully this helps!

@nexdrew, you are the man. Thank you so much!

Interestingly, this is not reproducible on master, suggesting that the fix exists and needs backporting.

I believe this is the problem:

--- a/pkg/sql/distsql_physical_planner.go
+++ b/pkg/sql/distsql_physical_planner.go
@@ -1915,7 +1915,7 @@ func (dsp *DistSQLPlanner) createPlanForLookupJoin(
        distsqlrun.ProcessorCoreUnion{JoinReader: &joinReaderSpec},
        post,
        types,
-       dsp.convertOrdering(planPhysicalProps(n), plan.PlanToStreamColMap),
+       dsp.convertOrdering(planPhysicalProps(n), planToStreamColMap),
    )

This code is the same on master. But on master we determine orderings differently, and it might just happen to be harder to reproduce. I'll look into it some more.

I think this might not be reproducible on master because the new orderings code will always choose a column that is in the input (and the input map plan.PlanToStreamColMap is a "prefix" of the output map planToStreamColMap).

Near as I can tell, the crash was introduced by 16d98c6b1ad8465931be54cefa4bd8c5e7bf38a4 - I haven't tracked down why exactly.

Yeah, that commit fixes an egregious bug where we don't actually use the ordering in the lookupJoinNode at all. Without that commit the ordering here is always empty.

When applying the v2.1.4 Docker image to nexdrew/cockroach-33342-repro, the panic and subsequent crash due to an "index out of range" error goes away. So I think this is fixed with the 2.1.4 release. 馃槂

Nice job, @RaduBerinde and @jordanlewis!

Thanks!

Was this page helpful?
0 / 5 - 0 ratings