Mostly failing to get messages at channels with +1000 message history, stuck on "Loading...".
VM GCP f1-micro (1CPU,0.6GB)
3 users, ~74k messages in archive
Version: 5.19.0
Build Number: 5.19.1
Build Date: Tue Jan 21 23:30:31 UTC 2020
Build Hash: 57f3ca975f565b0235d72ba6eb098e73a25513cd
Build Enterprise Ready: true
DB Version: 5.19.0
No errors on web console, the same problems on mobile and for other users.
Meanwhile in the mattermost.log:
{"level":"error","ts":1581243656.008936,"caller":"http/h2_bundle.go:4211","msg":"http2: panic serving xx.xx.xx.xx:22323: runtime error: invalid memory address or nil pointer dereference\ngoroutine 434476 [running]:\nnet/http.(*http2serverConn).runHandler.func1(0xc002d52760, 0xc0047ebf67, 0xc004e2f380)\n\tnet/http/h2_bundle.go:5706 +0x16b\npanic(0x16bde60, 0x2861d00)\n\truntime/panic.go:679 +0x1b2\ngithub.com/mattermost/mattermost-server/v5/app.(*App).GetNextPostIdFromPostList(0xc004807340, 0x0, 0x1a, 0x0)\n\tgithub.com/mattermost/mattermost-server/v5@/app/post.go:704 +0x37\ngithub.com/mattermost/mattermost-server/v5/api4.getPostsForChannelAroundLastUnread(0xc00267f530, 0x1bc7240, 0xc0021eb3b0, 0xc004c9a100)\n\tgithub.com/mattermost/mattermost-server/v5@/api4/post.go:229 +0x242\ngithub.com/mattermost/mattermost-server/v5/web.Handler.ServeHTTP(0xc0014ebe60, 0x19bf450, 0x2514b58, 0x22, 0x10001, 0x0, 0x0, 0x1bc7240, 0xc0021eb3b0, 0xc004c9a100)\n\tgithub.com/mattermost/mattermost-server/v5@/web/handlers.go:163 +0x1b9e\ngithub.com/NYTimes/gziphandler.GzipHandlerWithOpts.func1.1(0x1bc6bc0, 0xc002d52760, 0xc004c9a100)\n\tgithub.com/NYTimes/[email protected]/gzip.go:336 +0x23f\nnet/http.HandlerFunc.ServeHTTP(0xc0020038f0, 0x1bc6bc0, 0xc002d52760, 0xc004c9a100)\n\tnet/http/server.go:2007 +0x44\ngithub.com/gorilla/mux.(*Router).ServeHTTP(0xc002c16180, 0x1bc6bc0, 0xc002d52760, 0xc004805f00)\n\tgithub.com/gorilla/[email protected]/mux.go:212 +0xe2\nnet/http.serverHandler.ServeHTTP(0xc0002348c0, 0x1bc6bc0, 0xc002d52760, 0xc004805f00)\n\tnet/http/server.go:2802 +0xa4\nnet/http.initNPNRequest.ServeHTTP(0x1bcc7c0, 0xc004d4f7d0, 0xc006215180, 0xc0002348c0, 0x1bc6bc0, 0xc002d52760, 0xc004805f00)\n\tnet/http/server.go:3365 +0x8d\nnet/http.(*http2serverConn).runHandler(0xc004e2f380, 0xc002d52760, 0xc004805f00, 0xc0027dce80)\n\tnet/http/h2_bundle.go:5713 +0x9f\ncreated by net/http.(*http2serverConn).processHeaders\n\tnet/http/h2_bundle.go:5447 +0x4eb","source":"httpserver"}
{"level":"error","ts":1581244019.2803466,"caller":"http/h2_bundle.go:4211","msg":"http2: panic serving xx.xx.xx.xx:50012: runtime error: invalid memory address or nil pointer dereference\ngoroutine 438942 [running]:\nnet/http.(*http2serverConn).runHandler.func1(0xc002d52fb0, 0xc00155bf67, 0xc0061b0000)\n\tnet/http/h2_bundle.go:5706 +0x16b\npanic(0x16bde60, 0x2861d00)\n\truntime/panic.go:679 +0x1b2\ngithub.com/mattermost/mattermost-server/v5/app.(*App).GetNextPostIdFromPostList(0xc001749500, 0x0, 0x1a, 0x0)\n\tgithub.com/mattermost/mattermost-server/v5@/app/post.go:704 +0x37\ngithub.com/mattermost/mattermost-server/v5/api4.getPostsForChannelAroundLastUnread(0xc001ac23c0, 0x1bc7240, 0xc002352c40, 0xc001834f00)\n\tgithub.com/mattermost/mattermost-server/v5@/api4/post.go:229 +0x242\ngithub.com/mattermost/mattermost-server/v5/web.Handler.ServeHTTP(0xc0014ebe60, 0x19bf450, 0x2514b58, 0x22, 0x10001, 0x0, 0x0, 0x1bc7240, 0xc002352c40, 0xc001834f00)\n\tgithub.com/mattermost/mattermost-server/v5@/web/handlers.go:163 +0x1b9e\ngithub.com/NYTimes/gziphandler.GzipHandlerWithOpts.func1.1(0x1bc6bc0, 0xc002d52fb0, 0xc001834f00)\n\tgithub.com/NYTimes/[email protected]/gzip.go:336 +0x23f\nnet/http.HandlerFunc.ServeHTTP(0xc0020038f0, 0x1bc6bc0, 0xc002d52fb0, 0xc001834f00)\n\tnet/http/server.go:2007 +0x44\ngithub.com/gorilla/mux.(*Router).ServeHTTP(0xc002c16180, 0x1bc6bc0, 0xc002d52fb0, 0xc001834700)\n\tgithub.com/gorilla/[email protected]/mux.go:212 +0xe2\nnet/http.serverHandler.ServeHTTP(0xc0002348c0, 0x1bc6bc0, 0xc002d52fb0, 0xc001834700)\n\tnet/http/server.go:2802 +0xa4\nnet/http.initNPNRequest.ServeHTTP(0x1bcc7c0, 0xc0050e6120, 0xc000055500, 0xc0002348c0, 0x1bc6bc0, 0xc002d52fb0, 0xc001834700)\n\tnet/http/server.go:3365 +0x8d\nnet/http.(*http2serverConn).runHandler(0xc0061b0000, 0xc002d52fb0, 0xc001834700, 0xc0025fccc0)\n\tnet/http/h2_bundle.go:5713 +0x9f\ncreated by net/http.(*http2serverConn).processHeaders\n\tnet/http/h2_bundle.go:5447 +0x4eb","source":"httpserver"}
@Willyfrog This seams to be broken with https://github.com/mattermost/mattermost-server/commit/53e541977b2e5d9421bf92a5444a2971452560e4#diff-2f7d4446bf6cc464490bf4dd04b6d97bL235-L238 because the error check is missing. Can you confirm this?
cc @lieut-data as you submitted https://github.com/mattermost/mattermost-server/pull/12387
seems to be missing on 5.18/5.19 but not on master
The missing error check in question was originally introduced with https://github.com/mattermost/mattermost-server/commit/b832985f1dbef3bb88195afe234e8c3fcfd460c9 and would have been present from v5.14 onwards.
It was briefly fixed by the threads performance change, that I was involved in reverting that feature due to unrelated regressions, hence my commit above. As @Willyfrog notes, this is fixed in the upcoming v5.20 (by the threads performance change). I suspect the underlying problem is a database timeout that fails to get handled correctly. @taihen, heads up that while v5.20 will fix the panic, it /might/ just expose the underlying issue. Or, thanks to the threads performance changes, the timeout also might just disappear.
@taihen Have you had a chance to test this on v5.20.0?
Yes, to some degree, it exposed timeouts on db connection which I think as able to fix and get rid of delays.
Most helpful comment
The missing error check in question was originally introduced with https://github.com/mattermost/mattermost-server/commit/b832985f1dbef3bb88195afe234e8c3fcfd460c9 and would have been present from v5.14 onwards.
It was briefly fixed by the threads performance change, that I was involved in reverting that feature due to unrelated regressions, hence my commit above. As @Willyfrog notes, this is fixed in the upcoming v5.20 (by the threads performance change). I suspect the underlying problem is a database timeout that fails to get handled correctly. @taihen, heads up that while v5.20 will fix the panic, it /might/ just expose the underlying issue. Or, thanks to the threads performance changes, the timeout also might just disappear.