Describe the bug
vmstorage stoped with log:
unexpected fault address 0x7ef816d0ad4f
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7ef816d0ad4f pc=0x46a0da]
goroutine 73314 [running]:
runtime.throw(0xa4b142, 0x5)
runtime/panic.go:1116 +0x72 fp=0xce641053e0 sp=0xce641053b0 pc=0x436112
runtime.sigpanic()
runtime/signal_unix.go:692 +0x443 fp=0xce64105410 sp=0xce641053e0 pc=0x44ccb3
runtime.memmove(0xc1a66f2000, 0x7ef816d0ad4f, 0x5fc)
runtime/memmove_amd64.s:354 +0x40a fp=0xce64105418 sp=0xce64105410 pc=0x46a0da
github.com/VictoriaMetrics/VictoriaMetrics/lib/fs.(*ReaderAt).MustReadAt(0xca84ede550, 0xc1a66f2000, 0x5fc, 0xa000, 0x362ed4f)
github.com/VictoriaMetrics/VictoriaMetrics/lib/fs/reader_at.go:79 +0x15d fp=0xce64105560 sp=0xce64105418 pc=0x6fb4ad
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).readInmemoryBlock(0xc9df7601e0, 0xc266dc3ac0, 0x0, 0x362ed4f, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:336 +0x12d fp=0xce64105638 sp=0xce64105560 pc=0x73ab4d
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).getInmemoryBlock(0xc9df7601e0, 0xc266dc3ac0, 0xbc, 0x100, 0xbd, 0xbc)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:324 +0xae fp=0xce64105698 sp=0xce64105638 pc=0x73a85e
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).nextBlock(0xc9df7601e0, 0xce64105750, 0xbd)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:258 +0x99 fp=0xce641056e0 sp=0xce64105698 pc=0x739ef9
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).Seek(0xc9df7601e0, 0xc82eba9980, 0x87, 0x1980)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:148 +0x45f fp=0xce641057c8 sp=0xce641056e0 pc=0x7395df
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*TableSearch).Seek(0xcba68c6148, 0xc82eba9980, 0x87, 0x1980)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/table_search.go:96 +0x168 fp=0xce64105890 sp=0xce641057c8 pc=0x745e18
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).getTSIDByMetricName(0xcba68c6140, 0xc74752be68, 0xc40638186f, 0x85, 0x76791, 0xc00024e210, 0xc74752be68)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:1385 +0x1c5 fp=0xce64105970 sp=0xce64105890 pc=0x76cf15
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).GetOrCreateTSIDByName(0xcba68c6140, 0xc74752be68, 0xc40638186f, 0x85, 0x76791, 0x0, 0x20)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:554 +0x1da fp=0xce641059f8 sp=0xce64105970 pc=0x764caa
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).add(0xc000346000, 0xc74750e000, 0x0, 0x1000, 0xc4b905e000, 0x8ad, 0x1c00, 0x40, 0x120, 0x10000, ...)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:1205 +0xc3b fp=0xce64105bd0 sp=0xce641059f8 pc=0x7aa9bb
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).AddRows(0xc000346000, 0xc4b905e000, 0x8ad, 0x1c00, 0xcde375a940, 0x0, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:1088 +0x11c fp=0xce64105d20 sp=0xce64105bd0 pc=0x7a9a1c
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).processVMInsertConn(0xc019cfe580, 0xc378446480, 0x4, 0xa5b8d9)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:356 +0x45e fp=0xce64105eb0 sp=0xce64105d20 pc=0x7c1bfe
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).RunVMInsert.func1(0xc019cfe580, 0xb0e0e0, 0xc3389cabc0)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:161 +0x364 fp=0xce64105fc8 sp=0xce64105eb0 pc=0x7c7f44
runtime.goexit()
runtime/asm_amd64.s:1373 +0x1 fp=0xce64105fd0 sp=0xce64105fc8 pc=0x468b21
created by github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).RunVMInsert
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:133 +0x21e
detail log:
panic.log
Version
The line returned when passing --version command line flag to binary. For example:
$ ./vmstorage-prod -version
vmstorage-20200622-062829-heads-cluster-0-gf227799c
$ ./vminsert-prod -version
vminsert-20200610-061147-heads-cluster-0-gd71b6e65
Used command-line flags
./vmstorage-prod -storageDataPath /data1/vmdata -retentionPeriod 6 -search.maxUniqueTimeseries 5000000
Additional context
the vmstorage instance which panic, even though i start it, it didn't accept sample any more(rate(vm_vminsert_metrics_read_total) == 0).
and vminsert instance exposed lots of log like
2020-06-22T21:40:30.343Z warn VictoriaMetrics/app/vminsert/main.go:180 error in "/insert/0/influx/write": cannot handle more than 32 concurrent inserts during 10s; possible solutions: increase `-insert.maxQueueDuration`, increase `-maxConcurrentInserts`, increase server capacity
update
i update the vminsert to f227799c
the panic still exist
@n4mine , could you answer the following questions?
GOARCH do you use? amd64 or something other like arm64?vmstorage? Or it apears infrequently at unpredictable time?Temporary workaround for the bug is to pass -fs.disableMmap command-line flag to vmstorage.
@valyala
Which GOARCH do you use? amd64 or something other like arm64?
amd64
i build vmstorage use make vminsert-prod vmselect-prod vmstorage-prod without other additional flags
Does the SIGBUS panic appear reliably after every restart of vmstorage? Or it apears infrequently at unpredictable time?
it apears infrequently at unpredictable time
@n4mine , thanks for this information!
Could you build vmstorage from the commit 521c657f8d02aa4762ee55f1992bb2fbc3ae4c14 and check whether the SIGBUS panic appears in it?
@valyala
i has deploy 521c657, and will update status in the next few days
@valyala
it's panic just now
panic2.log
unexpected fault address 0x7fe9216af000
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7fe9216af000 pc=0x46a0f2]
goroutine 245 [running]:
runtime.throw(0xa4b142, 0x5)
runtime/panic.go:1116 +0x72 fp=0xc87833ad50 sp=0xc87833ad20 pc=0x436112
runtime.sigpanic()
runtime/signal_unix.go:692 +0x443 fp=0xc87833ad80 sp=0xc87833ad50 pc=0x44ccb3
runtime.memmove(0xc9f9e06000, 0x7fe9216abdfc, 0x5b5b)
runtime/memmove_amd64.s:363 +0x422 fp=0xc87833ad88 sp=0xc87833ad80 pc=0x46a0f2
github.com/VictoriaMetrics/VictoriaMetrics/lib/fs.(*ReaderAt).MustReadAt(0xc0330220f0, 0xc9f9e06000, 0x5b5b, 0x8000, 0x2ece9dfc)
github.com/VictoriaMetrics/VictoriaMetrics/lib/fs/reader_at.go:79 +0x15d fp=0xc87833aed0 sp=0xc87833ad88 pc=0x6fb50d
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).readInmemoryBlock(0xc22b3e8a00, 0xc90681d180, 0x0, 0x2ece9dfc, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:336 +0x12d fp=0xc87833afa8 sp=0xc87833aed0 pc=0x73abfd
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).getInmemoryBlock(0xc22b3e8a00, 0xc90681d180, 0x39, 0xc308980e01, 0x3a, 0x39)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:324 +0xae fp=0xc87833b008 sp=0xc87833afa8 pc=0x73a90e
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).nextBlock(0xc22b3e8a00, 0xc87833b0c0, 0x39)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:258 +0x99 fp=0xc87833b050 sp=0xc87833b008 pc=0x739fa9
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).Seek(0xc22b3e8a00, 0xc7a2255b80, 0x3f, 0x280)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:148 +0x45f fp=0xc87833b138 sp=0xc87833b050 pc=0x73968f
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*TableSearch).Seek(0xc851a503c8, 0xc7a2255b80, 0x3f, 0x280)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/table_search.go:96 +0x168 fp=0xc87833b200 sp=0xc87833b138 pc=0x745ec8
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).updateMetricIDsForOrSuffixNoFilter(0xc851a503c0, 0xc7a2255b80, 0x3f, 0x280, 0xee6b2b2, 0xc8c217fc20, 0x58, 0xc8c217fc20)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:2176 +0xd4 fp=0xc87833b2d8 sp=0xc87833b200 pc=0x774b34
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).updateMetricIDsForOrSuffixesNoFilter(0xc851a503c0, 0xc87a99d5f0, 0xee6b2b2, 0xc8c217fc20, 0x0, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:2145 +0x27c fp=0xc87833b3a0 sp=0xc87833b2d8 pc=0x77425c
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).getMetricIDsForTagFilter(0xc851a503c0, 0xc87a99d5f0, 0xee6b2b2, 0x2a01, 0xc7a2255b80, 0x8)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:2021 +0x96 fp=0xc87833b450 sp=0xc87833b3a0 pc=0x772e96
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).getMetricIDsForDateTagFilter(0xc851a503c0, 0xc9a3fd2990, 0x4803, 0xc862749560, 0x9, 0x10, 0x0, 0xee6b2b2, 0x0, 0x0, ...)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:2705 +0x2e4 fp=0xc87833b560 sp=0xc87833b450 pc=0x779894
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).getMetricIDsForDateAndFilters(0xc851a503c0, 0x4803, 0xc4bf408840, 0x4c4b41, 0xc851a50418, 0x3, 0x1)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:2505 +0x896 fp=0xc87833b6d8 sp=0xc87833b560 pc=0x776776
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).tryUpdatingMetricIDsForDateRange(0xc851a503c0, 0xc8c217fb60, 0xc4bf408840, 0x172dbaabfa0, 0x172dbe66910, 0x4c4b41, 0xc8c217fb60, 0x60)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:2362 +0x258 fp=0xc87833b768 sp=0xc87833b6d8 pc=0x775df8
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).updateMetricIDsForTagFilters(0xc851a503c0, 0xc8c217fb60, 0xc4bf408840, 0x172dbaabfa0, 0x172dbe66910, 0x4c4b41, 0x415bda, 0xc926c1be00)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:1947 +0x81 fp=0xc87833b868 sp=0xc87833b768 pc=0x7727b1
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).searchMetricIDs(0xc851a503c0, 0xc02ea860f8, 0x1, 0x1, 0x172dbaabfa0, 0x172dbe66910, 0x4c4b40, 0xc03226d148, 0xc26d48cce0, 0xc926c1be00, ...)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:1917 +0x101 fp=0xc87833b920 sp=0xc87833b868 pc=0x772271
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).searchTSIDs(0xc851a503c0, 0xc02ea860f8, 0x1, 0x1, 0x172dbaabfa0, 0x172dbe66910, 0x4c4b40, 0x172dbe66900, 0xc9a3fd2a01, 0xc2566e9500, ...)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:1499 +0x111 fp=0xc87833ba40 sp=0xc87833b920 pc=0x76e6e1
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexDB).searchTSIDs(0xc020490800, 0xc02ea860f8, 0x1, 0x1, 0x172dbaabfa0, 0x172dbe66910, 0x4c4b40, 0x0, 0x0, 0x0, ...)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:1333 +0x222 fp=0xc87833bba0 sp=0xc87833ba40 pc=0x76c902
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).searchTSIDs(0xc0001c4000, 0xc02ea860f8, 0x1, 0x1, 0x172dbaabfa0, 0x172dbe66910, 0x4c4b40, 0xc02ea860f8, 0xc4bf408840, 0xc008500700, ...)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:858 +0xcd fp=0xc87833bc50 sp=0xc87833bba0 pc=0x7a7b2d
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Search).Init(0xc052485040, 0xc0001c4000, 0xc02ea860f8, 0x1, 0x1, 0x172dbaabfa0, 0x172dbe66910, 0x4c4b40)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/search.go:157 +0xbb fp=0xc87833bce8 sp=0xc87833bc50 pc=0x79cf1b
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).processVMSelectSearchQuery(0xc03691e200, 0xc052484fc0, 0x0, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:827 +0x338 fp=0xc87833bdb8 sp=0xc87833bce8 pc=0x7c6d28
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).processVMSelectRequest(0xc03691e200, 0xc052484fc0, 0x0, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:510 +0x38c fp=0xc87833be38 sp=0xc87833bdb8 pc=0x7c3cdc
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).processVMSelectConn(0xc03691e200, 0xc0315e51d0, 0x4, 0xa5b913)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:373 +0xbf fp=0xc87833beb0 sp=0xc87833be38 pc=0x7c25af
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).RunVMSelect.func1(0xc03691e200, 0xb0e1a0, 0xc052489380)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:235 +0x36f fp=0xc87833bfc8 sp=0xc87833beb0 pc=0x7c871f
runtime.goexit()
runtime/asm_amd64.s:1373 +0x1 fp=0xc87833bfd0 sp=0xc87833bfc8 pc=0x468b21
created by github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).RunVMSelect
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:200 +0x21e
goroutine 1 [chan receive, 56 minutes]:
github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil.WaitForSigterm(0xc000000008, 0xa76d80)
github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil/signal.go:21 +0xdc
main.main()
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:81 +0x5cb
version
$ ./vmstorage-prod -version
vmstorage-20200623-105303-tags-v1.32.2-cluster-597-g521c657f
Thanks for the update, @n4mine !
Does this SIGBUS panic appear on all the physical computers where vmstorage runs or it appears only on a certain group of physical computers? Probably this issue is related to hardware problems on the a certain group of physical computers?
Probably this issue is related to hardware problems on the a certain group of physical computers?
hmmmm, i'm not sure.
i have 11 vm storage nodes, the panic happened in 6 of them. and all node's hardware is same.
Try downgrading vmstorage to v1.36.3-cluster and see whether the panic appears there. The next release after v1.36.3-cluster - v1.37.0-cluster contains optimizations for reading small data chunks with standard copy() function from Go. The SIGBUS panic occurs at copy() function now. Previously there was used cgo-based implementation for chunks' copying. See the commit d12019767690ab9833984a29aede974a4fcac52a , which adds the optimization, for details.
@n4mine , the commit 08edb90814b7bb850617b42f43718cdf334aa6b1 contains yet another attempt to fix SIGBUS panic. It falls back to reading data from the last 4KB of mmap'ed file with cgo-based copy which has been used before v1.37.0-cluster. If vmstorage built from v1.36.3-cluster doesn't generate SIGBUS panics, then this commit should help. Could you try building vmstorage from it and verifying whether it eliminates SIGBUS panic?
@valyala
after upgrade, it's panic again
$ ./vmstorage-prod -version
vmstorage-20200623-210915-heads-cluster-0-g46c5c077
goroutine 1570 [running]:
runtime.throw(0xa4b142, 0x5)
runtime/panic.go:1116 +0x72 fp=0xc9ad7d2f40 sp=0xc9ad7d2f10 pc=0x436112
runtime.sigpanic()
runtime/signal_unix.go:692 +0x443 fp=0xc9ad7d2f70 sp=0xc9ad7d2f40 pc=0x44ccb3
runtime.memmove(0xc4872f0000, 0x7f24e1bf902c, 0x1cdf)
runtime/memmove_amd64.s:354 +0x40a fp=0xc9ad7d2f78 sp=0xc9ad7d2f70 pc=0x46a0da
github.com/VictoriaMetrics/VictoriaMetrics/lib/fs.(*ReaderAt).MustReadAt(0xc00987c230, 0xc4872f0000, 0x1cdf, 0x8000, 0x368ec02c)
github.com/VictoriaMetrics/VictoriaMetrics/lib/fs/reader_at.go:79 +0x15d fp=0xc9ad7d30c0 sp=0xc9ad7d2f78 pc=0x6fb50d
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).readInmemoryBlock(0xc8dee01500, 0xc11e8be9f0, 0x0, 0x368ec02c, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:336 +0x12d fp=0xc9ad7d3198 sp=0xc9ad7d30c0 pc=0x73ac1d
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).getInmemoryBlock(0xc8dee01500, 0xc11e8be9f0, 0x54, 0x401, 0x55, 0x54)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:324 +0xae fp=0xc9ad7d31f8 sp=0xc9ad7d3198 pc=0x73a92e
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).nextBlock(0xc8dee01500, 0xc9ad7d32b0, 0x54)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:258 +0x99 fp=0xc9ad7d3240 sp=0xc9ad7d31f8 pc=0x739fc9
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*partSearch).Seek(0xc8dee01500, 0xc913dbf800, 0x11, 0x1800)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/part_search.go:148 +0x45f fp=0xc9ad7d3328 sp=0xc9ad7d3240 pc=0x7396af
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*TableSearch).Seek(0xc065920008, 0xc913dbf800, 0x11, 0x1800)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/table_search.go:96 +0x168 fp=0xc9ad7d33f0 sp=0xc9ad7d3328 pc=0x745ee8
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset.(*TableSearch).FirstItemWithPrefix(0xc065920008, 0xc913dbf800, 0x11, 0x1800, 0xc9ad7d34
98, 0x8)
github.com/VictoriaMetrics/VictoriaMetrics/lib/mergeset/table_search.go:123 +0x4d fp=0xc9ad7d3420 sp=0xc9ad7d33f0 pc=0x74636d
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).searchMetricName(0xc065920000, 0xc7644b0b40, 0x0, 0x240, 0x16151f43f253ea
5c, 0x0, 0x9, 0x747201, 0x4, 0x9, ...)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:1426 +0x19f fp=0xc9ad7d3508 sp=0xc9ad7d3420 pc=0x76d71f
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexSearch).storeDateMetricID(0xc065920000, 0x4805, 0x16151f43f253ea5c, 0x0, 0x0, 0x0
)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:2604 +0x38e fp=0xc9ad7d37c0 sp=0xc9ad7d3508 pc=0x7775ae
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).updatePerDateData(0xc0001c0000, 0xc6d5740000, 0x2710, 0x4000, 0x0, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:1436 +0xd7a fp=0xc9ad7d39f8 sp=0xc9ad7d37c0 pc=0x7ac55a
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).add(0xc0001c0000, 0xc6d5740000, 0x2710, 0x4000, 0xc4e5bf8000, 0x2710, 0x2c00,
0x40, 0x120, 0xc002380700, ...)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:1229 +0x10bf fp=0xc9ad7d3bd0 sp=0xc9ad7d39f8 pc=0x7aaf0f
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).AddRows(0xc0001c0000, 0xc4e5bf8000, 0x2710, 0x2c00, 0xc74f38f040, 0x0, 0x0)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:1088 +0x11c fp=0xc9ad7d3d20 sp=0xc9ad7d3bd0 pc=0x7a9aec
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).processVMInsertConn(0xc00193c200, 0xc0009ee480, 0x4, 0xa5b8f3)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:349 +0x191 fp=0xc9ad7d3eb0 sp=0xc9ad7d3d20 pc=0x7c1a01
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).RunVMInsert.func1(0xc00193c200, 0xb0e1c0, 0xc009a58780)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:161 +0x364 fp=0xc9ad7d3fc8 sp=0xc9ad7d3eb0 pc=0x7c8014
runtime.goexit()
runtime/asm_amd64.s:1373 +0x1 fp=0xc9ad7d3fd0 sp=0xc9ad7d3fc8 pc=0x468b21
created by github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport.(*Server).RunVMInsert
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/transport/server.go:133 +0x21e
@n4mine , thanks for the update!
Could you downgrade vmstorage binary to v1.36.3-cluster (commit 37aa4fe2823a79eabdacd758a77cb96e7d80e008 ) and check whether it has the panic with SIGBUS? Other cluster component versions could be left untouched while downgrading vmstorage.
@valyala
downgrade vmstorage binary to v1.36.3-cluster
i downgrade vmstorage yesterday, it's work fine all time(about 7 hours), no panic occur.
today i replace the old vmstorage instances with new machine(and use commit 46c5c0772c3ed5c653be21d40829a41d3ba4a903), it's already 9 hours and everything is fine.
So it is possible the issue was with faulty machines. Probably, SIGBUS panic is thrown when the operating system couldn't read the mmaped file contents due to some reason. For instance, if network-attached storage (NFS, EFS, EBS, GCP persistent disk, etc.) becomes temporarily unavailable, then the OS may trigger SIGBUS error when reading mmaped file - see https://www.sublimetext.com/blog/articles/use-mmap-with-care for details.
FYI, all the commits mentioned above have been included in v1.37.3.
@valyala
today i replace the old vmstorage instances with new machine(and use commit 46c5c07), it's already 9 hours and everything is fine.
after 4.23 days, all the new vmstorage instance work fine.
and old instance still panic sometimes
so i think it's hardware or kernel problems
i list my hardware and kernel version here
CPU: Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
MEM: 384G
DISK: raid0(INTEL SSDPE2KX020T8 * 2)
$ uname -r
3.10.0-514.16.1.el7.x86_64
i close this issue, and will update if i has more info