If you suspect this could be a bug, follow the template.
What version of Dgraph are you using?
release/v1.0.10
Have you tried reproducing the issue with latest release?
Yes.
What is the hardware spec (RAM, OS)?
CPU: 32-core
MEM: 64GB
STORAGE: SSD
OS: CentOS 7.4
Steps to reproduce the issue (command/config used to run Dgraph).
Couldn't reproduce this for now.
I exported the database from 1.0.9 and imported to 1.0.10.
And the panic occurred accidentally while snapshotting after a few days.
We've stopped running this node for now.
Nov 9 15:51:43 dgraph101 dgraph[45581]: panic: runtime error: invalid memory address or nil pointer dereference
Nov 9 15:51:43 dgraph101 dgraph[45581]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0xbd8aee]
Nov 9 15:51:43 dgraph101 dgraph[45581]: goroutine 8325297 [running]:
Nov 9 15:51:43 dgraph101 dgraph[45581]: github.com/dgraph-io/dgraph/vendor/github.com/dgraph-io/badger.(*valueLog).write(0xc0000992c8, 0xc1ab5a2b40, 0x1, 0xa, 0x0, 0x0)
Nov 9 15:51:43 dgraph101 dgraph[45581]: /ext-go/1/src/github.com/dgraph-io/dgraph/vendor/github.com/dgraph-io/badger/value.go:929 +0x1ee
Nov 9 15:51:43 dgraph101 dgraph[45581]: github.com/dgraph-io/dgraph/vendor/github.com/dgraph-io/badger.(*DB).writeRequests(0xc000099180, 0xc1ab5a2b40, 0x1, 0xa, 0xc14c079aa0, 0x0)
Nov 9 15:51:43 dgraph101 dgraph[45581]: /ext-go/1/src/github.com/dgraph-io/dgraph/vendor/github.com/dgraph-io/badger/db.go:593 +0x106
Nov 9 15:51:43 dgraph101 dgraph[45581]: github.com/dgraph-io/dgraph/vendor/github.com/dgraph-io/badger.(*DB).doWrites.func1(0xc1ab5a2b40, 0x1, 0xa)
Nov 9 15:51:43 dgraph101 dgraph[45581]: /ext-go/1/src/github.com/dgraph-io/dgraph/vendor/github.com/dgraph-io/badger/db.go:662 +0x55
Nov 9 15:51:43 dgraph101 dgraph[45581]: created by github.com/dgraph-io/dgraph/vendor/github.com/dgraph-io/badger.(*DB).doWrites
Nov 9 15:51:43 dgraph101 dgraph[45581]: /ext-go/1/src/github.com/dgraph-io/dgraph/vendor/github.com/dgraph-io/badger/db.go:711 +0x30e
Nov 9 15:51:43 dgraph101 dgraph[45581]: W1109 15:51:43.024667 45581 draft.go:327] Error while calling CreateSnapshot: requested index is older than the existing snapshot. Retrying...
The panic occurred here in badger.
// write is thread-unsafe by design and should not be called concurrently.
func (vlog *valueLog) write(reqs []*request) error {
vlog.filesLock.RLock()
maxFid := atomic.LoadUint32(&vlog.maxFid)
curlf := vlog.filesMap[maxFid]
vlog.filesLock.RUnlock()
...
for i := range reqs {
b := reqs[i]
b.Ptrs = b.Ptrs[:0]
for j := range b.Entries {
e := b.Entries[j]
var p valuePointer
p.Fid = curlf.fid // <-- PANIC
In badger, it looks like having a hypothesis that vlog.filesMap[maxFid] must have non-nil value. But I think it's more safer to check nil here because there always is a chance to get a nil from map.
func TestNil(t *testing.T) {
m := map[uint32]*logFile{}
var p valuePointer
curlf := m[0]
p.Fid = curlf.fid // <-- PANIC
}
Thank you!
That fid should not be nil. If it is, we have a bug.
I reproduced this, and there were too many open files before the panic occurred.
Nov 14 23:19:19 dgraph101 dgraph[141898]: E1114 23:19:19.722793 141898 lists.go:97] Can't read the proc file. Err: open /proc/self/stat: too many open files
Dgraph looks fine for now after increasing LimitNOFILE.
Sorry for your time to check this issue :-(
Thank you!
Most helpful comment
I reproduced this, and there were
too many open filesbefore the panic occurred.Dgraph looks fine for now after increasing LimitNOFILE.
Sorry for your time to check this issue :-(
Thank you!