Dgraph: bulk loader: program exceeds 10000-thread limit

Created on 16 May 2018 · 5Comments · Source: dgraph-io/dgraph

If you suspect this could be a bug, follow the template.

What version of Dgraph are you using?

1.05

Have you tried reproducing the issue with latest release?

yes

What is the hardware spec (RAM, OS)?

rdf data set: 400G
RAM: 128G
OS: centos

Steps to reproduce the issue (command/config used to run Dgraph).

when use dgraph bulk map reduce do batch data loading, program exceeds 10000-thread limit error raised at the begin of reduce.

full log:

MAP 05h41m10s rdf_count:4.165G rdf_speed:203.4k/sec edge_count:13.98G edge_speed:682.8k/sec
REDUCE 05h41m11s [0.00%] edge_count:0.000 edge_speed:0.000/sec plist_count:0.000 plist_speed:0.000/sec
runtime: program exceeds 10000-thread limit
fatal error: thread exhaustion

runtime stack:
runtime.throw(0x1368a7c, 0x11)
        /home/travis/.gimme/versions/go1.9.4.linux.amd64/src/runtime/panic.go:605 +0x95
runtime.checkmcount()
        /home/travis/.gimme/versions/go1.9.4.linux.amd64/src/runtime/proc.go:525 +0xa4
runtime.mcommoninit(0xc5f0d73400)
        /home/travis/.gimme/versions/go1.9.4.linux.amd64/src/runtime/proc.go:545 +0x9f
runtime.allocm(0xc42046ec00, 0x0, 0xc700000000)
        /home/travis/.gimme/versions/go1.9.4.linux.amd64/src/runtime/proc.go:1344 +0x99
runtime.newm(0x0, 0xc42046ec00)
        /home/travis/.gimme/versions/go1.9.4.linux.amd64/src/runtime/proc.go:1637 +0x39
runtime.startm(0xc42046ec00, 0x1afb500)
        /home/travis/.gimme/versions/go1.9.4.linux.amd64/src/runtime/proc.go:1728 +0x13f
runtime.handoffp(0xc42046ec00)

Expected behaviour and actual result.

kinbug

Source

QianchaoLiu

Most helpful comment

@manishrjain

yes, I have run it on hdd.

After changing to SSD, the problem does not occur.

However, I think the problem of automatically creating threads when the disk is too slow to write may be a bug should be fixed for guys who do not have SSD environment.

QianchaoLiu on 20 May 2018

👍2

All 5 comments

What's the command that you're using? How many files?

manishrjain on 17 May 2018

@manishrjain

dgraph bulk -r file.rdf -s file.schema --map_shards=1 --reduce_shards=1 --http localhost:8000 --zero=localhost:5080

There is only one file. At previous try, the file.rdf (350G) works well.
After adding some facets to relationship, the data set goes to 400G and error occurred during this try.

QianchaoLiu on 17 May 2018

Are you on a slow disk? I think this might be because disk reads are so slow that Go keeps on creating threads, and ends up hitting the limit.

Can you run your program on an SSD?

manishrjain on 17 May 2018

@manishrjain

yes, I have run it on hdd.

After changing to SSD, the problem does not occur.

However, I think the problem of automatically creating threads when the disk is too slow to write may be a bug should be fixed for guys who do not have SSD environment.

QianchaoLiu on 20 May 2018

👍2

That's an artifact of Go. If a goroutine blocks for a while, it would leave it aside and spawn a new thread. In this case, your reads were taking so long that Go created too many system threads. There's not much that we can do from within Badger to tackle this.

manishrjain on 26 May 2018

Was this page helpful?

0 / 5 - 0 ratings