Go-ipfs: huge write IO without reason

Created on 10 Jun 2020 · 12Comments · Source: ipfs/go-ipfs

Version information:

go-ipfs version: 0.5.1-8431e2e
Repo version: 9
System version: amd64/linux
Golang version: go1.13.10
runs as docker container, taken from ipfs/go-ipfs:v0.5.1

Description:

Repo of 3TB that has many different pins. No ongoing download in progress, but the repo still grows at speed of 20-30 KBytes/s (BTRFS, noatime,compress=zlib:3,space_cache). At the same time, average IO is 20-30 MegaBytes (!) per second.
The IO is pretty random that slow down magnetic media tremendously, with average 2-15 ipfs processes constantly sitting in I/O blocked state:

# vmstat -n 60
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
...
 0  2 422624 194356  21900 19943532    0    1   215 13104 7642 17343  3  2 52 43  0
 0 15 422880 203296  21896 19935368    1    1   217 13420 7754 18294  1  2 51 45  0
 0 12 423136 211772  21768 19929644    1    5   271 13350 7774 18120  3  2 47 48  0

kinbug neetriage

Source

skliarie

All 12 comments

Hey @skliarie,

has the situation changed in the last 3 days?

We need some additional information to your configuration:

do you use flatfs or badger as datastore?
do you take part on a cluster, like a collabcluster?
which experimental flags do you use (config file)?
which flags do you use on the daemon?

Can you provide the debug information from the daemon when the daemon runs the unexpected IO workload?

RubenKelevra on 13 Jun 2020

I use flatfs.
No cluster, standalone machine
ipfs started using "ipfs daemon --migrate=true", see attached my config file

cat /tmp/ipfs_repo_stat

NumObjects: 27482408
RepoSize: 2712552599390
StorageMax: 4000000000000
RepoPath: /data/ipfs
Version: fs-repo@9

skliarie on 13 Jun 2020

👍1

ipfs_config.txt

skliarie on 13 Jun 2020

👍1

ipfs-profile-bframe-2020-06-13T18:39:42+0000.tar.gz

skliarie on 13 Jun 2020

👍1

Thanks!

Do you run any operations on your node via API or access any data via the http gateway?

Do you have any service running, like a system monitoring agent, which supports IPFS and accesses the API to collect any metrics?

Does this issue persists when you restart your daemon, if so, does this issue start right away or after some delay?

RubenKelevra on 14 Jun 2020

I have no http gateway, only the API.
No system monitoring like prometheus (if you mean it). Just basic checks.
Yes, the issue starts right away after daemon restart.

skliarie on 14 Jun 2020

The repo will grow as the DHT stores data (and as you send out providers for data), but 20-30KB/s is way too fast. How are you calculating disk usage? Are you seeing the disk usage reported by ipfs repo stat go up by 20KB/s?

In terms of the disk reads, this is probably bitswap. You're connected to a lot of peers that are probably looking for data. You may also want to set Datastore.BloomFilterSize (in the config) to 67108864 (64MiB). This will construct a bloom filter of 64MiB on start and every time you GC (expensive) so you don't have to check your disk to see if you have something every time a peer asks.

Stebalien on 19 Jun 2020

"ipfs repo stat" takes too long time to execute, I do df for that.

I have BloomFilterSize 0, I guess it is unlimited, no? If yes, this explains huge size of the IPFS process:

# ps awux | head
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user    3223 37.0 13.6 17767376 3360848 pts/0 Sl+ Jun15 1964:04 ipfs daemon --migrate=true

And I do see (using iotop) that it is ipfs process that causes the writes and not BTRFS:

Total DISK READ :     218.76 K/s | Total DISK WRITE :      40.88 M/s
Actual DISK READ:     204.18 K/s | Actual DISK WRITE:      35.32 M/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                          
32037 be/4 user     29.17 K/s   40.88 M/s  0.00 % 99.99 % ipfs daemon --migrate=true

The writes don't have constant speed, it fluctuates between 0 to 40MB/s. I tried to find correlation between the writes (vmstat -n 1) and bitswap operation counters (watch ipfs stats bitswap), to no avail.

One thing I noticed, is that the writes are coming in bursts. See attached output of vmstat -n 1 for 10 mins.

It looks like there are some buffers that are flushed once in a while. Keep in mind that the node has huge number of peers (8755 at the moment) and 3TB worth of various popular pinsets.

vmstat-n1.txt.gz

skliarie on 19 Jun 2020

Find attached imon.sh script I wrote to monitor the ipfs node and output for several hours that it produces.

imon.sh.gz
ipfs_repo_stats.txt.gz

skliarie on 19 Jun 2020

I have BloomFilterSize 0, I guess it is unlimited, no? If yes, this explains huge size of the IPFS process:

0 means 0 (disabled).

IPFS memory usage is usually correlated with the number of peers you have.

Looking at the profile, I see what's going on. The problem is the DHT. Your node is backed up trying to write incoming provider records to disk: https://github.com/libp2p/go-libp2p-kad-dht/issues/675. It could just be that your disk is too slow, but I'm wondering if your node just _happens_ to be close to some really popular content.

For now, I'd try running your daemon with "dhtclient".

cc @aschmahmann

Stebalien on 19 Jun 2020

ipfs 0.6.0, switched to dhtclient, the args are now "--routing=dhtclient --migrate=true --enable-pubsub-experiment". BloomFilterSize": 536870912.
write speed dropped to 5MB average per minute, fluctuating between 1 to 15MB/s in much smoother way than before.
Number of partners dropped from 8k to about 700, no idea why..
The node feels faster though.

skliarie on 22 Jun 2020

It dropped because peers are no longer trying to connect to you to advertise/find content in the DHT. Now, all of your peers are: