Zfs: Very slow scrub in Ubuntu 16.04

Created on 11 May 2016  路  7Comments  路  Source: openzfs/zfs

I have a two drives in a mirror configuration and the scrub is going at ~2M/s. The system otherwise is not busy and very responsive. 15 days to finish a scrub is a bit too long, so I am gonna have to stop it. Is there any configuration I could be missing that would improve the speed?

zpool status -v
  pool: storage
 state: ONLINE
  scan: scrub in progress since Tue May 10 21:13:54 2016
    787M scanned out of 2.61T at 2.21M/s, 344h20m to go
    0 repaired, 0.03% done

Most helpful comment

The scrub starts slow and the drives are very loud for the first several minutes.
Then the disk IO goes to >125M/s from each disk and stays there for the rest of the scrub.

afaik, this is because scrubbing the metadata is nearly pure random io, so reading tiny blocks scattered over the disks gives low bandwith. until that is done, data is being scrubbed and that goes much faster

All 7 comments

AFAIK, there were some options to speed up zfs resilvering.

echo 2000 > /sys/module/zfs/parameters/zfs_resilver_min_time_ms
echo 0 > /sys/module/zfs/parameters/zfs_resilver_delay

But before trying this you might check that your system didn't run out of memory, and have your disks checked for bad / pending sectors with smartctl.

@deajan, I built this mirror by starting out with one drive only and once I tested everything working I added my second drive to it and re-silvering took only 7 hours just a week or two ago. The amount of data did not change since then. There is plenty of memory available and SMART data is clean. "zfs_resilver_min_time_ms" is already set at 3000 so I only tried setting "zfs_resilver_delay" to 0, but it did not make any difference: scrub is running at the same ~2M/s. Shutting down all apps/containers I had running did not make any difference either.

              total        used        free      shared  buff/cache   available
Mem:       16314836     8909188     5094460      370892     2311188     5980288
Swap:      16662524       42444    16620080

zpool status -v
  pool: storage
 state: ONLINE
  scan: scrub in progress since Wed May 11 06:24:24 2016
    990M scanned out of 2.60T at 2.22M/s, 342h10m to go
    0 repaired, 0.04% done

Ok, I tried again yesterday and that time let it run for a few minutes just to see if it stays slow all the time or only at the beginning. It turns out that the speed picks up later on: the scrub ended up running at 2 x ~120M/s and finished in under 8 hours as expected.

zpool status -v
  pool: storage
 state: ONLINE
  scan: scrub repaired 0 in 7h41m with 0 errors on Thu May 12 01:28:46 2016

Made any changes before trying again ?
If yes, can you undo the changes and try again to isolate what resulted in speed increase ?

No, no changes. I even rebooted for a good measure. The scrub starts slow and the drives are very loud for the first several minutes. Then the disk IO goes to >125M/s from each disk and stays there for the rest of the scrub.
When I tried the first times, I would not wait longer than a half a minute before stopping it. The last time I decided to give it a chance.

If the issue is resolved, can you close it ?

The scrub starts slow and the drives are very loud for the first several minutes.
Then the disk IO goes to >125M/s from each disk and stays there for the rest of the scrub.

afaik, this is because scrubbing the metadata is nearly pure random io, so reading tiny blocks scattered over the disks gives low bandwith. until that is done, data is being scrubbed and that goes much faster

Was this page helpful?
0 / 5 - 0 ratings