I found out that it is not possible to change refreservation property to value that is bigger than volsize property of zvol. I think this is strange behavior because when I need to reserve space for particular zvol I need to set refreservation to value that is grater than volsize because zfs might use more space for it's metadata. Also when I create zvol that isn't sparse zfs will automatically set refreservation to value that is bigger than volsize. For example when I created 10 GiB zvol I have following output from zfs get:
zfs get volsize,volblocksize,refreservation Pool-0/zvol1 -p
NAME PROPERTY VALUE SOURCE
Pool-0/zvol1 volsize 10737418240 local
Pool-0/zvol1 volblocksize 65536 -
Pool-0/zvol1 refreservation 10779951104 local
There is clearly seen that refreservation has grater value than volsize.
For example when I try to change refreservation to none and after that change it back to value that was set by zfs when zvol was created I see error.
zfs set refreservation=none Pool-0/zvol1
zfs get volsize,volblocksize,refreservation Pool-0/zvol1 -p
NAME PROPERTY VALUE SOURCE
Pool-0/zvol1 volsize 10737418240 local
Pool-0/zvol1 volblocksize 65536 -
Pool-0/zvol1 refreservation 0 local
zfs set refreservation=10779951104 Pool-0/zvol1
cannot set property for 'Pool-0/zvol1': 'refreservation' is greater than current volume size
Same thing happens when I would like to increase volsize and also refreservation to reflect changes of volsize.
What is also important I can bypass this obstacle by setting volsize to bigger value than needed refreservation, then set refreservation to value I need and change volsize back to value I would like to set at the begging. For example if I would like to enlarge my 10 GiB zvol to 20 GiB and I doesn't won't to make it sparse then I can do following things:
# set volsize to 21 GiB
zfs set volsize=22548578304 Pool-0/zvol1
# set refreservation to correct value
zfs set refreservation=21559640064 Pool-0/zvol1
# set volsize to 20 GiB that I need - value smaller than refreservation
zfs set volsize=21474836480
According to mentioned examples I think that error showed when refreservation is set to bigger value than volsize is a bug in validation of user input. It would be much better if it would be possible to set refreservation to value that is requested if size of zpool allows this of course.
If I am wrong then please explain why setting refreservation work this way.
@kolczykm Thanks for the detailed bug and description of the problem.
I agree, this feels like an unnecessary constraint. This check is enforced here, https://github.com/zfsonlinux/zfs/blob/master/lib/libzfs/libzfs_dataset.c#L1233, in the code but there's no additional documentation _why_ it's required. Offhand, I can't think of any legitimate reason.
Perhaps @ahrens or @eschrock recall why this is here...
I was under the impression that metadata was already accounted for in a refreservation for a zvol, hence maxing out the reservation is sufficient.
@behlendorf I agree that this check is probably unnecessary.
@DeHackEd The automatically-created refreservation is larger than the volsize to accommodate metadata. Users should be able to do the same manually.
I was hoping to bump a 100G zvol with a refreservation of 100G to a refreservation of 120G to allow for some snapshots (the pool doesn't have 100G additional free). Is that a legitimate use case for this enhancement?
No, refreservation by definition does not include snapshots. Just use reservation, or both to ensure you don't run out of space while writing to the zvol.
Also this is a question better suited to the zfs-discuss mailing list than the bug tracker.
I'm write here because this very same bug hit me today
Type | Linux
--- | ---
Distribution Name | CentOS Linux
Distribution Version | 7.3
Linux Kernel | 3.10.0-514.26.2.el7.x86_64
Architecture | x86_64
ZFS Version | 0.7.1-1
SPL Version | 0.7.1-1
After setting refreservation=none on a zvol, it is not possible to restore its original value
refreservationrefreservation=nonerefreservation to original value give an unexpected error saying cannot set property for 'tank/vol1': 'refreservation' is greater than current volume sizeExample:
[root@slave7 ~]# zfs list
NAME USED AVAIL REFER MOUNTPOINT
tank 816K 879M 24K /tank
[root@slave7 ~]# zfs create -V 200M tank/vol1
[root@slave7 ~]# zfs get all tank/vol1 | grep refreservation
tank/vol1 refreservation 208M local
tank/vol1 usedbyrefreservation 208M -
[root@slave7 ~]# zfs set refreservation=none tank/vol1
[root@slave7 ~]# zfs get all tank/vol1 | grep refreservation
tank/vol1 refreservation none local
tank/vol1 usedbyrefreservation 0B -
[root@slave7 ~]# zfs set refreservation=208M tank/vol1
cannot set property for 'tank/vol1': 'refreservation' is greater than current volume size
Any news on the bug?
Thanks.
This check should be removed.
In any case, volsize is insufficiently small as an upper bound for refreservation because it does not account for metadata overhead _estimates_. As any metadata overhead estimate is, by Murphy's Law, not correct -- especially with large physical block size, there is no benefit to a check against the total estimated refreservation either.
Sorry, let me understand: refreservation is an estimation only? If so, how to have a "true" (not estimated) value? Thanks.
When you create a volume and set its size, the default refreservation value is set to the size + an estimate of the metadata overhead . This is required because accounting is done on the size of the volume including both data and metadata. Therefore, it is incorrect to cap the refreservation at volsize.
The solution is to remove the refreservation comparison to volsize. The code in question is very old, so this problem has existed for a long time. There are other ways to solve this, so the fix here is applicable to the zfs command and perhaps some other consumers of libzfs.
When you create a volume and set its size, the default refreservation value is set to the size + an estimate of the metadata overhead
This seems to leave the possibility of running out a space on a ZVOLs even when not creating it as a sparse volume. In other words, if refreservation is nothing more than volsize + estimation, this means that the estimation can be wrong.
So, even non-sparse volume can return ENOSP to the upper-layer application? This seems in contrast on what the man page says (it mentions ENOSP only for sparse volumes).
Don't panic.
The estimate has considerable margin, so it is rarely a problem. Architecturally, it is difficult to improve the accuracy of the estimate because it can be affected by the configuration of the pool and the pool configuration can change over time. For most users who enable compression and use default copies=1, there is plenty of margin. For users who don't compress, use ashift > 9, raidz, and copies = 3, the margin is likely too small.
We've looked at this in detail from many angles and the answer is to have enough margin by default without getting overly conservative.
I concur with @richardelling - removing the check would be fine. And the metadata space estimate is sufficient in practice. And it even takes into account the copies property, and assumes no compression.
Most helpful comment
Don't panic.
The estimate has considerable margin, so it is rarely a problem. Architecturally, it is difficult to improve the accuracy of the estimate because it can be affected by the configuration of the pool and the pool configuration can change over time. For most users who enable compression and use default copies=1, there is plenty of margin. For users who don't compress, use ashift > 9, raidz, and copies = 3, the margin is likely too small.
We've looked at this in detail from many angles and the answer is to have enough margin by default without getting overly conservative.