Let's have this scenario
root@linux:~# zfs list -r -t all test
NAME USED AVAIL REFER MOUNTPOINT
test 126K 55.9M 23K /test
test/fs1 46K 55.9M 23K /test/fs1
test/fs1/fs2 23K 55.9M 23K /test/fs1/fs2
root@linux:~# zfs snap -r test@snap1
root@linux:~# zfs list -r -t all test
NAME USED AVAIL REFER MOUNTPOINT
test 126K 55.9M 23K /test
test@snap1 0 - 23K -
test/fs1 46K 55.9M 23K /test/fs1
test/fs1@snap1 0 - 23K -
test/fs1/fs2 23K 55.9M 23K /test/fs1/fs2
test/fs1/fs2@snap1 0 - 23K -
root@linux:~# touch /test/fs1/file1
I would like to have a new option, let's call it -z for this example, that when used would produce the new situation:
root@linux:~# zfs snap -r -z test@snap2
root@linux:~# zfs list -r -t all test
NAME USED AVAIL REFER MOUNTPOINT
test 290K 55.7M 23K /test
test@snap1 0 - 23K -
test/fs1 59K 55.7M 23K /test/fs1
test/fs1@snap1 13K - 23K -
test/fs1@snap2 0 - 23K -
test/fs1/fs2 23K 55.7M 23K /test/fs1/fs2
test/fs1/fs2@snap1 0 - 23K -
Basically what I'd like to be able to ask 'zfs snapshot' when -z parameter is present is:
"_before you create a snapshot, regardless if is requested recursively or not, check if any previous snapshot exists and if it does then check if the newly to-be-created snapshot would reference any data that the previous snapshot doesn't ._
_if it doesn't reference new data since previous snapshot, silently skip creation of the requested snapshot and report a successful operation_"
because all existing snapshotting tools fail to be able to skip creation of zero sized snapshots and none are planning to do so :(
While I see the reasoning, I'm going to need an override. If you're trying to use zfs send -R to backup your pool (or subset thereof) and first run zfs snapshot -r to make the snapshots you may find yourself in a bad situation if snapshots were skipped as a result of this feature. Or even if you make a snapshot for immediate sending only to find it not created.
Please don't silently not do what the user expects to happen.
I fail to see how this optional feature would affect existing workflows.
Like you said, if you run zfs snapshot -r then nothing is changed if this feature is available, as you would get a recursive snapshot for all datasets if the newly proposed option is omitted (which should be default)
Please don't silently not do what the user expects to happen
but the users expects that to happen! when the user explicitly asks for -z parameter, the user expects that zero sized snapshots are not created.
when the user does not ask for -z parameter, the snapshot is created and that is what the user is expecting.
Ugh, I missed the part where it's a -z parameter. I thought it was presented as being the new default behaviour. Sorry, it's been one of those mornings.
While the created snapshots can be culled afterwards in case they're 'empty' (compared to the last one) this takes way longer (and also bloats zpool history) compared to be able to not take them at all. And even when the functionality can be scripted (as mentioned) I suppose a -z option would have a way better runtime.
So for me this would be a very good improvement as it would massively cut down the amount of redundant snapshots, at least on most of the systems I maintain, so I could retire the script (and the related overhead) that currently has to remove them.
This can now be achieved using ZFS Channel Programs (zfs program) which was recently merged: should we close this?
i tried reading about zcp but i find it very hard to understand what it really does and how can it help for this issue.
could you post an ELI5 example how can this be achieved using zcp?
thanks!
The following ZCP script was briefly tested on a throwaway VM, feel free to adapt it to your environment:
arg = ...
argv = arg["argv"]
root = argv[1]
name = argv[2]
function snapshot_r(root, name)
for child in zfs.list.children(root) do
snapshot_r(child, name)
end
if (zfs.get_prop(root , "written") > 0) then
zfs.sync.snapshot(root .. "@" .. name)
end
end
snapshot_r(root, name)
Test output (snapshot is taken only the first time because written > 0):
root@linux:~# truncate -s 512m /var/tmp/file1
root@linux:~# zpool create $POOLNAME -O mountpoint=/$POOLNAME -f /var/tmp/file1
root@linux:~# for i in {1..2}; do
> zfs create $POOLNAME/$i
> for j in {1..2}; do
> zfs create $POOLNAME/$i/$j
> done
> done
root@linux:~# zfs list -r $POOLNAME -t all
NAME USED AVAIL REFER MOUNTPOINT
testpool 256K 368M 24K /testpool
testpool/1 72K 368M 24K /testpool/1
testpool/1/1 24K 368M 24K /testpool/1/1
testpool/1/2 24K 368M 24K /testpool/1/2
testpool/2 72K 368M 24K /testpool/2
testpool/2/1 24K 368M 24K /testpool/2/1
testpool/2/2 24K 368M 24K /testpool/2/2
root@linux:~# zfs get written -r $POOLNAME
NAME PROPERTY VALUE SOURCE
testpool written 24K -
testpool/1 written 24K -
testpool/1/1 written 24K -
testpool/1/2 written 24K -
testpool/2 written 24K -
testpool/2/1 written 24K -
testpool/2/2 written 24K -
root@linux:~# cat > zfs_snapshot-r_written_only.zcp <<'EOF'
> arg = ...
> argv = arg["argv"]
> root = argv[1]
> name = argv[2]
> function snapshot_r(root, name)
> for child in zfs.list.children(root) do
> snapshot_r(child, name)
> end
> if (zfs.get_prop(root , "written") > 0) then
> zfs.sync.snapshot(root .. "@" .. name)
> end
> end
> snapshot_r(root, name)
> EOF
root@linux:~# zfs program $POOLNAME zfs_snapshot-r_written_only.zcp -- $POOLNAME snap-$(date +%s)
Channel program fully executed with no return value.
root@linux:~# zfs program $POOLNAME zfs_snapshot-r_written_only.zcp -- $POOLNAME snap-$(date +%s)
Channel program fully executed with no return value.
root@linux:~# zfs program $POOLNAME zfs_snapshot-r_written_only.zcp -- $POOLNAME snap-$(date +%s)
Channel program fully executed with no return value.
root@linux:~# zfs program $POOLNAME zfs_snapshot-r_written_only.zcp -- $POOLNAME snap-$(date +%s)
Channel program fully executed with no return value.
root@linux:~# zfs list -r $POOLNAME -t all
NAME USED AVAIL REFER MOUNTPOINT
testpool 264K 368M 24K /testpool
testpool@snap-1518347971 0B - 24K -
testpool/1 72K 368M 24K /testpool/1
testpool/1@snap-1518347971 0B - 24K -
testpool/1/1 24K 368M 24K /testpool/1/1
testpool/1/1@snap-1518347971 0B - 24K -
testpool/1/2 24K 368M 24K /testpool/1/2
testpool/1/2@snap-1518347971 0B - 24K -
testpool/2 72K 368M 24K /testpool/2
testpool/2@snap-1518347971 0B - 24K -
testpool/2/1 24K 368M 24K /testpool/2/1
testpool/2/1@snap-1518347971 0B - 24K -
testpool/2/2 24K 368M 24K /testpool/2/2
testpool/2/2@snap-1518347971 0B - 24K -
root@linux:~# zfs get written -r $POOLNAME
NAME PROPERTY VALUE SOURCE
testpool written 0 -
testpool@snap-1518347971 written 24K -
testpool/1 written 0 -
testpool/1@snap-1518347971 written 24K -
testpool/1/1 written 0 -
testpool/1/1@snap-1518347971 written 24K -
testpool/1/2 written 0 -
testpool/1/2@snap-1518347971 written 24K -
testpool/2 written 0 -
testpool/2@snap-1518347971 written 24K -
testpool/2/1 written 0 -
testpool/2/1@snap-1518347971 written 24K -
testpool/2/2 written 0 -
testpool/2/2@snap-1518347971 written 24K -
root@linux:~#
What is the exact meaning of
Channel programs may only be run with root privileges.
from https://illumos.org/man/1M/zfs-program ?
In case I understand it correnctly: Having -z as an option to the normal zfs snapshot command would allow users to execute it (if zfs allow'ed) - while the channel program workaround is limited to root... ?
@GregorKopka exactly.
if (zfs.get_prop(root , "written") > 0) then
zfs.sync.snapshot(root .. "@" .. name)
hm, so what is the effort difference between above and the following code in a modified zfs-auto-snapshot script? the end user still has to verify if the written property is non-zero. so what's the benefit of zfs program vs the already working modified zfs-auto-snapshot?
````
zfs get -Hp -o value written $iimy issue was request about simplifying the user effort with a zfs command with a new parameter... that would be very simple for the end user... the recommened approach is no less difficult than the code path in zfs-auto-snapshot...
@loli10K then the channel programs approach dosn't solve it as it dosn't support delegation to normal users (which the original request would). While technically impressive it dosn't help a normal user.
The workaround with pruning a zfs list to create a zfs snapshot command line in userland (plenty of parsing in both directions) seems to be way more costly, plus I suspect there are limits on the length of the command line of a zfs snapshot invocation (while no real limit on the amount of datasets exist, apart from space in the pool), so my guess is that there's a limit on the amount of snapshots that can be created within one atomic zfs snapshot operation (unless using -r, which defeats the point).
Thus I support it as a flag, or through root being able to prepare channel programs that can then be called by normal users (with at least checking their allow'ed abilities vs. the sub-operations taken by the channel progam). A flag would be more practical, imho.
so what is the effort difference between above and the following code in a modified zfs-auto-snapshot script? the end user still has to verify if the written property is non-zero. so what's the benefit of zfs program vs the already working modified zfs-auto-snapshot?
From zfs-program(8):
The ZFS channel program interface allows ZFS administrative operations to be run programmatically as a Lua script. The entire script is executed atomically, with no other administrative operations taking effect concurrently. A library of ZFS calls is made available to channel program scripts. Channel programs may only be run with root privileges.
The bold part is the important one.
From my perspective zfs snapshot -z is more difficult to implement (requires both cli args parsing and libzfs modifications) and less powerful than ZCP (would be limited to zero-sized snapshots): given we now have zfs program available i find it difficult to justify the need to work on adding more, less powerful, options to zfs snapshot, but since it doesn't seem to address your specific use case feel free to open a pull request.
As a side now, keep in mind this option would probably prevent sending snapshots with the -R option since https://github.com/zfsonlinux/zfs/commit/6635624020506a76cb79c65c7ff5a580c4685a0b (#5196).
@kpande Could you please elaborate why this was closed?
@kpande Unless I missed the ZoL release that enabled running ZCP without root privileges: please reopen.
Might be a separate issue, but without being able to delegate to non-root users this isn't solved.
Could you please elaborate why this was closed?
it seems that (s)he closed a bunch of other reports I filed over time, perhaps (s)he has something personal against.

just opening many feature requests where you are expecting others to do the work for you, for free.
that's exactly how open source works. if you are the solely code writer, you ofcourse have the power close the feature request. however leaving it open does not hurt you, and leaves the door open to others who may be able to write the code.
Most helpful comment
While the created snapshots can be culled afterwards in case they're 'empty' (compared to the last one) this takes way longer (and also bloats zpool history) compared to be able to not take them at all. And even when the functionality can be scripted (as mentioned) I suppose a -z option would have a way better runtime.
So for me this would be a very good improvement as it would massively cut down the amount of redundant snapshots, at least on most of the systems I maintain, so I could retire the script (and the related overhead) that currently has to remove them.