setkeyv() can be accelerated significantly in cases where the key already exists.
This can be useful e.g. if you create a function that takes a data.table as an argument and you need to set a key but don't know if the user has already set the key on the input.
With the new implementation, you can just use setkey without worrying about speed penalties.
setkeyv() does two things:
fordervCreorderCurrently, if the key already exists, the call to forderv is still executed and only step 2 is skipped.
The only reason is a sanity check that the data.table is really sorted by the key.
I believe, it is not necessary to perform this sanity check each time, especially since it has been around for quite a while so that potential bugs should have popped up.
Great! Your PR merged. See comment there : https://github.com/Rdatatable/data.table/pull/2332#issuecomment-327575497
@MarkusBonsch thanks for doing this. I had hacky work-arounds that did this myself, and I didn't get to actually submit a PR. And of course thanks to the data.table team for all they do.
Most helpful comment
@MarkusBonsch thanks for doing this. I had hacky work-arounds that did this myself, and I didn't get to actually submit a PR. And of course thanks to the data.table team for all they do.