I frequently do
DT[CJ(colA = colA, colB = colB, unique=TRUE), on=c("colA","colB")]
# to complete missing levels
# or
DT[, CJ(colA = colA, colB = colB, unique=TRUE)][!DT, on=c("colA","colB")]
# to identify missing levels
# http://stackoverflow.com/a/36065607/1191259
It would be nice if I could get away with writing colA and colB fewer times. The FR here is for
CJ(colA, colB, unique=TRUE, names=TRUE)
to infer the names colA and colB, perhaps using whatever method is used by data.frame() and data.table() (make.names?).
(The name repetition could be reduced further if on=.Icols were a thing, I suppose, but I'll leave that for a separate FR.)
SO posts to update...
CJ takes ... as first argument, and that function is going to be generic method, so AFAIK we will need to change it into CJ(x, ...), those changes can be made together #1090
+1 and I don't see the need for the names argument - this should be the only behavior. With the join syntax change to using "on" instead of setkey this has become a big sticking point for me.
I'd also like to see unique = TRUE be the default - I can't think of _ever_ not needing to unique the arguments to CJ.
@jangorecki I didn't touch the #1090 / #814 stuff yet. better as self-contained, I think, unless I'm missing something
Most helpful comment
+1 and I don't see the need for the
namesargument - this should be the only behavior. With the join syntax change to using "on" instead ofsetkeythis has become a big sticking point for me.I'd also like to see
unique = TRUEbe the default - I can't think of _ever_ not needing to unique the arguments toCJ.