Using a dynamic column name with [[x]] notation causes the following error when the column name is stored in the argument of a function:
Error in .subset2(x, i, exact = exact) : recursive indexing failed at level 2
This works as expected:
dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later
x <- "x"
dt[, x.copy := dt[[x]]]
print(dt)
x y x.copy
1: 1 A 1
2: 2 B 2
3: 3 C 3
But it does not work if x is an argument of a function:
f <- function(dt = "", x = "") {
dt[, x.copy := dt[[x]]]
print(dt)
}
f(dt2, x = "x")
Error in .subset2(x, i, exact = exact) :
recursive indexing failed at level 2
Traceback:
8. (function(x, i, exact) if (is.matrix(i)) as.matrix(x)[[i]] else .subset2(x,
i, exact = exact))(x, ..., exact = exact)
7. `[[.data.frame`(dt, x)
6. dt[[x]]
5. eval(jsub, SDenv, parent.frame())
4. eval(jsub, SDenv, parent.frame())
3. `[.data.table`(dt, , `:=`(x.copy, dt[[x]]))
2. dt[, `:=`(x.copy, dt[[x]])]
1. f(dt = dt2, x = "x")
sessionInfo():
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.11.8
loaded via a namespace (and not attached):
[1] compiler_3.5.1 tools_3.5.1 yaml_2.2.0
With both 1.11.8 and latest dev (7a66224), the first example does not work.
library(data.table)
dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later
x <- "x"
dt[, x.copy := dt[[x]]]
#> Error in .subset2(x, i, exact = exact): recursive indexing failed at level 2
It is because x in dt[, x.copy := dt[[x]]] will be resolved as x in dt rather than x in calling environment (globalenv in this case) due to dynamic scoping rules.
The same thing happens with data frame too:
df <- data.frame(x = 1:3)
x <- "x"
with(df, df[[x]])
#> Error in .subset2(x, i, exact = exact): recursive indexing failed at level 2
If you rename x in dt to other symbol, everything works as expected.
library(data.table)
dt <- data.table(x0 = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later
x <- "x0"
dt[, x0.copy := dt[[x]]]
print(dt)
#> x0 y x0.copy
#> 1: 1 A 1
#> 2: 2 B 2
#> 3: 3 C 3
f <- function(dt = "", x = "") {
dt[, x0.copy := dt[[x]]]
print(dt)
}
f(dt2, x = "x0")
#> x0 y x0.copy
#> 1: 1 A 1
#> 2: 2 B 2
#> 3: 3 C 3
Yes, it appears to be a syntax/scoping issue, not a bug.
See this helpful Q&A, it should clarify some things (including why the error you got is what it is):
I appreciate the quick and thorough response. I'm certain I got the first example to work, but now... not. Anyway, I'm sure you're right. It's a problem though that if the notation works for other variables, but suddenly fails inexplicably for x. Average R users can't be expected to know the internals of data.table or R and know which variable names are okay to use inside the brackets! It's frustrating when things fail for seemingly no reason. I'd call that a design flaw, but I guess it's in the R core.
Maybe I misunderstand, but I think ..x exists to handle this sort of scoping:
dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later
f <- function(DT, x) {
DT[, x.copy := DT[[..x]]][]
}
f(dt, "x") # works
Fyi, trailing [] also prints.
@franknarf1 Fabulous! That sorts it out.
I think ..x notation is great but needs some clarification on its behavior if not already exists:
Certain functions are specially handled so that dt[, f(..x)] could mean selecting columns while others meaning to evaluate f(..x).
library(data.table)
dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt[, xy := paste0(x, y)]
x <- "x"
dt[, ..x]
#> x
#> 1: 1
#> 2: 2
#> 3: 3
dt[, c(..x, "y")]
#> x y
#> 1: 1 A
#> 2: 2 B
#> 3: 3 C
dt[, paste0(..x, "y")]
#> xy
#> 1: 1A
#> 2: 2B
#> 3: 3C
dt[, c(..x, paste0(..x, "y"))]
#> x xy
#> 1: 1 1A
#> 2: 2 2B
#> 3: 3 3C
dt[, (paste0(..x, "y"))]
#> [1] "xy"
pf <- paste0
dt[, pf(..x, "y")]
#> [1] "xy"
c <- paste0
dt[, c(..x, "y")]
#> xy
#> 1: 1A
#> 2: 2B
#> 3: 3C
paste <- function(a, b, ...) paste0(a, b)
dt[, paste(..x, "y", sep = "", collapse = "")]
#> xy
#> 1: 1A
#> 2: 2B
#> 3: 3C
paste <- function(a, b, ...) paste0(a, "x")
dt[, paste(..x, "y", sep = "", collapse = "")]
#> Error in `[.data.table`(dt, , paste(..x, "y", sep = "", collapse = "")): column(s) not found: xx
paste1 <- function(a, b, ...) paste0(a, "x")
dt[, paste1(..x, "y", sep = "", collapse = "")]
#> [1] "xx"
Most helpful comment
With both 1.11.8 and latest dev (7a66224), the first example does not work.
It is because
xindt[, x.copy := dt[[x]]]will be resolved asxindtrather thanxin calling environment (globalenv in this case) due to dynamic scoping rules.The same thing happens with data frame too:
If you rename
xindtto other symbol, everything works as expected.