Data.table: Bug with using the notation dt[, new.col := dt[[x]]] when x is the argument of a function

Created on 8 Jan 2019  路  6Comments  路  Source: Rdatatable/data.table

Using a dynamic column name with [[x]] notation causes the following error when the column name is stored in the argument of a function:
Error in .subset2(x, i, exact = exact) : recursive indexing failed at level 2

This works as expected:

dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later

x <- "x"
dt[, x.copy := dt[[x]]]
print(dt)
   x y x.copy
1: 1 A      1
2: 2 B      2
3: 3 C      3

But it does not work if x is an argument of a function:

f  <- function(dt = "", x = "")  {
    dt[, x.copy := dt[[x]]]
    print(dt)
}

f(dt2, x = "x")
Error in .subset2(x, i, exact = exact) : 
  recursive indexing failed at level 2

Traceback:

8. (function(x, i, exact) if (is.matrix(i)) as.matrix(x)[[i]] else .subset2(x, 
        i, exact = exact))(x, ..., exact = exact) 
7. `[[.data.frame`(dt, x) 
6. dt[[x]] 
5. eval(jsub, SDenv, parent.frame()) 
4. eval(jsub, SDenv, parent.frame()) 
3. `[.data.table`(dt, , `:=`(x.copy, dt[[x]])) 
2. dt[, `:=`(x.copy, dt[[x]])] 
1. f(dt = dt2, x = "x") 

sessionInfo():

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252    LC_MONETARY=English_Canada.1252 LC_NUMERIC=C                   
[5] LC_TIME=English_Canada.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.11.8

loaded via a namespace (and not attached):
[1] compiler_3.5.1 tools_3.5.1    yaml_2.2.0   

Most helpful comment

With both 1.11.8 and latest dev (7a66224), the first example does not work.

library(data.table)
dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later

x <- "x"
dt[, x.copy := dt[[x]]]
#> Error in .subset2(x, i, exact = exact): recursive indexing failed at level 2

It is because x in dt[, x.copy := dt[[x]]] will be resolved as x in dt rather than x in calling environment (globalenv in this case) due to dynamic scoping rules.

The same thing happens with data frame too:

df <- data.frame(x = 1:3)
x <- "x"
with(df, df[[x]])
#> Error in .subset2(x, i, exact = exact): recursive indexing failed at level 2

If you rename x in dt to other symbol, everything works as expected.

library(data.table)
dt <- data.table(x0 = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later

x <- "x0"
dt[, x0.copy := dt[[x]]]
print(dt)
#>    x0 y x0.copy
#> 1:  1 A       1
#> 2:  2 B       2
#> 3:  3 C       3

f  <- function(dt = "", x = "")  {
  dt[, x0.copy := dt[[x]]]
  print(dt)
}

f(dt2, x = "x0")
#>    x0 y x0.copy
#> 1:  1 A       1
#> 2:  2 B       2
#> 3:  3 C       3

All 6 comments

With both 1.11.8 and latest dev (7a66224), the first example does not work.

library(data.table)
dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later

x <- "x"
dt[, x.copy := dt[[x]]]
#> Error in .subset2(x, i, exact = exact): recursive indexing failed at level 2

It is because x in dt[, x.copy := dt[[x]]] will be resolved as x in dt rather than x in calling environment (globalenv in this case) due to dynamic scoping rules.

The same thing happens with data frame too:

df <- data.frame(x = 1:3)
x <- "x"
with(df, df[[x]])
#> Error in .subset2(x, i, exact = exact): recursive indexing failed at level 2

If you rename x in dt to other symbol, everything works as expected.

library(data.table)
dt <- data.table(x0 = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later

x <- "x0"
dt[, x0.copy := dt[[x]]]
print(dt)
#>    x0 y x0.copy
#> 1:  1 A       1
#> 2:  2 B       2
#> 3:  3 C       3

f  <- function(dt = "", x = "")  {
  dt[, x0.copy := dt[[x]]]
  print(dt)
}

f(dt2, x = "x0")
#>    x0 y x0.copy
#> 1:  1 A       1
#> 2:  2 B       2
#> 3:  3 C       3

Yes, it appears to be a syntax/scoping issue, not a bug.

See this helpful Q&A, it should clarify some things (including why the error you got is what it is):

https://stackoverflow.com/questions/1169456/the-difference-between-bracket-and-double-bracket-for-accessing-the-elem

I appreciate the quick and thorough response. I'm certain I got the first example to work, but now... not. Anyway, I'm sure you're right. It's a problem though that if the notation works for other variables, but suddenly fails inexplicably for x. Average R users can't be expected to know the internals of data.table or R and know which variable names are okay to use inside the brackets! It's frustrating when things fail for seemingly no reason. I'd call that a design flaw, but I guess it's in the R core.

Maybe I misunderstand, but I think ..x exists to handle this sort of scoping:

dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt2 <- copy(dt) #for later

f  <- function(DT, x)  {
    DT[, x.copy := DT[[..x]]][]
}

f(dt, "x") # works

Fyi, trailing [] also prints.

@franknarf1 Fabulous! That sorts it out.

I think ..x notation is great but needs some clarification on its behavior if not already exists:

Certain functions are specially handled so that dt[, f(..x)] could mean selecting columns while others meaning to evaluate f(..x).

library(data.table)
dt <- data.table(x = c(1:3), y = LETTERS[1:3])
dt[, xy := paste0(x, y)]

x <- "x"
dt[, ..x]
#>    x
#> 1: 1
#> 2: 2
#> 3: 3
dt[, c(..x, "y")]
#>    x y
#> 1: 1 A
#> 2: 2 B
#> 3: 3 C
dt[, paste0(..x, "y")]
#>    xy
#> 1: 1A
#> 2: 2B
#> 3: 3C
dt[, c(..x, paste0(..x, "y"))]
#>    x xy
#> 1: 1 1A
#> 2: 2 2B
#> 3: 3 3C
dt[, (paste0(..x, "y"))]
#> [1] "xy"

pf <- paste0
dt[, pf(..x, "y")]
#> [1] "xy"

c <- paste0
dt[, c(..x, "y")]
#>    xy
#> 1: 1A
#> 2: 2B
#> 3: 3C

paste <- function(a, b, ...) paste0(a, b)
dt[, paste(..x, "y", sep = "", collapse = "")]
#>    xy
#> 1: 1A
#> 2: 2B
#> 3: 3C

paste <- function(a, b, ...) paste0(a, "x")
dt[, paste(..x, "y", sep = "", collapse = "")]
#> Error in `[.data.table`(dt, , paste(..x, "y", sep = "", collapse = "")): column(s) not found: xx

paste1 <- function(a, b, ...) paste0(a, "x")
dt[, paste1(..x, "y", sep = "", collapse = "")]
#> [1] "xx"
Was this page helpful?
0 / 5 - 0 ratings