Data.table: .SD is locked for DT[, DT2[.SD]] joins

Created on 23 Nov 2016  路  11Comments  路  Source: Rdatatable/data.table

Based on an SO question, I was trying something like...

library(data.table)
DT = data.table(id = 1:2, v = 3:4)
DT2 = data.table(id = 1, x = 5)
DT[id == 1, DT2[.SD, on="id"]]

Error in set(i, j = lc, value = newval) :
.SD is locked. Updating .SD by reference using := or set are reserved for future use. Use := in j directly. Or use copy(.SD) as a (slow) last resort, until shallow() is exported.

I wasn't expecting to see this error since I'm not using set.

Other SO posts to update when fixed:

Most helpful comment

@skanskan You cannot normally refer to .SD in i, so that would be a new FR.

All 11 comments

In this case .SD refers to the outer data.table or to the inner one?

If you try DT[id == 1, DT2[dput(.SD), on="id"]]

You end up with

structure(list(id = 1L, v = 3L), .Names = c("id", "v"), class = c("data.table", 
"data.frame"), row.names = c(NA, -1L), .data.table.locked = TRUE)

This clearly comes from DT.

And what if I want to refer to DT2 instead?

@skanskan You cannot normally refer to .SD in i, so that would be a new FR.

I usually just do DT2[DT[id==1], on = "id"] in this case. But it is I think unexpected that .SD errors here.

Weird new example:

library(data.table)
df1 = setDT(structure(list(Measurement = structure(c(3L, 1L, 2L, 4L), .Label = c("Breadth", 
"Height", "Length", "Width"), class = "factor"), When = structure(c(1491592742, 
1486735990, 1484325914, 1479090924), class = c("POSIXct", "POSIXt"
), tzone = "")), .Names = c("Measurement", "When"), class = "data.frame", row.names = c(NA, 
-4L)))
df2 = setDT(structure(list(Measurement = structure(c(3L, 3L, 3L, 3L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L), .Label = c("Breadth", 
"Height", "Length", "Width"), class = "factor"), Datetime = structure(c(1491679142, 
1491765542, 1491769142, 1491851942, 1486822390, 1486908790, 1486995190, 
1487081590, 1487167990, 1484844314, 1484930714, 1485017114, 1485189914, 
1485535514, 1484325914, 1479004524, 1479177324, 1479436524, 1479609324, 
1479616524), class = c("POSIXct", "POSIXt"), tzone = ""), PassFail = structure(c(1L, 
1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 
2L, 1L, 2L), .Label = c("Fail", "Pass"), class = "factor")), .Names = c("Measurement", 
"Datetime", "PassFail"), row.names = c(NA, -20L), class = "data.frame"))

# this treats .SD as NULL? ... returns null data.table
df1[ 
  df2[.SD, on=.(Measurement, Datetime > When), 
    all(head(x.PassFail, 2) == "Fail")
  , by=.EACHI]$V1 
]

# errors with complaint about missing on=
df1[ 
  df2[.SD, on=.(Measurement, Datetime > When)]
]

# errors regarding .SD being locked as in my first post
df1[, df2[.SD, on=.(Measurement, Datetime > When), "bah", by=.EACHI]]

Data was borrowed from SO: http://stackoverflow.com/q/43373889/

Another example to update: http://stackoverflow.com/questions/43660562/find-matches-to-several-tables-conditional-full-join-using-data-table

Another example: https://stackoverflow.com/a/47818115/ should be able to write [.SD where it now is [mydf

mais um exemplo: https://stackoverflow.com/questions/48995398/sum-of-data-frames-rows-in-range-defined-by-columns#comment84996740_48995696

y https://stackoverflow.com/a/55029120/

mais um https://stackoverflow.com/questions/57167921/conditionally-merging-data-from-two-data-frames/57168406?noredirect=1#comment100887545_57168406

Was just about to post this on SO as a new Q:

    library(Lahman)
    library(data.table)

    Teams = as.data.table(Teams)
    Pitching = as.data.table(Pitching)

    Pitching[G > 5, rank_in_team := frank(ERA), by = .(teamID, yearID)]
    Pitching[rank_in_team == 1, team_performance := 
               Teams[.SD, Rank, on = .(teamID, yearID)]]

A fix that works is:

    Pitching[rank_in_team == 1, team_performance := 
               Teams[copy(.SD), Rank, on = .(teamID, yearID)]]

^ the above example is in the new .SD vignette. Should fix after this issue.

Hey @franknarf1 could you file the other seemingly broken examples as a separate issue? I have a fix for the .data.table.locked thing going but it doesn't address those other two cases

Thanks @MichaelChirico! -- I never noticed coercion was the common element, d'oh.

If you want to move ahead with what you have, maybe could close this issue and I will look at the other examples soon? (It might be a week or so. Pretty busy atm, and I have not installed except from CRAN or the master binaries for Windows in a long while.)

@franknarf1 just to let you know data.table::update.dev.pkg() will install from windows binaries if you are on 3.6 (or R-devel). No extra arguments are needed, just data.table::update.dev.pkg().

Was this page helpful?
0 / 5 - 0 ratings