Data.table: R-devel change; R-API inside parallel regions

Created on 30 Nov 2018  路  4Comments  路  Source: Rdatatable/data.table

Luke Tierney contacted me about a memory issue in R-devel. Tests are passing both on CRAN and Travis/Appveyor, but memory usage is higher in some cases. As before the garbage collector turns off.
Not sure when exactly the R-devel change was made, but some time in the last month or so. It might be that the R-devel change causes the new memory issue, or that the R-devel change reveals the problem that exists already (possibly even with data.table-release on R-release). In any case, it needs to be fixed.

Luke wrote :

I distilled the issue in 'constellation' down to the attached file
from the examples. If I run this the memory usage is much bigger than
previously and the gc() output is garbled. It's a
multi-threading issue again; everything looks fine with
OMP_NUM_THREADS=1. With mutlipe threads you are
calling DATAPTR from threads other than the main one and that creates
a race on setting the R_GCEnabled flag, so eventually it is getting
stuck on off. I instrumented the places where the GC is disabled and
tracked this as the first one from a thread other than the main one:

#4  0x00007ffff78ba9d5 in DATAPTR (x=x@entry=0x1167458)
     at ../../../R/src/include/Rinlinedfuns.h:106
#5  0x00007fffea02c4c8 in subsetVectorRaw (target=0x4529590, source=0x1167458,
     idx=0x4175210, any0orNA=FALSE) at subset.c:44
#6  0x00007fffea02c7a6 in subsetDT (x=<optimized out>, rows=<optimized out>,
     cols=<optimized out>) at subset.c:272

Given Luke's detail, it's easy to see the problem in the code without needing to reproduce. All R API usage needs taking outside all parallel regions as Luke requested before. I delayed doing that last time due to time pressure, and now it needs tackling. Last time I just did the first step which was to ensure that DATAPTR inside parallel regions did not receive ALTREP.

This warrants accelerating release of 1.12.0, not least because it impacts Luke.

$ grep "omp.*parallel" *.c

  • [x] subset.c
  • [x] reorder.c
  • [x] fwrite.c
  • [x] fread.c
  • [x] fsort.c
  • [x] forder.c
  • [x] between.c

$ grep ALTREP *.c

  • [x] between.c
  • [x] fsort.c
  • [x] reorder.c
  • [x] wrappers.c
R-devel bug

Most helpful comment

awesome of Luke to do all the hard work of identifying the cause! 馃挴

All 4 comments

awesome of Luke to do all the hard work of identifying the cause! 馃挴

I'm in process of confirming with Luke that it's fixed before closing ...

Confirmed fixed. Thanks to @ltierney for his help.


For completeness in case we need to refer back...
I could not reproduce the problem with R-devel compiled as follows. (I needed to confirm I could reproduce the problem with 1.11.8 before I could confirm 1.11.9 fixes it.)

./configure --without-recommended-packages --disable-byte-compiled-packages --enable-strict-barrier CC="gcc -fsanitize=address -fno-sanitize=float-divide-by-zero -fno-omit-frame-pointer" CFLAGS="-O0 -g -Wall -pedantic"

even with the script that Luke provided :

library(data.table)
library(constellation)
temp <- as.data.table(vitals[VARIABLE == "TEMPERATURE"])
pulse <- as.data.table(vitals[VARIABLE == "PULSE"])
resp <- as.data.table(vitals[VARIABLE == "RESPIRATORY_RATE"])
temp[, RECORDED_TIME := as.POSIXct(RECORDED_TIME,
  format = "%Y-%m-%dT%H:%M:%SZ", tz = "UTC")]
pulse[, RECORDED_TIME := as.POSIXct(RECORDED_TIME,
  format = "%Y-%m-%dT%H:%M:%SZ", tz = "UTC")]
resp[, RECORDED_TIME := as.POSIXct(RECORDED_TIME,
  format = "%Y-%m-%dT%H:%M:%SZ", tz = "UTC")]
gc()
for (i in 1:10) {
  cat("i=",i,"\n")
  b1 <- bundle(temp, pulse, resp,
    bundle_names = c("PLATELETS", "INR"), window_hours_pre = 24,
    window_hours_post = c(6, 6), join_key = "PAT_ID",
    time_var = "RECORDED_TIME", event_name = "CREATININE", mult = "all")
}
gc()

As a long shot, I recompiled R-devel more simply as follows, still with --enable-strict-barrier which is the important thing :
./configure --without-recommended-packages --enable-strict-barrier
reinstalled constellation and all its dependencies and data.table 1.11.8, and this time Luke's script correctly fails :

Fatal error: Wrong thread calling 'RunFinalizers'

Installing data.table-dev (1.11.9) into this R-devel and rerunning, works fine. Reinstalling data.table 1.11.8 again fails. So it's indeed fixed by 1.11.9. In my original R-devel compile, perhaps something there was making 1.11.8 work (disabling byte compiler, -00, ASAN, etc). Anyway, a simpler R-devel compile was needed and was worth trying.

@jangorecki If time allows, it would be great to add --enable-strict-barrier to the R-devel used in CI pipelines. I don't know if you compile R-devel or not for CI pipelines.

@mattdowle yes I compile R-devel daily, will add this flag there. Added to not forget in #3147

Was this page helpful?
0 / 5 - 0 ratings