I have some more super-wide data and wanted to melt it:
melt(DT, 1:2)
# Error in rbindlist(l, use.names, fill, idcol) :
# Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'character'
if I had instead run with melt(DT, names(DT)[1:2]), I'd see the same message. Interestingly, when I repeat the command, I get a different error:
# Error in rbindlist(l, use.names, fill, idcol) :
# Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'integer'
After that, I never see the 'character' error again. I ran this a few times on my single-row example and managed to crash R. This error isn't a big problem, though; just means I need to approach the problem differently.
EDIT: problem ended up being due to mistakenly assigning the same name to multiple columns.
Tacking on another example of this:
library(dplyr);library(data.table)
synth = fread("http://stanford.edu/~wpmarble/MLAB_data.txt") %>% t %>% as.data.table
colnames(synth) = c("state", "income", "retailprice", "percent_15_19",
"beercons", "smoking88", "smoking80", "smoking75",
paste0("smoking", 70:99), "smoking00")
synth.long = melt(synth,
id.vars = c("state", "income", "retailprice",
"percent_15_19", "beercons"),
measure = patterns("^smoking"))
synth.long
# Error in rbindlist(l, use.names, fill, idcol) :
# Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'integer'
Different runs will produce a slightly different error; sometimes instead of 'integer' it says 'character' or 'raw'. A few times the last line (printing synth.long) has crashed RStudio.
> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.1 (unknown)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.9.6 dplyr_0.4.3
loaded via a namespace (and not attached):
[1] R6_2.1.2 assertthat_0.1 magrittr_1.5 parallel_3.3.0 tools_3.3.0
[6] DBI_0.4-1 Rcpp_0.12.5 chron_2.3-47
It is not just that columns have the same name, because that can also work just fine.
I had a bit of trouble to make a nice simple reproducible example.
This works fine:
DT <- setDT(data.frame("Time.point" = seq(0, 6), "Time.(h)" = c(0.0, 0.5, 1.0, 3.0, 5.0, 7.0, 24.0),
"NEW.ME" = runif(7), "NEW.ME" = runif(7), check.names = FALSE))
DT <- data.table::melt(data = DT, c("Time.point", "Time.(h)"), na.rm = TRUE)
DT
This one will either error, or crash R:
DT <- setDT(data.frame("Time.point" = seq(0, 6), "Time.(h)" = c(0.0, 0.5, 1.0, 3.0, 5.0, 7.0, 24.0),
"NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.ME" = runif(7),
"NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.ME" = runif(7), "NEW.MER" = runif(7), "F050" = runif(7),
"NEW.MER" = runif(7), "F16-42-123p123C" = runif(7), "F16-42-123p123C" = runif(7), "NEW.MER" = runif(7),
"F16-42-123p123C" = runif(7), check.names = FALSE))
DT <- data.table::melt(data = DT, c("Time.point", "Time.(h)"), na.rm = TRUE)
DT
SessionInfo:
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Dutch_Netherlands.1252 LC_CTYPE=Dutch_Netherlands.1252 LC_MONETARY=Dutch_Netherlands.1252
[4] LC_NUMERIC=C LC_TIME=Dutch_Netherlands.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.10.0
loaded via a namespace (and not attached):
[1] tools_3.3.2
@wpmarble your example actually segfaults on current dev:
synth = fread("http://stanford.edu/~wpmarble/MLAB_data.txt") %>% t %>% as.data.table
# trying URL 'http://stanford.edu/~wpmarble/MLAB_data.txt'
# Content type 'text/plain' length 15833 bytes (15 KB)
# ==================================================
# downloaded 15 KB
colnames(synth) = c("state", "income", "retailprice", "percent_15_19",
"beercons", "smoking88", "smoking80", "smoking75",
paste0("smoking", 70:99), "smoking00")
synth.long = melt(synth,
id.vars = c("state", "income", "retailprice",
"percent_15_19", "beercons"),
measure = patterns("^smoking"))
synth.long
# *** caught segfault ***
# address 0xe000008e, cause 'memory not mapped'
I am having the same issue, segfault after calling melt on a data.table with two identical column names. Here is a MRE:
library(data.table)
devtools::session_info()
buggy.dt <- fread("month,Record high,Average high,Daily mean,Average low,Record low,Average precipitation,Average rainfall,Average snowfall,Average precipitation,Average rainy,Average snowy,Mean monthly sunshine hours
Jan,12.8,-5.4,-8.9,-12.4,-33.5,73.6,28.4,45.9,15.8,4.3,13.6,99.2
Feb,15,-3.7,-7.2,-10.6,-33.3,70.9,22.7,46.6,12.8,4,11.1,119.5
Mar,25.9,2.4,-1.2,-4.8,-28.9,80.2,42.2,36.8,13.6,7.4,8.3,158.8
Apr,30.1,11,7,2.9,-17.8,76.9,65.2,11.8,12.5,10.9,3,181.7
May,34.2,19,14.5,10,-5,86.5,86.5,0.4,12.9,12.8,0.14,229.8
Jun,34.5,23.7,19.3,14.9,1.1,87.5,87.5,0,13.8,13.8,0,250.1
Jul,36.1,26.6,22.3,17.9,7.8,106.2,106.2,0,12.3,12.3,0,271.6
Aug,35.6,24.8,20.8,16.7,6.1,100.6,100.6,0,13.4,13.4,0,230.7
Sep,33.5,19.4,15.7,11.9,0,100.8,100.8,0,12.7,12.7,0,174.1")
tall.dt <- melt(buggy.dt, id.vars="month", verbose=TRUE)
print(tall.dt)
The output I get on my system is:
tdhock@recycled:~/projects/temperature-sensor(master)$ R --vanilla < buggy-simple.R
R version 3.4.2 (2017-09-28) -- "Short Summer"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: i686-pc-linux-gnu (32-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(data.table)
> devtools::session_info()
Session info ------------------------------------------------------------------
setting value
version R version 3.4.2 (2017-09-28)
system i686, linux-gnu
ui X11
language en_US
collate en_US.UTF-8
tz Canada/Eastern
date 2017-11-25
Packages ----------------------------------------------------------------------
package * version date source
base * 3.4.2 2017-11-18 local
compiler 3.4.2 2017-11-18 local
data.table * 1.10.5 2017-08-28 Github (Rdatatable/data.table@a869907)
datasets * 3.4.2 2017-11-18 local
devtools 1.13.2 2017-06-02 cran (@1.13.2)
digest 0.6.12 2017-01-27 cran (@0.6.12)
graphics * 3.4.2 2017-11-18 local
grDevices * 3.4.2 2017-11-18 local
memoise 1.1.0 2017-04-21 cran (@1.1.0)
methods * 3.4.2 2017-11-18 local
stats * 3.4.2 2017-11-18 local
utils * 3.4.2 2017-11-18 local
withr 2.0.0 2017-07-28 cran (@2.0.0)
> buggy.dt <- fread("month,Record high,Average high,Daily mean,Average low,Record low,Average precipitation,Average rainfall,Average snowfall,Average precipitation,Average rainy,Average snowy,Mean monthly sunshine hours
+ Jan,12.8,-5.4,-8.9,-12.4,-33.5,73.6,28.4,45.9,15.8,4.3,13.6,99.2
+ Feb,15,-3.7,-7.2,-10.6,-33.3,70.9,22.7,46.6,12.8,4,11.1,119.5
+ Mar,25.9,2.4,-1.2,-4.8,-28.9,80.2,42.2,36.8,13.6,7.4,8.3,158.8
+ Apr,30.1,11,7,2.9,-17.8,76.9,65.2,11.8,12.5,10.9,3,181.7
+ May,34.2,19,14.5,10,-5,86.5,86.5,0.4,12.9,12.8,0.14,229.8
+ Jun,34.5,23.7,19.3,14.9,1.1,87.5,87.5,0,13.8,13.8,0,250.1
+ Jul,36.1,26.6,22.3,17.9,7.8,106.2,106.2,0,12.3,12.3,0,271.6
+ Aug,35.6,24.8,20.8,16.7,6.1,100.6,100.6,0,13.4,13.4,0,230.7
+ Sep,33.5,19.4,15.7,11.9,0,100.8,100.8,0,12.7,12.7,0,174.1")
> tall.dt <- melt(buggy.dt, id.vars="month", verbose=TRUE)
'measure.vars' is missing. Assigning all columns other than 'id.vars' columns as 'measure.vars'.
Assigned 'measure.vars' are [Record high, Average high, Daily mean, Average low, ...].
> print(tall.dt)
*** caught segfault ***
address (nil), cause 'memory not mapped'
Traceback:
1: rbindlist(l, use.names, fill, idcol)
2: data.table::.rbind.data.table(...)
3: rbind(deparse.level, ...)
4: rbind(head(x, topn), tail(x, topn))
5: print.data.table(tall.dt)
6: print(tall.dt)
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault
tdhock@recycled:~/projects/temperature-sensor(master)$
I also tried with the current data.table github master (3db6e9832e1491cb0e10f8bdcd4348f5c2b1ac8a) and I observed the same segfault.
Hello, any updates with regard to 'melting' DT with >=2 identical columns?
@sung for now, just rename. use make.names to help.
Thanks, @MichaelChirico, it seems to work for now too:
as.data.table(melt(as.data.frame(DT[,..columns.with.idental.names]))