I'm sure there is a smaller reprex possible, but this is a problem reduced from tidyverts/tsibble#194
The code below will crash R
gr <- structure(list(1:14566, 14567:30980, 30981:48498, 48499:66037),
ptype = integer(0),
class = c("vctrs_list_of","vctrs_vctr", "list"))
nr <- 0L
.Call(dplyr:::dplyr_group_indices, gr, nr)
Can you create a reprex with the public interface please? We don't support calling the unexported interface with malformed arguments.
@lionel-
sure. Is the following better? The call to group_indices has to be repeated once or twice sometimes to get the crash to happen.
res <-
structure(list(
Sensor = character(0), Date_Time = structure(numeric(0), tzone = "Australia/Melbourne", class = c(
"POSIXct",
"POSIXt"
)), Date = structure(numeric(0), class = "Date"), Time = integer(0),
Count = integer(0)
), row.names = integer(0), class = c(
"grouped_ts",
"grouped_df", "tbl_ts", "tbl_df", "tbl", "data.frame"
), key = structure(list(
Sensor = c(
"Birrarung Marr", "Bourke Street Mall (North)",
"QV Market-Elizabeth St (West)", "Southern Cross Station"
), .rows = structure(list(
1:14566, 14567:30980, 30981:48498,
48499:66037
), ptype = integer(0), class = c(
"vctrs_list_of",
"vctrs_vctr", "list"
))
), row.names = c(NA, 4L), class = c(
"tbl_df",
"tbl", "data.frame"
), .drop = TRUE), index = structure("Date_Time", ordered = TRUE), index2 = "Date_Time", interval = structure(list(
year = 0, quarter = 0, month = 0, week = 0, day = 0, hour = 1,
minute = 0, second = 0, millisecond = 0, microsecond = 0,
nanosecond = 0, unit = 0
), .regular = TRUE, class = c(
"interval",
"vctrs_rcrd", "vctrs_vctr"
)), groups = structure(list(Sensor = c(
"Birrarung Marr",
"Bourke Street Mall (North)", "QV Market-Elizabeth St (West)",
"Southern Cross Station"
), .rows = structure(list(
1:14566, 14567:30980,
30981:48498, 48499:66037
), ptype = integer(0), class = c(
"vctrs_list_of",
"vctrs_vctr", "list"
))), row.names = c(NA, 4L), class = c(
"tbl_df",
"tbl", "data.frame"
), .drop = TRUE))
dplyr::group_indices(res)
dplyr::group_indices(res)
Thank you. You are still creating data structures without using the public interface. Can you use constructors please?
After trying to simplify it, I indeed found what I believe to be the root cause. tsibble:::as_tibble.grouped_ts was not using the grouped_df constructor to create a grouped_df.
This is a contrived (non)-example, but here is essentially what caused the error using constructors inappropriately:
df <-
tibble::new_tibble(
data.frame(x=integer(), y = integer()),
groups = data.frame(x=0, .rows = vctrs::list_of(1:1000)),
nrow = 0,
class = "grouped_df")
dplyr::group_indices(df)
Thanks a lot for debugging this @TylerGrantSmith!
@earowang Could you use the exported constructor to build a grouped-df please? Only the public constructor can guarantee consistent data. Please let me know if you run into other issues.
I think the issue is that new_grouped_df() as the public constructor doesn't check the consistency of the data inputs and groups. (edit: this inconsistency is perhaps needed for filter(.preserve = TRUE))
Need to run the following snippet a couple of times to get the error.
df <- dplyr::new_grouped_df(
data.frame(x=integer(), y = integer()),
groups = data.frame(x=0, .rows = vctrs::list_of(1:1000))
)
dplyr::group_indices(df)
#> integer(0)
Created on 2020-10-01 by the reprex package (v0.3.0)
This is what's happening:
SEXP indices = PROTECT(Rf_allocVector(INTSXP, nr));
int* p_indices = INTEGER(indices);
this segfaults when nr is 0.