This is a meta issue related for all bind/grouped-mutate/join/filter issues related to custom S3 + S4 classes
See also hadley/vctrs#27
Should be implemented in vctrs.
We're trying to build a package for "tidy" functional data analysis which definines new data types for function-valued data and are not sure how to proceed since dplyr behavior for non-base columns is ... mysterious, especially with grouping.
Can anyone offer some advice on how to design classes for new data types so that they work with current (and future) versions of dplyr?
Looks like we still need to wrap our heads around how this will be supported consistently in future dplyr versions. The vctrs repo contains some pointers, but nothing production-ready yet, and no clean write-up as far as I remember.
@krlmlr
Thanks for the quick reply, but to be honest, any pointers I might be able to extract from vctrs-code won't really help to make our stuff work in the near term, since development there seems to have been stalled for a long time.
I know it is a lot to ask, but would you be willing to answer some specific questions about how dplyr's grouped_df-table verbs evaluate expressions if I write up some code snippets where I don't understand what's happening? These summarize and mutate calls are really hard to debug/inspect since they call C++ and I suspect my idea to define new mean, sum, sd, etc... methods for the functional datatype S3 classes we define might clash with dplyr's hybrid evaluation scheme...?
@fabian-s: Sure, if the interface or the behavior is too complicated and not obvious from the docs, this qualifies as a bug ;-)
I am currently experiencing a problem with lubridate intervals when using dplyr joins. I calculate the intervals first in one table, where each entity appears once, and then use an inner_join to a larger table where each entity appears in multiple rows. Now in the new table, the date interval shifts around apparently randomly from row to row for the same entity (although the length of the interval seems to be consistent and correct).
I can't tell from the series of issues I've clicked through if this precise problem has been raised before - but it's a current problem, and I would definitely call this a bug.
@tomwwagstaff: Can you please double-check with the development version of dplyr? I assume you'll be seeing an error, because we don't support the Interval class from lubridate. The problem you describe doesn't seem to be related to this issue, a new issue (with a reprex) is preferred.
Now that vctrs is on CRAN,
dplyr will incorportate vctrs ?dplyr going forward?vctrs does look promising, but I'm hesitant to refactor existing code only to then play catch-up if specs change.
We鈥檙e planning to integrate in 0.9.0. We do not have a specific timeline.
Pretty sure this is now taken care of by vctrs
Yeah, we can close this one now.
Most helpful comment
We鈥檙e planning to integrate in 0.9.0. We do not have a specific timeline.