I've spent a decent amount of time reading (and rereading and rereading) various explanations of NSE and dplyr programming, but have not been able to solve what seems like it should be a simple problem. I'd like to define summarization formulas in advance, and then write a function that feeds those formulas into summarise() for evaluation.
The motivation for this is to build a package specific to a set of data that sets up complicated aggregations of that data in a standard way that doesn't need to be remembered by the users, but allows them to quickly and easily pull the data and aggregate it properly, by only remembering the name of the aggregation.
Here's an example of what I mean:
price_per_carat <- ~ sum(price) / sum(carat)
price_depth_ratio <- ~ sum(price) / sum(depth)
summarise_as <- function(.data, ...) {
# some kind of black magic
}
diamonds %>%
group_by(cut, clarity)
summarise_as(price_per_carat, price_depth_ratio)
# A tibble: 40 x 4
# Groups: cut [?]
cut clarity price_per_carat price_depth_ratio
<ord> <ord> <dbl> <dbl>
1 Fair I1 2721.185 56.37119
2 Fair SI2 4297.840 80.31302
3 Fair SI1 4362.573 65.84666
4 Fair VS2 4715.875 65.60159
5 Fair VS1 4734.064 66.17081
6 Fair VVS2 4843.546 53.29967
7 Fair VVS1 5824.159 64.07653
8 Fair IF 4030.679 31.83685
9 Good I1 2989.670 57.94697
10 Good SI2 4424.404 73.58696
# ... with 30 more rows
In case that wasn't perfectly clear, I'm basically trying to find a way to get the summarise_as(price_per_carat, price_depth_ratio)
to behave as I had typed summarise(price_per_carat = sum(price) / sum(carat), price_depth_ratio = sum(price) / sum(depth))
.
I've tried all sorts of forms of quo(), enquo(), as.formula, deparse(substitute()), etc. to try to get it to work, and all have failed. What's the proper way to do this?
If you want to use formulas, here is one way:
library(tidyverse)
library(rlang)
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %@%, %||%, as_function, flatten, flatten_chr, flatten_dbl,
#> flatten_int, flatten_lgl, invoke, list_along, modify, prepend,
#> rep_along, splice
summarise_as <- function(.data, ...) {
dots <- quos(...)
names <- map_chr(dots, ~as.character(quo_get_expr(.x)))
names(dots) <- names
get_expr <- function(q){
formula <- eval_tidy(quo_get_expr(q), quo_get_env(q))
expr(!!formula[[2]])
}
exprs <- map( dots, get_expr)
summarise(.data, !!!exprs)
}
price_per_carat <- ~ sum(price) / sum(carat)
price_depth_ratio <- ~ sum(price) / sum(depth)
diamonds %>%
group_by(cut, clarity) %>%
summarise_as(price_per_carat, price_depth_ratio)
#> # A tibble: 40 x 4
#> # Groups: cut [?]
#> cut clarity price_per_carat price_depth_ratio
#> <ord> <ord> <dbl> <dbl>
#> 1 Fair I1 2721. 56.4
#> 2 Fair SI2 4298. 80.3
#> 3 Fair SI1 4363. 65.8
#> 4 Fair VS2 4716. 65.6
#> 5 Fair VS1 4734. 66.2
#> 6 Fair VVS2 4844. 53.3
#> 7 Fair VVS1 5824. 64.1
#> 8 Fair IF 4031. 31.8
#> 9 Good I1 2990. 57.9
#> 10 Good SI2 4424. 73.6
#> # ... with 30 more rows
maybe @lionel- has a better approach
Maybe that's easier:
library(tidyverse)
library(rlang)
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %@%, %||%, as_function, flatten, flatten_chr, flatten_dbl,
#> flatten_int, flatten_lgl, invoke, list_along, modify, prepend,
#> rep_along, splice
exprs <- list(
price_per_carat = expr(sum(price) / sum(carat)),
price_depth_ratio = expr(sum(price) / sum(depth))
)
diamonds %>%
group_by(cut, clarity) %>%
summarise(!!!exprs)
#> # A tibble: 40 x 4
#> # Groups: cut [?]
#> cut clarity price_per_carat price_depth_ratio
#> <ord> <ord> <dbl> <dbl>
#> 1 Fair I1 2721. 56.4
#> 2 Fair SI2 4298. 80.3
#> 3 Fair SI1 4363. 65.8
#> 4 Fair VS2 4716. 65.6
#> 5 Fair VS1 4734. 66.2
#> 6 Fair VVS2 4844. 53.3
#> 7 Fair VVS1 5824. 64.1
#> 8 Fair IF 4031. 31.8
#> 9 Good I1 2990. 57.9
#> 10 Good SI2 4424. 73.6
#> # ... with 30 more rows
Created on 2018-04-25 by the reprex package (v0.2.0).
I'll close this now. Perhaps consider https://community.rstudio.com for things that are questions more than issues.
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/
Most helpful comment
Maybe that's easier:
Created on 2018-04-25 by the reprex package (v0.2.0).