These are two separate issues, but I think they may have the same underlying code affecting them both. First, with select_
, I cannot use spaces in the name:
> mtcars_tbl <- tbl_df(mtcars)
> mtcars_tbl <- rename(mtcars_tbl, `miles per gallon` = mpg)
> select_(mtcars_tbl, "miles per gallon")
Error in parse(text = x) : <text>:1:7: unexpected symbol
1: miles per
Same thing if I actually use the variable, as "intended".
> tmp <- "miles per gallon"
> select_(mtcars_tbl, tmp)
Error in parse(text = x) : <text>:1:7: unexpected symbol
1: miles per
So, whether I pass an actual variable with a string in it, as I presume was the original intent, or if I just want a cleaner way to deal with spaces, select_
still fails. filter_
(and presumably others) also fails:
> filter_(mtcars_tbl, "miles per gallon")
Error in parse(text = x) : <text>:1:7: unexpected symbol
1: miles per
OK, now onto problem 2:
filter_
does not seem to work at all, even when this whitespace is not an issue. For example:
> filter_(mtcars_tbl, "cyl" > 4) %>% arrange(cyl) %>% head(2)
miles per gallon cyl disp hp drat wt qsec vs am gear carb
1 22.8 4 108.0 93 3.85 2.32 18.61 1 1 4 1
2 24.4 4 146.7 62 3.69 3.19 20.00 1 0 4 2
versus
> filter(mtcars_tbl, cyl > 4) %>% arrange(cyl) %>% head(2)
miles per gallon cyl disp hp drat wt qsec vs am gear carb
1 21 6 160 110 3.9 2.620 16.46 0 1 4 4
2 21 6 160 110 3.9 2.875 17.02 0 1 4 4
I played around with it some more, looking at the dim()
, etc. It's pretty clear that nothing is happening with filter_
. :frowning:
About problem 2, this is user error. You can either use "cyl >4"
or ~cyl > 4
, but what happens here is that gets evaluated:
> "cyl" > 4
[1] TRUE
so you get all the data back.
For problem one, you want:
tmp <- "miles per gallon"
select_(mtcars_tbl, as.name(tmp))
Romain,
Thank you for catching my error! I guess when I saw that one thing didn't work, I started suspecting the package more than my code.
Hadley,
Thank you for your solution. This will work in the short term, but is it a long term solution? What is the harm of wrapping everything with "as.name" automatically in the "*_" flavors of the package? Does that make something else crash somewhere else?
As I'd mentioned, there are 2 reasons to use the "*_" flavors:
It's a fundamental choice - should "my weird variable"
work, or should "starts_with('abc')
work. I decided on the latter, and it's too late to change now. as.name()
says that what you have is a name of a variable, which seems pretty reasonable to me.
Hi Hadley,
OK, understood! So there are unfortunately inevitable conflicts in how names are interpreted. I understand that, as it has been the bane of R for many years.
I guess I was just hoping that with all the marvelous things that dplyr, tidyr, ggplot2, etc. have been able to manage, despite relying upon a language that has made some poor choices under the hood, that maybe you found some cool magical way to resolve this issue, too.
Thanks for your response, and all your amazing packages!
Most helpful comment
For problem one, you want: