What's the best way of doing it?
The following
iris %>% select(which(sapply(., is.numeric)))
works with the latest version of magrittr. However,
iris %>% mutate_each(funs(log), which(sapply(., is.numeric)))
throws Error in lapply(X = X, FUN = FUN, ...) : object '.' not found, even with the updated chain operator. Is this the expected behaviour?
In general, what is the best way to filter columns by some boolean condition in dplyr? Would it be worth having another selection function (in addition to starts_with, ends_with, etc.) acting directly on columns instead of column names? E.g. something along the lines of
iris %>% select(satisfies(is.numeric))
Right, just noticed that
iris %>% mutate_each_q(funs(log), which(sapply(., is.numeric)))
does work, which makes sense.
There's no best way currently. I think it's going to be fairly hard to implement across backend types, but maybe I could add some generic methods for determining column types.
The purrr package now offers nice solutions to @steromano's original question, at least for data frames.
To keep all numeric columns:
iris %>%
purrr::keep(is.numeric) %>%
head(2)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1 5.1 3.5 1.4 0.2
#> 2 4.9 3.0 1.4 0.2
To mutate all numeric columns with a single function:
iris %>%
purrr::map_if(is.numeric, log) %>%
head(2)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 1.629241 1.252763 0.3364722 -1.609438 setosa
#> 2 1.589235 1.098612 0.3364722 -1.609438 setosa
Nice! select_if() worked perfectly. Thank you.
Most helpful comment
The
purrrpackage now offers nice solutions to @steromano's original question, at least for data frames.To keep all numeric columns:
To mutate all numeric columns with a single function: