Dplyr: Selecting all numeric columns

Created on 14 Jul 2014  路  4Comments  路  Source: tidyverse/dplyr

What's the best way of doing it?
The following

iris %>% select(which(sapply(., is.numeric)))

works with the latest version of magrittr. However,

iris %>% mutate_each(funs(log), which(sapply(., is.numeric)))

throws Error in lapply(X = X, FUN = FUN, ...) : object '.' not found, even with the updated chain operator. Is this the expected behaviour?

In general, what is the best way to filter columns by some boolean condition in dplyr? Would it be worth having another selection function (in addition to starts_with, ends_with, etc.) acting directly on columns instead of column names? E.g. something along the lines of

iris %>% select(satisfies(is.numeric))
feature

Most helpful comment

The purrr package now offers nice solutions to @steromano's original question, at least for data frames.

To keep all numeric columns:

iris %>% 
  purrr::keep(is.numeric) %>% 
  head(2)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1          5.1         3.5          1.4         0.2
#> 2          4.9         3.0          1.4         0.2

To mutate all numeric columns with a single function:

iris %>%
  purrr::map_if(is.numeric, log) %>%
  head(2)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1     1.629241    1.252763    0.3364722   -1.609438  setosa
#> 2     1.589235    1.098612    0.3364722   -1.609438  setosa

All 4 comments

Right, just noticed that

iris %>% mutate_each_q(funs(log), which(sapply(., is.numeric)))

does work, which makes sense.

There's no best way currently. I think it's going to be fairly hard to implement across backend types, but maybe I could add some generic methods for determining column types.

The purrr package now offers nice solutions to @steromano's original question, at least for data frames.

To keep all numeric columns:

iris %>% 
  purrr::keep(is.numeric) %>% 
  head(2)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1          5.1         3.5          1.4         0.2
#> 2          4.9         3.0          1.4         0.2

To mutate all numeric columns with a single function:

iris %>%
  purrr::map_if(is.numeric, log) %>%
  head(2)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1     1.629241    1.252763    0.3364722   -1.609438  setosa
#> 2     1.589235    1.098612    0.3364722   -1.609438  setosa

Nice! select_if() worked perfectly. Thank you.

Was this page helpful?
0 / 5 - 0 ratings