Here is the behaviour:
> d <- tbl_df(data.frame(xxx = 1:2, yyy = 1:2, bxx = 1:2, bbb = 1:2))
> d %>% select(starts_with('nonsense'))
Source: local data frame [2 x 4]
xxx yyy bxx bbb
1 1 1 1 1
2 2 2 2 2
> d %>% select(ends_with('nonsense'))
Source: local data frame [2 x 4]
xxx yyy bxx bbb
1 1 1 1 1
2 2 2 2 2
> d %>% select(matches('nonsense'))
Source: local data frame [2 x 4]
xxx yyy bxx bbb
1 1 1 1 1
2 2 2 2 2
> d %>% select(contains('nonsense'))
Source: local data frame [2 x 4]
xxx yyy bxx bbb
1 1 1 1 1
2 2 2 2 2
Clearly the select function should not return all columns in the dataframe. It should either throw an error with a helpful message or return an empty dataframe. I am not sure which would be preferable.
From what I can see the problem is in the list of functions called select_funs
in the select_vars_q
function. One would have to catch the error in there and decide what to return accordingly. I would be happy to submit a pull request but don't really want to do work without hearing what you think the most appropriate return value would be :)
I think it should throw an error, something like "Failed to select any columns". It also needs to handle the case like select(mtcars, -(mpg:carb))
.
Cool. I'll send a pull request in a week or so. On holiday atm.
I'm not sure if this is a related issue, but the following seems inconsistent to me:
> data_frame(a=1, ba=1) %>% select(starts_with("a"), ends_with("b")) %>% names
character(0)
>
> data_frame(a=1, ab=1) %>% select(starts_with("a"), ends_with("b")) %>% names
[1] "a" "ab"
Most helpful comment
I'm not sure if this is a related issue, but the following seems inconsistent to me: