Dplyr: Duplicated columns in join get suffixes

Created on 5 Dec 2013  路  9Comments  路  Source: tidyverse/dplyr

e <- data.frame(x = c(1, 1, 2, 3), z = 1:4)
f <- data.frame(x = c(1, 2, 2, 4), z = 1:4)

j <- inner_join(tbl_cpp(e), tbl_cpp(f), "x")

if the same name exists in both the x and y sources, then the variable names in the output get .x and .y added.

bug

Most helpful comment

I find the default suffixes .x .y rarely useful. Is it sensible to provide an argument "suffixes" for all joins (defaulting to c(".x", ".y")) a la merge()? Or is there an easy way to do that with available tools?

All 9 comments

BTW this is the last error I get from replacing tbl_df with tbl_cpp - I'll merge in that big change once this one is fixed.

Alright. I'll start on that right now then.

I find the default suffixes .x .y rarely useful. Is it sensible to provide an argument "suffixes" for all joins (defaulting to c(".x", ".y")) a la merge()? Or is there an easy way to do that with available tools?

+1 to rmatev's comment.

The missing suffix change functionality from the joins sometimes makes me reluctantly use merge()

A fix would be great. Thanks for all your work.

I agree. Would be nice to pick the suffix.

One more who ended up here hoping for a suffix option. Or a function like rename_each to pipe in before the join.

+1 to rmatev's comment: Having customizable suffices like for base::merge() would help a lot.

+1

Upvote for the suffix option.

Was this page helpful?
0 / 5 - 0 ratings