Data.table: A named vector version of setnames [request]

Created on 18 Jun 2015  路  11Comments  路  Source: Rdatatable/data.table

Hi!

I鈥檓 really missing a named vector version of setnames, e.g.:

setnames(DT, "b", "B") # current
setnames(DT, c("b" = "B")) # same as above

E.g., in https://github.com/hadley/plyr this is done by two different funtions: revalue and mapvalues, where revalue just calls mapvalues:

revalue <- function (x, replace = NULL, warn_missing = TRUE) 
{
    if (!is.null(x) && !is.factor(x) && !is.character(x)) {
        stop("x is not a factor or a character vector.")
    }
    mapvalues(x, from = names(replace), to = replace, warn_missing = warn_missing)
}

What do you think? Thanks!

Bela

enhancement

Most helpful comment

It would be a breaking change if someone is already supplying names character into setnames

d=data.table(a=1,b=2)
setnames(d, c("b"="y","a"="x"))
d
#   y x
#1: 1 2

I think usage of setnames(d, names(nm), nm) is short enough to consider this feature request as redundant.

All 11 comments

In what way do you miss it :-)? Are you finding it harder to type, remember, program with, understand, read etc.. Because the way I see it, I don't really miss it :-).

FWIW, you could do this

vec <- c(b="B")
setnames(DT, names(vec), vec)

I would find it easier to read when renaming several (like more than 10) names, which is a common task I have to do, and it would be easier for other people to read. I can always define my own function, but I thought it would be a nice addition for everybody. And I guess I鈥檓 not alone, since a convenience function of such a sort exits in plyr.

Yeah, I like to make named lists in my code for cleaning data. I don't rename variables very often, but if I'm importing from somewhere with annoying naming conventions and a lot of variables, like the Census, it's especially useful. I mean like...

my_renames <- c(
  badname  = goodname,
  badname2 = goodname2
)

The workaround GSee mentions is fine, but I would use the "named vector" syntax if it existed.

I'm still not getting the huge advantage part. But marked as FR as it should be straightforward to implement AFAICT and wouldn't break anything.

Thank you! :D

The only thing that might break is if someone is currently passing in a named vector expecting the names not to be used -- the names could be in a different order, or they could not correspond to any of the current names of the data.table.

The risk of that seems relatively low, but personally I don't always bother using unname when passing a named vector into functions that accept vectors as input but don't use the names.

@gsee That wouldn't be a problem if the new functionality were put in a wrapper with a different name, like setnamesv, right?

@franknarf1 this would make setnames and setnamesv inconsistent to setkeyv, set2keyv or setorderv in terms of _symbols_ or _character_ input.
If changing setnames function then maybe an _option_ to keep 1.9.4 behaviour.

@jangorecki Yeah, I'm not saying that should be the name, but creating a separate function would be the most straightforward. I think the current behavior should remain the default for setnames.

It would be a breaking change if someone is already supplying names character into setnames

d=data.table(a=1,b=2)
setnames(d, c("b"="y","a"="x"))
d
#   y x
#1: 1 2

I think usage of setnames(d, names(nm), nm) is short enough to consider this feature request as redundant.

Was this page helpful?
0 / 5 - 0 ratings