Dataframes.jl: Allow `rename!(df, pair::Pair{String, String})` as a signature

Created on 16 Nov 2019  ·  5Comments  ·  Source: JuliaData/DataFrames.jl

rename!(gmsc, (Symbol("NumberOfTime60-89DaysPastDueNotWorse") => Symbol("NumberOfTime60_89DaysPastDueNotWorse"), Symbol("NumberOfTime30-59DaysPastDueNotWorse") => Symbol("NumberOfTime30_59DaysPastDueNotWorse")))

is my rename code. I think it's better if we allow String in the rename signature

rename!(gmsc, ("NumberOfTime60-89DaysPastDueNotWorse" => "NumberOfTime60_89DaysPastDueNotWorse", "NumberOfTime30-59DaysPastDueNotWorse" => "NumberOfTime30_59DaysPastDueNotWorse"))

because it's more amendable to creating column names using algorithms for manipulating strings.

Most helpful comment

Then it is OK to make a PR adding this.

The "function" form has to accept Symbol but we could allow it to return anything (a string in particular) and cast it to Symbol, so it should be just one string call extra, which is not a huge deal.

All 5 comments

This is a discussion we had some time ago with @nalimilan.
The conclusion was to allow Symbol only (at least for now).
@nalimilan - do you remember the reason for this?

@xiaodaigh you can use broadcasting to save having to write one Symbol.

I think one of the reasons to use symbols everywhere is that they are more efficient for column lookup. But even that may change due to https://github.com/JuliaLang/julia/pull/32437, so we should benchmark it again once that PR is merged.

Regarding rename in particular, I would say it's a case where accepting strings would make sense since 1) performance isn't critical, 2) string processing functions do not accept nor return symbols, and 3) there's little risk of confusion with other functions like getindex. But unfortunately, even with that change, when one passes a function to rename it will have to take a symbol, nor a string, unless we radically change the API.

Then it is OK to make a PR adding this.

The "function" form has to accept Symbol but we could allow it to return anything (a string in particular) and cast it to Symbol, so it should be just one string call extra, which is not a huge deal.

This probably hurts consistency of API, as now people will expect Strings to work in other contexts. I think it's best to let users call Symbol themselves.

I'm not too worried about that since rename is quite particular.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bbrunaud picture bbrunaud  ·  3Comments

davidanthoff picture davidanthoff  ·  4Comments

abieler picture abieler  ·  7Comments

pdeffebach picture pdeffebach  ·  8Comments

cormullion picture cormullion  ·  6Comments