DataFrameRow is basically a mutable NamedTuple it should support:
foo(x; row...)(; row..., a=1, b=2)NamedTuplemerge(row, nt), merge(row, row2), merge(nt, row)pairs(row)cc @willtebbutt
I'm curious what you are doing that requires a DataFrameRow to be used like that? I always imagined a DataFrameRow should be basically a vector that can be indexed with a symbol.
I'm trying to achieve the following:
map a function over each row of a DataFrame, which spits out a couple of new fields.DataFrame containing the existing data, with the new fields concatenated on the end.One way to do this is to return from map(eachrow(df)) do row ... end a Vector of NamedTuples (i.e. a table), which can then be used to construct a new DataFrame, where the NamedTuples contain the existing and new fields. To do this, you need to convert a DataFrameRow to a NamedTuple, which can currently be achieved as follows:
new_names = vcat(names(row), vector_of_new_names)
new_values = vcat(convert(Vector, row), vector_of_corresponding_values)
(; Pairs.(new_names, new_values)...)
Currently the way to convert DataFrameRow to NamedTuple is to call copy on it.
Also pairs(::DataFrameRow) is defined.
So I have the following questions (here I am rather OK to add it):
NamedTuple(::DataFrameRow) to be defined?convert(::Type{NamedTuple}, ::DataFrameRow) to be defined?convert(::Type{Tuple}, ::DataFrameRow) to be defined? (Tuple(::DataFrameRow) exists)And do you want to have these (here I am more reluctant, as you can always call copy on DataFrameRow to get what you want, so this seems not to be strictly needed; the only drawback is that merge is needed for splatting):
merge allowing arbitrary mixing of NamedTuple and DataFrameRow and producing a NamedTuple (I am reluctant because it is tricky to get it 100% right and not stress the compiler at the same time)Wait, copy is how to convert a DataFrameRow into a NamedTuple ?
What is the logic behind that?
DataFrameRow is a view, and views in base are materialized by copy (e.g. standard view, in linear algebra wrappers around arrays etc.). We materialize to a NamedTuple, what other data type would you find suitable for this?
I guess that makes sense.
In summary, I will add only convert and NamedTuple methods and omit merge. OK?
CC @nalimilan
What about making (; row..., a=1) work?
That's what I want most
Why not if that's not too hard, but first converting to NamedTuple will also work and it's not too hard to do.
(FWIW, the kind of operation you describe really sounds like a job for the select we're talking about for some time.)
Yeah, @willtebbutt was saying that when we were talking about this in person
So what is the conclusion - do we want this splatting (which implicates implementing merge or not).
As I have noted the needed definition will be a bit tricky (it is doable). The first natural implementation:
merge(b::Union{NamedTuple, DataFrameRow}...) = merge(NamedTuple.(b)...)
is not good due to method ambiguities, so I would have to define special cases for 1, 2 and more than 2 positional arguments.
If it's not harder than that, then yes, sounds worth it.
OK - I will add also the merge method (it will be probably several of them but this is all that is needed)
See #2060
Most helpful comment
OK - I will add also the
mergemethod (it will be probably several of them but this is all that is needed)