Dataframes.jl: How do I get values from a data frame with Nullable?

Created on 8 Nov 2016  路  6Comments  路  Source: JuliaData/DataFrames.jl

Before Nullable it was possible to get values from a data frame with v = df[:a] but since NullableArrays that doesn't work anymore. Is v = [get(x) for x in df[:a]] the most convenient way to access the values now?

Most helpful comment

There might be some work started on making DataFramesMeta work around this problem with Nullable. But I think this new feature should be solved in DataFrames rather than requiring all users of DataFrames to implement their own workarounds.

All 6 comments

but since NullableArrays that doesn't work anymore

I'm not sure what you mean. df[:a] still works, it just produces a NullableArray. If what you're trying to do is get the non-null values, you can use dropnull:

julia> df[:a]
3-element NullableArrays.NullableArray{Int64,1}:
 1
 2
 3

julia> dropnull(df[:a])
3-element Array{Int64,1}:
 1
 2
 3

You can also convert directly to an Array using Array(df[:a]), but that will fail if there are null values.

Ok, my bad. I missed to check the latest documentation as the "stable" documentation still uses DataArrays. When I drop the nulls I'm often curious to know which rows that I drop so that I can do things like @where(df, !isna(:a)). The new equivalent would be @where(df, !isnull.(:a)) right? But that seems to be a bit broken though. I would expect isnull.(x) to return Array{Bool,1} but I get a NullableArray{Int64,1} which doesn't work to select rows. Am I missing something?

Btw I don't get anything to work with DataFrames with Nullable. DataFramesMeta doesn't work at all. @where(df, :a .> 1) doesn't work even without any nulls. And it's not an issue with DataFramesMeta as not even a basic thing like df[:a] + 1 works anymore.

There might be some work started on making DataFramesMeta work around this problem with Nullable. But I think this new feature should be solved in DataFrames rather than requiring all users of DataFrames to implement their own workarounds.

I would expect isnull.(x) to return Array{Bool,1} but I get a NullableArray{Int64,1} which doesn't work to select rows. Am I missing something?

That's just a bug, similar to https://github.com/JuliaStats/NullableArrays.jl/issues/142. We should definitely try to fix this.

DataFramesMeta hasn't been updated to work with NullableArrays yet. df[:a] + 1 not working is https://github.com/JuliaStats/NullableArrays.jl/issues/143, but it's a more general issue with nullable lifting semantics (https://github.com/JuliaLang/julia/pull/19034).

Anyway, these are not issues in DataFrames per se, so closing. We already have https://github.com/JuliaStats/DataFrames.jl/issues/1092 for the roadmap.

Thanks for the references!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mattBrzezinski picture mattBrzezinski  路  5Comments

tlienart picture tlienart  路  8Comments

cossio picture cossio  路  5Comments

blackeneth picture blackeneth  路  5Comments

davidanthoff picture davidanthoff  路  4Comments