This is a scary error I get from the following operation
## Average population of republican states
pop_party = let d = votes_bills_legislators_states
d2 = d[end-99:end, :]
gd = groupby(d2, "state")
# make sure all states represented
@assert length(gd) == 50 && all(sdf -> nrow(sdf) == 2, gd)
d3 = @pipe d2 |>
groupby(_, "party") |>
combine(_, "population" => mean) # errors on this last command
end
I don't have an MWE available. If I try to isloate this by saving d2 to a CSV and reading it in, this works fine.
Unreachable reached at 0x7fd0ec4c000d
signal (4): Illegal instruction
in expression starting at REPL[10]:1
groupreduce at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:1023
unknown function (ip: 0x7fd0ec4c003a)
Reduce at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:1032
unknown function (ip: 0x7fd0ec4bfef6)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2158 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2322
_combine at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:1150
#combine_helper#391 at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:589
combine_helper##kw at /home/peterwd/.julia/packages/DataFrames/htZzm/src/groupeddataframe/splitapplycombine.jl:585
pkg> st
Status `~/Documents/Projects/Senate Voting/Project.toml`
[336ed68f] CSV v0.7.3
[a93c6f00] DataFrames v0.21.4
[da1fdf0e] FreqTables v0.4.0
[d96e819e] Parameters v0.12.1
[08abe8d2] PrettyTables v0.9.1
[b8865327] UnicodePlots v1.2.0
Tagging @quinnj I think this is an issue with SentinelArrays
julia> d2.population
100-element SentinelArrays.SentinelArray{Float64,1,Float64,Missing,Array{Float64,1}}:
Hmmm, I'm not very familiar with the combine/reduce code; is there a more minimal repro on just the SentinelArray vector?
got it! It seems to happen after a call to stack
using DataFrames, SentinelArrays, Statistics
julia> df = DataFrame(g = rand(1:100, 100), v1 = SentinelArray(Union{Float64, Missing}[rand() for i in 1:100]), v2 = rand(100));
julia> long_df = stack(df, Not(1), variable_name = "state", value_name = "population");
julia> combine(groupby(long_df, "state"), "population" => mean) # errors here
I cannot reproduce it on 1.5:
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.5.0-rc1.0 (2020-06-26)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using DataFrames, CSV
julia> using Statistics
julia> df = DataFrame(g = rand(1:100, 100), v1 = CSV.SentinelArray(Union{Float64, Missing}[rand() for i in 1:100]), v2 = rand(100));
julia> long_df = stack(df, Not(1), variable_name = "state", value_name = "population");
julia> combine(groupby(long_df, "state"), "population" => mean)
2Γ2 DataFrame
β Row β state β population_mean β
β β Catβ¦ β Float64 β
βββββββΌββββββββΌββββββββββββββββββ€
β 1 β v1 β 0.514076 β
β 2 β v2 β 0.546164 β
I can reproduce on OSX v"1.5.0-rc1.0".
julia> versioninfo()
Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
JULIA_PKG_DEVDIR = /home/peterwd/Documents/Development
JULIA_EDITOR = subl
I reproduced this with an rr trace and sent it to keno for investigation
Worked around in #2335
Most helpful comment
I reproduced this with an rr trace and sent it to keno for investigation