Looks like in some cases, if I switch a global variable, then switch it back, its value on remote workers when captured is only updated after the first switch. Here is a MWE (Julia 1.1.0):
julia> using Distributed
julia> addprocs(1)
1-element Array{Int64,1}:
2
julia> x = :a
:a
julia> @fetch x
:a
julia> x = :b
:b
julia> @fetch x
:b
julia> x = :a
:a
julia> @fetch x
:b # <--- should be :a here
This doesn't always happen though. E.g. if x
is instead an Int
, it works as I would have expected from the docs:
julia> using Distributed
julia> addprocs(1)
1-element Array{Int64,1}:
2
julia> x = 1
1
julia> @fetch x
1
julia> x = 2
2
julia> @fetch x
2
julia> x = 1
1
julia> @fetch x
1
Note sure what triggers the bug, if it even has anything to do with the type of x
, although is seems reproducible every time. I've found that e.g. x::Int
and x::Vector
work fine, but x::Symbol
and x::String
do not.
The logic here is wrong (you can redefine the function with @show s.glbs_sent
to see what is going on):
In the steps you show before (for a a non-bits type):
x=:a //objectid(x)=hash(:a)
s.globs_sent=Dict(hash(:a)=>hash(x,hash(:a))) //new oid, send x
x=:b //objectid(x)=hash(:b)
s.globs_sent=Dict(hash(:a)=>hash(x,hash(:a)),hash(:b)=>hash(x,hash(:b))) //new oid, send x
x=:a
s.globs_sent=Dict(hash(:a)=>hash(x,hash(:a)),hash(:b)=>hash(x,hash(:b))) //key and value match from the first run, don't send x
This is a bug and will happen for many types (as long as you go back to a previous value for the same variable), and I am not sure this PR helps either :(
The issue is here -
I think glbs_sent::Dict{UInt64, UInt64}
should be glbs_sent::Dict{Symbol, Tuple{UInt64, UInt64}}
i.e. (key,value)
-> (global_symbol,(objectid, hash_value))`.
I'll submit a PR in a couple of days.
Yes, i figured that glbs_sent
should be symbol => (oid,value)
, so the changing value can be overwritten. I was too scared to submit any PR since this is quite low level and I don't know what else relies on the ClusterSerializer
struct.
Thanks for taking care of this!
I was too scared to submit any PR since this is quite low level and I don't know what else relies on the ClusterSerializer struct.
Please don't be! The more contributors the better. Most of us have been new to the Julia codebase at one time.
The linked PR that was merged claims to have fixed this.
Most helpful comment
Please don't be! The more contributors the better. Most of us have been new to the Julia codebase at one time.