Hi there,
I'm seeing an issue where if multiple "odd" objects (e.g. a DateTime) are being deserialized from different threads simultaneously. I can reproduce this on the latest version of OJ.
To create the failing file:
require "date"
require "oj"
f = File.open("odd_file.jsonl", "w")
100_000.times do
obj = { created_at: DateTime.new(2001,2,3,4,5,6) }
Oj.to_stream(f, obj)
f.puts
f.flush
end
f.close
To prompt a failure:
def run_test_thread
threads = Array.new(3) do
Thread.new do
counter = 0
File.open("odd_file.jsonl", "r") { |f| Oj.enum_for(:load, f).lazy.each { counter += 1 } }
puts counter
end
end
threads.each(&:join)
end
100.times do |i|
puts i
run_test_thread
end
When running this without multiple threads, it deserializes fine:
def run_test_thread
counter = 0
File.open("odd_file.jsonl", "r") { |f| Oj.enum_for(:load, f).lazy.each { counter += 1 } }
puts counter
end
100.times do |i|
puts i
run_test_thread
end
I've skimmed the C code for deserializing the odd objects, but I don't have enough familiarity with the logic to have a hunch about what could be causing this. Are we doing something unexpected by using this from multiple threads or could this be a bug in the deserialization logic?
Thanks!
--Mike
I'll take a look. I don't think there is a single threaded dependency but it sounds like there is an issue so I'll do some experimenting.
I'm not familiar with using the #enum_for method along with #lazy so it will take some experimenting.
What behaviour are you seeing? Is the counter different on each?
Ah sorry, totally forgot to post the exception. The counter is the same on them all when it succeeds, when it fails there鈥檚 an exception. I can post the actual output tomorrow.
The enum for and lazy combination are so that we can get a lazy enumerator that steps through the JSONL line by line. If you have a recommendation on another way to accomplish this, I can also experiment with that.
Thank you!
Figured it out but the fix will take a bit more time. For the odd routines the VALUE arguments are collected and the creation function called later. Those arguments are being collected by a GC. The fix is to make sure they are protected. There are a approaches that work. I just have to figure out which is best.
Okay, not that long. :-)
The fix will be in the odd branch. Not yet though.
Okay to try now.
Awesome! I can confirm that on the new branch, I'm able to get through all 100 iterations of my test code. On the old branch within 1-2 iterations I reliably get:
(irb):6:in `new': undefined method `div' for "created_at":String (NoMethodError)
100000
Traceback (most recent call last):
7: from (irb):6:in `block (2 levels) in run_test_thread'
6: from (irb):6:in `open'
5: from (irb):6:in `block (3 levels) in run_test_thread'
4: from (irb):6:in `each'
3: from (irb):6:in `each'
2: from (irb):6:in `load'
1: from (irb):6:in `new'
NoMethodError (undefined method `div' for (0/1):Rational)
Thanks!
Release 3.7.8 has the fix.
Amazing! Thanks so much for handling this so quickly -- I'm excited to have just put up the patch to pull the new version into our repos.
excellent
Most helpful comment
Amazing! Thanks so much for handling this so quickly -- I'm excited to have just put up the patch to pull the new version into our repos.