Oj: Non-deterministic issue when reading odd objects from threads

Created on 18 Jan 2019  路  9Comments  路  Source: ohler55/oj

Hi there,

I'm seeing an issue where if multiple "odd" objects (e.g. a DateTime) are being deserialized from different threads simultaneously. I can reproduce this on the latest version of OJ.

To create the failing file:

require "date"
require "oj"
f = File.open("odd_file.jsonl", "w")
100_000.times do
 obj = { created_at: DateTime.new(2001,2,3,4,5,6) }
 Oj.to_stream(f, obj)
 f.puts
 f.flush
end
f.close

To prompt a failure:

def run_test_thread
    threads = Array.new(3) do
        Thread.new do 
            counter = 0
            File.open("odd_file.jsonl", "r") { |f| Oj.enum_for(:load, f).lazy.each { counter += 1 } }
            puts counter
        end
    end

    threads.each(&:join)
end

100.times do |i|
  puts i
  run_test_thread
end

When running this without multiple threads, it deserializes fine:

def run_test_thread

            counter = 0
            File.open("odd_file.jsonl", "r") { |f| Oj.enum_for(:load, f).lazy.each { counter += 1 } }
            puts counter


end

100.times do |i|
  puts i
  run_test_thread
end

I've skimmed the C code for deserializing the odd objects, but I don't have enough familiarity with the logic to have a hunch about what could be causing this. Are we doing something unexpected by using this from multiple threads or could this be a bug in the deserialization logic?

Thanks!
--Mike

Most helpful comment

Amazing! Thanks so much for handling this so quickly -- I'm excited to have just put up the patch to pull the new version into our repos.

All 9 comments

I'll take a look. I don't think there is a single threaded dependency but it sounds like there is an issue so I'll do some experimenting.

I'm not familiar with using the #enum_for method along with #lazy so it will take some experimenting.

What behaviour are you seeing? Is the counter different on each?

Ah sorry, totally forgot to post the exception. The counter is the same on them all when it succeeds, when it fails there鈥檚 an exception. I can post the actual output tomorrow.

The enum for and lazy combination are so that we can get a lazy enumerator that steps through the JSONL line by line. If you have a recommendation on another way to accomplish this, I can also experiment with that.

Thank you!

Figured it out but the fix will take a bit more time. For the odd routines the VALUE arguments are collected and the creation function called later. Those arguments are being collected by a GC. The fix is to make sure they are protected. There are a approaches that work. I just have to figure out which is best.

Okay, not that long. :-)

The fix will be in the odd branch. Not yet though.

Okay to try now.

Awesome! I can confirm that on the new branch, I'm able to get through all 100 iterations of my test code. On the old branch within 1-2 iterations I reliably get:

(irb):6:in `new': undefined method `div' for "created_at":String (NoMethodError)
100000
Traceback (most recent call last):
        7: from (irb):6:in `block (2 levels) in run_test_thread'
        6: from (irb):6:in `open'
        5: from (irb):6:in `block (3 levels) in run_test_thread'
        4: from (irb):6:in `each'
        3: from (irb):6:in `each'
        2: from (irb):6:in `load'
        1: from (irb):6:in `new'
NoMethodError (undefined method `div' for (0/1):Rational)

Thanks!

Release 3.7.8 has the fix.

Amazing! Thanks so much for handling this so quickly -- I'm excited to have just put up the patch to pull the new version into our repos.

excellent

Was this page helpful?
0 / 5 - 0 ratings