Oj: Partial character in string 1

Created on 21 Sep 2017  路  36Comments  路  Source: ohler55/oj

We keep running into this error on a regular basis:
https://github.com/ohler55/oj/blob/master/ext/oj/dump.c#L820

Yesterday I narrowed it down to a single empty string when Oj.mimic_JSON is used.

The problem is that it doesn't work with any empty string, but as far as ruby is concerned it is identical to an empty string in every way we tried. We tried using things such as checking equality with an empty string, outputting the bytes (which results in an empty array), and using various combinations of unpack, but everything we tried showed that it was an empty string, but apparently not as it caused an error in Oj.dump() where as passing a newly created empty string to Oj did not.

If we clone the variable it also somehow fixes the issue. Do you have any idea how we can get the true contents of this seemingly empty string or why one empty string could result in this error when another would not?

We found that changing from the Oj.mimic_JSON default escape_mode of unicode_xss to json made this issue go away, but I'm pretty sure that's a bad idea.

Any ideas? I'm kinda stumped.

Most helpful comment

There is some debug code in the branch I will remove before merging.

All 36 comments

Just guessing but it could be an invalid Unicode such as a single byte with the high bit set. Can you convert to bytes and print it? If you have an example that fails I can fix it.

Yeah unfortunately I already checked that and bad_empty_string.bytes outputs []. Which is the same as if I had run ''.bytes. It's VERY odd.

bad_empty_string == '' also returns true

Is it maybe possible to output the bytes in the string as part of this of this error message:
https://github.com/ohler55/oj/blob/master/ext/oj/dump.c#L820

Very strange indeed. Can you provide a string or file that fails?

Unfortunately not, it seems to be intermittent, so it's probably something that happens while it's loading. I try loading the file directly and then passing it to Oj and do not get this error. It would be great to see what ends up on Oj's end.

Also, if we switch back to the regular json parser we do not get errors.

I posted in the wrong issue. There is an odd-chars branch with some debug info in it for you to try.

Thanks ohler55. We will take a look and see if that helps.

Can this be closed?

@michaeldawson @jfhinchcliffe what was the issue you guys ran into when trying to run the debug build?

As someone not super across gems with native extensions, the debugging info wasn't being output with the error message after shifting over to the branch with the debug info. I also tried removing the gem and re-bundling (pointing at the new branch), but still, when this issue arose, it didn't include the new debug information :(

I realized I deleted the branch a day or two ago while cleaning up. I'll re-make it but it will take a day or two. I'll put in some heavy tracing.

Thanks Peter :)

Try odd-chars branch. Pushed moments ago.

Thank you, I'm running it now, but so far I haven't hit the issue.

I got the same issue but I just found out about the branch
Will run it from now and see if I can encounter the error again

Got another one
The previous one only includes dumping string when the error occur
This one includes all (from the starts of request processing)
error_log.txt

Partial success with those but since the output is async its hard to tell which caused the error. I've updated the branch with another version that only output on the raise and with the string as hex and the line number. That should help narrow down what is going on. From the looks of it though the string do not appear empty. Please rerun since you are having success recreating.

Got JSON::GeneratorError - Partial character in string. @ 838 this time
no dumping string found

(Can't upload any file to github now, strange)

That actually tells me a lot. I have some idea. Stay tuned.

On October 12, 2017 5:23:33 PM GMT+09:00, PikachuEXE notifications@github.com wrote:

Got JSON::GeneratorError - Partial character in string. @ 838 this
time
no dumping string found

(Can't upload any file to github now, strange)

I pushed a branch with some addition checks. Please see if you can get it to fail for an empty string.

I got the exact same issue on the previous version of the branch:

"JSON::GeneratorError: Partial character in string. @ 838"

I'll update so that I have your latest commits.

Any change with using the latest commits from a few days ago?

I haven't run into it yet actually. It seems like it happens less often when using these debug builds. I'm not sure why.

The last update has a fix in it so not showing up is good.

I haven't encounter the error since using the branch

Likewise haven't encountered the issue with the branch - thanks! Do you think this code is mergeable? We aren't running this in production yet.

I never got this error in prod
But I am running this version of gem on prod too (and no error like before)

There is some debug code in the branch I will remove before merging.

Let me know if you want more testing to be done :D

I think it is fine. Just got back from Japan so will merge and release tomorrow.

On October 26, 2017 6:09:57 PM PDT, PikachuEXE notifications@github.com wrote:

Let me know if you want more testing to be done :D

Fix is in 2.8.1.

Thanks for all your work on this @ohler55

@ohler55 I'm assuming you meant 3.3.9?

Oops, yes, release a new Ox as well. Got my gem mixed up.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dgollahon picture dgollahon  路  5Comments

coconup picture coconup  路  23Comments

ericmwalsh picture ericmwalsh  路  25Comments

mediafinger picture mediafinger  路  40Comments

ohler55 picture ohler55  路  21Comments