Environment
Provide version numbers for the following components (information can be retrieved by running tns info in your project folder or by inspecting the package.json of the project):
Describe the bug
I just upgraded an app from NS 2.x to NS 5.4 (finally convinced my client!!!) and I was seeing some really weird results. I did some digging into it and finally tracked it down to the Worker's data it is sending it back to the primary thread is partially corrupted.
Basically the process is I send a massive text file from the primary thread to the background worker for it to parse and it create a nice memory JSON database. It saves this DB so that on app restarts it can use the already parsed data, and not have to re-download it and then can return it to the primary thread. I already pulled both the original downloaded data and the parsed data and everything looks perfect on the worker thread.
In my NS 2 design I would just do this from the worker:
global.postMessage({cmd: WORKERMESSAGE.ONDATA, results: db._getAllData()});
The db._getAllData() call would actually return a pure object result set (i.e. it is not JSON stringified.) So it would be like '{x: 1, y:2, table1: [ {entry1}, {entry2} ... ]....}`. So I am passing pure data objects over the wire. This worked in NS 2 perfectly.
In NS5 (and as a workaround) I have now had to do this:
global.postMessage({cmd: WORKERMESSAGE.ONDATA, results: JSON.stringify(db._getAllData())}); and then JSON.parse it on the primary thread... As you can imagine 5 megs of parsing is a small hit; vs how fast it used to be to push plain old data over the message... The data size currently returned is: 5,345,087 of JSON.stringified data.
Now that you see what works; and fails this is the actual issue on how it fails. It always appears to fail on the same records (i.e. this gives the semi-random effect as the majority of them are good in the lower ranges.
So outputting the records:
0, 1, 4, 7-35, are correct,
2, 3, 5, 6, 36, 37, 43, 52, 65-76 are broken.
So their is like only 20 broken records in the 1000's and 2000's but only one broken record in the 3000's and then looks like the majority of the 5000's and 6000's are broken...
As you can see it doesn't follow any rime or reason which are broken -- and it appears that the same records get corrupted each time, as I've noticed record 2 & 3 are always broken (which is what initially clued me into their being an issue)
So what I did was find one of the records that was showing anomalies and output that record on the Worker side and got:
==== object dump start ====
JS: setDetailId: "69233060"
JS: .... LOTS of CORRECT Data Removed ....
JS: fkConceptIdList: []
JS: fkTagIdList: [6]
JS: iud: ""
JS: cMin: [0, 0]
JS: cHalfLayer: [0, 0]
JS: cLayer: [1, 18]
JS: cCarrier: [2, 18]
JS: changed: "0"
However on the Primary side; when I used the same function to output the exact same record immediately after I received in; this is what I had:
==== object dump start ====
JS: setDetailId: "69233060"
... LOTS of CORRECT Data Removed ...
JS: fkConceptIdList: "[Circular]"
JS: fkTagIdList: "[Circular]"
JS: iud: ""
JS: cMin: "[Circular]"
JS: cHalfLayer: "[Circular]"
JS: cLayer: "[Circular]"
JS: cCarrier: "[Circular]"
JS: changed: "0"
As you can see EVERY single ARRAY in the record was converted to a text string called "[Circular]" even empty arrays.
It isn't all arrays going over the bridge; but it is all arrays in the record is showing corruption... So if the record triggers a "[Circular]" then ALL arrays in that specific record will be that. In addition the data shown is exactly how the records look. Zero, One or Two element arrays in 99.9% of all cases...
To Reproduce
Not sure how to easily reproduce outside of my app. Might be able to create a worker that reads in my raw data and passes it over to the primary thread. I do have copies of it extracted from the app as part of my diagnostics. However, this is clients proprietary data; so I'd rather not share if I don't have to. However, you might look to see what triggers it to print "[Circular]" and see the logic issue that may have been put into place since NS2.
Expected behavior
It to work like NS 2 did and not corrupt the data in memory...
Thanks for the detailed bug description.
The android runtime is not designed to share V8 isolate or any objects between multiple threads. So the only way to do communication between the workers is to serialize the data by making a copy of it. Here's a very good explanation of why this is necessary and even if this article is about Node.js, the exact same limitations apply to the {N} android runtime: https://nodeaddons.com/how-not-to-access-node-js-from-c-worker-threads/
So even if in NS 2 you were passing plain javascript objects, behind the scenes we have always been JSON stringifying them (and you were already paying the price).
The difference I can see between NS 2 and 5 is that we have introduced a so called _smart_ JSON stringify function for serializing the data between the workers instead of relying on the built-in V8 JSON.stringify: https://github.com/NativeScript/android-runtime/commit/593c75f5b0724250616df8a3632336b739066517
When this function detects circular object references it will serialize them as [Circular] in the resulting string. This is why you are observing those litterals with the current version of the runtime.
I am not quite sure how the V8's built-in JSON.stringify function was handling those circular object graphs but I am pretty sure you might also be getting wrong data with NS 2. They might not have broken the app because there were no [Circular] magic strings but I can bet 2 cents that your data was wrong back in NS 2 days (probably missing values or something).
With this in mind, I would recommend redesigning the current object graph that you wish to transfer between the worker and the main thread so that it doesn't have any circular object references in it.
If the data is backed by some native Java objects you might also checkout the following approach to share it between the workers and void paying the serialization price that comes with it.
@darind - Thanks for the response...
I guess I didn't explicitly state -- the data doesn't actually have any Circular references in it. ;-) It really is a POD. In fact 99% of the time it is data loaded from a file on disk and json parsed into the memory structure... (i.e. let data = JSON.parse(readtext(fromFileName));) Only the initial download does all the heavy work; and it basically takes the custom data protocol format and converts it into a json structure. The object graph is fairly simple, it just has a lot of data (5 megs worth):
{table1: [ {rowx}, {rowy} ...], table2: [{row1}, {row2}]....}
App has been running fine on NS 2 for like 3 years; no data corruption. ;-)
I knew that the data could not be shared between threads. Done a lot of threaded apps over the years in a wide variety of languages... I assumed that when I sent in data; it actually made a deep copy of the object and the other thread got the deep copy (which is typically faster than Serialize/deserializing). I didn't realize it also used a JSON serializer internally. LOL. But now, if I serialize it; you are still serializing it a second time, so this is still slower than me sending it directly in...
In summary:
JSON.serialize to send data between threadsfkConceptIdList field)@NathanaelA, the next thing that comes to mind is that the problem might be caused by the new JSON stringify routine. Since the data is confidential and you can't send it over, I have prepared a custom version of the android runtime that uses the old serialization function: https://we.tl/t-U81LAnmcs7
If the issue still persists, the next step would be to try and pinpoint which part of your data is causing it. You can remove any extra fields and and also anonymise the values so that you can send it to us for more thorough troubleshooting.
@darind, I figured out how to duplicate it, with a tiny sample. It isn't "Circular" data; however it is repeated data, which apparently makes the new serializer think it is "Circular". The weird thing is that apparently the serializer is also very confused as to what is repeated... Because in all reality what should had the "[Circular]" on it was the full record (like this example does) not just the arrays in the duplicated record; is their some sort of memory limit to what is tracked, that caused the main record to be purged, but the sub-arrays to be seen as repeated?
Ok put this in a worker.js file;
// Give NS a couple seconds to get everything up on main thread...
setTimeout( () => { startWorker(); }, 3000);
function startWorker() {
let r1 = {blah: "a"}; // This is our row data...
let d = {Hi: "Hi", table1: [r1, r1], table2: [r1]}; // This is our final data...
global.postMessage({id: 1, data: d, data2: JSON.stringify(d)}); // Send it on thru
}
My main-page.js file; had this:
let worker;
if (global.TNS_WEBPACK) {
const testWorker = require("nativescript-worker-loader!./worker.js");
worker = new testWorker();
} else {
worker = new Worker("./worker.js");
}
worker.onmessage = (msg) => {
console.dir(msg.data);
};
worker.onerror = function(err) {
console.log("Worker Error: ", err, err.stack);
};
This will give you:
==== object dump start ====
JS: id: "1"
JS: data: {
JS: "Hi": "Hi",
JS: "table1": [
JS: {
JS: "blah": "a"
JS: },
JS "[Circular]"
JS: ],
JS: "table2": [
JS: "[Circular]"
JS: ]
JS: }
JS: data2: "{"Hi":"Hi","table1":[{"blah":"a"},{"blah":"a"}],"table2":[{"blah":"a"}]}"
JS: ==== object dump end ====
The actual table full row entry r1 is repeated; but it isn't circular. Apparently the new NS Serializer thinks it is circular because it saw it before, but since we are on a totally different object path, it isn't actually circular, just repeated. This is why the v8 serializer works, it is much more battle tested and knows what a circular reference is vs just a second reference to the same data (as you can see with the data2 object, v8 serialized the data correctly)
The minute the serializer has exited serializing the first r1, all data it tagged/tracked should be freed as anything referencing the r1 later in the data again is not circular reference.
Would it be better/quicker to just dump the new NS serializer and actually simply call the v8 serializer from NS instead, since it is obviously a much more battle tested and probably handles any and all the weird cases???
@NathanaelA, thanks, your sample data is spot on!
In 6.0.0 we will revert back to the built-in JSON.stringify method for cross workers communication: https://github.com/NativeScript/android-runtime/pull/1411