https://ci.nodejs.org/job/node-test-commit-aix/27470/nodes=aix61-ppc64/console
test-osuosl-aix61-ppc64_be-2
00:10:52 not ok 2527 addons/buffer-free-callback/test
00:10:52 ---
00:10:52 duration_ms: 0.169
00:10:52 severity: crashed
00:10:52 exitcode: -5
00:10:52 stack: |-
00:10:52
00:10:52
00:10:52 #
00:10:52 # Fatal error in , line 0
00:10:52 # Check failed: result.second.
00:10:52 #
00:10:52 #
00:10:52 #
00:10:52 #FailureMessage Object: fffffffffffd0a0
00:10:52 ...
This was on a node-daily-master run, not a CI for a PR or anything like that.
@nodejs/platform-aix
00:10:52 # Fatal error in , line 0
this is weird, even the C++ debug variables (filename and line number) seem to have been corrupted!
#find . | xargs grep "CHECK" 2> /dev/null | grep "result" | grep "second"
#
#find . | xargs grep "CHECK" 2> /dev/null | grep "second"
./node_binding.cc: CHECK_GE(it->second.refcount, 1);
./env-inl.h: CHECK_EQ(insertion_info.second, true);
./api/environment.cc: CHECK_EQ(it->second, size);
./node_platform.cc: CHECK(insertion.second);
./node_platform.cc: CHECK(insertion.second);
./node_platform.cc: CHECK(it->second.second);
./node_platform.cc: CHECK(data.second);
./node_options-inl.h: CHECK_EQ(it->second.type, kBoolean);
./node_options-inl.h: CHECK_EQ(it->second.type, kBoolean);
#
Is the result
a variable name, or a typename?
#find . | xargs grep "CHECK" 2> /dev/null | grep "result" | grep "second" #
#find . | xargs grep "CHECK" 2> /dev/null | grep "second" ./node_binding.cc: CHECK_GE(it->second.refcount, 1); ./env-inl.h: CHECK_EQ(insertion_info.second, true); ./api/environment.cc: CHECK_EQ(it->second, size); ./node_platform.cc: CHECK(insertion.second); ./node_platform.cc: CHECK(insertion.second); ./node_platform.cc: CHECK(it->second.second); ./node_platform.cc: CHECK(data.second); ./node_options-inl.h: CHECK_EQ(it->second.type, kBoolean); ./node_options-inl.h: CHECK_EQ(it->second.type, kBoolean); #
Is the
result
a variable name, or a typename?
@gireeshpunathil Strange the find/grep didn’t find https://github.com/nodejs/node/blob/5707ed21a253b2ed1e2f5944f622c20092b866d6/deps/v8/src/objects/backing-store.cc#L581 (not sure if others — I’m on a tablet and found that one through the GitHub web UI).
Ah looking again you ran find
in src
.
Also seen on a test build for the new AIX machines in the CI (https://github.com/nodejs/build/issues/2110#issuecomment-567731188): https://ci.nodejs.org/job/node-test-commit-aix/27426/nodes=aix71-ppc64/testReport/junit/(root)/test/addons_buffer_free_callback_test/
test-ibm-aix71-ppc64_be-1
thanks @richardlau !
so this would mean that we try to insert duplicate entries (of backing_stores) into the map. Don't know what that means.
There is a trace point in the neighborhood that might give more clues and that gets enabled with --trace_backing_store
(FLAG_trace_backing_store
) but the CLI parser does not accept such an option!
cc @nodejs/v8
Cc @addaleax and @thangktran as they’ve been doing some work recently to adapt Node.js to the changes V8 have made to backing store management.
@richardlau i submitted https://github.com/nodejs/node/pull/31053 recently that might fix this problem.
@gireeshpunathil it's due to a behaviour change in V8 ArrayBuffer
. Previously, when a static buffer (or one that outlives the ArrayBuffer) is used, the BackingStore
will not be registered. Now that it is registered, problem might occur when new ArrayBuffer
is allocated at the same place of previous ArrayBuffer
that is still being tracked in BackingStore
table.
Stress test on master
: https://ci.nodejs.org/view/Stress/job/node-stress-single-test/29/nodes=aix61-ppc64/
80 failures in 1000 runs.
Stress run with #31053: https://ci.nodejs.org/view/Stress/job/node-stress-single-test/nodes=aix61-ppc64/31/
Stress run with #31053: https://ci.nodejs.org/view/Stress/job/node-stress-single-test/nodes=aix61-ppc64/31/
85 failures in 1000 runs so it doesn’t look like https://github.com/nodejs/node/pull/31053 fixes this 😞.
cc @thangktran
Attempting to bisect.
1000 runs against 2dff8ddafb3cd8ca6555d9e3731a1541c6a1bb2f (commit before 4f523c2c1a1c4cef33a1f925cb9102d5b8a51dab) had no failures: https://ci.nodejs.org/job/node-stress-single-test/32/nodes=aix61-ppc64/
Started 1000 runs against 4f523c2c1a1c4cef33a1f925cb9102d5b8a51dab: https://ci.nodejs.org/job/node-stress-single-test/33/nodes=aix61-ppc64/
Attempting to bisect.
1000 runs against 2dff8dd (commit before 4f523c2) had no failures: https://ci.nodejs.org/job/node-stress-single-test/32/nodes=aix61-ppc64/
Started 1000 runs against 4f523c2: https://ci.nodejs.org/job/node-stress-single-test/33/nodes=aix61-ppc64/
Still running but 15 failures out of 100 runs so it looks like 4f523c2c1a1c4cef33a1f925cb9102d5b8a51dab is the cause and is not fixed by any of the follow up fixes (https://github.com/nodejs/node/pull/30946, https://github.com/nodejs/node/pull/31053).
There is a trace point in the neighborhood that might give more clues and that gets enabled with
--trace_backing_store
(FLAG_trace_backing_store
) but the CLI parser does not accept such an option!
I think the flag is only available in debug builds: https://github.com/nodejs/node/blob/2bdeb88c27b4d8de3a8f6b7a438cf0bcb88fa927/deps/v8/src/flags/flag-definitions.h#L893
This can easily be made reproducible by removing the global.gc()
calls in the test.
The reason that this fails is that the test allocates multiple Buffer
s with the same base address, which, as @thangktran explained, is no longer allowed.
For fixing this, it should be enough to either move away from a static buffer for the underlying memory, or to modify the offset
argument in the JS test file so that the Buffers never have the same base address.
@thangktran Are you interested in opening a PR for this?
This can easily be made reproducible by removing the
global.gc()
calls in the test.The reason that this fails is that the test allocates multiple
Buffer
s with the same base address, which, as @thangktran explained, is no longer allowed.For fixing this, it should be enough to either move away from a static buffer for the underlying memory, or to modify the
offset
argument in the JS test file so that the Buffers never have the same base address.@thangktran Are you interested in opening a PR for this?
@addaleax sure i'd love to. I'm on it.
fwiw, consistent reproduce - on any platform:
$ cat test/addons/buffer-free-callback/foo.js
const common = require('../../common')
const binding = require(`./build/${common.buildType}/binding`)
let buf1 = binding.alloc(64, 1, 0)
let buf2 = binding.alloc(64, 1, 0)
#tools/test.py addons/buffer-free-callback/foo
=== release foo ===
Path: addons/buffer-free-callback/foo
#
# Fatal error in , line 0
# Check failed: result.second.
#
#
#
#FailureMessage Object: 0x7ffeefbfe5c0
the only missing puzzle now is, why the C++ assertion check is failing to get the file name and line number. @addaleax - do you know?
the only missing puzzle now is, why the C++ assertion check is failing to get the file name and line number. @addaleax - do you know?
V8 is intentionally omitting these from non-debug builds, see
That being said, here’s a PR that would enable stack trace printing for V8 check failures, which should already help a ton: https://github.com/nodejs/node/pull/31079
@gireeshpunathil it's due to a behaviour change in V8
ArrayBuffer
. Previously, when a static buffer (or one that outlives the ArrayBuffer) is used, theBackingStore
will not be registered. Now that it is registered, problem might occur when newArrayBuffer
is allocated at the same place of previousArrayBuffer
that is still being tracked inBackingStore
table.
Are we calling this out anywhere as a notable change? If the behaviour change is breaking for one of our addon tests it’s potentially breaking for addons out in the wild.
@richardlau It’s definitely breaking addons in the wild, not just potentially. :confused:
I can try to make sure we include it in the 14.0.0 changelog.
Most helpful comment
@richardlau It’s definitely breaking addons in the wild, not just potentially. :confused:
I can try to make sure we include it in the 14.0.0 changelog.