This is a list of the crashed test cases. All crashes can be reproduced following the steps of #1736 .
It is still unclear if these are problems of onnxifi or glow.
Crashed onnx test cases:
1 test_averagepool_2d_same_lower
2 test_reshape_reordered_dims
3 test_maxpool_with_argmax_2d_precomputed_strides
4 test_softmax_axis_0
5 test_maxpool_2d_precomputed_same_upper
6 test_maxpool_2d_same_lower
7 test_reshape_one_dim
8 test_batchnorm_epsilon
9 test_averagepool_2d_same_upper
10 test_batchnorm_example
11 test_reshape_extended_dims
12 test_maxpool_2d_same_upper
13 test_reshape_negative_dim
14 test_maxpool_with_argmax_2d_precomputed_pads
15 test_reshape_reduced_dims
16 test_averagepool_2d_precomputed_same_upper
17 test_sum_example
18 test_flatten_axis0
19 test_transpose_default
20 test_sum_one_input
That's a great data point! Thanks, @zrphercule for discovering those!
I'll grab a block of these if nobody else has taken any. I'll claim the first 3:
1 test_averagepool_2d_same_lower
2 test_reshape_reordered_dims
3 test_maxpool_with_argmax_2d_precomputed_strides
I've been trying to discern a pattern here so we can systematically harden the loader, but it's tricky. In some cases we explicitly don't support things, and currently use assert() or llvm_unreachable(). We can convert those to return error codes (as I did in #1769). The other two cases, though, are unsupported but we notice that when some other invariant fails deeper in the loader. Maybe we should avoid assert() entirely within the loader.
Luckily, at least, I don't think I've yet seen a case where the failure emanates from somewhere deeper in the compiler. That would be quite tricky to catch indeed.
@bertmaher I wonder how much work we need to do if we want to not use assert() in onnxifi-glow at all? Since I think it is very possible that the rest of these 20 cases are all due to these assert().
A better solution is to return exceptional onnxStatus and let test driver to handle this.
@jackm321 another option, feel free to ping me offline for any details.
After #2143, the latest list of ONNXIFI node tests that are either failing or crashing can be found below. An important next step here is to dig into what is causing these tests to crash/fail and determine what can be done to prevent this in the loader or fix any underlying bugs in glow.
failing:
crashing
@jackm321 is this still an issue?