Deeplearning4j: Multiple projects: CI test failures

Created on 18 Jun 2019  路  8Comments  路  Source: eclipse/deeplearning4j

https://jenkins.ci.skymind.io/blue/organizations/jenkins/eclipse%2Fdeeplearning4j/detail/bugfix%2Ftests/1/pipeline/37

Test results:

linux-ppc64le-cpu
libnd4j build issue: https://gist.github.com/AlexDBlack/124bd6350e096ef02b35c594b83cdac1

linux-x86_64-cpu
libnd4j: [ FAILED ] DeclarableOpsTests14.test_empty_reduce_mean_1

nd4j-tests: JVM crash on TFGraphTestZooModels - https://gist.github.com/AlexDBlack/0481d40324aea921b7d27ea079ef012b
(also on linux-x86_64-cpu-avx2)
also avx512 - maybe memory issue? https://gist.github.com/AlexDBlack/eb05e684395bd0bb08094a122ef4c13f

datavec-api:
TestNDArrayWritableTransforms.testNDArrayColumnsMathOpTransform:83 ? Runtime O...

deeplearning4j-core:

  GradientCheckTests.testGradientMLP2LayerIrisLayerNorm:777 testGradMLP2LayerIrisSimple() - activationFn=SIGMOID, lossFn=MCXENT, outputActivation=SOFTMAX, doLearningFirst=true, layerNorm=true
  RnnGradientChecks.testBidirectionalWrapper:128
  RnnGradientChecks.testLastTimeStepLayer:281 testLastTimeStepLayer() - mb=3, tsLength = 4, maskType=none, hasLayerNorm=true, rnnType=SimpleRnn
  RnnGradientChecks.testSimpleRnn:207

TestSameDiffOutput.testMSEOutputLayer:165 1 expected:...

-> RNN gradient check tests: same failure as nd4j layer norm test failure below

deeplearning4j-nlp:
TestBertIterator.testMinibatchPadding:288 [simple dtype issue?]

  FastTextTest.testPredict:114 arrays first differed at element [0]; expected:<-0.006423053797334433> but was:<-3.347830206621438E-4>
  BertWordPieceTokenizerTests.testBertWordPieceTokenizer9:182 Expected exception: ?I saw a girl with a telescope.
Tests in error: 
  FastTextTest.testPredictProbability:126 ? Runtime Model needs to be supervised...
  FastTextTest.testWordsNativeStatistics:227 ? ND4JIllegalState Cannot execute m...
  FastTextTest.testWordsStatistics:207 ? ND4JIllegalState Cannot execute matrix ...

Unable to reproduce BertWordPieceTokenizerTests locally.

deeplearning4j-nlp-uima:

  WordVectorSerializerTest.testIndexPersistence:262
  WordVectorSerializerTest.testOutputStream:468
  WordVectorSerializerTest.testParaVecSerialization1:529
  WordVectorSerializerTest.testStaticLoaderArchive:657 expected:
  WordVectorSerializerTest.testStaticLoaderBinary:602
  WordVectorSerializerTest.testStaticLoaderFromStream:617
  WordVectorSerializerTest.testStaticLoaderText:636
Tests in error: 
  Word2VecTests.testLoadingWordVectors:357 ? ND4JIllegalState Cannot execute mat...
  Word2VecTests.testWordsNearest:106 ? ND4JIllegalState Cannot execute matrix mu...

linux-x86_64-cuda-9.2

nd4j-tests
TestOpMapping - JVM crash: https://gist.github.com/AlexDBlack/39bc6cd1d04c992f13f97eb4f53abdcf

Deadlock/infinite test: org.nd4j.parameterserver.distributed.v2.DelayedModelParameterServerTest
https://gist.github.com/AlexDBlack/2970c368cee8dfcb8ddabe3913c25e60

macosx-x86_64-cpu (and avx2)
Can't build libnd4j: https://gist.github.com/AlexDBlack/cd6dc471e21300f104d9ce0398491c73

windows-x86_64-cpu (and avx2)

libnd4j:
[ FAILED ] DeclarableOpsTests14.test_empty_reduce_mean_1

nd4j-tests (note excluded a few fixed since - rnn, strided slice, testIndexingThorough)

Failed tests:   testLayerNormMixedOrders[0: backend(org.nd4j.linalg.cpu.nativecpu.CpuBackend)={1}](org.nd4j.autodiff.opvalidation.LayerOpValidation):

Tests in error: 
  testNormalizeMomentsOp[0: backend(org.nd4j.linalg.cpu.nativecpu.CpuBackend)={1}](org.nd4j.autodiff.opvalidation.ReductionOpValidation): Error during op execution
Bug DL4J LIBND4J ND4J

All 8 comments

Can't confirm macos issue locally. Stuff builds just fine. cc @sshepel

test_empty_reduce_mean_1 was fixed yesterday

Can't confirm macos issue locally. Stuff builds just fine. cc @sshepel

I have updated gcc to v8 on macos, and waiting some more fixes to trigger new test run...

Updating GCC won't fix this. We need to make sure to not put explicit template specialization like template struct SomeTemplate<int, float> in header files. It's OK on Linux and Windows, but Mac doesn't like it.

@raver119 , @saudet I have updated gcc to version 8 on one of the build agents, I the compilation issue is gone.

What should I do? Update all build agents to gcc version 8 or someone will try to test solution proposed by Samuel, in previous message?

Really? Well, some Mac users are having issues with 1.0.0-beta4, so maybe GCC 7 doesn't work so well for us anymore anyway...

All mac build agents were updated with new gcc version.

These problems are fixed by now:

  1. deeplearning4j-nlp:
    FastTextTest.testWordsNativeStatistics:227 ? ND4JIllegalState Cannot execute m...
    FastTextTest.testWordsStatistics:207 ? ND4JIllegalState Cannot execute matrix ...
  1. deeplearning4j-nlp-uima
    Word2VecTests.testLoadingWordVectors:357 ? ND4JIllegalState Cannot execute mat...
    Word2VecTests.testWordsNearest:106 ? ND4JIllegalState Cannot execute matrix mu...
Was this page helpful?
0 / 5 - 0 ratings