Couchdb: CompactTest fails while executing CouchDB elixir test suite.

Created on 7 Aug 2019  路  8Comments  路  Source: apache/couchdb

-CouchDB v2.3(master branch)
-Erlang OTP v21
-Elixir v1.8.2

Seeing the below issue while running CouchDB test suite using make check or make elixir

I have set the timeouts at the following two places to 999_999(lower timeouts of 10000 from exisiting 5000 still caused some tests to fail, will try to figure the most optimum value after some more trials)

https://github.com/apache/couchdb/blob/master/test/elixir/lib/couch.ex#L182
https://github.com/apache/couchdb/blob/master/test/elixir/lib/couch/db_test.ex#L293

Right now the elixir test suite is in progress and all the tests seem to pass, currently test is stuck at the following for quite sometime:-

CompactTest
  * test compaction reduces size of deleted docs

After the long timeout, the following test finally fails as below:-

CompactTest
  * test compaction reduces size of deleted docs (1017513.1ms)

  1) test compaction reduces size of deleted docs (CompactTest)
     test/elixir/test/compact_test.exs:17
     ** (RuntimeError) timed out after 1000053 ms
     code: retry_until(fn ->
     stacktrace:
       (couchdbtest) test/elixir/lib/couch/db_test.ex:301: Couch.DBTest.retry_until/4
       test/elixir/test/compact_test.exs:38: (test)

This is the only test failure that I am seeing right now, and all the other tests pass successfully.

bug needs-triage

All 8 comments

@wohali @kocolosk any inputs on this one?

Hard to say without any additional details like database log files. I don't think I've ever seen that one fail like that on our CI system yet so I don't have any other context to go on.

Hi @kocolosk ,

I debugged the failing CompactTest test case and have managed to find the root cause of the failure.

After understanding the logic of the test case and tracing the code flow, I realised that the failure happened due to the incorrect assert check at the following location:-
https://github.com/apache/couchdb/blob/master/test/elixir/test/compact_test.exs#L46

This is because the final data size after deletion & further compaction is more than the deleted data size after only deletion, but not compaction.
The opposite was being checked due to which the test case was failing consistently.

Following are the values of the variables in question that I managed to trace:-

    CompactTest
      * test compaction reduces size of deleted docs

    Value of orig_data_size = 4436.

    Value of orig_disk_size = 103907.

    Value of deleted_data_size = 7455.

    Value of final_data_size = 11924.

    Value of final_disk_size = 218681.

      * test compaction reduces size of deleted docs (18819.2ms)

I have made the necessary changes and submitted a PR for the same.
https://github.com/apache/couchdb/pull/2127

Entire test suite for CouchDB v2.3(current master) executes successfully/passes with Erlang v21 and Elixir v1.8 on PowerPC64LE. Closing this issue. Thanks for all your help and support in getting this through.

Great stuff @sarveshtamba. We are looking to add PPC64LE to the CI matrix soon, so that should help us keep this one green.

@kocolosk any timelines to add PPC64LE to the CI matrix?

Hi @kocolosk ,

The build script to build CouchDB alongwith all of its dependencies on PowerPC64LE is present at the below location:-
https://github.com/ppc64le/build-scripts/blob/master/couchdb/couchdb_ubuntu_16.04.sh

Note that this is tested for building CouchDB v2.3 (current master) with Erlang v21 and Elixir v1.8 only.

Also as per here - https://cwiki.apache.org/confluence/display/INFRA/Jenkins+node+labels , below are the ppc64le nodes available for ASF projects:-

|-------+---------------------------+-|
|ppc64le|                           |2|
|       |hadoop-ppc64le-1,          | |
|       |ubuntu-ppc64-le            | |
|-------+---------------------------+-|

There are currently 2 ppc64le nodes present at https://builds.apache.org/computer/ . They are
1) https://builds.apache.org/computer/hadoop-ppc64le-1/ and
2) https://builds.apache.org/computer/ubuntu-ppc64le/

@sarveshtamba Please don't issue hijack, this isn't the right issue for this topic.

We're aware of those nodes and can't use the hadoop one. Because of our need for redundant builders, we've requested 2 nodes from OSU and received them yesterday. We're also moving to a new Jenkins/CloudBees Core install soon and will set those nodes up then.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

maciozo picture maciozo  路  5Comments

mojito317 picture mojito317  路  3Comments

DeylEnergy picture DeylEnergy  路  5Comments

wohali picture wohali  路  3Comments

andrasbacsai picture andrasbacsai  路  5Comments