This is Scylla's bug tracker, to be used for reporting bugs only.
If you have a question about Scylla, and not a bug, please ask it in
our mailing-list at [email protected] or in our slack channel.
Installation details
Scylla version (or git commit hash): 4.3.rc3-0.20201223.5bd52e4db with build-id 7855b9d820223e7134e84437248b1b474a492f64
Cluster size: 4
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-01ef5abee0187f85a
it happened after we forced SSL auto reloading with a nemesis called ServerSslHotReloadingNemesis
2021-01-10 12:42:46.742: (DisruptionEvent Severity.NORMAL): type=start name=ServerSslHotReloadingNemesis node=Node longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 [34.244.35.16 | 10.0.1.49] (seed: False) duration=None
2021-01-10 12:42:58.666: (DisruptionEvent Severity.NORMAL): type=end name=ServerSslHotReloadingNemesis node=Node longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 [34.244.35.16 | 10.0.1.49] (seed: False) duration=11
2021-01-10 12:42:57.000: (DatabaseLogEvent Severity.ERROR): type=DATABASE_ERROR regex=Exception line_number=355659 node=Node longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 [34.244.35.16 | 10.0.1.49] (seed: False)
2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !WARNING | scylla: [shard 8] messaging_service - Exception loading {/etc/scylla/ssl_conf/db.crt}: std::system_error (error GnuTLS:-34, Base64 decoding error.)
2021-01-10 12:42:54.000: (DatabaseLogEvent Severity.ERROR): type=DATABASE_ERROR regex=Exception line_number=744252 node=Node longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 [54.229.121.119 | 10.0.2.116] (seed: False)
2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !WARNING | scylla: [shard 5] messaging_service - Exception loading {/etc/scylla/ssl_conf/db.crt}: std::system_error (error GnuTLS:-34, Base64 decoding error.)
it seems like it continued to retry after it failed:
< t:2021-01-10 12:43:03,886 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !NOTICE | sudo: scyllaadm : TTY=unknown
; PWD=/home/scyllaadm ; USER=root ; COMMAND=/bin/cp -f /tmp/db.crt /etc/scylla/ssl_conf/db.crt
< t:2021-01-10 12:43:03,887 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 5] compaction -
[Compact keyspace_deflate.standard1 ee405290-533f-11eb-87a4-000000000001] Compacted 2 sstables to [/var/lib/scylla/data/keyspace_deflate/standard1-2ca835f0513511eb807c000000000002/md-16273-big-Data.db:l
evel=0, ]. 760MB to 600MB (~78% of original) in 613586ms = 977kB/s. ~3472000 total partitions merged to 2735790.
< t:2021-01-10 12:43:03,887 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !WARNING | scylla: [shard 8] messaging_service - Exception loading {/etc/scylla/ssl_conf/db.crt}: std::system_error (error GnuTLS:-34, Base64 decoding error.)
< t:2021-01-10 12:43:03,887 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 10] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,888 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 9] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,888 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 11] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,888 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 8] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,888 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 0] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,888 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 1] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,888 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 13] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,889 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 6] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,889 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 5] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,889 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 2] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,889 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 12] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,889 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 4] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,889 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 7] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,889 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 3] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:03,890 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:57+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-5 !INFO | scylla: [shard 8] messaging_service - Reloaded {}
and then again, on the other node:
< t:2021-01-10 12:43:05,975 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !NOTICE | sudo: scyllaadm : TTY=unknown
; PWD=/home/scyllaadm ; USER=root ; COMMAND=/bin/cp -f /tmp/db.crt /etc/scylla/ssl_conf/db.crt
< t:2021-01-10 12:43:05,977 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !WARNING | scylla: [shard 5] messaging_service - Exception loading {/etc/scylla/ssl_conf/db.crt}: std::system_error (error GnuTLS:-34, Base64 decoding error.)
< t:2021-01-10 12:43:05,979 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 12] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,980 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 4] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,981 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 10] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,982 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 9] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,983 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 7] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,984 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 1] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,985 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 3] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,985 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 2] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,986 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 8] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,987 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 6] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,988 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 11] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,989 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 0] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,989 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 13] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,990 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:54+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 5] messaging_service - Reloaded {/etc/scylla/ssl_conf/db.crt}
< t:2021-01-10 12:43:05,993 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:55+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 1] messaging_service - Reloaded {}
< t:2021-01-10 12:43:05,994 f:cluster.py l:1356 c:sdcm.cluster p:DEBUG > 2021-01-10T12:42:55+00:00 longevity-tls-1tb-7d-prepare--db-node-6ea0e285-4 !INFO | scylla: [shard 10] messaging_service - Reloaded {}
Assuming it eventually _did_ load the cert, this is not an error, but more or less expected. Since FS notification does not make a distinction between writing and _finishing_ writing, we can try to load incomplete files. This is ok, we just retry a bit later (on next change, finish).
node logs can be found here
i think @elcallio is right, at least, in this specific case, on a test by @ShlomiBalalis ... in other test, i'm not sure about its outcome, and if the same, then we will need to just ignore this messages in the SCT filter
[scyllaadm@longevity-tls-1tb-7d-prepare--db-node-6ea0e285-6 ssl_conf]$ ls -lh
total 16K
-rw-rw-r--. 1 scyllaadm scyllaadm 2.0K Dec 28 04:13 cadb.pem
-rw-r--r--. 1 scyllaadm scyllaadm 41 Jan 10 12:42 cadb.srl
drwxrwxr-x. 2 scyllaadm scyllaadm 203 Dec 28 04:13 client
-rw-r--r--. 1 scyllaadm scyllaadm 2.0K Jan 10 12:42 db.crt
-rw-rw-r--. 1 scyllaadm scyllaadm 3.2K Dec 28 04:13 db.key
drwxrwxr-x. 2 scyllaadm scyllaadm 110 Dec 28 04:13 example
[scyllaadm@longevity-tls-1tb-7d-prepare--db-node-6ea0e285-6 ssl_conf]$ openssl x509 -enddate -noout -in ./db.crt
notAfter=Jan 10 12:42:48 2022 GMT
The cert key was generated in runtime, the expiration cycle is 365 days.
Assuming it eventually _did_ load the cert, this is not an error, but more or less expected. Since FS notification does not make a distinction between writing and _finishing_ writing, we can try to load incomplete files. This is ok, we just retry a bit later (on next change, finish).
Hmm, that's sad. Let's introduce a short delay between the notification and the reload.
Fix for this was merged into seastar (and updated to in scylla). Should be able to close this as fixed.
seastar fix https://github.com/scylladb/seastar/commit/e22a58dcc48b450a97c50bdbd2e6ac02407cfdea
part of 4.5 55609f2033457d1e75d8220e3086e7d708daab92
Most helpful comment
Assuming it eventually _did_ load the cert, this is not an error, but more or less expected. Since FS notification does not make a distinction between writing and _finishing_ writing, we can try to load incomplete files. This is ok, we just retry a bit later (on next change, finish).