Scylla version (or git commit hash): 4.3rc1
Cluster size: 1
OS (RHEL/CentOS/Ubuntu/AWS AMI): Docker
docker run --name some-scylla-43 -d scylladb/scylla:4.3.rc1 --smp 1 --memory 1G --experimental 1
cqlsh> CREATE KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;
cqlsh> CREATE TABLE ks.t (pk int, ck int, v int, PRIMARY KEY (pk, ck, v)) WITH cdc = {'enabled':true};
cqlsh> INSERT INTO ks.t (pk,ck,v) VALUES (1,2,3);
ServerError: cdc::metadata::get_stream: could not find any CDC stream (current time: 2020/11/18 15:01:46). Are we in the middle of a cluster upgrade?
The problem is gone after a few seconds / minutes.
Look like CQL is available before the CDC service is ready
https://docs.scylladb.com/using-scylla/cdc/cdc-stream-generations/#the-first-generation-s-timestamp
Not a bug. You just can't jump on a fresh cluster like that @tzach ;). Give it some time to settle :)
Perhaps today we could get rid of this delay, since we are now able to determine that we're the first node in the cluster (due to @asias removing the seed stuff). In this case we could make the first generation timestamp's equal to the boot time so it will be available immediately.
If it's possible then it's probably worth doing @kbr-
This is bad UX
We can not expect a user to read all the docs before start using or evaluating Scylla
We could write a FAQ page
user encounters the message -> opens faq -> ctrl + f -> gets instant explanation
The message could also be reformulated a bit, it mentions only one possible reason of the error (upgrade) and not another (fresh cluster startup)
Better to remove the delay.
We do not want user to go to FAQ as the first impression of the feature.
The best will be to remove this limitation
If impossible, its better to give a more descriptive message. Instead of "Are we in the middle of a cluster upgrade?"
should be "CDC initiation is in progress; please try again in a minute".
We do not want user to go to FAQ as the first impression of the feature.
The best will be to remove this limitationIf impossible, its better to give a more descriptive message. Instead of "Are we in the middle of a cluster upgrade?"
should be "CDC initiation is in progress; please try again in a minute".
I am completely agree we should make the limitation. Especially, this is a damn one node cluster.
@kbr-
Here is the relevant code. Let's use it to remove the limitation. Tzach will buy you a beer.
210 bool storage_service::is_first_node() {
211 if (db().local().is_replacing()) {
212 return false;
213 }
214 auto seeds = _gossiper.get_seeds();
215 if (seeds.empty()) {
216 return false;
217 }
218 // Node with the smallest IP address is chosen as the very first node
219 // in the cluster. The first node is the only node that does not
220 // bootstrap in the cluser. All other nodes will bootstrap.
221 std::vector<gms::inet_address> sorted_seeds(seeds.begin(), seeds.end());
222 std::sort(sorted_seeds.begin(), sorted_seeds.end());
223 if (sorted_seeds.front() == get_broadcast_address()) {
224 slogger.info("I am the first node in the cluster. Skip bootstrap. Node={}", get_broadcast_address());
225 return true;
226 }
227 return false;
228 }
229
230 bool storage_service::should_bootstrap() {
231 return !db::system_keyspace::bootstrap_complete() && !is_first_node();
232 }
Thanks @asias. I will cook up a patch today.
Most helpful comment
Thanks @asias. I will cook up a patch today.