Previous Version: 20.10.1
New Version: 20.11.14468076
After trying to upgrade to the latest nightly i've got some Errors during execution of install.sh.
Currently compose is able to start but Sentry throws "There was an error loading data."
Postgres and worker logs show that atleast sentry_release.status column is missing.
I'm dont know if Kafka is the problem which led to the migrations not running or the migrations having errors.
Install.sh log:
...
Docker images built.
Creating network "sentry_onpremise_default" with the default driver
Bootstrapping and migrating Snuba...
Creating sentry_onpremise_clickhouse_1 ...
Creating sentry_onpremise_redis_1 ...
Creating sentry_onpremise_zookeeper_1 ...
Creating sentry_onpremise_zookeeper_1 ... done
Creating sentry_onpremise_kafka_1 ...
Creating sentry_onpremise_redis_1 ... done
Creating sentry_onpremise_clickhouse_1 ... done
Creating sentry_onpremise_kafka_1 ... done
+ '[' b = - ']'
+ snuba bootstrap --help
+ set -- snuba bootstrap --force
+ set gosu snuba snuba bootstrap --force
+ exec gosu snuba snuba bootstrap --force
%3|1605706319.866|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 2ms in state CONNECT)
%3|1605706320.864|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
2020-11-18 13:32:00,864 Connection to Kafka failed (attempt 0)
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 55, in bootstrap
client.list_topics(timeout=1)
cimpl.KafkaException: KafkaError{code=_TRANSPORT,val=-195,str="Failed to get metadata: Local: Broker transport failure"}
%3|1605706321.868|FAIL|rdkafka#producer-2| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 0ms in state CONNECT)
%3|1605706322.868|FAIL|rdkafka#producer-2| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
2020-11-18 13:32:02,869 Connection to Kafka failed (attempt 1)
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 55, in bootstrap
client.list_topics(timeout=1)
cimpl.KafkaException: KafkaError{code=_TRANSPORT,val=-195,str="Failed to get metadata: Local: Broker transport failure"}
%3|1605706323.872|FAIL|rdkafka#producer-3| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 1ms in state CONNECT)
%3|1605706324.872|FAIL|rdkafka#producer-3| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
2020-11-18 13:32:04,873 Connection to Kafka failed (attempt 2)
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 55, in bootstrap
client.list_topics(timeout=1)
cimpl.KafkaException: KafkaError{code=_TRANSPORT,val=-195,str="Failed to get metadata: Local: Broker transport failure"}
%3|1605706325.880|FAIL|rdkafka#producer-4| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 0ms in state CONNECT)
%3|1605706326.875|FAIL|rdkafka#producer-4| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
2020-11-18 13:32:06,882 Connection to Kafka failed (attempt 3)
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 55, in bootstrap
client.list_topics(timeout=1)
cimpl.KafkaException: KafkaError{code=_TRANSPORT,val=-195,str="Failed to get metadata: Local: Broker transport failure"}
%3|1605706327.885|FAIL|rdkafka#producer-5| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Connect to ipv4#172.29.0.5:9092 failed: Connection refused (after 1ms in state CONNECT)
2020-11-18 13:32:08,885 Connection to Kafka failed (attempt 4)
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 55, in bootstrap
client.list_topics(timeout=1)
cimpl.KafkaException: KafkaError{code=_TRANSPORT,val=-195,str="Failed to get metadata: Local: Broker transport failure"}
2020-11-18 13:32:10,443 Failed to create topic outcomes
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 91, in bootstrap
future.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
cimpl.KafkaException: KafkaError{code=TOPIC_ALREADY_EXISTS,val=36,str="Topic 'outcomes' already exists."}
2020-11-18 13:32:10,445 Failed to create topic events
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 91, in bootstrap
future.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
cimpl.KafkaException: KafkaError{code=TOPIC_ALREADY_EXISTS,val=36,str="Topic 'events' already exists."}
2020-11-18 13:32:10,445 Failed to create topic errors-replacements
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 91, in bootstrap
future.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
cimpl.KafkaException: KafkaError{code=TOPIC_ALREADY_EXISTS,val=36,str="Topic 'errors-replacements' already exists."}
2020-11-18 13:32:10,446 Failed to create topic event-replacements
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 91, in bootstrap
future.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
cimpl.KafkaException: KafkaError{code=TOPIC_ALREADY_EXISTS,val=36,str="Topic 'event-replacements' already exists."}
2020-11-18 13:32:10,446 Failed to create topic snuba-commit-log
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 91, in bootstrap
future.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
cimpl.KafkaException: KafkaError{code=TOPIC_ALREADY_EXISTS,val=36,str="Topic 'snuba-commit-log' already exists."}
2020-11-18 13:32:10,447 Failed to create topic cdc
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 91, in bootstrap
future.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
cimpl.KafkaException: KafkaError{code=TOPIC_ALREADY_EXISTS,val=36,str="Topic 'cdc' already exists."}
2020-11-18 13:32:10,447 Failed to create topic ingest-sessions
Traceback (most recent call last):
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 91, in bootstrap
future.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
cimpl.KafkaException: KafkaError{code=TOPIC_ALREADY_EXISTS,val=36,str="Topic 'ingest-sessions' already exists."}
Traceback (most recent call last):
File "/usr/local/bin/snuba", line 33, in <module>
sys.exit(load_entry_point('snuba', 'console_scripts', 'snuba')())
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/usr/src/snuba/snuba/cli/bootstrap.py", line 98, in bootstrap
Runner().run_all(force=True)
File "/usr/src/snuba/snuba/migrations/runner.py", line 132, in run_all
pending_migrations = self._get_pending_migrations()
File "/usr/src/snuba/snuba/migrations/runner.py", line 257, in _get_pending_migrations
raise MigrationInProgress(migration_key)
snuba.migrations.errors.MigrationInProgress: transactions: 0008_transactions_add_timestamp_index
Cleaning up...
docker-compose logs Traceback:
worker_1 | Traceback (most recent call last):
worker_1 | File "/usr/local/lib/python2.7/site-packages/celery/app/trace.py", line 412, in trace_task
worker_1 | R = retval = fun(*args, **kwargs)
worker_1 | File "/usr/local/lib/python2.7/site-packages/celery/app/trace.py", line 704, in __protected_call__
worker_1 | return self.run(*args, **kwargs)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry_sdk/integrations/celery.py", line 186, in _inner
worker_1 | reraise(*exc_info)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry_sdk/integrations/celery.py", line 181, in _inner
worker_1 | return f(*args, **kwargs)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/tasks/base.py", line 48, in _wrapped
worker_1 | result = func(*args, **kwargs)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/tasks/store.py", line 870, in save_event
worker_1 | _do_save_event(cache_key, data, start_time, event_id, project_id, **kwargs)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/tasks/store.py", line 784, in _do_save_event
worker_1 | project_id, assume_normalized=True, start_time=start_time, cache_key=cache_key
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/utils/metrics.py", line 193, in inner
worker_1 | return f(*args, **kwargs)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/event_manager.py", line 317, in save
worker_1 | _get_or_create_release_many(jobs, projects)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/utils/metrics.py", line 193, in inner
worker_1 | return f(*args, **kwargs)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/event_manager.py", line 579, in _get_or_create_release_many
worker_1 | date_added=release_date_added[(project_id, version)],
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/models/release.py", line 204, in get_or_create
worker_1 | return cls._get_or_create_impl(project, version, date_added, metric_tags)
worker_1 | File "/usr/local/lib/python2.7/site-packages/sentry/models/release.py", line 225, in _get_or_create_impl
worker_1 | projects=project,
worker_1 | File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 250, in __iter__
worker_1 | self._fetch_all()
worker_1 | File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 1121, in _fetch_all
worker_1 | self._result_cache = list(self._iterable_class(self))
worker_1 | File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 53, in __iter__
worker_1 | results = compiler.execute_sql(chunked_fetch=self.chunked_fetch)
worker_1 | File "/usr/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 899, in execute_sql
worker_1 | raise original_exception
worker_1 | ProgrammingError: UndefinedColumn('column sentry_release.status does not exist\nLINE 1: ...elease"."id", "sentry_release"."organization_id", "sentry_re...\n ^\n',)
Seems like your upgrade stalled and you now skipped Sentry migrations, which explains the issues. I'll ping @lynnagara to look into the Snuba migration issue.
Hi @LuckyType Can you confirm the version of ClickHouse you are using is 20.3.9.70 as defined here? https://github.com/getsentry/onpremise/blob/a717c11a2554474c7ba8637ebba89750061c2a2f/docker-compose.yml#L106
The migration that stalled uses a feature that was not turned on by default in some prior versions of ClickHouse.
I've got the same issue here; how do we recover from this?
Our clickhouse image:
image: 'yandex/clickhouse-server:19.17'
how do we recover from this?
Use this repo which already has the correct Clickhouse version set:
@BYK, thanks for the quick response. A couple of questions which are really unclear to me:
@mcdurdin since you are not using a setup that we know, it is very hard to assist you with those questions.
If you were using this repo, I'd say just backup your data and config volumes and try stuff.
@BYK, we're using the standard onpremise setup, following your instructions on this repository, from this version of https://github.com/getsentry/onpremise/blob/fb125a1e4c40701b32f974f6eb2c46a05ca2cd78/docker-compose.yml with the following changes:
--- a/BASE/docker-compose.yml
+++ b/HEAD/docker-compose.yml
@@ -170,18 +170,18 @@ services:
<< : *sentry_defaults
# Increase `--commit-batch-size 1` below to deal with high-load environments.
command: run post-process-forwarder --commit-batch-size 1
- sentry-cleanup:
- << : *sentry_defaults
- image: sentry-cleanup-onpremise-local
- build:
- context: ./cron
- args:
- BASE_IMAGE: 'sentry-onpremise-local'
- command: '"0 0 * * * gosu sentry sentry cleanup --days $SENTRY_EVENT_RETENTION_DAYS"'
+# sentry-cleanup:
+# << : *sentry_defaults
+# image: sentry-cleanup-onpremise-local
+# build:
+# context: ./cron
+# args:
+# BASE_IMAGE: 'sentry-onpremise-local'
+# command: '"0 0 * * * gosu sentry sentry cleanup --days $SENTRY_EVENT_RETENTION_DAYS"'
nginx:
<< : *restart_policy
ports:
- - '9000:80/tcp'
+ - '80:80/tcp'
image: 'nginx:1.16'
volumes:
- type: bind
Does that provide the information you need?
@mcdurdin then just upgrade to the latest one and run ./install.sh?
You should probably keep the cleanup job btw and you can now control the port to be bound via SENTRY_BIND env variable so your changes should no longer be needed.
You need to manually reset that migration by running this on the snuba container first:
snuba migrations reverse --group transactions --migration-id 0008_transactions_add_timestamp_index --force
Then upgrade ClickHouse and retry.
@lynnagara, that resolved the issue for me, thank you. For other readers, I ran the following command:
docker-compose run --rm snuba-api migrations reverse --group transactions --migration-id 0008_transactions_add_timestamp_index --force
Thanks for the quick response.
@lynnagara suggestion worked flawlessly for me, thanks to @mcdurdin for the run command.
Most helpful comment
You need to manually reset that migration by running this on the snuba container first:
snuba migrations reverse --group transactions --migration-id 0008_transactions_add_timestamp_index --forceThen upgrade ClickHouse and retry.