Bulk action like merge or delete on 652 issues is not working

On a delete action I can see in the logs:
172.18.0.1 - - [03/May/2019:17:51:31 +0000] "GET /api/0/projects/geokrety/geokrety-legacy/issues/?sort=date&shortIdLookup=1&environment=kumy&limit=25&statsPeriod=24h&query=is%3Aunresolved&cursor=1556897291000:0:1 HTTP/1.1" 200 803 "https://sentry.kumy.org/geokrety/geokrety-legacy/?environment=kumy" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0"
172.18.0.1 - - [03/May/2019:17:51:37 +0000] "GET /api/0/projects/geokrety/geokrety-legacy/issues/?sort=date&shortIdLookup=1&environment=kumy&limit=25&statsPeriod=24h&query=is%3Aunresolved&cursor=1556897291000:0:1 HTTP/1.1" 200 803 "https://sentry.kumy.org/geokrety/geokrety-legacy/?environment=kumy" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0"
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/sentry/api/base.py", line 90, in handle_exception
response = super(Endpoint, self).handle_exception(exc)
File "/usr/local/lib/python2.7/site-packages/sentry/api/base.py", line 190, in dispatch
response = handler(request, *args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/sentry/api/endpoints/organization_group_index.py", line 292, in delete
search_fn,
File "/usr/local/lib/python2.7/site-packages/sentry/api/helpers/group_index.py", line 414, in delete_groups
'paginator_options': {'max_limit': 1000},
File "/usr/local/lib/python2.7/site-packages/sentry/api/endpoints/organization_group_index.py", line 40, in _search
result = search.query(**query_kwargs)
File "/usr/local/lib/python2.7/site-packages/sentry/search/django/backend.py", line 404, in query
paginator_options, search_filters, **parameters)
File "/usr/local/lib/python2.7/site-packages/sentry/search/snuba/backend.py", line 302, in _query
search_filters=search_filters,
File "/usr/local/lib/python2.7/site-packages/sentry/search/snuba/backend.py", line 466, in snuba_search
sample=1, # Don't use clickhouse sampling, even when in turbo mode.
File "/usr/local/lib/python2.7/site-packages/sentry/utils/snuba.py", line 477, in raw_query
raise SnubaError(err)
SnubaError: HTTPConnectionPool(host='localhost', port=1218): Max retries exceeded with url: /query (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f516eadbed0>: Failed to establish a new connection: [Errno 111] Connection refused',))
172.18.0.1 - - [03/May/2019:17:51:39 +0000] "DELETE /api/0/organizations/geokrety/issues/?query=is%3Aunresolved&environment=kumy&project=5 HTTP/1.1" 500 362 "https://sentry.kumy.org/geokrety/geokrety-legacy/?environment=kumy" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0"
Bulk merge is also giving a similar error
SnubaError: HTTPConnectionPool(host='localhost', port=1218): Max retries exceeded with url: /query (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f516eadbe10>: Failed to establish a new connection: [Errno 111] Connection refused',))
Can you please provide more details on your setup?
I'm using swarm deployment, with this docker-compose.yml, anything you wish to know?
version: '3.7'
x-defaults: &defaults
# build: .
image: sentry-local:9.1.1
environment:
SENTRY_SECRET_KEY: xxx
SENTRY_SECRET_KEY_: xxx
SENTRY_MEMCACHED_HOST: memcached
SENTRY_REDIS_HOST: redis
SENTRY_POSTGRES_HOST: postgres
SENTRY_EMAIL_HOST: smtp
SENTRY_DB_PASSWORD: xxx
SENTRY_EMAIL_HOST: smtp.xyz.net
SENTRY_EMAIL_PASSWORD: 'xxx'
SENTRY_EMAIL_USER: 'xxx'
SENTRY_EMAIL_PORT: 587
SENTRY_EMAIL_USE_TLS: 'True'
SENTRY_SERVER_EMAIL: xxx
SENTRY_USE_SSL: 1
# OPTIONAL: If you want GitHub integration
GITHUB_CLIENT_ID: xxx
GITHUB_CLIENT_SECRET: xxx
GITHUB_EXTENDED_PERMISSIONS: "repo"
volumes:
- /srv/SENTRY/data:/var/lib/sentry/files
deploy:
labels:
traefik.enable: "false"
restart_policy:
condition: any
depends_on:
- redis
- postgres
- memcached
- smtp
networks:
default:
x-defaults-other: &defaults-other
deploy:
labels:
traefik.enable: "false"
restart_policy:
condition: any
networks:
default:
services:
memcached:
image: memcached:1.5-alpine
<<: *defaults-other
redis:
image: redis:3.2-alpine
<<: *defaults-other
postgres:
image: postgres:9.5
environment:
POSTGRES_PASSWORD: xxx
volumes:
- /srv/SENTRY/postgres:/var/lib/postgresql/data
<<: *defaults-other
web:
<<: *defaults
# ### To upgrade: run as sleep, then connect in container and `$ upgrade`
#command: sleep 10000
deploy:
labels:
traefik.enable: "true"
traefik.docker.network: "traefik_default"
traefik.frontend.rule: "Host:xxx"
traefik.frontend.passHostHeader: "true"
traefik.protocol: "http"
traefik.port: 9000
restart_policy:
condition: any
networks:
default:
traefik_default:
cron:
<<: *defaults
command: run cron
worker:
<<: *defaults
command: run worker
networks:
default:
traefik_default:
external: true
Can you also share your config as the error logs indicate snuba but it should not be enabled by default and you don't seem to be running that service.
No config file updated, only configured using environment variables. sentry.conf.py and config.yml are the one from commit 0b1843047ae28425d428cf4f264b0fa07f59a76c
Alright, thanks for the info. Investigating...
@kumy - initial investigation reveals that this is not a feature that should be enabled in 9.x releases as it requires a new service to be running and that's why you are seeing the issues.
I'll dig deeper and see why this is exposed without the feature flag being enabled. Can you share which page are seeing this command? Looks like it is the issues page but just trying to confirm. Also, although this may seem irrelevant, can you share how many projects you have on your Sentry instance? Just one or more than one would be enough rather than a precise answer.
Finally, this feature likely won't be available in the 9.x releases so the "fix" will simply remove the path to get there unless you have the feature flag enabled to indicate that you have the new service running. Apologies for the inconvenience.
@BYK I have 2 projects. Here are the used steps:



However, I've just upgraded to 9f8c89a5f7d7718c6f3c1e48cee0f32d77340810, and on launch, I saw many lines in logs like:
21:32:13 [ERROR] sentry.errors.events: preprocess.failed.empty (cache_key=u'e:445977fb8d5546199d45986ac4d7b648:5')
21:32:13 [ERROR] sentry.errors.events: preprocess.failed.empty (cache_key=u'e:d9712136606c40759a91d381b8cc828c:5')
21:32:13 [ERROR] sentry.errors.events: preprocess.failed.empty (cache_key=u'e:64b941a358484a0781223823e95c6f34:5')
And tried the action again and seems to work now!!! But celery is eating one core :/
And you interested in a test back on commit 0b18430?
@kumy - thanks a lot for the follow up and all the useful information! I'll check this again myself that said I don't think testing this with older versions is useful as we expect people to upgrade to latest if an issue arises (any fix I'd do would mean a newer version here anyway :D).
That said looking at the diff between the commit you shared and master, I mostly see some config fixes: https://github.com/getsentry/onpremise/compare/0b1843047ae28425d428cf4f264b0fa07f59a76c..master and none of them seems to be related.
Maybe your celery worker was stuck and upgrading forced it to restart which fixed the issue?
Killing celery didn't fixed the problem. It started again and took one complete cpu again. I'll try to start again from scratch without data at all. Maybe something was broken in db while upgrading.
@kumy - I meant killing celery triggering the fix you observed :) I think it is normal for celery to consume the CPU to its full while it is processing all these. Not sure if you can add a new worker to speed things up tho. Or did you mean celery was stuck and just consuming CPU?
I started the stack again tonight, celery seems quite smooth since start time (~5 minutes right now).
Well then, I'll close this one out but will still investigate the Snuba dependency thing. Thanks a lot for coming back and following up with everything!