Sentry: flooded server logs with unique violation

Created on 26 Sep 2017  路  10Comments  路  Source: getsentry/sentry

Server log flooded with sentry_organizationonboar_organization_id_47e98e05cae29cf3_uniq and sentry_eventuser_project_id_1a96e3b719e55f9a_uniq unique violation:

ERROR: duplicate key value violates unique constraint "sentry_eventuser_project_id_1a96e3b719e55f9a_uniq"
DETAIL: Key (project_id, hash)=(41, c62986fcfbcd1a26f3962f4cf8bf06bd) already exists.
STATEMENT: INSERT INTO "sentry_eventuser" ("project_id", "hash", "ident", "email", "username", "ip_address", "date_added") VALUES (41, E'c62986fcfbcd1a26f3962f4cf8bf06bd', NULL, NULL, NULL, E'1.1.1.1', E'2017-09-26 12:23:36.183755+00:00') RETURNING "sentry_eventuser"."id"

ERROR: duplicate key value violates unique constraint "sentry_organizationonboar_organization_id_47e98e05cae29cf3_uniq"
DETAIL: Key (organization_id, task)=(1, 5) already exists.
STATEMENT: INSERT INTO "sentry_organizationonboardingtask" ("organization_id", "user_id", "task", "status", "date_completed", "project_id", "data") VALUES (1, NULL, 5, 1, E'2017-09-26 12:23:36.195133+00:00', 41, E'{}') RETURNING "sentry_organizationonboardingtask"."id"

some stats of unique violation per hour:

  count | constraint name
----------------------------------------------------------------------------
  73226 | "sentry_organizationonboar_organization_id_47e98e05cae29cf3_uniq"
  71236 | "sentry_eventuser_project_id_1a96e3b719e55f9a_uniq"
     46 | "sentry_eventmapping_project_id_eb6c54bf8930ba6_uniq"
      1 | "sentry_groupsubscription_group_id_7e18bedd5058ccc3_uniq"
      1 | "sentry_grouprulestatus_rule_id_329bb0edaad3880f_uniq"
      1 | "sentry_grouprelease_group_id_46ba6e430d088d04_uniq"
      1 | "sentry_environmentrelease_project_id_1762e5bd8b7a0675_uniq"

sentry is not exclusively use this server and I am not happy to filter out this 140 000 messages from logs every hours :-)

All 10 comments

Unfortunately, Postgres logs this as an ERROR when we gracefully handle this in the application and this is normal behavior. I wish these were a lower level, or a way to tell Postgres that this is fine.

Basically, there are lots of places where we just try to blindly INSERT and just catch the duplicate error. Instead of doing a SELECT first to check if it already exists, then INSERT. It鈥檚 faster and less error prone to just INSERT and handle the IntegrityError. So unfortunately, there鈥檚 nothing we can really do here to stop the logging from Postgres.

Instead of doing a SELECT first to check if it already exists, then INSERT. It鈥檚 faster and less error prone to just INSERT and handle the IntegrityError. So unfortunately, there鈥檚 nothing we can really do here to stop the logging from Postgres.

You are wrong, I am sure :-)

Anyway, suggestion from sentry developers to DBA is "hide all errors from sentry and ignore it". Am I right?

If you have a better suggestion, we're all ears.

And yeah, we don't look at these errors ourselves since they're meaningless and they bubble back up to the application, which we'd see. We do track almost every other Postgres error though ourselves.

Prime example: https://github.com/getsentry/sentry/blob/master/src/sentry/models/organizationonboardingtask.py#L51-L65

We do this, but cache it so it only should happen once per hour. Maybe you don't have cache configured to prevent it?

I don't think we cache around the EventUser creation though from what I can tell.

We do this, but cache it so it only should happen once per hour. Maybe you don't have cache configured to prevent it?

Thanks, I think you are right! I am ask admin and looks like we setup only SENTRY_CACHE, but not CACHES.

May be you can cache EventUser too, somehow?

May be you can cache EventUser too, somehow?

I can look into it. I'm not sure off the top of my head. This doesn't cause any problems for us in production, so it's definitely not high priority.

And no matter what, you're still going to get these errors, even with caching. Just less of them.

And no matter what, you're still going to get these errors, even with caching. Just less of them.

yeah, but it not so big problem if it will be for example 100 times per hours.

BTW, EventUser is already cached in https://github.com/getsentry/sentry/blob/6e821514e962d93968b8849d42d906d6c53ba1a9/src/sentry/event_manager.py#L788
and it use django cache too, if I not mistaken:
https://github.com/getsentry/sentry/blob/6e821514e962d93968b8849d42d906d6c53ba1a9/src/sentry/utils/cache.py#L12-L14

I am don't know what is event_manager.py, but may be correct CACHES will fix all my problems.

Ah, then yes. The problem with EventUser is that it will still spam quit a bit since the number of permutations is a lot. Even for an hour cache, you might not get a great hit rate on them. It'll be significantly more than the onboarding tasks, but is probably less than 70k. :)

After enabling cache I have now:

  12745 "sentry_eventuser_project_id_1a96e3b719e55f9a_uniq"
     15 "sentry_release_project_project_id_35add08b8e678ec7_uniq"
      9 "sentry_filtervalue_project_id_201b156195347397_uniq"
      8 "sentry_organizationonboar_organization_id_47e98e05cae29cf3_uniq"
      8 "sentry_grouprelease_group_id_46ba6e430d088d04_uniq"
      3 "sentry_environmentrelease_project_id_1762e5bd8b7a0675_uniq"

much better, but still too many...

Thank you for help with cache, Matt!

The number of errors has increased again (~34 000 per hours) and I use this trigger for hide it:

create or replace function x_hide_eventuser_uniq_violation_trg()
 returns trigger
 language plpgsql
 security definer
as $function$
begin
  perform 1 from public.sentry_eventuser t where t.project_id = new.project_id and t.hash = new.hash;
  if found then
    set local log_min_messages to log;
  end if;
  return new;
end
$function$;
alter function x_hide_eventuser_uniq_violation_trg() owner to postgres;

create trigger hide_uniq_violation before insert on sentry_eventuser for each row execute procedure x_hide_eventuser_uniq_violation_trg();
Was this page helpful?
0 / 5 - 0 ratings