Onpremise: How do I backup Sentry 10+?

Created on 27 Jan 2020  路  22Comments  路  Source: getsentry/onpremise

I've just finished setting up Sentry 10.1.0.dev0 (1713221b5d6f182853c0d71f51100464ceada7de) today, along with SAML authentication with my Keycloak server. This is all very nice and awesome, so I'd like to keep it around, even in the case of server failure.

Which leads me to the question in the title: With the new architecture with Kafka, Snuba, etc, what do I need to include in my backups? A postgres dump is already included, is there anything more?

Docs In Progress

Most helpful comment

I mounted volumes in directory for easing backups. But it looks like using volumes for sentry-zookeeper and sentry-kafka caused issues. So I got back to default configuration for these containers.

This is most probably due to some permission or user id conflicts. We'll be having a back-up and restore guide in the coming months.

All 22 comments

If you back up all the named volumes defined in the install script here:

https://github.com/getsentry/onpremise/blob/bc6d3b47e257057587e29153947c1ba223160416/install.sh#L72-L79

you should be good. The critical ones there are sentry-postgres and sentry-clickhouse that said Redis holds the stats and some in-flight data for task queues, same for Kafka and Zookeeper. sentry-data holds all the data you have uploaded to Sentry backend such as avatars, source maps or symbol files and finally sentry-symbolicator holds the cache for Symbolicator which is not critical but is good for performance.

cc @mattrobenolt in case I'm missing anything.

Seems reasonable.

@nogweii
Can you please share the steps you took to step SAML with Keycloak?

I tried everything for a week but it still fails with this error

Authentication error: SAML SSO failed, https://sentry.<myhost>/saml/metadata/sentry/ is not a valid audience for this Response

Thanks in advance for your help.

@mingfang this issue doesn't seem like the right place for that question. I strongly recommend using the forum for this.

Here are the steps for anyone trying to integrate Sentry SAML with Keycloak.

Keycloak

1-create client, Clients -> Create
Client ID = https://<sentry url>/saml/metadata/sentry/
Client Protocol = saml
Client SAML Endpoint = https://<sentry url>/saml/acs/sentry/
*must include trailing slash

2-edit the client created in #1 and set
IDP Initiated SSO URL Name = sentry

3-Remove Client Scopes
Assigned Default Client Scopes
select role_list
Remove selected

4-add username Mapper
Name = username
Mapper Type = User Property
Property = Username
SAML Attribute Name = username

5-add email Mapper
Name = email
Mapper Type = User Property
Property = Email
SAML Attribute Name = email

Sentry

1-Register Identity Provider -> IdP Data

2- Entity ID
Keycloak -> Realm Settings -> General -> Endpoints -> SAML 2.0 Identitiy Provider Metadata
entityID=https://<keycloak url>/auth/realms/<realm>

3- Single Sign On URL = https://<keycloak url>/auth/realms/<realm>/protocol/saml/clients/sentry

4- x509 public certificate
Keycloak -> Realm Settings -> Keys -> Certficate -> copy and paste long cert string

5- Attribute Mappings
IdP User ID = username
User Email = email

The list provided by @BYK is helpful:

https://github.com/getsentry/onpremise/blob/bc6d3b47e257057587e29153947c1ba223160416/install.sh#L72-L79

But as far as I understand docker, this is not the full story. Backing up docker volumes usually seems to happen by mounting that volume in a container that writes the contents of some mounted folder to the host filesystem.

In order to properly backup the data inside those volumes, one has to look through docker-compose.yml to see which parts of the filesystem require a backup. From what I can see this boils down to ...

sentry-data: /data
sentry-postgres: /var/lib/postgresql/data
sentry-redis: /data
sentry-zookeeper: /var/lib/zookeeper/data
sentry-kafka: /var/lib/kafka/data
sentry-clickhouse: /var/lib/clickhouse
sentry-symbolicator: /data

@MarcusRiemer

Backing up docker volumes usually seems to happen by mounting that volume in a container that writes the contents of some mounted folder to the host filesystem.

Backing up to hostfile system should be done temporary only and then files should be moved to network or cloud storage. I've created a helper to push the files directly from container to s3. But backup files isn't the right way for databases. Maybe it works for Redis with AOF, but I'm sure that this breaks for postgres. Except the database is offline. From my point of view, usage of specialized tools is always better.

So, to backup sentry _online_ is a very complex task. I'm not even sure if it's possible to create a _consistent_ backup in this case.

Oh yes, that is of course correct. I implicitly assumed that the backup would be done when Sentry is down.

Otherwise one has to somehow coordinate the state of the different data storages and should probably use the dedicated tools like pg_dump.

Docker folks recommend the "extract from container" method here: https://docs.docker.com/storage/volumes/#backup-restore-or-migrate-data-volumes

We can obviously improve this but don't have the resources to invest into it currently. If anyone is willing to give a helping hand, we'd definitely review and guide the patch.

Are these assumptions true?

  • Backups of sentry-postgres and sentry-data will contain all non-event sentry config, e.g. users, avatars, etc. Restoring only these two would basically restore sentry config completely, but without any actual event data.
  • A backup of sentry-clickhouse contains all event data, other than events currently in the incoming queue.
  • A backup of sentry-redis holds stats; these aren't essential for a restore, but they can be preserved by restoring the redis volume.
  • sentry-zookeeper and sentry-kafka only hold data about incoming events; if it's acceptable to lose any in-flight events at the time of backup, these don't need to be backed up.
  • The sentry-symbolicator cache will be rebuilt if not restored.

Trying to figure out what makes sense for a periodic backup with some tolerance for in-flight event loss. Assuing the above are true, seems like sentry-postgres, sentry-data, sentry-clickhouse, and sentry-redis would be sufficient to preserve most data, and only sentry-postgres and sentry-data are needed if restoring to the same config with a clean event slate is desired. True?

Only the following two are not correct/accurate:

A backup of sentry-redis holds stats; these aren't essential for a restore, but they can be preserved by restoring the redis volume.

Now the stats are also held in Clickhouse. Redis holds in-flight or pending job data and some other stuff like sessions etc. You may still lose things if you don't restore this but it is likely that they won't be disastrous.

sentry-zookeeper and sentry-kafka only hold data about incoming events; if it's acceptable to lose any in-flight events at the time of backup, these don't need to be backed up.

Kafka is now the main communication pipeline between services so it holds any in-flight data between these. These can be events to be processed, events to be post-processed, event outcomes, and soon session information for release health. None of this data should be terrible to lose but again, you may lose some real data if these are purged.

Perfect, thanks for the quick feedback!

Hey, Does anyone did a "persistent postgresql backup by calling pg_dump or such before backing up" (pot same for redis with a BGSAVE)? and could share the "commands" to be executed when sentry is used/started via docker?

I mounted volumes in directory for easing backups. But it looks like using volumes for sentry-zookeeper and sentry-kafka caused issues. So I got back to default configuration for these containers.

The issue was that the post processors showed errors when following these steps: https://github.com/getsentry/onpremise/issues/478#issuecomment-666254392

I mounted volumes in directory for easing backups. But it looks like using volumes for sentry-zookeeper and sentry-kafka caused issues. So I got back to default configuration for these containers.

This is most probably due to some permission or user id conflicts. We'll be having a back-up and restore guide in the coming months.

https://github.com/getsentry/onpremise/issues/364#issuecomment-578887863 Do the containers need to be stopped before creating the backups of the volumes? Or do you think it's fine to keep them running? I'm just asking because I'm a little bit afraid of data corruption if there are new events during the backup process

To be safe, yeah, they should be stopped.

Is there any good solution to backup clickhouse while the contaimers are running? postgres is no problem

Can anyone tell methe command to do a postgrest backup to another directory... for sentry?

We use docker exec -t $POSTGRES_CONTAINER_NAME pg_dump -c -U postgres postgres | gzip > $BACKUP_PATH

it is not clear, how to backup all configuration and login information, without exact data.

Was this page helpful?
0 / 5 - 0 ratings