Dgraph: Q: What are my obligations for generating a consistent backup?

Created on 28 Feb 2018  路  6Comments  路  Source: dgraph-io/dgraph

I would like to be able to backup a dgraph cluster hosted on AWS by instructing the cluster to go into a read-only state and flush any write buffers to disk, and then generate EBS snapshots of each disk rather than going through the documented export process. The copy-on-write semantics of EBS snapshots dramatically shorten the maintenance window that I'd need to take to perform a cluster backup over the documented export option.

Now, I realize that dgraph is not an AWS-only product, so I'm not asking anyone for support for my own particular use case.

However, I would like to understand my obligations for generating a consistent backup using the documented export option given this worrying comment.

1) Is there a way to instruct the cluster to reject writes and go into a read-only mode?
2) If not, do I need to gate writes to the cluster through a custom external service?
3) If not, can you clarify how the documented export will generate a consistent backup while concurrently applying mutations?

Thank you; I'm really interesting in dgraph -- it's a good fit for my use case -- but I need clarity on backup handling and disaster recovery; otherwise, I'm going to end up using AWS Neptune (and flirt with writing a GraphQL to SPARQL transpiler which I do not want to do).

kinquestion

Most helpful comment

Can we do backups without enterprise features?

All 6 comments

@politician Dgraph by design maintains versions of data to support transactions. Dgraph's export reads a snaphost of the database and it's ready only, so you don't need to worry about the ongoing mutations.

@janardhan1993 Thanks; that makes sense. Does the export filename contain the version number? Generally, given a cluster of 5 nodes, I understand that I have to collect the exported files off of each node and move them to, say, S3 for archival. Later, if I need to restore them, can I restore 1 of 5, or does the tool enforce that all 5 files are present?

  • The export file name doesn't contain the version number, though we could make it part of the file name if it's useful.

  • There would be an export file on every node which is the leader of a group, so if you had replicas, only one of them would write the exported file.

  • You can restore only one if you want to but that would mean you just load partial data for a group.

Dgraph supports MVCC, so a given read timestamp would generate a consistent snapshot of the DB. We're soon going to launch backup support to S3 as an enterprise feature, which you can use for this purpose.

Can we do backups without enterprise features?

Hi @geoyws

Backups are only enabled when a valid license file is supplied to a Zero server OR within the thirty (30) day trial period, no exceptions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

djdoeslinux picture djdoeslinux  路  4Comments

andrewsmedina picture andrewsmedina  路  4Comments

jeffkhull picture jeffkhull  路  3Comments

allen-munsch picture allen-munsch  路  4Comments

MichelDiz picture MichelDiz  路  3Comments