When debugging a problem with the badger storage that an internal team at Red Hat was having, I changed our code to use Badger v2 and I have it working on a local branch.
One thing that needs to be decided is about the data migration, as the database from v1 isn't compatible with Badger v2. Should we attempt to provide an automatic migration tool? This can potentially take hours and might require an admin intervention to add resources to this one-off process. For instance, my laptop (16GiB RAM) was completely taken up by this for about 40 minutes and failed several times. This would also require backing up the data before starting the process, which probably means provisioning a separate volume in container environments. In the end, users might decide that it's worth just starting fresh with a new database.
During this migration, Jaeger has to be offline, no matter what we decide.
This is basically what needs to be done: https://github.com/dgraph-io/badger/blob/master/README.md#i-see-manifest-has-unsupported-version-x-we-support-y-error
cc @aditya-konarde, @burmanm
In time: we are not sure yet whether Badger v2 is indeed better: I'm currently investigating a problem with a nil pointer and a possible memory leak.
Hey @jpkrohling ! I'm the maintainer of Badger DB and I'd love to help you guys migrate to Badger v2. Please feel free to ask for help.
As for the migration, the on-disk format is different between badger v1.x and v2.x . The only way to migrate data is via backup and restore. You could trigger a backup using the db.Backup(...) API which would create a new backup.bak file. Jaegar doesn't need to be offline for this to work. You might have to do some kind of housekeeping to move the data that was added after the backup was started.-
Actually, you could do something better, Badger has stream framework. So you can do stream data from badger v1 db to badger v2 db . Here's how you would do it
1. Open Badger v1 in read-only mode (read-write would also work but
then new data won't be visible to streams after it starts)
2. Open badger v2 in read-write mode but *do not write any data*
3. Stream data from badger v1 and use stream writer to insert data into badger v2.
We have to be careful here with the versions. If badger is running with default
options everything should work, if not, we can write some tests to verify this.
Backup and restore use the stream framework and the stream writer but we can skip the write backup to disk and read from disk step.
Thanks for your offer, @jarifibrahim! In general, Jaeger's data isn't critical and might have short TTLs (a couple of weeks at most). So, migration isn't a huge concern, at least not right now.
My work in migrating to v2 is mostly because we had some stability issues with v1 (cc @aditya-konarde) and thought that v2 might solve them. I'm just back from a couple of months leave and will ping you again in a few days. I'm sure I can use your help by then :)
Adding as possible for Jaeger 2.0, as this would be a breaking change if data migration was not possible.
There was also mention of a version 3 by end of year - we should investigate timelines and whether this would also be incompatible with previous version storage format.
I was going to propose the same thing, maybe we should skip version 2 and just jump straight to 3?
@jarifibrahim Do you have a release date for badger v3?
@objectiser Badger v3 is planned to be released in late November.
@jarifibrahim Thanks. Is the storage format backward compatible with v2?
Unfortunately no. All badger major versions are incompatible with each other since they have storage format change.
@jarifibrahim Hey! I see one problem, which appeared in v2, is now present and would cause panics in current jaeger adaption.
For custom Valid function on iterator on View here is used it.Item() != nil. It is done here https://github.com/jaegertracing/jaeger/blob/master/plugin/storage/badger/spanstore/reader.go#L623. But due to changes, nil internal item causes panic in v2 https://github.com/dgraph-io/badger/blob/master/iterator.go#L469. Can we have a function to check item for nil or a Valid function which checks if key is less than given prefix?
@Vemmy124 You can use item.Valid() or item.ValidForPrefix() https://github.com/dgraph-io/badger/blob/d24777861adf1ec7185eeb034781590c03f3ef6a/iterator.go#L474-L488
@jarifibrahim Do you suggest using iter.ValidForPrefix("") to check if the underlying item is nil (empty prefix as a hack to make HasPrefix function returning always true, which allows to determine if item is nil or not)? Isn't it overkill? Otherwise I cannot determine whether false result of Valid() is obtained via nil item or invalid prefix.
@Vemmy124 iter.ValidForPrefix("") won't be an overkill if the item == nil. The ValidForPrefix function will check if the item is nil and then check for the prefix https://github.com/dgraph-io/badger/blob/d24777861adf1ec7185eeb034781590c03f3ef6a/iterator.go#L486-L488
If item.Valid() returns false, item.ValidForPrefix will also return false.
If you want to check if the item is valid here https://github.com/jaegertracing/jaeger/blob/bd59f13478cdd35b41e85f67a6f04b6cb4d92df8/plugin/storage/badger/spanstore/reader.go#L623, use item.Valid().
Please correct me if I'm wrong. I don't think I understand the issue correctly.
Most helpful comment
Hey @jpkrohling ! I'm the maintainer of Badger DB and I'd love to help you guys migrate to Badger v2. Please feel free to ask for help.
As for the migration, the on-disk format is different between badger v1.x and v2.x . The only way to migrate data is via backup and restore. You could trigger a backup using the
db.Backup(...)API which would create a newbackup.bakfile. Jaegar doesn't need to be offline for this to work. You might have to do some kind of housekeeping to move the data that was added after the backup was started.-Actually, you could do something better, Badger has stream framework. So you can do stream data from badger v1 db to badger v2 db . Here's how you would do it
Backup and restore use the stream framework and the stream writer but we can skip the
write backup to disk and read from diskstep.