Hive: Protobuf

Created on 15 Nov 2019  路  7Comments  路  Source: hivedb/hive

It looks like you have binary conversion for objects for writing directly to the device (RandomAccessFile?), and the HiveType annotation concept looks very protobuf-like with enumerated fields. Have you given any thought of utilizing protobuf as your binary intermediary? Something like this might make efforts such as syncing with existing cloud persistence services a bit easier if those services utilize gRPC (such as Firestore). This would also allow a nice way to persist classes across platforms, use data in custom platform channels with simple class generation for iOS and Android built in thanks to the protoc plugin. This is certainly not a trivial request but I was curious about your thoughts. Thanks.

question

Most helpful comment

The prototype of Hive was using protobuf and Hive still relies on the same concepts.
The problem with protobuf was that it was not customizable enough.

Speed is one of the most important values for Hive and the custom implementation significantly improved performance and also reduced the required space.

One example are the ASCII keys: It is much faster in Dart to convert an ASCII String from / to binary representation. Protobuf has no ASCII String type. This makes a huge difference for bigger boxes.

I understand that using protobuf would have certain advantages but I also think that it is not a good idea to synchronize the binary format directly.

All 7 comments

The prototype of Hive was using protobuf and Hive still relies on the same concepts.
The problem with protobuf was that it was not customizable enough.

Speed is one of the most important values for Hive and the custom implementation significantly improved performance and also reduced the required space.

One example are the ASCII keys: It is much faster in Dart to convert an ASCII String from / to binary representation. Protobuf has no ASCII String type. This makes a huge difference for bigger boxes.

I understand that using protobuf would have certain advantages but I also think that it is not a good idea to synchronize the binary format directly.

Actually it is possible to use protobuf with hive, by providing the type adapter:

/// Assume [Foo] is a proto class.
class FooAdapter extends TypeAdapter<Foo> {
  @override
  int get typeId => 0;

  @override
  Foo read(BinaryReader reader) => Foo.fromBuffer(reader.readByteList());

  @override
  void write(BinaryWriter writer, Foo obj) =>
      writer.writeByteList(obj.writeToBuffer());
}

I have no idea what the performance impact would be tho.

@X-Wei There will be a performance impact but it is not too bad. Unfortunately, I don't see where this would be useful.

Protobufs can still be useful especially in cases where we share data across different languages. Protobuf could be useful e.g. if I want to scrape and massage data in python and use it in Flutter.

Then, you should probably define your classes in the protobuf language instead of Dart and then use something like protobuf to use them in Dart.

Then, you should probably define your classes in the protobuf language instead of Dart and then use something like protobuf to use them in Dart.

Sure, I mean after defining the protobuf and generating the dart classes with protoc, I still want to use hivedb for storing and accessing the proto message in the flutter app. This way we have the benefits of both hive and proto worlds.

In one of my projects, I was experimenting with CRDTs and build a synchronization-layer in Dart and I used the package message pack and the repo for serializing all my messages.

When I was looking at your binary_writer, it did look similar. So maybe the message_pack spec could be useful?

This is what I tried out, and it worked quite nice:

class Person {
  String name;
  int age;
  List<Person> friends;
  Person({
    this.name,
    this.age,
    this.friends,
  });

  Person copyWith({
    String name,
    int age,
    List<Person> friends,
  }) {
    return Person(
      name: name ?? this.name,
      age: age ?? this.age,
      friends: friends ?? this.friends,
    );
  }

  Map<String, dynamic> toMap() {
    return {
      'name': name,
      'age': age,
      'friends': friends?.map((x) => x?.toMap())?.toList(),
    };
  }

  static Person fromMap(Map<String, dynamic> map) {
    if (map == null) return null;

    return Person(
      name: map['name'],
      age: map['age'],
      friends: List<Person>.from(map['friends']?.map((x) => Person.fromMap(x))),
    );
  }

  String toJson() => json.encode(toMap());

  static Person fromJson(String source) => fromMap(json.decode(source));

  @override
  String toString() => 'Person(name: $name, age: $age, friends: $friends)';

  @override
  bool operator ==(Object o) {
    if (identical(this, o)) return true;

    return o is Person &&
        o.name == name &&
        o.age == age &&
        listEquals(o.friends, friends);
  }

  @override
  int get hashCode => name.hashCode ^ age.hashCode ^ friends.hashCode;
}

/// Assume [Foo] is a proto class.
class FooAdapter extends TypeAdapter<Person> {
  @override
  int get typeId => 0;

  @override
  Person read(BinaryReader reader) =>
      Person.fromMap(deserialize(reader.readByteList()));

  @override
  void write(BinaryWriter writer, Person obj) =>
      writer.writeByteList(serialize(obj.toMap()));
}

Obviously, the class attributes need to be somehow serialized. You could either use a form of from/toMap (use the VSCode Extention Dart Data Classes or any other code generator) or put the attributes in a list and let the message pack serialize those.

I am not sure about the performance and size in bytes though and it has to be evaluated. But message pack is quite established and has libraries in a lot of languages.

Was this page helpful?
0 / 5 - 0 ratings