Serde: Conditionally deserialize sub-struct based on visited values

Created on 30 Jan 2019  路  6Comments  路  Source: serde-rs/serde

I need to deserialize the following list

[{
  "t": "item.created",
  "v": 1,
  "data": {
    "id": 100,
    "name": "some"
  }
}, {
  "t": "item.deleted",
  "v": 1,
  "data": {
    "at": "1971-02-02T00:00:00Z"
  }
}]

into Vec<ItemEvent> where ItemEvent is defined as the following:

trait Event {
    fn event_type(&self) -> &'static str;
    fn event_version(&self) -> u8;
}

#[derive(Deserialize)]
struct Created {
    id: u32,
    name: String,
}
impl Event for Created {
    fn event_type(&self) -> &'static str {"item.created"}
    fn event_version(&self) -> u8 {1}
}

#[derive(Deserialize)]
struct Deleted {
    at: DateTime<Utc>,
}
impl Event for Deleted {
    fn event_type(&self) -> &'static str {"item.deleted"}
    fn event_version(&self) -> u8 {1}
}

enum ItemEvent {
    Created(Created),
    Deleted(Deleted),
}
impl Event for ItemEvent {
    fn event_type(&self) -> &'static str {
        match self {
            ItemEvent::Created(ev) => ev.event_type(),
            ItemEvent::Deleted(ev) => ev.event_type(),
        }
    }
    fn event_version(&self) -> u8 {
        match self {
            ItemEvent::Created(ev) => ev.event_version(),
            ItemEvent::Deleted(ev) => ev.event_version(),
        }
    }
}

I've checked/tried the following things with no success:

  1. Use adjancet tagging, but I have two tag fields and tag value goes from trait implementation, not enum.
  2. As I need to inspect t/v fields during deserialization the Visitor should definitely be used.
  3. Any Visitor implementation examples which I've found:

    • either know the exact deserialization type, so visit_map() just ends with Ok(MyType{});

    • or forward deserialization of _the whole_ MapAccess received in visit_map().

I've ended up with somewhat following

impl<'de> Deserialize<'de> for ItemEvent {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        #[derive(Deserialize)]
        #[serde(field_identifier)]
        enum Field {
            #[serde(rename = "t")]
            Type,
            #[serde(rename = "v")]
            Version,
            #[serde(rename = "data")]
            Data,
        };

        const FIELDS: &'static [&'static str] = &["t", "v", "data"];

        struct ItemEventVisitor;

        impl<'de> Visitor<'de> for ItemEventVisitor {
            type Value = ItemEvent;

            fn expecting(&self, f: &mut fmt::Formatter) -> fmt::Result {
                f.write_str("ItemEvent")
            }

            fn visit_map<M>(self, map: M) -> Result<ItemEvent, M::Error>
            where
                M: de::MapAccess<'de>,
            {
                let (mut t, mut v) = (None, None);
                let mut data = None;
                while let Some(key) = map.next_key()? {
                    match key {
                        Field::Type => {
                            if t.is_some() {
                                return Err(de::Error::duplicate_field("t"));
                            }
                            t = Some(map.next_value()?);
                        }
                        Field::Version => {
                            if v.is_some() {
                                return Err(de::Error::duplicate_field("v"));
                            }
                            v = Some(map.next_value()?);
                        }
                        Field::Data => {
                            if data.is_some() {
                                return Err(de::Error::duplicate_field("data"));
                            }
                            data = Some(map.next_value()?);
                        }
                    }
                }
                let t = t.ok_or_else(|| de::Error::missing_field("t"))?;
                let v = v.ok_or_else(|| de::Error::missing_field("v"))?;
                let data =
                    data.ok_or_else(|| de::Error::missing_field("data"))?;
                Ok(match (t, v) {
                    ("item.created", 1) => {
                        ItemEvent::Created(Created::deserialize(
                            de::value::MapAccessDeserializer::new(data),
                        )?)
                    },
                    ("item.deleted", 1) => {
                        ItemEvent::Deleted(Deleted::deserialize(
                            de::value::MapAccessDeserializer::new(data),
                        )?)
                    },
                    _ => return Err(de::Error::custom("馃敟"));
                })
            }
        }

        deserializer.deserialize_struct("ItemEvent", FIELDS, ItemEventVisitor)
    }
}

which does not actually compile.

The question is: how one could forward deserialization inside visit_map(), but only for one field, not the whole received MapAccess?

support

Most helpful comment

I would write this as:

impl<'de> Deserialize<'de> for ItemEvent {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        #[derive(Deserialize, Debug)]
        enum EventType {
            #[serde(rename = "item.created")]
            Created,
            #[serde(rename = "item.deleted")]
            Deleted,
        }

        #[derive(Deserialize)]
        struct EventHelper {
            t: EventType,
            v: u32,
            data: serde_json::Value,
        }

        let helper = EventHelper::deserialize(deserializer)?;
        match (helper.t, helper.v) {
            (EventType::Created, 1) => Created::deserialize(helper.data)
                .map(ItemEvent::Created)
                .map_err(de::Error::custom),
            (EventType::Deleted, 1) => Deleted::deserialize(helper.data)
                .map(ItemEvent::Deleted)
                .map_err(de::Error::custom),
            (t, v) => Err(de::Error::custom(format!(
                "unrecognized version v={} for event {:?}",
                v, t
            ))),
        }
    }
}

All 6 comments

I would write this as:

impl<'de> Deserialize<'de> for ItemEvent {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        #[derive(Deserialize, Debug)]
        enum EventType {
            #[serde(rename = "item.created")]
            Created,
            #[serde(rename = "item.deleted")]
            Deleted,
        }

        #[derive(Deserialize)]
        struct EventHelper {
            t: EventType,
            v: u32,
            data: serde_json::Value,
        }

        let helper = EventHelper::deserialize(deserializer)?;
        match (helper.t, helper.v) {
            (EventType::Created, 1) => Created::deserialize(helper.data)
                .map(ItemEvent::Created)
                .map_err(de::Error::custom),
            (EventType::Deleted, 1) => Deleted::deserialize(helper.data)
                .map(ItemEvent::Deleted)
                .map_err(de::Error::custom),
            (t, v) => Err(de::Error::custom(format!(
                "unrecognized version v={} for event {:?}",
                v, t
            ))),
        }
    }
}

@dtolnay thanks, but that would work only for JSON as serde_json::Value is used? Is there a way to make it work for arbitrary format?

You could use serde_value::Value in place of serde_json::Value.

@dtolnay I have somewhat of a similar use case, but for quite a big struct with a lot of aliases.

So I was wondering if there is a way to do something like this, but without having to copy over the whole struct as a helper struct (as it's really pretty big)?

Using the same struct inside the deserialize method would create an infinite look right? So could this be done with a type alias or newtype (not familiar with these, so I have no clue if that could work)?

Thanks!!

Maybe good to elaborate a little on how my use case could be similar. I have a struct with a value containing an enum of enums and which enum that should be used is dependent on another field.

So my idea was (following the above approach) to deserialize the whole struct, but use a temp Value for the enum field. Once I have access to the other field, I can then deserialize the remaining field with the correct enum.

So something like this:

#[derive(Serialize, Deserialize, Clone, Debug)]
#[serde(untagged)]
pub enum Model {
    Model1(Serie1),
    Model2(Serie2),
}

#[derive(Serialize_repr, Deserialize_repr, Clone, Debug)]
#[allow(non_camel_case_types)]
#[repr(i8)]
pub enum Serie1 {
    #[serde(rename(deserialize = "SubModel 1"))]
    SubModel1 = 0,
    #[serde(rename(deserialize = "SubModel 2"))]
    SubModel2 = 1,
}

#[derive(Serialize_repr, Deserialize_repr, Clone, Debug)]
#[repr(i8)]
pub enum Serie2 {
    #[serde(rename(deserialize = "Model 123"))]
    Model1= 0,
    #[serde(rename(deserialize = "Model 234"))]
    Model2 = 1,
    #[serde(rename(deserialize = "Model 345"))]
    Model3 = 2,
}

struct SomeDevice {
    driver: String,
    model: Model,
}

Of course this is very simplified, but I think the idea is clear. The issue is that there are multiple sub enums that represent 0 so the first match will always be used instead of the correct one (which belongs to the specific driver).

Hope this solution can also be used for this use case, but of course any other pointers are more then welcome as well! Thanks!

@dtolnay do you prefer that I open a new support issue for this instead?

Was this page helpful?
0 / 5 - 0 ratings