I have a question about getting vector of bytes on deserializing step from a passed data. At the moment Serde return strings for &[u8] and Vec<u8> types. I understand, that some sort of things also not available with vectors because Rust language doesn't have specialization right now.
If I'm going to make a support for a BERT deserializer for the Vec<u8>, does it mean that necessary to define my own Visitor for a de::SeqVisitor trait (like it specified there)? The BERT deserializer have a parse_binary method which copies a binary data into buffer and returns it to a caller:
use std::io::{self, Read};
use byteorder::{BigEndian, ReadBytesExt};
use serde::de::{self, EnumVisitor, Visitor, Deserialize};
use super::errors::{Error, Result};
pub struct Deserializer<R: Read> {
reader: R,
header: Option<u8>,
}
impl<R: Read> Read for Deserializer<R> {
#[inline]
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
self.reader.read(buf)
}
}
impl<R: Read> de::Deserializer for Deserializer<R> {
type Error = Error;
forward_to_deserialize! {
bool usize u8 u16 u32 u64 isize i8 i16 i32 i64 f32 f64 char
str string unit seq seq_fixed_size bytes option map unit_struct
tuple_struct struct struct_field tuple ignored_any newtype_struct
}
#[inline]
fn deserialize<V: Visitor>(&mut self, visitor: V) -> Result<V::Value> {
if self.header.is_none() {
self.header = Some(try!(self.read_u8()));
}
let result = self.parse_value(visitor);
self.header = None;
result
}
#[inline]
fn deserialize_enum<V: EnumVisitor>(
&mut self, _enum: &'static str, _variants: &'static [&'static str],
mut visitor: V
) -> Result<V::Value> {
Err(Error::UnsupportedType)
}
}
impl<R: Read> Deserializer<R> {
/// Creates the BERT parser from an `std::io::Read`.
#[inline]
pub fn new(reader: R) -> Deserializer<R> {
Deserializer {
reader: reader,
header: None,
}
}
/// The `Deserializer::end` method should be called after a value has
/// been fully deserialized. This allows the `Deserializer` to validate
/// that the input stream is at the end.
#[inline]
pub fn end(&mut self) -> Result<()> {
if try!(self.read(&mut [0; 1])) == 0 {
Ok(())
} else {
Err(Error::TrailingBytes)
}
}
#[inline]
fn parse_value<V: Visitor>(&mut self, visitor: V) -> Result<V::Value> {
let header = self.header.unwrap();
self.header = None;
match header {
109 => self.parse_binary(header, visitor),
_ => Err(Error::InvalidTag)
}
}
// Example of data [0, 0, 0, 5, 118, 97, 108, 117, 101]
// First 4 bytes is length, after - data (not necessarily a string)
#[inline]
fn parse_binary<V: Visitor>(&mut self, _header: u8, mut visitor: V) -> Result<V::Value> {
let length = try!(self.read_i32::<BigEndian>());
let mut buffer = vec![0; length as usize];
try!(self.reader.read_exact(&mut buffer));
visitor.visit_byte_buf(buffer);
}
}
At the moment Serde return strings for &[u8] and Vec
types
serde only does that when you deserialize to a String.
If I'm going to make a support for a BERT deserializer for the Vec
, does it mean that necessary to define my own Visitor for a de::SeqVisitor trait (like it specified there)? The BERT deserializer have a parse_binary method which copies a binary data into buffer and returns it to a caller:
jup, that's the way to go. You also need to implement deserialize_seq and call visitor.visit_seq(your_seq_visitor).
This code was compiling successfully, but in my test for a binary I'm getting the following error:
---- test_deserializers::test_deserialize_binary stdout ----
thread 'test_deserializers::test_deserialize_binary' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("Invalid type. Expected `Seq`")', ../src/libcore/result.rs:788
At the moment the BinarySeqVisitor and invoke it from the Deserializer<R: Read> looks like this:
impl<R: Read> Deserializer<R> {
// ... some other implementation of Deserializer<R: Read>
#[inline]
fn parse_binary<V: Visitor>(
&mut self, _header: u8, mut visitor: V
) -> Result<V::Value> {
let length = self.read_i32::<BigEndian>().unwrap() as usize;
visitor.visit_seq(BinarySeqVisitor::new(self, Some(length)))
}
}
struct BinarySeqVisitor<'a, R: 'a + Read> {
de: &'a mut Deserializer<R>,
length: Option<usize>
}
impl<'a, R: 'a + Read> BinarySeqVisitor<'a, R> {
#[inline]
fn new(de: &'a mut Deserializer<R>, length: Option<usize>) -> Self {
BinarySeqVisitor { de: de, length: length }
}
}
impl<'a, R: Read> de::SeqVisitor for BinarySeqVisitor<'a, R> {
type Error = Error;
fn visit<T: Deserialize>(&mut self) -> Result<Option<T>> {
match self.length {
Some(0) => return Ok(None),
Some(ref mut len) => *len -= 1,
_ => {}
};
match Deserialize::deserialize(self.de) {
Ok(value) => Ok(Some(value)),
Err(e) => Err(e)
}
}
fn end(&mut self) -> Result<()> {
if let Some(0) = self.length {
Ok(())
} else {
Err(Error::TrailingBytes)
}
}
fn size_hint(&self) -> (usize, Option<usize>) {
match self.length {
Some(len) => (len, self.length),
None => (0, Some(0))
}
}
}
impl<'a, R: Read> de::Visitor for BinarySeqVisitor<'a, R> {
type Value = Vec<u8>;
#[inline]
fn visit_unit<E>(&mut self) -> std::result::Result<Vec<u8>, E>
where E: de::Error,
{
Ok(Vec::new())
}
#[inline]
fn visit_seq<V>(&mut self, mut visitor: V) -> std::result::Result<Vec<u8>, V::Error>
where V: de::SeqVisitor,
{
let mut values = Vec::with_capacity(visitor.size_hint().0);
while let Some(value) = try!(visitor.visit()) {
values.push(value);
}
try!(visitor.end());
Ok(values)
}
}
Any ideas how to fix it?
that means that the visitor you are passing to fn parse_binary<V: Visitor>( isn't one that expects to give you a sequence.
It means to change which signature? Which invoke and pass it to the parse_binary method? Or perhaps parse_value? I really can't understand how to change the code, that it works correctly as I expect.
@dtolnay could you share a little bit more information about how to make my own vector/sequence visitor with a current state of my codebase?
@Relrin I'm not entirely sure, as I see no calls to parse_binary, do you have the code uploaded somewhere?
Note that with serde it's often useful to use a debugger to step through the code (or even just spray logging statements everywhere).
@oli-obk Current code is a part of the following pull request, which I've showed in this issue.
@dtolnay Do you have any ideas how to fix this code?
I've tried to change a logic of parse_binary method onto:
#[inline]
fn parse_binary<V: Visitor>(
&mut self, _header: u8, mut visitor: V
) -> Result<V::Value> {
let length = self.read_i32::<BigEndian>().unwrap() as usize;
let seq_visitor = BinarySeqVisitor::new(self, Some(length));
seq_visitor.visit_seq(self)
}
But compiler generates the following errors:
error[E0277]: the trait bound `deserializers::Deserializer<R>: serde::de::SeqVisitor` is not satisfied
--> /Users/savicvalera/code/bert-rs/bert/src/deserializers.rs:132:21
|
132 | seq_visitor.visit_seq(self)
| ^^^^^^^^^
|
= note: required because of the requirements on the impl of `serde::de::SeqVisitor` for `&mut deserializers::Deserializer<R>`
error[E0308]: mismatched types
--> /Users/savicvalera/code/bert-rs/bert/src/deserializers.rs:132:9
|
132 | seq_visitor.visit_seq(self)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected associated type, found struct `std::vec::Vec`
|
= note: expected type `std::result::Result<<V as serde::de::Visitor>::Value, errors::Error>`
= note: found type `std::result::Result<std::vec::Vec<u8>, _>`
If you expect to be able to deserialize a String from a bert value of type Binary, then your deserializer needs to call one of visit_str, visit_string, visit_bytes or visit_byte_buf because those are the ones that StringVisitor supports. For example something like this:
#[inline]
fn parse_binary<V: Visitor>(
&mut self, _header: u8, mut visitor: V
) -> Result<V::Value> {
let length = self.read_i32::<BigEndian>().unwrap() as usize;
let mut buf = vec![0; length];
try!(self.read(&mut buf));
visitor.visit_byte_buf(buf)
}
@dtolnay I have had a functionality which return to a user String value after parsing a binary. Currently I'm trying to extract bytes from the binary and return it as is as the result (in our case is Vec<u8>). But I can't figure out, how to it correctly. In the comments above described a current state of parse_binary method, which trying to use my sequence deserializer, but its failed.
That won't work because String can only be deserialized through visit_str / visit_string / visit_bytes / visit_byte_buf and Vecparse_binary needs to decide whether to support String or Vec
Once we implement specialization for Vec
I filed #555 to implement visit_seq for deserializing String, and that way it will work for both String and Vec
Your implementation in the PR (including parse_binary) is correct except for this part:
match Deserialize::deserialize(self.de) {
Ok(value) => Ok(Some(value)),
Err(e) => Err(e)
}
According to your test, you expect the binary data to look like this:
109, // binary
0, 0, 0, 5, // length
118, // "v"
97, // "a"
108, // "l"
117, // "u"
101 // "e"
But your implementation expects the following:
109, // binary
0, 0, 0, 5, // length
97, // unsigned integer
118, // "v"
97, // unsigned integer
97, // "a"
97, // unsigned integer
108, // "l"
97, // unsigned integer
117, // "u"
97, // unsigned integer
101 // "e"
The fix is to deserialize from something that knows to read a u8 without needing to see the "unsigned integer" tag every time:
impl<'a, R: Read> de::SeqVisitor for BinarySeqVisitor<'a, R> {
fn visit<T: Deserialize>(&mut self) -> Result<Option<T>> {
match self.length {
Some(0) => return Ok(None),
Some(ref mut len) => *len -= 1,
None => {}
};
Deserialize::deserialize(self).map(Some)
}
// ...
}
impl<'a, R: Read> de::Deserializer for BinarySeqVisitor<'a, R> {
type Error = Error;
fn deserialize<V>(&mut self, mut visitor: V) -> Result<V::Value>
where V: Visitor
{
visitor.visit_u8(try!(self.de.read_u8()))
}
forward_to_deserialize! {
bool usize u8 u16 u32 u64 isize i8 i16 i32 i64 f32 f64 char str string
unit option seq seq_fixed_size bytes map unit_struct newtype_struct
tuple_struct struct struct_field tuple enum ignored_any
}
}
Then in the unit test:
let binary: Vec<u8> = binary_to_term(&data).unwrap();
assert_eq!(b"value", binary.as_slice());
Side note: you can get rid of the impl de::Visitor for BinarySeqVisitor<'a, R> because nothing uses it.
Thank @oli-obk @dtolnay for helping me out! It works now 馃檪