The code generated by serde_codegen::expand() for deserializing types that contain BTreeMap contains code that is both broken and baffling. It appears to be the result of an index/offset/pointer error in the codegen itself.
This is a weird one, so let me know if I can provide any additional info.
This struct:
#[derive(Debug, Deserialize)]
pub struct Service {
pub documentation: Option<String>,
pub examples: Option<BTreeMap<String, String>>,
pub metadata: Metadata,
pub operations: BTreeMap<String, Operation>,
#[serde(deserialize_with="ShapesMap::deserialize_shapes_map")]
pub shapes: BTreeMap<String, Shape>,
pub version: String,
}
when fed through serde_codegen::expand() produces (among otherwise functional and correct code) this visit_str function:
fn visit_str<__E>(&mut self, value: &str)
-> ::std::result::Result<__Field, __E> where
__E: _serde::de::Error {
match value {
"documentation" => {
Ok(__Field::__field0)
}
"examples" => { Ok(__Field::__field1) }
"metadata" => { Ok(__Field::__field2) }
"AWS Marketplace Commerce Analytics" => {
Ok(__Field::__field3)
}
"shapes" => { Ok(__Field::__field4) }
"version" => { Ok(__Field::__field5) }
_ => Ok(__Field::__ignore),
}
}
Note how the matcher arm that should match the operations field instead matches the string "AWS Marketplace Commerce Analytics".
The _only_ place the string "AWS Marketplace Commerce Analytics" appears in our entire codebase is in a different function in the same source file, the relevant portion of which is:
impl Service {
pub fn service_type_name(&self) -> &str {
match &self.metadata.service_full_name[..] {
[...snip...]
"AWS Marketplace Commerce Analytics" => "MarketplaceCommerceAnalytics",
[...snip...]
Interestingly, if I insert a comment in the Service struct we're feeding to the codegen, a completely different match arm in the same method is replaced with the wrong string:
fn visit_str<__E>(&mut self, value: &str)
-> ::std::result::Result<__Field, __E> where
__E: _serde::de::Error {
match value {
"documentation" => {
Ok(__Field::__field0)
}
"Iot" => { Ok(__Field::__field1) }
"metadata" => { Ok(__Field::__field2) }
"operations" => { Ok(__Field::__field3) }
"shapes" => { Ok(__Field::__field4) }
"version" => { Ok(__Field::__field5) }
_ => Ok(__Field::__ignore),
}
}
In this case, "examples" is replaced by "Iot" (which also appears as a string in the matcher of the service_type_name function above.
The source file that we're feeding to serde_codegen is https://github.com/rusoto/rusoto/blob/master/codegen/src/botocore.in.rs
The issue can be reliably reproduced by creating a new Cargo project with an external dependency on any Rusoto feature (except S3), like:
[dependencies]
rusoto = {version = "0.18.0", features = ["sqs"]}
Like I said, this is a weird one. I'll be happy to help debug.
Thanks for the detailed report! This looks like the same root cause as #574 which we closed because we didn't have a way of reproducing the issue. Your issue looks much more promising in terms of us being able to reproduce and fix. I agree, this is a really weird one. Unclear whether this is a bug in Serde's codegen, any of the libraries we depend on, or Rust compiler.
Can you also share the following in case I have trouble reproducing on my computer and it turns out to be relevant?
rustc --version --verboseNever mind on the extra details, looks like all of that is included in the rusoto ticket you linked. I will take a look after work. This is my highest priority this week.
This case is derailing the pretty-printer. It is not hit anywhere else. Now to figure out (1) why it is there and (2) why it is wrong.
Fixed in #591. We were parsing the input to the build script and the output of the #[derive(Deserialize)] expansion using two different ParseSess sessions and the position of "operations" in the expanded code collided with the position of "AWS Marketplace Commerce Analytics" in the original code (both at position 1553) so the pretty printer decided to print one instead of the other. The fix uses the same ParseSess for both so that the pretty printer doesn't mix up values.
I released serde_codegen 0.8.14 with the fix.
Most helpful comment
Fixed in #591. We were parsing the input to the build script and the output of the
#[derive(Deserialize)]expansion using two different ParseSess sessions and the position of"operations"in the expanded code collided with the position of"AWS Marketplace Commerce Analytics"in the original code (both at position 1553) so the pretty printer decided to print one instead of the other. The fix uses the same ParseSess for both so that the pretty printer doesn't mix up values.