Skip to content

Custom INI parser does not support flattened structs

TL;DR: https://github.com/serde-rs/serde/issues/1183

Problem

We currently use flattened structs for v2 implementation of the BUILDINFO format. Since it re-uses most of the already existing v1 fields, I thought it would be a good idea to use the struct flattening (i.e. flatten attribute) of serde as follows:

struct BuildInfoV1 {
    format: Schema,
    pkgname: Name,
    installed: Vec<InstalledPackage>,
    // stripped rest for brevity
}
struct BuildInfoV2 {
    #[serde(flatten)] // carries over all fields from BuildInfoV1
    v1: BuildInfoV1,
}

This works in theory and our current test suite passes:

format = 2
pkgname = foo
installed = foo-1.2.3-1-any
installed = bar-1.2.3-1-any

However, I hit a scenario which this does not work due to not being able to infer the type:

format = 2
pkgname = foo
installed = bar-1.2.3-1-any

Error: DeserializeError(Custom("invalid type: string "bar-1.2.3-1-any", expected a sequence")

Apparently, the flatten feature works by first capturing all the data generically using deserialize_any. On the other hand, our deserializer expects the actual type (i.e. a sequence) so it throws an error when it encounters a String instead.

This also happens with any type that is non-self-describing, so if you have used e.g. u64 in the parser:

Error: invalid type: string, expected u64

External Links

  1. An issue which describes the same problem with a hacky workaround: https://github.com/serde-rs/serde/issues/1881

  2. A similar case in rust-csv with a better explanation: https://github.com/BurntSushi/rust-csv/issues/344

#[serde(flatten)] is only supported for self-describing formats because it needs to parse the tokens ahead of time using deserialize_any(), since it doesn't know what types the nested fields contain yet. Since the deserialization then simply "guesses" the type, it might guess wrong and you get issues like this.

  1. Bug report in serde: https://github.com/serde-rs/serde/issues/1183

Solution

There doesn't seem to be an apparent solution to this yet (see the 3rd link above). We can ditch flatten and embrace the code duplication or come up with a custom deserializer on BUILDINFO v2 that does flattening manually.

I'm looking into these solutions and also extending our test suite to catch these edge cases.

Edited by Orhun Parmaksız