Servo: Replace preferences backend with static data structures instead of hashmap

Created on 16 Nov 2015  路  39Comments  路  Source: servo/servo

Instead of having a dedicated prefs.json file with this kind of logic, it'd be great if we didn't have to duplicate key names all over the place. What if we made the json into some Rust data structure to hold the default values with key names that can be referenced by other files? If not, is it possible to check the key names at compile time?

C-assigned C-has open PR E-less easy I-refactor

All 39 comments

We could deserialize the JSON into the struct, although I'm not certain what that code does with fields present in the JSON that aren't in the Rust structure.

There are a few parts to this:

  • create a Rust structure that holds all of the known preferences using native types
  • create a new API to obtain this structure and replace each use of PREFS.get with it
  • create an API that can get and set preferences via a string name and map those to the native values (this will be required by some code in components/style and components/webdriver_server)
  • create a mechanism to deserialize prefs.json and apply the values in it to the native values

@jdm This seems interesting. Can I work on this?

@prampey Go ahead!

Sorry for the delay guys. Had my tests.
@jdm I did some digging and found the serde_json crate. Based on the requirements posed, I think it would be effective to use a HashMap<String, Value> data structure. A HashMap because well, it is dynamic in nature (grows to our requirements) and it has proper error checking, so with any false-key extraction, the compiler automatically blurts out the key is not available. Also, unlike in struct, you can actually list and iterate through all the key names in a HashMap.
The Value data type is an enum to represent JSON data, therefore, all types of JSON data can be represented with it. So we won't have any problems with extracting any type of JSON data.
Native types can easily be extracted from Value type with conversion functions like as_i32().
The other parts can be implemented if _this_ data structure is accepted.

@prampey We already store preferences inside of a HashMap; the goal of this work is to make it a compile-time failure to use a preference name that does not exist.

@jdm Okay, so we can either convert it into a struct which performs its own compile-time checking or we could use a _compiler plugin_ which traces through the data structure for our keyname and then reports a result. Right?

That sounds correct. A structure sounds less complicated than a compiler plugin to me.

@jdm This seems much more complicated than I thought it to be.
In order to compile-check the validity of a key, we first need access to the JSON string during compile time. This is only possible if we have the JSON hardcoded.
Secondly, struct poses some limitations. To start with, we cannot _set_ preferences to it due to a fixed number of fields. Also, we need to dynamically generate the fields of the struct from the JSON string during compile time. Since macros do not have type information, I'd have to check the type of every field in the JSON literal. Luckily, the serde library provides a json! macro that checks type of every field and converts it all into a Valuein the end. So I can use this code and modify it. However, I'd still have to collect all the types and key names in a vector and then generate a struct (all at once) in the end. (which I'm not sure is possible with a macro :P) This seems like a lot of effort just to leverage the compile-time checking of a struct.
Thirdly, my previous comment on the compiler plugin is probably wrong. You can't trace through a data structure during compile time right? We'd probably have to parse the AST for our desired key name and then report accordingly. (assuming our JSON was hardcoded)

@jdm Anything we can do about this?

In order to compile-check the validity of a key, we first need access to the JSON string during compile time.

This should be possible via a cargo build script.

Any updates? Are you still working on this?

Consider the following:

  • we define structures like this:
#[derive(Serialize, Deserialize)
struct Preferences {
    dom: Dom,
    layout: Layout,
}

#[derive(Serialize, Deserialize)
struct Status {
    enabled: bool,
}

#[derive(Serialize, Deserialize)
struct Bluetooth {
    testing: Status,
    enabled: bool,
}

#[derive(Serialize, Deserialize)
struct Dom {
    testbinding: Status,
}
  • we deserialize prefs.json into the overall Preferences structure at startup to set the initial values
  • Rust code that wants to check a preference gets full compiler support with PREFS.dom.testbinding.enabled == true (even the generated bindings code can do this by storing a function pointer that takes the preferences as an argument)
  • to support the webdriver operation that accepts a string argument, we could serialize the current preferences into a JSON object, then iterate over the components of the requested preference until we reach the appropriate leaf value and set it to the serialized value that webdriver has provided. We can then deserialize the entire preference object back into the native type.

The majority of this redesign would happen in components/config/prefs.rs, making use of the serde_json crate. The webdriver piece would happen in components/webdriver_server. The codegen bit would be in components/script/dom/bindings/codegen/CodegenRust.py and components/script/dom/bindings/guard.rs.

If we choose to deserialize prefs.json directly into the native struct type, we would need to restructure prefs.json to look more like this:

{
    "dom": {
         "testbinding": {
             "enabled": false
        },
        "bluetooth": {
            "enabled": false,
            "testing": false
        }
    }
}

On the other hand, we could do the same thing we plan to do for webdriver - keep prefs.json unmodified, parse it as json, then iterate over the keys in the resulting object so we get values like "dom.bluetooth.enabled". We then serialize the native preference dato to JSON, and for each key in the parsed prefs.json, we could traverse the preferences structure and set the value appropriately. Finally when we're done iterating over the prefs.json keys, we would deserialize the preferences data into the native data types once more. This would be my preferred option, because it keeps prefs.json compact and easy to read.

@kwonoj Would you be interested in working on this?

@jdm, thanks for the suggestion - yes, I think I can try tackle this once I have shaped PR around #20428. let me tentatively assign myself for this, and will ask some questions If I don't get correct ideas. 馃檹

@highfive assign me

I was out for some personal thing and recently started thinking around this and realized time has quite passed around. If anyone have idea / working feel freely take it over, otherwise I'll try to simmer idea around to make this happen. This seems bit more tricker then I originally thought.

It seems a good way to mitigate the problem with the difference in format for the json created by serde_json and the current prefs.json would be using serde's flatten attribute:

#[derive(Serialize, Deserialize)
struct Preferences {
   #[serde(flatten)]
    dom: Dom,
}

#[derive(Serialize, Deserialize)]
struct BluetoothTesting {
    #[serde(rename = "dom.bluetooth.testing.enabled")]     
    enabled: bool,
}

#[derive(Serialize, Deserialize)
struct Bluetooth {
    #[serde(flatten)]
    testing: BluetoothTesting,
    #[serde(rename = "dom.bluetooth.enabled")]
    enabled: bool,
}

#[derive(Serialize, Deserialize)
struct Dom {
    #[serde(flatten)]
    bluetooth: Bluetooth,
}

Hey, I'd like to start working on this issue if no one is currently tackling it.

Please do, @gilbertw!

Great, I'll jump on it!

Quick question:

Reviewing the thread, it appears that we have two competing approaches for loading the prefs.json file into a struct:

  1. Use serde to deserialize the prefs.json directly into the target data structure. Using @russelltg's suggestion we can maintain the current structure of the prefs.json file.

  2. Use serde to deserialize the prefs.json file into an intermediate data structure that is then used to populate the target data structure.

While option 1 would be the simplest approach, option 2 would provide more flexibility by not requiring one to one mappings between preferences in the file and the struct, and allow us to do more complex validation if we needed it.

The prefs.json file is currently pretty simple, so my hunch is that option 1 would be sufficient, but I was wondering if we expected the preferences to become more complex in the future, where a separation between it and the data structure exposed internally would be desirable?

Another thing that might affect this decision is if we wanted to report errors when additional/invalid keys exist in the preferences file, as we could determine this with an easily inspectable data structure.

I would not expect prefs.json to become more complex in the future. I think option 1 is the better choice.

We can probably get rid of PrefValue too. Each getter can just return the actual type in an Arc.

Any reason not to just translate preference names into getter methods, like prefs.dom_bluetooth_testing_enabled()? Is there any value in creating type-level structure to match the implied hierarchy?

My vague feeling is that it's easier to read and clearer for documentation purposes if there are separate structures of related preferences.

And yes, I would be delighted if there were no need to use PrefValue.

@jdm Do you envision keeping the same json format, or change the format to more match the new structure?

My preference would be to keep the same json format because I think it's easier to read, and we'll need to support a very similar kind of conversion when processing webdriver preference commands. If it's significantly more complicated than a different structure, I wouldn't insist on it.

This is a first stab at what the structure could look like: https://github.com/servo/servo/compare/master...peterjoel:issue_8539_prefs_refactor?expand=1

I made a macro that creates structs for all of the intermediate objects, which (se/de)serialize to the same json format. I pretty much mirrored the property names from prefs.json but actually I noticed a few inconsistencies, and these could be reorganised in a later PR.

Before I do the _big_ job of replacing the usages of the old hash map, it would be good to get some feedback to make sure I'm on the right track.

We may have to keep PrefValue, since there is functionality (like reset()) which would be hard to implement on a per-pref basis otherwise. I'll investigate to see if that is used much.

If we keep it, we can improve it still by making it parametrised over the type of the value, and making it implement Deref so you can use it more conveniently.

Clever! This seems like a useful path forwards to me.

@jdm Generating the alternative get(&str) interface might be tricky with a declarative macro though. It _might_ be possible - I'm still trying some things out, but I wonder if I will need to do with with a proc macro. Is there an existing component crate for that or should it go somewhere else?

components/script_plugins contains a bunch of proc macros for components/script. I think that might be the only one right now.

@jdm Can you assign this to me?

done

I am quite close to generating a structure that works from a proc-macro, but I have some questions on usage that will impact the exact design.

This is what I have been working towards:

  • The leaves of the static structure are wrappers for Arc<RwLock<T>>, with a get_ref method to get a impl Deref<Target = T>. e.g. let enabled = *PREFS.foo.bar.enabled
  • The only restriction on the type of a preference value is that it is Clone, so most numeric types will work, as well as &str and possibly others. (In the existing implementation only f64, i64, bool and String are supported via the variants of PrefValue).
  • The static structure can be updated from another one in its entirety, or just over one branch.
  • Each variable is in its own Arc<RwLock>, so two prefs can be updated at the same time.

The HashMap interface:

  • A wrapper around a HashMap<String, Fn(&Prefs) -> PrefValue>, together with a reference to the full structure.
  • PrefMap::get(String) invokes the closure, passing the full structure and converting the result to a PrefValue
  • PrefValue is the same as it currently is. All numeric values are converted to the appropriate PrefValue variant.

In this setup, the HashMap interface is read-only. In order to make it writeable, I would almost certainly have to either:

  • restrict the types of the settings to the exact types of the enum: f64, i64, bool, String.
  • OR include unsound coercions from the types in the enum to the actual pref type (e.g. coerce PrefValue::Int(i64) to i32.

How is updating the preferences expected to be done? Is it likely that it will need to be done via the String keys? If that is the case, I will need to rethink a little.

Yes, the only preference update after initialization comes from this string-based code.

@jdm Ok, so it's ok for the static structure to be read-only?

It would seem like it's best to limit the types to the i64, f64, bool, String in that case too then.

If I understand correctly, I believe it's ok for the static structure to be read-only.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shinglyu picture shinglyu  路  4Comments

Grishy picture Grishy  路  3Comments

ferjm picture ferjm  路  3Comments

pshaughn picture pshaughn  路  3Comments

CYBAI picture CYBAI  路  3Comments