Terminal: Switch to a JSON serializer that supports pretty printing

Created on 7 May 2019  Âˇ  27Comments  Âˇ  Source: microsoft/terminal

The contents of profile.json are all in the same line, marks its readability very low. Maybe it's better to change the default format of it?
profile_screenshot

Area-Settings Issue-Task Needs-Tag-Fix Product-Terminal

Most helpful comment

Oh, I'm not saying the file is temporary, I'm saying the bug is temporary.

And @SwimmingTiger hit the nail on the head - we're planning on having both an actual GUI for modifying the settings and a plaintext file to edit (for those who like living dangerously)

All 27 comments

This is a bug (though it doesn't have an issue so it is good you made one). Copying @zadjii-msft's comment from another bug:

The profiles.json thing is a temporary thing. Unfortunately, Windows.Data.Json always writes out json as a single line, without the ability to pretty-print it, and we re-write the profiles.json to make sure that it stays up to date with the schema of the app. So currently, it'll keep re-writing it. But that's a huge pain point that we want to fix ASAP.

If profiles.json is temporary, what will eventually remplace it? Will it still be a json file?

Consider the case of VSCode. You can edit the JSON file directly, but they also now provide a GUI to modify configurations.

Oh, I'm not saying the file is temporary, I'm saying the bug is temporary.

And @SwimmingTiger hit the nail on the head - we're planning on having both an actual GUI for modifying the settings and a plaintext file to edit (for those who like living dangerously)

Yeah, I've been editing with VS Code and pressing Shift + Alt + F to reformat after every save.
Very confusing.

@zadjii-msft, suggest to replace Windows.Data.Json with nlohmann/json, reasons:

  1. Capability of pretty printing.
  2. It has a single include header which is easy to intergrate into this project.
  3. STL like syntax.
  4. MIT Licence, same with this project itselft, so no license issue.
  5. Acceptable speed and memory consumption considering the simple scenario here, we just use it for config, even though there are faster libraries.

That's probably the one we'd go with TBH.

I think there's a small internal ingestion process that we'd need to go through just to make sure everything's above board, but when we get signoff, then I'd be happy to update the code :)

@DHowett-MSFT let's get on it (after Build is done of course)

How about raising an issue with System.Data.Json to enable better formatting options on save? Seems like that'd be beneficial to a lot of people.

@onovotny It'd be excellent, and we _absolutely_ will push that dependency.
Until we can switch to the "20H1" SDK, though, we'll need to take a two pronged approach. :smile:

@lennylxx That looks like a nice library... but this worries me:

This library does not support comments. It does so for three reasons:

I know that JSON doesn't _have_ comments. Perhaps we should open a discussion about what our settings serialization format _should_ be?
I was partial to TOML. That's a bridge too far, I reckon, so how about JSONC? JSON with Comments is always, always going to be more human-friendly than JSON without comments.

fwiw, vscode and many others have adopted "bastardized json" as their default for settings; it is basically "the object construction format accepted by ECMA6" where things like // commented line in the middle of an object, /* comment here */, and { nameWithoutQuotes: "some value" } are accepted.

EDIT: nvm, I guess this is what @DHowett-MSFT was referring to.

Oh yea, JSON _with comments_ would be important. Kinda frustrating that nlohmann/json doesn't at least have an option for it, but I get it. Stripping the comments before parsing shouldn't be that hard, right?

I guess the problem would be to write the json file with the comments it had when it was first parsed, which could be more difficult.

e.g. the terminal loads my json configurations, I make modifications through the terminal's settings, it would be great to keep the comments I put in the json files when the terminal makes modifications to it.

Hmm. This is a good point.

If VsCode supports comments in their settings json, then how can they keep the comments? IIRC they have a settings UI as well.

summoning @Tyriar who might be able to link someone who actually knows

We use a parser that our team built https://github.com/microsoft/node-jsonc-parser, the settings UI is basically a UI on top of the json file and is built into VS Code itself (you wouldn't be able to leverage this).

Also FYI comments are apparently part of "json5" https://json5.org/

@zadjii-msft VS Code is build on top of electron which allows packages like jsonc-parser to be used within the application. In fact, this module is built by the guys behind VS Code and is being used widely across the editor.

Edit: whoops, didn't see @Tyriar's comment, sorry for that! 😅

Welp. So much for that. I can't see us having the resources needed to port that to c++ any time soon :/

Since porting jsonc-parser to C++ is quite time consuming, maybe we should seek other serialization formats. We need:

  1. hierarchical configuration
  2. human friendly
  3. pretty printing
  4. comments
  5. popular enough that most people know how to modify

So how about YAML?
There are some libraries yaml-cpp and rapidyaml, both released under MIT license.

YAML might feel out of place. Since VS Code uses JSON, people might expect the terminal to use json as wrll (unless we are talking about YAML 1.2 which has become a superset of JSON)

@utybo VS Code uses JSON doesn't mean JSON is a good format for configuration. Otherwises, VS Code would use JSON directly, instead of a reimplementation with comment supporting.
Don't get me wrong, actually I like json, I use it a lot in my RESTful services, but we need to choose the right techniques based on scenarios and resources we have here, not personal preferences.
Here is an ariticle taking about why json is not perfect for configuration.
https://www.lucidchart.com/techblog/2018/07/16/why-json-isnt-a-good-configuration-language/

rapidjson has relaxed JSON syntax (comment, trailing comma, NaN/Infinity) options.

Personally, I wanted us to use TOML. It's robust enough, but it's far less ubiquitous than JSON. (fwiw: I also like YAML, but I won't force my ideology on my team :smile:)

I had a look at _rapidjson_ and _jsoncpp_. rapidjson has the benefit of being a header-only library, admittedly, but it looks like jsoncpp has a bit of an edge here specifically regarding how it handles comments and pretty printing. Comments are accessible through the Value interface, and they can be both read _and written_. It seems like it lets a writer control flow/pretty printing per element as well.

That'll let us do some nice things like:

  • preserve user comments, but reformat file
  • print out optional settings commented out (/* "requestedTheme": ["dark", "light", ... ] */)

I may have missed something about rapidjson, of course.

Maybe we should just wait for updates of the Windows.Data.Json APIs for stringify JSON objects with formatting.

And, since the existing function provided by Windows.Data.Json APIs can correctly parse strings with formats into JSON objects, as a workaround, instead of introducing additional libraries, a function that manually converts a JSON object into a multiple-line string with human-readable indents can be implemented before the corresponding APIs getting updated.

Woo thanks @zadjii-msft! 😃✨

Would it be possible to re-open and re-title this (currently called “Switch to a JSON serializer that supports pretty printing”)? The scope of the discussion expanded to choice of a different format entirely. I’d still vote for yaml compatibility, but saving as json by default (readable by yaml parsers).

I don’t know much about comment preservation in yaml, but a quick Google search shows it’s been done. Not sure how good any of the implementations are.

Maybe this isn’t reasonable, considering how full-steam-ahead the project is on json now.

First off: I agree. But honestly, we’re probably not moving away from JSONC unless there’s a very good reason. It’s nearly ubiquitous. There’s good enough tooling for it. Even though it’s pretty screwy as a format, it does what it intends to.

I was a proponent of TOML for this project, but we’re a pretty small team and making strategic bets that get us moving faster is (almost) always going to win out.

FWIW, I wrote a Makefile to automatically upgrade Jira to new releases. The http XML configuration file changes every release, and is included in the release. You need to manually reconfigure the file every release, editing in changes like what port to listen on and what protocol to use, the accepted hostnames, etc. Anyway, it seems to be edited by hand on their end and each release ships with different (mangled and unformatted) xml configuration files, with different comments in different places, etc.

To work around this, I had the script take the original file from the previous release and run it through a formatter (that forces all whitespace, indentation, etc (ie doesn’t have an “allow” policy) — importantly, definitively sorting the elements and attributes to a strict order such that there can only be one valid, correct sort — and created a standardized view of the unmodified file from the previous distribution. Then I do the same with the user-modified version containing the changes we want to preserve, and finally standardize the file that shipped with the new release; giving us three different standardized files.

At this point, generate two diffs: one between the the standardized view of the previous version of the stock file and the standardized view of the user modified file, and the other between the cleaned-up previous stock file and the cleaned-up new stock file, then do a three-way merge with maximum fuzz to get a configuration with the latest features but all user changes from the previous release.

——

If you consider the old version without comments, the old version with the comments, and the new version without comments to be the three snapshots, you can generate a version with the comments with a library that doesn’t support that. It sounds scary but with the toolset already in place to format, diff, and patch it’s actually just a few lines.

Or you could just write a patch to save any comments encountered after parsing one node/property/value and before you get to the other, then spit it back out on write, lol.

Was this page helpful?
0 / 5 - 0 ratings