Runtime: Support of comments in jsondocument

Created on 19 Jul 2019  路  18Comments  路  Source: dotnet/runtime

We hope to have an easy(er) way to perverse comments while editing json file. Maybe in jsondocument. We strongly believe config file should have comment support. However, since the file will also be edit by machine, we need a way to preserve the comments. Today, only the lowest level API supports reading comments, but that would be too complicated for our scenario. The code is here.

Detail:

The write comment API is available today. But it hard to use jsontoken level API to read comments and find a way to remember the location (in order to write it back). We believe a DOM like API that we can read/write and preserve comments as with System.Xml is a good example.

Tool manifest is a json file that is intended to be edit by both the user and CRUD SDK command line, for example of update tool operation.

before

{
  "version": 1,
  "isRoot": true,
  "tools": {
    "dotnetsay": {
      "version": "1.1.4", // used only in tests
      "commands": [
        "dotnetsay"
      ]
    }
  }
}

After update, today

{
  "version": 1,
  "isRoot": true,
  "tools": {
    "dotnetsay": {
      "version": "2.1.4",
      "commands": [
        "dotnetsay"
      ]
    }
  }
}

Desire:

{
  "version": 1,
  "isRoot": true,
  "tools": {
    "dotnetsay": {
      "version": "2.1.4", // used only in tests
      "commands": [
        "dotnetsay"
      ]
    }
  }
}
area-System.Text.Json enhancement

Most helpful comment

(This isn't going to be productive, but I can't let it go unsaid. To me, this means JSON is a poor choice for the tools manifest. dotnet add package preserves comments in csproj just fine. It is totally reasonable to expect parity for the command to add a package to a project as the command to add a local tool to a repo. A document that can be edited by a mix of human and machine and have comments and whitespace preserved is a very valid use case for a list of dependencies. We have replaced XML with JSON for the sake of it in scenarios where XML works much better.)

All 18 comments

@ahsonkhan in the end we find it is still too hard to convert to use the lower level API. We hope to add that feature in higher level API

cc @bartonjs, @joperezr, @kasiabulat - something to consider, particularly for the modifiable dom (enable support for JsonCommentHandling.Allow within JsonDocument).

This is a known limitation of the JsonSerializer/JsonDocument. A modifiable DOM does not have the same constraints and can probably allow it without hurting the perf of mainline scenarios.

in the end we find it is still too hard to convert to use the lower level API.

@wli3, can you elaborate on what was the difficult part, particularly for your scenario, just so I have a better sense of where others might struggle with as well?

I haven't figured out yet if this is easy with NewtonSoft.Json either.

I tried this, but it didn't work:

``` c#
var json =
@"{
// This is a comment
}";
var settings = new JsonLoadSettings
{
CommentHandling = CommentHandling.Load,
};

    var doc = JObject.Parse(json, settings);
    doc.Root["apples"] = "oranges";

    Console.WriteLine(doc);
It prints like this with the comment gone:

{
"apples": "oranges"
}
```

Is there an easy way that I've missed? I'm not very familiar with either API.

can you elaborate on what was the difficult part, particularly for your scenario, just so I have a better sense of where others might struggle with as well?

The code to deserialize is 100 lines already. And we are using jsonDocument, a higher level API already. For our use case, if by the time of the change, deserializer is available, it could be done by several lines. To do the same thing using jsontoken level would be even more work than "necessary". We need to start to consider ordering and nesting. Plus, we have to find a way to keep track where the comments are located(so we can write it back). It would be much easier if the API is like "add a new node like this in the tools node", and just keep everything else the same.

For our use case, if by the time of the change, deserializer is available, it could be done by several lines.

I don't understand what you mean by this. We have a deserializer.

JsonSerializer.Deserialize(...)
https://github.com/dotnet/corefx/blob/7431e4938c461f6e24cf3122c0b9ca4eb27e4dde/src/System.Text.Json/ref/System.Text.Json.cs#L184-L202

It prints like this with the comment gone:

I am not too sure. Looking at the docs for JsonLoadSettings/CommentHandling.Load, I would have thought that would retain the comment. @JamesNK would know more about Newtonsoft.Json.

```C#
var json = @"{""foo"": 1 /* comment */}";
var settings = new JsonLoadSettings
{
CommentHandling = CommentHandling.Load,
};

var doc = JObject.Parse(json, settings);
Assert.Equal(json, doc.ToString()); // Fails: {\r\n "foo": 1\r\n}
```

JsonElement/JsonDocument would have an exceedingly difficult time representing comments. While an array iterator could easily indicate that there was a comment value (and either decide that the indexer skips them or doesn't), the object iterator returns JsonProperty values, and a comment cannot be sensibly modelled under that type.

But perhaps they could be modelled under the loose node thought experiment for a mutable DOM.

mutable DOM

That would also work for us

We have a deserializer

The conversion work is done several month ago. Although my point is, we should not use even lower level API for our rather simple scenario (until comments).

Mutable DOM is what I have in mind.

Newtonsoft.Json doesn't perfectly preserve and round-trip a JSON document. IMO System.Text.Json should not either.

There is all kinds of stuff that gets lost:

  • Comments
  • The exact encoding of text
  • Whitespace

(This isn't going to be productive, but I can't let it go unsaid. To me, this means JSON is a poor choice for the tools manifest. dotnet add package preserves comments in csproj just fine. It is totally reasonable to expect parity for the command to add a package to a project as the command to add a local tool to a repo. A document that can be edited by a mix of human and machine and have comments and whitespace preserved is a very valid use case for a list of dependencies. We have replaced XML with JSON for the sake of it in scenarios where XML works much better.)

There is no reason you can't parse JSON and preserve all information about whitespace and comments, then modify the document. Visual Studio does it, I assume with its own JSON library.

The needs of runtime JSON parsing and designtime JSON parsing are different.

I agree there's no reason in theory, but in practice we have a readily available stack for XML in .NET that does this, but no such parity for JSON. We can't really go shopping for yet another JSON library for this or roll our own. Hence this request.

What I think you're going to have to do to preserve comments (speaking only for myself and not the JSON team 馃槃):

  1. Continue to use Utf8JsonReader. Accept that you will lose information about whitespace
  2. Read into your own JSON DOM. Comments don't fit into the name/value world of JSON objects, you need a DOM that is more like a tree (hello XML). You can then modify the DOM to insert/remove JSON as needed.
  3. Write the DOM back out (including the comment nodes) using Utf8JsonWriter

To me, this means JSON is a poor choice for the tools manifest. ...

Amusingly, we had the same conclusion while doodling on a whiteboard why comments are hard to preserve in the model. (Mainly because XML says comments exist, and how to deal with them, and JSON says comments don't exist, so there's no spec to look to for what a right answer is).

Was this page helpful?
0 / 5 - 0 ratings