Efcore.pg: JSON.NET plugin

Created on 10 Mar 2018  路  22Comments  路  Source: npgsql/efcore.pg

With the new JSON.NET plugin coming to Npgsql 3.3, we need to create the corresponding plugin for the EF Core provider (depends on #300). It would provide a db context options extension, UseJsonNet(), which would do the following:

  • Add the Npgsql JsonNet plugin to the global type mapper
  • Add the appropriate RelationalTypeMapping to the NpgsqlTypeMappingSource
  • Add the appropriate translators for JSON operations.
enhancement

Most helpful comment

FYI I am currently working on this - a preliminary implementation is working. Will share details soon.

All 22 comments

After some research into this, there's one very annoying mismatch between Newtonsoft.JSON and PostgreSQL's JSON support that makes this difficult. Mapping JObject and JArray is easy enough, but operation translation is problematic.

In Newtonsoft., the way to access a string value in a JSON is:

c# string itemTitle = (string)rss["channel"]["item"][0]["title"];
It should be easy to map the indexers into PostgreSQL's -> operator (see the docs). However, the cast of the last value into string is problematic... In Newtonsoft, this cast yields the simple string value, but PostgreSQL treats it as a cast of a JSON document, so returns the string value surrounded by double-quotes. In other words, if the string value being access is foo, Newtonsoft will return foo while PostgreSQL will return "foo".

To work around this, we need to customize the casting behavior when it is applied to a json/jsonb. The Newtonsoft casting behavior can be implemented by rendering #>>'{}', which "looks up" the root element of the give JSON document and returns it as a string. However, it's difficult to see how we can override this behavior in NpgsqlQuerySqlGenerator from a plugin.

In theory we can replace the NpgsqlQuerySqlGenerator service from the plugin, although replacing a service with another service wrapping the first is really non-trivial with the dependency-injection system. Npgsql could register a factory lambda for IQuerySqlGenerator, which the JSON extension would replace with another ServiceDescriptor the does the wrapping.

However, if the replacement service extends NpgsqlQuerySqlGenerator, overriding only the relevant cast behavior, it's impossible to have multiple plugins changing bits of behavior in the SQL generator (each one would clobber the current generator with its own).

Because of the above complications I'm going to drop this for now and revisit later. In any case PG 11 is supposed to add SQL/JSON support, maybe it's better to wait for that and see what things look like.

Probably @JamesNK has some ideas on this topic?

If you want double quotes then use https://www.newtonsoft.com/json/help/html/M_Newtonsoft_Json_Linq_JToken_WriteTo.htm with a JsonTextWriter over the top of a StringWriter. Or manually check for a string and update the value with a call to JsonConvert.ToString(s) (I think).

Thanks @seriouz and @JamesNK.

The problem isn't really with Newtonsoft.JSON or how to do stuff with it... The goal of this plugin would be to translate idiomatic, simple JSON.NET code to PostgreSQL JSON operators in SQL. The problem is that there's an annoying mismatch between the regular way to do JSON.NET and the PostgreSQL operators where it comes to string casts. In other words, I'm sure I can write JSON.NET code that produces the double quotes, as you suggest above @JamesNK, but the point is to allow users to write normal, regular JSON.NET and for that to get evaluated seamlessly and out of the box on PostgreSQL.

Concretely, I want to translate the following JSON.NET sample:

c# string cpu = (string)o["CPU"];
The expression o["CPU"] in this sample is a JSON object (JToken in JSON.NET). But when we cast to string in JSON.NET, we simply get the string (no quotes), whereas in PostgreSQL we get quotes, since PostgreSQL considers that a text cast of a JSON object needs to be a valid JSON string representation.

So the problem isn't how to get JSON.NET to produce the quotes, but rather how to properly translate JSON.NET code to PostgreSQL in a way that everything works.

Hi, roji,
AFAICS, it should be possible to extract the string as string using json_extract_path_text() instead of using the cast, see https://www.postgresql.org/docs/10/static/functions-json.html.

@markusschaber but when would you call that function? The point is that the user wrote the expression (string)o["CPU"], and that's what you have to translate to SQL... You're right that we could translate the C# cast operation to json_extract_path_text() (or to the #>>'{}' operator as I suggested above, although you're proposal's better), but that would involve changing the translation for casts, which is problematic from a plugin, as I wrote above.

What's the current status on this? Can we contribute somehow?

@greenboxal this is definitely still planned for 2.2.0, but the issues above still need to be addressed. In addition, it seems wise to hold off on this as PG 11 will probably be adding SQL/JSON standard support, which might change the way we want to do this.

Finally, this may be a bit complicated for an outside contributor to tackle as a first foray into an EF Core provider, since it potentially touches on several internal details. However, if you want to help by researching the state of SQL/JSON in PostgreSQL 11, whether it's a good fit (better than the pre-11 JSON support) and whether it helps in any way with the issue described above - that would be really great.

@roji - I think I have an alternative solution. Instead of supporting casting, use Value<> extension:

string cpu = o["CPU"].Value<string>();

With this, you can create a method call translator, that would convert this to string.

I gave it a try and created small proof-of-concept, to see how this would work. You can review it here: https://github.com/lmarszal/NpgsqlJObjectPoC
It's far from being ready but can give us some starting point for the discussion.

Any thoughts?

@lmarszal Nice idea!

@lmarszal, the point is that we don't want to force a strange constraint on the way users write their LINQ code over JSON.NET... The idea is to have users write normal, regular LINQ just as they would outside of a database context, and have that work as-is on PostgreSQL. I really don't want to see a plugin which forces an unnatural extension method which means the exact same thing as a C# cast, because of some internal issue like this...

I do intend to look at this again at some point, and I'm also hoping that by the time I do PostgreSQL will introduce SQL/JSON support which may make the issue go away (I admit that isn't grounded in anything though).

@roji, I got this. For me more natural than casting is using .Value<> - but this is only my opinion. And casting should be evaluated exactly the same as .Value, as their underlying logic implemented in JObject is almost identical. And last but not least both options should be supported (not to, as you said, force users to do something unnatural for them).

My thinking was, that since (1) at the end of the day .Value has to be supported too, and (2) cast is a problem now: we might introduce something that works now and just add support for cast later on.
Existing code wouldn't change, just new functionality will be added.

The additional benefit would be, that when SQL/JSON appear (and if this helps), we would just need to change the underlying code, as the LINQ expressions users would have written won't have to change.

For me more natural than casting is using .Value<> - but this is only my opinion.

Apart from subjective feelings, the point is that casting is the pattern shown in the basic JSON.NET documentation. I think that a plugin that fails on the most basic sample is a bit of a non-starter.

I'd rather we at least tried to solve the problem for casting - and as you say, implement translation for Value<T>() as well.

I did a next quick test and handling casting seems doable with IExpressionFragmentTranslator - sth like this:

internal class JTokenCastExpressionFragmentTranslator : IExpressionFragmentTranslator
{
    public Expression Translate(Expression expression)
    {
        if (expression is UnaryExpression unaryExpression &&
            unaryExpression.NodeType == ExpressionType.Convert &&
            unaryExpression.Type == typeof(string) &&
            unaryExpression.Operand.Type == typeof(JToken))
        {
            return new CustomUnaryExpression(unaryExpression.Operand, "->>0", typeof(string), postfix: true);
        }

        return null;
    }
}

This requires small changes to plugin model though:

  • Extend NpgsqlCompositeExpressionFragmentTranslator to iterate through plugins, like NpgsqlCompositeMethodCallTranslator does
  • Add virtual method to NpgsqlEntityFrameworkPlugin (sth like AddExpressionFragmentTranslators)

With this I've managed successfully run a call like this:

string foo = "text";
_dbContext.SampleItems.Where(x => (string)x.Body["Inner"]["Foo"] == foo).ToList();

Great, that's definitely a step in the right direction! In my original attempt I think I got a bit stuck working on NpgsqlQuerySqlGenerator, where NpgsqlCompositeExpressionFragmentTranslator may be a better extension point for this. I'd still want to evaluate the two approaches to see which is best.

Regardless, note that I'm currently in the process of converting the two plugins to use the new plugin model in EF Core 2.2 (which is based on my previous work). This is issue #658 and I'll probably submit a PR later today or tomorrow (very close to finished). This means that you shouldn't spend any time working on the code as it is, but rather look at how things are implemented in the current EF Core release/2.2 branch - the plugin model is slightly different.

As a stop gap, we can also integrate the special casting logic into the Npgsql provider (not via a plugin), using string comparison to identify the operation on JToken. This isn't ideal, but is a relatively harmless workaround until something clean can be done in the EF Core plugin model.

Good - so I'm waiting for your PR and then then will try to figure out how to make this work with new the plugin model. Please keep me posted if, how and when I can contribute to this. I'd really love to see this feature fully implemented, so I can spend some time working on this :)

Here we go: #672

FYI, in case this has flown under the radar .NET Core team is planning / implementing their version (not Newtonsoft's Json.NET) of JSON support:
https://github.com/dotnet/announcements/issues/90 -> discussion at:
https://github.com/dotnet/corefx/issues/33115

Yep, we're waiting to see what comes out of that to see what to support in Npgsql.

Info on the new PostgreSQL 12 jsonpath: https://www.postgresql.org/docs/devel/datatype-json.html#DATATYPE-JSONPATH, https://paquier.xyz/postgresql-2/postgres-12-jsonpath. We should definitely take another look at this sometime.

FYI I am currently working on this - a preliminary implementation is working. Will share details soon.

Superceded by #981

Was this page helpful?
0 / 5 - 0 ratings