Efcore.pg: Support for user-defined ranges

Created on 8 Mar 2018  路  29Comments  路  Source: npgsql/efcore.pg

63 added support for the built-in range types (int4range, int8range, numrange, tsrange, tstzrange, daterange).

However, it's possible for users to define their own range types. Adding support for these would mean either:

  1. Load range definitions from the database. This means that NpgsqlRelationalTypeMappingSource can no longer be a simple singleton and has to be aware of types in specific databases
  2. Provide an API for users to let the TypeMappingSource know about additional range types.

As user-defined ranges are probably quite rare, this is a low-priority feature. If this feature is important to you, please upvote it.

enhancement

Most helpful comment

OK, let me see if I manage to do this in the weekend - I'm interested in playing around with this a little bit. If I can't get around to it I'll let you know.

All 29 comments

In the context of time and specifically of NodaTime it appears that there's support for Instant and LocalDate but not LocalTime. Equivalently in Postgres there's support for date ranges, datetime (timestamp with or without timezone) ranges but not time ranges.

This seems asymmetric on the Postgres side. Is there any particular reason for this? If not, could we allow NpgsqlRange<LocalTime> as another "built in" range?

@Jefffrey It does look a little odd that there is no timerange. But the docs actually show how to define a timerange type here: https://www.postgresql.org/docs/10/static/rangetypes.html#RANGETYPES-DEFINING

Once we've added support for user-defined ranges, you would be able to define timerange and map it to NpgsqlRange<LocalTime>.

I suspect there's no built-in time range simply because date ranges are just much more useful than time ranges.

Please note that currently this feature is definitely not on the top of our priority list, and is a lot of work for the relatively little value it provides... Anyone else who's interested in this should definitely upvote and describe their scenarios so we know what people want.

Just for reference, my specific use case is to store weekly recurring time slots (for example 12:00 to 13:00 on Monday). The timezone they refer to (being it a local time) is determined somewhere else (just like a local date would).

@Jefffrey a time range wouldn't really help for that - PostgreSQL time represents time, but not on any given day of the week. This is something that would require several columns rather than just a single type.

Yes. I also store the corresponding DayOfWeek. The time range part just represents the extent of the time slot.

I do the same thing as Jefffrey using time range

OK, keep upvoting this issue then :)

To be clear, the reason we're not implementing this more readily is because it would impact the design of the Npgsql EF Core provider in a significant way... All type mapping support is currently database-independent: the provider doesn't actually look at the database it's connected to in order to know which types are there or not (this is true even for the enum support). User-defined ranges would require changing that.

I admire your attempt to keep the whole thing database-independent, but the reason people use Npgsql is because they have already chose Postgres as a database provider so they don't expect Npgsql to be compatible with other database engines (at least I don't).

The support for Postgres range types, arrays and NodaTime is to be honest quite astounding. I'm honestly impressed by the work other people and you specifically put into .NET Core and EF Core for it to take advantage of the beauty that Postgres is.

But falling into the same old "ORMs are supposed to be platform independent" clich茅 is something that will never work properly because it means settling for the lowest common denominator of all (or most) database engines which will never be optimal.

Anyway, I've personally worked around the issue by storing the LocalTime and have the duration being implicit, but I'd suggest you to consider if pursuing a platform independent world is really worth it.

I admire your attempt to keep the whole thing database-independent, but the reason people use Npgsql is because they have already chose Postgres as a database provider so they don't expect Npgsql to be compatible with other database engines (at least I don't).

I think you misunderstood, the point wasn't about being portable across database types. It was that the type mapping architecture currently doesn't access the the database in any way. User-defined range support would require for the type mappings to be set up based on what ranges happen to be defined in your specific database.

In other words, the component responsible for type mapping (NpgsqlTypeMappingSource) is currently a singleton, shared by all contexts and therefore does not have access to any database connection on which it would load user-defined types. We would need to make the NpgsqlTypeMappingSource a scoped service, which would require rearchitecturing the whole thing. Care must be taken not to slow things down (if we set up all mappings for each context instead of just once, that would probably be unacceptable).

So this has nothing to do with cross-database support, only with the internal architecture of the Npgsql provider. It may be possible to work around this problem in the same way we implemented enums, i.e. require the user to register, up-front, which user-defined ranges exist in the database, rather than loading them. This is a bit suboptimal but at least sidesteps the singleton/scoped issue. Frankly there just hasn't been enough demand for this feature to tackle it.

The support for Postgres range types, arrays and NodaTime is to be honest quite astounding. I'm honestly impressed by the work other people and you specifically put into .NET Core and EF Core for it to take advantage of the beauty that Postgres is.

Thanks for these kind words :) It's nice to hear the work is appreciated out there!

Oh, I've misread, sorry. I had to admit it sounded weird that Npgsql cared about being database engine independent.

I can see how ranges for user defined types might be a lot of work for little value, but I was advocating more for a one time case for NpgsqlRange<LocalTime> and consider it like a "bult-in" range type, given that NpgsqlRange<LocalDate> exists and it's considered "built-in".

I was advocating more for a one time case for NpgsqlRange<LocalTime> and consider it like a "bult-in" range type

I like the idea...but what would it be mapped to? Since there is no timerange, would you expect the provider to create the type? It seems common for type creation on the public schema to be restricted as it could lead to schema-qualified types being shadowed.

Just to brainstorm:

We could make it a user requirement that such a type (timerange) already be defined, as we do with the NTS plugin for GIS types. Maybe this would be a config option (disabled by default) upon registering the plugin?

@Jeffrey the point is that there's no built-in range of time... When you install PostgreSQL that type just doesn't exist (unlike say daterange). I definitely don't feel we should make some sort of exception where we'd magically assume the user created it with a specific hard coded name, or anything like that.

@austindrenski the difference with the NTS plugin is that it supports an extension that adds well-known types with well-known names, whereas the user is free to create whatever range type with whatever name they wish. It's not the same problem...

Also, creating the type in the database is only half of the problem, the other half is making the TypeMappingSource aware of it - and again, remember that it's currently a singleton, so shared between all context instances and types which use Npgsql in a given program.

Finally, I don't think this is a good fit for a plugin since ranges are a built-in capability of PostgreSQL, like enums.

There are solutions to this, but I'd prefer not to hack something up..

Finally, I don't think this is a good fit for a plugin since ranges are a built-in capability of PostgreSQL, like enums.

I was thinking about this being in some way configurable from the NodaTime plugin. But I also haven't looked into what would be feasible (just hypotheticals).

Finally, I don't think this is a good fit for a plugin since ranges are a built-in capability of PostgreSQL, like enums.

I was thinking about this being in some way configurable from the NodaTime plugin. But I also haven't looked into what would be feasible (just hypotheticals).

I understand... At the end of the day, since PostgreSQL simply doesn't come with this type, we'd have to make some sort of hardcoding assumption (e.g. "if you happen to have a range of time called timerange, we'll silently pick it up"), which isn't a good way forward. As I wrote above, there are two ways to solve this:

  1. Provide some new, global API that registers the user-defined range with the singleton NpgsqlTypeMappingSource. This isn't ideal because we're forcing the user to set something up which we should ideally be reading from the database.
  2. Bite the bullet and make NpgsqlTypeMappingSource a scoped service, or more precisely, split it into a scoped service and a singleton service. The singleton would retain all universal mappings found everywhere, whereas the scoped component would be quite light, loading extra mappings from the specific database it's connected to and caching on the connection string. This is the "perfect" solution in that it doesn't require users to do anything at all, and would also take care of some other user-defined types (like domains, but not enums and composites which require user mapping to some .NET type in any case - can't load that from any database). The main concern here is to do it right and make sure there's no perf degradation.
  1. Bite the bullet and make NpgsqlTypeMappingSource a scoped service, or more precisely, split it into a scoped service and a singleton service.

TBH I think we're at the point of need to move forward on this.

I'm not sure exactly how much work we're talking about here, but I can take a first pass at it this weekend if you're crunched for time.

OK, let me see if I manage to do this in the weekend - I'm interested in playing around with this a little bit. If I can't get around to it I'll let you know.

Forgot about this issue which was very related: #138

Note old suggestion by @ajcvickers to make type mapper scoped instead of singleton: https://github.com/aspnet/EntityFrameworkCore/issues/7275#issuecomment-268304781.

I give this a quick look, and it seems like making the TypeMappingSource scoped would have pretty significant repercussions. TypeMappingSource is dependency of many other singleton services (such as MigrationsSqlGenerator), and I can easily see us converting a sizable chunk of the EF Core service graph from singletons to scoped if we go down this path. This means many needless service allocations on each context instantiation, and I don't think we should go down this path just to support user-defined ranges. @austindrenski (or others), you can see https://github.com/roji/Npgsql.EntityFrameworkCore.PostgreSQL/commit/d2160b96c427e5d3e685d454f866f3bcf1ebace0 for a start of this approach (replacing singleton by scoped), which I didn't carry forward very far.

Other options include a new API point on ITypeMappingSource which would accept the context as a parameter. This would allow ITypeMappingSource to remain a singleton, but still to look for context-specific mappings as well. This would probably involve caching additional database-specific mappings, based on the connection string. This would involve adding a new API in EF Core, but it would be backwards compatible (the default implementation would simply call the overload which doesn't accept the context).

Finally, there's option 1 above, which is to provide an Npgsql-specific API for the user for globally declaring that a user-defined range exists. This would register a mapping in the singleton NpgsqlTypeMappingSource, a bit like how we already handle enums (but not exactly).

@austindrenski if you'd like to explore this further please feel free to, let me know your thoughts.

@roji I haven't read the whole thread here (I will if you think it would be useful) but I want to say that making the type mapper scoped was a bad idea. I should never have suggested it. :-) I'm pretty sure that after the work we did on MySQL 2.1 support that they are no longer doing this.

@ajcvickers yeah, I totally agree. This thread is only interesting insofar as it discusses supporting a certain type of UDTs (PostgreSQL ranges). Unlike PostgreSQL enums and composites, which in any case require a manual user mapping action (since a .NET type must be specified), ranges in theory could be automatically mapped by the EF Core provider (PostgreSQL ranges are always mapped by Npgsql to NpgsqlRange<T> at the ADO.NET level). However that would mean making the type mapper scoped.

I think it's pretty clear we should just introduce a special user-facing API as suggested above, but I'll hold off on that until @austindrenski gives his point of view.

@roji Thanks for taking a look at this.

TypeMappingSource is dependency of many other singleton services (such as MigrationsSqlGenerator), and I can easily see us converting a sizable chunk of the EF Core service graph from singletons to scoped if we go down this path.

Yikes...

This means many needless service allocations on each context instantiation, and I don't think we should go down this path just to support user-defined ranges.

Agreed.

I think it's pretty clear we should just introduce a special user-facing API as suggested above

Agreed.

But do you mean modifying ITypeMappingSource or something like GlobalTypeMapper.MapEnum<T>(...)?

Right now my big pet peeve is needing to register enums with GlobalTypeMapper.MapEnum<T>(...) in the static constructor of my DbContext. I would prefer to see this (enums, ranges, etc) handled with something like builder.ForNpgsqlHasEnum(...) within the context itself.

As a general question though, how common do we expect it to be that one application is hosting multiple contexts that point to different databases where user-defined types might conflict?

I'm thinking here that while I may have multiple contexts per program (which may even point to different physical servers), it would be an error on my part to expect that a type from one database is present in the other.

In other words, what does global registration of user-defined types really break? It might let EF Core translate a parameter that causes PostgreSQL to complain... but that doesn't seem like such a big deal.

So the first thing to understand, is that there's a difference between the following two things:

  1. The model, which is built via ModelBuilder in the context's OnModelConfiguring(). You can have multiple models in the same application.
  2. The type mapper, or TypeMappingSource, which is a singleton for a given provider (multiple providers are supported, but each has a singleton type mapper). This defines which mappings are supported for a given provider, and by extension which property types may exist on your model's entities and what database types they will be mapped to.

In order to support user-defined ranges (or any other type), we must first make the type mapper aware of them. The type mapper can't be made aware of these types by loading them from a database, because that would imply it has access to a specific database, making it scoped.

Right now my big pet peeve is needing to register enums with GlobalTypeMapper.MapEnum(...) in the static constructor of my DbContext. I would prefer to see this (enums, ranges, etc) handled with something like builder.ForNpgsqlHasEnum(...) within the context itself.

The above should explain why it doesn't make sense to register enums, ranges or any type mapping via the model builder, which, well, builds a specific model... builder.ForNpgsqlHasEnum() declares the existence of an enum type on the model, but only for the purpose of creating that enum when applying migrations to the database. The type mapping for that enum must be registered separately with the TypeMappingSource.

But do you mean modifying ITypeMappingSource or something like GlobalTypeMapper.MapEnum(...)?

I'm not sure exactly what the API would look like, but it could be some sort of direct static call on NpgsqlTypeMappingSource, which updates a static list of registered user-defined ranges that are later consulted when trying to get a mapping.

This will definitely not be something on NpgsqlConnection.GlobalTypeMapper, which are ADO.NET-level things - enums and user-defined ranges differ on this. In order to use an enum at the ADO.NET level, it's the user's responsibility to define a compatible CLR enum type and inform Npgsql about a mapping between this type and a PostgreSQL enum. Ranges, however, don't involve a user-defined CLR type - all range types are simply NpgsqlRange<T>. This is why at the ADO.NET level a mapping action is required for enums and composites (except for unmapped mode), but not for ranges.

At the EF Core level, since the user has to perform an ADO.NET mapping action for enums, the type mapper can simply communicate with the ADO.NET layer and load that; this is why nothing is needed for enum support in EF Core beyond the ADO.NET NpgsqlConnection.GlobalTypeMapper.MapEnum() call. However, since there's no need for an ADO.NET mapping for user-defined ranges, we need to introduce a new one at the EF Core level to make TypeMappingSource aware of the range.

As a general question though, how common do we expect it to be that one application is hosting multiple contexts that point to different databases where user-defined types might conflict?
I'm thinking here that while I may have multiple contexts per program (which may even point to different physical servers), it would be an error on my part to expect that a type from one database is present in the other.
In other words, what does global registration of user-defined types really break? It might let EF Core translate a parameter that causes PostgreSQL to complain... but that doesn't seem like such a big deal.

I completely agree that global registration is sufficient, and that we can't support different databases with different conflicting mappings. For one thing, the only way to support this would be a scoped type mapper, which we already know we're not doing. But we still have to add a new EF Core API for adding that global mapping for the case of user-defined ranges.

Hope the above clarifies everything. Since the option of a scoped type mapper has been clearly ruled out, I can work on the new API for global registration of user-defined ranges in the near future.

Thinking about this some more, rather than a static call to NpgsqlTypeMappingSource we could introduce a nicer syntax similar to what our plugins use, via a context option. Unless I'm mistaken, this could also enable different contexts to have different user-defined range mappings, giving us the best of all worlds... Will investigate.

Hope the above clarifies everything. Since the option of a scoped type mapper has been clearly ruled out, I can work on the new API for global registration of user-defined ranges in the near future.

I appreciate the background, this indeed helps to clear up how I think about the type mapping system.

The above should explain why it doesn't make sense to register enums, ranges or any type mapping via the model builder, which, well, builds a specific model... builder.ForNpgsqlHasEnum() declares the existence of an enum type on the model, but only for the purpose of creating that enum when applying migrations to the database. The type mapping for that enum must be registered separately with the TypeMappingSource.

Maybe I'm thinking about this the wrong way, but is there anything blocking us from expanding the role of ForNpgsqlHasEnum()?

Could me make it generic (using those new enum type constraints) and have it do an "add if not present" operation on the type mappers?

I understand the responsibility separation the current design follows, but it still feels fractious in some ways.

Ranges, however, don't involve a user-defined CLR type - all range types are simply NpgsqlRange<T>. This is why at the ADO.NET level a mapping action is required for enums and composites (except for unmapped mode), but not for ranges.

A bit off topic but from the sounds of this, mapping composites to ValueTuple<...> would follow the logic of ranges and avoid the ADO.NET registration, right?

Maybe I'm thinking about this the wrong way, but is there anything blocking us from expanding the role of ForNpgsqlHasEnum()?

There are reasons for avoiding this. For one thing, INpgsqlOption (on which the new range mappings would live) implements ISingletonOptions, which signals EF Core that the option is a singleton. If I understand everything correctly, this allows two DbContext using Npgsql to have different sets of options.

Another less important reason, is that ForNpgsqlHasEnum() is an extension method on ModelBuilder, which is definitely not meant to interact with internal services in the way this would require. Not saying it's impossible, but it's not the way things are meant to work.

I've done some quick prototyping and it looks really good. The user would only needs to do the following:

c# protected override void OnConfiguring(DbContextOptionsBuilder builder) => builder.UseNpgsql(ConnectionString, o => o.MapRange(typeof(float), "floatrange"));

A bit off topic but from the sounds of this, mapping composites to ValueTuple<...> would follow the logic of ranges and avoid the ADO.NET registration, right?

I'm not sure yet. The idea is first to tackle the question of composite tuple mapping at the ADO.NET level, since it's a useful feature to have there as well. How exactly this will work at the ADO.NET level remains to be seen - will this require an explicit mapping call? Or will this work even work without it? The latter seems like it could be possible, and would be ideal? This would determine what the EF Core implementation would look like.

Note that even if we implement composite tuple mapping at the ADO level, we shouldn't dump the current mapping capabilities to a user-provided type (class or struct), since users may prefer using that. I see the new tuple mapping feature as something that would replace the current unmapped composite support, which uses dynamic (not sure why I went there). So we'd have two, equally recommended methods - either you have your own type and set up an explicit mapping to it, or we just use a value tuple. We would probably want to support both at the EF Core level as well.

I'll probably be submitting a PR later today implementing the user-defined range support.

One more note on why the model definition (ForNpgsqlHasRange()) needs to be separate from the mapping definition (UseNpgsql("", o => o.MapRange()))...

The concern of ForNpgsqlHasRange() is mainly to get the range type created in the DB. This means it has to handle the various range type parameters (canonical function, collation...) necessary to generate the CREATE TYPE ... AS RANGE statement. Not all applications do this - some EF Core application want to use an already-existing database where everything is in place. So mapping and creation need to be separate...

Was this page helpful?
0 / 5 - 0 ratings