Ecto: Improvements to paramterized types

Created on 12 Aug 2020  Â·  19Comments  Â·  Source: elixir-ecto/ecto

  • [x] Add docs to all Ecto.ParameterizedType callbacks

  • [ ] Discuss if and how to move the functionality in Ecto.Changeset to Ecto.ParameterizedType (cast, change, apply_changes)

  • [ ] Discuss if and how to move the functionality in Ecto.Repo.Schema to Ecto.ParameterizedType (Embedded.prepare, surface_changes)

All 19 comments

@josevalim I just want to confirm … If we agree on 2nd and 3rd point we would be able to write such code:

defmodule MyApp.Comment do
  use Ecto.Schema
  use MyLib

  schema "comments" do
    belongs_to :author, MyApp.User
    exactly_one_of :source, :belongs_to do
      polym :post, MyApp.Post
      # …
    end
    # helpful macro which would generate something like:
    # belongs_to :post, MyApp.Post
    # …
    # field :source, MyLib.ExactlyOneOf, polym_fields: [:post, …]
  end
end

which would allow to work on :source field just like on :post association (in case %MyApp.Post{…} would be used or in this example post_id would be the only non-null column in database) just by Ecto.ParameterizedType callbacks without any workarounds and/or rewriting any part of ecto/ecto_sql code, right?

I believe that true polymorphic associations would be a good example for a complex real world use case of all callbacks. Here ideally if in callbacks we would receive full schema instead of only a field value. Maybe it would be good to consider if Ecto.ParameterizedType would receive full schema when virtual option is set to true? Look that for example callback load (from database) of virtual (field) does not makes sense unless it have access to full schema or at least a specified list of fields and looks like the same applies to other (all?) callbacks.

WDYT?

Associations have other complexities that are not taken into account here, such as tracking transactions. Plus there are many other association specific callbacks. So what you suggest above is a non-goal, at the moment.

I mean, we haven't even solved embeds yet, talking about associations is definitely too early.

@josevalim Associations have other complexities that are not taken into account here, such as tracking transactions. Plus there are many other association specific callbacks. So what you suggest above is a non-goal, at the moment.

Wait, do we need "association specific callbacks" and other things you said to support that? For sure: I asked about doing things like calling cast_assoc/3 instead cast callback (1st point) in parameterized type (instead of changeset/2 function) + give full schema instead of only field value. I asked in terms of custom field which would only help handling multiple associations and not about custom associations.

Would it still be a problem?

talking about associations is definitely too early.

I completely understand that. I just thought that it would be already possible after implementing 1st point (+ maybe 2nd) + as said providing full schema instead of only value in callbacks.

@josevalim I take the first task.

Has reuse of Ecto.Enum been discussed? We are big users of embedded_schema coupled with https://github.com/commanded/commanded. Since there are many events/commands that may share data types, its common for use to declare Enums to be shared across many schemas. For example, using our internal version of Enum

defmodule App.Sport do
  # Generates an Ecto.Type
  use App.Ecto.Enum, values: [:baseball, :football]
end

defmodule App.Event do
  use Ecto.Schema

  embedded_schema do
    field :uuid, Ecto.UUID
    field :sport, App.Sport
  end
end

defmodule App.Event do
  use Ecto.Schema

  embedded_schema do
    field :uuid, Ecto.UUID
    field :sport, App.Sport
  end
end

I really do think that first class support for this should be considered

@davydog187 yup, the plan is for Ecto to ship with an Ecto.Enum type, it's just a matter of laying the proper foundation first.

Great. Sharing our implementation in case its useful https://gist.github.com/davydog187/0bec728c8f61218e9390aa43266b2acf

@davydog187 Once ParameterizedType would be finished and released I would update my work I have already started locally. It would be an advanced enum library with a fully reversible migrations and many other features.

Regarding your use case you can do 2 things now:

  1. Write a shared (between multiple schema modules) macros with predefined values
  2. Write your own ParameterizedType based on existing implementation, but changing just those runtime values to compile time attributes or just keep your current implementation. I don't think that ecto would give 2 different enum modules (one with runtime values and one with compile time values).

If you need share enums with the current in master, I would do:

field :sport, App.Sport, values: App.Enum.sports()

where

def sports, do: [:baseball, :football]

If the goal is to define multiple enum modules, than the custom type API that is available today is enough. The goal with adding paramterized modules was to not force uses to define a bunch of small modules.

@josevalim #3381 submitted to address the first point.

@TheFirstAvenger awesome, thank you!

Btw, for the next tasks, we don't need to tackle all of them at once. For example, we can start only with a PR on cast, then one for change, etc.

@josevalim Today I took a shot at implementing embeds in Changeset.cast (#3384). Let me know if it is on the right track or not.

Hi folks, so today I spent most of the morning trying to remove embeds from Ecto.Repo.Schema and, unfortunately, those changes are not really straight-forward. We can't reliably do it without:

  1. Slowing down inserts/updates by using transactions more frequently
  2. Hardcoding the logic to nested changesets in changeset.changes

The approach in 2 is quite doable. I have even pushed a commit that uses this approach to removes embed specific logic from Ecto.Changeset.apply_changes here: 6f5b35fe0b57bb93c72cec634171fcb298ee3448 - However, it feels like additional behaviour and not really part of parameterized types.

Here is the overall summary if we can move things to parameterized type:

  1. Ecto.Changeset.cast -> probably not doable
  2. Ecto.Changeset.change -> doable
  3. Ecto.Changeset.apply_changes -> doable if we track the nested changesets in changeset.change
  4. Ecto.Repo.Schema.surface_changes -> doable
  5. Ecto.Repo.Schema.prepare -> doable if we track the nested changesets in changeset.change

Maybe we can 2 and 4 when someone has a use case of parameterized types that needs those callbacks - but right now only embeds/assocs need them and we can't fully convert them, so I would rather stop here.

I already think we made great progress by removing :embed | :assoc from Ecto.Type itself, so at this point I am more inclined to take our victories and call this a day. :)

@josevalim right now only embeds/assocs need them and we can't fully convert them

Right now only those are hardcoded in ecto. :slightly_smiling_face:

But seriously … I believe that there are dozens if not hundreds of documents with nested fields just like json. 2nd most popular after json is obviously xml. We have no idea what kind of document optimizations (like jsonb) PostgreSQL would deliver, so parameterized types would have lots to say here and in many other places.

I worked a bit on Ecto.Changeset last time so I have not much idea how in detail it looks in Ecto.Schema, but for Ecto.ParameterizedType as an API it's always better to add some callback even if embeds would call other modules like now they are doing in for example change callback (call to Ecto.Changeset.Relation module).

More important here is that we would no longer need to write cast_embed or even worse our own cast_* functions (if needed), because something is so much hard coded and we need to copy-paste ecto code. The cleanup may be done in it's own time after making parameterized types API stable.

If you want I can create an example repository with comments what can be improved in really specific real-world use case.

xml is not a good example because it is orthogonal to embeds/associations, i.e. I should be able to persist embeds to an XML column without reimplementing embeds.

And even if you wanted to reimplement embeds, the parameterized API won't solve the duplication because:

  • I don't think we can parameterize cast_* in the first place without making the paremeterized type API extremely more complex for everyone. I think we should recognize that not everything will fit cast and we may need specific cast_* functions for richer types

  • even if we somehow solved cast for parameterized types, the parameterized API would only give a unified API, it won't make it any easier to reimplement embeds because the implementation of the embeds themselves are still private

In a nutshell, I am skeptical about extending the behaviour with the new callbacks without the confidence it will ever fit their purpose. Even worse, releasing the wrong API now means we will block ourselves in the future.

If you think said generalized API is possible, I'd definitely read such a proposal, but after working on it with you and @TheFirstAvenger, I personally don't see it happening right now, so I'd rather focus my efforts elsewhere.

In any case, I want to reemmphasize there was already huge progress in here and the current parameterized types are a major win. :heart: And while I don't think we should merge the remaining PRs, they are the type of work that we only know it isn't a good fit once we actually try to do it, so I really appreciate all of the effort. :)

Closing this per the above.

@josevalim I have check what I would need in parameterized types. In my case I have virtual field which simplifies working on multiple other fields. Here are things currently missing to make my code fully integrated with ecto:

  1. cast/2 should be: one of cast/2 or cast/3 (or maybe: on_cast/3)
    a) For cast/2 we have 1:1 as before
    b) For cast/3 we expect let's say %Ecto.Changeset{} to be passed and same to be returned in :ok tuple

  2. load/4 optional callback (or maybe: on_load/4)
    When source is Comment and query return has key parent I would like to be able to have a full schema added to callback like current load/3.

  3. For my specific use-case I would need join_assoc/2 for Ecto.Query. It's for simplify API i.e. not introduce new confusing macro.

My example:

defmodule EctoPolym do
  defmacro __using__(_pts \\ []) do
    quote do
      Module.register_attribute(__MODULE__, :polym_fields, accumulate: true)
      import EctoPolym, only: [exactly_one_of: 3, polym: 2, polym: 3]
    end
  end

  defmacro exactly_one_of(polym_name, polym_type, do: block) do
    quote do
      EctoPolym.ensure_not_nested(@polym_fields)
      @polym_name unquote(polym_name)
      @polym_type unquote(polym_type)
      _ = unquote(block)
      polym_fields = Module.delete_attribute(__MODULE__, :polym_fields)
      field @polym_name, EctoPolym.ExactlyOneOf, polym_fields: polym_fields, polym_type: @polym_type, virtual: true
    end
  end

  defmacro polym(name, assoc_module, opts \\ []) do
    quote bind_quoted: [assoc_module: assoc_module, name: name, opts: opts] do
      assoc_name = :"#{@polym_name}_#{name}"
      @polym_fields assoc_name

      case @polym_type do
        :belongs_to -> belongs_to name, assoc_module, opts
        :has_many -> has_many name, assoc_module, opts
        :many_to_many -> many_to_many name, assoc_module, opts
        type -> raise "expected belongs_to, has_many or many_to_many got: #{type}"
      end
    end
  end

  @doc false
  def ensure_not_nested([]), do: :ok

  def ensure_not_nested(_non_empty_list) do
    raise ArgumentError, "nested polymorphic associations are not supported"
  end
end

defmodule EctoPolym.ExactlyOneOf do
  use Ecto.ParameterizedType

  def cast(%{__struct__: _related} = data, changeset, %{schema: _schema}) do
    assocs = schema.__schema__(:associations)
    assoc = Enum.find(assocs, & &1.related == related)
    {:ok, Ecto.Changeset.cast_assoc(changeset, assoc.field)}
  end
  # …
end

defmodule EctoPolym.Blog.Post do
  use Ecto.Schema

  schema "posts" do
    field :author, :string
    field :body, :string
    field :title, :string
    has_many :comments, EctoPolym.Blog.Comment
  end

  def changeset(post, params) do
    Ecto.Changeset.cast(post, params, [:author, :body, :title])
  end
end

defmodule EctoPolym.Blog.Comment do
  use Ecto.Schema
  use EctoPolym

  schema "comments" do
    field :body, :string

    exactly_one_of :parent, :belongs_to do
      polym :comment, __MODULE__
      polym :post, EctoPolym.Blog.Post
    end

    has_many :comments, __MODULE__
  end

  def changeset(comment, params) do
    Ecto.Changeset.cast(comment, params, [:body, :parent])
  end
end

and with such code:

import Ecto.Query
alias EctoPolym.Blog.{Comment, Post}

Comment
|> from(as: :comment)
# here in assoc/2 we check if parent field in Comment schema implements optional callback
# if so it would work as same as:
# parent in subquery(query_returned_from_callback)
|> join(:inner, [comment: comment], parent in assoc(comment, :parent), as: :parent)
# I would like to be able to load parent to a proper relation in load/4 callback
|> select_merge([parent: parent], %{parent: parent})
|> EctoPolym.Repo.all()
# [%Comment{parent: %Post{…}, parent_comment: nil, parent_post: %Post{…}, …}, …]

Thoughts?

As I mentioned, associations touch a whole different part of Ecto and they should be considered separately from ParameterizedType. My recommendation is to play with the private Ecto.Aosociation API and see where it leads you. There have been efforts in making them public in the past but it was not concluded.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

alaadahmed picture alaadahmed  Â·  4Comments

brandonparsons picture brandonparsons  Â·  3Comments

jbence picture jbence  Â·  3Comments

yordis picture yordis  Â·  4Comments

wojtekmach picture wojtekmach  Â·  3Comments