Elixir version (elixir -v):
Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false]
Elixir 1.6.3 (compiled with OTP 19)
Database and version (PostgreSQL 9.4, MongoDB 3.2, etc.): PostgreSQL 9.6.1
I've created a failing test case here with a description:
https://github.com/blatyo/upsert_issue/blob/master/test/upsert_issue_test.exs#L7-L25
@id "e87296f6-860e-45d1-9515-fc3639fbac40"
test "upsert update case omits values" do
%Post{}
|> Post.changeset(%{id: @id, deleted_at: DateTime.utc_now()})
|> Repo.insert!(conflict_target: :id, on_conflict: :replace_all)
# Value in database has deleted_at set to a datetime, but new struct here has
# deleted_at set to nil. So when the update portion of the upsert is called,
# Ecto has omitted deleted_at's value. That means in the struct returned
# deleted_at is nil, but Ecto never set the value of deleted_at to nil in the
# database.
updated_post =
%Post{}
|> Post.changeset(%{id: @id, deleted_at: nil})
|> Repo.insert!(conflict_target: :id, on_conflict: :replace_all)
db_post = Repo.get!(Post, @id)
assert db_post.deleted_at == updated_post.deleted_at
end
The update portion of upsert should pass all values given as params to the changeset in the case of :replace_all.
I think you're right. The problem seems to be that we only consider the changes field of the changeset in here: https://github.com/elixir-ecto/ecto/blob/c7bf196d9f4833761ae01c05d7b87bbcc2a29cc5/lib/ecto/repo/schema.ex#L202-L204
I'll start to work in it this week.
@michalmuskala the problem here is "cast" it won't return nil changes. We could add new option to force nil changes.
I believe it's more subtle than that. If the value passed in params matches the default for the schema, then cast won't return those changes. That's fine for the insert part of the clause, but not the update.
Yeah, we need to lift the values from the struct.
Yes, we shouldn't change how cast works - that would be a huge change. We should use in upsert similar logic that we use in insert to lift all the values off the changeset.
I might be stating the obvious, but it's probably important that correct overrides also get set when a Ecto.Schema-type struct is passed in to insert() (as opposed to a Changeset), as with #2017 and the project I run into this issue in.
Insert with schemas and changesets should be correct - we pass all the values that are different than nil - no matter when they are placed.
We're skipping nil values because otherwise database-level defaults wouldn't work.
nil values are very relevant in upserts though. This is the current behaviour (slightly pseudo-code-ish but should be clear enough):
defmodule User do
@primary_key {:id, :binary_id, autogenerate: false}
schema "users" do
field(:username, :string)
field(:full_name, :string)
end
end
user_id = UUID.uuid4()
Repo.insert(%User{id: user_id, username: "john", full_name: "John Smith"})
Repo.insert(%User{id: user_id, username: "steve", full_name: nil}, on_conflict: :replace_all)
Repo.get(User, user_id)
# => %User{id: user_id, username: "steve", full_name: "John Smith"}
I think these semantics are counter-intuitive and unhelpful for replace_all. The only workaround I have right now is constructing a Changeset that pretends that all nullable values were originally set to some unlikely values (uuids?), so that the upsert picks up the nil as a relevant change.
I can write a full repro if that helps.
Yes, the behaviour for upserts is incorrect. But you mentioned it being incorrect for insert as well, and I don't think that's true.
Why don't you think that's true?
Oh, sorry, I wasn't specific enough, I meant upserts exclusively
IMHO lifting all values from the struct might not be very ideal in case we have default fields in the schema, like:
defmodule Post do
use Ecto.Schema
field(:title)
field(:draft, :boolean, default: true)
end
%Post{}
|> Ecto.Changeset.cast(%{title: "blah", draft: false})
|> Repo.insert!(conflict_target: :id, on_conflict: :replace_all)
%Post{}
|> Ecto.Changeset.cast(%{title: "blah"})
|> Repo.insert!(conflict_target: :id, on_conflict: :replace_all)
Would it be very implicit and causing surprises if we change draft to be true in the second insert?
Would it make sense to update the documentation of :replace_all to be something like _update existing entry with changes in the given Ecto.Changeset_? Also users can always use Ecto.Changeset.force_change/3 to make sure desired changes are being set?
@qcam I will look at your comments and give you an answer tomorrow (somewhere in the next 12-24h). If you would rather look at another issue meanwhile, maybe #2615 is a good one? Or pick anything else you may be interested on. :)
@qcam I believe it should send all values provided to cast and timestamps as part of the update (excludes defaults) and sends everything for the insert part.
On replace_all, all of the values sent to the database on insert should be
sent also on the “upsert” part. So we don’t need to lift from the struct
per se, we can use the values we have already computed.
Actually, please ignore my last comment. What I described is how it behaves today and what @qcam mentioned. I need to think about this a bit more.
I agree with @blatyo here. On replace_all, we should keep all of the values in the database. I also think the solution is relatively straight-forward, no struct lifting is necessary:
diff --git a/lib/ecto/repo/schema.ex b/lib/ecto/repo/schema.ex
index 58b5202e..db9d63e8 100644
--- a/lib/ecto/repo/schema.ex
+++ b/lib/ecto/repo/schema.ex
@@ -527,7 +527,7 @@ defmodule Ecto.Repo.Schema do
end
end
- defp on_conflict(on_conflict, conflict_target, schema_meta, header, counter_fun, adapter) do
+ defp on_conflict(on_conflict, conflict_target, schema_meta, counter_fun, adapter) do
%{source: source, schema: schema, prefix: prefix} = schema_meta
case on_conflict do
@@ -538,15 +538,13 @@ defmodule Ecto.Repo.Schema do
:nothing ->
{:nothing, [], conflict_target}
:replace_all ->
- {header, [], conflict_target}
+ {replace_all_fields!(schema), [], conflict_target}
{:replace, keys} when is_list(keys) and conflict_target == [] ->
raise ArgumentError, ":conflict_target option is required when :on_conflict is replace"
{:replace, keys} when is_list(keys) ->
{keys, [], conflict_target}
- :replace_all_except_primary_key when is_nil(schema) ->
- raise ArgumentError, "cannot use :replace_all_except_primary_key on operations without a schema"
:replace_all_except_primary_key ->
- {header -- schema.__schema__(:primary_key), [], conflict_target}
+ {replace_all_fields!(schema) -- schema.__schema__(:primary_key), [], conflict_target}
[_ | _] = on_conflict ->
from = if schema, do: {source, schema}, else: source
query = Ecto.Query.from from, update: ^on_conflict
@@ -558,6 +556,14 @@ defmodule Ecto.Repo.Schema do
end
end
+ defp replace_all_fields!(kind, nil) do
+ raise ArgumentError, "cannot use #{inspect(kind)} on operations without a schema"
+ end
+
+ defp replace_all_fields!(_kind, schema) do
+ schema.__schema__(:fields) ++ schema.__schema__(:embeds)
+ end
+
defp on_conflict_query(query, from, prefix, counter_fun, adapter, conflict_target) do
counter = counter_fun.()
Thoughts? Am I missing something?
I am going ahead with this one then!
Fixed: 9c120798a923f4560c427af79ebdea96b7b3997e
If anybody needs a fix and can't move to Ecto 3 quite yet, here is a workaround (credit to @alexgaribay).
defp replace_query(%Changeset{data: %mod{}, changes: changes}) do
# Set non-applied values to default values
fields_to_drop = mod.__schema__(:associations) ++ mod.__schema__(:embeds) ++ [:__meta__]
all_changes =
mod
|> Map.from_struct()
|> Map.merge(changes)
|> Map.drop(fields_to_drop)
Enum.reduce(all_changes, from(x in mod), fn {key, new_value}, query ->
update(query, [x],
set: [
{^key, ^new_value}
]
)
end)
end
replace_query = replace_query(changeset)
opts = [
on_conflict: replace_query,
conflict_target: :id,
returning: true
]
Repo.insert(changeset, opts)
Most helpful comment
I'll start to work in it this week.