App.Model
|> App.Repo.insert_all([
%{name: "name1", data: "data1"},
%{name: "name2", data: "data2"},
], [
on_conflict: :replace_all,
conflict_target: [:name],
returning: [:id]
])
SELECT id, name FROM models;
id | name
----+----------------
1 | name1
2 | name2
App.Model
|> App.Repo.insert_all([
%{name: "name1", data: "data1"},
%{name: "name2", data: "data2"},
], [
on_conflict: :replace_all,
conflict_target: [:name],
returning: [:id]
])
SELECT id, name FROM models;
id | name
----+----------------
3 | name1
4 | name2
Upsert will not change ID primary key of already existing rows, just update other fields.
Can you please check the query we are sending to the database? And can you can then try to run this query directly against the database and see how it behaves? I believe there is nothing we can do here, it is all the DB behaviour.
Real DB request is something like this ( one row in request for example )
INSERT INTO "models"
("name","id","data")
VALUES
("name1",DEFAULT,"data1")
ON CONFLICT ("name") DO UPDATE
SET
"name" = EXCLUDED."name",
"id" = EXCLUDED."id",
"data" = EXCLUDED."data"
RETURNING
"id";
It seems "id" = EXCLUDED."id" is the cause of this behaviour described above
Yes, I think this is a bug. We should replace all except the primary key. Thank you.
Thx for fast feedback!
I am wondering why it is setting "id" to DEFAULT. Do you also have some entries that include the ID in the list given to insert_all?
In this example ID is not present in data to upsert, just uniq_key name
It seems behaviour is fixed when "id" = EXCLUDED."id" is removed from request
@timCF Thanks for checking.
I am worried about changing :replace_all because we can only remove primary keys if a schema is available - it may also be a breaking change if someone is relying on this behaviour. So I will push a new option, :replace_all_except_primary_key that should fit your use case. We may want to make it more generic in the future, {:replace_all, except: [...]}, but I think this case is common enough to warrant its own option, especially given that it should likely be preferred over :replace_all.
Thx! {:replace_all, except: [...]} sounds very good, and replace_all_except_primary_key is also ok for my current task :)
I am curious as to the use case where the PK is expected to change by default in the context of an upsert operation.
is the {:replace_all, except: [...]} being tracked anywhere? I'd like to subscribe to it so I know when it gets merged. My use case is doing bulk upserts based on a unique external_id. I need to preserve the PK (for associations) and match the new data based on the external_id. Something like this:
App.Model
|> App.Repo.insert_all([
%{external_id: "id1", data: "data1"},
%{external_id: "id2", data: "data2"},
], [
on_conflict: {:replace_all, except: [:id, :external_id]},
conflict_target: :external_id,
returning: true
])
@josevalim Is the fix only gonna be included in v3.0?
@kcurtin there are no plans for it right now so it isn't tracked anywhere. You can do App.Model.__schema__(:fields) or similar to get all fields and remove the ones you want.
@safwank yes because it augments the adapter APIs and for this reason it can't be included in a point release.
@josevalim Are you suggesting a temporary workaround would be to do something along these lines?
App.Model.__schema__(:fields)
|> List.delete(:id)
|> App.Repo.insert_all(...)
@safwank not exactly like that but something like that, yes, correct.
Hah, just realized that code wouldn't actually work, but point taken 馃憤
I think this bug might still be present in Ecto 3.0.6.
My schema looks like this:
defmodule MyApp.Source do
use Ecto.Schema
@primary_key {:id, :binary_id, autogenerate: true}
And my upsert function looks like this:
def upsert_source(attrs \\ %{}) do
%Source{}
|> Source.changeset(attrs)
|> Repo.insert(
conflict_target: [:some_unique_attribute],
on_conflict: :replace_all_except_primary_key
)
end
Example IEx session of the problem in action:
# initial insert...
iex(1)> MyApp.Context.upsert_source(%{"feed_url" => "example.com", "title" => "yo"})
[debug] QUERY OK db=3.3ms decode=1.6ms queue=1.4ms
INSERT INTO "sources" ("feed_url","title","inserted_at","updated_at","id") VALUES ($1,$2,$3,$4,$5) ON CONFLICT ("feed_url") DO UPDATE SET "feed_url" = EXCLUDED."feed_url","source_url" = EXCLUDED."source_url","title" = EXCLUDED."title","subtitle" = EXCLUDED."subtitle","image" = EXCLUDED."image","link" = EXCLUDED."link","language" = EXCLUDED."language","summary" = EXCLUDED."summary","author" = EXCLUDED."author","inserted_at" = EXCLUDED."inserted_at","updated_at" = EXCLUDED."updated_at" ["example.com", "yo", ~N[2019-01-15 22:05:59], ~N[2019-01-15 22:05:59], <<159, 165, 81, 21, 219, 48, 64, 193, 133, 214, 173, 254, 31, 34, 74, 95>>]
{:ok,
%MyApp.Source{
__meta__: #Ecto.Schema.Metadata<:loaded, "sources">,
author: nil,
entries: #Ecto.Association.NotLoaded<association :entries is not loaded>,
feed_url: "example.com",
id: "9fa55115-db30-40c1-85d6-adfe1f224a5f",
image: nil,
inserted_at: ~N[2019-01-15 22:05:59],
language: nil,
link: nil,
source_url: nil,
subtitle: nil,
summary: nil,
title: "yo",
updated_at: ~N[2019-01-15 22:05:59]
}}
# Should update just the 'title' attribute
iex(2)> MyApp.Context.upsert_source(%{"feed_url" => "example.com", "title" => "your"})
[debug] QUERY OK db=4.9ms queue=3.0ms
INSERT INTO "sources" ("feed_url","title","inserted_at","updated_at","id") VALUES ($1,$2,$3,$4,$5) ON CONFLICT ("feed_url") DO UPDATE SET "feed_url" = EXCLUDED."feed_url","source_url" = EXCLUDED."source_url","title" = EXCLUDED."title","subtitle" = EXCLUDED."subtitle","image" = EXCLUDED."image","link" = EXCLUDED."link","language" = EXCLUDED."language","summary" = EXCLUDED."summary","author" = EXCLUDED."author","inserted_at" = EXCLUDED."inserted_at","updated_at" = EXCLUDED."updated_at" ["example.com", "your", ~N[2019-01-15 22:06:04], ~N[2019-01-15 22:06:04], <<27, 42, 29, 10, 111, 178, 64, 86, 163, 50, 40, 193, 72, 200, 140, 250>>]
{:ok,
%DailyBugle.Subscriptions.Source{
__meta__: #Ecto.Schema.Metadata<:loaded, "sources">,
author: nil,
entries: #Ecto.Association.NotLoaded<association :entries is not loaded>,
feed_url: "example.com",
id: "1b2a1d0a-6fb2-4056-a332-28c148c88cfa",
image: nil,
inserted_at: ~N[2019-01-15 22:06:04],
language: nil,
link: nil,
source_url: nil,
subtitle: nil,
summary: nil,
title: "your",
updated_at: ~N[2019-01-15 22:06:04]
}}
Note the new ID on the second insert.
If I use raw SQL to perform the same INSERTs the problem does not happen.
Example psql session:
mydb=# select * from sources;
id | title | feed_url | source_url | subtitle | author | image | link | language | summary | inserted_at | updated_at
----+-------+----------+------------+----------+--------+-------+------+----------+---------+-------------+------------
(0 rows)
mydb=# INSERT INTO "sources" ("feed_url","title","inserted_at","updated_at","id") VALUES ('example.com/brss','yo yo yo',now(),now(),gen_random_uuid()) ON CONFLICT ("feed_url") DO UPDATE SET "feed_url" = EXCLUDED."feed_url","source_url" = EXCLUDED."source_url","title" = EXCLUDED."title","subtitle" = EXCLUDED."subtitle","image" = EXCLUDED."image","link" = EXCLUDED."link","language" = EXCLUDED."language","summary" = EXCLUDED."summary","author" = EXCLUDED."author","inserted_at" = EXCLUDED."inserted_at","updated_at" = EXCLUDED."updated_at";
INSERT 0 1
mydb=# select * from sources;
id | title | feed_url | source_url | subtitle | author | image | link | language | summary | inserted_at | updated_at
--------------------------------------+----------+------------------+------------+----------+--------+-------+------+----------+---------+----------------------------+----------------------------
d01982ab-7038-44fc-9d46-cc94362ab13d | yo yo yo | example.com/brss | | | | | | | | 2019-01-15 17:19:29.493673 | 2019-01-15 17:19:29.493673
(1 row)
mydb=# INSERT INTO "sources" ("feed_url","title","inserted_at","updated_at","id") VALUES ('example.com/brss','hey hey hey',now(),now(),gen_random_uuid()) ON CONFLICT ("feed_url") DO UPDATE SET "feed_url" = EXCLUDED."feed_url","source_url" = EXCLUDED."source_url","title" = EXCLUDED."title","subtitle" = EXCLUDED."subtitle","image" = EXCLUDED."image","link" = EXCLUDED."link","language" = EXCLUDED."language","summary" = EXCLUDED."summary","author" = EXCLUDED."author","inserted_at" = EXCLUDED."inserted_at","updated_at" = EXCLUDED."updated_at";
INSERT 0 1
daily_bugle_dev=# select * from sources;
id | title | feed_url | source_url | subtitle | author | image | link | language | summary | inserted_at | updated_at
--------------------------------------+-------------+------------------+------------+----------+--------+-------+------+----------+---------+----------------------------+----------------------------
d01982ab-7038-44fc-9d46-cc94362ab13d | hey hey hey | example.com/brss | | | | | | | | 2019-01-15 17:19:50.487509 | 2019-01-15 17:19:50.487509
(1 row)
I'm using Postgres 10.
Most helpful comment
Yes, I think this is a bug. We should replace all except the primary key. Thank you.