Ecto: Insert_all/3 fails to set timestamps

Created on 2 Feb 2017  路  20Comments  路  Source: elixir-ecto/ecto

Environment

Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Elixir 1.3.3
ecto 2.1.3
Postgres

Current behaviour

I got the following schema

  schema "notifications" do
    field :subject, :string
    field :subject_id, :integer
    field :satisfied_at, Ecto.DateTime
    field :values, {:array, :string}
    belongs_to :user, RecordApp.User

    timestamps()
  end

Which i created with the following migration

  use Ecto.Migration

  def change do
    create table(:notifications) do
      add :subject, :string
      add :subject_id, :integer
      add :satisfied_at, :utc_datetime
      add :values, {:array, :string}
      add :user_id, references(:users, on_delete: :nothing)

      timestamps()
    end
    create index(:notifications, [:user_id])
  end

I'm trying to insert a bulk of records, using the following code.

    entries = Enum.map(users, fn (user) ->
      %{user_id: user.id, subject: subject, subject_id: file.id, values: values}
    end)
    Repo.insert_all(Notification, entries)

Which gives me a

** (exit) an exception was raised:
    ** (Postgrex.Error) ERROR 23502 (not_null_violation): null value in column "inserted_at" violates not-null constraint

    table: notifications
    column: inserted_at

Doesn't seem to break, when i just insert a single notification, am i doing anything wrong? :)

Most helpful comment

Just had to deal with this, for the next person who does...

# for a schema 'Table' with timestamps()

data =
  data
    |> Enum.map(fn(row) ->
        row
          |> Map.put(:inserted_at, Timex.now)
          |> Map.put(:updated_at, Timex.now)
      end)

Repo.insert_all(Table, data)

All 20 comments

insert_all/3 is meant as a low level operation translating directly into a single database INSERT statement with very straightforward semantics. It's not accidental that it accepts plain maps and not schema structs.

From the docs:

When a schema is given, the values given will be properly dumped before being sent to the database. If the schema contains an autogenerated ID field, it will be handled either at the adapter or the storage layer. However any other autogenerated value, like timestamps, won鈥檛 be autogenerated when using insert_all/3. This is by design as this function aims to be a more direct way to insert data into the database without the conveniences of insert/2. This is also consistent with update_all/3 that does not handle timestamps as well.

@michalmuskala

This is sad actually, the Book from plataformatec has this example:

MyApp.Repo.insert_all(Post, [[title: "hello", body: "world"],
                             [title: "another", body: "post"]])

And the Post Schema example has the timestamps included, so that would produce an error in real life.

I think, this issue can be solved, and there are several options.

  1. In do_insert_all check the schema for inserted_at field, and check if that field is passed with the values, if not generate.

  2. adding an option param to the insert_all & update_all functions, like timestamps: true

  3. inside the migration at the database level, for example by setting default: fragment("now()")

Either its opt-in or opt-out, the lack of proper autogen timestamps feature in insert_all & update_all is an issue IMHO, and will come up more frequently, when people are starting to use that new api.

I think the 1. option would be the cleanest one (maybe with 2. in combo), it would require that the user is passing a Schema to the function, and would't work with "table_names". This would be a valid limitation, maybe adding an option to the timestamps macro, and make it an opt-in feature, to not break running code.

@josevalim & @michalmuskala If some of the options is making sense, I would love to take a look on this

update_all worked like this since it was introduced, and there weren't many complaints. For insert_all it comes up more often, though. I'm torn on this.

and there weren't many complaints

That might be because it is not noticed that the updated field is not updating, few people would check that, but it would be very bad for auditing purposes...

I'm torn on this.

Ok, but the 2. option insert_all/update_all("posts", [...], timestamps: true) seams to be very safe IMO.

The problem is we don't know what time timestamp fields are in that case, so it's not possible for fill them. The only possibility could be when Post schema is passed. This means the option would work only some of the time and that's why I'm not convinced.

  • If the Schema is passed and timestamps fields are not set -> generate (no options)
    Would limit that feature, but it's reasonable.

The main idea is that: The timestamps feature in Schemas and migrations is expected to work by default.

Just had to deal with this, for the next person who does...

# for a schema 'Table' with timestamps()

data =
  data
    |> Enum.map(fn(row) ->
        row
          |> Map.put(:inserted_at, Timex.now)
          |> Map.put(:updated_at, Timex.now)
      end)

Repo.insert_all(Table, data)

@madasebrof I'd assign now = Timex.now ahead of time and pass that in so your inserted_at and updated_at match (right now they would differ slightly)

Ecto.DateTime.utc can also be used as an alternative to Timex.now

The Ecto.DateTime module is deprecated as of Ecto 2.2 (and probably will be removed in 3.0) in favour of the built-in Elixir calendar types, where NaiveDateTime.utc_now or DateTime.utc_now allow retrieving current time.

Thanks @michalmuskala, missed that one :)

Also just dealt with this, but have created a insert_batch in my Repo module which allows me to pass in a list of structs:

  def insert_batch(structs) do
    model = List.first(structs).__struct__

    rows = structs
           |> Enum.map(&model.convert_to_map/1)
           |> Enum.map(&add_timestamps/1)

    Repo.insert_all(model, rows)
  end

  def convert_to_map(struct) do
    struct
    |> Map.delete(:__struct__)
    |> Map.delete(:__meta__)
    |> Map.delete(:id)
  end

  def add_timestamps(row) do
    %{row |
      inserted_at: DateTime.utc_now,
      updated_at: DateTime.utc_now
    }
  end

Might be a bit of ceremony, but at least I can now do Repo.insert_batch([%Model{}, %Model{}]). Note: this won't play nice with associations or virtual attributes on the struct. For that I have a convert_to_map in the model which deletes those virtual attributes/associations. I'm sure it could be amended to handle upsert, too, by removing id from new records.

If this is by design, what is the recommended higher-level way of doing bulk insert? If there is none, should it not be provided?

insert_all is primarily about maximum efficiency for bulk transfer of data from application to the database. This means insert_all is inherently incompatible with changesets. A single changeset operation can result in multiple database queries (nested changesets, prepare_changes, etc), while insert_all is all about guaranteeing a single database query for the entire operation. If you don't care about having a single query Enum.map(changesets, &Repo.insert/1) or a slightly more complex multi version are the way to go.

Enum.reduce(changesets, {0, Multi.new()}, fn changeset, {n, multi} ->
  {n + 1, Multi.insert(multi, n, changeset)}
end)
|> elem(1)
|> Repo.transaction

I found that insert_all works with inserted_at timestamps if this field is defined at migration level but not in the Ecto schema.

For example:

  def change do
    alter table(:events) do
      add :inserted_at, :utc_datetime_usec, default: fragment("now() at time zone 'utc'")
    end
  end

@josemonteiro It probably worked for you not because it's not in the schema, but because it has a default.

@futpib I believe it's due to both things.
If I add it to the Ecto schema it stops working, because Ecto will override the database schema default with the Ecto schema default. So, "hiding" it from Ecto schema seems to be a requirement.

@josemonteiro btw, its possible to just add other field options like default option to the timestamps function. See https://github.com/elixir-ecto/ecto_sql/blob/v3.1.1/lib/ecto/migration.ex#L852

timestamps [
  default: fragment("now() at time zone 'utc'")
]

@michalmuskala Would it be safe, to mention it in the docs?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

kelostrada picture kelostrada  路  3Comments

a12e picture a12e  路  4Comments

atsheehan picture atsheehan  路  4Comments

AndresOsinski picture AndresOsinski  路  5Comments

fuelen picture fuelen  路  3Comments