insert_all will allow multiple values to be inserted at once:
@spec insert_all(source :: Ecto.Schema.t | binary, values :: [map | Keyword.t], opts :: Keyword.t) :: {count, list | nil}
If a binary is given, values will be inserted as is in the table. If a schema was given, values will be properly dumped before insertion. In both cases, missing values in the table/schema won't be filled automatically.
For example, imagine the following schema:
schema "posts" do
field :a
field :b
field :c
end
The following insert:
Repo.insert_all Post, [%{a: 1}, %{b: 2}]
will generate the following SQL
INSERT INTO "posts" (a, b) VALUES (1, DEFAULT) (DEFAULT, 2)
notice how the field c was ignored in both cases.
This feature is required to implement many to many support (#1169).
insert_all should also allow the prefix syntax: {prefix, source}
The function will return a tuple the number of entries added, similar to update_all and delete_all. The second element of the tuple is nil unless the :returning option is given, as in #460.
+1
I have added basic support for insert_all, passing only sources, no schemas. If we want to support schemas, we would go into deeper discussions like: should we dump fields? What about autogenerate? And we can side-step this discussion for now until we have more use-cases.
Is there a way to do all those things from the caller?
Given a list of changesets, how do I use insert_all? I would think that something along the lines of:
|> autogenerate_fields
|> to_source
Would be extremely useful. I think sidestepping default behavior is good, but providing some "blessed paths" to explicitly including that behavior would be very useful.
This might already exist and I just haven't found it.
Enum.map(changesets, &Repo.insert/1)
If you have changesets, we have constraints, associations, embeds, prepare_changes hooks and a whole bunch of things that may run. It does not make sense to support them in insert_all exactly because insert may perform multiple operations behind the scenes.
Yes, but that comes with significant performance tradeoffs. It's also an all or nothing proposition right now.
I understand that insert_all is not meant to work within the ecosystem of schemas, that's fine. But there's no clear path to getting my data out of the schema ecosystem and into a form that is compatible with insert_all. Right now I've got:
{:ok, current_time} = Ecto.DateTime.cast(:erlang.timestamp |> :calendar.now_to_datetime)
models = models
|> Enum.map(&Map.from_struct/1)
|> Enum.map(&Map.drop(&1, [:__meta__, :id]))
|> Enum.map(&Map.merge(&1, %{inserted_at: current_time, updated_at: current_time}))
Which does the job but is a very specific solution that requires me to know all sorts of implementation details like :__meta__ being a field in the struct (and to remove :id because my table is configured to autogenerate on insert).
What I'm looking for might even be a simple Ecto.Schema.to_source/1 or something like that, but there should be some guidance on "How do I convert an Ecto.Schema struct to a source for insert_all?"
I cannot think of anything shorter than this:
fields = Thing.__schema__(:fields) -- [:id]
extra = %{inserted_at: Ecto.DateTime.utc, updated_at: Ecto.DateTime.utc}
structs = Enum.map structs, & &1 |> Map.take(fields) |> Map.merge(extra)
Repo.insert_all Thing, structs
In one way or the other you will have to ignore some fields and add your own and I don't think we could have a single function that would cleanly automate that.
Yeah. You're probably right. A silver bullet function isn't really doable.
If I wasn't clear, this isn't so much about feasibility, but about avoiding confusion. The first thing I did when trying insert_all was to try inserting a bunch of changesets. When that didn't work, I went to the documentation and saw they weren't supported but there wasn't a clear direction to go. That's the bug here.
Perhaps documenting something like what you have above as an example would solve that. Maybe just a simple source = Map.take(model, Model.__schema__(:fields)) and a note about how depending on the schema you may need to handle the auto generated ID and the timestamps (perhaps an example of adding timestamps? Somehow I missed utc/0.)
It's entirely possible a different direction is going to be better. I just wanted to bring up some friction I encountered.
I will keep an eye open then. The reason I don't have to have an "official" from changeset to data for insert_all is because it will lead people to use insert_all only for performance without fully understand the drawbacks it brings compared to insert. To me, the case for insert_all is exactly when we are not working with structured data at all.
I'm not sure I understand the drawbacks. Or at least, the ones I see are
purely due to Ecto's architecture. Ignoring associations (that is a can of
worms for sure), wouldn't a theoretical API like:
|> validate_all
|> prepare_all
|> autogenerate_all
|> insert_all
Wouldn't that provide all the functionality of Enum.map(models,
&Repo.insert/1)? The only downside I see is the fact that everything
succeeds and fails together (one bad piece of data breaks everything).
validate_all and prepare_all are actually pretty easy to build.
autogenerate_all might be adapter dependent which makes it trickier.
Again, ignoring associations, is there anything I'm missing?
On Sun, Feb 21, 2016, 1:22 PM José Valim [email protected] wrote:
Closed #1167 https://github.com/elixir-lang/ecto/issues/1167.
—
Reply to this email directly or view it on GitHub
https://github.com/elixir-lang/ecto/issues/1167#event-558664360.
If you take the hardest part (associations and embeds), sure, the rest is easier. :) prepare_all also means you need to wrap your code inside transactions. At least as it is in changesets today.
Ah, prepare is meant to be part of the transaction? OK, that's hard to get
around. I concede. :)
I'm working on some bulk insert performance tuning, so I'll probably have
more feedback.
On Sun, Feb 21, 2016, 1:40 PM José Valim [email protected] wrote:
If you take the hardest part (associations and embeds), sure, the rest is
easier. :) prepare_all also means you need to wrap your code inside
transactions. At least as it is in changesets today.—
Reply to this email directly or view it on GitHub
https://github.com/elixir-lang/ecto/issues/1167#issuecomment-186923600.
Most helpful comment
Yes, but that comes with significant performance tradeoffs. It's also an all or nothing proposition right now.
I understand that insert_all is not meant to work within the ecosystem of schemas, that's fine. But there's no clear path to getting my data out of the schema ecosystem and into a form that is compatible with insert_all. Right now I've got:
Which does the job but is a very specific solution that requires me to know all sorts of implementation details like
:__meta__being a field in the struct (and to remove :id because my table is configured to autogenerate on insert).What I'm looking for might even be a simple
Ecto.Schema.to_source/1or something like that, but there should be some guidance on "How do I convert an Ecto.Schema struct to a source for insert_all?"