Currently there is no way (other than using fragment) to run query like count(*). This can be problematic as count(<field>) on <field> that isn't indexed can be terribly slow (found by @mkaszubowski). The reason why this is happening is that by SQL standard COUNT(<field>) should return count of non-null values in DB so that mean it need to check wether any value in given field is empty or not, while COUNT(*) do not have this limitation and always return count of all matched rows.
Currently there is available workaround in form of count(true) in Ecto, but this isn't obvious solution and for sure isn't idiomatic.
There should be count/0 function that would expand to counting all entries matching the query which I believe that most DBs have optimised enough.
Sounds good to me! PR please?
Just to clarify: the problem exists even if the fields used in count is indexed properly. The difference between two approaches is that when using count(<field>), PostgreSQL uses an index scan and still has to fetch the rows. With count(*) it can use index only scan without fetching the rows, which is way faster.
I have a closed pr about count function #2575 if you want i can modify for that situation.
@soteras a PR is definitely welcome. Although note that we are not adding new callbacks to the repo module. The new count function will belong to the Ecto.Query.API instead. Please let me know if you have any questions!
@josevalim any sources where I could learn how are aggregate functions in Ecto.Query.API handled? Because it seems a little convoluted (as most macro systems).
There are two steps:
Ecto.Query.Builder where expressions are expanded/compiled to query structs. This step verifies all expressions are valid and supported<adapter>.Connection.There is a step in the middle, which is the Ecto.Query.Planner. My suggestion is to search the codebase for a less commonly used aggregator, such as avg, and see where in the codebase we touch it.
@hauleth are you doing that or wants to?
I want to, but if you are willing then you can try as I do not have time right now to dig into it properly.
Wiadomość napisana przez Carlos Henrique notifications@github.com w dniu 28.06.2018, o godz. 23:52:
@hauleth https://github.com/hauleth are you doing that or wants to?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/elixir-ecto/ecto/issues/2595#issuecomment-401184727, or mute the thread https://github.com/notifications/unsubscribe-auth/AARzNyw55jCtOSgk1HJRE3qYGkZ1fhKwks5uBVA4gaJpZM4UrhdQ.
It seem it was merged. @soteras thanks for your work.
Most helpful comment
Just to clarify: the problem exists even if the fields used in
countis indexed properly. The difference between two approaches is that when usingcount(<field>), PostgreSQL uses anindex scanand still has to fetch the rows. Withcount(*)it can useindex only scanwithout fetching the rows, which is way faster.