Ecto: `count(<field>)` can sometimes be much slower than `count(*)`

Created on 18 Jun 2018 · 10Comments · Source: elixir-ecto/ecto

Environment

Database and version (PostgreSQL 9.4, MongoDB 3.2, etc.): PostgreSQL (possibly other SQL DBs, unsure about NoSQLs)
Ecto version (mix deps): 2.0+

Current behavior

Currently there is no way (other than using fragment) to run query like count(*). This can be problematic as count(<field>) on <field> that isn't indexed can be terribly slow (found by @mkaszubowski). The reason why this is happening is that by SQL standard COUNT(<field>) should return count of non-null values in DB so that mean it need to check wether any value in given field is empty or not, while COUNT(*) do not have this limitation and always return count of all matched rows.

Currently there is available workaround in form of count(true) in Ecto, but this isn't obvious solution and for sure isn't idiomatic.

Expected behavior

There should be count/0 function that would expand to counting all entries matching the query which I believe that most DBs have optimised enough.

Enhancement Intermediate

Source

hauleth

👍3

Most helpful comment

Just to clarify: the problem exists even if the fields used in count is indexed properly. The difference between two approaches is that when using count(<field>), PostgreSQL uses an index scan and still has to fetch the rows. With count(*) it can use index only scan without fetching the rows, which is way faster.

mkaszubowski on 18 Jun 2018

👍3

All 10 comments

Sounds good to me! PR please?

josevalim on 18 Jun 2018

mkaszubowski on 18 Jun 2018

👍3

I have a closed pr about count function #2575 if you want i can modify for that situation.

soteras on 19 Jun 2018

@soteras a PR is definitely welcome. Although note that we are not adding new callbacks to the repo module. The new count function will belong to the Ecto.Query.API instead. Please let me know if you have any questions!

josevalim on 19 Jun 2018

@josevalim any sources where I could learn how are aggregate functions in Ecto.Query.API handled? Because it seems a little convoluted (as most macro systems).

hauleth on 19 Jun 2018

There are two steps:

First is in Ecto.Query.Builder where expressions are expanded/compiled to query structs. This step verifies all expressions are valid and supported
Second is conversion to SQL and that happens in each adapter separately. For SQL adapters, the module that does it is usually named <adapter>.Connection.

michalmuskala on 19 Jun 2018

There is a step in the middle, which is the Ecto.Query.Planner. My suggestion is to search the codebase for a less commonly used aggregator, such as avg, and see where in the codebase we touch it.

josevalim on 19 Jun 2018

@hauleth are you doing that or wants to?

soteras on 28 Jun 2018

I want to, but if you are willing then you can try as I do not have time right now to dig into it properly.

Wiadomość napisana przez Carlos Henrique notifications@github.com w dniu 28.06.2018, o godz. 23:52:

@hauleth https://github.com/hauleth are you doing that or wants to?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/elixir-ecto/ecto/issues/2595#issuecomment-401184727, or mute the thread https://github.com/notifications/unsubscribe-auth/AARzNyw55jCtOSgk1HJRE3qYGkZ1fhKwks5uBVA4gaJpZM4UrhdQ.