Phoenix: Tests for the Rumbl application end with error

Created on 23 Nov 2019  ·  15Comments  ·  Source: phoenixframework/phoenix

Environment

  • Elixir version (elixir -v):
Erlang/OTP 22 [erts-10.4.4] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]

Elixir 1.9.0 (compiled with Erlang/OTP 22)
  • Phoenix version (mix deps): phoenix 1.4.11 (Hex package) (mix)
  • NodeJS version (node -v): 0.16.2
  • NPM version (npm -v): 6.9.0
  • Operating system: macOS Mojave

Actual behavior

Running mix test on the Rumbl application from Programming Phoenix 1.4 ends up with an error:

15:18:54.721 [error] Postgrex.Protocol (#PID<0.291.0>) disconnected: ** (DBConnection.ConnectionError) owner #PID<0.578.0> exited

Client #PID<0.581.0> is still using a connection from owner at location:

    :prim_inet.recv0/3
    (postgrex) lib/postgrex/protocol.ex:2837: Postgrex.Protocol.msg_recv/4
    (postgrex) lib/postgrex/protocol.ex:2553: Postgrex.Protocol.recv_transaction/4
    (postgrex) lib/postgrex/protocol.ex:1858: Postgrex.Protocol.rebind_execute/4
    (ecto_sql) lib/ecto/adapters/sql/sandbox.ex:370: Ecto.Adapters.SQL.Sandbox.Connection.proxy/3
    (db_connection) lib/db_connection/holder.ex:293: DBConnection.Holder.holder_apply/4
    (db_connection) lib/db_connection.ex:1255: DBConnection.run_execute/5
    (db_connection) lib/db_connection.ex:1342: DBConnection.run/6
    (db_connection) lib/db_connection.ex:596: DBConnection.execute/4
    (ecto_sql) lib/ecto/adapters/postgres/connection.ex:80: Ecto.Adapters.Postgres.Connection.execute/4
    (ecto_sql) lib/ecto/adapters/sql.ex:580: Ecto.Adapters.SQL.execute!/4
    (ecto_sql) lib/ecto/adapters/sql.ex:562: Ecto.Adapters.SQL.execute/5
    (ecto) lib/ecto/repo/queryable.ex:177: Ecto.Repo.Queryable.execute/4
    (ecto) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
    (rumbl_web) lib/rumbl_web/channels/presence.ex:78: RumblWeb.Presence.fetch/2
    (phoenix) lib/phoenix/presence.ex:318: anonymous fn/5 in Phoenix.Presence.handle_diff/5
    (stdlib) maps.erl:232: :maps.fold_1/3
    (phoenix) lib/phoenix/presence.ex:316: anonymous fn/4 in Phoenix.Presence.handle_diff/5
    (elixir) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

The connection itself was checked out by #PID<0.581.0> at location:

    (ecto_sql) lib/ecto/adapters/postgres/connection.ex:80: Ecto.Adapters.Postgres.Connection.execute/4
    (ecto_sql) lib/ecto/adapters/sql.ex:580: Ecto.Adapters.SQL.execute!/4
    (ecto_sql) lib/ecto/adapters/sql.ex:562: Ecto.Adapters.SQL.execute/5
    (ecto) lib/ecto/repo/queryable.ex:177: Ecto.Repo.Queryable.execute/4
    (ecto) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
    (rumbl_web) lib/rumbl_web/channels/presence.ex:78: RumblWeb.Presence.fetch/2
    (phoenix) lib/phoenix/presence.ex:318: anonymous fn/5 in Phoenix.Presence.handle_diff/5
    (stdlib) maps.erl:232: :maps.fold_1/3
    (phoenix) lib/phoenix/presence.ex:316: anonymous fn/4 in Phoenix.Presence.handle_diff/5
    (elixir) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

Not sure if this was reported, but I still hit this bug with Phoenix 1.4.11. In my setup it happens every time. So I'm opening this as noted by @josevalim:

Good catch! Basically the Presence wants to query the DB and send updates but the test has shut down and there are no database connections. To fix this consistently, we would need to query the presence supervisor and ask all of its children to shutdown at the end of each test. But to do so, we will need to add a new API to Phoenix. If you can open up an issue in Phoenix issues tracker, it would be extra helpful. Thank you!

Reproduction

Download and extract https://pragprog.com/titles/phoenix14/source_code. The relevant app is in code/testing_otp/rumbl_umbrella. Strangely enough I can't seem to even run the tests (the error report above is from my follow-along version):

** (Mix) Could not start application rumbl_web: RumblWeb.Application.start(:normal, []) returned an error: shutdown: failed to start child: RumblWeb.Presence
    ** (EXIT) shutdown: failed to start child: Phoenix.Tracker
        ** (EXIT) shutdown: failed to start child: RumblWeb.Presence_shard0
            ** (EXIT) an exception was raised:
                ** (ArgumentError) argument error
                    (stdlib) :ets.lookup(RumblWeb.PubSub, :node_name)
                    (phoenix_pubsub) lib/phoenix/pubsub.ex:288: Phoenix.PubSub.call/3
                    (rumbl_web) lib/rumbl_web/channels/presence.ex:10: RumblWeb.Presence.init/1
                    (phoenix_pubsub) lib/phoenix/tracker/shard.ex:120: Phoenix.Tracker.Shard.init/1
                    (stdlib) gen_server.erl:374: :gen_server.init_it/2
                    (stdlib) gen_server.erl:342: :gen_server.init_it/6
                    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

Expected behavior

The tests should pass without error.

Most helpful comment

We have introduced a fix for this problem on latest Ecto by adding start_owner!. You can see how it is meant to be used in Phoenix apps here.

The start_owner uses a separate process to own the connection and this process will terminate after the presence processes, as long as you also add this code to your setup block:

  on_exit(fn ->
    for pid <- Task.Supervisor.children(RumblWeb.Presence.TaskSupervisor) do
      ref = Process.monitor(pid)
      assert_receive {:DOWN, ^ref, _, _, _}, 1000
    end
  end)

thanks everyone for the patience!

All 15 comments

This is an error on the book side, not the framework side. Please check out the pragprog errata page for the book. Thanks!

Thanks for having a look!

How did you fix it? I have the same problem, but can't find solution at the pragprog errata page of the book.

@jejuro me neither and I just gave up.

Hello,
I have the exact same error..
And it's also intermittent.

And unfortunately, as @chrismccord suggested above, there is no solution given in the errata page.

But José Valim suggested for a patch to be done in Phoenix itself:

José Valim says: Good catch! Basically the Presence wants to query the DB and send updates but the test has shut down and there are no database connections. To fix this consistently, we would need to query the presence supervisor and ask all of its children to shutdown at the end of each test. But to do so, we will need to add a new API to Phoenix. If you can open up an issue in Phoenix issues tracker, it would be extra helpful. Thank you!

I'm still learning but in the meantime I tried to tinker with terminating the tasks as suggested by José..
And it worked.. Somehow...
I mean not really..
It happens way less often but still happens.
And the error is now in one line like so:

11:07:52.039 [error] Postgrex.Protocol (#PID<0.362.0>) disconnected: ** (DBConnection.ConnectionError) owner #PID<0.445.0> exited

So I tried with the following (in the setup block of the video_channel_test.exs file):

setup do
  user = insert_user(name: "Gary")
  video = insert_video(user, title: "Testing")
  token = Phoenix.Token.sign(@endpoint, "user socket", user.id)
  {:ok, socket} = connect(RumblWeb.UserSocket, %{"token" => token})

  on_exit(fn ->
    Task.Supervisor.children(RumblWeb.Presence.TaskSupervisor)
    |> Enum.each(fn pid ->
         Task.Supervisor.terminate_child(RumblWeb.Presence.TaskSupervisor, pid)
       end)
  end)

  {:ok, socket: socket, user: user, video: video}
end

I played with debug outputs (IO.inspecting and IO.putsing many pid, process infos etc.)
And it still happens sometimes.
I guess that when it occurs it's still happening just before the on_exit block here is executing.

If I had more knowledge of how tests spin up the whole stack I could have tried to terminate those Presence.TaskSupervisor tasks before the tests shut down.

Another solution, might have be to simply put a :capture_log tag before each tests or at the module level:

@moduletag :capture_log
#or
@tag :capture_log
test "join replies with video annotations" do
  ...

But I don't know why the errors are still showing despite the tests passing successfully.

If anyone have any details about how to solve this I'll happy to try..

You are correct, on_exit happens too late.

The issue is that once the test terminates, the channel process will terminate, the presence process will notice the channel termination, and then invoke the callbacks without the database.

All of this happens async, so it is hard to make it sync. I am not sure at the moment how to fix those.

Any more ideas on this one? I can't get my channel tests to run without this error as I'm using fetch in my presence module

Managed to fix this issue with following changes :

# config.exs
config :rumbl_web, RumblWeb.Endpoint,
   ...
  pubsub_server: RumblWeb.PubSub,
  ...

# rumbl_web/application.ex
    children = [
      RumblWeb.Telemetry,
      {Phoenix.PubSub, name: RumblWeb.PubSub, adapter: Phoenix.PubSub.PG2},
      RumblWeb.Presence,
      RumblWeb.Endpoint
    ]



md5-35d751806e665a0b94c6a9033b886c73



# channels/presence.ex

use Phoenix.Presence, otp_app: :rumbl_web,
                        pubsub_server: RumblWeb.PubSub

This rely on Phoenix 1.5.1 and PubSub 2 - the app is the one from the book -

We have introduced a fix for this problem on latest Ecto by adding start_owner!. You can see how it is meant to be used in Phoenix apps here.

The start_owner uses a separate process to own the connection and this process will terminate after the presence processes, as long as you also add this code to your setup block:

  on_exit(fn ->
    for pid <- Task.Supervisor.children(RumblWeb.Presence.TaskSupervisor) do
      ref = Process.monitor(pid)
      assert_receive {:DOWN, ^ref, _, _, _}, 1000
    end
  end)

thanks everyone for the patience!

Hi! I was still having the same issue after implementing Jose setup and updating to the latest version of Ecto. At the end I ended up patching it with a small timer like this at the end of Jose's on_exit function

:timer.sleep(1)

Same as @nicozar95, thanks for suggesting the sleep in the on_exit function :+1:
Does that mean that there are some remaining processes to wait for as well?

Maybe it would be useful to have a documented helper function for this use case, since it is probably a legit one? Also, monitoring internal processes from the Presence feels a bit like exposing implementation details. WDYT?

@sabiwara those are very good points. I have added a fetchers_pids() function and some docs.

I get errors like this when I run the tests (even though they still succeed):

==> rumbl_web
.................18:27:24.971 [error] Task #PID<0.988.0> started from RumblWeb.Presence_shard0 terminating
** (stop) exited in: DBConnection.Holder.checkout(#PID<0.982.0>, [log: #Function<9.29328868/1 in Ecto.Adapters.SQL.with_log/3>, source: "users", timeout: 15000, pool_size: 10, pool: DBConnection.Ownership])
    ** (EXIT) shutdown: "owner #PID<0.981.0> exited"
    (db_connection 2.2.2) lib/db_connection/holder.ex:86: DBConnection.Holder.checkout/2
    (db_connection 2.2.2) lib/db_connection/holder.ex:67: DBConnection.Holder.checkout/2
    (db_connection 2.2.2) lib/db_connection.ex:1029: DBConnection.checkout/3
    (db_connection 2.2.2) lib/db_connection.ex:1340: DBConnection.run/6
    (db_connection 2.2.2) lib/db_connection.ex:595: DBConnection.execute/4
    (ecto_sql 3.4.5) lib/ecto/adapters/postgres/connection.ex:80: Ecto.Adapters.Postgres.Connection.execute/4
    (ecto_sql 3.4.5) lib/ecto/adapters/sql.ex:544: Ecto.Adapters.SQL.execute!/4
    (ecto_sql 3.4.5) lib/ecto/adapters/sql.ex:526: Ecto.Adapters.SQL.execute/5
    (ecto 3.4.6) lib/ecto/repo/queryable.ex:192: Ecto.Repo.Queryable.execute/4
    (ecto 3.4.6) lib/ecto/repo/queryable.ex:17: Ecto.Repo.Queryable.all/3
    (rumbl_web 0.1.0) lib/rumbl_web/channels/presence.ex:19: RumblWeb.Presence.fetch/2
    (phoenix 1.5.4) lib/phoenix/presence.ex:347: anonymous fn/4 in Phoenix.Presence.Tracker.handle_diff/2
    (stdlib 3.13) maps.erl:233: :maps.fold_1/3
    (phoenix 1.5.4) lib/phoenix/presence.ex:345: anonymous fn/3 in Phoenix.Presence.Tracker.handle_diff/2
    (elixir 1.10.4) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
    (stdlib 3.13) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
Function: #Function<0.87329413/0 in Phoenix.Presence.Tracker.handle_diff/2>
    Args: []

Unfortunately I am still a beginner with Elixir and Phoenix so it's hard for me to track the cause or even tell if this is related to the issue. I tried to add the above snippet to my setup block, but it did not have any effect.

Also, if I try to run only the video channel tests (mix apps/rumbl_web/test/rumbl_web/channels/video_channel_test.exs) they fail: it looks like for some reason the test implementation of the HTTPClient is not found. The same does not happen when I run all tests from the top rumbl_umbrella directory.

❯ mix test apps/rumbl_web/test/rumbl_web/channels/video_channel_test.exs
==> info_sys
Compiling 6 files (.ex)
warning: InfoSys.Test.HTTPClient.request/1 is undefined (module InfoSys.Test.HTTPClient is not available or is yet to be defined)

...

Finished in 0.3 seconds
2 tests, 1 failure

Could anybody point me in the right direction?

EDIT: increasing the sleep to 200 fixed the errors for me. I still do not understand why I cannot run only the video channel tests, though.

@simonedavico Hi! It might make sense to wait before fetching the tasks, not after:

on_exit(fn ->
    :timer.sleep(10)
    for pid <- RumblWeb.Presence.fetchers_pids()  do
      ref = Process.monitor(pid)
      assert_receive {:DOWN, ^ref, _, _, _}, 1000
    end
  end)

This gives presence process a bit more time to start the tasks. Otherwise we run a risk of fetching the pids before the tasks are started, and failing to wait for them.

@tsrandrei Yes, this works for me.

Was this page helpful?
0 / 5 - 0 ratings