A channel is able to be upgraded as any process in the OTP application.
Channels are not automatically picked up for upgrades by the standard OTP mechanism. This is because the channel processes are not started by the supervisors, but instead by the transport processes. This means that the discovery mechanism has no way of reaching the channels processes to know to update them.
I'm not sure if it's possible for the transport process to start behaving like a supervisor to make it possible or if it would be required to introduce real supervisors. If it's deemed the fix would be too complex, it should be at least documented that channels don't work with standard upgrade processes.
I need to experiment here and see what needs to be done. In Paul's ElixirConf talk a couple years ago, he showed off exrm at the time upgrading an app with Phoenix Channels where he changed the channel code messaging from something like "ping" to "pong" and then performed a release. Everything worked as expected, so what in particular are we missing here that limits upgradability? Going with a Supervisor start_child approach is probably the best bet, but I need to see what it would take.
Hey @chrismccord, thanks for looking into this.
TL;DR is - yes, on the code upgrade the new functionality is properly swapped with the old one, but the problem I've hit described here with a bit more details talks about not executing code_change/3.
Initially I thought it was invalid release built with distillery, but in the conversation @michalmuskala mentioned that this might rather be due to flaw of current design of channels.
Fixed in my channels refactor.
Just ran into this on while running hot code upgrade with distillery 2.0.12 and phoenix 1.4.2 on gigalixir. I was making a video about how to do hot code upgrades when my channel state didn't change 馃槩
Can you please provide more information? Channels are now part of your app
supervision tree, which means they can be found by releases and hot code
upgrades. There has to be some debugging in why the processes can鈥檛 be
Jos茅 Valimwww.plataformatec.com.br
http://www.plataformatec.com.br/Founder and Director of R&D
Sure, what would you like to know? Do you want a gist of my code?
defmodule GigalixirPhoenixWeb.RoomChannel do
use GigalixirPhoenixWeb, :channel
@greeting "Hey!"
def join(_, _params, socket) do
send(self(), :counter)
socket =
socket
|> assign(:greeting, @greeting)
|> assign(:timer, 0)
{:ok, socket}
end
def handle_info(:counter, %{assigns: %{timer: timer, greeting: greeting}} = socket) do
timer = timer + 1
Process.send_after(self(), :counter, 1000)
push(socket, "count", %{timer: timer, greeting: greeting})
{:noreply, assign(socket, :timer, timer)}
end
def code_change(_old_vsn, socket, _extra) do
{:ok, assign(socket, :greeting, @greeting)}
end
end
The greeting was never updated on my socket, which quite surprised me. I created that file by hand. I just used the generator, and I didn't see it asked me to add anything to my app supervision tree. I'm more than willing to share the project if you'd like or I can debug from here if you can guide me.
What are the generated appup instructions? Does it tell the supervision
Jos茅 Valimwww.plataformatec.com.br
http://www.plataformatec.com.br/Founder and Director of R&D
since this was deployed on gigalixir I've contacted @jesseshieh for some assistance. will follow up if there's a bigger issue.
I think this actually isn't a problem with phoenix, but instead something wrong with either gigalixir or distillery. I'm the founder of gigalixir. I'm not sure yet. It looks like although the appup files are created when the upgrade release is generated, the distillery tarball does not include them. gigalixir currently only copies the tarball over to the running app container so the appup files appera to be lost. I'm investigating to see what the proper way of including the appup files is.
I investigated this further and I think this may be a phoenix issue after all. I think @bitwalker describes it best in his comment here, where he says
the Channel module is not actually run as a process - instead it is a callback module which is delegated to from Phoenix.Channel.Server and is stored in the state of the socket. What this means in practice is that your code_change handler will only get called if there is a change to the Phoenix.Channel.Server module in Phoenix which triggers an upgrade
I think what he's referring to is this line, where the channel code_change is called as a callback.
https://github.com/phoenixframework/phoenix/blob/v1.4/lib/phoenix/channel/server.ex#L322
In my tests, if you just change the channel module, the distillery generated relup ends up looking like this which has no code_change directive in it.
{"0.1.1",
[{"0.1.0",[],
[{load_object_code,
{gigalixir_phoenix,"0.1.1",
['Elixir.GigalixirPhoenixWeb.RoomChannel']}},
point_of_no_return,
{load,
{'Elixir.GigalixirPhoenixWeb.RoomChannel',brutal_purge,
brutal_purge}}]}],
...
I got things working by using a custom appup file that forces Phoenix.Channel.Server to also be updated. For example
$ cat rel/appups/gigalixir_phoenix/0.1.1.appup
{"0.1.1",
[{"0.1.0",
[{update,'Elixir.GigalixirPhoenixWeb.RoomChannel',{advanced,[]},[]}
,{update,'Elixir.Phoenix.Channel.Server',{advanced,[]},[]}
]}],
...
Which results in a relup, which calls code_change on Phoenix.Channel.Server
cat ./_build/prod/rel/gigalixir_phoenix/releases/0.1.1/relup
{"0.1.1",
[{"0.1.0",[],
[{load_object_code,
{gigalixir_phoenix,"0.1.1",
['Elixir.GigalixirPhoenixWeb.RoomChannel']}},
{load_object_code,{phoenix,"1.4.2",['Elixir.Phoenix.Channel.Server']}},
point_of_no_return,
{suspend,['Elixir.GigalixirPhoenixWeb.RoomChannel']},
{load,
{'Elixir.GigalixirPhoenixWeb.RoomChannel',brutal_purge,brutal_purge}},
{code_change,up,[{'Elixir.GigalixirPhoenixWeb.RoomChannel',[]}]},
{resume,['Elixir.GigalixirPhoenixWeb.RoomChannel']},
{suspend,['Elixir.Phoenix.Channel.Server']},
{load,{'Elixir.Phoenix.Channel.Server',brutal_purge,brutal_purge}},
{code_change,up,[{'Elixir.Phoenix.Channel.Server',[]}]},
{resume,['Elixir.Phoenix.Channel.Server']}]}],
...
Using this relup file, the upgrade works and the greeting is updated properly on existing, already connected sockets. Future sockets that join are of course updated as well. For reference, the v0.1.0 code is here and the v0.1.1 diff is here.
Anyway, as @michalmuskala says above, I'm not sure if this needs to be fixed or not, but if not, "it should be at least documented that channels don't work with standard upgrade processes.".
Does it make sense to re-open this issue?
Ok, I see. Awesome work @jesseshieh!
So this will be fixed once we migrate to DynamicSupervisor. In Phoenix v1.4 we already call start_link in the channel process but because we use simple_one_for_one, it always points to the Phoenix.Channel.Server. But once we migrate to the DynamicSupervisor, which is planned, it will be immediately solved (probably in v1.5 or v1.6 or something).
Until then, the .appup needs to explicitly list the channels, as you mentioned. :+1:
Most helpful comment
Fixed in my channels refactor.