Rails: Live streaming threads not dying

Created on 18 Jun 2013  路  51Comments  路  Source: rails/rails

Hello,

I'm running into a weird issue with live streams in rails4. I noticed when I refresh a page that is streaming data, the thread that has been created for it sometimes doesn't die. Here's an example controller I'm using:

class GetJobController < ApplicationController
    include ActionController::Live

    def get_job
    response.headers["Content-Type"] = "text/event-stream"
    redis = Redis.new
    redis.subscribe('job') do |on|
      on.message do |event, data|
        response.stream.write("event: #{event}\n")
        response.stream.write("data: #{data}\n\n")
       end
    end
    rescue IOError
        puts "Stream closed"
    ensure
        redis.quit
        response.stream.close
    end
end

Here is the coffeescript I am using

source = new EventSource("/job")
        source.addEventListener 'job', (e) ->
          message = e.data
          $("#job.id").text message

When I start up Puma, the number of threads starts at two. When I refresh the page, the number of threads is three. If I refresh the page too much, the thread limit for Puma gets reached and the server just freezes. I narrowed it down to the controller causing the problem. If I remove the redis.subscribe portion, I obviously don't get any data back and no threads are created. Is there something I am doing wrong? My first thought was something with thread safety, but I am not sure.

Thanks

actionpack pinned

All 51 comments

/cc @tenderlove

I am also seeing the same problem here, I'm expecting the threads to die as I move off this controller, but they seem to hang around and I eventually run out.

When running puma config.ru instead of the rails s Puma, I've also come across found this message when trying to force shutdown the server, which may explain the issue:

Puma caught this error: Attempt to unlock a mutex which is locked by another thread
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:22:in `unlock'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:22:in `ensure in call'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:23:in `call'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/actionpack-4.0.0/lib/action_dispatch/middleware/static.rb:64:in `call'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/railties-4.0.0/lib/rails/engine.rb:511:in `call'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/railties-4.0.0/lib/rails/application.rb:97:in `call'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/railties-4.0.0/lib/rails/railtie/configurable.rb:30:in `method_missing'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/commonlogger.rb:33:in `call'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.1.0/lib/puma/configuration.rb:66:in `call'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.1.0/lib/puma/server.rb:369:in `handle_request'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.1.0/lib/puma/server.rb:246:in `process_client'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.1.0/lib/puma/server.rb:145:in `block in run'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.1.0/lib/puma/thread_pool.rb:92:in `call'
/Users/joe/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.1.0/lib/puma/thread_pool.rb:92:in `block in spawn_thread'

having the same issue..

Rack app error: #<ThreadError: Attempt to unlock a mutex which is locked by another thread>
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:22:in `unlock'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:22:in `ensure in call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:23:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/actionpack-4.0.0/lib/action_dispatch/middleware/static.rb:64:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/railties-4.0.0/lib/rails/engine.rb:511:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/railties-4.0.0/lib/rails/application.rb:97:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/content_length.rb:14:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:371:in `handle_request'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:248:in `process_client'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:147:in `block in run'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/thread_pool.rb:92:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/thread_pool.rb:92:in `block in spawn_thread'
2013-07-16 03:17:24 -0700: Rack app error: #<ThreadError: Attempt to unlock a mutex which is not locked>
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:22:in `unlock'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:22:in `ensure in call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:23:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/actionpack-4.0.0/lib/action_dispatch/middleware/static.rb:64:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.Completed 401 Unauthorized in 11710ms
0.0/gems/railties-4.0.0/lib/rails/engine.rb:511:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/railties-4.0.0/lib/rails/application.rb:97:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/content_length.rb:14:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:371:in `handle_request'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:248:in `process_client'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:147:in `block in run'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/thread_pool.rb:92:in `call'
/Users/seb/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/thread_pool.rb:92:in `block in spawn_thread'

I believe this is a problem with your controller code as opposed to a problem with the library. If you watch your network requests and look at the W3C spec for EventSource, you'll notice that an EventSource's behavior is to create a new connection upon receiving data. What happens is redis.subscribe blocks until it picks up a message, then it keeps on blocking. In the meantime, your EventSource in the browser has closed its connection (and the rails server has no way of knowing unless it attempts to write to the closed stream), and created a new request to the server. So now you have two threads, both blocking on redis.subscribe. Keep repeating that and you will eventually run out of threads.

If you call redis.unsubscribe('job') at the end of your message block you will not have this problem.

All that said, I do wish I knew of a clean way to create a global redis listener that just stayed constantly subscribed and broadcasted to my IO streams upon message receipt. This subscribing and unsubscribing once per data write seems ugly.

Sorry, I went and re-read the W3C spec. The information above regarding the behavior of EventSource is wrong. EventSource will NOT automatically close the connection and then re-open it, so you should NOT call redis.unsubscribe as I mentioned above. I was confused because of the network request behavior I was seeing in my browser. I suspect you are seeing the same behavior, because the end result is maxing out a bunch of blocked threads.

My behavior was caused by an incorrectly configured nginx reverse proxy sitting in front of my rails. If you're using nginx too, the following configuration solved the problem for me:

  proxy_set_header X-Accel-Buffering no; 
  proxy_set_header Connection ''; 
  proxy_http_version 1.1;
  chunked_transfer_encoding off;

I'm using puma as my server.

same problem here, and when i control+C my puma server i get

Started GET "/orders/51e88b433e16c03bbc000012" for 127.0.0.1 at 2013-07-20 22:52:43 -0400
Processing by StreamerController#orders as text/event-stream
  Parameters: {"store_id"=>"51e88b433e16c03bbc000012"}
^CExiting
Completed 401 Unauthorized in 242743ms
2013-07-20 22:56:46 -0400: Rack app error: #<ThreadError: Attempt to unlock a mutex which is locked by another thread>
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:22:in `unlock'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:22:in `ensure in call'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/lock.rb:23:in `call'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/actionpack-4.0.0/lib/action_dispatch/middleware/static.rb:64:in `call'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/railties-4.0.0/lib/rails/engine.rb:511:in `call'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/railties-4.0.0/lib/rails/application.rb:97:in `call'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/content_length.rb:14:in `call'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:371:in `handle_request'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:248:in `process_client'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/server.rb:147:in `block in run'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/thread_pool.rb:92:in `call'
/Users/jtomasrl/.rbenv/versions/2.0.0-p0/lib/ruby/gems/2.0.0/gems/puma-2.3.2/lib/puma/thread_pool.rb:92:in `block in spawn_thread'

Also wanted to add that I tried switching to unicorn and have been running into the same issues (although no locked mutex error)

As far as I know, there is no good way for the rails server to know that your client has closed a connection short of attempting to write to response.stream. I periodically send a heartbeat to redis to trigger the IOError and clean up old threads. Otherwise you will end up with a bunch of blocked threads. Try refreshing a page until the server (I use puma) hangs, then publish to your redis channel. If you use some logging and watch your logs while you do this, you should be able to understand where and why your threads are hanging, and the purpose of the "flushing heartbeat".

If there is a better approach than the heartbeat one, i.e. a way for the server to better know when a client has disconnected, I would love to hear about it.

+1

+1

+1

I worked around what may be the same issue in this gist https://gist.github.com/njakobsen/6257887. However the solution is not exactly the Mona Lisa, so take it with a grain of salt. See the sleep hack comment.

+1

+1

Used something like this in the meantime: http://stackoverflow.com/questions/14268690/actioncontrollerlive-is-it-possible-to-check-if-connection-is-still-alive Not the most beautiful solution, but worked for killing the connection off.

I've posted a question on StackOverflow to try and find other solutions in regards to this problem: http://stackoverflow.com/q/18970458/877472. The interim solution I've come up with (which you can see in the question) is an implementation of the "flushing heartbeat" idea suggested by @huetsch. It works, but I find it difficult to believe there isn't a better way. I'll soon refactor it so that it behaves more like a single long-running connection thread garbage collector (as opposed to creating a new flusher thread every connection), but other than that I cannot presently come up with a better solution either. Any suggestions are welcome.

Thanks for summarizing all of this and putting it on StackOverflow @teeg82. My final code is a lot like yours, except I use a cron script to do my "flushing heartbeat" and, as you mentioned over there, I used psubscribe.

I'm still hoping for a better solution. SSE ultimately runs on TCP, so it seems like there should be some FIN packet sent from the client when the browser is done with its connection, but perhaps browser implementations just trash the connections without notifying the remote server that they're closing the socket. That would leave the connection half-open and mean we would need a solution like our current one. Maybe browsers don't actually do this and something isn't being caught in Rails that should. Or maybe browsers have a very good reason to not send a FIN packet to the server. I don't have enough knowledge of the situation to hypothesize further.

What I do know is that without this kind of fix, Live Streaming in Rails feels pretty crippled. If it really is a client-side issue, I'd hope that we could get a solution like above (or better) bundled into Rails. Making everyone roll their own seems... redundant. If it's just a Rails issue where something isn't being caught, that would probably be easier to track down and solve. @tenderlove wrote quite a bit about this feature on his blog (maybe he even authored most of the streaming code?) - perhaps he has something to say about the matter.

Using Wireshark, I can indeed see FIN packets being sent. Admittedly, I am not very knowledgeable or experienced with protocol level stuff, however from what I can tell, I definitely detect a FIN packet being sent from the browser when I establish the connection using the EventSource library, and NO packet sent if I suppress that line of code (meaning disallow that connection). Though I'm not terribly up on my TCP knowledge, this seems to indicate to me that the connection is at least being properly terminated by the client. So like you said, @huetsch, it might indeed indicate a bug in Puma or Rails (at least to me). I'm going to try another server (Unicorn) and see if that makes any difference, though I'm not hopeful, since joeellis earlier indicated it made no difference.

I believe I recall reading that Goliath had a solution to this problem - which would seem to confirm the theory that this is a problem in the Rails stack. I glanced at Goliath because of that, but I have a large codebase and porting everything to Goliath for a potential solution to this specific problem was definitely out of the picture. Still, it's a direction you may want to consider.

+1

+1

Same problem here. Have ruled out networking problems, as I am proxying everything through Apache into Puma, and I've killed Apache and the threads in Puma don't die. I've also published messages to Redis that would normally have been consumed by the clients, but the ActionController::Live methods aren't detecting the client isn't there.

+1

Is this problem specific to that blocking redis.subscribe call? If you look at the redis documentation page for SUBSCRIBE, you'll see the top comment is someone asking for a timeout on the subscribe so it doesn't block forever.. http://redis.io/commands/subscribe#comment-720420850

@boutcheratwest Yeah that post is very much related to the issue. As that person described, there's no way for your Rails code to execute the (p)unsubscribe method. The end result is a perpetually running thread. Ultimately, that's why the connection poker concept works, because we're deliberately causing an error to be thrown which allows the connection and the thread to shut down properly. Which is goofy and, as that person implies, inherently broken. A timeout I think would work too because your client code can simply re-establish the connection when it dies. I'm not sure which method is better. Ideally, for mine and huetsch's implementation of the solution, I'd like to find a way to determine whether the connection is closed on the other end without writing to the stream. As it is right now, I have to implement client-side code to handle my "poking" message separately, which is obtrusive and goofy as heck.

Taking a lot from @huetsch and @teeg82, ended up with a solution that I tossed on SO: http://stackoverflow.com/a/19485363/928963 .. One single thread publishing a heartbeat to a dedicated Redis channel, then the controller signs for the heartbeat channel as well as the one it was interested in, and ensures any heartbeat requests cause a heartbeat event sent down the pipe to the client.. Upon disconnection, 30-60 seconds later the thread will be destroyed. (Thanks guys for the help)

I still believe this is a poor solution and that the answer lies further down the Rails stack.

It looks like it's definitely possible for the server to pick up client disconnects automatically. Check out Goliath:

https://github.com/postrank-labs/goliath/wiki/Streaming

They're breaking the incoming connection into headers, body, and close and giving you a chance to react with every event. "on_close" looks like it would give us exactly the hook we need for an elegant solution.

I wonder how hard it would be to re-architect Rails to do something similar. Maybe Rails was engineered in such a way that this isn't a quick fix. It would be great if someone more knowledgeable could chime in.

A heartbeat solution is not ideal because 1) you are pushing out more data than you need to to every client, and 2) worst case your client disconnects immediately after a heartbeat and your worker is sitting idle for the heartbeat duration, not doing anything useful.

With that said I may have to go with a heartbeat based approach until someone has a better solution :)

Yup, @ledbettj and @huetsch are absolutely right, it is not ideal for exactly the reasons ledbettj pointed out. For this solution, you basically have a trade off between the two points he mentioned: Either you decrease the time between heartbeats so that you minimize the amount of time threads / connections are left open, resulting in MORE data being pushed through the open stream, or you increase that time and minimize the data being sent, leaving threads / connections potentially open longer.

Sadly, it is the only one we have at the moment, other than perhaps periodically closing the connection (which would require the client code to periodically re-establish the connection). I suppose you could also implement an "activity" timer that shuts down the connection if there has been no activity for a period of time. Obvious downsides would be that the client would still need to re-establish the connection, and depending on how long you consider "inactivity" to be, you'd still have dangling connections / threads before its considered "inactive".

Interesting find, huetsch. If Goliath can do it, Rails should be able to do it too (I hope). I'll try and look in to the Goliath stack to see how they implemented their solution, and see what we can find (time permitting...).

+1

+1

related for puma users: https://github.com/puma/puma/issues/389

Quoting Evan:

I'm going to add a timed terminate and some code to request exit and then actually wait for it to exit to 2.7.0. That should fix it.

+1

+1

This issue can be reproduced without redis right? Any controller stream that doesn't write will be kept open indefinitely. I've seen some other strange behaviour like adding a sleep(0) gives different behaviour.

+1. The same here. @tenderlove any idea about the status of this?.

Can someone give me a sample app to work with?

@tenderlove here you have one that I have just created: https://github.com/jfcalvo/rails-live-streaming-closing to help you replicate the problem.

It is really simple, you can send some SSE messages to the app with redis-cli like:

publish events-channel "hello"

I have created it with Rails 4.1.0. I though it has some problems too reloading with Puma threads and Spring (my first time using Spring so I'm not sure if thats the problem).

If I wanted to submit a patch to Rails to fix this issue, does anyone have suggestions as to where I would begin? Does ActionDispatch seem like a reasonable place?

+1

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

We really need some progress on this, it basically makes SSE impossible on Rails. Where would I start tracking this down?

@bbhoss , I was actually able to get my application to work using the popular hack (i.e., sending a heartbeat to the EventSource every 30 seconds or so). It is hacky, though, so if you find anything or get any suggestions of where to start on a real, Rails-source solution, please let me know and I鈥檒l help out.

Here's my solution which doesn't use a heartbeat: http://stackoverflow.com/a/22971751/106704

It works for me but uses sinatra which you can mount as a rack app in Rails.

Thanks for the workarounds, I'll give them a shot. IMO, SSE should be a feature of Rails though, and I would like to attempt to fix it. Just kind of daunting to get started.

+1

+1

Is this still open?

I still got hanging threads in 2019 with basically exact the same code shown on top of this issue... i saw some workaround but no solutions. Would like to see this happen

+1

And the cause, as some people already pointed out, seems to be Rails not releasing the ActiveRecord connection.

Neither sse code nor redis uses a loop to access a ActiveRecord Connection/Model but after all threads are exhausted, this exception is thrown.

ActiveRecord::ConnectionTimeoutError (could not obtain a connection from the pool within 5.000 seconds (waited 5.003 seconds); all pooled connections were in use):
Was this page helpful?
0 / 5 - 0 ratings