Logstash: Logstash problem: message=>"Failed to flush outgoing items",<Errno::EBADF: Bad file descriptor - Bad file descriptor>

Created on 15 Dec 2014 · 26Comments · Source: elastic/logstash

Since a couple of months ago, I'm experiencing problem with logstash, in the logs I have several error like the following:

{:timestamp=>"2014-12-15T10:30:25.482000-0700", :message=>"Failed to flush outgoing items", :outgoing_count=>252, :exception=>#<Errno::EBADF: Bad file descriptor - Bad file descriptor>, :backtrace=>["org/jruby/RubyIO.java:2097:in `close'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:173:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:168:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:156:in `connect'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:139:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:406:in `connect'", "org/jruby/RubyProc.java:271:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/pool.rb:48:in `fetch'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:403:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:319:in `execute'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217:in `post!'", "/opt/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:106:in `bulk_ftw'", "/opt/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:80:in `bulk'", "/opt/logstash/lib/logstash/outputs/elasticsearch.rb:315:in `flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in `buffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in `buffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:112:in `buffer_initialize'", "org/jruby/RubyKernel.java:1521:in `loop'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:110:in `buffer_initialize'"], :level=>:warn}

I have ElasticSearch 1.1.1 and Logstash 1.4.2, and the configuration for logstash is like this:

input {
    udp {
        type => "udp"
        port => 30000
    }
}

filter {
   if [type] == "udp" {
      mutate {
        rename => ["@host", "host"]
      }
      dns {
         reverse => ["host"]
         action => "replace"
      }
      grok {
          patterns_dir => "/etc/logstash/patterns"
          match => ["message", "%{OLD_MESSAGE}",
                    "message", "%{NEW_MESSAGE}"
                   ]
      }
   }
}

output {
    elasticsearch {
        protocol => "http"
        cluster => "logstash"
        host => "127.0.0.1"
        index => "logstash-syslog-%{+YYYY.MM.dd}"
    }

}

What is wrong? Is there any bug in Logstash? Should I upgrade it?

Thanks in advance/

logging improvements

Source

ghost

Most helpful comment

So this isn't a bug _per se,_ but perhaps we could catch this exception and return a more descriptive error message in the log. This may be beneficially informational in cases where Logstash is at the mercy of the remote service.

Added the "logging improvements" label.

untergeek on 11 Mar 2016

👍3

All 26 comments

Hi,

interesting issue.
Can you confirm your ES instance is not having any issues?

electrical on 15 Dec 2014

I checked the ES instance everytime I have this error but there is no error on the ES log files but the ES process died. I found surprising that the problem goes to ES instead Logstash.
Any idea?

ghost on 15 Dec 2014

can you check your system logs?
Perhaps ES got OOM killed? ( Out Of Memory ) or something else that might give an indication.

electrical on 15 Dec 2014

@nadalie can you check your file descriptor limit? what is your OS?

suyograo on 18 Dec 2014

Sorry for not answering before, I was out for holidays.
The maximun number of file descriptors is set to 64k:
I have already set to the maximum (to 64k):

  "process" : {
    "refresh_interval" : 1000,
    "id" : 862,
    "max_file_descriptors" : 65535,
    "mlockall" : false
  }

And for the logstash process is set to maximum 10000 open files in /etc/security/limits.conf.
Even though, I'm not logging so many data so this limit should not be reached, the maximum is 15.000 messages in a day, so even the connection needs time to close it, it shouldn't be a problem.

I'm running on Linux and I'm thinking that maybe this is happening because running out of RAM memory. My server has only 3gigas of RAM but only 256 megas assigned to ES, my ES configuration is 5 shards and creating a index per day and keeping just the latest 30 indexes.
Could it be this?

ghost on 6 Jan 2015

I had issue with very same symptoms. I'm running the same versions of LS and ES as described, I have it in containers with lot of resources.

My issue was that output -> elasticsearch -> host was set to just hostname "es1" and my search domain in resolv.conf was in wrong order, so I got public IP, where is forbidden to access elasticsearch port 9200. I provided FQDN (or just IP) and everything started to work.

@nadalie this is not obviously your case, but I suggest to doublecheck that your ES is accessible from where logstash is running and there is no obstruction on the port. Something like telnet 127.0.0.1 9200

vasekch on 7 Jan 2015

@vasekch I'm running both in the same machine so I'm able to connect to the ES port is accesible from logstash, thanks!

ghost on 7 Jan 2015

I just came across this issue and similar to what @vasekch mentioned, it turned out to be networking stack related to ES. I am also running a non-data ES node on the same logstash box - turns out an iptables change I made forgot to allow an exception for localhost traffic. I was able to do a curl against the localhost ES server before I put in the iptables fix which initially threw me off.

msheiny on 11 Feb 2015

What do you suggest it's a networking stack problem? I didn't change anything from iptables(in fact, I allow connections from 9200 "-A INPUT -m state --state NEW -m tcp -p tcp --dport 9200 -j ACCEPT
" ). I don't understand why without changing anything in the configuration and so on, I suddenly experiencing problems.

ghost on 13 Feb 2015

@nadalie: in my case I had error message every time, if I'm getting you correctly you only have this message in logs time to time.
If you have firewall on and can not turn it off to test it, I would suggest to review your DoS preventing rules (req/s etc), maybe they come to play when you don't expect it.
And yes, it's most probably network stack problem AFAIK.

vasekch on 13 Feb 2015

I can confirm that the Errno::EBADF: Bad file descriptor message _can_ occur when a remote port is inaccessible or disconnects in-flight.

untergeek on 13 Feb 2015

Same exception as in issue #2896 but in output ti influxdb

evgygor on 25 Mar 2015

fizerkhan on 3 May 2015

I still have the problem, but I don't see it's a problem of the remote ports is not available. It just happen after couple of months running without problems, and there is not a network disconnection. I don't have firewall.

ghost on 4 May 2015

i have the same error, i have two instance of logstash one used as indexer and read from redis server and other is shipper and forward to redis
the logstash indexer which reads from the redis always has this error in the logs after some time

albarki on 11 May 2015

I had same message in the logs. It was due to too many CLOSE_WAIT connections on logstash port.
On cleanup of these connections, the error was gone.

Hope this helps someone.

sdembra on 6 Jul 2015

my issue is resolved by making a cron job to restart the logstash instance that reads from redis-server each 6 hours and it works now perfectly

albarki on 7 Aug 2015

We have this problem whenever the upstream elasticsearch or load balancer for the logstash output is temporarily unavailable, from that point on logstash fails to process any logs and they go to abyss until the process is restarted.

{:timestamp=>"2016-01-07T07:41:54.590000+1100", :message=>"Got error to send bulk of actions: connect timed out", :level=>:error}
{:timestamp=>"2016-01-07T07:41:54.590000+1100", :message=>"Failed to flush outgoing items", :outgoing_count=>1, :exception=>"Manticore::ConnectTimeout", :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.4.4-java/lib/manticore/response.rb:35:in `initialize'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.4.4-java/lib/manticore/response.rb:70:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.4.4-java/lib/manticore/response.rb:245:in `call_once'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.4.4-java/lib/manticore/response.rb:148:in `code'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/http/manticore.rb:71:in `perform_request'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/base.rb:201:in `perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/http/manticore.rb:54:in `perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/client.rb:125:in `perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-api-1.0.15/lib/elasticsearch/api/actions/bulk.rb:87:in `bulk'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-1.1.0-java/lib/logstash/outputs/elasticsearch/protocol.rb:105:in `bulk'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-1.1.0-java/lib/logstash/outputs/elasticsearch.rb:548:in `submit'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-1.1.0-java/lib/logstash/outputs/elasticsearch.rb:547:in `submit'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-1.1.0-java/lib/logstash/outputs/elasticsearch.rb:572:in `flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-1.1.0-java/lib/logstash/outputs/elasticsearch.rb:571:in `flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.21/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/RubyHash.java:1342:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.21/lib/stud/buffer.rb:216:in `buffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.21/lib/stud/buffer.rb:193:in `buffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.21/lib/stud/buffer.rb:159:in `buffer_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-1.1.0-java/lib/logstash/outputs/elasticsearch.rb:537:in `receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.6-java/lib/logstash/outputs/base.rb:88:in `handle'", "(eval):296:in `output_func'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.6-java/lib/logstash/pipeline.rb:244:in `outputworker'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.6-java/lib/logstash/pipeline.rb:166:in `start_outputs'"], :level=>:warn}

sammcj on 6 Jan 2016

👍1

What version of logstash and what version of the logstash-output-elasticsearch plugin?

jordansissel on 6 Jan 2016

@sdembra How are you managing clean up of connections in CLOSE_WAIT? Please shed some light. I am on the same issue.

arivazhagan-jeganathan on 7 Mar 2016

I just restart logstash process when there are too many CLOSE_WAIT
connections.

One more reason behind getting too many CLOSE_WAIT connections can be
elasticsearch
going out of memory.

In that case, you will have to increase memory for ES.

Another thing you can do, if you are using a multi node ES cluster, is to
set
appropriate parameters in elasticsearch.yml file:

*discovery.zen.minimum_master_nodes: (number of master-eligible nodes / 2)

1*
_discovery.zen.ping.timeout: 30s._

_Default ping timeout is 3s._

Hope this helps.

On Mon, Mar 7, 2016 at 4:54 PM, arivazhagan-jeganathan <
[email protected]> wrote:

@sdembra https://github.com/sdembra How are you managing clean up of
connections in CLOSE_WAIT? Please shed some light. I am on the same issue.

—
Reply to this email directly or view it on GitHub
https://github.com/elastic/logstash/issues/2241#issuecomment-193212472.

sdembra on 10 Mar 2016

@sdembra Thank you for the info. We made some configuration changes in logstash output configuration for elasticsearch and it works good.
We added flush_size of 500 and idle_flush_time of 5 seconds. Earlier connections went to CLOSE_WAIT without these configuration. Now its fine.

arivazhagan-jeganathan on 10 Mar 2016

Added the "logging improvements" label.

untergeek on 11 Mar 2016

👍3

I found that this issue was the same as [https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/127] resolved by proper output configuration described in [https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/127#issuecomment-261374678]

sbarvick on 17 Nov 2016

I ask our admin to restart the server and upgrade RAM to 8G and issue was solved.
ES was started and logstash was started, while send bulk request, ES was killed by some reason, maybe OOM(out of memory).