Nomad: Tens of thousands of open file descriptors to a single nomad alloc logs directory

Created on 23 Nov 2016  Â·  56Comments  Â·  Source: hashicorp/nomad

Nomad version

Nomad v0.4.1

Operating system and Environment details

Linux ip-10-201-5-129 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Issue

Nomad has tens of thousands of open file descriptors to an alloc log directory.

nomad 2143 root *530r DIR 202,80 4096 8454154 /var/lib/ssi/nomad/alloc/14e62a40-8598-2fed-405e-ca237bc940c6/alloc/logs

Something similar to that repeated ~60000 times.
lsof -p 2143 | wc -l returns ~60000

I stopped the alloc but the descriptors are still there.

In addition, the nomad process is approaching 55 GB of memory used.

stagneeds-investigation themapi typbug

All 56 comments

Hey could you post the open file descriptors in a gist?

Couple questions:
1) Are you running raw_exec jobs?
2) Can you grab the memory usage of all nomad processes (tree view would be great)
3) Do you have enable_debug set? If so can you grab a memory profile of the agent by hitting:

go tool pprof -output=mem.profile http://127.0.0.1:4646/debug/pprof/heap

1) Not running raw_exec jobs.
2/3) Will do next time we run into the issue.

process

Here's a glimpse of htop. The process just ran out of memory and rebooted.
The logs are filled with "dropping update to alloc"

Unfortunately, this agent did not have enable_debug set.

image

Here's a graph of memory usage from our alerting system.

@sheldonkwok can you attach the logs and the open file descriptors.

Does this happen regularly? The memory profile will help a lot!

Using the docker driver. Debug is on now so I'll have to catch it when it's in the bad state before it reboots.

Bad news. Getting net/http: timeout awaiting response headers when trying to run go tool pprof. Notably we were running the logging command before this and the 60000 open file descriptors are

nomad   2113 root *012r      DIR  202,80     4096 8456000 /var/lib/ssi/nomad/alloc/ddf27f57-9179-2fbf-9c44-69f54a28f17b/alloc/logs
nomad   2113 root *013r      DIR  202,80     4096 8456000 /var/lib/ssi/nomad/alloc/ddf27f57-9179-2fbf-9c44-69f54a28f17b/alloc/logs
nomad   2113 root *014r      DIR  202,80     4096 8456000 /var/lib/ssi/nomad/alloc/ddf27f57-9179-2fbf-9c44-69f54a28f17b/alloc/logs

repeated many many times.

got anything in the client or server logs relevant to the issue? e.g. repeated actions ?

A lot of
client: alloc "00e0bb67-624c-5221-087b-ba31f5740b43" in terminal status, waiting for destroy
in the client but that seems more of a product of the client running out of memory

We encountered this issue again after hitting the logging api for one of the allocs. It's crashing multiple production nomad clients.

@sheldonkwok Can you show nomad fs of that folder? What is the logging like of the task? What is the log config

Mode        Size    Modified Time          Name
-rw-r--r--  31 KiB  12/01/16 21:25:59 UTC  service.stderr.0
-rw-r--r--  0 B     12/01/16 06:31:26 UTC  service.stdout.0
      LogConfig: {
        MaxFileSizeMB: 10,
        MaxFiles: 10,
      },

Nothing too crazy it seems.

Does this reproduce every time you log or just occasionally. What is the log command you are running? Are the file descriptors open on the nomad client or the executor?

/v1/client/fs/logs/<Allocation-ID>
follow=true,origin=start,offset=0

We're not sure if it's every time or just occasionally. The descriptors are open on the client.

@sheldonkwok I can not reproduce this on either 0.4.1/0.5.0/master. It would be great if you could get a vagrant box or some reproductions steps so I can debug this. As of now it is a bit hard since you can't even attach a profiler

We're still working on some reproduction steps because nothing we do immediately crashes the system and we're alerted of issues later.

Managed to get the client into a state where the memory low but the number of file descriptors open is still high. Here's the mem profile.

http://pastebin.com/XbPH9qhw

Could you get the profile too and send both the heap and profile output

Thanks,
Alex

On Dec 1, 2016, 3:16 PM -0800, Sheldon Kwok notifications@github.com, wrote:

Managed to get the client into a state where the memory low but the number of file descriptors open is still high. Here's the mem profile.
http://pastebin.com/XbPH9qhw
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

We upgraded to Nomad 0.5 and we're still experiencing this issue when we make multiple requests to an allocation's log route simultaneously. Our use case is that many of our devs want to see logs for their service at the same time.

@sheldonkwok Can you get me this: curl http://127.0.0.1:4646/debug/pprof/goroutine?debug=2

How are retrieving the logs? Programatically?

Will do next time it happens.
We are using the route /v1/client/fs/logs/<Allocation-ID> when one of the devs asks for logs.

My current thought is you may never be closing the connection which is causing this. The goroutines should help correlate number of descriptors to times its been called (goroutines handling the request)

I was considering that but the number of file descriptors jumped from dozens to the tens of thousands pretty quickly when we only made a handful of simultaneous requests.

I reproduced this behavior with nomad logs command. When job contains more that one task and you trying to see logs from them e.g.:
nomad logs -tail -f -stderr -job job_name

We unfortunately had to stop using the nomad logs/logs route because of this issue. We are instead ingesting the logs through Filebeat/Logstash/Elasticsearch. It doesn't handle streaming as well though. Will continue to try and reproduce though.

@sheldonkwok Sorry to hear that! Hope to be able to resolve soon and get you using it again. Still haven't been able to reproduce unfortunately!

@leonoff When I do that I get a message telling me to pick which task? Is that not the behavior you see?

@dadgar I don't see any messages telling me to pick up task,
my job: https://gist.github.com/leonoff/1169e6337ba0ccc56c044a6a857207f1
step-by-step actions:

  1. nomad run -check-index 0 job.hcl
  2. wait until running
  3. nomad logs -tail -f -stderr -job hello
  4. see no logs(it's ok)
  5. exit, ^C
  6. CPU usage on nomad spikes up to 100% & count of opened file descriptors grown up to 2-3k

I'm running Nomad v0.5.0

@leonoff @sheldonkwok Got it to reproduce!!! Thanks for the instruction @leonoff. I have reproduced it. For me though the file descriptors close after a while and it is a few hundred not thousand. Still bad behavior though.

Will begin investigating!

@dadgar I suppose that number of opened file descriptors at a time limited by cpu resource :). And yes, in most cases file descriptors close after a while, but some times it didn't and only way to stop it is to stop the job. Also, I mentioned a strange "side effect": after an "incident" average CPU usage of nomad is increased. e.g. when job(specified above) running nomad uses 0.3-0.7% of cpu, after an incident(when file descriptors closed) 20-30%.

Yeah I noticed the jump in CPU too.

I just found a bug in nomad-ui that would cause it to never ever stop tailing files

if you use nomad-ui it could be part of the cause, maybe

@sheldonkwok @leonoff Can you please test the fix? We released new binaries https://releases.hashicorp.com/nomad/0.5.1-rc2/

Will do. Thanks for the quick turnaround!

On Fri, Dec 9, 2016, 19:15 Diptanu Choudhury notifications@github.com
wrote:

@sheldonkwok https://github.com/sheldonkwok @leonoff
https://github.com/leonoff Can you please test the fix? We released new
binaries https://releases.hashicorp.com/nomad/0.5.1-rc2/

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/hashicorp/nomad/issues/2024#issuecomment-266175267,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABbBAWHjTMosjS-0xdrTqeO4MNHnPyK1ks5rGhlmgaJpZM4K6KhC
.

@diptanu @dadgar Tested new binaries, issue fixed for me.

@leonoff Awesome! Thank you!

@dadgar @diptanu Please see https://github.com/hashicorp/nomad/issues/2089.

Had no issues today with logging, but did have many clients panic.

The file descriptor issue just creeped back on to our system today running Nomad 0.5.2... Will try to reproduce. Just letting everyone know.

This has still been happening a lot for us lately. Today there was an actual panic - I opened a separate ticket here, but there was heavy utilization of logs during the time of the crash. I still have a feeling there's an edge case in here somewhere...

@justinwalz Do you have a list of the file descriptors that were open?

I got to the machine after the nomad process had already died, but I will get a dump next time it occurs live.

Hi, it occurred again

Nomad is using up a ton of RAM, and a high count of file descriptors

ps aux | grep nomad

root       2449  649 39.0 59051896 49124000 ?   Ssl  19:17 868:09 /usr/local/bin/nomad agent -config /etc/nomad

truncated output from lsof -p 2449

nomad   2449 root *810r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *811r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *812r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *813r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *814r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *815r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *816r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *817r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *818r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *819r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *820r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *821r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *822r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *823r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *824r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *825r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *826r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *827r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *828r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *829r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *830r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *831r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *832r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *833r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *834r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *835r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *836r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *837r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *838r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *839r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *840r      DIR             202,81     4096 2621596 /var/lib/ssi/nomad/alloc/32b3a6a3-7fde-418b-2006-7c65b8ec1660/alloc/logs
nomad   2449 root *841r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *842r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad   2449 root *843r      DIR             202,81     4096 2621588 /var/lib/ssi/nomad/alloc/bd1fa749-7fa1-da62-faf1-8a3187ecc508/alloc/logs
nomad -v
Nomad v0.5.2

To add more context, I've attached the client logs. There are some errors about the logs API and too many open files included.

For what it's worth, nomad healed itself - both memory usage back under control and file descriptors back to normal without needing to restart.

client-logs.tar.gz

@justinwalz Okay re-opening this. Do you have any reproduction steps?

We're still running into this issue every week. It makes the server unresponsive and forces reallocations for all of the jobs on it. This issue happens across all of our jobs intermittently when we attempt to fetch logs. It's difficult to reproduce but seems guaranteed to popup with enough logging requests over time on a job. The shared logging config is

      LogConfig: {
        MaxFileSizeMB: 10, 
        MaxFiles: 10, 
      },  

@sheldonkwok I will make time to play with this soon. I was never able to reproduce however. Are you running both server/client on one machine?

Thanks Alex. We are running 5 servers on their own machines and the clients are run separately as well.

Hey just tried the following and couldn't reproduce:

Ran this job:

job "example" {
  datacenters = ["us-west-2"]
  type = "service"

  group "cache" {
    count = 20

    restart {
      attempts = 60
      interval = "1m"
      delay = "1s"
      mode = "delay"
    }

    ephemeral_disk {
      size = 300
    }

    task "redis" {
      driver = "docker"
      config {
        image = "redis:3.2"
        command = "/bin/bash"
        args = ["-c", "base64 /dev/urandom | head -c 10000000"]
        port_map {
          db = 6379
        }
      }

      resources {
        cpu    = 500 # 500 MHz
        memory = 256 # 256MB
        network {
          mbits = 10
          port "db" {}
        }
      }
    }
  }
}

The job produces 10 MB of logs every few seconds and I streamed the logs of all 20 allocations to 3 different machines. I did this for 30+ minutes and transferred many GBs of logs from the client.

@sheldonkwok I did look at your logs and it looks like the streaming connection is held for 50+ minutes but it is still streaming log stdout.0/stderr.0? Are the apps just not logging much?

I am going to try that as well and will report back.

Just tailed logs for over an hour and nothing as well.

@sheldonkwok any more detail on the logging pattern would be great.

Hey @dadgar, I really appreciate the efforts you've spent trying to reproduce it! A normal logging pattern that we see is multiple people tailing the logs of multiple services at the same time for many hours. Some of these services may be the same service as well.

@sheldonkwok Hm unfortunately I did try that.

So I had 3 nodes each running ~7 allocations. I then was following the logs of all 20 allocations on 3 different machines for over an hour and a half.

I repeated this for two different workloads:

1) Small trickle of logs outputted by each allocation
2) 10 MBs of logs every few seconds.

Neither reproduced. Further when I stopped the logging, not a single extra FD was used (as in there was no leaking that would eventually cause a problem).

Unfortunately, we cannot reproduce it effectively either. It generally happens once a week. We have thousands of allocations being created and running daily over hundreds of machines. In addition, users can be tailing thousands of them at once. Thanks again @dadgar

@sheldonkwok Do you retain Nomad log files? Would you be willing to run a debug adds some logging that may help get to the bottom of this?

Sure, what do we set and how much extra logging does it generate?

@sheldonkwok You will set log_level = "DEBUG". I don't know how much extra logging as I need to spend some time building the debug build. I may get to it this week. Will let you know.

@sheldonkwok I'm doing some cleanup of old issues and came across this one. The Nomad logging infrastructure has undergone many changes since this was originally opened. If you're still experiencing any problems would you please open a new issue?

Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jippi picture jippi  Â·  3Comments

stongo picture stongo  Â·  3Comments

byronwolfman picture byronwolfman  Â·  3Comments

jrasell picture jrasell  Â·  3Comments

mancusogmu picture mancusogmu  Â·  3Comments