Octoprint: Memory leak in maintenance branch

Created on 16 Apr 2018  ยท  14Comments  ยท  Source: OctoPrint/OctoPrint

What were you doing?

I've been tracking the _maintenance_ branch for a while now and since about a week or so I'm noticing that octoprint will take up more and more RAM and get killed by the OOM killer.

This happens on a beagleboneblack as well as a raspberrypi2. On the BBB it will last about a day before bad things happen. 'htop' show the octoprint process taking all the memory.

I've started noticing it after the tornado update.

What did you expect to happen?

I expected the memory not to leak.

What happened instead?

The memory leaked.

Did the same happen when running OctoPrint in safe mode?

Haven't tried safe mode yet, will do once the print queue is done (printing from SD for the time being).

Version of OctoPrint

RPI2: maintenance 0da52cb366
BBB: maintenance 0da52cb366

Operating System running OctoPrint

Both devices are running Angstrom linux v2017.12

Printer model & used firmware incl. version

Prusa i3 MK3 and an Fabrikator Mini v1

Browser and version of browser, operating system running browser

Safari 11.1 on OSX, Chrome 65 on OSX and OctoClient on iOS.

Link to octoprint.log

Most recent octoprint.log: https://gist.github.com/koenkooi/1c47415ad60659a56e02394ff8ff531e

This was after an OOM and 'systemctl restart octoprint nginx mjpg-streamer', but networking is a bit funny.

Screenshot(s)/video(s) showing the problem:

LibreNMS log of the BBB:
schermafbeelding 2018-04-16 om 10 59 49

I have read the FAQ.

bug triage

All 14 comments

Hm... Thanks for reporting this. This Tornado update is causing me way way more grey hairs than I expected.

Could you take a look at what tornado version you have currently installed? The change log of 5.0.2 suggests that a memory leak in IOLoop creation/deletion was fixed, and I'm wondering if that might already be it.

pip freeze | grep -i tornado should give the answer.

I have 5.0.1 installed:

root@blackbox:~# opkg list_installed | grep tornado
python-sockjs-tornado - 1.0.3+git0+85869af90b-r0.0
python-tornado50 - 5.0.1-r0.1
root@blackbox:~# pip freeze | grep -i tornado
You are using pip version 9.0.1, however version 10.0.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
sockjs-tornado==1.0.3
tornado==5.0.1

I'll give 5.0.2 (https://github.com/tornadoweb/tornado/tree/branch5.0 bbdd75a13c2540235ebd8b3cd42d52ac2b6c59fd) a try

I checked which gitrev was used for 5.0.1 and that turns out to be this one: https://github.com/tornadoweb/tornado/commit/54ad63e907f4da96d6bddb84b860672b2dc9845b , which already includes the leak fix. I'll upgrade both machines to 5.0.2 and give safe mode a try as well.

RPI2 with tornadoweb/tornado@54ad63e , memleak is still there:

schermafbeelding 2018-04-16 om 17 16 59

I've noticed something interesting: most memory gets freed after a print is done and ramps up much more quickly when printing. Here's a plot of the memory usage right before a print from SD ends until right after the next print from SD ends:

schermafbeelding 2018-04-17 om 08 38 57

Systemd reports that octoprint hasn't restarted in that interval:

root@raspberrypi2:~# systemctl status octoprint
โ— octoprint.service - The responsive web interface for your 3D printer
   Loaded: loaded (/lib/systemd/system/octoprint.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2018-04-16 13:43:25 CEST; 18h ago

Here's a plot of the BBB printing an calibration cube from SD and then sitting idle the rest of the night:
schermafbeelding 2018-04-17 om 08 43 05

The dip at the end is octoprint restarting into safe mode.

It looks like there are 2 memleaks:

  1. A slow leak when octoprint is idle
  2. A faster leak during printing, which gets largely or completely freed after the print is done

The issue is present in safe mode as well:

schermafbeelding 2018-04-17 om 11 31 31

I did a brief test against both tornado 5 and 4.5 this morning and can confirm that something's fishy there - saw a steady increase in 5, didn't see that in 4.5. Which means I'll probably have to downgrade tornado to 4.5 for now. Which kinda sucks but can't be helped.

I've just downgraded tornado to 4.5.x again on the maintenance branch. Would be great if you could give this a shot on your end.

I'm unsure whether the issue is tornado itself, or rather an interaction with sockjs-tornado. Sadly that dependency appears to be unmaintained now, which makes things a bit tricky. I bundled it and patched things so that it would work with tornado 5 at all, but it working doesn't mean it works 100% correctly.

Hi @foosel, I also noticed the leak a couple of days ago, my octoprint instance ended eating >60% of the ram. I first blamed some custom plugins I've, but downgrading tornado seems to make things normal again (~12h uptime now, couple of test prints).

Thanks, salud.

I've updated both the BBB and RPi2 to latest maintenance and change tornado to 4.5.x. Will report back later this weekend.

It looks like the leak is gone, I'm going to leave the instances running for a few extra days before saying more.

A neglible increase in 48 hours. I'd say it's fixed.

Considering this closed then. A bit meh (will need to deep dive into this down the road) but for now a downgrade to 4.5.x is an acceptable solution.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dkingsjr picture dkingsjr  ยท  5Comments

FormerLurker picture FormerLurker  ยท  5Comments

JohnOCFII picture JohnOCFII  ยท  5Comments

foosel picture foosel  ยท  5Comments

foosel picture foosel  ยท  4Comments