OctoPrint loses serial connection.

Created on 22 May 2018  ·  30Comments  ·  Source: OctoPrint/OctoPrint

What were you doing?

  1. Printing a thing, namely one of the bases for the Petsfang -- not particularly big or complex, was taking roughly 2:30.

What did you expect to happen?

Print finished

What happened instead?

Octoprint lost the serial connection to the printer.

Did the same happen when running OctoPrint in safe mode?

Yes, both in safe mode and in normal mode with _all_ third party plugins disabled, and even some of the built-in ones.

Version of OctoPrint

1.3.8

Operating System running OctoPrint

OctoPi 0.15.0

Printer model & used firmware incl. version

Creality CR-10 running TH3D Unified Firmware, latest version -- U1.R1.8c.

Link to octoprint.log

https://gist.github.com/cassianoleal/971a6996a20852fc5bf513cb94e1843e

I've also included /var/log/syslog in the gist.

Link to contents of terminal tab or serial.log

I'm very sorry but I lost this. There was not much to see there though, unless you wanted to analyse the actual Gcode being sent to the printer or the fact that eventually it stopped responding.

Other information

This is a followup from my comment on #2647 . As it was correctly pointed out by @foosel that bug was specific to the server dying in response to a failure.

I have now been running Octoprint 1.3.7 on the same setup with no issues so far.

I'm running on a Raspberry Pi 3 B+. The Pi itself is powered by the official PSU, which solved problems I had in the past with power fluctuations -- I'm not seeing those anymore. I have the printer and a Logitech C270 webcam both connected to a powered Anker USB hub with new high quality cables. Nothing else is connected to the computer.

Everything is up-to-date. After the issue I ran apt update and rpi-update and neither had updates to be done.

bug triage unreproduced

All 30 comments

Issue also raised in OctoPi issues log: https://github.com/guysoft/OctoPi/issues/532

I am experiencing the same behavior with an identical pi setup (different printer)

Hi @cassianoleal,

It looks like there is some information missing from your bug report that will be needed in order to solve the problem. Read the Contribution Guidelines which will provide you with a template to fill out here so that your bug report is ready to be investigated (I promise I'll go away then too!).

If you did not intend to report a bug but wanted to request a feature or brain storm about some kind of development, please take special note of the title format to use as described in the Contribution Guidelines.

Please do not abuse the bug tracker as a support forum - that can be found at discourse.octoprint.org. Go there for any kind of issues with network connectivity, webcam functionality, printer detection or any other kind of such support requests or general questions.

Also make sure you are at the right place - this is the bug tracker of the official version of OctoPrint, not the Raspberry Pi image OctoPi nor any unbundled third party OctoPrint plugins or unofficial versions. Make sure too that you have read through the Frequently Asked Questions and searched the existing tickets for your problem - try multiple search terms please.

I'm marking this one now as needing some more information. Please understand that if you do not provide that information within the next two weeks (until 2018-06-05 09:50 UTC) I'll close this ticket so it doesn't clutter the bug tracker. This is nothing personal, so please just be considerate and help the maintainers solve this problem quickly by following the guidelines linked above. Remember, the less time the devs have to spend running after information on tickets, the more time they have to actually solve problems and add awesome new features. Thank you!

Best regards,
~ Your friendly GitIssueBot

PS: I'm just an automated script, not a human being, so don't expect any replies from me :) Your ticket is read by humans too, I'm just not one of them.

unless you wanted to analyse the actual Gcode being sent to the printer or the fact that eventually it stopped responding.

That is in fact the whole point why the serial.log is requested in the first place ;) For any kind of communication issues it is absolutely crucial to know what went over the line before the issue occurred, the more the better. So if you and @JabawokJayUK could both provide logs (or at the very least the terminal contents) of a reproduction, that might help and would certainly be appreciated.

As pointed out in the other issue, also double check your cabling.

I have now been running Octoprint 1.3.7 on the same setup with no issues so far.

This must be coincidence, you made a typo there or something is really fishy here. The only difference between 1.3.7 and 1.3.8 is a third party dependency being pinned to a lower version that is not even remotely involved in printing. That difference should only ever come to fruition during initial setup or an upgrade, not run time. So - are you 100% certain you mean 1.3.7 and not 1.3.6? If yes, then I have no idea why you are seeing a difference here. Between 1.3.6 and 1.3.7/8 I could at least look if the server now puts more stress on things, but that's about it.

In any case, chances are slim I'll be able to fix this. Based on the dmesg outputs provided by you two in the other ticket, this is something that goes wrong far far below OctoPrint (your kernel driver is erroring out for some reason). There's not much if any that a user space program a couple of levels up like OctoPrint can do to fix that. All I can tell you right now is that in the past such connection issues boiled down to power supply problems or other things (e.g. dwc_otg) outlined in the FAQ entries I linked in the other ticket. I've never seen this (either with 1.3.8, 1.3.7, 1.3.6 or any versions prior) unless I a) pulled out the plug or b) had a power failure.

@foosel I fully appreciate that this issue is a kernel space issue and not a userland one but dispute the issue being related to the physical environment. Speaking from my own config / setup, I followed the FAQ completely to resolve early communication issues, so have official pi power supply on a pi3b+, ferrite magnets on a high quality usb cable etc.

I have had consistent, reliable usage with Version 1.3.7 and only experience this issue after an automatic upgrade to 1.3.8. The variable that changed is/was the version of OctoPrint, not the physical environment or the OS in my configuration which is echoed by the experience of @cassianoleal.

Are we really really talking about 1.3.7 -> 1.3.8, not 1.3.6 -> 1.3.8? 1.3.8 was a hotfix release for the aforementioned third party dependency issues pushed out mere days after 1.3.7 (1.3.7 was released on April 9th, 1.3.8 on April 13th), 1.3.6 was released in December 2017.

If you both say 1.3.7 is definitely the version that is fine and 1.3.8 isn't, fine by me, but due to the nature of the short one-after-another-releases in this specific case I want to make absolutely sure we are not declaring the wrong version as "still worked fine here".

@foosel for clarity I have only tested this on OctoPrint 1.3.8 running on OctoPi 0.14.0.

My last 100% working config was OctoPrint 1.3.7 running on OctoPi 0.14.0

Then I'm drawing an absolute blank here what it could be between the two updates. The only thing that changed from 1.3.7 to 1.3.8 is literally this. It's one line in the installer (two if you include the comments), nothing else. Everything else that actually touches code (and not just the README or other metadata) is completely identical between 1.3.7 and 1.3.8, especially anything that touches the serial connection. See here.

@foosel I can confirm that I'm currently using 1.3.7, _not_ 1.3.6. It's been printing the whole day now without issues. Everything is the same. What I did was simply clone Octoprint's git repo, checkout the 1.3.7 tag, create a virtualenv, install the pips and run it from there. All on the same hardware and OS as 1.3.8, which is still there untouched.

I understand where you're coming from, but perhaps there's some bug in psutil 5.4.4 that's causing some resource leak or somehow flooding the serial port, perhaps some other dependency has introduced a bug.

I have also noticed that in my 1.3.7 I'm now using psutil 5.4.5, not 5.4.4. That might also hint to the problem.

I guess it's time to check dependencies here. Could both of you please provide the output of this (assuming OctoPi here):

~/oprint/bin/pip freeze

1.3.7

pi@octopi:~/OctoPrint $ . venv/bin/activate
(venv) pi@octopi:~/OctoPrint $ pip freeze
argh==0.26.2
awesome-slugify==1.6.5
Babel==2.5.3
backports.ssl-match-hostname==3.5.0.1
blinker==1.4
certifi==2018.4.16
chainmap==1.0.2
chardet==3.0.4
click==6.2
emoji==0.4.5
feedparser==5.2.1
Flask==0.10.1
Flask-Assets==0.10
Flask-Babel==0.9
Flask-Login==0.2.11
Flask-Principal==0.3.5
frozendict==1.2
future==0.15.2
futures==3.1.1
idna==2.6
itsdangerous==0.24
Jinja2==2.8.1
Markdown==2.6.11
MarkupSafe==1.0
monotonic==1.3
netaddr==0.7.19
netifaces==0.10.7
OctoPrint==1.3.7
OctoPrint-FanSpeedSlider==0.1.8
pathtools==0.1.2
pkg-resources==0.0.0
pkginfo==1.2.1
psutil==5.4.5
pyasn1==0.4.2
pylru==1.0.9
pyserial==3.4
python-dateutil==2.6.1
pytz==2018.4
PyYAML==3.10
regex==2018.2.21
requests==2.18.4
rsa==3.2.3
sarge==0.1.4
scandir==1.3
semantic-version==2.4.2
six==1.11.0
sockjs-tornado==1.0.3
speaklater==1.3
tornado==4.0.2
Unidecode==0.4.21
urllib3==1.22
watchdog==0.8.3
webassets==0.12.1
websocket-client==0.40.0
Werkzeug==0.8.3
wrapt==1.10.11

1.3.8

pi@octopi:~/oprint $ . bin/activate
(oprint) pi@octopi:~/oprint $ pip freeze
Action-Commands==0.1
argh==0.26.2
awesome-slugify==1.6.5
Babel==2.5.3
backports.ssl-match-hostname==3.5.0.1
Bed-Visualizer==0.1.0
blinker==1.4
certifi==2018.4.16
chainmap==1.0.2
chardet==3.0.4
click==6.2
DisplayLayerProgress==1.6.2
emoji==0.4.5
feedparser==5.2.1
Flask==0.10.1
Flask-Assets==0.10
Flask-Babel==0.9
Flask-Login==0.2.11
Flask-Principal==0.3.5
frozendict==1.2
future==0.15.2
futures==3.1.1
idna==2.6
itsdangerous==0.24
Jinja2==2.8.1
Markdown==2.6.11
MarkupSafe==1.0
monotonic==1.3
netaddr==0.7.19
netifaces==0.10.6
Octolapse==0.3.1
OctoPrint==1.3.8
OctoPrint-Autoscroll==0.0.2
OctoPrint-DetailedProgress==0.1.4
OctoPrint-FanSpeedSlider==0.1.8
OctoPrint-FirmwareUpdater==1.0.0
OctoPrint-FloatingNavbar==0.3.0
OctoPrint-Fullscreen==0.0.4
OctoPrint-NavbarTemp==0.9
OctoPrint-Tempsgraph==0.3.3
OctoPrint-Touchtest==0.1.1
pathtools==0.1.2
Pillow==5.1.0
pkg-resources==0.0.0
pkginfo==1.2.1
psutil==5.4.3
pyasn1==0.4.2
pybonjour==1.1.1
pylru==1.0.9
pyserial==3.4
python-dateutil==2.6.1
pytz==2018.4
PyYAML==3.10
regex==2018.2.21
requests==2.18.4
RPi.GPIO==0.6.3
rsa==3.2.3
sarge==0.1.4
scandir==1.3
semantic-version==2.4.2
six==1.11.0
sockjs-tornado==1.0.3
speaklater==1.3
Themeify==1.2.0
tornado==4.0.2
TouchUI==0.3.11
Unidecode==0.4.21
urllib3==1.22
watchdog==0.8.3
webassets==0.12.1
websocket-client==0.40.0
Werkzeug==0.8.3
wrapt==1.10.11

Interestingly, on 1.3.8 I see a lot of pips for plugins I uninstalled (and they don't show) on the GUI.

0.14.0 with 1.3.8:

pi@octopi:~ $ ~/oprint/bin/pip freeze
argh==0.26.2
awesome-slugify==1.6.5
Babel==2.4.0
backports.csv==1.0.5
backports.ssl-match-hostname==3.5.0.1
blinker==1.4
certifi==2017.4.17
chainmap==1.0.2
chardet==3.0.4
click==6.2
emoji==0.4.5
feedparser==5.2.1
Flask==0.10.1
Flask-Assets==0.10
Flask-Babel==0.9
Flask-Login==0.2.11
Flask-Principal==0.3.5
frozendict==1.2
future==0.15.2
futures==3.1.1
idna==2.6
ipaddress==1.0.19
itsdangerous==0.24
Jinja2==2.8.1
Markdown==2.6.8
MarkupSafe==1.0
monotonic==1.3
netaddr==0.7.19
netifaces==0.10.6
OctoPrint==1.3.8
OctoPrint-BLTouch==0.2.0
OctoPrint-CostEstimation==2.1.0
OctoPrint-FilamentManager==0.5.3
pathtools==0.1.2
pkginfo==1.2.1
psutil==5.4.3
pyasn1==0.2.3
pybonjour==1.1.1
pylru==1.0.9
pyserial==2.7
python-dateutil==2.6.0
pytz==2017.2
PyYAML==3.10
regex==2017.6.23
requests==2.18.4
rsa==3.2.3
sarge==0.1.4
scandir==1.3
semantic-version==2.4.2
six==1.10.0
sockjs-tornado==1.0.3
speaklater==1.3
SQLAlchemy==1.1.18
tornado==4.0.2
Unidecode==0.4.20
uritools==2.1.0
urllib3==1.22
watchdog==0.8.3
webassets==0.12.1
websocket-client==0.40.0
Werkzeug==0.8.3
wrapt==1.10.11
You are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

@foosel I've been using 1.3.7 for a few days now without issues. I've upgraded to the current devel version, 1.4.0.dev1254+ga92372ec and had the serial disconnection about 1h into a print.

I am running it on a tmux session, and there is 0 logging during the event (in fact, there only seems to be the webserver's logs there), same as in octoprint.log:

2018-05-26 22:31:40,070 - tornado.access - WARNING - 404 DELETE /plugin/logging/logs/serial.log.2018-05-26_22-31-35 (::ffff:192.168.192.24) 30.10ms
2018-05-26 22:31:40,887 - tornado.access - WARNING - 404 DELETE /plugin/logging/logs/serial.log.2018-05-26_22-31-34 (::ffff:192.168.192.24) 29.64ms
2018-05-26 22:31:41,614 - tornado.access - WARNING - 404 DELETE /plugin/logging/logs/serial.log.2018-05-26_22-31-33 (::ffff:192.168.192.24) 35.91ms

Those are tentatives I made to delete logs that are showing on the web UI but are not present on disk, and were done before I started the print.

The relevant logs are in this gist.

The serial log is also pretty empty, even though I had it enabled.

I am glad that this bug is already reported because I would not know how to extract the needed logs.
I just wanted to say that I have exactly the same issue, the printer just stops in mid of print and the last message in the terminal was a message to the printer, without "OK" response.
I just downgraded to 1.3.7 and now it works perfectly again (which was working bevor since the release of 1.3.7.
If I somehow can help, please tell me how.

My HW: RPI 3 + Anet A8 on SkyNetV2.4.5

@19michi98 (and anyone else chiming in with "I'm seeing this too" here!) Please also provide

@cassianoleal @JabawokJayUK I'll up the psutil dependency on the maintenance branch to 5.4.5 (which apparently fixes the issue introduced in 5.4.4 for which 1.3.8's version pin was done to begin with). It would be awesome if one or both of you could give this a test and report back. I spent most of the weekend printing with 1.3.8 and still haven't seen a reproduction of this myself, so I can't test. Considering that the only change between 1.3.7 and 1.3.8 was in fact the psutil version pin, that's about the only possible culprit here though.

maintenance branch is now set to psutil >= 5.4.5

I have been printing on maintenance branch for about 1h30min now without issues. Will update when I do larger prints as well.

Another bit of encouraging news, I've done a 5h print now on the maintenance branch. If you don't mind leaving this open for a couple more days, I'd like to try a longer print on it as well before declaring it fixed?

I usually leave tickets open until the fix is released anyhow, so that's totally fine by me.

I'm still completely and utterly stunned though that this library appears to be the cause here. It's not even used anywhere in the comm layer, and the only thing it is currently used for is some environment reporting on initial start and later only for disk usage reports. Really really weird.

Yeah I've been trying to come up with ways to measure/test that so we could have an idea of what might be causing it but not sure how to approach it. I'll see if I can get prometheus or something to spew out resource metrics out of the Pi and then try to force the error and see if at least there's something obvious to look at. Can't promise I'll find the time though.

I've just finished another print on this branch, 07:43:39. With 1.3.8 I had never gone over 5h or so, most of the time failing after about 2h.

After a few days of flawless printing with the maintenance branch, today it failed me about 2h into a print. :(

Same symptoms, same everything really.

/var/syslog basically only shows:

Jun  3 11:36:00 octopi kernel: [ 6946.283830] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
Jun  3 11:37:20 octopi kernel: [ 7026.999949] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

OctoPrint shows:

2018-06-03 11:37:26,635 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-06-03 11:37:29,641 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-06-03 11:37:32,649 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-06-03 11:37:35,657 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-06-03 11:37:38,671 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-06-03 11:37:41,685 - octoprint.util.comm - INFO - No response from printer after 6 consecutive communication timeouts, considering it dead.
2018-06-03 11:37:41,710 - octoprint.util.comm - INFO - Changing monitoring state from "Printing" to "Offline (Error: Too many consecutive timeouts, printer still connected and alive?)"

And the Terminal logs in OP are:

Send: N123924 G1 F2400 X145.975 Y127.307 E4978.4448*56
Recv: ok
Send: N123925 G0 F6000 X145.296 Y127.307*123
Recv: ok
Send: N123926 G1 F2400 X149.410 Y131.421 E4978.6772*52
Recv: ok
Send: N123927 G0 F6000 X148.731 Y131.421*120
Communication timeout while printing, trying to trigger response from printer. Configure long running commands or increase communication timeout if that happens regularly on specific commands or long moves.
Send: N123928 M105*20
Communication timeout while printing, trying to trigger response from printer. Configure long running commands or increase communication timeout if that happens regularly on specific commands or long moves.
Send: N123929 M105*21
Communication timeout while printing, trying to trigger response from printer. Configure long running commands or increase communication timeout if that happens regularly on specific commands or long moves.
Send: N123930 M105*29
Communication timeout while printing, trying to trigger response from printer. Configure long running commands or increase communication timeout if that happens regularly on specific commands or long moves.
Send: N123931 M105*28
Communication timeout while printing, trying to trigger response from printer. Configure long running commands or increase communication timeout if that happens regularly on specific commands or long moves.
Send: N123932 M105*31
No response from printer after 6 consecutive communication timeouts, considering it dead. Configure long running commands or increase communication timeout if that happens regularly on specific commands or long moves.
Changing monitoring state from "Printing" to "Offline (Error: Too many consecutive timeouts, printer still connected and alive?)"
Connection closed, closing down monitor

I'm at a loss. I'll probably repurpose the Pi 3 B+ and run OP in a different computer for a while and see where that gets me.

Personally this makes me both sad and relieved. The latter because as I sad I really couldn't make sense of why 1.3.7 would be stable when 1.3.8 wouldn't be. In any case, if you have another Pi lying around, that would also be an interesting thing to test.

Alternatively you could try installing 1.3.8 but with this patch applied (edit forgot to paste the link 🤦‍♀️), to rule out any other changes in maintenance causing the same or similar behaviour on your setup. But other than that I'm also at a loss here. I've done several 4+ hour prints the past two weeks or so against 1.3.8 (on a Pi3 and OctoPi 0.14 though, not a Pi3+ with 0.15) and never ran into any kind of problem. Not saying there isn't an issue here, just saying I have not been able to trigger it here.

I'm totally following you. I'm not ruling out some mix of hardware/firmware issue either. The 3 B+ was riddled with firmware and driver issues when it first came out, and faulty units were not unheard of.

I'm pulling the plug on a 3B to rule out that part of the equation. It seems like most or all the other people with similar issues are also on the B+.

Same here at 1.3.7, on a raspberry pi zero w with an anycubic kossel, first there were problems with temperature drops, wifi disconnecting, serial connection drops (which might be because of the serial drops I don't know), tried to update to 1.3.8, and that basically just botched, told me to manually restart, but after that it's dead, and it won't let me ssh in anymore

@The-Clockmaker to be honest, temperature AND wifi AND serial drops AND then also a broken update doesn't even sound remotely like this issue here but more like a broken or unstable system in general. Check your power supply and wiring and if that doesn't help get in touch on the support forum. And keep in mind that the Zero W isn't recommended for running OctoPrint for a reason (slow AF and easily bogged down by simply WiFi traffic).

@foosel hm okay I did not know that, though I do have to say that for the past year or so it has run perfectly smooth on the pi zero w, never had any problems as far as I know, but it started straight after the update, could it not be that the temp drops while printing could be a serial issue? Tbh the wifi drops only happened twice, and I have pulled the plug a few times now and managed to get back in with ssh, and rebooted, so I don't know if with 1.3.8 those problems are gone now, but I guess I'll see soon enough

I have pulled the plug a few times now

If with that you mean just yanking out the power while it was still up and running, that's also a really really bad idea and bound to give you file system corruption sooner or later that will then cause a whole zoo of issues, from system instability to not being able to boot any longer.

Temperature drops cannot be explained by serial communication issues. OctoPrint displays what your printer reports. If you printer reports dropping temperatures, OctoPrint will report that. If you printer reports nothing, OctoPrint won't update the temperatures anymore. If you see a drop in temps, its your printer's firmware seeing a drop in temps.

@foosel Yes, but I had no choice, that was straight after updating to 1.3.8, and I was unable to reboot it via the webpage, which was telling me to reboot it manually, and it was no longer responding to ssh..
It was the printer though that was reporting dropping temperatures, straight after the print began, at which point the octopi web interface stopped responding, after that I decided to update it, which seemed to run into some issues, for some reason it does seem to run fine now at 1.3.8, but if it runs into problems again I'll just do a fresh install

Edit: @foosel It seems now that after having update to 1.3.8 all the problems are fixed, I did not touch any of the cables or wiring since then so it doesn't seem like that was the issue, I don't know what it was then, but it's working again normally now

@foosel Just a quick update. I'm running now on a Pi 3 B (not +), stable 1.3.7.

So far it's smooth, but it's only been roughly an hour.

Today during the day, still running on the previous problematic system, I was regularly ssh'ing to the box and just manually polling for things like uptime, free memory, CPU, load averages, etc.

It was perfectly healthy, very low resource usage. One thing I did notice was that most of the time I had about 100Mb in buffers/caches (from free) leaving me with about 800Mb of actually free men (almost 900Mb if we add the cache), whereas now, on the new Pi 3B, just 1h into a print it's using >500Mb of cache.

pi@octopi:/var/log $ free -m
              total        used        free      shared  buff/cache   available
Mem:            875          88         267          11         519         722
Swap:            99           0          99

pi@octopi:/var/log $ cat /proc/meminfo
...
Buffers:          271692 kB
Cached:           241836 kB
...

I'm wondering if all that cache might have things like instructions to the printer and the lack of them might be causing a buffer underrun or something similar, causing the connection drops.

I don't have enough data to back any of this but I'll try to run the B+ with 1.3.7 again and see if I get the same behaviour, but before I want to try a really long print on the new 3B.

The connection loss happens now and then to me too, octoPrint 1.3.8 on OctoPi too.
Hard to reproduce, I have been frustrated and had little time, and rebooted OctoPi & printer with marlin 1.1.8 not taking time to trying to figure out which one I could reboot to prevent next print from being destroyed.

Was this page helpful?
0 / 5 - 0 ratings