Marlin: Instant crash when trying to print from SD card with RC4

Created on 24 Mar 2016  Â·  42Comments  Â·  Source: MarlinFirmware/Marlin

I've upgraded to RC4 and I'm getting instant crash:

  • Browsing SD card works fine
  • As soon as a file is selected, immediate crash, LCD goes blank and after a short while the startup message is shown
  • Reflashing an ealier version makes it work again, so it's not the hardware or the SD card
  • SD-Card is 1GB with FAT32, crash happens with just the test file below as filename t.gco

I tried to track this down a bit but with the huge merge tree, it's not easy and bisect is tricky when having to apply the configuration.

The latest commit I found that still works is b7928a000a586fdbd39f12b909e2d0da6dde49cf
Crash with 06332f20be01ea65f330c968d997ba98f8c40acf

Any ideas that make finding the change causing that issue easier?

This is on a Prusa i3 with Mini-Rambo + Reprap Discount Smart Controller

testing.gcode.zip
marlin-config.diff.zip

Test files bundle 2: f0.gco with comments causes crash, f1.gco no crash but all unknown commands
test-files-2.zip

Potential ? More Data

Most helpful comment

I'm thinking of doing this for RC5… (which will be released soon).
Will this work when compiling with Makefile?

#if ARDUINO < 10500
  #error Versions of Arduino IDE prior to 1.5 are no longer supported. Please update your toolkit. Comment this line to proceed, but beware of compiler bugs.
#endif

All 42 comments

@t-paul Could you please send the test g-code ? I have just tried SD printing and it works fine on my end.

Also... A few more details about the SD-Card. What file system is on it? Do you have long file names that you are selecting? (I believe there were some changes made to support long file names.)

File and SD-Card info added to the issue description on top.

When just running the start code (almost default code from Cura), it prompts for "Wait for user..." for some reason. Removing the M117 Printing... did not change anything.

I've reverted to 7d25c107a82cde2da0ebe0e7c04f96e79cb02273 and this is printing fine right now.

How do you put the file on the card? Marlin upload - or direct copy with a card-reader?

Copy on laptop using built-in card-reader.

If I remove all the comments in the gcode file, it does not crash anymore. It's still not printing though, just going back to the main info-screen.

@t-paul did you connected trough serial while trying to SD print to see if Marlin gives any hint about what may be happening ?

Things like this are usually buffer overflow issues, easily missed, and depending on many factors, so not everyone sees them. I hope we'll track it down soon.

And RC4 had a lot of rework on buffers.. but I'm finding a bit strange why comments seems to influence, if they do then this must be on the parser and not on the planner.
I didn't had yet the opportunity to test the provided g-code to see if this can be reproducible on my setup.

I did just now.

With the test file that crashes it just says:

echo:Now fresh file: f0.gco
File opened: f0.gco Size: 2772000
File selected

The one with the comments removed now shows the "Wait for user..." message instead of just going back. Clicking the LCD button it produces a stream of "Unknown command" errors.

File opened: f1.gco Size: 2755365
File selected
echo:Unknown command: "0 S65.000000
M10"
echo:Unknown command: "S220.000000
G21
G"
echo:Unknown command: "0
M82
M107
G28 X0 Y0
G28 Z0
G1 Z15.0 F3600
G"
echo:Unknown command: "2 E0
G1 F200 E15
G"
echo:Unknown command: "2 E0
G1 F3600
M205 X10
M117 Printing...
M107
G0 F3600 X"
echo:Unknown command: "0.823 Y30.475 Z0.300
G1 F1200 X"
echo:Unknown command: "1.330 Y2"
echo:Unknown command: ".270 E0.028"
echo:Unknown command: "0
G1 X"
echo:Unknown command: "1.703 Y28.543 E0.046"
echo:Unknown command: "6
G1 X"
echo:Unknown command: "2.385 Y27.427 E0.07587
G1 X"
echo:Unknown command: "2.861 Y26.765 E0.0"
echo:Unknown command: "38"
echo:Unknown command: "
G1 X"
echo:Unknown command: "3.701 Y25.764 E0.12278
G1 X"
echo:Unknown command: "4.270 Y25.180 E0.14080
G1 X"
echo:Unknown command: "5.253 Y24.311 E0.16"
echo:Unknown command: "81
G1 X"
echo:Unknown command: "5."
echo:Unknown command: "04 Y23.818 E0.18786
G1 X"
echo:Unknown command: "7.001 Y23.110 E0.21672
G1 X"
echo:Unknown command: "7.66"
echo:Unknown command: "Y22.746 E0.23353

"Wait for user" means it probably concatenated some command like "M104" down to "M1". Sure looks like buffer corruption issues to me. What version of Arduino are you using to do the build and install?

I'm using the Linux version 1.0.6 straight from the official download extracted to some folder.

Just for fun, try a newer version. The latest is 1.6.8. https://www.arduino.cc/en/Main/Software Maybe it will only make Marlin crash faster.

Please attach the GCode file you are printing in a new comment so we can test with it. I want to see what kind of line-endings, spacing, and other things it is using as I run it through the command parser code.

Looks like it's working fine (no crash, bed starts heating) when compiled with 1.6.8 (yay! a bit scary though). I've added the 2 test files anyway to the description on top in case this should still be tracked down...

I'll try to actually print with that firmware later today.

Looks like it's working fine (no crash, bed starts heating) when compiled with 1.6.8 (yay! a bit scary though). I've added the 2 test files anyway to the description on top in case this should still be tracked down...

That adds some complexity to the situation. It could be v. 1.6.8 fixes a bug in the compiler that was causing the problem. But more likely, it causes the generated code (and buffers) to be in different places and what ever is over flowing doesn't hit this failure mechanism.

Yeah, true, if it hits later while printing the result could be worse. I can still try to help with testing things.

In fact the buffer seems not to be overwritten with garbage or data not related with gcode, instead seems that the pointer skews. You can see that for the first line the pointer skewed three bytes forward, instead of processing the "M190 ..." it is processing "0 ...".

Marlin_main.cpp Line 990++.
Is there any way to come out of sd_comment_mode?
There is again that ugly

    static bool stop_buffering = false,
                sd_comment_mode = false;

what should not be an initialization only.

Sorry. You can come out.
Missed to read the outer "()" here ((sd_char == '#' || sd_char == ':') && !sd_comment_mode)

    // '#' stops reading from SD to the buffer prematurely, so procedural macro calls are possible
    // if it occurs, stop_buffering is triggered and the buffer is run dry.
    // this character _can_ occur in serial com, due to checksums. however, no checksums are used in SD printing

What does it mean with "procedural macro calls are possible" ?

We may have another symptom: #3245

Looks like it's working fine (no crash, bed starts heating) when compiled with 1.6.8

It makes weird sense, somehow. I vaguely recall that Arduino 1.0.6 was an _enfant terrible_ – one of the more bug-ridden versions of Arduino. But I have a bunch of versions here, so I will try a few.

@jbrazio #3245 I can account for – a single line of bad logic skipping leading digits on the "string" argument to M23 and M117 – You can see that M117 123 Skidoo will not print the 123 part. And that is now fixed in #3246.

What does it mean with "procedural macro calls are possible"

Take a look at M32, which allows you to chain SD files. It used to be broken, but I did some work to fix it up for 1.0.2, so I think it now actually functions.

The strangest thing about the log output above is that commands seem to have CR / LF in the middle, and that is getting through to the interpreter. To me that screams of either (a) buffer corruption or (b) a bad pointer. I don't see "pointer corruption" because that would give total garbage.

echo:Unknown command: "2 E0
G1 F3600
M205 X10
M117 Printing...
M107
G0 F3600 X"

But overflow, if coming from another source other then the parser, would mean we should see strange unrelated data on the buffer.

Did you had any luck with other versions of Arduino ?

I saw the very same thing with Arduino 1.0.6 and _no other versions_. Check out how many issues we had involving 1.0.6 in one way or another choking on our code, which has become more "meta-coding" over time:

https://github.com/MarlinFirmware/Marlin/search?q=1.0.6&type=Issues&utf8=✓

I'm thinking of doing this for RC5… (which will be released soon).
Will this work when compiling with Makefile?

#if ARDUINO < 10500
  #error Versions of Arduino IDE prior to 1.5 are no longer supported. Please update your toolkit. Comment this line to proceed, but beware of compiler bugs.
#endif

No issues with printing so far (couple of mid-sized ~1.5h prints) with the 1.6.8 compiled version.

Aaaarrggghhhhh!!!! This is horrible if a compiler version was causing all these problems.

I don't see "pointer corruption" because that would give total garbage.

I'm not sure this is true. For example, if there was a race condition with the pointer being used by the interrupt routine and elsewhere, it would be very possible for it to be in a bad state and just hammer one byte. But I think @AnHardt got those issues cleaned up.

@thinkyhead I was about to suggest the same.

Luckily it does not happen too often, but it's possible. We chased something similar for month (openscad/openscad#514).

I think there is no reason to support an older IDE since you can download it for free...

As mentioned, we have a lot of "meta-programming" in this code, and in some cases I can imagine the compiler might get too wound-up by it. I hope we don't have to get down to comparing the assembler code side-by-side. I remember we had a bug a while ago that was fixed simply by changing the order of some lines, no functional change or bugs, just the compiler didn't like it. "Never touch that code" was not an option, unfortunately.

Same here. Decided yesterday to upgrade to RC5, compiled with IDE 1.0.5.

  • blank screen when starting print from SD
  • Pronterface wouldn't start printing once temp was reached

Recompiling with IDE 1.6.8 did the trick. No problem at all.

Thank you for the report, this is very helpful information to us.

RC5 is now warning users with IDE versions lower than 1.0.5 that they should upgrade but maybe we were too "soft", @thinkyhead do we have to be more aggressive and stop supporting 1.0.5 also, what's your opinion ?

Aaaarrggghhhhh!!!!

@jbrazio The scientist in me would love to know exactly what's going on. But my practical side is saying, yeah, let's just require Arduino 1.5 and higher at least.

They have upgraded avr-libc and gcc in the meanwhile.. the time and effort will not worth it.

@t-paul
So please close this ticket for house keeping. Thanks.

3462 "solves" your issue - in a way - and informs other users.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ceturan picture ceturan  Â·  4Comments

modem7 picture modem7  Â·  3Comments

StefanBruens picture StefanBruens  Â·  4Comments

pubalan12 picture pubalan12  Â·  4Comments

ahsnuet09 picture ahsnuet09  Â·  3Comments