When homing (G28) if X or Y fails to trip stallguard, the printer will hard-reset.
By far the easiest way to make this happen is with 0.9 degree motors and belts that are too tight or too loose. You need the M350 mod to enable setting X/Y microsteps.
Generally, the printer will register the first impact correctly, but the second will not register and it will continue running into the end-stop for a few seconds before hard-resetting. (e.g., normally it will "tap-tap", but in this scenario, it's "tap-"brrrrrrrrrrrt-[reset]"
I'll try to force it to happen and make a video tonight when I get home.
There is some anecdotal discussion of this in the 0.9 degree stepper thread: https://forum.prusaprinters.org/forum/original-prusa-i3-mk3s-mk3-user-mods-octoprint-enclosures-nozzles-.../stepper-motor-upgrades-to-eliminate-vfa-s-vertical-fine-artifacts/
Have you tried setting the microstepping manually in the configuration and using the printer without the M350 mod? I want to make sure it's not me causing the issue :)
It is probably an overflow in the planner calculation for the steps required for homing. Movement becomes erratic at high microstepping values anyways in the firmware so this situation can/should be avoided.
Looking at the thread, they seem to have problems with stallGuard and the Moons' motors. I also have some problems with my modded TMC2130 miniRambo 1.3a board, but I wasn't ever able to make it reset.
Yes, I'd get this before the M350 feature existed as well.
My setup has X/Y microstepping at 8 and the steps/mm still at 100. I don' think that'd overflow anything since it's the same net step calculations. I know that the einsy definitely can't handle 16 usteps and 200 steps/mm at fast travel.
There's a wealth of info on this in the linked thread in regards to stallguard tuning, and some here as well:
https://forum.prusaprinters.org/forum/original-prusa-i3-mk3s-mk3-user-mods-octoprint-enclosures-nozzles-.../tuning-homing-for-0-9-degree-motors/
Just saw your update. I have OMC/Stepperonline motors. They work quite nicely with stock firmware once you have the ability to set usteps and figure out that belt tensions need adjusting if homing fails.
I suppose I could have filed a bug earlier when I first encountered this, but it never occurred to me since I was already off the beaten path and I try to keep my support expectations for unofficial alterations minimal :)
Edit: The particularly interesting bit is in the second linked thread regarding unstable "home" step positions.
Also, I'm more than happy to help by testing or tinkering with parts of the firmware if necessary.
Edit 2: Possibly relevant: I am running vesconite bushings instead of the stock bearings on X/Y for noise reasons and this results in even smoother/lower friction motion on these axes. This could explain why stock stallguard values seem to work most of the time for me even when others need significant tuning/tinkering.
For reference, here's a normal home serial output:
SENDING:G28 Y
tmc2130_home_enter(axes_mask=0x02)
echo:busy: processing
echo:busy: processing
0 step=51 mscnt= 826
tmc2130_goto_step 1 35 2 1000
tmc2130_home_exit tmc2130_sg_homing_axes_mask=0x02
and here's a failed home where the printer resets:
G28 X
SENDING:G28 X
tmc2130_home_enter(axes_mask=0x01)
0 step=12 mscnt= 193
tmc2130_goto_step 0 43 2 1000
start
echo: 3.8.1-RC1-2851
echo: Last Updated: Oct 5 2019 12:35:45 | Author: (none, default config)
Compiled: Oct 5 2019
echo: Free Memory: 2112 PlannerBufferBytes: 1392
echo:Stored settings retrieved
adc_init
CrashDetect DISABLED
FSensor ENABLED
Sending 0xFF
echo:SD card ok
MMU => 'start'
MMU <= 'S1'
MMU => '106ok'
MMU <= 'S2'
MMU => '372ok'
MMU version valid
MMU - ENABLED
Video of the failure:
Peeking at the code, looks like it's dying in this region: (2220 of marlin_main.cpp)
tmc2130_goto_step and home_exit print to serial; the latter never happens during failure.
uint8_t back = tmc2130_home_bsteps[axis];
if (tmc2130_home_enabled && (orig <= 63))
{
tmc2130_goto_step(axis, orig, 2, 1000, tmc2130_get_res(axis));
if (back > 0)
tmc2130_do_steps(axis, back, -axis_home_dir, 1000);
}
else
tmc2130_do_steps(axis, 8, -axis_home_dir, 1000);
tmc2130_home_exit();
if your guess as to an over or underflow is correct then the suspect is probably this while loop from the goto_step routine:
while ((cnt--) && ((mscnt >> shift) != step))
{
tmc2130_do_step(axis);
delayMicroseconds(delay_us);
mscnt = tmc2130_rd_MSCNT(axis);
}
... We have a winner.
I made this change to the above while:
while ((cnt--) && ((mscnt >> shift) != step))
{
printf_P(PSTR("tmc2130_goto_step_while %d %d %d %d \n"),cnt, mscnt ,shift, step);
tmc2130_do_step(axis);
delayMicroseconds(delay_us);
mscnt = tmc2130_rd_MSCNT(axis);
}
Failure:
Normal home:
tmc2130_home_enter(axes_mask=0x02)
0 step=36 mscnt= 576
tmc2130_goto_step 1 35 2 1000
tmc2130_goto_step_while 14 592 5 35
tmc2130_goto_step_while 13 587 5 35
tmc2130_goto_step_while 12 552 5 35
tmc2130_goto_step_while 11 520 5 35
tmc2130_goto_step_while 10 488 5 35
tmc2130_goto_step_while 9 456 5 35
tmc2130_goto_step_while 8 424 5 35
tmc2130_goto_step_while 7 392 5 35
tmc2130_goto_step_while 6 360 5 35
tmc2130_goto_step_while 5 328 5 35
tmc2130_goto_step_while 4 296 5 35
tmc2130_goto_step_while 3 264 5 35
tmc2130_goto_step_while 2 232 5 35
tmc2130_goto_step_while 1 200 5 35
tmc2130_goto_step_while 0 167 5 35
tmc2130_home_exit tmc2130_sg_homing_axes_mask=0x02
ok
Oops, updated capture correctly printing the values as unsigned ints:
I spent some more time noodling over this and unfortunately I'm not familiar enough with expected/desired behaviour and TMC programming to know what is necessary to fix this situation; my gut is that the while expects cnt will always start >0 but we've found a corner case where it is exactly 0 - I'm just not certain whether fixing the while to check <=0 is just putting a bandage on the symptom and the real issue is this value starting at 0. If you have insights to point me in the right direction I'd be happy to take this on with some guidance.
I will take a look into the issue this weekend.
@leptun if you find a solution, we can still merge it into 3.9
I've pushed up a candidate fix in #2342 .
This particular corner case seems to be most aggravated when the axis doesn't get a good solid hit, e.g. there is a zip tie or something else in the way.
In my case, I ended up entering the function with an mscnt of 176, cnt 32, steps 38, giving a final "steps" value of -6 going into the while. C's implicit static_cast of the steps int to uint gave the original symptom of cnt==0
I suspect this fix may have minor impact on homing repeat-ability, but given that the primary cause appears to be something squishy in the axis path already, I wouldn't be too worried about it in the first place... :-/
Alternately, we might be able to use it to issue a warning to the user ("Homing failed, ensure axis is ....") but that would require additional research to understand and I don't think my sample size of one printer is sufficient.
The code in question appears to be designed to find the nearest "reachable" step value (based on ustep resolution/M350) to the actual desired target step; if this is correct then a better fix would potentially be to move over a "full" motor step and then move the remaining microsteps.
Most helpful comment
I will take a look into the issue this weekend.