when StealthChop for z is enabled trying to do babystepping doesn't work, initially the motor will move just a little then it will stop moving completely and even after the build it won't move at all unless reset
this is marlin build 2.0.x from 18.01.2020
Expected behavior:
z axis moving according the knob
Actual behavior:
z axis stops moving after a few turns and is completely disabled afterwords
hardware is tmc 2208 skr 1.4
Please test the bugfix-2.0.x
branch to see where it stands. If the problem has been resolved then we can close this issue. If the issue isn't resolved yet, then we should investigate further.
Please test the
bugfix-2.0.x
branch to see where it stands. If the problem has been resolved then we can close this issue. If the issue isn't resolved yet, then we should investigate further.
this w as tested on a firmware from the 18th..
Please test the
bugfix-2.0.x
branch to see where it stands. If the problem has been resolved then we can close this issue. If the issue isn't resolved yet, then we should investigate further.this w as tested on a firmware from the 18th..
i've updated the bug when i said 18 i mean from the 18th this month as in marlin 2.0.x
This issue is confirmed. For now the solution is to lower your babystep multiplier setting and re-flash. Try dividing it in half as a starting-point.
I have been unable to reproduce this. I tried with my original SKR 1.3 configuration, then with several modifications to get closer to the posted configs. I even swapped out my 2208 for the specific module I was using to test shutdowns with linear advance.
During my tests I had visible Z activity due to bed leveling, and also tested in vase mode to allow testing during continuous Z movement.
I'm out of ideas to get it to happen on my machine. My goal was to reproduce the issue, the do a mixed-signal capture of the motor outputs and the step pulses at the same time. My hypothesis is that prior to shutdown, we will see two step pulses unreasonably close together just before the motors turn off.
Can somebody experiencing this issue post an STL and sliced gcode that you can reproduce it with? Maybe there is capability used in the file (Z-hop maybe?) that behaves poorly along with babystepping.
Can somebody experiencing this issue post an STL and sliced gcode that you can reproduce it with? Maybe there is capability used in the file (Z-hop maybe?) that behaves poorly along with babystepping.
perhaps it's an issue with 1.4? i don't think it's related to a specific g-code i have several of them
Mine hasn't made it here from China yet unfortunately....
I also have a 1.4 on the way, I doubt that is the issue. The 1.3 and 1.4 just aren't that different. It has also been seen on the TH3D EZBoard.
@emaayan, can you confirm that your 1.4 is the non-turbo version? I know that is what your config says, but perhaps things could misbehave if your board has an LPC1769 (turbo) and you build for LPC1768 (non-turbo).
i don't think it's related to a specific g-code i have several of them
Your slicer is undoubtedly configured differently than mine, so having a file known to cause issues on your printer might help reproduce it on mine.
i have pure g-code that also causes this happen, also i have skr 1.4
non-turbo.
On Wed, Jan 22, 2020 at 10:44 AM Jason Smith notifications@github.com
wrote:
i don't think it's related to a specific g-code i have several of them
Your slicer is undoubtedly configured differently than mine, so having a
file known to cause issues on your printer might help reproduce it on mine.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MARFQTPU3WKBHHPXJDQ7ABQBA5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJSWP4Q#issuecomment-577071090,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MATHHN2NS62MTS7PWDQ7ABQBANCNFSM4KI4O63Q
.
a lot of updates and fixing has happend in the last week, is the problem still there?
i have pure g-code that also causes this happen…
@emaayan — If the issue is still occurring, that G-code would be good to check out. Did you ever try reducing the babystep multiplier on your machine, and what value made it work? And, what is your Z steps-per-mm value?
Dropping steps/mm from 800 to 400 (by halving the driver step rate from 1/16 to 1/8) does not show the issue. That is what we did for 800steps/mm machines with the TMC drivers. Have not had an issue since. So there is something going on (maybe driver timings being too quick) when the steps/mm are higher than 400 steps/mm.
You can still run interpolated mode on the TMC driver and there is nothing of value lost running at 1/8 vs 1/16 until the cause of the drivers shutting down when steps on Z are higher than 400steps/mm.
maybe similar experience today - in my start script i do a prime-nozzle-line, i normaly do babystepping while this line is printing,
G1 X5 Y15
G1 Z0.2
G1 X250 E20 F800
G1 Z10
always successful
today i was to late - after last line in startscript G1 Z10
, Cura does a diagonal move (XYZ) to startpoint of skirt, i used babystepping while this move was done. That crashed the Z axis completely.
abort the print, do G28: X home ok, Y home ok, Z home - travel to Z-safe-homing-point, no Z move.
killed message - reset printer
yes the issue is still happening, i just tried with marlin 2.0.3, i tried
it what my my eternal ball and chain (A.K.A bed leveling model, which is
what i've been doing for that last couple of months, printing flat
rectangular)
On Sat, Feb 1, 2020 at 5:32 AM Scott Lahteine notifications@github.com
wrote:
@emaayan https://github.com/emaayan — If the issue is still occurring,
that G-code would be good to check out. Did you ever try reducing the
babystep multiplier on your machine, and what value made it work? And, what
is your Z steps-per-mm value?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MAQPTXYMPLEBES2HM3RATUNBA5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKQSPHQ#issuecomment-580986782,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MGXKBUKF2YYRY7LGB3RATUNBANCNFSM4KI4O63Q
.
i should also add that now, even even in spreadcycle mode, after a while
the baby stepping mode just hangs, the lcd and z motor stop responding to
the knob , but after i stop the print the z axis does respond
On Sun, Feb 2, 2020 at 1:29 AM Elhanan Maayan elh.maayan@gmail.com wrote:
yes the issue is still happening, i just tried with marlin 2.0.3, i tried
it what my my eternal ball and chain (A.K.A bed leveling model, which is
what i've been doing for that last couple of months, printing flat
rectangular)On Sat, Feb 1, 2020 at 5:32 AM Scott Lahteine notifications@github.com
wrote:@emaayan https://github.com/emaayan — If the issue is still occurring,
that G-code would be good to check out. Did you ever try reducing the
babystep multiplier on your machine, and what value made it work? And, what
is your Z steps-per-mm value?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MAQPTXYMPLEBES2HM3RATUNBA5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKQSPHQ#issuecomment-580986782,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MGXKBUKF2YYRY7LGB3RATUNBANCNFSM4KI4O63Q
.
my settings:
*/
#define BABYSTEPPING
#if ENABLED(BABYSTEPPING)
#define BABYSTEP_WITHOUT_HOMING
//#define BABYSTEP_XY // Also enable X/Y
Babystepping. Not supported on DELTA!
#define BABYSTEP_INVERT_Z false
// Change if Z babysteps should go the other way
#define BABYSTEP_MULTIPLICATOR_Z 10
// Babysteps are very small. Increase for faster motion.
#define BABYSTEP_MULTIPLICATOR_XY 1
#define DOUBLECLICK_FOR_Z_BABYSTEPPING
// Double-click on the Status Screen for Z Babystepping.
#if ENABLED(DOUBLECLICK_FOR_Z_BABYSTEPPING)
#define DOUBLECLICK_MAX_INTERVAL 1250
// Maximum interval between clicks, in milliseconds.
// Note: Extra time may be
added to mitigate controller latency.
#define BABYSTEP_ALWAYS_AVAILABLE
// Allow babystepping at all times (not just during movement).
//#define MOVE_Z_WHEN_IDLE // Jump to the move Z menu
on doubleclick when printer is idle.
#if ENABLED(MOVE_Z_WHEN_IDLE)
#define MOVE_Z_IDLE_MULTIPLICATOR 1
// Multiply 1mm by this factor for the move step size.
#endif
#endif
I have similar problem. Babystepping not working after G28, just not move to down. Move to up is OK. The number on the LCD is changing, but the Z-axis motors just twitch.
I just turned on function #define BABYSTEP_WITHOUT_HOMING, and I used Babystepping function before G28 immediately after switching on printer, and Babystepping is completely OK.
I mean, problem is about G28 - homing.
I have SKR 1.3 and TMC2209
I just tried:
Change the value of BABYSTEP_MULTIPLICATOR 1-20
TMC2209 steps mode 4,8,16 and steps per mm 100-400
... still the same problem
today it happend again on a different printer. i did babystepping while a Z move was done.
Z axis no more action - had to reset printer.
SKR 1.3 TMC2208/2209, bugfix on both printers updated once a week (not same day).
The original issue reminds me that stealthChop has to be on for this failure to occur.
If you find that BABYSTEPPING
works with stealthChop turned off, then perhaps it will make sense to turn off stealthChop as a general strategy to ensure good baby-stepping. I'm speculating that maybe the small pulses injected at unusual intervals are not something that stealthChop quite knows how to manage, so the driver itself is going to sleep.
The thing is, Marlin doesn't do anything differently when stealthChop is on or off, so perhaps these SKR boards have some kind of flaw or addition that can't handle out-of-time pulses when stealthChop is enabled.
So, given that pulse timing is the most likely culprit, see if it helps to add this to your config:
#define MINIMUM_STEPPER_PULSE 1
And if that doesn't work, try increasing it to 2 to see if that helps.
Another piece of data that would be useful is to see the output of M122
both before and after the Z stepper driver fails. To get that command in Marlin enable the MONITOR_DRIVER_STATUS
option.
So, given that pulse timing is the most likely culprit, see if it helps to add this to your config:
#define MINIMUM_STEPPER_PULSE 1
And if that doesn't work, try increasing it to 2 to see if that helps.
Agreed, this sounds very much exactly like the linear advance pulse timing issues that cause them to go into shutdown.
If my memory is correct an increase to 2 helped a little and square wave stepping helps a little. Neither resolved it completely. The large number of steps will definitely make it more likely. It probably only happens on fast mcu's because of how fast they can set the pulses. When a babystep pulse lands right next to a leveling adjustment pulse bad things can happen.
Only way to know for sure is a logic analyzer hooked up but thats what it smells like.
Agreed, this sounds very much exactly like the linear advance pulse timing issues that cause them to go into shutdown.
Every time I have captured TMC2208 drivers shutting down in Linear Advance, it is immediately preceeded by two or more very rapid pulses, in the middle of a slow pulse train. As an example, imagine two pulses 2us apart, in the middle of a 200us pulse train.
I assume something like this is happening with babystepping, but I'm not exactly sure how it is happening, and I haven't yet captured the failure on an analyzer.
Some possible ideas from the following code:
#define BABYSTEP_AXIS(AXIS, INVERT, DIR) { \
const uint8_t old_dir = _READ_DIR(AXIS); \
_ENABLE(AXIS); \
DELAY_NS(MINIMUM_STEPPER_PRE_DIR_DELAY); \
_APPLY_DIR(AXIS, _INVERT_DIR(AXIS)^DIR^INVERT); \
DELAY_NS(MINIMUM_STEPPER_POST_DIR_DELAY); \
_SAVE_START; \
_APPLY_STEP(AXIS)(!_INVERT_STEP_PIN(AXIS), true); \
_PULSE_WAIT; \
_APPLY_STEP(AXIS)(_INVERT_STEP_PIN(AXIS), true); \
_APPLY_DIR(AXIS, old_dir); \
}
It would be great if someone who can reproduce the problem could try replacing that macro (in stepper.cpp) with this version, and re-test.
This isn't perfect. It is going to add some delays that aren't always necessary, and _PULSE_WAIT probably isn't exactly the correct delay for MAXIMUM_STEPPER_RATE, but it should provide some insight.
#define BABYSTEP_AXIS(AXIS, INVERT, DIR) { \
const uint8_t old_dir = _READ_DIR(AXIS); \
_ENABLE(AXIS); \
DELAY_NS(MINIMUM_STEPPER_PRE_DIR_DELAY); \
_APPLY_DIR(AXIS, _INVERT_DIR(AXIS)^DIR^INVERT); \
DELAY_NS(MINIMUM_STEPPER_POST_DIR_DELAY); \
_SAVE_START; \
_APPLY_STEP(AXIS)(!_INVERT_STEP_PIN(AXIS), true); \
_PULSE_WAIT; \
_APPLY_STEP(AXIS)(_INVERT_STEP_PIN(AXIS), true); \
DELAY_NS(MINIMUM_STEPPER_PRE_DIR_DELAY); \
_APPLY_DIR(AXIS, old_dir); \
DELAY_NS(MINIMUM_STEPPER_POST_DIR_DELAY); \
_PULSE_WAIT; \
}
Maybe
#define BABYSTEP_AXIS(AXIS, INVERT, DIR) { \
const uint8_t old_dir = _READ_DIR(AXIS); \
_ENABLE(AXIS); \
if(_INVERT_DIR(AXIS)^DIR^INVERT != old_dir) { \
DELAY_NS(MINIMUM_STEPPER_PRE_DIR_DELAY); \
_APPLY_DIR(AXIS, _INVERT_DIR(AXIS)^DIR^INVERT); \
DELAY_NS(MINIMUM_STEPPER_POST_DIR_DELAY); \
} \
_SAVE_START; \
_APPLY_STEP(AXIS)(!_INVERT_STEP_PIN(AXIS), true); \
_PULSE_WAIT; \
_APPLY_STEP(AXIS)(_INVERT_STEP_PIN(AXIS), true); \
if(_INVERT_DIR(AXIS)^DIR^INVERT != old_dir) { \
DELAY_NS(MINIMUM_STEPPER_PRE_DIR_DELAY); \
_APPLY_DIR(AXIS, old_dir); \
DELAY_NS(MINIMUM_STEPPER_POST_DIR_DELAY); \
} \
_PULSE_WAIT; \
}
And to minimize impact, maybe bring these down to 200ns instead of the default 650 when used here.
I recommend just leaving at their defaults. The point is actually to maximize impact. If the behavior improves, then more time can be spent to optimize the solution.
Right, but for a tmc the spec is 100ms, so still setting double that. If too many moves get queued up im worried about an overflow as I havnt followed back how theyre queued if at all. But no reason not to try both!
before crash
M122 Log Output
Send: M122
Recv: X Y Z Z2 E
Recv: Address 0 0 0
Recv: Enabled true true true true false
Recv: Set current 700 700 800 800 850
Recv: RMS current 673 673 795 795 826
Recv: MAX current 949 949 1121 1121 1165
Recv: Run current 21/31 21/31 25/31 25/31 26/31
Recv: Hold current 14/31 14/31 17/31 17/31 18/31
Recv: CS actual 14/31 14/31 17/31 17/31 18/31
Recv: PWM scale 15 15 18 19 21
Recv: vsense 1=.18 1=.18 1=.18 1=.18 1=.18
Recv: stealthChop true true true true false
Recv: msteps 16 16 16 16 16
Recv: tstep max max max max max
Recv: pwm
Recv: threshold
Recv: [mm/s]
Recv: OT prewarn false false false false false
Recv: OT prewarn has
Recv: been triggered false false false false false
Recv: off time 4 4 4 4 4
Recv: blank time 24 24 24 24 24
Recv: hysteresis
Recv: -end 2 2 2 2 2
Recv: -start 1 1 1 1 1
Recv: Stallguard thrs 80 40 0
Recv: DRVSTATUS X Y Z Z2 E
Recv: stst * * * * *
Recv: olb
Recv: ola
Recv: s2gb
Recv: s2ga
Recv: otpw
Recv: ot
Recv: 157C
Recv: 150C
Recv: 143C
Recv: 120C
Recv: s2vsa
Recv: s2vsb
Recv: Driver registers:
Recv: X 0xC0:0E:00:00
Recv: Y 0xC0:0E:00:00
Recv: Z 0xC0:11:00:00
Recv: Z2 0xC0:11:00:00
Recv: E 0x80:12:00:00
Recv:
Recv:
Recv: Testing X connection... OK
Recv: Testing Y connection... OK
Recv: Testing Z connection... OK
Recv: Testing Z2 connection... OK
Recv: Testing E connection... OK
Recv: ok
after Z fail
M122 Log Output
Send: M122
Recv: X Y Z Z2 E
Recv: Address 0 0 0
Recv: Enabled true true true true false
Recv: Set current 700 700 800 800 850
Recv: RMS current 673 673 795 795 826
Recv: MAX current 949 949 1121 1121 1165
Recv: Run current 21/31 21/31 25/31 25/31 26/31
Recv: Hold current 14/31 14/31 17/31 17/31 18/31
Recv: CS actual 14/31 14/31 25/31 25/31 18/31
Recv: PWM scale 14 14 31 30 21
Recv: vsense 1=.18 1=.18 1=.18 1=.18 1=.18
Recv: stealthChop true true true true false
Recv: msteps 16 16 16 16 16
Recv: tstep max max 561 561 max
Recv: pwm
Recv: threshold
Recv: [mm/s]
Recv: OT prewarn false false false false false
Recv: OT prewarn has
Recv: been triggered false false false false false
Recv: off time 4 4 4 4 4
Recv: blank time 24 24 24 24 24
Recv: hysteresis
Recv: -end 2 2 2 2 2
Recv: -start 1 1 1 1 1
Recv: Stallguard thrs 80 40 0
Recv: DRVSTATUS X Y Z Z2 E
Recv: stst * * *
Recv: olb
Recv: ola
Recv: s2gb
Recv: s2ga
Recv: otpw
Recv: ot
Recv: 157C
Recv: 150C
Recv: 143C
Recv: 120C
Recv: s2vsa
Recv: s2vsb
Recv: Driver registers:
Recv: X 0xC0:0E:00:00
Recv: Y 0xC0:0E:00:00
Recv: Z 0x40:19:00:20
Recv: Z2 0x40:19:00:10
Recv: E 0x80:12:00:00
Recv:
Recv:
Recv: Testing X connection... OK
Recv: Testing Y connection... OK
Recv: Testing Z connection... OK
Recv: Testing Z2 connection... OK
Recv: Testing E connection... OK
Recv: ok
failure is easy to reproduce
G28
doubleclick on controller for babystep menu
G1 Z100 meanwhile rotate encoder for babysteps back and forth
Z axis no more action
another G28 will home x and y, then M112 is called
before crash, babysteps do hard knocking moves
testing in spreadcycle - works
testing Z in spreadcycle and Z2 in stealthchop - only Z2 crashes
before crash, babysteps do hard knocking moves
changing BABYSTEP_MULTIPLICATOR_Z to 5 and 2 - crashes Z immediatly if turning encoder
BABYSTEP_MULTIPLICATOR to 8 and 12 - requires more babystep moves before crash
always knocking while babystepping
i can also add the using HOMING_FEEDRATE_Z (960) as well as above that
(like 1060) with stealthchop will also fail the bltouch (meaning it would
home the z, but the probe will retract while the z is still on the bed)
On Wed, Feb 5, 2020 at 10:38 PM rado79 notifications@github.com wrote:
before crash, babysteps do hard knocking moves
testing in spreadcycle - works
testing Z in spreadcycle and Z2 in stealthchop - only Z2 crashes
before crash, babysteps do hard knocking moveschanging BABYSTEP_MULTIPLICATOR_Z to 5 and 2 - crashes Z immediatly if
turning encoder
BABYSTEP_MULTIPLICATOR to 8 and 12 - requires more babystep moves before
crash
always knocking while babystepping—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MGS4N36JQPLDVW4VCTRBMPUDA5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK44TCY#issuecomment-582601099,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MEOBVDENK6I5MI6LK3RBMPUDANCNFSM4KI4O63Q
.
It would be great if someone who can reproduce the problem could try replacing that macro (in stepper.cpp) with this version, and re-test…
does not change anything
does not change anything
Thanks for trying it out! That at least rules out the direction change delay.
Maybe
#define BABYSTEP_AXIS(AXIS, INVERT, DIR) { \ const uint8_t old_dir = _READ_DIR(AXIS); \ _ENABLE(AXIS); \ if(_INVERT_DIR(AXIS)^DIR^INVERT != old_dir) { \ DELAY_NS(MINIMUM_STEPPER_PRE_DIR_DELAY); \ _APPLY_DIR(AXIS, _INVERT_DIR(AXIS)^DIR^INVERT); \ DELAY_NS(MINIMUM_STEPPER_POST_DIR_DELAY); \ } \ _SAVE_START; \ _APPLY_STEP(AXIS)(!_INVERT_STEP_PIN(AXIS), true); \ _PULSE_WAIT; \ _APPLY_STEP(AXIS)(_INVERT_STEP_PIN(AXIS), true); \ if(_INVERT_DIR(AXIS)^DIR^INVERT != old_dir) { \ DELAY_NS(MINIMUM_STEPPER_PRE_DIR_DELAY); \ _APPLY_DIR(AXIS, old_dir); \ DELAY_NS(MINIMUM_STEPPER_POST_DIR_DELAY); \ } \ _PULSE_WAIT; \ }
And to minimize impact, maybe bring these down to 200ns instead of the default 650 when used here.
tested with these settings
does not change anything
does not change anything
crashes Z immediatly if turning encoder
@rado79, I want to make sure I understand your last comment. The Z motor stops/crashes with all the settings you tried?
yes crashes always. the only difference is - crashes immediatly if turning encoder - requires more babystep moves before crash
Here are the highlighted differences in the stepper driver statuses…
Recv: PWM scale 15 15 18 19 21
Recv: tstep max max max max max
Recv: stst * * * * *
Recv: PWM scale 14 14 31 30 21
Recv: tstep max max 561 561 max
Recv: stst * * *
Check and see if it helps to follow the StealthChop tuning procedure outlined in section 6.2 of the datasheet. We might just need to provide better tuning defaults for some of these axes.
And also for fun, enable BABYSTEP_XY
and see if XY babystepping also leads to the same motor shutdown as Z babystepping.
@sjasonsmith — I foresee a rewrite of Babystepping (which is not very complicated) so that it uses a new methodology. The stepper ISR should just include the extra steps (or leave them out) as part of the stepping phase and we can get rid of this injected and likely out-of-phase step and DIR pulse in the stepper.babystep
method.
@rado79 — I would bet that if you have bed leveling turned OFF and start some moves in XY and play around with Z babystepping that the Z stepper will shut down more quickly if you move the knob back and forth to change the DIR pin rapidly. Or, if you are doing a move that includes Z, if you do babystepping in the direction away from Z movement, it probably shuts down faster. I just can't imagine that toggling the DIR pin rapidly makes the driver very happy.
@thinkyhead I agree with you. I’m trying to find a way to reproduce it very reliably so that I can confirm my suspected root cause with an analyzer, and can have confidence that any fix has solved the problem.
I’m not sure why it is so much harder to reproduce on my printer than others. Perhaps I need to adjust my current up and down and see if that makes it more susceptible.
I’m probably going to wear out my encoder with this one. I went through two batteries on my drill using it to turn the encoder knob last night! I saw about five failures, but nothing consistent enough for me to consider reproducible yet.
@thinkyhead i tried BABYSTEP_XY to get the drivers X and Y to shut down, no chance - works like a charm. I tried the tests above (with G1 Z100 and babysteps), but now only moving babysteps in Z+ direction, no direction change, same issue as above. If i activate UBL, start a print and use babystepping while UBL Z-corretion does his job, i can not reproduce the shut down, and no knocking noise.
@sjasonsmith i do only 1 to 10 clics on encoder until the shut down happens.
@thinkyhead what i did also - disable UBL, start a print and do a lot of rapid back and forth moves with encoder does not show the issue, also no knocking sound while babystepping. So for me there is no issue with direction changes. But this gave me another idea, i started a print with z-hop enabled. Tried to get some babysteps while z-hop, and it did not last long to reproduce the shutdown.
Test with Z feedrate from 1mm/s to 25mm/s does show the issue.
Should I do more tests or is it more useful to wait for a rewrite?
@rado79, probably no more testing is needed on your end for now, unless someone comes up with other experiments to try.
I would like to follow your exact instructions to reproduce the issue. Could you post your updated configs (if different from 3 days ago), as well as the gcode file you are using to reproduce this? My attempts at babystepping during a variety of normal Z moves typically results in noisy steppers but not disabled steppers.
My hope is to come up with some steps that allow me to reproduce it very consistently on my own printer, so that I can capture the signals at the point of failure and see what is going on. Right now we think we know what is happening, but I like to be able to see/measure it to be sure.
Check and see if it helps to follow the StealthChop tuning procedure outlined in section 6.2 of the datasheet. We might just need to provide better tuning defaults for some of these axes.
@thinkyhead sorry i am not able to do that, normally i fix cars.
@sjasonsmith you can use the config from above, after every test i reversed the changings to my defaults. I do not test with gcode, i do with a single G1 Z100 move or my start script with G1 Z10 at the end because i do not have problems during print (except Z-Hop test from above
gcode-Quick_Retraction_Torture_Test.zip).
i do with a single G1 Z100
This move is just using the default speed based on the configuration? (Homing speed, probably)
To clarify, instructions would match this:
Yes, thats it. I tried with M203 Z25, Z15, Z10, Z5, Z1 but Z25 is my default.
Great, I'll try it again later tonight.
You have to G28 otherwise babystep is not enabled in my config.
Another thing to try for fun: M290 Zn.nn
and see if different length moves lead to the driver shutdown.
@sjasonsmith — One poison ingredient seems to be many steps on an axis, thus requiring a high babysteps multiplier and leading to a larger number of pulses being sent over a longer period of time.
Babysteps are currently initiated by the temperature ISR at a fixed frequency of ~1kHz, so when you double an axis's steps-per-mm ratio, the axis will babystep at half the speed. In the old Stepper ISR when it needed to do a large number of pulses it would loop up to 4 times. Perhaps Stepper::babystep
should check whether the axis has a large steps-per-mm value and if so then loop a few times. (It should of course un-block interrupts in-between babysteps.)
While I think it's good to keep collecting data on this phenomenon and figure out which boards are most susceptible, ultimately the best solution will be to move the actual babystepping into the Stepper ISR and process it there. I will explore that asap.
@thinkyhead Based on my observations working with linear advance, I think doing multiple steps at a time might make matters worse. I always see the drivers fail right after two rapid pulses occur on the step line. Hopefully I will be able to reproduce and capture the babystepping failure tonight, to verify whether it looks the same.
One option might be to perform it in the step ISR, but only if there isn't already a step to perform. That would not provide perfectly smooth acceleration, but would ensure the pulses were not immediately adjacent to each other, unless multi-stepping were engaged.
After hours of trying to reliably reproduce this, I finally found something that made a difference.
I switched to a larger Z motor. Now I can reproduce the stall within a few seconds turning the encoder back and forth.
It is late, so I will have to try to capture the behavior tomorrow.
this is the motor i'm using
https://www.amazon.com/STEPPERONLINE-Stepper-63-74oz-Connector-Extruder/dp/B07LF898KN?ref_=ast_sto_dp&th=1&psc=1
On Sat, Feb 8, 2020 at 11:21 AM Jason Smith notifications@github.com
wrote:
After hours of trying to reliably reproduce this, I finally found
something that made a difference.I switched to a larger Z motor. Now I can reproduce the stall within a few
seconds turning the encoder back and forth.It is late, so I will have to try to capture the behavior tomorrow.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MHHAFLJ36KFCL4EW5DRBZ2Q7A5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELFNSOA#issuecomment-583719224,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MB23W4SSG3ZIF4ZNG3RBZ2Q7ANCNFSM4KI4O63Q
.
Here is a trace of a motor shutting down. As I expected, it occurs shortly after two pulses occur very close together. In these case the two circled pulses occured 15us from each other, where previous pulses were several hundred microseconds apart.
The babysteps occur at a steady rate, but the "normal" steps are accelerating and decelerating. With more steps/mm, the chance increases that these two steps from different sources will align on a millisecond boundary.
Isn't this like a race condition in software?
On Sat, Feb 8, 2020, 22:50 Jason Smith notifications@github.com wrote:
Here is a trace of a motor shutting down. As I expected, it occurs shortly
after two pulses occur very close together. In these case the two circled
pulses occured 15us from each other, where previous pulses were several
hundred microseconds apart.The babysteps occur at a steady rate, but the "normal" steps are
accelerating and decelerating. With more steps/mm, the chance increases
that these two steps from different sources will align on a millisecond
boundary.[image: image]
https://user-images.githubusercontent.com/20053467/74091861-86b52600-4a71-11ea-9af0-4e76d7de97b8.png—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MGZAJDZ2SFSGVE6GKDRB4LJVA5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELF3L5Y#issuecomment-583775735,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MCSAKNKSZ5WUFDV4VLRB4LJVANCNFSM4KI4O63Q
.
@sjasonsmith — The way I imagine a (slightly better) Stepper ISR-based solution is to use the babysteps counter to simply change the number of steps going into the stepping phase by some increment until the babysteps counter reaches zero, but making sure not to affect the stepper counters with these babystep moves (i.e., use for logic, but then adjust). And if you do "babystepping" when there are no moves in the planner buffer, we could substitute a regular planner block, but set a new flag that tells the stepper ISR to keep its step counters unchanged during the move.
Isn't this like a race condition in software?
No, more like…
These two steps from different sources will align on a millisecond boundary.
The babystep method disables interrupts and applies a delay at the front and one at the back before re-enabling interrupts, and this should be preventing overlapping steps, but apparently one or both of the the front / back delays within the interrupt-free block is too small.
IT almost looks like interrupts are not being blocked during the babystep. See if it helps to make this change in Stepper::babystep
…
void Stepper::babystep(const AxisEnum axis, const bool direction) {
- cli();
+ DISABLE_ISRS();
. . .
- sei();
+ ENABLE_ISRS();
}
If that change alone is not enough, please test #16813 to see if it eliminates the problem.
I've been away all day, but am about to start some more testing with this.
- cli(); + DISABLE_ISRS();
This doesn't change anything. The steps aren't really overlapping, they are just close together.
I found I can make the problem go away by adding very long delays before and after each babystep, to ensure they are not occurring adjacent to normal move steps. 100 microseconds seems to work. That is clearly not ideal, but it provides a data point...
I will now take a look at the PR you posted.
change the number of steps going into the stepping phase by some increment until the babysteps counter reaches zero, but making sure not to affect the stepper counters with these babystep moves
@thinkyhead, with what you are imagining, would babysteps be applied in a location where they would be subject to the same acceleration limits as normal moves?
I'm not really familiar with the planner, so I don't know when (or how) it makes decisions about the pulse rate for each stepper. Is that something that is pre-calculated to an extent that babysteps would not feel responsive to a user, or that would have to wait a long time during long straight moves?
Ultimately linear advance needs the same type of treatment, since it causes issues for the same reason.
@thinkyhead, with what you are imagining, would babysteps be applied in a location where they would be subject to the same acceleration limits as normal moves?
I'm reaching for something which could be as non-disruptive as possible and will not require doing reverse steps when a move is underway. What you ultimately want is for the stepper ISR to intentionally skip steps when moving in one direction, and to intentionally introduce extra steps when moving in the other direction. However, the stepper ISR runs at different rates and is not a very good place to accomplish this if you want regular and predictable babystepping.
So, we will just have to put a scope on the proposed patch and try adding delays around the babystep pulses to see whether or not it has any effect.
Babysteps was originally developed before bed leveling was common, so there was never any opposing Z motion during Z babystepping. Now we have a conflict, and I'm not sure that injecting sudden changes in direction is a good idea on any stepper driver. However the less sophisticated ones have been tolerant.
Unless we can get rid of these opposing signals we might have to suspend all Z motion in the stepper ISR during babystepping and apply a substitute.
Can you double-check that the issue still appears (or does not appear) when doing Z babystepping with bed leveling turned off?
I am not testing with a normal print, I have a contrived gcode file that generates Z movement while I babystep.
Hopefully one of the many people experiencing the issue can run a test with leveling disabled. I suspect the problem will go away, although could still happen with z hop or actual layer changes.
Actually, I think @rado79 already did this experiment and verified there was no issue with leveling and Z hop disabled.
I am not testing with a normal print, I have a contrived gcode file that generates Z movement while I babystep.
Babystepping with any old XY motion, leveling off, will be fine to test the condition.
Actually, I think @rado79 already did this experiment and verified there was no issue with leveling and Z hop disabled.
Two datapoints are better than one, so if you can re-confirm, that's a real help.
Meanwhile, I'm obsessing over how to interject or drop pulses in some orderly way, and if anything comes to me while I sleep I'll let you know….
Back from outback. I can not reproduce this issue while auto bed leveling is doing its job alone. I need definitely a Z travel move or Z-hop to shut down the driver, with or without ABL does not matter. ABL was disabled on both machines when I first noticed this problem.
Maybe i have too little Z correction with ABL to reproduce shut down (flat bed).
I hate you, I'm over battling different sides of bed levels...
On Mon, Feb 10, 2020, 12:54 rado79 notifications@github.com wrote:
Back from outback. I can not reproduce this issue while auto bed leveling
is doing its job alone. I need definitely a Z travel move or Z-hop to shut
down the driver, with or without ABL does not matter. ABL was disabled on
both machines when I first noticed this problem.
Maybe i have too little Z correction with ABL to reproduce shut down (flat
bed).—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MBQQRX6VF2EDPZW7E3RCEW7VA5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELICG2I#issuecomment-584065897,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MDCEVQ7UCBRZUC3Z2DRCEW7VANCNFSM4KI4O63Q
.
definitely a Z travel move or Z-hop to shut down the driver
Ok, so I suspect that the TMC220x is not going to like babystepping no matter what we do in the software, because the Standstill Detection is extremely sensitive to pulses against the current direction of motion, or even in line with it if they come at intervals that imply sudden acceleration.
So, I'm scanning the datasheet to see if there's a simple way to disable Standstill detection or make it less fussy. If we can stop the driver from shutting down, it might let our badly-timed steps through unmolested.
Ohhh Kayyy… The "only" practical solution that seems possible according to the datasheet is to tell the stepper driver that is being baby-stepped to go out of stealthChop mode when baby-stepping starts. Then after a second or two (with no baby-stepping) the stepper will re-enable stealthChop mode. This should prevent the stepper from shutting down, though it will require some testing to see whether babysteps moves the desired distance or if it loses some steps.
So, I will try to code up that solution now….
That won't help the _many_ people that use 2208s in standalone mode, though. Obscene delays surrounding babystepping might be the only option there. I guess standalone users are probably mostly on 8-bit boards, which might not exhibit the same issue (I'm unsure on this).
extremely sensitive to pulses against the current direction of motion, or even in line with it if they come at intervals that imply sudden acceleration
From what I've observed in my traces, they don't seem to have issues when the pulses are spaced out a bit (hundreds of microseconds), even when the direction is alternating with every pulse. It's really only when there are two sudden closely-spaced pulses, whether there is a direction change or not.
Obscene delays surrounding babystepping might be the only option there.
It has not yet been demonstrated that this would work.
From what I've observed in my traces, they don't seem to have issues when the pulses are spaced out a bit
Easier said than done, unfortunately. The obvious solution is to add a global array that stores the processor tick of the last pulse done on each axis before exiting the ISR, and then add a needed delay in front of any babystep that would be too close to that pulse. We don't know when the next stepper ISR pulse might be, so after every babystep there would need to be a delay corresponding to the worst case.
Because the "obvious" solution adds overhead to every stepper pulse and imposes extra delay on the steppers that are not involved in the babystepping, it is considered a bit of a last resort.
just a question, i was wondering what caused it in the first place?
cause it seems like a something a lot of people would detect early on?
On Tue, Feb 11, 2020 at 3:12 AM Scott Lahteine notifications@github.com
wrote:
Obscene delays surrounding babystepping might be the only option there.
It has not yet been demonstrated that this would work.
From what I've observed in my traces, they don't seem to have issues when
the pulses are spaced out a bitEasier said than done, unfortunately. The obvious solution is to add a
global array that stores the processor tick of the last pulse done on each
axis before exiting the ISR, and then add a needed delay in front of any
babystep that would be too close to that pulse. We don't know when the next
stepper ISR pulse might be, so after every babystep there would need to be
a delay corresponding to the worst case.Because the "obvious" solution adds overhead to every stepper pulse and
imposes extra delay on the steppers that are not involved in the
babystepping, it is considered a bit of a last resort.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MGX34NOJ7TQHRGMFK3RCH3QDA5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELK435A#issuecomment-584437236,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MASMQKL4K52IHDSJXTRCH3QDANCNFSM4KI4O63Q
.
@emaayan — 32-bit boards. And many more TMC220x drivers.
I did some more tests from above after https://github.com/MarlinFirmware/Marlin/pull/16857#.
//#define INTEGRATED_BABYSTEPPING - babystepping does not do anything - no reaction at all.
With a G1 Z100 i hear a typical sound from steppermotors depanding on speed during move.
Turning encoder above/below +000.00 sound changes to another stable frequency,
turning back to Z +000.00 sound changes to previous frequency
edit: no driver shutdown
@thinkyhead in relation to https://github.com/MarlinFirmware/Marlin/issues/16941#issuecomment-590419176
Downloaded todays bugfix 9040394e8e1eb67e995bee59b340309448329c2e and did my tests again.
//#define INTEGRATED_BABYSTEPPING shutdown persits.
same good news with:
//#define BABYSTEPPING_EXTRA_DIR_WAIT / #define INTEGRATED_BABYSTEPPING - no shutdown reproducible.
I will need to test again on my setup. I've been very busy with work and was out of town last week, so couldn't continue my earlier testing.
I'm glad to hear it works with just BABYSTEPPING_EXTRA_DIR_WAIT
. The integrated babystepping is still experimental, so I encourage trying some changes to the code wrapped in the option and see if anything changes. I will do more testing in this realm later this week.
There may be a misunderstanding due to my non-professional english.
It works for me with or without BABYSTEPPING_EXTRA_DIR_WAIT enabled.
Just INTEGRATED_BABYSTEPPING has to be enabled. Otherwise it comes to shutdown.
Just
INTEGRATED_BABYSTEPPING
has to be enabled….
Oh, okay! I wasn't aware that INTEGRATED_BABYSTEPPING
was working any better than the old technique. I couldn't see any difference in my testing, but I haven't put the scope on it yet.
It will be good to keep testing with bed leveling both on and off, because the extra Z motion can exacerbate the shutdown problem.
@thinkyhead do you have a proposal for a scope for beginners?
I would be interested to dive into this topic - to gain more experience and maybe help debugging someday.
do you have a proposal for a scope for beginners?
Play around and have fun. The squiggly lines start to make sense after a while.
No good news guys, Z driver shutdown persists while using UBL.
Testing babysteps during a single G1 Z100 move has caused shutdowns in the past, now it works (with INTEGRATED_BABYSTEPPING).
I found this thread while trying to find a solution for a frustrating ABL problem.
BTT E3 DIP 1.0 with 4x TMC2208 in Stealthchop / BLTouch 3.1 / Prusaslicer 2.1.0
G29 is in the start script and Z-Offset is saved to the EEPROM during first layer.
No following print makes good use of the saved Z-Offset. Consecutive prints run the nozzle into the bed as if ~0.2mm were just subtracted from the Z-offset. Z-Offset adjusted again and saved. Next print, same problem. Working the numbers between -1.7 to -1.6 to -1.5
Upon restarting the machine, first layer is air-printing....
Is it possible that the baby steps needed for proper Z-offset are "killed"?
The print job itself does execute without problem once the first hurdles are taken.
is it possible that trying to fix this bug also caused this bug in version
2.0.4.4 and beyond?
this only happens when stealhchop xy is engaged
https://github.com/MarlinFirmware/Marlin/issues/17206
On Wed, Mar 4, 2020 at 3:15 PM Alphaelectric notifications@github.com
wrote:
I found this thread while trying to find a solution for a frustrating ABL
problem.
BTT E3 DIP 1.0 with 4x TMC2208 in Stealthchop / BLTouch 3.1 / Prusaslicer
2.1.0G29 is in the start script and Z-Offset is saved to the EEPROM during
first layer.No following print makes good use of the saved Z-Offset. Consecutive
prints run the nozzle into the bed as if ~0.2mm were just subtracted from
the Z-offset. Z-Offset adjusted again and saved. Next print, same problem.
Working the numbers between -1.7 to -1.6 to -1.5
Upon restarting the machine, first layer is air-printing....Is it possible that the baby steps needed for proper Z-offset are "killed"?
The print job itself does execute without problem once the first hurdles
are taken.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617?email_source=notifications&email_token=ADGP5MDBMN7GSWZN2AQ6H5DRFZIAPA5CNFSM4KI4O632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENXZKIA#issuecomment-594515232,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MDEKUN72XOIVDW3Z6TRFZIAPANCNFSM4KI4O63Q
.
@emaayan try what's suggested here: https://github.com/MarlinFirmware/Marlin/issues/17323
It helped a similar issue I was having
i actually it's the same as in here i believe
https://github.com/MarlinFirmware/Configurations/issues/62 (havn't tried it
yet)
On Sun, Apr 26, 2020 at 4:41 AM Adam Segal notifications@github.com wrote:
@emaayan https://github.com/emaayan try what's suggested here: #17323
https://github.com/MarlinFirmware/Marlin/issues/17323It helped a similar issue I was having
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/16617#issuecomment-619465411,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADGP5MGOQARCTHE6HPQSLKLROOGOBANCNFSM4KI4O63Q
.
@emaayan
Please test the bugfix-2.0.x branch to see where it stands.
This issue is stale because it has been open 30 days with no activity. Remove stale label / comment or this will be closed in 5 days.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.