Hi there
With the current bugfix version I experience lost steps on all axes with 8825 Stepper drivers
I tried the versions from June 16. and June 14.
The bugfix version from june 3. works fine
I use a selfmade coreXY printer with 8825 drivers in 1/32 mode on X/Y and 1/16 mode on Z axis, after the update the X/Y motors made a strange noise and when I tried to move in X it also moves Y
I played around with parameters and tried MINIMUM_STEPPER_PULSE 4 which helped, but when printing there are layer shifts on every layer.
I tried different MINIMUM_STEPPER_PULSE values in combination with MAXIMUM_STEPPER_RATE down to 65000 with no luck.
Disabling Linear-Advance helps also, but there are still layer shifts.
Disabling Adaptive Step Smoothing does not change anything
I changed the 8825 Drivers with spare A4988 in 1/16 mode and it prints fine again, but now i could see that the z axis with 8825 drivers also loses steps.
I keep Investigating, but I need help
Thanks in advance
There's a possibility that this is related to the layer shift problem we've been having for a long time. I've changed the title so that people from that thread will look at this one.
Its possible that it is related, but I didn't had this problem before yesterday.
The Firmware from June 3. is working fine
Two testprints with the current firmware
the same gcode as above with Bugfix 2.0 from june 3. printed after the parts above, so I know the drivers are still working
if you are sure that it isnt related to the mechanics / drivers, Try looking at the feedrates, max accelerations and jerk configs, maybe there is something different there
I checked feedrate, acceleration and jerk
I can print without problems using the same settings on the unchanged hardware if I upload the firmware from june 3.
The layer shifts also occur when printing very slowly
Do you have the hardware to check how long the step pulse is? I'l try to get time to look into this, lots has changed in the planner/step generation recently, but I'm not sure what, if anything, has since the 3rd.
I'm sorry I dont have the hardware for this
And I dont know much about coding, but there are several commits that changed stepper timing since the 3rd
I have time, just tell me what I could test (without special equipment)
Could it be that the current of the drivers is just too low or too near the limit ? When playing with new versions of Bugfix and 8825's, I experienced also once a "skip" of steps (but not on all axes) and I had to increase it a bit. To test that, I issued G1 on X, Y or Z to move the head along an axe and put a "light" hand in the way of the head to increase a bit the resistance. If skipping steps, I increased a bit the current of this drive.
Perhaps nothing to do with your issue but who knows...
I checked the drivers Vref and it is 0.5V, exactly what it should be. This gives me 1A max current for 1,5A motors (67%)
The A4988 drivers I tested with, were also adjusted to 1A and the motors were running fine.
It was the case for me too. Nevertheless, I had to increase a bit my X and Y axes after the little test explained above.
Okay, I tried to stop the carriage with my finger with no effect. The motors have a lot of torque.
So the current in fine, but thank you Bergerac
I still suspect the stepper timing is off just a little bit.
The z axis is driven by 3 motors each is connected to its own 8825 driver on an extra board.
All 3 drivers get their Step, Dir and Enable Signals from shared Pins
After some up and down of the z axis, the bed is tilted a little bit, this never happened before.
So the 8825 are individual susceptible to the (supposed) signal error
It might be that the actual step pulse duration is lets say 1.8 us, which is okay for driver 1 but driver 2 catches maybe only 99% of the pulses.
But this is just a guess.
I tried to understand the code in stepper.cpp but I had to give up
@kAdonis : Try to increase pulse width a bit.
How long are the cables between the main board and the stepper extra board ?... You are probably running into signal integrity issues.... Or ground issues...
I tried to increase pulse width up to 5us with no effect.
The cables to the extra board are 10cm long, but the extra board is only for the z-axis. The X/Y drivers are on the RAMPS as usual.
I will try to connect the z motors to only 1 driver directly on the RAMPS and test again.
Okay, 3 motors are probably too much for one 8825 driver and 12V, so I tested with dual z drivers directly on RAMPS
The issue is still there with pulse width 2us and max stepper rate 250000
I will test with higher pulse width
Humm...
How many pulses per mm are you using for each of your Z axis ?
I had problems when using 12v and fast step rates. 12v are just insufficient for fast moves...
ejtagle, noob question, If i use 24v will i be allowed to go faster before the motor stop? I'm trying to increase my 400x400 printer speed, the first step was the volcano hotend, but now i'm looking into faster motion.
24v allows more precise control of motor coil currents. But, you must be very careful, as most RAMPs do not tolerate such increase of voltage, and if they do, then the problem could be the arduino itself, as it also powers its 5v regulator from that voltage.
The drivers themselves have no problem going up to 30...40volts
In case of RAMPS/REARM, i dont know, but if the board tolerates being powered from 24v, then i would certainly recommend it.
On my z axis I have 1600 steps/mm but only 4mm/s max speed, 6400 steps/s, much lower than x and y.
But I have layer shifts even when printing with 30mm/s
With older Marlin versions I could print with 70mm/s, so 12V is not ideal but it works.
I had reliability problems when moving my Z axis at more than 1mm/s. I have 4000 pulses per mm.
And it is not software. It is hw related.
One of the problems we have now is that the old firmware placed a hidden limitation on the maximum pulse speed ... We removed that limitation (because Arduino is able to keep up without problems at higher speeds. But that seems to be triggering hw problems...
I'll reduce the steps/mm on z axis and do more test tomorrow
but X and Y axes are more important
I have learned that deducing the maximum feedrate for an axis is very complex... It is not enough to test a plain move.
The first step is always to raise the driver current, but then several printing tests must be done to be sure.
If, as @kAdonis says, it is possible to go back to the June 3rd build and have no problems with the same configuration then we must have changed something that is either reducing the maximum rate he can drive the axis, creating a pulse train his drivers do not like, .. or running the axis harder at the same specified rate. When it's reproducible like that I can't see it just being the hardware even if lowering speeds fixes it.
I think there was a change... After June 3rd stepper pulse period was fixed. Previously, the pulse period was 8x larger than the desired configured one. That IS a change - The bug only triggered when doing double/quad stepping...
For what it is worth, I have a similar arrangement with high current motors and 8825s. Some time ago I stumbled upon some flyback diode boards on AliExpress, They were cheap so I decided to try them.
I absolutely could not believe the change. At that point I realized much of my artifacts on the prints were a result of missing steps. No more missing steps and noise decrease significantly.
Everybody should use them.
@ejtagle Were you able to scope the pulse width when making the changes? Because if you didn't I can take a quick look.
update:
The printer prints fine with pulse width set to 5us and a max stepper rate of 140000 kHz( dont know if this rate makes sense)
I tried 4us pulse width and had again layer shifts.
8825 Drivers on X/Y and z axis
z axis still powered by dual z drivers on RAMPS
I think I could identify the "worst" (of 5) drivers and dont use it at the moment.
Initially (see first post) 5us didnt work, but this test included the "worst" driver.
I dont wanted to give up my extra board easily, because (I think) its cleaner to use one driver per motor.
So I tried again with three A4988 Drivers for the z axis installed on the extra board to be save from step losses.
Still using 8825 driver in 1/32 mode for X/Y
I used again a pulse width setting of 5us and and a max stepper rate of 140000 kHz
It prints fine!
So there are no signal integrity or ground issues, at least not obvious.
I'm really curious how long the pulse actually is?
Update : There are still layer shifts
@teemuatlut : I didn't scope the pulse width yet. I have the hardware required to do it, but as printig was and is working pretty well (using 2uS pulses on DRV8845) so i didn´t do it.
kAdonis: Try the following: Use a HEAVY copper wire (2.5mm^2 or even more) to join the ground of your external board with the ground of RAMPS/ReARM.
Also, place 2 capacitors in parallel between GND and +V of your external board, AT the external board connector. One of the capacitors should be 1000uF/25v (or more), the other one should be 100nF ceramic disc
@ejtagle Thank you for the advice with the capacitors
But I need 5us pulse width even with the external board disconnected
Do you mean 8845 drivers or 8825?
Are yours from Pololu? My 8825 are cheap chinese ones
8825 drivers, Chinese clones.
those Smoother Kit Addon Module for 3D Printer Motor Drivers are about 3 for the cost of 1 driver.
I can't recommend them enough. Of course it will take 2 weeks to get them..........
I also have the cheap Chinese 8825 drivers. Though, since they are all made in China, the expense is everyone in between.
I've confirmed the step pulse duration to be reasonable under test conditions (3 axis move),
LPC1768 MINIMUM_STEPPER_PULSE setting
| set duration (us) | X (ns)| Y (ns)| Z (ns)|
| ------------- | ------------- | --- | --- |
| 0 | 689 | 563 | 502 |
| 1 | 1064 | 937 | 812 |
| 2 | 1938 | 1811 | 1690 |
| 4 | 3940 | 3811 | 3674 |
| 6 | 5940 | 5687 | 5562 |
They're a bit low, we should try to overshoot if anything, but it shouldn't be an issue DRV8825 needs 2us so at 5us it is definitely enough.
Yes, rounding is using a truncating operator. But it is pretty close to the calculations
By reasonable I mean it is getting longer for higher values so not the cause of this issue (more than likely) but it is still an issue, if a user sets the appropriate MINIMUM_STEPPER_PULSE for their drivers it will not be sufficient for all of them, E steppers will be getting even shorter pulses than Z more than likely.
Let me explain; The timing code uses part of the execution time as delay. The difference you see is caused by that execution time delay. The difference in timing between axis is constant, does not increase if increasing pulse width
yes but the Z pulse is 1690ns when MINIMUM_STEPPER_PULSE = 2, this will not drive a stepper with a minimum pulse of 2us reliably
The alternative should be to round up. To be honest, the delay is placed at the proper place. I suspect the compiler could be moving code around the waiting point... (But at least on Due i did not see that)
have you measured the pulse durations on Due? is this a LPC176x specific problem?
No i didn´t measure it yet. But the ARM Cortex M3 core used is the same, so I expect minor differences
The idea originally was (stepper::stepper_pulse_phase_isr) that the START_PULSE macro takes more or less the same amount of time to execute as the STOP_PULSE macro. If that was the case, then the delays should compensate and the proper pulse width would be output.
By your own measurements, seems the STOP_PULSE macro is faster than the START_PULSE, so they do not compensate, and the Z, E steppers are getting a slightly less than expected pulse width.
C compiler tends to cache values in registers because on ARM reading from memory is slower than fetching the address and then reading the value from SRAM (static variables are MUCH slower in ARM, accessing then through pointers is much faster. But in AVR is exactly the opposite scenario)...
That could explain the timing assimetry
The fix could be to add that extra offset to the calculations, or enforce timing in 4 points instead of 2, to make sure low and high times are always enforced ...
To be honest, i was quite disappointed when i disassembled the ARM generated code for the stepper isr. 64 cycles for each stepper was too much. But that is exactly what it takes...
Indeed you can see the asymmetry, and longer delay before Z, bearing in mind it's 62.5ns per division so skewed a bit
Well.. as I have the probes setup I decided to investigate the spam button in octoprint causes skipping issue, this doesn't look right at all, why is the planner putting a cruise block at max feed rate, directly after a finished deceleration?
The difference you see there is caused by the execution speed of the macros START_PULSE and STOP_PULSE.
Basically, they can be written as
delta_error += advance_dividend;
if (delta_error>= 0) {
set step pin;
count_position += count_direction; \
}
And then, on stop:
if (delta_error >= 0) { \
delta_error -= advance_divisor;
clear step pin; \
}
When the ifs do not execute, then the timing of the other pulses change.
The only way to "fix" the timing would be to add an "else" clause and compensate when the condition is not true, by adding a delay...
Regarding the full accelerated block after an slowdown, i absolutely agree that it should not happen.
But only Octoprint is able to do that: Sending Gcodes directly does not produce the problem
I do suspect that a precise timing is required to cause this problem... I have dumped the blocks being queued and never saw it.... But it obviously exists...
Some possibilities i think of:
The first block becomes busy when the planner tries to join it to the next one. So the planner can´t update the executing block speed profile, but it does update the following ones
That last possibility could be the reason. Imagine the planner doing the whole plan, trying to merge a 1st block with a 2nd block. Once it is planned, it tries to update the first block, but it is unable to do it because the block now is executing... There is something to prevent that: That block is marked as RECALCULATE, and the stepper should not take it...
Maybe there is a race condition and the bit is being cleared before the block is used.. or the order of calculations... let me see...
@kAdonis can you try commits from before and after the 10th, just in general try to narrow this down a bit.
@p3p okay, I'll try
I was able to finish a 2 hour print with pulse width set to 6us
There are at least two small gaps where the ISR could jump in:
@@ -842,14 +842,13 @@ void Planner::reverse_pass_kernel(block_t* const current, const block_t * const
const float new_entry_speed_sqr = TEST(current->flag, BLOCK_BIT_NOMINAL_LENGTH)
? max_entry_speed_sqr
: MIN(max_entry_speed_sqr, max_allowable_speed_sqr(-current->acceleration, next ? next->entry_speed_sqr : sq(MINIMUM_PLANNER_SPEED), current->millimeters));
if (current->entry_speed_sqr != new_entry_speed_sqr) {
- current->entry_speed_sqr = new_entry_speed_sqr;
-
// Need to recalculate the block speed
SBI(current->flag, BLOCK_BIT_RECALCULATE);
+ current->entry_speed_sqr = new_entry_speed_sqr;
}
}
}
}
@@ -905,18 +904,18 @@ void Planner::forward_pass_kernel(const block_t* const previous, block_t* const
const float new_entry_speed_sqr = max_allowable_speed_sqr(-previous->acceleration, previous->entry_speed_sqr, previous->millimeters);
// If true, current block is full-acceleration and we can move the planned pointer forward.
if (new_entry_speed_sqr < current->entry_speed_sqr) {
+ // We need to recompute the trapezoidal shape
+ SBI(current->flag, BLOCK_BIT_RECALCULATE);
+
// Always <= max_entry_speed_sqr. Backward pass sets this.
current->entry_speed_sqr = new_entry_speed_sqr; // Always <= max_entry_speed_sqr. Backward pass sets this.
// Set optimal plan pointer.
block_buffer_planned = block_index;
-
- // And mark we need to recompute the trapezoidal shape
- SBI(current->flag, BLOCK_BIT_RECALCULATE);
}
}
// Any block set at its maximum entry speed also creates an optimal plan up to this
// point in the buffer. When the plan is bracketed by either the beginning of the
@p3p After a lot of testing I found out, the Issue was introduced in commit 6f14bca
Testing was more difficult than I thought
I had to apply the workaround for the PlatformIO linker issue #11008
and there were compiler errors with TMC2130 library 2.4 so I needed to install version 2.3
@kAdonis Well that certainly narrowed it down
@ejtagle indeed before that commit the pulse duration were always more than the specified minimum, @ 2us pulse setting X 2600ns, Y 2400ns Z 2200ns, after the commit they are much lower X 1600ns, Y 1400ns, Z 1200ns, they must have been improved in another commit after that but we still need to update to be always above the minimum on the all axis.
@kAdonis I'm still unsure why you would need to go all the way up to 6us pulse duration though, even 3us should have fixed it if it is just this problem.
Yes, its a mystery
maybe my drivers are from a bad charge?
maybe my drivers are from a bad charge?
Not sure what you mean, I can't see it being a hardware fault if 2us pulse duration worked for all your steppers before that commit, the pulses generated were less than 3us.
Ah yes, your right
Have you measured the duration between pulses? Maybe the stepper rate is too high?
I'll test with 3us pulse duration and different rates
Maybe the stepper rate is too high?
The LPC176x stepper ISR runs at ~ 8us (depends what its doing) so limiting the frequency to about 60kHz (50% + wiggle room) before double stepping would be needed, you are running at 32kHz (200mm/s x 160p/mm) so should be fine.
edit: ofc this is with 2us stepper pulses, any additional delay will be added
@AnHardt : Your proposed changes are more than reasonable... 👍 ... Let´s push them!
@p3p : I do agree that the minimum pulse width should be at least the specified one. I will try to do the required modifications to ensure that.
@kAdonis : If you need larger pulses than expected, this couls be caused by several issues, including signal integrity ones and noise. Or even the driver strenght (amount of current available from the ARM chip to toggle the line..
I tested the latest commit (with the fix above) and I could reduce pulse width to 5us, going lower results again in layer shifts.
I tested also with a different 12V power source with no change.
I consider to give up my DRV8825 drivers :sob: and get TMC2130 but they might not provide enough current for my heavy carriage
and just to clarify one more time, before the commit above it was 100% reliable at 2us? wonder what else has changed
@kAdonis : Try a 24v power supply, if your setup allows it.
And i suspect a signal integrity issue here. What i´d do is to add termination resistors to your wiring...
@ejtagle although there are always ways to improve the hardware, I'm worried when something worked, then didn't after a specific commit, then it's automatically our fault either way ^^
Not necessarily.. Signal ringing can be attenuated or increased by just changing the pulse width. As @kAdonis is using an external board, with "10cm" long cables, that could be the cause... And if that is the cause, i would not be surprised if for some specific pulse widths it seems to work, and for some others seems not to work...
@p3p yes, it was reliable before. And if I upload on old firmware it still is.
@ejtagle I had this issue also with the external board not connected, as I mentioned before.
I wonder if it is important how the prints look, they are all more or less skewed to the left and back(see pictures above)
as it is a CoreX/Y printer its not easy to see which motor is losing steps, but I think the motor connected to the x slot is losing more steps. I swapped drivers but the prints look similar. I have quite a big pile of ruined test prints.
I couldn't find any irregularity within a layer, so it might be that the layer shifts occur only, while z axis is moved
@kAdonis : The main problem with old firmware is that, depending on the exact date, the stepper pulses were either much longer than configured, or a little bit shorter.
I actually don´t see a problem : If your setup works with 5uS, just use that. It would limit the step rate to 200khz, and i am pretty sure you are not using such high step rates.
That skewing is strange, i must admit.
You don´t need to do such prints to test if the drivers are losing or not steps. Just add several fast moves per layer. For example, printing 4 small objects in separate points of the bed.
At some point, i had a "similar" issue, and it was the Z stepper that was losing steps. Reducing the Z maximum feedrate cured the problem
The main problem with old firmware is that, depending on the exact date, the stepper pulses were either much longer than configured, or a little bit shorter.
Indeed, but I checked the pulse duration just before the commit that breaks and it was between 2 and 3 us as would be expected, I don't think using a 5us pulse is so bad (well.. it is a waste of cycles) I just don't like not understanding why something changes, especially when it has no reason too.
@p3p : The only way would be to bring a logic Analyzer to @kAdonis place... I know: Part of the challenge is to figure out the causes without having a reliable way to measure things.
At some point, i proved that the Bresenham algorithm does not lose pulses at all. @thinkyhead also proved it. I even wrote a test program for a PC with that algorithm to be absolutely sure no pulses were missing.
So, that leaves only timing issues and hardware issues, either electrical or mechanical ones...
(and yes, i also hate not knowing the exact reason. If you have followed my latests PRs and conversations, you`d notice that I am completely against "magic" solutions that do not tackle the roots of the problems that the associated PR is trying to fix... Marlin has very subtle dependencies between modules, it is not easy at all to figure out if touching something will break another thing or not ... :S
I was already thinking about buying a logic analyzer, but I have no experience and no idea, wether the affordable devices will be of any use.
I am open for recommendations
I think any LA that is able to sample at 10mhz (maybe 20Mhz) should be more than enough to measure signals here....
Indeed mine cost about £13 does everything I need and comes with more signal decoders than most (included in the pulseview software).
I ordered a cheap LA similar to the one @p3p owns and took a quick look at the sigrok homepage
I also hate not knowing the cause, even if I can use the printer with 5us pulse width.
Hello ,
I have a lot of problems with the lasts versions with a MKS SBASE
I spent two days trying to find the right settings
And just see this post .
I have a lot of losing steps on x and y like the picture at the top of this post
I did not notice any issues before the version that added this code:
`+/**
is there a solution or patch to come?
I have to change the configuration with pulse width setting at 5us and max stepper rate at 140000 kHz ?
Best regards , Stephane
Hello Stephane
You could try pulse width 5 or 6us maybe more and 140000kHz,
if your printer stops losing steps with this settings, then you have probably the same problem
There is no other solution at the moment, because the cause is unknown
please report your findings here
thank you very much, I'll try and I'll tell you if there are any improvements
thank you !
2 prints of 2-3 hours and the result is relatively correct. Much better than anything I've tried before
with pulse 5us and max stepper rate at 140000 kHz
@ejtagle @p3p I found something!
I used to connect my Re-Arm to a RasPi3 running Octoprint over a serial connection, just three cables RX, TX and GND. This connection was rock stable, no resend errors or lost/garbled characters. And the USB port was free, so I could upload new firmware easily
I was thinking about what makes my printer so unique, and tried again to print with Octoprint connected over USB.
It prints now fine! with 2us pulse width.
I checked it twice. This makes no sense to me, but you are the experts
@Stephane-80
Which host are you using? are you using USB or serial?
@kAdonis that is exactly the kind of information that makes the problem obvious (hopefully, assuming I'm correct) looks like the author of LPC176x HardwareSerial support did not set the interrupt priorities.
HAL/HAL_LPC1768/HarwareSerial.cpp
UART_IntConfig(UARTx, UART_INTCFG_RLS, ENABLE);
- if (UARTx == LPC_UART0) NVIC_EnableIRQ(UART0_IRQn);
- else if ((LPC_UART1_TypeDef *) UARTx == LPC_UART1) NVIC_EnableIRQ(UART1_IRQn);
- else if (UARTx == LPC_UART2) NVIC_EnableIRQ(UART2_IRQn);
- else if (UARTx == LPC_UART3) NVIC_EnableIRQ(UART3_IRQn);
+ if (UARTx == LPC_UART0) {
+ NVIC_SetPriority(UART0_IRQn, NVIC_EncodePriority(0, 3, 0));
+ NVIC_EnableIRQ(UART0_IRQn);
+ } else if ((LPC_UART1_TypeDef *) UARTx == LPC_UART1) {
+ NVIC_SetPriority(UART1_IRQn, NVIC_EncodePriority(0, 3, 0));
+ NVIC_EnableIRQ(UART1_IRQn);
+ } else if (UARTx == LPC_UART2) {
+ NVIC_SetPriority(UART2_IRQn, NVIC_EncodePriority(0, 3, 0));
+ NVIC_EnableIRQ(UART2_IRQn);
+ } else if (UARTx == LPC_UART3) {
+ NVIC_SetPriority(UART3_IRQn, NVIC_EncodePriority(0, 3, 0));
+ NVIC_EnableIRQ(UART3_IRQn);
+ }
RxQueueWritePos = RxQueueReadPos = 0;
@p3p Thats it! your code is working.
I can print now as before without errors over serial connection
Thank you everyone, for this great support :thumbsup:
and especially @p3p and @ejtagle for their patience
for me when I have the problems it is with usb ( repetier server on raspberry )
no tested on sd print or other
for you it's good with 2us pulse width with usb ?
at 140000 kHz or 250000 kHz ?
I have not tested with 2us and 140000 kHz
my problems are at 2 to 4 o 5 us and 25000 kHz
what are the best settings ?
the patch resolve only serial problem or usb too ?
A pulse width of 2us and 250000 khz is working for me now
this setting is the fastest a DRV8825 is able to work with
as I understand the max pulse rate is not so important for 8825 drivers, but 25000 is really low
it looks like your Issue is caused by something different
@Stephane-80 Afraid not, the fix only applies to HardwareSerial, the USB CDC driver has the correct priorities, even lower than I set the Hardware Serial in the patch above.
Leave the frequency at default, you should never need to change that it as it's just the upper limit the driver can accept, you will be getting no where near those frequencies anyway. If you only use USB serial you do not have the same issue as KAdonis so it may be best if you open an issue, supplying your configs and hardware details.
Ahhh, I was cheering too soon.
after a higher print it is visible that the part is slightly skewed to the right and front now
Well.. at least we found our second bug trying to solve your issue :wink:, did you let the usb print complete far enough to know if it is reliable?
Thank you , for your help I am not very comfortable with pulse and max stepper rate parameters
and I'm not sure what I'm saying about it
that's why it's difficult for me to open a bug report
All I can say without being afraid to say nonsense is that it's a MKS Sbase with embedded drivers on a CR-10S and repetier server over usb on raspberry.
I the onboard drivers are equivalent of DRV8825 but I'm not sure...
Before my marlin update everything was working properly ( the previous version was from the beginning of the month )
The parameters are those found in the sample configuration file of MKS Sbase
2us and 250000 khz
With this I have an important offset of printing as stated here.
At worst I will keep the parameters pulse 5us and max stepper rate at 140000 kHz which are not too bad.
This time I want to be sure before I report back
I checked printing over USB again with a longer print and there are still very subtle layer shifts leading to a slight distortion
It prints fine (really), with 3us pulse width
so it seems there is a third cause for my issue
Its strange...
Was this test ran with any advanced motion control features enabled, or just standard Jerk settings.
All my test are ran with
Jerk X5 Y5
S curve acceleration enabled
Linear Advance enabled (K 0.08 set in gcode)
Adaptive step smoothing sometimes enabled (did not see a difference)
These settings worked before the Issue
Adaptive step smoothing only tested once, because it came with the last "good" commit
Hello
I have an an AZSMZ_MINI and i faced the sampe shift problem with Pololu DRV8825 .
With older version of Marlin no problems, with a new version the shits appared.
I tried the suggestion of changing to 5us pulse width and max stepper rate 150000 and it solved the issue.
I have not understud if now the problem is solved with a newer version or if I have to mantain the new setting.
Anyway thanks a lot for suggestions that solved the problem
There have been some pretty thorough patches made to the planner/stepper code within the last few days. Please test with the most recent bugfix branches to see if the situation has improved. This issue is intimately related to #10446, so feel free to post your results on both threads.
I already tested commit 42f9921 from June 28 (after #11098 has been merged). Unfortunately there is no change. I can print fine with 3µs pulse width but not with the default 2µs.
I also tested with
#define BLOCK_BUFFER_SIZE 32
as @ejtagle suggested in #10446, but still no change.
I got my Logic Analyzer and measured the step pulse duration on my RAMPS/Re-Arm
tested on commit 42f9921
how is this possible?
this is the shortest pulse I found, but every pulse is below 2µs.
The updated pulse-width code is pretty new and might still need some adjustments. If 3µs works for you then stick with that. If you can do some additional tests with a variety of pulse widths and report the duration in each case, that could provide some useful information.
@thinkyhead : The strange thing is that the pulse width was measured by @p3p and its length was exactly the configured amount. With the same hw. I am asking myself if there are more than one Re-Arm board, perhaps with different clock speeds.
That could easily explain this problem...
I can check again but indeed all measured pulse durations are within spec on the MKS SBase, it runs the same code as the Re-ARM so should be the same, I will try to measure the signals from the Re-ARM in my printer later.
@ejtagle We set the frequencies appropriately in the system initialisation, (and the usually static macros check the current frequency) there shouldn't be any difference.
this is my Re-Arm:
nothing unusual to see, no "turbo" marking :wink:
this looks like the crystal for the main PLL, probably 12MHz?
I can confirm that on my Re-ARM the pulse durations are as expected, @kAdonis can I have your configs, I know you supplied them earlier just want to make sure we are on the same page.
sure, here they are:
Configs.zip
I bought my Re-Arm from ebay in germany, it was offered as new, but could it be that the Smoothieware bootloader has been altered and overclocking the mcu?
just an idea
If they were overclocking the CPU, it must be just the CPU and not the CPU peripherals, as otherwise, USB and Serial communications would also be out of spec.
The other possibility is that the compiler is optimizing in a different way ... perhaps moving the setting and clearing of STEPs signals...
All my tests were done with the latest gcc arm compiler...
The other "interesting" possibility is that NXP (=Freescale) has produced a new hardware revision of the silicon, and they have changed (again! ... Yes, this DID happen to me with the Kinetis family the last year) the PLL registers a tiny bit and that is forcing the overclocking... Of course, all of this is pure speculation...
thank you for the explanations, I hope my ideas are not too stupid.
I'm using platformIO for VSCode, everything standard, it updates automatically so I guess its using also the latest gcc arm compiler
I tested platformIO for Atom again last week, but the issue stayed the same
As I understand the serial number on the LPC1768 it is from week39 2016, so it's not really new
I bought my Re-Arm from ebay in germany, it was offered as new, but could it be that the Smoothieware bootloader has been altered and overclocking the mcu?
No Marlin initialises everything, USB would not function if there was any deviation from the expected peripheral frequencies.
Just tested with the exact files you supplied using the current bugfix-2.0.x I see exactly 2us pulses on y,z ~2.3 on x
@p3p , @kAdonis : Is there a way kAdonis could send the already compiled firmware so p3p can test it ? ... That way, we can be sure it is not the environment or the compiler the problem... ;)
Sorry I have to retract my last statement I was in a hurry when I did that test, The Re-ARM hadn't flashed, I do see the short pulses, I'l investigate
@p3p: That is "good" news ... 👍 - Perhaps we can disassemble and compare the working and the non working firmwares to see if there is any difference...
It's directly correlated to #define COREXY
, will have to look into what that changes in the stepper isr, also we are still cutting it fine on the 'good' pulses @ 1960ns, (40ns resolution on the scope) as this is the minimum duration from the datasheets we should probably aim to overshoot by 200ns or so
Sounds like a change in optimization ... strange, pretty strange...
I feel like we should just:
#if MINIMUM_STEPPER_PULSE
// Just wait for the requested pulse duration
- while (HAL_timer_get_count(PULSE_TIMER_NUM) < pulse_end) { /* nada */ }
+ DELAY_US(MINIMUM_STEPPER_PULSE);
#endif
We overshoot by about 400ns, which is probably good as the values are minimums not targets and it removes so much complication
The problem with that modification, as you could expect, is AVR. Adding an extra delay in 32bit boards is not a problem. On AVR it is a huge amount of extra time, because the pulse set and clear is taking about 2.3 uS (if my memory serves me well)
I will try to enable COREXY and disassemble the ISR, to check if there is any change in the optimizations being done...
Well, i did compile 2 versions, one with COREXY enabled and the other without it, The stepper ISR loop that toggles the STEP pins is exaclty the same, byte by byte and instruction by instruction...
Right now, the best speculation i have so far is that the delay changes are caused by the Flash accelerator the LPC17xx has. This is not to blame LPC: Most ARM processors have a Flash accelerator to improve performance, as reading from the FLASH directly is far slower than the speed at which instructions are consumed.
LPC has a 128 bit FLASH, that allows it to read up to 8 16bit THUMB instructions at the same time. The FLASH speed is 20 Mhz, so that allows to read at nearly the same speed as instructions are consumed - But, as the FLASH prefetcher is not speculative, jumps, if the row is not already in the flash cache, usually have the effect of requiring a full fetch of a 128 bit row, thus adding 5 latency cycles. And the Flash cache is shared between Code and Data busses (ARM uses a hardvard architecture= so, it can act as a bottleneck if separate data and code rows are to be fetched to execute the current instruction (because is a load from flash instruction) and fetch the next instruction (i have already measured this, and execution from FLASH usually does not reach maximum possible execution speed due to this sharing
SAM3x has also a Flash accelerator, and it is also 128bit capable. The FLASH speed is about 20mhz. So, basically, it is the same.
I assume the different timings are just caused by different code alignments. Probably when COREXY is defined, the start of the loop ends being aligned to a cache line, and then COREXY is not defined, it is not aligned. That adds 5 latency cycles = 50nS ... Well... i does not make such a huge difference after all...
As discussed in #11742 this happens to us with a mega and a due alike. And very consistent.
@maukcc — You're seeing this with 2.0.x only, correct?
@ejtagle — I don't think there should be any significant difference between 1.1.x and 2.0.x, but apparently something isn't matching up. Before we merge a big change like #11578 I'll have to confirm that 1.1.x and 2.0.x have better parity with one another.
@thinkyhead: yes
On a Mega, the minimum pulse width is about 3uS ... So, unless the compiler optimizer is moving things, minimum pulse time can not be the cause...
@thinkyhead correcting my previous statement:
We are seeing this in all versions older then approximately February 2018
As LichtiMC also stated in #11742
After reading #12403, I do not think it is the same issue, as that is a real layer shift that happens.
This is more a consistent losing of steps on directional changes.
I had those issues once or twice also. I did add flyback diodes to my XYZ steppers and since have not seen those problems. The diodes are suppose to eliminate lost and multiple steps issues like that and make the stepping MUCH quieter. I must tell you they do that very well and for the price on AliExpress it is TOTALLY worth it. YMMV
Driving inductive loads will always produce flyback voltages. These can be disruptive enough to fry components or produce or eliminate pulses (steps). Some drivers with special s/w claim to reduce or eliminate the issue at some expense. Diodes are cheap. They work, so try them. Then let everyone know how well they work so the developers don't have to chase a problem that is easily resolved in hardware.
In fact, it may be totally impossible to eliminate in s/w and keep it that way while making other changes.
But being an engineer myself, I know how challenges attract me.
sorry my previous post should be:
We are seeing this in all versions NEWER then approximately February 2018
@kAdonis is this still a problem?
Afaik, this is still relevant
I cannot test myself, because I switched to different drivers (TMC5160), but by what I read (and write) in other bug reports, there is still a problem
Summary: 32bit hardware+ CoreXY and especially DRV8825--> Stepper Pulses too short
:-(
The most common issue is the WRONG SPEC of stepper being used with DRV8825 drivers. You can't use steppers with low resistance & low inductance phases, nor can the drive voltage be greatly higher than the nominal stepper operating voltage.
Example: a 3 V nominal stepper at 12 V or higher drive, will have problems micro-stepping at 1/16 and 1/32.
The missed-steps issue affects all drivers with a fixed-frequency chopper, but is really noticeable on the DRV8825 because it has a 3.75 us blanking period after the H-bridge switches. If the phase current rises very fast, i.e. 10% of max or more at 1/32 step in 3.75 us, because the phase reactance is too low or the drive voltage is too high, there will be missed steps around the zero-crossing.
At low stepping rates, the phase inductance doesn't retard current-change as effectively, and a large phase resistance helps to compensate.
The diode idea with low-reactance steppers originated with this blog. At 24 V drive, you may need to double the number of diodes (double the voltage drop). The term 'TL Smoother' has become generic because of so many clones, though it refers to a specific product.
Lastly, all DRV8825 driver boards based upon the Pololu design (pretty much all) have current-sense resistors that violate the recommended minimum VREF (1 V) in the datasheet: "Operational at VREF between 0 to 1 V, but accuracy is degraded." The smaller the micro-steps, the more the accuracy matters.
i have just started testing with re-arm and DRV8825, so far no issues, but i will do some more serious testing this weekend
but my 0.02$ bet is that this is not an issue
Hi there. We also have quite really strange Y layers shift on our U20 and U30 running Marlin 2.0.X pulled on 14/04/2019. Apparently, increasing Vref on our A4988 tends to limit greatly the issue, but still happens.
U30 with lighter beds are less prone to the issue than U20 with heavier beds. As of today, I am not sure if the issue in this thread is the same as the one we face, but looks a bit similar. Too short pulses?
@hobiseven , there is a discussion about Layer shift here #12403
Sorry to revive this old thread, but im totally lost now, i designed a printer using 2 steppers for Y axis (using Y and E1), and two endstops, 1 stepper for X and one for Z, I'm having lost steps randomly (sometimes it prints perfectly sometimes i get lost steps), i already tried changing drivers, increasing current, using 30v on the stepper (i'm currently using it). My mechanics are sliding very well, printing speed at 80mms and acceleration of 1000, i really cant find a reason to get lost steps, (im using A4988).
Anyone has any suggestions? (losses happen on X axis)
Do your steppers match 8825s at given voltage? Did you try to increase MINIMUM_STEPPER_PULSE? Did you try slow/fast/mixed decay?
Thanks for the quick reply!, i gave up on 8825 and put 4988, but i can revert it. i'll try your stepper pulse suggestion right now.
I was pointed to this issue because I have a similar (same?) problem.
Some of my prints were showing repeatable (and even deterministic) layer shifting. I'm using DRV8825 drivers with a RADDS 1.5 board @ 24V. I have lowered speed, acceleration and jerk settings, but that did not solve it.
I tried Repetier-Firmware 1, and the problem is solved (even at higher speed, acceleration and jerk settings). Printing the exact same GCode yields a better result. Left is Marlin 2.0 (few days ago), right is Repetier-Firmware. The left print is shifted.
I will see if changing MINIMUM_STEPPER_PULSE in Marlin 2.0 yields a better result.
@basilfx did you have luck changing MINIMUM_STEPPER_PULSE ?
@kAdonis do you still get layer shifts?
As I wrote in february, I cannot test myself, because I switched to different drivers (TMC5160), and I am using now SQUARE_WAVE_STEPPING
Sorry Boelle, I didn't follow Marlin development since then, so I dont know if this thread is still relevant.
I think, a list with known Bugs would be helpful, but maintaining this list wouldn't be fun.
My suggestion for labels: K: Core / H-Bot and Bug confirmed - P3P wrote Jul. 3. 2018
... I do see the short pulses, I'l investigate (...) It's directly correlated to #define COREXY ,...
I have stated many times here that I have had similar problems with my CoreXY system - however, I don't believe it is related to CoreXY.
What I did was to add flyback diodes to the driver outputs (available on Aliexpress very cheap) and have not seen those problems since. I have not seen any feedback that other people have tried this, only that they seem to prefer more time consuming and expensive alternatives that do not seem to produce any results.
I am still waiting to hear from someone who has the same issue and tries the flyback diodes.
They are cheap and plug&play. Mounting them permanently might take some ingenuity.
I only used 3 (available as one item). None on the extruder. Come to think of it, I am having some issue with blobs that may be related. I think I will make another order and try that as a solution.
blobs? i started to have this a week ago (or there abouts) all i did was updating the firmware without even looking at the config. i have narrowed it down to that at each layer change in the last 1-2mm of movement the extruder speeds up and hence i creates a blob, its easy to note as you can hear the extruder move faster, and i eyeballed it closely to make sure it happened at layer change
never figured out why it started to happen, but there are no layer shifts, and fun enough i use TMC2100's EXCEPT on the extuder where i use a DRV8825
no wires was moved, no hardware was touched, just a updated of firmware
i have these 2 features enabled
lin adv is enabled but set to 0 in firmware so i can set it in slicer
now the printer is taken apart as i want to replace wires with flat ribbon cables (more easy to replace if wire breaks inside), bed gets a 50 wire ribbon, overkill for my purpose, but better to thick wires than to thin, wires to extruder and x motor gets 25 wire ribbon.
so it will be some time before i can experiment further if its the combination of features that suddenly did the blob thing happen
thanks. it is about time for me to do another update. I had blobs before, but it was a function of accel and jerk settings and they were pretty consistent as far as location. The blobs I am having now are random and not at discontinuities but rather in mid long runs.
Is there any more detailed descriptions of all the new setting than just in the code? I am finding it hard to keep up with all the new stuff.
me too.... even thou the printer is down i updated today and they have changed stuff arround probing, but i got help on discord on how to override with gcode so i can still have it work with the mk42 bed
as for the driver goes i forgot to mention that mine are all active cooled by a 80mm fan, they also where before when the blobs was not there
but one thing i can test when i get the printer running (in 1-2 weeks) is simply change arround the drivers so that say X is driven by the DRV8825 and the extruder with the TMC2100, just to see if that changes anything, if not i will change it back and have look at features
i was asked on discord about almost everything and i can say that my connectors are solid, i invested in the correct crimp tool and gold plated dupont connectors. my time is to short for having problems to save pennies on the tool and hardware, better pay a little more now and have a relative troublefree life :-D
Yeah, I just ordered another diode array. I didn't see the three up that I ordered b4, but at $2.78 it is cheap enough.
Mine are sort of active cooled. I initially fried a few drivers, then I put a 40mm fan just blowing over the top of them and had no issue since. I rearranged things and now the fan in my PSU exhausts over the top of the drivers and seems to do the job OK. It doesn't take very much, apparently.
when i get to it i will get these: https://shop.watterott.com/SilentStepStick-Protector-for-Stepper-Motor-Drivers
they call them protectors, but its just diodes
Yes, I just saw those on Aliexpress. $0.79 US each
same function, cleaner installation, free shipping and assembled. They all come from China. may as well by direct and save
takes longer to get
and they dont give a rats arse about customer service and buyer protection
and some wise old man once told me you get what you pay for
those i linked are assembled and shipped from germany, and i live in denmark so its a no brainer since i live juse above them :-D
well i have to solder on the headers, but i rather do that since the risk of bending pins in shipping is high,
Personal experience - I have purchase hundreds of items from Aliexpress and their customer service is as good or better than Amazon. If you need the parts next week, buy local. If you plan and can wait 3-4 weeks you will save a lot. 6.00 € + shipping vs 0.6775 € and free shipping is a somewhat significant premium to pay. Most of my printer parts come direct. However, everything is not necessarily cheaper. For what these cost you should try it out. Ain't much to lose if it goes wrong. If you have a problem just open a dispute and Ali resolves it within 3 days. They really are OK. Just treat the with respect and they will return it.
Another note on update - I downloaded the latest commit, but I don't think the BS has been resolved - at least not the way I did it. I have not loaded it yet so I can't verify. I am hesitant to load as I am seeing a lot of BUGs listed.
hi,
i have the same problem on my mini kossel with mks sbase (drv8825) , i post in issue #12403 , no pb on smoothieware with the same gcode file
picture in the post:
https://github.com/MarlinFirmware/Marlin/issues/12403#issuecomment-536219590
i just receive extention board and tmc2130 i will try to change drivers and see if it's work
Hi,
I mount my news tmc2130 with external board
Now wall are straight no more layer shift. so my issue is marlin's DRV8825 management.
The same problem with sbase
defalt settings 250000 2us
+1
I'm curious, has anyone with this problem tried installing the flyback diodes???
@ruggb i did, it did not help. both with 4 and 8 diodes versions.
I'm curious, has anyone with this problem tried installing the flyback diodes???
HI,
i try flyback diodesbut it was the same result
I tweak my mks board to enable fast decay on DRV8825 but no change
with smoothieware no pb with or without diode and fast decay
interesting since it worked on my ramps board with 8825 drivers. What could the difference be? I do have some powerful steppers, about 70 (whatever that is, I only remember the number at this point).
@kAdonis many updates have passed, is this still an issue on latest 2.0.x?
Yes, it is still an issue, it would seem the only way to eliminate it is to use the settings above from kadonis....
"The printer prints fine with pulse width set to 5us and a max stepper rate of 140000 kHz( dont know if this rate makes sense)"
... I implemented those settings in the advanced config and the issue instantly disappeared. I am using MKS SBASE 1.2.
Thanks!
"The printer prints fine with pulse width set to 5us and a max stepper rate of 140000 kHz( dont know if this rate makes sense)"
I implemented those settings in the advanced config and the issue instantly disappeared
the above seems to fix the issue, hence its more a config issue and not firmware issue
hence its more a config issue and not firmware issue
Well .. it is a firmware issue in that the default config value doesn't work, but 32bit timing issues in the stepper ISR are an ongoing investigation so setting higher pulse duration is currently the best solution even though 2us(+abit) should be the correct setting for DRV8825.
The timing discrepancy should only be in the ns range though so why people seem to need to go as high as 5us I don't know, unless the drivers don't actually work close to the minimum pulse duration the datasheet says.
I think this should be reopened
The bug is still there, there's only a workaround found
we need to figure out if its a bug as such or if default config is wrong
and there is also the problem if the drivers work with the pulse the datasheet says, how can we make a default if datasheets are not true?
The Problem here is, that under some circumstances, the actual step pulse duration is shorter than the configured step pulse time. p3p (and me) measured step pulse times as low as 1690ns with configured step pulse time of 2000ns, this is too low for the DRV8825 drivers to work reliably.
So you have to manually override the (theoretically correct) default setting, with longer Step pulse settings to get correct actual step pulse times
The datasheets are fine but marlin code is not
i agree, if we can trust the default config actually ouput 2000ns then there is a problem
@shitcreek or @thinkyhead should look at this
Having the feeling that issue #11205 is related, I retested with a step pulse of 2us to 5us
In the case S_CURVE_ACCELERATION, JUNCTION_DEVIATION (or the new setup for this option), ADAPTIVE_STEP_SMOOTHING and LIN_ADVANCE_ are activated (or at least more than 2 of those functions), the results are wrong with a stepper pulse of 2us and correct with a stepper pulse of 3us or above.
In my case, the correct value is between 2us and 3us. As we cannot input floating point values for stepper pulse, 3us is the adequate value on my MKS SBASE. No need to go to 4us or above. This goes in the direction of what @p3p said above ("so setting higher pulse duration is currently the best solution even though 2us(+abit)") and the fact that the issue is the way marlin is handling this.
@kAdonis, I just posted a link on issue #11205 to a branch with an experimental change to resolve the short pulses. It isn't a fully complete solution. More work will be needed to make it acceptable across all devices, but it could be worth trying. If it does resolve the issue that could inform possible fixes for this.
@p3p and I were discussing this issue earlier. The pulse lengths are actually much worse than have already been mentioned in this issue. I saw pulses as low as 856ns when a 2us pulse was intended.
Hello. I apologize for my English. I seem to have encountered the same bug on MKS sBase v1.3. I print through OctoPrint, connected directly to the AUX1 connector. Printserver Octoprint is installed on OrangePi PC+. On advice sjasonsmith (https://github.com/MarlinFirmware/Marlin/issues/16218#issuecomment-565736271) i configured MINIMUM_STEPPER_PULSE 4. The seal more less normalized. Does this mean that I faced the same bug in the firmware?
If this is indeed the case, do you need any help to fix this bug? Since I program only at the initial level, I can only help in testing fixes on hardware. I just need instructions on what to do.
I have a BigTreeTech SKR v1.3 Board with drv8825 stepper motor drivers. There is also a working MKS TFT and MKS sBase v1.3 installed on the printer. There's also a Logic Analyzer that works with 3.3 volt levels. I do not have a debugger for LPC1768, there is only STlink v2.
My configuration for Marlin 2.0.x
MarlinConfig.zip
@dark184, at this point I don't need any help with debugging, but as I make edits to PR #16128 help verifying the changes would be welcome. I expect to spend more time improving that PR in the next few days.
PR #16128 has been merged into bugfix-2.0.x. This definitely fixed timing issues impacting 8825 drivers on 32-bit controllers, but probably fixes sporadic issues elsewhere as well.
It would be great for people to test the latest bugfix-2.0.x to see if issues remain. I recommend you remove any custom MINIMUM_STEPPER_PULSE or MAXIMUM_STEPPER_RATE to be sure you are testing the fix, and not working around issues in other ways.
@sjasonsmith Wel done Jason !
Based on the last bugfix (e3b02757d3f7c854ccf2f91bbc0f9de585cf8478) and removing all work arrounds, I made 2 test prints with all the options creating issues in #11205. No problem. Test modules are just as expected.
Thanks a lot
I dont have the hardware to test myself, but the issue seems solved
Thanks to all contributors, I will close this now
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
I ordered a cheap LA similar to the one @p3p owns and took a quick look at the sigrok homepage
I also hate not knowing the cause, even if I can use the printer with 5us pulse width.