Marlin: [bugfix-2.0.x] LIN_ADVANCE 1.5 no extruder movement on LPC1768 build

Created on 27 Feb 2018 · 25Comments · Source: MarlinFirmware/Marlin

Fresh clone of the repo just now (last commit 35ec678) to setup a clean test and confirm what I ran into this weekend. LIN_ADVANCE works as I'd expect on my Atmega based boards, but not the LPC

Board: Azteeg X5 GT (so it's a 1769 processor), simple non-spi drivers
Configs attached, all I changed was the board name, some thermistor settings, steps/mm, and enabled LIN_ADVANCE for this test.

All other motors move as expected during a print, or simple axis movement commands, except the extruder (extruder was up to temp, no cold extrusion warnings).

The extruder driver enables/energizes the extruder motor as soon as a movement is issued to the extruder, but no steps are being sent.

As soon as I comment out LIN_ADVANCE the extruder works exactly as expected again.

I'm happy to gather more data, just let me know what you'd like to see.
Marlin_bugfix-2.0.x_2-26-18_LIN_ADVANCE issue on LPC.zip

Confirmed ! 32-Bit & HAL

Source

newt0ns

All 25 comments

Thanks very much, @newt80ns — This is the kind of stuff we definitely need to know about. Since the common factor is LIN_ADVANCE, and since the 2.0.x version actually works on AVR but not LPC that gives us a good starting-point for our search, even though it has some confusing elements.

The first thing I will do is to look over the LIN_ADVANCE code in 2.0.x with an eye towards what could possibly differ between the AVR and LPC that would affect it. After staring at the code for a while I will try to come back with some things you can try, or with any luck a patch that fixes it.

Talk to you soon…

thinkyhead on 27 Feb 2018

I'll test this over the weekend on the Re-Arm to see if I can replicate.

psavva on 27 Feb 2018

Does this also happen with K=0 (advance disabled, but dedicated extruder ISR used) or only if K has a value?

Sebastianv650 on 27 Feb 2018

It did during my testing over the weekend (K=0 still resulted in no extruder movement), I'll confirm on this clean build tonight though.

newt0ns on 27 Feb 2018

With K=0 also affected we can rule out all the planner.cpp code, the problem has to be inside the ISR itself. All the code is doing there is to say "execute the extruder ISR just after the main stepper ISR if we need an e step". Quite strange this isn't working.

Sebastianv650 on 27 Feb 2018

@Sebastianv650 — Could the bug have something to do with advance_isr_scheduler calling HAL_timer_set_compare twice for all CPU_32_BIT?

// Is the next advance ISR scheduled before the next main ISR?
if (nextAdvanceISR <= nextMainISR) {
  HAL_timer_set_compare(STEP_TIMER_NUM, nextAdvanceISR);
  . . .
}
else {
  HAL_timer_set_compare(STEP_TIMER_NUM, nextMainISR);
  . . .
}

#ifdef CPU_32_BIT
  // Make sure stepper interrupt does not monopolise CPU by adjusting count to give about 8 us room
  uint32_t stepper_timer_count = HAL_timer_get_compare(STEP_TIMER_NUM),
           stepper_timer_current_count = HAL_timer_get_count(STEP_TIMER_NUM) + 8 * HAL_TICKS_PER_US;
  HAL_timer_set_compare(STEP_TIMER_NUM, max(stepper_timer_count, stepper_timer_current_count));
#else
  NOLESS(OCR1A, TCNT1 + 16);
#endif

thinkyhead on 1 Mar 2018

If it does the same thing as setting OCR1A on the ATMega, it should be just fine. For testing, it should be OK to just remove the "check for monopolization" lines.
@newt0ns can you test with this code commented out in stepper.cpp?

#ifdef CPU_32_BIT
  // Make sure stepper interrupt does not monopolise CPU by adjusting count to give about 8 us room
  uint32_t stepper_timer_count = HAL_timer_get_compare(STEP_TIMER_NUM),
           stepper_timer_current_count = HAL_timer_get_count(STEP_TIMER_NUM) + 8 * HAL_TICKS_PER_US;
  HAL_timer_set_compare(STEP_TIMER_NUM, max(stepper_timer_count, stepper_timer_current_count));
#else
  NOLESS(OCR1A, TCNT1 + 16);
#endif

Sebastianv650 on 1 Mar 2018

I was going to test something similar last night but ran out of time, with this:

// Is the next advance ISR scheduled before the next main ISR?
    if (nextAdvanceISR <= nextMainISR) {
      // Set up the next interrupt
      // Don't run the ISR faster than possible
      #ifdef CPU_32_BIT
        // Make sure stepper interrupt does not monopolise CPU by adjusting count to give about 8 us room
        uint32_t stepper_timer_count = HAL_timer_get_compare(STEP_TIMER_NUM),
                 stepper_timer_current_count = HAL_timer_get_count(STEP_TIMER_NUM) + 8 * HAL_TICKS_PER_US;
        HAL_timer_set_compare(STEP_TIMER_NUM, max(stepper_timer_count, nextAdvanceISR));
      #else
        HAL_timer_set_compare(STEP_TIMER_NUM, nextAdvanceISR);
        NOLESS(OCR1A, TCNT1 + 16);
      #endif
      // New interval for the next main ISR
      if (nextMainISR) nextMainISR -= nextAdvanceISR;
      // Will call Stepper::advance_isr on the next interrupt
      nextAdvanceISR = 0;
    }
    else {
      // The next main ISR comes first  
      // Don't run the ISR faster than possible
      #ifdef CPU_32_BIT
        // Make sure stepper interrupt does not monopolise CPU by adjusting count to give about 8 us room
        uint32_t stepper_timer_count = HAL_timer_get_compare(STEP_TIMER_NUM),
               stepper_timer_current_count = HAL_timer_get_count(STEP_TIMER_NUM) + 8 * HAL_TICKS_PER_US;
        HAL_timer_set_compare(STEP_TIMER_NUM, max(stepper_timer_count, nextMainISR));
      #else
        HAL_timer_set_compare(STEP_TIMER_NUM, nextMainISR);
        NOLESS(OCR1A, TCNT1 + 16);
      #endif
      // New interval for the next advance ISR, if any
      if (nextAdvanceISR && nextAdvanceISR != ADV_NEVER)
        nextAdvanceISR -= nextMainISR;
      // Will call Stepper::isr on the next interrupt
      nextMainISR = 0;
    }

    // Restore original ISR settings
    HAL_ENABLE_ISRs();

I'll give both tests a shot tonight. I'm not a fan of my implementation (duplicate code), but it's less clock cycles in the end, so there's that...

newt0ns on 1 Mar 2018

Attempted both workarounds last night, unfortunately I didn't have time to debug much more with it. My attempt resulted in an extremely slow step generation, which I later found was likely due to me grabbing stepper_timer_count(not needed in my example, bad copy/paste) and not stepper_timer_current_count. So it was adding delays on delays. Will try tonight with a more proper test, and more time hopefully.

Commenting out the rate limiting block did not have any affect on the extruder step generation.

newt0ns on 2 Mar 2018

Ok, found the issue. While adding a bunch of serial dump debug data trying to trace from planner through to the stepper I found that a serial_echo call in the START_E_PULSE function presto, extruder movement. The added delay due to the slow serial call got the driver going, aha!

So, it would seem the speed of the LP1769 was likely _just_ working without any issues in the normal ISR() as there are several more pins it's switching during each loop, but the advanced isr is only dealing with the single extruder and was simply flipping the step pin too quickly for the driver.

As soon as I added #define MINIMUM_STEPPER_PULSE 1 to the configure_adv I now have a moving extruder in addition to my other axes with LIN_ADVANCE enabled. No other changes were needed.

newt0ns on 3 Mar 2018

Nice! We should realy check if the 32bit boards are fast enough to generate a 50% duty cycle by using an ISR to enable and another routine to disable the signal again.
One day I will (far away) have one and then I will have a look into that if it's not done until then..

Sebastianv650 on 3 Mar 2018

We should really check if the 32bit boards are fast enough to generate a 50% duty cycle…

…at what frequency?

thinkyhead on 4 Mar 2018

@newt0ns — If this is solved, you can close the issue anytime.

thinkyhead on 4 Mar 2018

…at what frequency?

I know this only works when double/quad stepping is no longer needed. I would enable all signals within the ISR, then set the timer to "duration between two loops / 2" (_NEXT_ISR(ocr_val / 2);)and disable all signals then. This could be a very lightweight ISR, only calling the needed PULSE_STOP(); code, or we could separate some of the timer calculations between the Start_pulse ISR and the stop pulse ISR to create a more equal load balancing.

Sebastianv650 on 4 Mar 2018

Ah, I see what you mean. Yes, I've considered the idea before, to run two concurrent ISRs at the same rate, but one would be offset by one pulse-duration later than the first. Its job would be to set all STEP pins LOW, regardless of whether they were set HIGH. (Since LOW->LOW does nothing.)

The same may be accomplished by firing a one-off interrupt from the main ISR, just at the point when STEP is being set HIGH. Would the overhead of the ISR hurt much? Hard to say.

thinkyhead on 4 Mar 2018

Another thing to note is that with the faster MCUs we may not need double or quad stepping. The ISR can just run more often. In spite of ISR overhead, it may still be more performant, as it could give up more time to the main thread.

thinkyhead on 4 Mar 2018

👍1

I guess the overhead of any ISR would be lower than the loss due to sit and wait a minmum time just because the stepper drivers need a minimum signal length :)

Sebastianv650 on 4 Mar 2018

Yea, other work can certainly be done in that time, particularly when you start adding an additional wait loop for just the extruder(s).

newt0ns on 4 Mar 2018

[OT❓] Hmm, @Sebastianv650 what do you think?
In the lin-advance e_steps loop, after STOP_PULSE…

This?

// For minimum pulse time wait before looping
#if EXTRA_CYCLES_E > 20
  while (EXTRA_CYCLES_E > (uint32_t)(TCNT0 - pulse_start) * (INT0_PRESCALER)) { /* nada */ }
#elif EXTRA_CYCLES_E > 0
  DELAY_NOPS(EXTRA_CYCLES_E);
#endif

or this?

// For minimum pulse time wait before looping
if (e_steps)
  #if EXTRA_CYCLES_E > 20
    while (EXTRA_CYCLES_E > (uint32_t)(TCNT0 - pulse_start) * (INT0_PRESCALER)) { /* nada */ }
  #elif EXTRA_CYCLES_E > 0
    DELAY_NOPS(EXTRA_CYCLES_E);
  #endif
}

thinkyhead on 4 Mar 2018

All discussed here will give a 50% duty cycle only to the 'leading' axis.
The way to go could be:
Double all steps to get even numbers.
Don't set or reset step-pins, but toggle them.
Needs to be called twice as often than normal.

AnHardt on 4 Mar 2018

@AnHardt good note, that's true. But I think it's still a solution which should work on all stepper drivers and comes with a minimum of overhead.

@thinkyhead nice idea, it took me a minute to get the meaning of that extra if :blush: Yes, we don't need to wait if no more e_steps are available, the next ISR will be far enough away. The same should be done after stoping the pulses in the main ISR loop also. The if statement in this case could check for i.

Sebastianv650 on 4 Mar 2018

Needs to be called twice as often than normal.

It's an interesting idea, to de-couple the ISR routine that gets blocks and prepares the step loop from the actual step loop. It would eliminate the need to have the step loop inside of the stepper ISR (for platforms that can manage the steps rate). But at a certain point the overhead of the extra interrupts must surely trump the reduction in logic.

We might accomplish a similar effect by placing the steps loop at the top of Stepper::isr instead of farther down.

General question: Is there a simple way to change the address that the interrupt calls, rather than (as we do now) having ISR(TIMER1_COMPA_vect) call our isr handler methods based on conditions? That would reduce a tiny amount of the overhead (as would placing the stepping loop within the body of the ISR itself).

thinkyhead on 5 Mar 2018

Needs to be called twice as often than normal.

The interrupt overhead would be the same as if calling a disable interrupt. Timer calculation could be omitted every odd step, even if than not that smooth distributed, but exactly 50% duty cycle for the leading axis. (always assuming having no double or quad stepping, what is the worst evil for pulse distribution)

AnHardt on 5 Mar 2018

is the worst evil for pulse distribution

Of course, most of the time we're not stepping all steppers during those loops. The most expensive move is probably when moving X and Y 45° across the bed, especially with Extrusion plus bed leveling. But even then, the distribution of the steps is staggered a bit, so on each iteration of the steps loop only 1 or 2 pulses is being sent on average.

It might make sense in some cases to stagger the steppers intentionally — offsetting the Bresenham counters a little bit, just so that each individual loop does less stepping, at least in cases where we can predict that two or more Bresenham counters are going to frequently coincide.

It occurs to me also that the code currently checks whether a STEP pin was pulsed HIGH to decide whether to set it LOW (or when inverted, HIGH). Would it be less overhead, I wonder, to skip the check — and just always write all STEP pins LOW?

So, instead of this:

// Advance the Bresenham counter; start a pulse if the axis needs a step
#define PULSE_START(AXIS) \
  _COUNTER(AXIS) += current_block->steps[_AXIS(AXIS)]; \
  if (_COUNTER(AXIS) > 0) { _APPLY_STEP(AXIS)(!_INVERT_STEP_PIN(AXIS), 0); }

// Stop an active pulse, reset the Bresenham counter, update the position
#define PULSE_STOP(AXIS) \
  if (_COUNTER(AXIS) > 0) { \
    _COUNTER(AXIS) -= current_block->step_event_count; \
    count_position[_AXIS(AXIS)] += count_direction[_AXIS(AXIS)]; \
    _APPLY_STEP(AXIS)(_INVERT_STEP_PIN(AXIS), 0); \
  }

…this:

// Advance the Bresenham counter; start a pulse if the axis needs a step
#define PULSE_START(AXIS) \
  _COUNTER(AXIS) += current_block->steps[_AXIS(AXIS)]; \
  if (_COUNTER(AXIS) > 0) { \
    _APPLY_STEP(AXIS)(!_INVERT_STEP_PIN(AXIS), 0); \
    _COUNTER(AXIS) -= current_block->step_event_count; \
    count_position[_AXIS(AXIS)] += count_direction[_AXIS(AXIS)]; }

// Stop pulse, if any
#define PULSE_STOP(AXIS) _APPLY_STEP(AXIS)(_INVERT_STEP_PIN(AXIS), 0)

thinkyhead on 5 Mar 2018

Right. Currently, in between of the stepper interrupts, the step pins are all low.
So it comes down to the question if comparing the long _COUNTER(AXIS) is faster than setting the pin to LOW. For digitalWrite() on 16MHz Arduinos, i'm quite sure the compare is faster. For fastIO WRITE() it could be a close race. Theoretically this has to be tested on all platforms. But at the ARMs a long compare is incredible fast - only one instruction - hard to beat.

AnHardt on 5 Mar 2018

👍1

Was this page helpful?

0 / 5 - 0 ratings