Marlin 🚀 - DO NOT EVER EVER USE DYNAMIC MEMORY ON AVR

The use of malloc() and free() are reasonable. Especially because the arrays are going to get large as the n x n grid of points increases. It is useful to do n=6 when doing a Bed Level Topology report in G29.

With that said, it appears to have a mismatch between these malloc()'s and free()'s. A free(x) never happens. I also think it would be reasonable to check that all the malloc()'s succeeded prior to continuing.

  x = ( double * ) malloc ( n * sizeof ( double ) );
  jpvt = ( int * ) malloc ( n * sizeof ( int ) );
  qraux = ( double * ) malloc ( n * sizeof ( double ) );
  r = ( double * ) malloc ( m * sizeof ( double ) );
  itask = 1;
  ind = dqrls ( a_qr, lda, m, n, tol, &kr, b, x, r, jpvt, qraux, itask );
  free ( a_qr );
  free ( jpvt );
  free ( qraux ); 
  free ( r );

Who made this change? It looks like somebody is making changes without a full understanding of what they are doing.

Roxy-3D on 3 Jul 2015

👍2

"Who made this change? It looks like somebody is making changes ..."

These are "ancient history" (December 2013). I believe that predates the current efforts to revive Marlin.

Wackerbarth on 3 Jul 2015

Years ago I read that malloc() and free() , in AVR family, cause memory
fragmented.
You confirmed that this is always right?

Drk

2015-07-03 14:24 GMT+01:00 Richard Wackerbarth [email protected]:

"Who made this change? It looks like somebody is making changes ..."

These are "ancient history" (December 2013). I believe that predates the
current efforts to revive Marlin.

—
Reply to this email directly or view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/2367#issuecomment-118350248
.

Drakelive on 3 Jul 2015

Years ago I read that malloc() and free() , in AVR family, cause memory
fragmented. You confirmed that this is always right?

My experience is malloc() & free() are just fine in the Arduino environment. I'm sure there are conditions like "You always have to give back everything you allocated." and other things like that. The reason I'm pretty confident they are OK is we have been using them for a long time in the (at the time) Enhanced G29 here: https://github.com/beckdac/Marlin We have not seen any problems with the memory getting fragmented and not able to support the malloc() requests.

@Wackerbarth My guess is this thread should be tagged as a 'Confirmed Bug'. The missing free(x) is troubling. And probably checks for successful malloc()'s need to be done to insure correct operation of the firmware.

Roxy-3D on 3 Jul 2015

👍1

You can use the "History" button on the file to locate the commit, and search for the SHA of the commit in the issues tab to find the pull request that merged it. In this case:

@fsantini included it in pull request #673 and
@alexborro merged commit 89a304f (and 4 others) into MarlinFirmware:Marlin_v1 on Dec 23, 2013

Agreed it's a bug to malloc without free. Haven't heard that all uses of malloc in AVR are stupid and retarded, but perhaps a nicer way to put this would be "static memory allocation is safer, can someone please convert this code?"

gregrebholz on 3 Jul 2015

Agreed it's a bug to malloc without free. Haven't heard that all uses of malloc in AVR are stupid and retarded, but perhaps a nicer way to put this would be "static memory allocation is safer, can someone please convert this code?"

Static memory allocation is safer. But the memory is always allocated and not available for anything else. The G29 is the routine that uses qr_solve() and creates those large arrays. I think it can be argued that only keeping the memory tied up for as long as you need it allows other code to use that memory if it needs it too.

Roxy-3D on 3 Jul 2015

I'm sorry for the bug. That portion of code was taken from an external library (as indicated in the commit description) and that lacking free() slipped through my revision.

fsantini on 3 Jul 2015

malloc() on avr is a poor idea, it really is. this is where you count clock cycles and bytes... but different wording could have came out better ;-)

nothinman on 3 Jul 2015

Well, the qr_solve() routine is very expensive computationally and it only gets called once per G29. An extra millisecond to setup the memory for it is in the round off. If we were guaranteed to have a K-byte of stack space we could just allocate the memory on the stack inside the G29 function. It wouldn't be standard C, but I think I saw some code for the Arduino environment where a standard declaration for an array such as :
double x[s];

works just fine with s being a variable. If that is the case, perhaps we should consider putting those arrays on the stack instead of getting the memory from the heap???

Roxy-3D on 3 Jul 2015

There are three kinds of RAM storage -- Static, heap (malloc) and the stack.
Static store is a permanent allocation. However, the other two are dynamic.
There are three issues with the use of malloc.

The first is the possibility of fragmentation. This is caused by allocating and deallocating entries whose lives overlap.

Another issue relates to the use of pointers to allocated arrays where the allocated space outlives the scope of the routine that created it. This is the most common cause of memory leaks.

From the few calls to malloc, I don't think we have that issue. There is one array (the bed slope coefficients) that survives (it is the "result" returned by) the call to qr_solve.
These coefficients are properly freed in the calling G29 routine. However, it would certainly be cleaner to allocate that result array at the G29 level and just pass the pointer to it to the solver.

The third issue has to do with the implementation of the handling of freed space. Depending on the malloc/free implementation, malloc may retain a permanent heap that grows to some high-water mark, even when all of the blocks are freed. If that is the case, calling G29, even once, has the same effect as making a static allocation of the arrays. (Although the total size would vary with the number of points involved).

I think that it would be much cleaner to use stack based allocation. That way, we KNOW that, upon exit from G29, all of the array storage has been released for other use.

However, this issue is not a "show stopper". It does work, as implemented. We should make the rewrite a part of some future improvement release.

Wackerbarth on 3 Jul 2015

There are three kinds of RAM storage -- Static, heap (malloc) and the stack. Static store is a permanent allocation. However, the other two are dynamic. There are three issues with the use of malloc.

These are very large temporary arrays. For sure we do not want them in static memory.

The first is the possibility of fragmentation. This is caused by allocating and deallocating entries whose lives overlap.

The use of this memory within G29 does not have 'lives that overlap'. At the end of the G29, all of the memory malloc()'ed can be free()'ed. We only need the Bed Level Correction matrix from the G29 and that is allocated in global (or static) memory space.

Another issue relates to the use of pointers to allocated arrays where the allocated space outlives the scope of the routine that created it. This is the most common cause of memory leaks.

This is not in play with the G29. It can free() all of its memory at the end.

From the few calls to malloc, I don't think we have that issue.

Agreed.

There is one array (the bed slope coefficients) that survives (it is the "result" returned by) the call to qr_solve. These coefficients are properly freed in the calling G29 routine.

...

The third issue has to do with the implementation of the handling of freed space. Depending on the malloc/free implementation, malloc may retain a permanent heap that grows to some high-water mark, even when all of the blocks are freed. If that is the case, calling G29, even once, has the same effect as making a static allocation of the arrays. (Although the total size would vary with the number of points involved).

Well... I haven't looked at the actual Arduino code for malloc() and free() but I would be surprised if it had this bad behaviour. Memory is too precious in a micro-controller to do things like this. And my experience with the G29 in the BeckDac fork using many points again and again with no reset works just fine. My guess is the free() routine is good about merging free blocks of memory back together again.

I think that it would be much cleaner to use stack based allocation. That way, we KNOW that, upon exit from G29, all of the array storage has been released for other use.

Agreed. But we need to be sure an out of memory condition is detected and reported. And in fact, that is true of the current malloc() / free() approach.

However, this issue is not a "show stopper". It does work, as implemented. We should make the rewrite a part of some future improvement release.

Agreed. However, I think that missing free(x) is a bug and should be tracked (and fixed) as such.

Roxy-3D on 3 Jul 2015

Here are two types of kids, since the tone is sliding down hill here to explaining things like we're all 3 and we need explaining about not eating the sand in the sand box...

There are messy kids whom don't have a spot for all their toys and leave the lying around. Then there are good kids whom have a spot for each toy and put them back in their spots when they aren't playing with them.

Depending on the board there are less than 1024 bytes of RAM left in a Marlin build. When juggling option settings you need to know if a set of options fits on your board.

How ever its allocated for: if it's gone, it's gone. And finding out at runtime 99% of the way through a 30 hour build isn't the time to find this out.

This Qr code needs its dynamic toys converted to a static struct and thus have them tally'd up so we can know when we've turned on too many options.

This is the second time Daid has exhibited what appear to be anger management issues on this project.

IT Employers monitor github: don't be like Daid.

On Jul 3, 2015, at 8:27 AM, Roxy-3DPrintBoard [email protected] wrote:

There are three kinds of RAM storage -- Static, heap (malloc) and the stack. Static store is a permanent allocation. However, the other two are dynamic. There are three issues with the use of malloc.

These are very large temporary arrays. For sure we do not want them in static memory.

The first is the possibility of fragmentation. This is caused by allocating and deallocating entries whose lives overlap.

The use of this memory within G29 does not have 'lives that overlap'. At the end of the G29, all of the memory malloc()'ed can be free()'ed. We only need the Bed Level Correction matrix from the G29 and that is allocated in global (or static) memory space.

Another issue relates to the use of pointers to allocated arrays where the allocated space outlives the scope of the routine that created it. This is the most common cause of memory leaks.

This is not in play with the G29. It can free() all of its memory at the end.

From the few calls to malloc, I don't think we have that issue.

Agreed.

There is one array (the bed slope coefficients) that survives (it is the "result" returned by) the call to qr_solve. These coefficients are properly freed in the calling G29 routine.

...

The third issue has to do with the implementation of the handling of freed space. Depending on the malloc/free implementation, malloc may retain a permanent heap that grows to some high-water mark, even when all of the blocks are freed. If that is the case, calling G29, even once, has the same effect as making a static allocation of the arrays. (Although the total size would vary with the number of points involved).

Well... I haven't looked at the actual Arduino code for malloc() and free() but I would be surprised if it had this bad behaviour. Memory is too precious in a micro-controller to do things like this. And my experience with the G29 in the BeckDac fork using many points again and again with no reset works just fine. My guess is the free() routine is good about merging free blocks of memory back together again.

I think that it would be much cleaner to use stack based allocation. That way, we KNOW that, upon exit from G29, all of the array storage has been released for other use.

Agreed. But we need to be sure an out of memory condition is detected and reported. And in fact, that is true of the current malloc() / free() approach.

However, this issue is not a "show stopper". It does work, as implemented. We should make the rewrite a part of some future improvement release.

Agreed. However, I think that missing free(x) is a bug and should be tracked (and fixed) as such.

—
Reply to this email directly or view it on GitHub.

scotty1024 on 3 Jul 2015

There is mo single reason to use malloc here. None. C++ objects on the
stack will serve this goal just as well without the chance of leaking. It
sets horrible precidense for other code as well. As well as someone will
fuck up for sure.

Why do i go mad? Because its shit like this that has made marlin a whole
lot less stable in the last few years. And the host suppliers and slicers
usually get the blame.
On 3 Jul 2015 18:29, "scotty1024" [email protected] wrote:

Here are two types of kids, since the tone is sliding down hill here to
explaining things like we're all 3 and we need explaining about not eating
the sand in the sand box...

There are messy kids whom don't have a spot for all their toys and leave
the lying around. Then there are good kids whom have a spot for each toy
and put them back in their spots when they aren't playing with them.

Depending on the board there are less than 1024 bytes of RAM left in a
Marlin build. When juggling option settings you need to know if a set of
options fits on your board.

How ever its allocated for: if it's gone, it's gone. And finding out at
runtime 99% of the way through a 30 hour build isn't the time to find this
out.

This Qr code needs its dynamic toys converted to a static struct and thus
have them tally'd up so we can know when we've turned on too many options.

This is the second time Daid has exhibited what appear to be anger
management issues on this project.

IT Employers monitor github: don't be like Daid.

On Jul 3, 2015, at 8:27 AM, Roxy-3DPrintBoard [email protected]
wrote:

There are three kinds of RAM storage -- Static, heap (malloc) and the
stack. Static store is a permanent allocation. However, the other two are
dynamic. There are three issues with the use of malloc.

These are very large temporary arrays. For sure we do not want them in
static memory.

The first is the possibility of fragmentation. This is caused by
allocating and deallocating entries whose lives overlap.

The use of this memory within G29 does not have 'lives that overlap'. At
the end of the G29, all of the memory malloc()'ed can be free()'ed. We only
need the Bed Level Correction matrix from the G29 and that is allocated in
global (or static) memory space.

Another issue relates to the use of pointers to allocated arrays where
the allocated space outlives the scope of the routine that created it. This
is the most common cause of memory leaks.

This is not in play with the G29. It can free() all of its memory at the
end.

From the few calls to malloc, I don't think we have that issue.

Agreed.

There is one array (the bed slope coefficients) that survives (it is the
"result" returned by) the call to qr_solve. These coefficients are properly
freed in the calling G29 routine.

...

The third issue has to do with the implementation of the handling of
freed space. Depending on the malloc/free implementation, malloc may retain
a permanent heap that grows to some high-water mark, even when all of the
blocks are freed. If that is the case, calling G29, even once, has the same
effect as making a static allocation of the arrays. (Although the total
size would vary with the number of points involved).

Well... I haven't looked at the actual Arduino code for malloc() and
free() but I would be surprised if it had this bad behaviour. Memory is too
precious in a micro-controller to do things like this. And my experience
with the G29 in the BeckDac fork using many points again and again with no
reset works just fine. My guess is the free() routine is good about merging
free blocks of memory back together again.

I think that it would be much cleaner to use stack based allocation.
That way, we KNOW that, upon exit from G29, all of the array storage has
been released for other use.

Agreed. But we need to be sure an out of memory condition is detected
and reported. And in fact, that is true of the current malloc() / free()
approach.

However, this issue is not a "show stopper". It does work, as
implemented. We should make the rewrite a part of some future improvement
release.

Agreed. However, I think that missing free(x) is a bug and should be
tracked (and fixed) as such.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/2367#issuecomment-118387328
.

daid on 3 Jul 2015

(I also might know a thing or two about how to make rock solid always
working code. But its up to you guys if you want to listen to me or remove
me from the collab list)
On 3 Jul 2015 18:49, "Daid" [email protected] wrote:

There is mo single reason to use malloc here. None. C++ objects on the
stack will serve this goal just as well without the chance of leaking. It
sets horrible precidense for other code as well. As well as someone will
fuck up for sure.

Why do i go mad? Because its shit like this that has made marlin a whole
lot less stable in the last few years. And the host suppliers and slicers
usually get the blame.
On 3 Jul 2015 18:29, "scotty1024" [email protected] wrote:

Here are two types of kids, since the tone is sliding down hill here to
explaining things like we're all 3 and we need explaining about not eating
the sand in the sand box...

There are messy kids whom don't have a spot for all their toys and leave
the lying around. Then there are good kids whom have a spot for each toy
and put them back in their spots when they aren't playing with them.

Depending on the board there are less than 1024 bytes of RAM left in a
Marlin build. When juggling option settings you need to know if a set of
options fits on your board.

How ever its allocated for: if it's gone, it's gone. And finding out at
runtime 99% of the way through a 30 hour build isn't the time to find this
out.

This Qr code needs its dynamic toys converted to a static struct and thus
have them tally'd up so we can know when we've turned on too many options.

This is the second time Daid has exhibited what appear to be anger
management issues on this project.

IT Employers monitor github: don't be like Daid.

On Jul 3, 2015, at 8:27 AM, Roxy-3DPrintBoard [email protected]
wrote:

There are three kinds of RAM storage -- Static, heap (malloc) and the
stack. Static store is a permanent allocation. However, the other two are
dynamic. There are three issues with the use of malloc.

These are very large temporary arrays. For sure we do not want them in
static memory.

The first is the possibility of fragmentation. This is caused by
allocating and deallocating entries whose lives overlap.

The use of this memory within G29 does not have 'lives that overlap'.
At the end of the G29, all of the memory malloc()'ed can be free()'ed. We
only need the Bed Level Correction matrix from the G29 and that is
allocated in global (or static) memory space.

Another issue relates to the use of pointers to allocated arrays where
the allocated space outlives the scope of the routine that created it. This
is the most common cause of memory leaks.

This is not in play with the G29. It can free() all of its memory at
the end.

From the few calls to malloc, I don't think we have that issue.

Agreed.

There is one array (the bed slope coefficients) that survives (it is
the "result" returned by) the call to qr_solve. These coefficients are
properly freed in the calling G29 routine.

...

The third issue has to do with the implementation of the handling of
freed space. Depending on the malloc/free implementation, malloc may retain
a permanent heap that grows to some high-water mark, even when all of the
blocks are freed. If that is the case, calling G29, even once, has the same
effect as making a static allocation of the arrays. (Although the total
size would vary with the number of points involved).

Well... I haven't looked at the actual Arduino code for malloc() and
free() but I would be surprised if it had this bad behaviour. Memory is too
precious in a micro-controller to do things like this. And my experience
with the G29 in the BeckDac fork using many points again and again with no
reset works just fine. My guess is the free() routine is good about merging
free blocks of memory back together again.

I think that it would be much cleaner to use stack based allocation.
That way, we KNOW that, upon exit from G29, all of the array storage has
been released for other use.

Agreed. But we need to be sure an out of memory condition is detected
and reported. And in fact, that is true of the current malloc() / free()
approach.

However, this issue is not a "show stopper". It does work, as
implemented. We should make the rewrite a part of some future improvement
release.

Agreed. However, I think that missing free(x) is a bug and should be
tracked (and fixed) as such.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/2367#issuecomment-118387328
.

daid on 3 Jul 2015

How ever its allocated for: if it's gone, it's gone. And finding out at runtime 99% of the way through a 30 hour build isn't the time to find this out.

I prefer to do my G29's at the start of a print. If G29 gives back all of its memory that it used (whether because it is allocated on the stack or heap) this is a non-issue. G29 isn't causing problems for some 30 hour print.

This Qr code needs its dynamic toys converted to a static struct and thus have them tally'd up so we can know when we've turned on too many options.

I disagree. I want the flexibility of specifying the appropriate number of points (at the time) to probe and have it not permanently consume the memory that only gets used once at the start of a print. Or... In fact, a G29 with a large number of points might not even be used for a print. It might be a diagnostic tool to help understand the flatness of a new piece of glass. In that case it would be very wasteful to hard allocate all that memory.

Here are two types of kids, since the tone is sliding down hill here to explaining things like we're all 3 and we need explaining about not eating the sand in the sand box...

There are messy kids whom don't have a spot for all their toys and leave the lying around. Then there are good kids whom have a spot for each toy and put them back in their spots when they aren't playing with them.

This is the second time Daid has exhibited what appear to be anger management issues on this project. IT Employers monitor github: don't be like Daid.

Hmmm... I bet psychologists read here too! And you sound a little judgmental about all those messy kids! My work area for my printer is a disaster. So clearly I'm not a 'good' kid! :( But I know where everything was last abandoned. (My code is clean however!!!)

PS. I know you didn't mean it that way. I'm just messing with you.

Roxy-3D on 3 Jul 2015

👍1

@daid personal defamatory attacks using hateful words such as "retarded" are unacceptable. GitHub is about open collaboration amongst developers of all skill levels and disabilities. Open collaboration is difficult with people tossing around such ugly words and all caps shouting.

On Jul 3, 2015, at 5:54 AM, daid [email protected] wrote:

https://github.com/MarlinFirmware/Marlin/blob/Release/Marlin/qr_solve.cpp
@fsantini
@MarlinFirmware/collaborators

I have no clue who merged this, but really, malloc in avr code in limited memory space is stupid, retarded and should never have merged. It leaves open a window for memory leaks, and wastes your most precious resource (memory) for memory management info.

—
Reply to this email directly or view it on GitHub.

scotty1024 on 3 Jul 2015

👍1

@Roxy-3DPrintBoard - "that missing free(x) is a bug".

As far as I can tell, there is no missing free() call. As I explained above, the coefficients are malloc()ed in the qr routine and a pointer to them is returned as the result. The G29 code properly free()s them when it gets through. Therefore, no memory leak. You can run G29 many times and it will keep reusing (allocating and deallocating) the same memory.

The potential issue is whether, or not, malloc(), itself, keeps that memory to the exclusion of other functions such as the stack. Some versions of malloc never reduce the size of their heap, even after everything has been free()d.

@Roxy-3DPrintBoard - " (whether because it is allocated on the stack or heap)"
Freeing space on the stack assures that it is available for other stack-based uses. Similarly, freeing space in the heap allows it to be reused for other malloc()ed uses. Most implementations successfully coalesce smaller blocks into larger ones when all are free()d. However, in many cases, the total heap may never "give back" any of the memory that was acquired to expand the heap.

Wackerbarth on 3 Jul 2015

👍1

It wouldn't be standard C, but I think I saw some code for the Arduino environment where a standard declaration for an array such as :

double x[s];

works just fine with s being a variable. If that is the case, perhaps we should consider putting those arrays on the stack instead of getting the memory from the heap???

If this is true, this would appear to address everybody's issue. This would be a cleaner and safer way to move forward.

Roxy-3D on 3 Jul 2015

Do you reboot your printer in between prints? There is no such thing as a zero foot print dynamic heap library. With each print Qr is stirring the pot via G29. You were advising Bo to keep one finger on the reset switch in case of a mysterious head crash bug. How do you know if the cause isn't the malloc heap having collided and over written something: you don't.

At compile time how do you know the image is stable and has sufficient RAM to operate: you don't.

Dynamic memory is wonderfully appropriate and useful when the system you are working with has _way_ more than 1024 unallocated bytes of RAM. When doing embedded development on such constrained systems we're clipping coupons and bending over to pick up pennies.

I recall seeing some one mentioning they were planning on stripping down this Qr code, they said it was mostly unused? RAM isn't the only thing Marlin is short on. The production Printrbot Marlin fork is at 99.1% of FLASH usage. As I recall it presently has more RAM left than FLASH.

On Jul 3, 2015, at 9:53 AM, Roxy-3DPrintBoard [email protected] wrote:

How ever its allocated for: if it's gone, it's gone. And finding out at runtime 99% of the way through a 30 hour build isn't the time to find this out.

I prefer to do my G29's at the start of a print. If G29 gives back all of its memory that used (whether because it is allocated on the stack or heap) this is a non-issue. G29 isn't causing problems for some 30 hour print.

This Qr code needs its dynamic toys converted to a static struct and thus have them tally'd up so we can know when we've turned on too many options.

I disagree. I want the flexibility of specifying the appropriate number of points (at the time) to probe and have it not permanently consume the memory that only gets used once at the start of a print. Or... In fact, a G29 with a large number of points might not even be used for a print. It might be a diagnostic tool to help understand the flatness of a new piece of glass. In that case it would be very wasteful to hard allocate all that memory.

Here are two types of kids, since the tone is sliding down hill here to explaining things like we're all 3 and we need explaining about not eating the sand in the sand box...

There are messy kids whom don't have a spot for all their toys and leave the lying around. Then there are good kids whom have a spot for each toy and put them back in their spots when they aren't playing with them.

This is the second time Daid has exhibited what appear to be anger management issues on this project. IT Employers monitor github: don't be like Daid.

Hmmm... I bet psychologists read here too! And you sound a little judgmental about all those messy kids! My work area for my printer is a disaster. So clearly I'm not a 'good' kid! :( But I know where everything was last abandoned. (My code is clean however!!!)

PS. I know you didn't mean it that way. I'm just messing with you.

—
Reply to this email directly or view it on GitHub.

scotty1024 on 3 Jul 2015

So long as someone can answer the question of: how much head room is left on the stack? For all we know the stack and heap are already colliding. If so, switching this from heap to stack still causes a collision.

On Jul 3, 2015, at 10:04 AM, Roxy-3DPrintBoard [email protected] wrote:

It wouldn't be standard C, but I think I saw some code for the Arduino environment where a standard declaration for an array such as :

double x[s];

works just fine with s being a variable. If that is the case, perhaps we should consider putting those arrays on the stack instead of getting the memory from the heap???

If this is true, this would appear to address everybody's issue. This would be a cleaner and safer way to move forward.

—
Reply to this email directly or view it on GitHub.

scotty1024 on 3 Jul 2015

I don't know how much room there is, but when you have a stack running into
the heap it's quite unpredictable, while a stack running into static ram is
more predictable.

Most of the stack will most likely be used during an stepper interrupt
while the planner is doing it's job. Looking at the data then it's most
likely up a few 100 of bytes that are used.

(And, I will stand by this point, if you think dynamic memory allocation in
a highly memory constrained environment which needs to run stable for long
amounts of time is a good idea, you are a retarded person)

On Fri, Jul 3, 2015 at 7:19 PM, scotty1024 [email protected] wrote:

So long as someone can answer the question of: how much head room is left
on the stack? For all we know the stack and heap are already colliding. If
so, switching this from heap to stack still causes a collision.

On Jul 3, 2015, at 10:04 AM, Roxy-3DPrintBoard [email protected]
wrote:

It wouldn't be standard C, but I think I saw some code for the Arduino
environment where a standard declaration for an array such as :

double x[s];

works just fine with s being a variable. If that is the case, perhaps we
should consider putting those arrays on the stack instead of getting the
memory from the heap???

If this is true, this would appear to address everybody's issue. This
would be a cleaner and safer way to move forward.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/2367#issuecomment-118394398
.

daid on 3 Jul 2015

Do you reboot your printer in between prints? There is no such thing as a zero foot print dynamic heap library. With each print Qr is stirring the pot via G29.

No, but if there is enough memory, it doesn't matter if there are a few pointers that manage the free space. That certainly is lighter weight than hard allocating enough memory to have 36 points consisting of 4 doubles.

You were advising Bo to keep one finger on the reset switch in case of a mysterious head crash bug. How do you know if the cause isn't the malloc heap having collided and over written something: you don't.

Yes, there is some mysterious behaviour. But it is very difficult to prove a negative. I don't know where it is coming from. It almost feels like it is a race condition with too many Z_RAISE_xxx_xxx happening. With that said, if it was a malloc() problem, it wouldn't be likely to show up unless a large number of probe points was being specified.

At compile time how do you know the image is stable and has sufficient RAM to operate: you don't.

At run time, you have no way of knowing (right now) that you have enough stack space. It might be worth while to put some stack fences in place.

Dynamic memory is wonderfully appropriate and useful when the system you are working with has _way_ more than 1024 unallocated bytes of RAM. When doing embedded development on such constrained systems we're clipping coupons and bending over to pick up pennies.

Which is exactly why I don't want to see enough memory hard allocated to handle a large number of probe points that pretty much only happens at the very very start of a print.

I recall seeing some one mentioning they were planning on stripping down this Qr code, they said it was mostly unused? RAM isn't the only thing Marlin is short on. The production Printrbot Marlin fork is at 99.1% of FLASH usage. As I recall it presently has more RAM left than FLASH.

The iterative version to replace qr_solve() is done. It does not lower the amount of RAM needed. But it does save 10KB of flash space. When the G29 gets broken out into its various flavors, that would be a good time to provide the option of using the original qr_solve() or the iterative version.

So long as someone can answer the question of: how much head room is left on the stack? For all we know the stack and heap are already colliding. If so, switching this from heap to stack still causes a collision.

Agreed. Probably we should do a sanity check on this. G29 is one of the biggest stack pigs. One approach would be to fill the stack memory with a known value at initialization time. And then (just for the purpose of this inquiry) we could add a command to find how much space the stack has used since the initialization was done. And of course, how much buffer space is left in the stack.

(And, I will stand by this point, if you think dynamic memory allocation in
a highly memory constrained environment which needs to run stable for long
amounts of time is a good idea, you are a retarded person)

If done right, dynamic memory allocation is just fine. Perhaps you have had other experiences. Perhaps it was too difficult for you to do right? I don't know the answer. But just because I'm more neutral on how it is accomplished does not make me retarded.

Roxy-3D on 3 Jul 2015

👍1

No it's not right. It's a heap of trouble. You're not running this project solo, you're collaborating with a lot of people with various skill levels. You're opening the paths for shitstorms of trouble. It has nothing to do with my or your skill level, it has to do with everyone's skill level.

Feel free to question my skill level. But you can look in my github account to see I have a track record in everything from AVR C to C++ to python. Also got 5 years of trafficlight safety systems under my belt.

It has to do with stability. Stability is KEY. If the firmware crashes into a 25 hour print because of you wanted to use malloc (without it being needed at all, let me remind you that), then my host software will get blamed. Or Octoprint will get blamed, or any other host software.
Simple rules like this are not because "it cannot be done", it's because it's hard to do it right. It's a risk. It's a risk in software which is not only running for hours and hours at a time, but also managing a heater with could burn your house down. (Yes, this actually happened)

That on top of wasting precious bytes of memory. Every byte counts. (Last time I cleaned up, which was a few year ago, these "does not mind to use some memory here" usages added up to about 700 bytes which I cleaned)
Have you looked on how the memory allocator of avrlibc works? If not, then you don't actually know how much memory is wasted. How fragmentation is handled. Have you checked different versions of this library? As not everyone is using the same compiler suit (and the one in the Arduino environment is quite dated)

"It works fine for me" is a horrible president for a project running on 10 to 100 thousands of printers.

daid on 3 Jul 2015

You are a 'Collaborator' ? Up above I even suggested that putting the G29 memory on the stack was a way to move forward without malloc()'s and would address everybody's concerns. I even said it would be 'safer and cleaner'. But now, somehow I'm the bad person because I'm not highly emotional about how G29 gets its memory?

I think Scotty1024 was correct when he said these things:

This is the second time Daid has exhibited what appear to be anger management issues on this project.

IT Employers monitor github: don't be like Daid.

@daid personal defamatory attacks using hateful words such as "retarded" are unacceptable. GitHub is about open collaboration amongst developers of all skill levels and disabilities. Open collaboration is difficult with people tossing around such ugly words and all caps shouting.

I really don't want to have any more interaction with you Daid. You are not a nice person.

Roxy-3D on 3 Jul 2015

👍1

All ranting aside, how easy is to change that in the code?
Any idea how it can be done?

Nandox7 on 3 Jul 2015

I have examined the code for the avr library from gcc. There is a little overhead in having malloc loaded.
It seems to be about 4 pointers and a couple size_t entries. But I suspect that you get this loaded anyway unless you create a custom stdlib which expressly removes it.
The code allocates its heap at the top of the data segment and moves the __brkval up and down to change the size of the heap as the last element is freed. Thus, as long as our code properly avoids memory leaks, the memory usage is only a little more to allocate a few blocks on the heap.(One pointer per block).

The issue of stack overflow is always a possibility. But avoiding the heap doesn't really change that issue at all. (WRT variable sized arrays that are on either the stack or a heap). If we need to address that issue, then we need to have appropriate tests checking for stack overflow and gracefully abort.

The space that G29 uses during its execution may be an issue. However, once it completes, it is not an issue during the rest of a print run.

Wackerbarth on 3 Jul 2015

All ranting aside, how easy is to change that in the code?
Any idea how it can be done?

It's not that hard. The bigger issue is we are still in a bug fix phase and shouldn't be making changes not related to bugs right now. But when the G29 code gets separated out into its various flavors, that might be a good time to make the change.

Roxy-3D on 3 Jul 2015

The code allocates its heap at the top of the data segment and moves the __brkval up and down to change the size of the heap as the last element is freed. Thus, as long as our code properly avoids memory leaks, the memory usage is only a little more to allocate a few blocks on the heap.(One pointer per block).

I checked this also and can confirm your findings.

The issue of stack overflow is always a possibility. But avoiding the heap doesn't really change that issue at all. (WRT variable sized arrays that are on either the stack or a heap).

I believe everybody can agree with what you just said. The issue is there either way.

If we need to address that issue, then we need to have appropriate tests checking for stack overflow and gracefully abort. The space that G29 uses during its execution may be an issue. However, once it completes, it is not an issue during the rest of a print run.

Agreed in the general case. But if G29 did cause the heap and stack to collide, it is possible to have data damaged that is needed later. It maybe worth while to figure out how much space we have left on the stack.

Right now __brkval floats up and down. That complicates things. But we could pre-fill the stack with a known value and then see how big that memory block is after key operations. That pre-filled memory block would be getting squeezed from both sides. What we care about is knowing how big the unused part of it is.

Roxy-3D on 3 Jul 2015

"That pre-filled memory block would be getting squeezed from both sides." -- Actually, that would find the value for the "worst case" where there is enough room for both the largest heap, and, at the same time, the largest stack. Since we know that the maximum heap usage is in the solver, it is very likely the case that the "worst case" is within the deepest subroutine nesting therein.

However, if there is some other G_xx stack that is even deeper, then that would be the failure point. (Actually within the Interrupt service routine, if it happens during that nesting)

An easier, and more accurate way to view things is to simply place some strategic calls to SdFatUtil::FreeRam and record the minimum value in a static variable.

Wackerbarth on 3 Jul 2015

This discussion went of the track rather fast.
There have been a lot of good points and useful suggestions. But that is not what this is about.

By the way the real problem of the code is the missing error handling. If malloc() fails to get the code then what?

I agree with @daid there should be no malloc() on AVR at all ! Everybody who ever worked with safety critical software knows that dynamic memory management is evil.
Marlin is not run on a PC and it can not crash all the time. The environment Marlin runs in is defined on compile time. So no need to be very dynamic as the environment will not change. This is an open source project so people with different skill levels join. We need to make it easy to write correct code and hard to write bad code. malloc() makes it easy to write faulty code.

As debugging is harder to do then writing code. If you put all you smarts into coming up with the best code, you will not be able to debug it!

So please make it part of the coding convention to not use malloc() and put the "dynamic memory" on the stack! That has been suggested earlier. nobody disagreed so it should be acceptable for all.

@Roxy-3DPrintBoard please don't take it personally if someone says that doing something/ or that who does something is stupid/retarded/..
You and daid probably never meet in person, so he can only judge you based on what you do/say here. So this can only be a disagreement based on facts and not personally.
Then look at some of the emails of Linus, he can be very "not nice" but he created Linux and many people work with him on it. Being polite in technical discussion can be very inefficient. For example this discussion thread is probably much longer than the Pull request to resolve this, if we even get one.

So could we all cool down a bit a be friends again?

JustAnother1 on 3 Jul 2015

An easier, and more accurate way to view things is to simply place some strategic calls to SdFatUtil::FreeRam and record the minimum value in a static variable.

I'm already well along the way to setting it up to look at (and size) the squeezed block of memory approach. If you could be talked into doing the SdFatUtil::FreeRam approach we could double check the results from two different perspectives.

Roxy-3D on 3 Jul 2015

JustAnother1 said: please don't take it personally if someone says that doing something/ or that who does something is stupid/retarded/..

Ah... He most certainly did not say that method was stupid or retarded. He said:

(And, I will stand by this point, if you think dynamic memory allocation in
a highly memory constrained environment which needs to run stable for long
amounts of time is a good idea, you are a retarded person)

And this comment also applies to you. You said:

So please make it part of the coding convention to not use malloc() and put the "dynamic memory" on the stack! That has been suggested earlier. nobody disagreed so it should be acceptable for all.

Apparently you are retarded by his standard because you want the dynamic memory to be placed on the stack. :) I'm not very emotional either way. I don't care as long as the malloc()'s and free()'s are done correctly.

Any way... How is it a 'Collaborator' is tearing the group of people donating their time apart. Maybe there should be a status title of 'Anti-Collaborator' because it doesn't seem like there is much 'Collaboration' coming from him. Mostly just opinions and pontifications. I'm actually writing code as we speak to get the answers to how much head room we have. But then again, I'm retarded, so I guess the 'Collaborators' won't care what number my work comes up with. However, I'll be comfortable that I know the answer and the number is correct.

Incidentally, malloc() is used 8 times in three different files. It is used in new.cpp, qr_solve.cpp and WString.h. If somebody feels strongly on this topic (which I do not), it should not be very difficult to eliminate the usage of malloc() and free().

Roxy-3D on 4 Jul 2015

👍1

malloc / free

I've done lots of testing of malloc/free to see how they behave, and they definitely work in a predictable way. Seeing how predictable they are, I have no personal problem with malloc/free, but they should obviously only be used in those cases where the buffer has a short lifespan. If only one piece of code uses malloc/free, and for a limited lifetime, that is guaranteed not to cause fragmentation. That's because the implementation of malloc/free is super-simple, especially on AVR.

I haven't read through this whole thread, because I only care about the facts, maam… so if I've stated something already discussed, my apologies.

thinkyhead on 4 Jul 2015

👍1

I will be forwarding this email thread to the legal department at Daid's employer.

It isn't just Daid standing there engaging in hate speech. If they condone his actions they are standing there beside him.

On Jul 3, 2015, at 10:51 AM, daid [email protected] wrote:

I don't know how much room there is, but when you have a stack running into
the heap it's quite unpredictable, while a stack running into static ram is
more predictable.

Most of the stack will most likely be used during an stepper interrupt
while the planner is doing it's job. Looking at the data then it's most
likely up a few 100 of bytes that are used.

(And, I will stand by this point, if you think dynamic memory allocation in
a highly memory constrained environment which needs to run stable for long
amounts of time is a good idea, you are a retarded person)

On Fri, Jul 3, 2015 at 7:19 PM, scotty1024 [email protected] wrote:

So long as someone can answer the question of: how much head room is left
on the stack? For all we know the stack and heap are already colliding. If
so, switching this from heap to stack still causes a collision.

On Jul 3, 2015, at 10:04 AM, Roxy-3DPrintBoard [email protected]
wrote:

It wouldn't be standard C, but I think I saw some code for the Arduino
environment where a standard declaration for an array such as :

double x[s];

works just fine with s being a variable. If that is the case, perhaps we
should consider putting those arrays on the stack instead of getting the
memory from the heap???

If this is true, this would appear to address everybody's issue. This
would be a cleaner and safer way to move forward.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
https://github.com/MarlinFirmware/Marlin/issues/2367#issuecomment-118394398
.

—
Reply to this email directly or view it on GitHub.

scotty1024 on 4 Jul 2015

@Roxy-3DPrintBoard - "I'm already well along the way to setting it up to look at ..."

IMHO, that is a reflection of one of the problems plaguing this project. You're jumping in and writing code without first proposing a design (For example, when/where/how do you propose to initialize this memory? How will the result be made available?, etc. What compile-time/run-time options will control it? -- Since ROM is also at a premium, we will want to be able to have it included only optionally.) If you want me to write an alternate implementation, then there are some obvious parts that we should be sharing.

So, please, start by discussing the entire DESIGN rather than just jumping in with an implementation.

Wackerbarth on 4 Jul 2015

So, please, start by discussing the entire DESIGN rather than just jumping in with an implementation.

No need. This isn't code that will actually be part of the code base. This is code to get us an answer so we know how much head room we have. And the code is very very simple. Rather than have a big discussion, I'll just publish my code and you can look at it and nod your head up and down that it works and does what it is supposed to do.

Roxy-3D on 4 Jul 2015

👍1

@Roxy-3DPrintBoard please read more closely:

(And, I will stand by this point, if you think dynamic memory allocation in
a highly memory constrained environment which needs to run stable for long
amounts of time is a good idea, (_I would probably added a "then" here_) you are a retarded person)

So you, and me are not retarded, as long as we don't think that a stupid thing to do would be a good idea. And stating that someone is retarded if he believes something (that others believe to be stupid) is not the same as stating a person is retarded.

@scotty1024 could you point me to the "hate speech" part? As pointed out above there are no "personal" attacks here. Just evaluations of technical aspects. And don't be too disappointed when you realize the difference between US Law and International Law,...

JustAnother1 on 4 Jul 2015

👎1

JustAnother1, I do concede your point. But at a higher level, it really does not make sense to have a 'Collaborator' acting this way.

Roxy-3D on 4 Jul 2015

👍1

@Roxy-3DPrintBoard -- This isn't code that will actually be part of the code base."

I disagree. Unless there is "a huge surplus", we will need to retest this from time to time. As such, it would be a part of the code base, as an option that is almost always excluded from the user's firmware.
That way, we have it "on file" for when we need it again later.

Wackerbarth on 4 Jul 2015

OK... I will publish a 'Strawman' that we can discuss that actually gets valid numbers. Does that work for you? (Seriously... I'm far enough along, I'm almost ready to get numbers. I want to keep going.)

Yikes!!!! With more debug code... I think we have some memory corruption happening. I think those numbers I previously posted here are correct. I need to double and triple check everything.

But the problem is, when I added debug code to make sure the pattern was correct and unaltered in the free memory area, I found a bunch of it that is getting hammered. At least on my system, where I'm writing code, I'm seeing data corruption! If this data corruption is really happening, it isn't a big surprise we are seeing some flakey behavior.

Roxy-3D on 4 Jul 2015

👍1

I'm more interested in having you define the general way in which you think it will work. By the time you have written your "Strawman", because, at least in your opinion, it already "works", you aren't going to want to change anything. Therefore, any suggestions which might improve the design are effectively squashed.

Wackerbarth on 4 Jul 2015

I'm more interested in having you define the general way in which you think it will work. By the time you have written your "Strawman", because, at least in your opinion, it already "works", you aren't going to want to change anything. Therefore, any suggestions which might improve the design are effectively squashed.

No... I enjoy making code better. And I value other people's thoughts. And with just a little bit of checking you will find I've written code I have no intention of using myself. I wrote the code just because it was intellectually interesting. For example the iterative Least Squares Fit algorithm that saves 10KB of code space to replace qr_solve(). Or the code to save the Bed Level Correction Matrix to EEPROM so G29's do not have to be run for every print (published at 3DPrintBoard). If only somebody would have some good suggestions on how to improve M48 (also originally published at 3DPrintBoard). I would love to hear how to stress the mechanics of Delta printers harder and you would see new options show up for that. I don't even have a Delta printer but I would love to add an option explicitly for them.

I am not opposed at all to listening to suggestions on how to make things (my code) better and more useful.

Roxy-3D on 4 Jul 2015

👍1

I found a bug in qr_solve.cpp!

In the call to dqrsl below m is passed where n should have been passed.

if ( kr != 0 ) {
  job = 110;
  info = dqrsl ( a, lda, m, kr, qraux, b, rsd, rsd, x, rsd, rsd, job );
}

Inside dqrsl m is n and kr is k. In Marlin n is always 3 (X, y, z columns). This min() thus uses 2 (n - 1) which is important because kr can be either 0, 1, 2, 3. Thus the min() would normally cap ju at 2. But the bug passes in m which in Marlin is always AUTO_BED_LEVELING_GRID_POINTS*AUTO_BED_LEVELING_GRID_POINTS e.g., 4 or 9 (being typical values).

ju = i4_min ( k, n-1 );
/*                                                                              
Special action when N = 1.
*/

scotty1024 on 4 Jul 2015

@scotty1024 This has turned into a rather disorganized thread. This topic should probably have the focus it deserves without any of the distractions that are going to continue in this thread. Would you mind starting a dedicated thread for this just so we can keep things very focused when we discuss it?

Roxy-3D on 4 Jul 2015

this … has made marlin a whole lot less stable in the last few years

Well, let's not forget about the dozens of other bugs —some of them buffer overruns— fixed in the last 7 months or so. That old crufty qr_solve code has been the only thing using malloc/free, and only on machines with G29 enabled. The main thing making Marlin unstable is that we're forced to use C++ which is notoriously brittle when it comes to C-style arrays, uninitialized variables, etc., and (nods to @Wackerbarth) the lack of a robust quality control system.

Happily, we have a shiny new qr_solve ~~#2382~~ #2549 that allocates the worst-case up front and/or uses the stack instead of using malloc. FTW!

thinkyhead on 7 Jul 2015

Resolved by #2549

AnHardt on 31 Jul 2015

Marlin: DO NOT EVER EVER USE DYNAMIC MEMORY ON AVR

Most helpful comment

All 46 comments

Related issues