Hi,
I'm working on the current snapshot build of DL4J/ND4J. If I call INDArray.getRow() this creates LongBuffer, AtomicBoolean and ArrayList instances which the garbage collector can't discard. This behavior leads finally to an OutOfMemoryException in the heap space. I encounter this problem if running a CpuBackend.
This error is not happening in Beta3. I have not tested if it is present on CUDA.
You can find minimal example code for reproduction here: https://gist.github.com/SchmaR/5291bf21d812ac020307251eb4cdcf97
While I was investigating the problem, I noticed that it might be linked to a reference in DirectShapeInfoProvider.longCache, which seems to hold the last existing reference to the LongBuffer objects. Attached you find screenshots showing the reference chain as well as the stack trace of the initialitation of the LongBuffer objects.
16:29:34.633 [main] INFO org.nd4j.linalg.factory.Nd4jBackend - Loaded [CpuBackend] backend
16:29:35.346 [main] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for NativeOps: 4
16:29:35.608 [main] INFO org.nd4j.nativeblas.Nd4jBlas - Number of threads used for BLAS: 4
16:29:35.616 [main] INFO o.n.l.a.o.e.DefaultOpExecutioner - Backend used: [CPU]; OS: [Linux]
16:29:35.617 [main] INFO o.n.l.a.o.e.DefaultOpExecutioner - Cores: [8]; Memory: [1.8GB];
16:29:35.617 [main] INFO o.n.l.a.o.e.DefaultOpExecutioner - Blas vendor: [MKL]



Please indicate relevant versions, including, if relevant:
If I can help to figure out what's going wrong let me know.
Aha! Link: https://skymindai.aha.io/features/ND4J-57
Not a bug.
Shape buffers are cached, reused. And never released. And since they are very small, you have to do something VERY unusual to make a problem out of that.
So, please tell us what you're doing there in order to help you.
Or, you mean that cache is broken somehow?
No, i've checked with debugger - after 100k iterations there's only 4 shapes in cache, and they are perfectly reused.

Hmm, if I run the code in the gist on my machine I end up with a large amount of LongBuffers which are consuming all of my heap memory. I don't face the behavior on beta3 but in the current snapshot.
Is there any information I could provide which could help to figure out what's going wrong?
Where exactly you see those LongBuffers?
If I either use Visual JVM or the Memory profiling view in IntelliJ. According to Visual JVM the buffers are consuming the most of my memory. If I trigger a garbage collection step in Visual JVM none of the LongBuffers are cleaned up. (in the snapshot build) If I do the same on beta3 the LongBuffers get discarded and I don't run out of heap memory while iterating over indArray.getRow(i % 3).
Obviously, i'm using snapshots too, and i can't reproduce behavior you're describing. Maybe you're running some different code?
This is what I'm seeing (Snapshots, master, windows 10, native):
https://gist.github.com/AlexDBlack/83041fa0511954f97cd85e9a6544f6cb

Here's the same thing slightly modified to add GC. The memory snapshot was collected after "DONE GC" was printed in the code.
https://gist.github.com/AlexDBlack/db9b37bd041628f6c32b4acb64c7a1b2

Hmmm.... wrappedBuffer abused somewhere?
Hi,
I've further investigated this issue.
Every call of Shape.shapeOf() results in a new LongBuffer. (NDArrayIndex.java:329, NDArrayIndex.java:300, BaseNDArray.java:3070) Let's name it "Shape LongBuffer". The (underlying) DataBuffer of the INDArray which calls it holds a reference to this new "Shape LongBuffer". The "underlying DataBuffer" is the result of INDArray.shapeInfoDataBuffer() of the original INDArray on which we call getRow(). (ShapeOffsetResolution.java:365) This "underlying LongBuffer" is referenced by the "Shape LongBuffer" as wrappedBuffer. (BaseDataBuffer.java:176) I assume that, because of these cyclic references, the garbage collection can't remove the no longer needed local instances. This holds even if INDArray.close() is called and null is assigned to the INDArray variable. This reference cycle seems to be not cleaned up at any point.
Before the creation of the Shape "LongBuffer" it is not checked if already a suitable Buffer exists. There are a lot of equal objects (DataBuffer references seem to be all equal screenshot) with different System.identityHashCode (DataBuffer references System.identityHashCode screenshot ) building up.
Here is a visualization of the cyclic references dataBufferReferences.png
I've tested this using the following code based on @AlexDBlack code. (Gist MemoryLeakTest.java)
I've further investigated the matter and isolated a call of Shape.shapeOf() (Gist MemoryLeakShapeTest.java)
After Garbage Collection I end up with the following memory consumption:
DONE GC
Max memory:7090
total memory:1365
free memory:684
used memory:680
Do you see a workaround or a way to fix this issue properly?
A call of INDArray.getRow results in the following stack traces that can be categorized in Shape Information Views and Data Views. I track every call to BaseDataBuffer(DataBuffer underlyingBuffer, long length, long offset) because its the only point I'm are aware of which adds new elements to either INDArray.data.references or INDArray.shapeInformation.references.
At first this stack is generated:
<init>:162, BaseDataBuffer (org.nd4j.linalg.api.buffer)
<init>:105, LongBuffer (org.nd4j.linalg.api.buffer)
create:72, DefaultDataBufferFactory (org.nd4j.linalg.api.buffer.factory)
createBuffer:1122, Nd4j (org.nd4j.linalg.factory)
shapeOf:2834, Shape (org.nd4j.linalg.api.shape)
resolve:329, NDArrayIndex (org.nd4j.linalg.indexing)
get:4993, BaseNDArray (org.nd4j.linalg.api.ndarray)
getRow:5076, BaseNDArray (org.nd4j.linalg.api.ndarray)
main:19, MemoryLeakTest (org.nd4j)
<init>:162, BaseDataBuffer (org.nd4j.linalg.api.buffer)
<init>:105, LongBuffer (org.nd4j.linalg.api.buffer)
create:72, DefaultDataBufferFactory (org.nd4j.linalg.api.buffer.factory)
createBuffer:1122, Nd4j (org.nd4j.linalg.factory)
shapeOf:2834, Shape (org.nd4j.linalg.api.shape)
isRowVectorShape:1594, Shape (org.nd4j.linalg.api.shape)
resolve:336, NDArrayIndex (org.nd4j.linalg.indexing)
get:4993, BaseNDArray (org.nd4j.linalg.api.ndarray)
getRow:5076, BaseNDArray (org.nd4j.linalg.api.ndarray)
main:19, MemoryLeakTest (org.nd4j)
Secondly this one:
<init>:162, BaseDataBuffer (org.nd4j.linalg.api.buffer)
<init>:105, LongBuffer (org.nd4j.linalg.api.buffer)
create:72, DefaultDataBufferFactory (org.nd4j.linalg.api.buffer.factory)
createBuffer:1122, Nd4j (org.nd4j.linalg.factory)
shapeOf:2834, Shape (org.nd4j.linalg.api.shape)
resolve:329, NDArrayIndex (org.nd4j.linalg.indexing)
exec:365, ShapeOffsetResolution (org.nd4j.linalg.indexing)
get:4995, BaseNDArray (org.nd4j.linalg.api.ndarray)
getRow:5076, BaseNDArray (org.nd4j.linalg.api.ndarray)
main:19, MemoryLeakTest (org.nd4j)
Third:
<init>:162, BaseDataBuffer (org.nd4j.linalg.api.buffer)
<init>:105, LongBuffer (org.nd4j.linalg.api.buffer)
create:72, DefaultDataBufferFactory (org.nd4j.linalg.api.buffer.factory)
createBuffer:1122, Nd4j (org.nd4j.linalg.factory)
shapeOf:2834, Shape (org.nd4j.linalg.api.shape)
isRowVectorShape:1594, Shape (org.nd4j.linalg.api.shape)
resolve:336, NDArrayIndex (org.nd4j.linalg.indexing)
exec:365, ShapeOffsetResolution (org.nd4j.linalg.indexing)
get:4995, BaseNDArray (org.nd4j.linalg.api.ndarray)
getRow:5076, BaseNDArray (org.nd4j.linalg.api.ndarray)
main:19, MemoryLeakTest (org.nd4j)
Fourth
<init>:162, BaseDataBuffer (org.nd4j.linalg.api.buffer)
<init>:105, LongBuffer (org.nd4j.linalg.api.buffer)
create:72, DefaultDataBufferFactory (org.nd4j.linalg.api.buffer.factory)
createBuffer:1122, Nd4j (org.nd4j.linalg.factory)
shapeOf:2834, Shape (org.nd4j.linalg.api.shape)
shapeOf:3070, BaseNDArray (org.nd4j.linalg.api.ndarray)
subArray:2579, BaseNDArray (org.nd4j.linalg.api.ndarray)
get:5035, BaseNDArray (org.nd4j.linalg.api.ndarray)
getRow:5076, BaseNDArray (org.nd4j.linalg.api.ndarray)
main:19, MemoryLeakTest (org.nd4j)
<init>:162, BaseDataBuffer (org.nd4j.linalg.api.buffer)
<init>:79, FloatBuffer (org.nd4j.linalg.api.buffer)
create:67, DefaultDataBufferFactory (org.nd4j.linalg.api.buffer.factory)
createBuffer:1122, Nd4j (org.nd4j.linalg.factory)
<init>:192, BaseNDArray (org.nd4j.linalg.api.ndarray)
<init>:80, NDArray (org.nd4j.linalg.cpu.nativecpu)
create:390, CpuNDArrayFactory (org.nd4j.linalg.cpu.nativecpu)
create:4375, Nd4j (org.nd4j.linalg.factory)
create:2172, BaseNDArray (org.nd4j.linalg.api.ndarray)
subArray:2589, BaseNDArray (org.nd4j.linalg.api.ndarray)
get:5035, BaseNDArray (org.nd4j.linalg.api.ndarray)
getRow:5076, BaseNDArray (org.nd4j.linalg.api.ndarray)
main:19, MemoryLeakTest (org.nd4j)
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Hi,
I've further investigated this issue.
Every call of Shape.shapeOf() results in a new LongBuffer. (NDArrayIndex.java:329, NDArrayIndex.java:300, BaseNDArray.java:3070) Let's name it "Shape LongBuffer". The (underlying) DataBuffer of the INDArray which calls it holds a reference to this new "Shape LongBuffer". The "underlying DataBuffer" is the result of INDArray.shapeInfoDataBuffer() of the original INDArray on which we call getRow(). (ShapeOffsetResolution.java:365) This "underlying LongBuffer" is referenced by the "Shape LongBuffer" as wrappedBuffer. (BaseDataBuffer.java:176) I assume that, because of these cyclic references, the garbage collection can't remove the no longer needed local instances. This holds even if INDArray.close() is called and null is assigned to the INDArray variable. This reference cycle seems to be not cleaned up at any point.
Before the creation of the Shape "LongBuffer" it is not checked if already a suitable Buffer exists. There are a lot of equal objects (DataBuffer references seem to be all equal screenshot) with different
System.identityHashCode(DataBuffer references System.identityHashCode screenshot ) building up.Here is a visualization of the cyclic references dataBufferReferences.png
I've tested this using the following code based on @AlexDBlack code. (Gist MemoryLeakTest.java)
I've further investigated the matter and isolated a call of
Shape.shapeOf()(Gist MemoryLeakShapeTest.java)After Garbage Collection I end up with the following memory consumption:
Do you see a workaround or a way to fix this issue properly?
Stack Traces for MemoryLeakTest.java
A call of INDArray.getRow results in the following stack traces that can be categorized in Shape Information Views and Data Views. I track every call to
BaseDataBuffer(DataBuffer underlyingBuffer, long length, long offset)because its the only point I'm are aware of which adds new elements to eitherINDArray.data.referencesorINDArray.shapeInformation.references.Shape Information Views
At first this stack is generated:
Secondly this one:
Third:
Fourth
Data Views