Incubator-mxnet: NDArray failed to allocate CPU memory

Created on 12 Apr 2018 · 11Comments · Source: apache/incubator-mxnet

Traceback (most recent call last):
File "SOM.py", line 109, in
test_som_with_color_data()
File "SOM.py", line 97, in test_som_with_color_data
img2 = nd.reshape(weights.data(),shape=(som_dim,som_dim,-1)).asnumpy()
File "C:\Users\Vitaly\Anaconda2\lib\site-packages\mxnet\ndarray\ndarray.py", line 1868, in asnumpy
ctypes.c_size_t(data.size)))
File "C:\Users\Vitaly\Anaconda2\lib\site-packages\mxnet\base.py", line 149, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [15:44:16] d:\program files (x86)\jenkins\workspace\mxnet\mxnet\src\storage./cpu_device_storage.h:70: Failed to allocate CPU Memory

code:

   som_dim = 100
    net = SOMNetwork(input_dim=3, dim=som_dim, sigma=3)
    net.collect_params().initialize(ctx=ctx)
    net.hybridize()
    test_data = nd.random.uniform(0, 1, (5000000, 3))
    weights = net.params.get('weight')
    img1 = nd.reshape(weights.data(), shape=(som_dim,som_dim,-1)).asnumpy()
    plt.figure(1)
    plt.subplot(121)
    plt.imshow(img1)
    start = time.time()
    for i, (data) in enumerate(test_data):
        net.n = i
        data = data.as_in_context(ctx)
        output = net(data)
        weights.set_data(weights.data() + output)
    end = time.time()
    print(end - start)    
    img2 = nd.reshape(weights.data(),shape=(som_dim,som_dim,-1)).asnumpy()
    plt.subplot(122)
    plt.imshow(img2)
    plt.show()

As I understand fails asnumpy function. But as for img1 and for img2 shape of NDArray is 100x100x3.
System has 16GB of RAM

NDArray Python

Source

KaiserSozo

Most helpful comment

Anybody here?

KaiserSozo on 25 Apr 2018

😄1 👍1

All 11 comments

@nswamy Could you please add labels-
Python, NDarray

Roshrini on 19 Apr 2018

As I understand now memory consumes here: weights.set_data(weights.data() + output). Where weights is the Parameter with grad_req {'null'}. I've choosen such way of parameter updating because:

I already have needed delta for gradient descent from output = net(data)
(The most valuable) Trainer.step takes 1.5x more time approx than modyfiing parameter directly.

So here are 3 questions:

How to modify parameter directly and do not cause memory leak?
How to optimize performance of data saving? I measured time for data calculations and it is about 190 seconds, while direct data saving to the parameter takes about 500s and Trainer.step takes about 740s.
Is it usual that gradient descent (data saving) takes so much time compared to calculation time?

KaiserSozo on 19 Apr 2018

Ok, for the first question and topic question I have the answer:
Need to use weightsData.wait_to_read() after weights.set_data(weights.data() + output)

But performance issue is still valid

KaiserSozo on 19 Apr 2018

Addition: I've measured time again with .wait_to_read() and calculaton time was 330 seconds but copying time became 2400 seconds!

KaiserSozo on 19 Apr 2018

So as I concluded I can call wait_to_read() not each time I'm updating weights.set_data(weights.data() + output), but for example with 1:10 or 1:20 frequency. That should speed up the process drastically. As I see from the debugger actual data already is in 'weights' right after calling set_data, even if I'm not calling wait_to_read() (this explains such a big growth of memory usage). So new iteration of calculation begins to work with actual data. Am I right in that?
And it would be nice if you explain or give a link to the information about memory usage and data saving mechanism for better understanding the subect.

KaiserSozo on 19 Apr 2018

So I think I was wrong and 'weights' do not have actual data if not wait_to_read() called. Because after testing my code with wait_to_read() called on each 20-th iteration error raised on 20% but thecopying time persisted the same. Any suggestions?

KaiserSozo on 19 Apr 2018

Also the topic starting problem still persists:
File "SOM.py", line 429, in
File "SOM.py", line 328, in evaluate_accuracyMLP
acc.update([label], [output])
File "C:\Users\C0d3r\Anaconda2\lib\site-packages\mxnet\metric.py", line 418, in update
pred_label = pred_label.asnumpy().astype('int32')
File "C:\Users\C0d3r\Anaconda2\lib\site-packages\mxnet\ndarray\ndarray.py", line 1826, in asnumpy
ctypes.c_size_t(data.size)))
File "C:\Users\C0d3r\Anaconda2\lib\site-packages\mxnet\base.py", line 149, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [12:41:00] d:\program files (x86)\jenkins\workspace\mxnet\mxnet\src\storage./cpu_device_storage.h:70: Failed to allocate CPU Memory

KaiserSozo on 20 Apr 2018

So, when I replaced algorithm for evaluation accuracy for my own, memory consumption became stable.

WAS:
def evaluate_accuracyMLP(dataIt, outputIt, net, netSom, inputsCount, activeNeuronsCount):
#acc = mx.metric.Accuracy()
#i = 0
#for data, label in itertools.izip(dataIt, outputIt):
# data = data.as_in_context(ctx)
# label = label.as_in_context(ctx)
# outputT, win_index, delta, mask = netSom(data)
# data = data.reshape((-1,inputsCount))
# args = (data, mask)
# with autograd.record():
# output = net(*args)
# acc.update([label], [output])
#return acc.get()[1]

IS:
def evaluate_accuracyMLP(dataIt, outputIt, net, netSom, inputsCount, activeNeuronsCount):
res = 0
loss = gluon.loss.L2Loss()

for data, label in itertools.izip(dataIt, outputIt):
    data = data.as_in_context(ctx)
    label = label.as_in_context(ctx) 
    outputT, win_index, delta, mask = netSom(data)               
    data = data.reshape((-1,inputsCount)) 
    args = (data, mask)
    output = net(*args)
    l2loss = loss(output, label)
    res += l2loss

return (res/dataIt.shape[0]).asscalar()

KaiserSozo on 20 Apr 2018

The same situation with memory consumption persists when I'm adding (using +=) NDArray from 1 element in a loop and after loop call .asscalar()
It fixes by calling .asscalar() in for loop for each iteration.

KaiserSozo on 20 Apr 2018

I removed .asscalar() call from for loop, because it takes too much time for execution (again question about performance!) and again I have out of memory problem, I assume because of loss function calculations. Is it usual behavior (consumig so much memory) and I need more RAM (I refuse to believe it also as in so much time for asscalar(), mem copying and so on)?

KaiserSozo on 20 Apr 2018

👍1

Anybody here?

KaiserSozo on 25 Apr 2018

😄1 👍1

Was this page helpful?

0 / 5 - 0 ratings