Incubator-mxnet: NDArray failed to allocate CPU memory

Created on 12 Apr 2018  路  11Comments  路  Source: apache/incubator-mxnet

Traceback (most recent call last):
File "SOM.py", line 109, in
test_som_with_color_data()
File "SOM.py", line 97, in test_som_with_color_data
img2 = nd.reshape(weights.data(),shape=(som_dim,som_dim,-1)).asnumpy()
File "C:\Users\Vitaly\Anaconda2\lib\site-packages\mxnet\ndarray\ndarray.py", line 1868, in asnumpy
ctypes.c_size_t(data.size)))
File "C:\Users\Vitaly\Anaconda2\lib\site-packages\mxnet\base.py", line 149, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [15:44:16] d:\program files (x86)\jenkins\workspace\mxnet\mxnet\src\storage./cpu_device_storage.h:70: Failed to allocate CPU Memory

code:

   som_dim = 100
    net = SOMNetwork(input_dim=3, dim=som_dim, sigma=3)
    net.collect_params().initialize(ctx=ctx)
    net.hybridize()
    test_data = nd.random.uniform(0, 1, (5000000, 3))
    weights = net.params.get('weight')
    img1 = nd.reshape(weights.data(), shape=(som_dim,som_dim,-1)).asnumpy()
    plt.figure(1)
    plt.subplot(121)
    plt.imshow(img1)
    start = time.time()
    for i, (data) in enumerate(test_data):
        net.n = i
        data = data.as_in_context(ctx)
        output = net(data)
        weights.set_data(weights.data() + output)
    end = time.time()
    print(end - start)    
    img2 = nd.reshape(weights.data(),shape=(som_dim,som_dim,-1)).asnumpy()
    plt.subplot(122)
    plt.imshow(img2)
    plt.show()

As I understand fails asnumpy function. But as for img1 and for img2 shape of NDArray is 100x100x3.
System has 16GB of RAM

NDArray Python

Most helpful comment

Anybody here?

All 11 comments

@nswamy Could you please add labels-
Python, NDarray

As I understand now memory consumes here: weights.set_data(weights.data() + output). Where weights is the Parameter with grad_req {'null'}. I've choosen such way of parameter updating because:

  1. I already have needed delta for gradient descent from output = net(data)
  2. (The most valuable) Trainer.step takes 1.5x more time approx than modyfiing parameter directly.

So here are 3 questions:

  1. How to modify parameter directly and do not cause memory leak?
  2. How to optimize performance of data saving? I measured time for data calculations and it is about 190 seconds, while direct data saving to the parameter takes about 500s and Trainer.step takes about 740s.
  3. Is it usual that gradient descent (data saving) takes so much time compared to calculation time?

Ok, for the first question and topic question I have the answer:
Need to use weightsData.wait_to_read() after weights.set_data(weights.data() + output)

But performance issue is still valid

Addition: I've measured time again with .wait_to_read() and calculaton time was 330 seconds but copying time became 2400 seconds!

So as I concluded I can call wait_to_read() not each time I'm updating weights.set_data(weights.data() + output), but for example with 1:10 or 1:20 frequency. That should speed up the process drastically. As I see from the debugger actual data already is in 'weights' right after calling set_data, even if I'm not calling wait_to_read() (this explains such a big growth of memory usage). So new iteration of calculation begins to work with actual data. Am I right in that?
And it would be nice if you explain or give a link to the information about memory usage and data saving mechanism for better understanding the subect.

So I think I was wrong and 'weights' do not have actual data if not wait_to_read() called. Because after testing my code with wait_to_read() called on each 20-th iteration error raised on 20% but thecopying time persisted the same. Any suggestions?

Also the topic starting problem still persists:
File "SOM.py", line 429, in
File "SOM.py", line 328, in evaluate_accuracyMLP
acc.update([label], [output])
File "C:\Users\C0d3r\Anaconda2\lib\site-packages\mxnet\metric.py", line 418, in update
pred_label = pred_label.asnumpy().astype('int32')
File "C:\Users\C0d3r\Anaconda2\lib\site-packages\mxnet\ndarray\ndarray.py", line 1826, in asnumpy
ctypes.c_size_t(data.size)))
File "C:\Users\C0d3r\Anaconda2\lib\site-packages\mxnet\base.py", line 149, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [12:41:00] d:\program files (x86)\jenkins\workspace\mxnet\mxnet\src\storage./cpu_device_storage.h:70: Failed to allocate CPU Memory

So, when I replaced algorithm for evaluation accuracy for my own, memory consumption became stable.

WAS:
def evaluate_accuracyMLP(dataIt, outputIt, net, netSom, inputsCount, activeNeuronsCount):
#acc = mx.metric.Accuracy()
#i = 0
#for data, label in itertools.izip(dataIt, outputIt):
# data = data.as_in_context(ctx)
# label = label.as_in_context(ctx)
# outputT, win_index, delta, mask = netSom(data)
# data = data.reshape((-1,inputsCount))
# args = (data, mask)
# with autograd.record():
# output = net(*args)
# acc.update([label], [output])
#return acc.get()[1]

IS:
def evaluate_accuracyMLP(dataIt, outputIt, net, netSom, inputsCount, activeNeuronsCount):
res = 0
loss = gluon.loss.L2Loss()

for data, label in itertools.izip(dataIt, outputIt):
    data = data.as_in_context(ctx)
    label = label.as_in_context(ctx) 
    outputT, win_index, delta, mask = netSom(data)               
    data = data.reshape((-1,inputsCount)) 
    args = (data, mask)
    output = net(*args)
    l2loss = loss(output, label)
    res += l2loss

return (res/dataIt.shape[0]).asscalar()

The same situation with memory consumption persists when I'm adding (using +=) NDArray from 1 element in a loop and after loop call .asscalar()
It fixes by calling .asscalar() in for loop for each iteration.

I removed .asscalar() call from for loop, because it takes too much time for execution (again question about performance!) and again I have out of memory problem, I assume because of loss function calculations. Is it usual behavior (consumig so much memory) and I need more RAM (I refuse to believe it also as in so much time for asscalar(), mem copying and so on)?

Anybody here?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sbodenstein picture sbodenstein  路  3Comments

luoruisichuan picture luoruisichuan  路  3Comments

Fzz123 picture Fzz123  路  3Comments

realbns2008 picture realbns2008  路  3Comments

yuconglin picture yuconglin  路  3Comments