Hello,
This might be very trivial. I am trying to build a regression nnet that predicts 2 values per observation but learning systematically fails. I thought that LinearRegressionOutput supports multiple outputs but I might be wrong?
Here is an example with the iris dataset:
``` data(iris)
data <- mx.symbol.Variable("data")
fc1 <- mx.symbol.FullyConnected(data, num_hidden=16)
fc2 <- mx.symbol.FullyConnected(fc1, num_hidden=2)
out <- mx.symbol.LinearRegressionOutput(data = fc2)
model <- mx.model.FeedForward.create(X = t(data.matrix(iris[,-c(1, 2)])),
y = t(data.matrix(iris[, c(1,2)])),
symbol = out,
num.round = 10,
learning.rate = 0.001,
momentum = 0.9,
eval.metric = mx.metric.rmse,
array.batch.size = 10,
array.layout = "colmajor")
Straining with 1 devices
[1] Train-rmse=NaN
[2] Train-rmse=NaN
[3] Train-rmse=NaN
[4] Train-rmse=NaN
[5] Train-rmse=NaN
[6] Train-rmse=NaN
[7] Train-rmse=NaN
[8] Train-rmse=NaN
[9] Train-rmse=NaN
[10] Train-rmse=NaN
```
And if I try this nnet with my dataset, R crashes after a while without displaying any error.
Thank you for your help,
Michel
Same error here. I would like to know too. Trying to use (multivariate) LogisticRegression with multiple outputs...
Try reducing your learning rate. Does that help?
no... even at learning.rate=1e-20 or 1e-99
Is the syntax OK? Somewhere people do complicated stuff with Group, bind and executors???
Try setup a monitor on the weights and gradients so that you can see the values in the logging
yikes... started only 1 week with mxnet... pointers how to setup a monitor (in R)? I wonder if label was ever made to be a matrix rather than single vector of labels. I peeked into the source R-package/R/model.R where label is frequently referred to with 'length(y)' (e.g. in mx.model.init.iter), meaning it wasn't expected as (multivariate) output.
weird... say 1 out of 10 times, especially after switching gpu device and setting learning rate small like 1e-6, I get no NaN's. But then NaN starts coming out again. Even using cpu. weird.
Getting same error here. Multiple regression output would be fantastic.
Has there been any movement on this? I'd be interested to see the correct way to handle this!
I'm not an R person, but here is a python example showing a 30-valued output multiple regression setup on some artificial data. This model converges:
import find_mxnet
import mxnet as mx
from load_data import load
from sklearn.cross_validation import train_test_split
import logging
import pdb as pdb
import numpy as np
import matplotlib.pyplot as plt
def get_mlp():
"""
multi-layer perceptron
"""
outLabl = mx.sym.Variable('softmax_label')
data = mx.symbol.Variable('data')
flat = mx.symbol.Flatten(data=data)
fc1 = mx.symbol.FullyConnected(data = flat, name='fc1', num_hidden=100)
act1 = mx.symbol.Activation(data = fc1, name='relu1', act_type="relu")
fc2 = mx.symbol.FullyConnected(data = act1, name='fc2', num_hidden=30)
net = mx.sym.LinearRegressionOutput(data=fc2, label=outLabl, name='linreg1')
return net
#
# Load data
#
# Create artificial data
X = np.ones((2140, 9216)).reshape((2140, 1, 96, 96))
y = 0.6*np.ones((2140, 30))
#
# Setup iterators
#
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state = 42)
trainIter = mx.io.NDArrayIter(data = X_train, label = y_train, batch_size = 64)
valIter = mx.io.NDArrayIter(data = X_test , label = y_test , batch_size = 64)
#
# Multidevice kvstore setup and logging
#
kv = mx.kvstore.create('local')
head = '%(asctime)-15s Node[' + str(kv.rank) + '] %(message)s'
logging.basicConfig(level=logging.DEBUG, format=head)
#
# Get model and train
#
net = get_mlp()
model = mx.model.FeedForward(
ctx = mx.gpu(),
symbol = net,
num_epoch = 15,
learning_rate = 0.001,
momentum = 0.9,
wd = 0.00001,
initializer = mx.init.Xavier(factor_type="in", magnitude=2.34),
)
model.fit(X=trainIter, eval_data=valIter, batch_end_callback=mx.callback.Speedometer(1,50), epoch_end_callback=None, eval_metric='rmse')
#
# Prediction
#
valIter.reset()
for prediction in model.predict(valIter):
print prediction
pdb.set_trace()
Had exactly the same problem(in r), however switching from Windows 10 to Ubuntu 16 and building the latest version of mxnet and the r package solved the problem for me.
Hi, I needed a multiple output regression in mxnet using R, so I have translated the example python code provided by @Piyush3dB into R (although I use a different example). Code and results below:
Results:
[1] "Train rmse:"
y1 y2 y3
0.05941047 0.07813825 0.01670451
[1] "Test rmse:"
y1 y2 y3
0.05959333 0.08441392 0.01816598
Plots:

Code:
## Translated to R from python example at:
# https://github.com/dmlc/mxnet/issues/2138#issuecomment-222812951
# MXNET settings:
nRounds <-300
nHidden <- 30
optimizer <- "rmsprop"
array.layout <- "rowmajor"
ctx <- mx.cpu()
initializer <- mx.init.Xavier()
# Data settings:
nObservations <- 2000
noiseLvl <- 0.5
nOutput <- 3
set.seed(42)
mx.set.seed(42)
get_mlp <- function() {
# multi-layer perceptron
label = mx.symbol.Variable('label')
data = mx.symbol.Variable('data')
flat = mx.symbol.Flatten(data=data)
fc1 = mx.symbol.FullyConnected(data = flat, name='fc1', num_hidden=nHidden)
act1 = mx.symbol.Activation(data = fc1, name='tanh1', act_type="tanh")
fc2 = mx.symbol.FullyConnected(data = act1, name='fc2', num_hidden=nOutput)
net = mx.symbol.LinearRegressionOutput(data=fc2, label=label, name='lro')
return(net)
}
# Generate some random data
df <- data.frame(x1=rnorm(nObservations),
x2=rnorm(nObservations),
x3=rnorm(nObservations),
x4=rnorm(nObservations))
expts <- list()
for (outIdx in 1:nOutput) {
expts[[outIdx]] <- sample(0:3, 4, replace=T)
df[[paste0("y", outIdx)]] <- df$x1^expts[[outIdx]][1] +
df$x2^expts[[outIdx]][2] + df$x3^expts[[outIdx]][3] +
df$x4^expts[[outIdx]][4] + noiseLvl*rnorm(nObservations)
}
respCols <- paste0("y", 1:nOutput)
# Scale data to zero-mean unit-variance
df <- data.frame(scale(df))
# Split into training and test sets
test.ind = seq(1, nObservations, 10) # 1 in 10 smaples for testing
train.x = data.matrix(df[-test.ind, -which(names(df) %in% respCols)])
train.y = data.matrix(df[-test.ind, respCols])
test.x = data.matrix(df[test.ind, -which(names(df) %in% respCols)])
test.y = data.matrix(df[test.ind, respCols])
# Setup iterators
trainIter = mx.io.arrayiter(data = t(train.x), label = t(train.y))
valIter = mx.io.arrayiter(data = t(test.x) , label = t(test.y))
# Get model and train
net = get_mlp()
model = mx.model.FeedForward.create(X=trainIter,
eval.data=valIter,
ctx=ctx,
symbol=net,
num.round=nRounds,
initializer=initializer,
optimizer=optimizer,
array.layout=array.layout
)
# Prediction
train.Response <- t(predict(model, train.x, array.layout=array.layout))
test.Response <- t(predict(model, test.x, array.layout=array.layout))
# Results
print("Train rmse:")
print(colMeans((train.Response - train.y)^2))
print("Test rmse:")
print(colMeans((test.Response - test.y)^2))
par(mfrow=c(nOutput, 2))
for (outIdx in 1:nOutput) {
plot(train.y[, outIdx], train.Response[, outIdx],
xlab="Actual output", ylab="Model Response",
main=paste0("train perf. output ", outIdx))
abline(0,1)
plot(test.y[, outIdx], test.Response[, outIdx],
xlab="Actual output", ylab="Model Response",
main=paste0("test perf output ", outIdx))
abline(0,1)
}
Most helpful comment
Hi, I needed a multiple output regression in mxnet using R, so I have translated the example python code provided by @Piyush3dB into R (although I use a different example). Code and results below:
Results:
Plots:

Code: