Incubator-mxnet: Use R, how to manually confficients predict y^, make the result equal to the function `predict`?

Created on 5 Jul 2017 · 3Comments · Source: apache/incubator-mxnet

The question

This is not just a question of R, but I use R.
I hope to know how predict.MXFeedForwardModel work. I set three layers linear FeedForward neural networks, it like this :

data <- mx.symbol.Variable("data")
fc1 <- mx.symbol.FullyConnected(data, num_hidden = num_hidden_first, name = "fc1")
act1 <- mx.symbol.Activation(fc1, act_type = "relu", name = "relu1")
fc2 <- mx.symbol.FullyConnected(act1, num_hidden = 28, name = "fc2")
act2 <- mx.symbol.Activation(fc2, act_type = "relu", name = "relu2")
fc3 <- mx.symbol.FullyConnected(act2, num_hidden = 1, name = "fc3")
mlp <- mx.symbol.LinearRegressionOutput(fc3, name = "mlp")

It outputs three sets variable coefficients. Each set contains weight and bias.

dim(model$arg.params$fc1_weight)       `39*64`
length(model$arg.params$fc1_bias)       `64`
dim(model$arg.params$fc2_weight)       `39*28`
length(model$arg.params$fc2_bias)       `28`
dim(model$arg.params$fc3_weight)       `39*1`
dim(model$arg.params$fc3_bias)       `1`

I try to manually substitution confficients to predict the y^, make the result equal to the function predict(model, data). Try the way like this:

mx.predict <- function(model, data) {
  W1 <- as.array(model$arg.params$fc1_weight)
  b1 <- as.array(model$arg.params$fc1_bias)
  W2 <- as.array(model$arg.params$fc2_weight)
  b2 <- as.array(model$arg.params$fc2_bias)
  W3 <- as.array(model$arg.params$fc3_weight)
  b3 <- as.array(model$arg.params$fc3_bias)

  data <- as.matrix(data)
  pred <- data %*% W1
  for (i in length(b1)) {
    pred[, i] <- pred[, i] + b1[i]
  }
  pred <- pred %*% W2
  for (i in length(b2)) {
    pred[, i] <- pred[, i] + b2[i]
  }
  pred <- pred %*% W3
  for (i in length(b3)) {
    pred[, i] <- pred[, i] + b3[i]
  }
  return(pred)
}

But the result is far greater than the predict(model, data). I know that wrong. But please tell me which way to use the coefficient, to get the correct results ?
I try to view the predict.MXFeedForwardModel code, but found that its core part seems to involve c ++, I am not familiar with c ++.
Who can help me understand how predict.MXFeedForwardModel works, and help me get the right result with the R function . Thanks!

Source

GuilongZh

Most helpful comment

I think you forgot the application of the activation function. Since you use "relu", you should take something like pmax(0, pred) for act1 and act2.

jeremiedb on 6 Jul 2017

👍2

All 3 comments

I think you forgot the application of the activation function. Since you use "relu", you should take something like pmax(0, pred) for act1 and act2.

jeremiedb on 6 Jul 2017

👍2

Yes, I forgot that. I think that's a reason, and will try to add activation function. Thank you for your help!

GuilongZh on 6 Jul 2017

@GuilongZh Please try the code below. I think 1.639859e-05 is acceptable.

library(mxnet)

data(BostonHousing, package="mlbench")

train.ind <- seq(1, 506, 3)
train.x <- data.matrix(BostonHousing[train.ind, -14])
train.y <- BostonHousing[train.ind, 14]
test.x <- data.matrix(BostonHousing[-train.ind, -14])
test.y <- BostonHousing[-train.ind, 14]

data <- mx.symbol.Variable("data")
fc1 <- mx.symbol.FullyConnected(data, num_hidden = 20, name = "fc1")
act1 <- mx.symbol.Activation(fc1, act_type = "relu", name = "relu1")
fc2 <- mx.symbol.FullyConnected(act1, num_hidden = 1, name = "fc2")
mlp <- mx.symbol.LinearRegressionOutput(fc2, name = "mlp")

mx.set.seed(0)
model <- mx.model.FeedForward.create(mlp, X=train.x, y=train.y,
                                     ctx=mx.cpu(), num.round=50, array.batch.size=20,
                                     learning.rate=2e-6, momentum=0.9, eval.metric=mx.metric.rmse)

preds <- predict(model, train.x)

dim(model$arg.params$fc1_weight)
dim(model$arg.params$fc1_bias)
dim(model$arg.params$fc2_weight)
dim(model$arg.params$fc2_bias)

W1 <- as.array(model$arg.params$fc1_weight)
b1 <- as.array(model$arg.params$fc1_bias)
W2 <- as.array(model$arg.params$fc2_weight)
b2 <- as.array(model$arg.params$fc2_bias)

data <- as.matrix(train.x)
pred <- data %*% W1
for (i in length(b1)) {
  pred[, i] <- pred[, i] + b1[i]
}

pred = matrix(pmax(0, pred), nrow = nrow(pred))

pred <- pred %*% W2
for (i in length(b2)) {
  pred[, i] <- pred[, i] + b2[i]
}

sum((pred - t(preds))^2)
# [1] 1.639859e-05