I trained the cifar10 in R according to the implementation of python in:
https://github.com/dmlc/mxnet/tree/master/example/image-classification.
My code is in:
https://github.com/ziyeqinghan/mxnet/tree/master/R-package/demo/image-classification
However, every time I run the command Rscript train_cifar10.R --gpus 0, the performance would be very different. The following is some of different performance.
Batch [50] Train-accuracy=0.09546875
...
Batch [350] Train-accuracy=0.0966294642857143
[10] Train-accuracy=0.0969269501278772
[10] Validation-accuracy=0.0999599358974359
[5] Train-accuracy=0.820612212276215
[5] Validation-accuracy=0.478866185897436
...
[6] Train-accuracy=0.839002403846154
[6] Validation-accuracy=0.328826121794872
Batch [50] Train-accuracy=0.92484375
...
Batch [350] Train-accuracy=0.928973214285714
[18] Train-accuracy=0.929008152173913
[18] Validation-accuracy=0.837740384615385
Also I saved the trained model in python. Then, I loaded the the model and continued training it in R and found the similar problem.
Since the python code works fine, I doubt there maybe some bugs in the implementation of R codes. I will appreciate it if anyone can give me some help or suggestions.
Thanks so much!
That is why I am using the python interfaces now for my project,sigh!
Working on a paper for WABI conference.
I will come back to this after paper submission. You have my word.
yeah same with C/C++ interfaces too. I tried with it and just started python interface. even most popular rmsprop optimizer is not implemented and I had to reverse engineering python interface to get exactly same functionality with it and dmlc/mxnetcpp is practically unusable :( lack of support for those makes mxnet users will start looking for alternative solution. even for contributors.
I have pushed the cifar 10 example. The results using lenet is as follow:
Rscript train_cifar10.R --cpu=TRUE --network=lenet
Loading required package: mxnet
Loading required package: methods
Loading required package: argparse
Loading required package: proto
[1] "Network used: lenet"
[01:06:27] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: data/cifar10/train.rec, use 1 threads for decoding..
[01:06:27] src/io/./iter_normalize.h:103: Load mean image from data/cifar10/mean.bin
[01:06:27] src/io/iter_image_recordio.cc:68: Loaded ImageList from data/cifar10/test.lst 10000 Image records
[01:06:27] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: data/cifar10/test.rec, use 1 threads for decoding..
[01:06:27] src/io/./iter_normalize.h:103: Load mean image from data/cifar10/mean.bin
[1] "Computing with CPU"
Start training with 1 devices
Batch [50] Train-accuracy=0.25234375
Batch [100] Train-accuracy=0.284609375
Batch [150] Train-accuracy=0.302239583333333
Batch [200] Train-accuracy=0.3121875
Batch [250] Train-accuracy=0.32134375
Batch [300] Train-accuracy=0.328046875
Batch [350] Train-accuracy=0.332924107142857
[1] Train-accuracy=0.336237980769231
[1] Validation-accuracy=0.358880537974684