Since we can construct NN with Glow C++ API like examples/mnist.cpp.
why cannot we just import Caffe2 Model and differentiate F and retrain?
what's the main difference between Function built from model and from API ? (e.g. Tensor initialization, diferentiate).
thanks a lot~
There's no reason this couldn't be done in theory, importing the network structure from a proto to differentiate and train. The C++ API for building the graph is used for mnist, unittests/MLTest.cpp, the proto importers, etc. But currently our proto importers are built expecting pre-trained models, e.g. variables that are created during loading have the trainable property set to false. So you would need to modify the importer at least allowing for that, and I'm sure you would encounter other small obstacles. I don't expect it would be incredibly difficult overall, though.
@jfix71 thanks for your reply.
p.s.
1.glow(NHWC) will add transpose node for Caffe2 input data(NCHW), and transpose back. there is something wrong when differentiate loaded graph directly, perhaps with some operator during lowering and optimization.
current initialization for model tensor is GivenTensorFill, ConstantFill and UniformFill
but xaiver initializtion is used when creaeConv and createFullyConnected with C++ API.
that seems to be the main difference.
@wayneshawn @jfix71 We plan to rewrite the Graph builder methods and make them more general (by separating the context). I hope that this rewrite will help to fix the generality problem that you've described (where the wrong initialization kind is used), but I suspect that we'll need to iterate over it a few times before we get it to work. Could you please open an issue with an accurate description of the problem so that we could track this and fix it?
Most helpful comment
There's no reason this couldn't be done in theory, importing the network structure from a proto to differentiate and train. The C++ API for building the graph is used for mnist, unittests/MLTest.cpp, the proto importers, etc. But currently our proto importers are built expecting pre-trained models, e.g. variables that are created during loading have the
trainableproperty set tofalse. So you would need to modify the importer at least allowing for that, and I'm sure you would encounter other small obstacles. I don't expect it would be incredibly difficult overall, though.