Hello,
I would like to make a picture of the activations of a certain layer during training after every batch.
Let's say I would make a batch_end_callback and then do something and then dump the activations as an image to disk.
What would be the easiest way to go about this ?
Is this even possible ? If not, why not :)
See example/python-howto/monitor_weight
@juliandewit I usually measure the states of the mid-level inputs by inserting a CustomOp that intercept the gradient and activation. Like the following:
import ast
import mxnet as mx
def safe_eval(expr):
if type(expr) is str:
return ast.literal_eval(expr)
else:
return expr
class IdentityOp(mx.operator.CustomOp):
def __init__(self, logging_prefix="identity", input_debug=False, grad_debug=False):
super(IdentityOp, self).__init__()
self.logging_prefix=logging_prefix
self.input_debug = input_debug
self.grad_debug = grad_debug
def forward(self, is_train, req, in_data, out_data, aux):
if(self.input_debug):
logging.debug("%s: in_norm=%f, in_shape=%s"
%(self.logging_prefix, np.linalg.norm(in_data[0].asnumpy()), str(in_data[0].shape)))
self.assign(out_data[0], req[0], in_data[0])
def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
if (self.grad_debug):
logging.debug("%s: grad_norm=%f, grad_shape=%s"
% (self.logging_prefix, np.linalg.norm(out_grad[0].asnumpy()), str(out_grad[0].shape)))
self.assign(in_grad[0], req[0], out_grad[0])
@mx.operator.register("identity")
class IdentityOpProp(mx.operator.CustomOpProp):
def __init__(self, logging_prefix="identity", input_debug=False, grad_debug=False):
super(IdentityOpProp, self).__init__(need_top_grad=True)
self.input_debug = safe_eval(input_debug)
self.grad_debug = safe_eval(grad_debug)
self.logging_prefix = str(logging_prefix)
def list_arguments(self):
return ['data']
def list_outputs(self):
return ['output']
def infer_shape(self, in_shape):
data_shape = in_shape[0]
output_shape = in_shape[0]
return [data_shape], [output_shape], []
def create_operator(self, ctx, shapes, dtypes):
return IdentityOp(input_debug=self.input_debug,
grad_debug=self.grad_debug,
logging_prefix=self.logging_prefix)
def identity(data, name="identity", logging_prefix=None,
input_debug=False, grad_debug=False):
return mx.symbol.Custom(data=data,
name=name,
logging_prefix=name,
input_debug=input_debug,
grad_debug=grad_debug,
op_type="identity")
We can insert such "identity" operator in the network, like
a = ...
a = identity(a)
b = mx.sym....(a)
You can control the inner behavior of the identity operator, like to print the norm + gradient norm, or saving the activations to the disk.
Another way is to group the activation layer and the final loss function together. Like the following:
mid_level_layer = ...
final_loss = ...mid_level_layer...
out = mx.sym.Group([final_loss, mx.sym.BlockGrad(mid_level_layer)])
We can then use the call_back function to save the second output.
Thank you VERY much for your answers.
As I was afraid it was a bit harder than expected but your solution seems workable.
I was aware of weight monitoring but the 2nd solution seems more appropriate for my specific problem.
Hello @sxjscience , Can you please explain in detail how to call back second output? I am new to MXNet and getting confused with the different approaches available. I have used mx.sym.Group[output_layer, fc_layer]. I would like to get the output of fc_layer which is an intermediate layer to my network.
Please help..
Most helpful comment
@juliandewit I usually measure the states of the mid-level inputs by inserting a CustomOp that intercept the gradient and activation. Like the following:
We can insert such "identity" operator in the network, like
You can control the inner behavior of the identity operator, like to print the norm + gradient norm, or saving the activations to the disk.
Another way is to group the activation layer and the final loss function together. Like the following:
We can then use the call_back function to save the second output.