We want to be able to have custom backwards functions for methods in new tensor types such as DifferentialPrivacyTensor. This requires that we make sure that we can call backwards in python as pytorch drops backwards down to c++ immediately, getting rid of our custom logic.
Note: while DP will be an early user of this project - all work should be decoupled from DP as we'll be using it for other areas as well (like encrypted training)
I've been exploring this. I think the best option here is to use the grad_fn methods created by torch operations. For example:
x = torch.tensor([1., 2., 3.], requires_grad=True)
y = x**2
z = y.mean()
grad_z_y = z.grad_fn(torch.ones_like(z)) # gradient of z w.r.t. y
# grad_z_y == tensor([0.3333, 0.3333, 0.3333])
grad_y_x = y.grad_fn(grad_z_y) # gradient of y w.r.t. x
# grad_y_x == tensor([0.6667, 1.3333, 2.0000])
# we should actually accumulate here
x.grad = grad_y_x
# now we should be able to run the optimizer step
If I can figure out how to go backwards through the graph, I should be able to use .grad_fn to calculate all the necessary gradients.
@iamtrask Would having the outputs of .grad_fn be good enough for DP?
Progress! You can get the graph using .grad_fn.next_functions. Let's see if I can hack something out today
Pull request here: https://github.com/OpenMined/PySyft/pull/1942
Most helpful comment
Progress! You can get the graph using
.grad_fn.next_functions. Let's see if I can hack something out today