Is there a way to convert an adjacency tensor produced by a MLP into a Data object while allowing backprop for a generative adversarial network? The generative adversarial network has an MLP generator with a pytorch_geometric based GNN as the discriminator I have not been able to find the answer to this question yet. Here is a simplified example of what the problem is.
Say I have this MLP generator:
class Generator(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(3, 6)
def forward(self, z):
return torch.tanh(self.fc1(z))
output = gen(torch.randn(3))
# output = tensor([ 0.2085, -0.0576, 0.4957, -0.6059, 0.2571, -0.2866], grad_fn=<TanhBackward>)
So, this generator returns a vector representing a graph with two nodes, which we can reshape to form an adjacency matrix and a node feature vector.
adj = output[:4].view(2,2)
# adj = tensor([[-0.5811, 0.0070],
[ 0.3754, -0.2587]], grad_fn=<ViewBackward>)
node_features = output[4:].view(2, 1)
# node_features = tensor([[0.1591],
[0.0821]], grad_fn=<ViewBackward>)
Now to convert this to a pytorch_geometric Data object, we must construct a COO matrix (the x parameter in the Data object is already the node_features). However, if we loop through the adj matrix and add a connection to a COO matrix with the code below, back propagation does not work from the pytorch_geometric GNN to the pytorch MLP.
coo = [[], []]
for i in len(adj):
for j in len(adj[i]):
# for our purposes, say there is an edge if the value >0
if adj[i][j] >0:
coo[0].append(i)
coo[1].append(j)
We can now construct the Data object like so:
d = Data(x = node_features, edge_index = torch.LongTensor(coo))
However, when training a GAN by converting the generator output to a Data object for the GNN discriminator, back propagation and optimization does not work (I assume because the grad_fn and grad properties are lost. Does anyone know how to convert a tensor to a pytorch_geometric Data object while allowing back prop to happen in the generative adversarial network with MLP generator that outputs adj matrix/tensor and node features and GNN (pytorch_geometric based) discriminator that takes a Data object as input?
It is correct that you lose gradients that way. In order to backpropagate through sparse matrices, you need to compute both edge_index and edge_weight (the first one holding the COO index and the second one holding the value for each edge). This way, gradients flow from edge_weight to your dense adjacency matrix.
In code, this would look as following:
edge_index = (adj > 0).nonzero().t()
row, col = edge_index
edge_weight = adj[row, col]
self.conv(x, edge_index, edge_weight)
Most helpful comment
It is correct that you lose gradients that way. In order to backpropagate through sparse matrices, you need to compute both
edge_indexandedge_weight(the first one holding the COO index and the second one holding the value for each edge). This way, gradients flow fromedge_weightto your dense adjacency matrix.In code, this would look as following: