Pytorch_geometric: When num_hyperedges is greater than num_nodes, HypergraphConv cannot work

Created on 6 Nov 2020 · 4Comments · Source: rusty1s/pytorch_geometric

🐛 Bug

When num_hyperedgesM is greater than num_nodesN, HypergraphConv cannot work.

To Reproduce

Steps to reproduce the behavior:
When I add hyperedge 4 in https://github.com/rusty1s/pytorch_geometric/blob/master/test/nn/conv/test_hypergraph_conv.py , the program crashes.


import torch
from torch_geometric.nn import HypergraphConv

def test_hypergraph_conv():
    in_channels, out_channels = (16, 32)
    hyperedge_index = torch.tensor([[0, 0, 1, 1, 2, 3, 3, 3, 2, 1 ,2], [0, 1, 2, 1, 2, 1, 0, 3, 3, 4, 4]])
    num_nodes = hyperedge_index[0].max().item() + 1
    x = torch.randn((num_nodes, in_channels))

    conv = HypergraphConv(in_channels, out_channels)
    assert conv.__repr__() == 'HypergraphConv(16, 32)'
    out = conv(x, hyperedge_index)
    assert out.size() == (num_nodes, out_channels)


    conv = HypergraphConv(in_channels, out_channels, use_attention=True,
                          heads=2)
    out = conv(x, hyperedge_index)
    assert out.size() == (num_nodes, 2 * out_channels)

    print("finish")

if __name__ == "__main__":
    test_hypergraph_conv()

log:

Traceback (most recent call last):
  File "test.py", line 24, in <module>
    test_hypergraph_conv()
  File "test.py", line 12, in test_hypergraph_conv
    out = conv(x, hyperedge_index)
  File "/home/user/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/user/anaconda3/lib/python3.8/site-packages/torch_geometric/nn/conv/hypergraph_conv.py", line 126, in forward
    out = self.propagate(hyperedge_index, x=x, norm=B, alpha=alpha)
  File "/home/user/anaconda3/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 252, in propagate
    out = self.aggregate(out, **aggr_kwargs)
  File "/home/user/anaconda3/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 286, in aggregate
    return scatter(inputs, index, dim=self.node_dim, dim_size=dim_size,
  File "/home/user/anaconda3/lib/python3.8/site-packages/torch_scatter/scatter.py", line 152, in scatter
    return scatter_sum(src, index, dim, out, dim_size)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/home/user/anaconda3/lib/python3.8/site-packages/torch_scatter/scatter.py", line 22, in scatter_sum
            size[dim] = int(index.max()) + 1
        out = torch.zeros(size, dtype=src.dtype, device=src.device)
        return out.scatter_add_(dim, index, src)
               ~~~~~~~~~~~~~~~~ <--- HERE
    else:
        return out.scatter_add_(dim, index, src)
RuntimeError: index 4 is out of bounds for dimension 0 with size 4

When removing hyperedge 4, the program works perfectly.

Expected behavior

HypergraphConv can handle the incidence matrix \mathbf{H} \in {\{ 0, 1 \}}^{N \times M} whose M is greater than N.

Environment

OS: Linux
Python version: 3.8
PyTorch version: 1.7.0
CUDA/cuDNN version: 11.0
GCC version: 5.5

Additional context

bug

Source

liulixinkerry

Most helpful comment

Thanks for reporting. The HyperGraphConv is indeed a little bugged at the moment. I will try to fix it.

rusty1s on 6 Nov 2020

👍2

All 4 comments

Thanks for reporting. The HyperGraphConv is indeed a little bugged at the moment. I will try to fix it.

rusty1s on 6 Nov 2020

👍2

@rusty1s Could you tell, please, if there are any coming fixes? I have also bumped into this problem.

pilot7747 on 20 Nov 2020

It's still on my TODO (see https://github.com/rusty1s/pytorch_geometric/issues/1465), but feel free to propose a fix if you have time.

rusty1s on 22 Nov 2020

Hey Matthias,
Happy new year to you!

Correct me if I misunderstood something:
The hypergaph convolution works by introducing a vertex for each hyperedge, then propagating the signal from each vertex to those and then propagating and distributing that signal from the help vertices among all nodes that are connected by each hyperedge. This is actually a quite neat solution.

However, you never introduce fresh nodes for each hyperedge, but rather reuse the existing ones, which should be mathematically correct, but leads to the issue described above. Thus, I think a simple zero padding for the node_features \in \mathbb{R}^{num_nodes \times num_features} to \mathbb{R}^{max{num_nodes, num_hyperedges} \times num_features} in line 126 of HypergraphConv, i.e. from

self.flow = 'source_to_target'
out = self.propagate(hyperedge_index, x=x, norm=B, alpha=alpha)
self.flow = 'target_to_source'
out = self.propagate(hyperedge_index, x=out, norm=D, alpha=alpha)

x_help = torch.zeros((max([x.size(0), num_edges]), x.size(1))) # create help matrix of shape defined above
x_help[:x.size(0), :x.size(1)] = x  #write node_features to that help matrix

self.flow = 'source_to_target'
out = self.propagate(hyperedge_index, x=x_help, norm=B, alpha=alpha)
self.flow = 'target_to_source'
out = self.propagate(hyperedge_index, x=out, norm=D, alpha=alpha)

(which that rather hacky implementation for simple padding)

should seal the deal. Or introduce num_hyperedges new vertices every time and alter edge_index by edge_index[1,:] += num_nodes.

If I misunderstood anything, please feel free to correct me! :)