Pytorch_geometric: Is it possible that data.y is also edges, like edge_index

Created on 1 Sep 2019 · 4Comments · Source: rusty1s/pytorch_geometric

❓ Questions & Help

Hi. The thing I want to do is to predict the connection of the nodes. So the groundtruth label are the edges, specifically just like edge_index with shape[2, num_edges]. But the problem is that different graph has different number of edges in groundtruth, so it cannot train on min-batch. I know there is a solution that I can manually concatenate the groudtruth labels based on the Batch Attribute of DataLoader. But I wonder is there a more decent method to do that? e.g. like create a edge_index2 (same as edge_index) for replacing data.y.

Anyhow, this is really a great package that save me a lot of time. Thank you so much! really great work!

Source

zhangfuyang

Most helpful comment

When you name your ground-truth edges data.y_index with shape [2, num_edges], batching should work flawless for your idea. In fact, you can replace data.y with anything you like.

rusty1s on 1 Sep 2019

👍2

All 4 comments

When you name your ground-truth edges data.y_index with shape [2, num_edges], batching should work flawless for your idea. In fact, you can replace data.y with anything you like.

rusty1s on 1 Sep 2019

👍2

Thanks! y_index works.
When you said data.y can be anything you mean it can be anything but only with same dimension for all graph in dataset right? So in this case, there will be an concatenation error when feed ground-truth directly into data.y.

zhangfuyang on 1 Sep 2019

hello @zhangfuyang:

Answer

In fact, Data is just a container, you can write anything you like...

see the following screenshots.

`data.y = ['cxk'] * 2333`

What is in the graph.

as shown in the 2 screenshots, it can really contains EVERYTHING YOU LIKE.

P.S. Source code of `torch_geometric.data.Data`

   def __init__(self, x=None, edge_index=None, edge_attr=None, y=None,
                 pos=None, norm=None, face=None, **kwargs):
        self.x = x
        self.edge_index = edge_index
        self.edge_attr = edge_attr
        self.y = y
        self.pos = pos
        self.norm = norm
        self.face = face
        for key, item in kwargs.items():
            if key == 'num_nodes':
                self.__num_nodes__ = item
            else:
                self[key] = item

        if torch_geometric.is_debug_enabled():
            self.debug()

I only show the __init__ method. In this method, it does not have the dimension check. **kwargs provides any extra attributes, for example, train_idx, which shows which node is to train. (this is seen in Cora dataset)

`data.cxk = 666`

DO NOT MANAGE TO OVERWRITE `data.num_nodes`!

sorry, data.num_nodes is an IMPORTANT ATTRIBUTE, as shown in the source code of torch_geometric.data.Data.

an example of overwriting it.

Conclusion

If you just want to try it FOR FUN, the code above shows some of the possible ways of using that. However, if you want to work with it. (especially in debug mode), CHECK IT CAREFULLY!

yours sincerely
@wmf1997

WMF1997 on 2 Sep 2019

👍1

Wow! Thank you so much WMF1997.

zhangfuyang on 2 Sep 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Dataset dimensions

SaschaStenger · 4Comments

EdgeConv implementation error in test(loader) method

Raverss · 3Comments

Batch.to_data_list() for bipartite graph

a060146251 · 3Comments

Fail to install torch-scatter

zetayue · 3Comments

When num_hyperedges is greater than num_nodes, HypergraphConv cannot work

liulixinkerry · 4Comments