Hi, I have some quick questions about Data()/ Graph. As the following,
Node indices should always range from 0 to num_nodes - 1, and an edge (i, j) should align with the node indices in x: That is, x[i] should give you the features of the source node, while x[j] should give you the features of the destination node.
Edit: For getting the node id, you can also pass in an additional attribute to data, e.g.:
data.n_id = torch.arange(num_nodes)
Hope this clarifies your issues :)
Thanks for the quickly reply!
So the num_node of Data is calculated from the unique indexes within the edge_index?
If not explicitly set, num_nodes will be calculated via x.size(0). In general, you cannot rely on num_nodes == edge_index.max() + 1 because of isolated nodes.
https://github.com/rusty1s/pytorch_geometric/issues/1391#issue-649149066
So if I want to build a Data() from the above example edge_index, I should reorder the edge_index?
What do you mean by re-ordering? As far as I can see, there's no reason to modify it in the first-place.
https://github.com/rusty1s/pytorch_geometric/issues/1580#issuecomment-682341772
Oh! I have misunderstood the meaning here. I thought it means that I need to sort the edge_index.
For example, if I have a graph with [[0, 1], [1, 4], [1, 5], [1, 6], [0, 2], [2, 7], [0, 3], [3, 8], [3, 9], [3, 10], [3, 11]], then I need to reorder this to [[0, 1], [0, 2], [0, 3], [1, 4], [1, 5], [1, 6], [2, 7], [3, 8], [3, 9], [3, 10], [3, 11]].
For the above example (original order), I have 12 nodes and I have to give x with the length of 12?
If my x is torch.arrange(12), this means that node 0 will get x=0, node 1 get x=1, node 4 get x=2 and so on?
(1) edge_index does not need to be sorted.
(2) Yes, exactly :)
Got it!
l = np.array([[0, 1], [1, 4], [1, 5], [1, 6], [0, 2], [2, 7], [0, 3], [3, 8], [3, 9], [3, 10], [3, 11]]).transpose()
r = np.array([l[1, :], l[0, :]])
edge_index = torch.from_numpy(r)
train_loader = NeighborSampler(edge_index,
sizes=[2, 2], batch_size=1)
for batch_size, n_id, adjs in train_loader:
print("###")
for edge_ind, e_id, size in adjs:
print(e_id)
For the above code, Is there a more detailed description of batch_size, n_id returned by the NeighborSampler?
And edge_ind, e_id, size returned by adjs?
Sorry for the bothering, I have found the documentation about it.
an item returned by NeighborSampler holds the current batch_size, the IDs n_id of all nodes involved in the computation, and a list of bipartite graph objects via the tuple (edge_index, e_id, size), where edge_index represents the bipartite edges between source and target nodes, e_id denotes the IDs of original edges in the full graph, and size holds the shape of the bipartite graph.
If I have isolated nodes in the graph, how should I assign their feature by x?
Since they won't appear in the edge_index.
Or I only need to assign features to those who aren't isolated and manually assign the num_node by myself?
Isolates nodes are still nodes, so they should have the same semantic node features as all the remaining nodes. If they aren't important for your computation graph, feel free to set them to zero.
That's said I have a graph with the example mentioned above as [[0, 1], [1, 4], [1, 5], [1, 6], [0, 2], [2, 7], [0, 3], [3, 8], [3, 9], [3, 10], [3, 11]].
If I have an isolated node called 12, how should I assign the feature to it with x ?
And what if the isolated node is not a starting(0) or ending index(max_node)?
You can still add node features for isolated nodes, i.e. to add node 12 as an isolated nodes, you simply have a node feature matrix of shape [13, num_features] where x[12] corresponds to the features of node 12.
If 4 is the isolated node with [[0, 1], [1, 5], [1, 6], [0, 2], [2, 7], [0, 3], [3, 8], [3, 9], [3, 10], [3, 11]]
And I have x is torch.arrange(12), in the previous discussion suggest that I will have x=0 for node0, x=1 for node 1, x=2 for node 5, x=3 for node 6, x=4 for node 2 and so on.
In this case, the x[4] won't be the feature of the isolated node 4 ?
x[4] denotes the feature of the isolated node 4. That is true for all nodes, i.e.: node i holds its features in x[i].
So when I havex is torch.arrange(12) as features, it should mean that node i in edge_index would have feature x[i]?
Edge [1,5] would get x[1] and x[5] as the feature for node 1 and node 5.
Yes :)