Neo: When one side connect actively, the other side will always keep trying a new connection

Created on 27 Sep 2019  路  14Comments  路  Source: neo-project/neo

Describe the bug

If neo node A and B both listen port 40333, and A connected B firstly. At this time, B will always try to create a new connection from B to A with timer.
But when the new connection created, and B sent the VersionPlayload to A, A will close the connection at once, as A have connected B.

https://github.com/neo-project/neo/blob/c4023a48567eea8a82967ad91e436c42b5552511/neo/Network/P2P/RemoteNode.cs#L211-L216

Assumption: The number of connected nodes is less than MinDesiredConnections.

To Reproduce
Steps to reproduce the behavior:

Evn.: 4 four nodes in different hosts.

  1. add log in RemoteNode
 public RemoteNode(NeoSystem system, object connection, IPEndPoint remote, IPEndPoint local)
            : base(connection, remote, local)
        {
            System.Console.WriteLine("add RemoteNode " + remote.ToString() + "  remotes: " + LocalNode.Singleton.RemoteNodes.Count);  // <---- here

            this.system = system;
            this.protocol = Context.ActorOf(ProtocolHandler.Props(system));
            LocalNode.Singleton.RemoteNodes.TryAdd(Self, this);
         ...............................

add log in RemoteNode.PostStop

        protected override void PostStop()
        {
            LocalNode.Singleton.RemoteNodes.TryRemove(Self, out _);
            base.PostStop();

            System.Console.WriteLine("remove RemoteNode " + Remote.ToString() + "  remotes: " + LocalNode.Singleton.RemoteNodes.Count);  // <------- 
        }

add log in Peer.OnTimer

 private void OnTimer()
        {
            if (ConnectedPeerActors.Count >= MinDesiredConnections) return;
            if (UnconnectedPeers.Count == 0)
                NeedMorePeers(MinDesiredConnections - ConnectedPeerActors.Count);
            foreach(IPEndPoint unconnected in UnconnectedPeers)
            {
                Console.WriteLine("--- unconnected " + unconnected.ToString());  // <--- here
            }
            foreach (IPEndPoint connected in ConnectedPeers.Values)
            {
                Console.WriteLine("--- connected " + connected.ToString());   // <--- here
            }

            IPEndPoint[] endpoints = UnconnectedPeers.Take(MinDesiredConnections - ConnectedPeerActors.Count).ToArray();
            ImmutableInterlocked.Update(ref UnconnectedPeers, p => p.Except(endpoints));
            foreach (IPEndPoint endpoint in endpoints.Except(ConnectedPeers))
            {
                Console.WriteLine("start to connect " + endpoint.ToString() + " ..."); // <--- here
                ConnectToPeer(endpoint);
            }
        }
  1. setup a privatenet and watch the logs

Expected behavior

When A connected B, B dose not need to create a new connection between B and A unless the original connection crashed.

Screenshots

image

B's ip is 192.168.1.147, and A's ip is 192.168.1.149. A connected B firstly. As you can see, B will try to create a connection with timer.

Platform:

  • OS: Centos 7.0
  • Version neo-cli 3.0.0 preview1

How to solve?

The reason is that A dosen't use the listener port 40333 but other port to connect B, but B will treat the {A's ip}:40333 as a new node, and try to connect it.

When A connect B, B must also add the {A's ip}:{A's listener port} in its connected peers.

Most helpful comment

Another way is to add a ConnectFailedPeers list?

All 14 comments

It is great to see you are investigating this. I will take a carefull look.

fixed by #1124

I think this is as expected. Why do we want to allow duplicate connections?

I mean, there is not need for B to connect A again, when A has connected B.
Currently, the problem is that B is always trying to connect A.

If neo node A and B both listen port 40333, and A connected B firstly. At this time, B will always try to create a new connection from B to A with timer.
But when the new connection created, and B sent the VersionPlayload to A, A will close the connection at once, as A have connected B.

I think that we should use Reject message is certain cases, and this is one of them.

Why A don't send a Reject message to say B, stop do this?
https://github.com/neo-project/neo/blob/c4023a48567eea8a82967ad91e436c42b5552511/neo/Network/P2P/MessageCommand.cs#L26

Reject is for transactions or blocks.

I think there is no need to send a Reject message, as B has already knew A's listener port, which is contained in A's VersionPayload.

Why we don't use unique Id Guid instead of Ports or nonce?

I am configuring the scenario and the text is wrong @Tommo-L

foreach (IPEndPoint connected in ConnectedPeers)
            {
                Console.WriteLine("--- connected " + connected.ToString());   // <--- here
            }

should be

foreach (IPEndPoint connected in ConnectedPeers.Values)
            {
                Console.WriteLine("--- connected " + connected.ToString());   // <--- here
            }

This is my log

image

Two nodes aren't enough, you need 3 nodes at least. In my environment, I have 4 nodes in different hosts.

Why two nodes aren't enough, the work flow as following:
Step1: A ---- connected ----> B
Step2: B timeout#1, B will try to connect A, but A will close it at once. And B will also remove A from UnconnectedPeers.
https://github.com/neo-project/neo/blob/c4023a48567eea8a82967ad91e436c42b5552511/neo/Network/P2P/Peer.cs#L229-L230
Step3: B timeout#2, but this time, UnconnectedPeers is empty, then B will broadcast GetAddr message to get new unconnected peers.

https://github.com/neo-project/neo/blob/c4023a48567eea8a82967ad91e436c42b5552511/neo/Network/P2P/Peer.cs#L224-L228

https://github.com/neo-project/neo/blob/c4023a48567eea8a82967ad91e436c42b5552511/neo/Network/P2P/LocalNode.cs#L124-L130

Step4: However, when A received GetAddr message, A will send the addresses of connected nodes, which is the address of B.

https://github.com/neo-project/neo/blob/c4023a48567eea8a82967ad91e436c42b5552511/neo/Network/P2P/ProtocolHandler.cs#L145-L155

Step5: B received the address list from A, it's only the address of B. And it will be filter by B.

https://github.com/neo-project/neo/blob/c4023a48567eea8a82967ad91e436c42b5552511/neo/Network/P2P/ProtocolHandler.cs#L119-L125

https://github.com/neo-project/neo/blob/c4023a48567eea8a82967ad91e436c42b5552511/neo/Network/P2P/Peer.cs#L65-L72

Finally, the UnconnectedPeers of B is still empty. This is why two nodes works fine.

Another way is to add a ConnectFailedPeers list?

I think that is better with ConnectFailedPeers every 10 minutes we can purge this cache.

fixed in #1154, plz review it

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vncoelho picture vncoelho  路  4Comments

doubiliu picture doubiliu  路  3Comments

realloc picture realloc  路  4Comments

igormcoelho picture igormcoelho  路  4Comments

csmuller picture csmuller  路  3Comments