Even in local private nets Seed nodes are not being able to follow CN relayed blocks.
This is a critical issue that needs to be tracked.
An effort should be made for improving TaskManager and ProtocolHandler classes.
I am still planning to work on this during the next 3 weeks.
This issue is probably a duplicate of #542
Let's try to unify and list all possible open issues and ideas, there is also #522 and #366.
We should identify a line future directions for improving the P2P.
Also, I have one simple change that improves it, that I may put a PR out for first; if you are observing seed nodes not following in a private network and you are running fast blocks you will need to change the code that relays last 2 blocks when receiving multiple blocks to relaying like the last 10 instead.
just change this line to:
if (blocksPersisted++ < blocksToPersistList.Count - 10) continue;
This doesn't really matter when running slower blocks though or in a bigger network. In a really small private network that runs blocks at high speed. you need it though.
Let's create a formulate based on seconds in order to make it more generic.
Another thing is that StartHeight, as we were just discussing in that other thread.
@vncoelho Sounds good. Maybe you could create the PR that uses a formula for this based on block time and I will review it.
I'm pretty sure that will fix your seed node lagging issue. Even without needing #522 . I already tested it locally when I was running fast blocks and it fixed the issue for me.
A draft was created here https://github.com/neo-project/neo/pull/621
Take a look and fell free to adjust.
Jeff, even with PR 621 we will not really solve the whole issue, because once the node is lagged we still have problems in getting some blocks that were previously lost.
That will help but not really solve the Blocks Request Communication Procedure.
TaskManager I will propose a simpler modification first to ensure that it will be able to request missed blocks from a node it is currently connected to for some time. We can adjust so that a node will be able to know the height of all its connected peers instead of just their starting height. It wouldn鈥檛 be necessary if it was guaranteed that it would never skip sending inv messages, but since the code doesn鈥檛 have that guarantee currently, it will help in that rare scenario.One way to do it would be to have the response to inv messages contain the responders node height.
@jsolman did you have time to progress on this?
I made a PR for a small mechanism for updating heights #673 What's is equally useful is requesting data a second time (incase it was corrupt, got lost or whatever) without having to reconnect seems like a must to me. I think there should be a minimum interval on requesting the same data, not a hard deny. This timeout can be as simple as a periodic task that cleans up the list of historically requested hashes. This also prevents the historical hash list to grow potentially huge (if a client syncs from 0 and never disconnects).
I have not started work for #542 yet.
@vncoelho Have you observed nodes still getting behind in any of your privatenets still that wasn鈥檛 because of default max connections of 3?
is currently happening?
Yes, we still need to optimize connection/disconnection and blocks requests.
Is already solved?
@shargon, perhaps it has been solved.
In the last 1 month we have monitored our nodes and the behavior was solved with the improvements on the P2P.
It still happens but now the node catchs up and syncs again.
We expect an even better performance after #1397