Go-ipfs: Add functionality for advanced control over peer set

Created on 18 Mar 2019  Â·  35Comments  Â·  Source: ipfs/go-ipfs

I'd really like to have the ability to specify peer IDs in my config file that my node will try to always remain connected to. Additionally, it would be nice to specify strategies for these peers, like "always", which attempts to reconnect on any disconnect (respecting backoff rules), "preferred" which never closed a connection to that peer, but doesnt necessarily try and hold open a connection, and maybe one more that just increases the likelihood the conn manager will hold open the connection, but still allows it to be closed as needed.

This will enable me to keep connections open to friends machines, making transfers between us much more reliable.

kinenhancement topiconnmgr

Most helpful comment

proposing new commands

Single Peer -- Keep connections to a specific peers

Use list of peers to stay connected to all the time.

command name options (options to seed ideas -- i dont love any of these names :D)

# choose one
ipfs swarm bind    [list | add | rm]
ipfs swarm peer    [list | add | rm]
ipfs swarm link    [list | add | rm]
ipfs swarm bond    [list | add | rm]
ipfs swarm friend  [list | add | rm]
ipfs swarm tie     [list | add | rm]
ipfs swarm relate  [list | add | rm]
ipfs swarm couple  [list | add | rm]

subcommands

<cmd> list
<cmd> add [--policy=([always]|protect|...)] [ <peer-id> | <multiaddr> ]
<cmd> rm [ <peer-id> | <multiaddr> ]

examples

# just w/ p2p. (use a libp2p peer-routing lookup to find addresses)
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# connect specifically to this address
ipfs swarm bind add /ip4/127.0.0.1/udp/4001/quic/p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# can combine both, to try the address but also lookup addresses in case this one doesn't work.
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr
ipfs swarm bind add /ip4/127.0.0.1/udp/4001/quic/p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# always keep a connection open always (periodically check, dial/re-dial if disconnected)
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr
ipfs swarm bind add --policy=always /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# once opened, keep a connection open (try to keep it open, but don't re-dial)
ipfs swarm bind add --policy=protect /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

Peer Group -- Keep connections to a (changing) group of peers

  • Use a group key to find each other and stay connected. Connect to every peer in the group. Keep a list of groups.
  • Maybe use a pre-shared key (PSK) to join the group and find out about each other (that way we can have private groups)

command name options

# choose one
ipfs swarm group   [list | add | rm]
ipfs swarm party   [list | add | rm]
ipfs swarm clique  [list | add | rm]
ipfs swarm flock   [list | add | rm]

subcommands

<cmd> list
<cmd> add [--mode=(all|any|number|...)] [ <group-key> ]
<cmd> rm [ <group-key> ]

examples

ipfs swarm group add --mode=all <secret-key-for-ipfs-gateways>
ipfs swarm group add --mode=all <secret-key-for-pinbot-cluster>
ipfs swarm group add --mode=any <secret-key-for-dtube-gateways>
ipfs swarm group add --mode=any <secret-key-for-pinata-peers>
ipfs swarm group add --mode=all <secret-key-for-textile-peers>

All 35 comments

Additionally, the infra team wants this to hold connections between the gateways and our pinning servers open.

This will percolate to the connection manager (current, interim and future).

+1

This feature would mean that I can set up a gateway node that could 'permanently' connect to pin storage node and speed up propagation when the machine holding the content is known.

We would also highly benefit from this type of feature. We're currently running an automated "swarm connect {gatewayAddr}" with our nodes every 5 minutes or so to keep these connections open.

Having an official "supported" way of keeping nodes connected would be amazing.

@Stebalien @raulk A pretty quick and non-invasive way of doing this would be to add a list of peers to the connection manager that lets it handle the ‘preferred’ case (not closing the connection). Then, in go-ipfs we can have a little process that is fed a list of ‘always’ peers that it dials, and listens for disconnects from.

Geth has the "static nodes" feature which may be useful as a development/design pattern for go-ipfs.

https://github.com/ethereum/go-ethereum/wiki/Connecting-to-the-network#static-nodes

@whyrusleeping let's do that quickly. Connection manager v2 proposal is in the works, but there's no reason we cannot implement a _protected set_. I'll work on a patch.

@whyrusleeping

like "always", which attempts to reconnect on any disconnect (respecting backoff rules),

Do you think libp2p should take care of reestablishing the connection? This would require changes in the host and/or the swarm, e.g. when you host.Connect() you could specify a _supervision policy_ for that connection.

"preferred" which never closed a connection to that peer, but doesnt necessarily try and hold open a connection,

See https://github.com/libp2p/go-libp2p-interface-connmgr/pull/14 and https://github.com/libp2p/go-libp2p-connmgr/pull/36.

and maybe one more that just increases the likelihood the conn manager will hold open the connection, but still allows it to be closed as needed.

In the current connection manager, this affinity can be achieved by setting a higher score on that connection via a separate tag, e.g. "peer_affinity".

@Stebalien The issue occurring in https://github.com/ipfs/go-ipfs/issues/6145 may or may not be relevant to this ticket

It would be awesome if this could be exposed via an API as well as configuration as this would allow Cluster to dynamically protect connections between IPFS nodes as they join a cluster.

I've added the Protect()/Unprotect() API to the connection manager, available in gomod version v0.0.3.

Please take it out for a spin and report back.

You should be unblocked now to make progress with this; do shout out if you think otherwise.

@raulk @whyrusleeping How does this work in regards to spam protection for high profile nodes? For example, almost everybody would probably love to stay connected to the official ipfs.io gateway nodes if given the chance. However, it's obviously unfeasible for the official ipfs.io nodes to maintain connected to that many nodes all the time, which could result in an overwhelming amount of disconnects / attempts to reconnect.

Do the backoff rules may cover this edge case? I just wanted to double check that this doesn't accidentally bring your infrastructure to a standstill.

@obo20 dialer backoff rules wouldn't cover that case, as presumably the dials would succeed.

While I think it's legitimate for everybody to want to stay connected to ipfs.io, that's should not be the case and it's not the desired architecture. In other words: IPFS is not a hub-and-spoke nor federated model.

Gateways are able to discover across a decentralised network; if that proves dysfunctional, we should dig into that.

@obo20 from the viewpoint of a libp2p node, it's legitimate to strive to keep a connection alive with peer A if you consider it high-value. Peer A also has resource management in place, and will eventually prune connections it considers low-value. If many peers deem peer A as high-value, they will eventually compete for its resources. If the protocol manages reputation/scoring well (e.g. bitswap), peer A will retain the highest performing peers.

@raulk I may have miscommunicated my concern.

My worry is that if say for example, the ipfs.io gateway nodes have a high water of 2000 (I'm making this number up) and then 3000 other nodes on the network want to have a "protected" swarm connection to those nodes, how would this be handled?

@obo20 from the viewpoint of a libp2p node, it's legitimate to strive to keep a connection alive with peer A if you consider it high-value. Peer A also has resource management in place, and will eventually prune connections it considers low-value. If many peers deem peer A as high-value, they will eventually compete for its resources. If the protocol manages reputation/scoring well (e.g. bitswap), peer A will retain the highest performing peers.

So from my interpretation of this comment, the high profile nodes would just prune excess nodes even if every single one of those nodes had set it as "protected" ? This sounds good from the perspective of the high profile node.

How do the nodes who have deemed this connection "protected" act when they've been pruned? Do they attempt to frequently reconnect or do they just accept that they've been pruned and move on? (This may be more of an ipfs implementation question instead of a libp2p question.)

How do the nodes who have deemed this connection "protected" act when they've been pruned?

They just see the connection die. The application (e.g. IPFS) can then attempt to reconnect, and the other party will accept the connection and keep it alive until the connection manager prunes it again.

Note that Bitswap is not proactively managing reputation/scoring AFAIK. I'm sure a PR there would probably be well-received.

I noticed that this is partially in the changelog, but are these important connections remembered anywhere yet or is it still upcoming like in the original issue?

My usecase is having three nodes of which one is almost 24/7 and to avoid killing routers they have small connection limits and while they have each other as bootstrap nodes, after running for a moment they forget all about each other.

When I pin something on one, I likely also want to pin it on the others and that is slow unless I ipfs swarm connect by myself (I guess the changelog means that I will have to be running that less often). As they aren't all 24/7, I think the suggested "preferred" flag would fit my usecase due to the nodes connecting each other mainly through Yggdrasil network having static addresses within it.

@raulk what @obo20 is pointing out is that if everyone decides to add the ipfs gateways to their protected connection set, the gateways will get DoSed with connections. What we need to prevent this is the 'disconnect' protocol so the gateways can politely ask peers to disconnect, and have those peers not immediately try to reconnect.

Sure, malicious peers can always ignore that, but we want normal well behaved peers to not accidentally DoS things.

if everyone decides to add the ipfs gateways to their protected connection set

Would there be any point in this or is this just fear of users not getting it or am I not getting it? I am not using IPFS.io gateway (but ipns.co), but if users request my content from IPFS.io a lot, won't it be fast due to cache anyway regardless of whether my node is currently connected to the gateways or not?

is this just fear of users not getting it

Basically this, yeah. Network protections in systems like this shouldn't have to rely on clients behaving properly. Adding a disconnect protocol still relies on clients behaving properly, but its an additional step (circumventing the disconnect protocol would be deliberate, force connecting to the gateway nodes is more of a configuration mistake)

@Mikaela This feature isnt complete yet, its just the ephemeral important connections currently. Persistence should be coming soon (at this point I think all it takes is adding it to the config file and wiring that through)

proposing new commands

Single Peer -- Keep connections to a specific peers

Use list of peers to stay connected to all the time.

command name options (options to seed ideas -- i dont love any of these names :D)

# choose one
ipfs swarm bind    [list | add | rm]
ipfs swarm peer    [list | add | rm]
ipfs swarm link    [list | add | rm]
ipfs swarm bond    [list | add | rm]
ipfs swarm friend  [list | add | rm]
ipfs swarm tie     [list | add | rm]
ipfs swarm relate  [list | add | rm]
ipfs swarm couple  [list | add | rm]

subcommands

<cmd> list
<cmd> add [--policy=([always]|protect|...)] [ <peer-id> | <multiaddr> ]
<cmd> rm [ <peer-id> | <multiaddr> ]

examples

# just w/ p2p. (use a libp2p peer-routing lookup to find addresses)
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# connect specifically to this address
ipfs swarm bind add /ip4/127.0.0.1/udp/4001/quic/p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# can combine both, to try the address but also lookup addresses in case this one doesn't work.
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr
ipfs swarm bind add /ip4/127.0.0.1/udp/4001/quic/p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# always keep a connection open always (periodically check, dial/re-dial if disconnected)
ipfs swarm bind add /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr
ipfs swarm bind add --policy=always /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

# once opened, keep a connection open (try to keep it open, but don't re-dial)
ipfs swarm bind add --policy=protect /p2p/Qmbwqf292G3GbrNm1ydtKeqhqqgyqXDtDvsuBYuvXsPHHr

Peer Group -- Keep connections to a (changing) group of peers

  • Use a group key to find each other and stay connected. Connect to every peer in the group. Keep a list of groups.
  • Maybe use a pre-shared key (PSK) to join the group and find out about each other (that way we can have private groups)

command name options

# choose one
ipfs swarm group   [list | add | rm]
ipfs swarm party   [list | add | rm]
ipfs swarm clique  [list | add | rm]
ipfs swarm flock   [list | add | rm]

subcommands

<cmd> list
<cmd> add [--mode=(all|any|number|...)] [ <group-key> ]
<cmd> rm [ <group-key> ]

examples

ipfs swarm group add --mode=all <secret-key-for-ipfs-gateways>
ipfs swarm group add --mode=all <secret-key-for-pinbot-cluster>
ipfs swarm group add --mode=any <secret-key-for-dtube-gateways>
ipfs swarm group add --mode=any <secret-key-for-pinata-peers>
ipfs swarm group add --mode=all <secret-key-for-textile-peers>

I'm definitely a fan of the functionality that @jbenet is suggesting. While 'ipfs swarm connect' currently gets the job done, it would be nice to have more fine tuned control over how to manage connections that we want to keep alive.

The swarm groups are an interesting concept. Instead of a secret key to manage access, I'd love it if there was a concept of "roles" to determine access to the groups. Essentially, the first node to set a group up is an admin role, and then can add either admins or members from there.

The benefit here is that the owner(s) of a group can add / revoke nodes if needed without needing to completely reform the entire group since there's no secret that acts as a master password.

Another thing that would be incredibly helpful here is if this type of stuff could be added to a permanent config, instead of being only temporary until the node restarts. Currently we (and I believe the IPFS infrastructure team - according to @mburns) use the default 'ipfs swarm connect' functionality to keep our nodes connected and we have to continually connect our nodes on a repeating cron-task so that if the node restarts we can reconnect them. Having something like this persist between reboots would be incredibly valuable.

As mentioned at the top of the issue, we'd like to be able to ensure that gateway nodes stay connected to cluster nodes. It'd be rad to also ensure that all the gateway nodes are peers so content that has been cached in one region is only a bitswap away from any other gateway nodes... but right now the gateways are constently maxed out with new peer connections with no particular affinity for any of them.

The libp2p BasicConnMgr got a simple Protect api in https://github.com/libp2p/go-libp2p-connmgr/pull/36 that means the peer you pass to it won't be trimmed from the connections list when you hit the high water mark.

As a first pass, could we surface it in the 0.5 ipfs release by adding flag to swarm connect

$ ipfs swarm connect --protect /your/proto/here

In it's current form, that info won't survive restarts, and it wont automatically try and reconnect if the connection is terminated, so perhaps it doesnt give us much, but right now the gateways seem to be trimming connections to the cluster nodes almost immediately after the grace period is up.

More considered fixes like that proposed in https://github.com/ipfs/go-ipfs/issues/6097#issuecomment-506964185 are welcome but I just want to generally draw more attention to this issue and get some energy behind it. Good PRs are here needed.

A workaround is to add gateways to the bootstrap list. Bootstrap nodes are re-connected to frequently (I'm not sure if they are also "protected" or tagged with higher priority).

A workaround is to add gateways to the bootstrap list. Bootstrap nodes are re-connected to frequently (I'm not sure if they are also "protected" or tagged with higher priority).

Only if the number of open connections drops below 4. They _also_ aren't tagged with any high priority (as a matter of fact, we've considered tagging them with a negative priority).

Can we make ipfs swarm connect call Protect on the connection by default? If I explicilty ask my node to connect to another, I don't want the connection to be in the list of trimmable connections, I want it to stay connected. I can't guarantee the other side wont drop it, but I definitly dont want my side to drop it. I can ipfs swarm disconnect to signal that I'm done with it.

Adding a mechanism to "reconnect on close" requires us to solve the "dont ddos popular nodes" problem, but exposing the existing libp2p logic to let users identify connections that they dont want their node to trim seems much less risky, and would allow users that control groups of nodes to maintain connections between them all by connecting from both sides. It's doesn't solve the auto-reconnect problem, but that can be scripted for now.

@olizilla Would there be a way to swarm connect without protecting the connection?

@olizilla we currently add a weight of 100 (didn't have connection protection at the time). But yeah, we should probably protect those connections and add a --protect=false flag.

Is this functionality being considered at all for the 0.5 release?

No. However, there are a few improvements already in master that _may_ help:

  • The connection manager will no longer count connections in the grace period towards the connection limit.

    • Pro: Useful connections won't be trimmed in favor of new connections.

    • Con: You may end up with more connections and may need to reduce your limit.

  • Bitswap keeps track of historically useful peers and tells the connection manager to avoid disconnecting from these peers.

@Stebalien Does this bitswap history persist through resets or does it live in memory?

If not, would it be difficult to have a separate bootstrap list (or just something in the config that we can set) that consists of peers which we don't ever want to prune connections for? Upon node initiation the node would add all nodes in that list to that "historically useful peers list" you mentioned.

For context, my main goal here is to avoid having to periodically run outside scripts to manage my node connections as this has been somewhat unreliable.

No, it lives in memory. The end goal is to _also_ have something like this issue implemented, just not right now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Mikaela picture Mikaela  Â·  3Comments

magik6k picture magik6k  Â·  3Comments

0x6431346e picture 0x6431346e  Â·  3Comments

emelleme picture emelleme  Â·  3Comments

ArcticLampyrid picture ArcticLampyrid  Â·  3Comments