We store all pins in a single massive object so adding and removing pins is really slow when we have many pins.
This affects:
ipfs dag add --pin=trueipfs addipfs pin addipfs pin rmListing pins _also_ appears to be slow but for a different reason:
ipfs pin ls buffers pins in memory before sending them back to the user (see #6304).ipfs pin ls lists all pinned blocks, directly or indirectly, by default. Calling ipfs pin ls --type=recursive is _much_ faster.Proposed solutions:
I'm filing this issue so we can have a single issue that succinctly describes the entire issue and all variants.
Also sounds like another use case for an embedded graph database
Not really. We have MFS, we can just use that. The current blockers are:
Unfortunately, this'll only get worse as we hack in new pin types for cluster. We need some way to specify (in unixfs) how a file/directory should be pinned (where pin policies higher up the directory tree take precedence).
The current thought here is to introduce an intermediate fix that stores pins in go-ipld-hamt. Blockers:
Maybe need just use read only or archive flag for pined blocks in the underline file system?
Unfortunately, it's not quite that simple. Pinning happens at a higher layer and not all of our datastores store one file per block.
This is also causing an issue with monitoring over at https://github.com/firehol/netdata/issues/3156 as it makes a lot of 'ipfs pin ls' calls.
Is the pinset object stored and read/written on disk when operations are performed? If so wouldn't it be possible to load the object into memory and read/write to there to get high performance IO with memory access? You could copy the object to disk as a backup, but you wouldn't incur expensive read operations as you are reading from the in-memory object. This would serve as a reasonable intermediate fix until the pinning system at large is reworked.
Reading is fast, we store the pinset in memory. The slow part is flushing to disk.
Why would an 'ipfs pin ls' flush to disk? I think there must be something else going on, if the netdata guys are seeing an inordinate load due to an 'ipfs pin ls' being sent once every 5s or so.
pin ls by default lists all indirectly pinned objects (children of recursive pins). I don't know why it does this by default but it does...
You can list pins you added by running ipfs pin --type=recursive; ipfs pin --type=direct.
That still doesn't answer why the guys over on https://github.com/firehol/netdata/issues/3156 are seeing massive IPFS resource usage when 1) they have a large repo (several thousand objects) and 2) they turn on monitoring (which does an 'ipfs pin ls' every few seconds). Is there instrumentation they could turn on?
It's listing every single object (block) that has been pinned. It's consuming a ton of ram because we, unfortunately, create a list of pins in-memory before returning them to the client. We should fix* (the second part) this but doing so will be a breaking API change so we'll have to be careful.
It's also probably garbage collecting a bunch (we're working on some fixes to CIDs that'll make them allocate less but that's still in progress).
@pjz So on my own nodes to avoid having to constantly poll IPFS and incur slow performance from examining the pinset, I maintain a database which contains an exact copy of the pins my IPFS nodes currently have. Any updates that would effect the pinset must also update the database.
By doing this, I avoid having to contact my IPFS node and perform performance impacting operations like ipfs pin ls.
Yes while this isn' desirable it has been working very well but has a couple of considerations, namely that all operations which effect pinset must also update the DB. Don't forget, IPFS is still very new so sometimes you have to make small accommodations until such issues are resolved.
I maintain a database which contains an exact copy of the pins my IPFS nodes
I should mention that in working with pins, I've also come to the pattern of maintaining my own cache, to avoid delay on large nodes, even when only listing recursive pins,
In my specific case, I'm interested in both the listing being more performant, but also having some means of notification from the node. Like an event that I can subscribe to, which signals when the pinset has changed.
This would allow me to maintain my own state, poll once and then just poll once more (or do some means of a delta with info from the event) on state change, instead of polling based around time or some other arbitrary metric.
For context, I'm dealing with ipfs mount at the moment, and refresh the listing for /ipfs entails getting the node's pinset.
I'm also interested in other events from the node, such as knowing when keys have changed, mfs has changed, etc.
I'm willing to bet monitoring tools would be interested in this as well.
Yes, absolutely it's made my node perform significantly better. i've currently begun moving to a model where the only time I need to talk to my IPFS node to list anything is for crucial operations. Otherwise, everything else that isn't a write operation should be reading from my cache/database
While those are great workarounds, they're not really feasible for a general monitoring solution. I guess they'll just have to wait until the IPFS server gets it together. I think it's clear that whatever datastructure it's using needs to be re-evaluated or supplemented to make this kind of monitoring/usage not cause it to eat itself.
So, adding pins should be faster. But listing every single object that has been pinned (directly or indirectly by some recursive pin) in your datastore will always be somewhat slower.
If everyone's solution is to maintain a parallel cache of what pins exist... why not have IPFS do that internally instead? Keep a cache that's invalidated on add/remove of pins, but otherwise is untouched. Then repeated calls to 'ipfs pin ls' would be trivial. Maybe make 'ipfs pin verify' also serve as a way to manually invalidate the cache/force a rebuild of it.
@pjz
I think that could help in improving the performance.
I know that awareness of the node's state is a separate issue, but if we have to come up with a messaging system for invalidating some node-wide, pinset-cache, we may as well have a system to broadcast those same events as well.
For those that still want to be made aware of when the state has changed.
The practical reason for this, would still just be to avoid unnecessary calls via polling, in long-lived processes.
Even if pin ls is fast, it'd be nice to update your copy of the pinset, only when it's been changed.
However, this only seems useful to implement if there's more than 1 event (more than just "pins have changed").
I mentioned some others before, like writes to MFS, IPNS key has been created/deleted/updated, etc.
This would allow people to maintain their own cache of various states, if they like, but still have a generic implementation underneath for fast operation in the general case.
Any opinions on this?
What you describe sounds somewhat like a way to tap into the logging system.
@Stebalien Has there been much progress / prioritization on this front? As we continue to scale, this becomes increasingly relevant.
No progress.
We're also facing this issue with aroung 2mio hashes and around 4-500k pins, can we support you in any way? We're currently "workarounding" this with multiple ipfs instances
@dirkmc you were looking into this for js-ipfs. Are you still planing on applying that same optimization to go-ipfs?
@Stebalien I'm currently doing some research to understand where the performance bottlenecks are with adding large numbers of files to go-ipfs, which will likely include performance analysis for pinning.
Before making any pinning optimizations, we'll likely want to decide if it makes sense for pins to be stored in the blockstore, which is a bigger conversation.
Most helpful comment
@Stebalien I'm currently doing some research to understand where the performance bottlenecks are with adding large numbers of files to go-ipfs, which will likely include performance analysis for pinning.
Before making any pinning optimizations, we'll likely want to decide if it makes sense for pins to be stored in the blockstore, which is a bigger conversation.