Go-ipfs: IPNS republisher should not rely on DHT to store (possibly expired) records

Created on 1 Mar 2018  路  18Comments  路  Source: ipfs/go-ipfs

The republisher relies on the DHT to store records, but those records may expire before republish occurs (eg if the republishing node goes offline). We should store them somewhere else in the datastore.
See https://github.com/ipfs/go-ipfs/pull/4742#issuecomment-369604767 for more details

Most helpful comment

Proposal: make the republisher wrap the namesys service in a republishing namesys service. When calling Publish on the RepublishingNamesys, it would store the record in a datastore and then continue to republish that record until replaced or deleted.

Thoughts?

All 18 comments

@whyrusleeping do you have a suggestion for how we should store these records?

Proposal: make the republisher wrap the namesys service in a republishing namesys service. When calling Publish on the RepublishingNamesys, it would store the record in a datastore and then continue to republish that record until replaced or deleted.

Thoughts?

That sounds like a reasonable proposal to me.

Should we merge in the PR to use variadic options for the namesys interface first?
https://github.com/ipfs/go-ipfs/pull/4733

So, as I noted on that PR, the underlying namesys relies on this record itself to keep track of the sequence number.

I can think of three solutions:

  1. Have the underlying namesys record the sequence number separately along with the last published value. IMO, we should do that regardless. Additionally, have the republishing nameservice record which names it should republish (and their last publish times?).
  2. Have the namesys expose known, published records. That would allow us to keep the republisher simple. However, that also complicates the namesys Publisher interface (which, arguably, shouldn't expose this information).
  3. Just combine the two.

Quick note: you should be able to update to latest-gx-tagged go-libp2p-kad-dht, go-ipfs-routing, go-libp2p-record for this. I should have left everything working on that front.

I noticed that the Pubsub publisher puts an IPNS record directly into the datastore, so that the republisher can later retrieve the record's sequence number. But it looks like namesys will always publish to both the DHT and to Pubsub anyway, so is it necessary for Pubsub publisher to do this?

And I guess a related question: Should the republisher care about records published directly to Pubsub (but not to the DHT)?

That's so that both itself and the DHT publisher can find it; the latter is important if you run your daemon having disabled pubsub.

And yes, i think the republisher should care about pubsub published records, as these will have a different seqno.
Also, it's not inconceivable that we do just pubsub publishing in the future.

It looks like the pubsub record sequence number may be different from the DHT record sequence number for the same value in the following scenario:

  • A client calls namesys Publisher to update an IPNS record
  • namesys Publisher calls the DHT publisher, which increments the sequence number
  • namesys Publisher calls the pubsub publisher, which also increments the sequence number

I'm not sure if that actually causes any problems, but it's probably worth fixing

Yes, that's an artifact of the implementation.
Ideally they would both use the same seqno, but it seemed too complicated at the time i was writing that code and opted for different seqnos.
This kept the implementation straightforward, as the two publishers could be kept completely separate and without changes to the extant publisher.

But since we are hacking the publishers, we might as well go the extra mile and have them use the same seqno for the record.
It shouldn't be causing any problems though.

Yes I agree, @Stebalien suggested simply comparing the values to make sure the sequence number is not incremented when the same value is published.

Is there a case in which we would want to publish only to pubsub (but not to the DHT)? I'm wondering if the pubsub publisher should call out to the DHT publisher itself, which would take care of all the sequence number handling that the pubsub publisher has to do at the moment

Currently no, but we might want this in the future.

I think if we don't always publish to the DHT it will cause an issue with publishing to a key shared between peers:

  • An IPNS record for key A is published on peer A to pubsub but not to the DHT
  • namesys Publish is called on peer B for key A
  • Peer B goes out to the DHT to get the latest sequence number for key A
  • The IPNS record published to pubsub (but not the DHT) by peer A will not be found, so the publisher on peer B will get the wrong sequence number
Was this page helpful?
0 / 5 - 0 ratings

Related issues

magik6k picture magik6k  路  3Comments

funkyfuture picture funkyfuture  路  3Comments

lidel picture lidel  路  3Comments

0x6431346e picture 0x6431346e  路  3Comments

magik6k picture magik6k  路  3Comments