When site with origin e.g. foo.com uses JS-IPFS, new node is bootstrapped for every browser context (tab, iframe) this implies:
There is an opportunity to improve this by offloading most of JS-IPFS work into SharedWorker in browsers where API is available and fallback to a dedicated Worker elsewhere (basically Safari).
We can explore ServiceWorker as a better fallback mechanism in the future. Unilke dedicated worker they can be shared, but they also come with enough challenges to justify keeping it out of scope initially.
However use of SharedWorker implies some trade-offs:
Worker needs to be instantiated with a separate JS script. Which leaves us with the following options:
This is a major problem as WebRTC is largely replacing WebSocket transport and requires more research. Pragmatic approach could be to tunnel WebRTC traffic into a worker, however that would still imply reestablishing existing connections which would be a major drawback.
Another thing I'm recognizing now is that two different browsing contexts (tabs) could start IPFS node with different configurations. Which needs to be handled in some way.
I have started hacking on the ipfs-message-port-client that in the spirit is very similar to ipfs-http-client except it wraps around MessagePort.
However I'm realizing that API surface is pretty large, which reminded me of js-ipfs-lite that is a lot slimmer but also well type typed (TS ❤️). This combined with the configuration point above got me wondering if this maybe a good opportunity to consider stripping some of the convenience stuff which could be pulled in separately (maybe even opportunity to converge code bases ?)
I had some discussion with @hugomrdias last night, about aegir not yet supporting multiple webpack configs where he pointed out shiny package.json exports which could be a better solution going forward.
After spending all day yesterday running into walls, I was desperate for some progress so I have come up with a following plan and would really love your feedback @achingbrain
.aegir.js) that is the spirit of ipfs-http-client except it wraps around MessagePort. Idea is that it will be loaded in the main thread and will RPC with IPFS node through the MessagePort..aegir.js) that wraps around MessagePort and starts IPFS node and executes incoming commands.ipfs.add are somewhat tricky due to highly polymorphic API and streaming.normalise-input takes almost anything and turning it into AsyncIterable<{ path, content: AsyncIterable<Buffer> }> which adds ton of complexity especially when things need to be moved across the wire. It also makes it easy to pass sub-optimal input. I think it would be nice if bare minimum API was optimized for better performance. That is to suggest used ArrayBuffer as they can be transferred without encoding / decoding or copying across the threads. Support for more exotic inputs could be than added via helper libs that users could choose to use.AsyncIterators to MessagePorts and transfer those or alternatively do multiplexing over the same MessagePort but if it's mostly for convenience I think it would be more justified to do it as another helper lib. e.g. from what I can tell ipfs.files.write could effectively be used for writing out of stream input which again could reduce complexity of the RPC protocol and could be implemented only on the client end.I like the idea of ipfs-message-port-* packages.
packages/ipfs and use parts of it to assemble runtime-optimized distribution in packages/ipfs-web etcwindow.ipfs interface in IPFS Companion (I had some prior discussions about something like that with @alanshaw, and we always wanted similar sandboxing and limited API surface for window.ipfs anyway)Agree the API surface is too big and IMO for MVP.
ipfs.add, ipfs.get, ipfs.cat, ipfs.dag.* (unsure about unixfs-specific ipfs.object.*)addTS JSDoc is something @hugomrdias experimented with before, not sure what is the status tho. Perhaps drop it for now if it gets too distracting.
Note that ipfs files write produces different DAGs than ipfs add (https://github.com/ipfs-shipyard/ipfs-webui/issues/676)
punting things like Peer Identity, MFS/key "namespacing" etc for later phase.
I think those should not matter for same-origin node sharing unless I'm overlooking something.
For extremely lean MVP supported APIs would be: ipfs.add, ipfs.get, ipfs.cat
Unfortunately those are the most complicated and problematic ones, getting ipfs.files.* would be a lot easier. I wrote down notes explaining why https://gozala.io/pl-notes/JSCoffee-2020-05-13 and hope to discuss in more details during the call.
ipfs.dag.* (unsure about unixfs-specific ipfs.object.*)
Complexity is mostly in having jQuery style APIs where functions take pretty much anything. ipfs.dag.* and ipfs.object.* don't seem to have pretty straight forward API so it should be a big deal.
This would also enable us to test those browser optimizations for reading big files during add
Could please explain what you mean here, I'm not sure I follow.
TS JSDoc is something @hugomrdias experimented with before, not sure what is the status tho. Perhaps drop it for now if it gets too distracting.
He actually filled me in on this today, very exciting stuff! We also took same approach so things should merge nicely. I don't think it's distraction it make navigating large codebase like js-ipfs a lot more manageable.
Note that ipfs files write produces different DAGs than ipfs add (ipfs-shipyard/ipfs-webui#676)
I don't think that matters because mostly client forwards ArrayBuffers / Blobs to the worker thread which than runs corresponding API calls.
This is exciting! I'm looking forward to contributing here more as time permits. In the mean time, the minimal API alluded to above is very close/similar to ipfs-lite. There is a (currently on the back burner) TS implementing here: https://github.com/textileio/js-ipfs-lite. Just for reference!
I have outlined some high level questions I would love to get answered by surveying our community / users. I think it would help to ensure that effort is focused on user needs (in browser context) as opposed to API compatibility with IPFS in runtimes which have different constraints.
What are the essential configuration options for JS-IPFS in browsers ?
_It is important to distinguish it from incidental, where it is means to an end. In other words, it is worth pondering if existing use cases could be addressed through different means_.
What subset of JS-IPFS API is used in browsers ?
_It is worth considering this, so we can better prioritize efforts and optimize experience of what is important as opposed to aiming for API compatibility._
Why users are choosing to use JS-IPFS in main thread over worker thread ?
_Given that effort effectively moves JS-IPFS to a worker thread, we need to understand why users chose to use JS-IPFS in main thread in first place. (Validate working assumption that doing it on main thread was just easier). It is also worth considering how important high level JS-IPFS API in main thread is (It may be that users would be perfectly happy with using JS-IPFS in worker thread)_
How does user priorities fall in regards to convenient API vs bare-bone API ?
_JS-IPFS prioritizes convenience. At the same time >60% of code written for this effort deals with accepting variadic inputs, effectively encoding / decoding streams etc… It is worth checking our assumptions about users needs / wants to avoid bloat that may be irrelevant in practice_
What are other important things (not on the list) that JS-IPFS does or you wish it did for your browser use case ?
_E.g Textile chose to invest into js-ipfs-lite & which probably motivated by mismatch in what JS-IPFS provides vs what team needed. By asking we hope to identify other high priority needs / wants for JS-IPFS in browser that we may have overlooked otherwise._
If you spot something overlooked or missed please let me know.
@carsonfarmer thanks for joining the thread and prompt to look closer to the js-ipfs-lite API. I have few questions remarks based on what I see there, which I would like to understand:
Library uses modular architecture, which is great way to manage complexity. However I would like to understand how important that is for textile. Reason is that plug-able components is at incompatible with idea of sharing node across multiple browsing contexts (tabs, applications) because they can end up with different setups which would prevent sharing.
Argument could be made towards say sharing networking stack and allowing plugging stores. However that also has complications. For one tabs (on different origins) will not be able to access data that in the disk without tunneling it over the network. From UX perspective it also means there is no canonical place for seeing all the data.
That being said, I do recognize value say in e.g in-memory store which is not persisted, which could be achieved differently e.g. DAG staging staging area (Something along the lines of https://github.com/ipfs/js-ipfs/issues/3033#issue-617835092) which could address use case without complicating node sharing.
I see node Buffer is used as canonical representation for data. Is that motivated by a convenience or some other reasons ?
Current approach attempts to avoid bringing node baggage into browser, that is ArrayBuffer|ArrayBufferView captures all binary types (including node Buffer it being subclass of Uint8Array) and given that all native browser APIs do return those it appears that there is no good reason to demand Buffer in any of the user facing APIs (removing it from internals is more complicated).
I would say that, in the interest of sharing the node across multiple browsing contexts, a less modular design might be ideal. Additionally, while there are other considerations for the modular design (allows much smaller packages and dynamic imports if done properly), the bundle size savings are so large that the benefits from reuse would certainly be larger than the benefits of smaller one-off bundles. So to answer the question: not that important for use (though I would still advocate for a modular design in terms of its development).
As far as Buffer goes, that is purely convenience. All the existing tooling expects/supports it. But we (Textile) internally have been trying to move most of our code and APIs to Uint8Array. So happy to support that push here as well.
Thanks for the discussion @Gozala. Here are details as we discussed :)
The only configuration option that we rely on is passing a rendezvous server as a swarm address. Previously this was the only way to connect to a rendezvous server, but a way to do this has been added: https://github.com/ipfs/js-ipfs/issues/2508#issuecomment-582501319
Other than that we disable preload and bootstrap, in order to create the instance faster. This should however be less of a problem with the shared instance.
We use the following:
ipfs.addipfs.catipfs.dagipfs.pubsubipfs.libp2pI believe that's all, but I might be missing something. The main reason we're using ipfs.libp2p is because of pubsub room. Important to note here is that this is only supported in js-ipfs and not in ipfs-http-client. This should probably be added to the readme of pubsub room as well in order to not confuse people. cc @achingbrain
Another thing to note is that we don't use garbage collection right now. This was only recently introduced into js-ipfs. Instead we assume that all data that we add to ipfs will just remain in storage, and sync it from the network if it's not there.
We are currently in a situation where our software package (3box) is run on multiple tabs across multiple domains. There are two main things that could be solved using worker threads:
Don't have strong opinions here. On the security on things like the pinning api on multiple origins I tend to lean towards the side of usability now rather than origin specific pins. This gives us the chance to experiment more.
The question of codecs is important. Initially a requirement from our side would be that we can provide a custom codec to the worker. We are working on codecs for signed and encrypted data, which we will start using as soon as it's ready.
Thanks @oed for taking time to discuss this with me & for publishing more details here. I would like to followup on few points:
I believe that's all, but I might be missing something. The main reason we're using ipfs.libp2p is because of pubsub room. Important to note here is that this is only supported in js-ipfs and not in ipfs-http-client. This should probably be added to the readme of pubsub room as well in order to not confuse people. cc @achingbrain
This appears to be a know issue https://github.com/ipfs-shipyard/ipfs-pubsub-room/issues/28 so maybe we should revive conversation about this in that thread.
ipfs.libp2p itself has pretty large API surface and I think it would really help if we could:
ipfs.pubsub lacks some of the functionality from ipfs-pubsub-room and if so maybe it would be best to asses how this functionality can provided instead of trying to cover whole ipfs.libp2p API surface. From what I recall during our discussion, missing functionality was to ability to send messages to peers directly. Is there anything else ?We want to be able to share data across origins. Currently we rely on the data being available in the network, but this is less then ideal. Instead it would be nice if nodes on multiple origins could share block storage. We are currently working on a solution for this by putting the blockstore api into an iframe.
I put some more thought into this after our conversation. And came to conclusion that in first iteration it would be best to do following:
ipfs-message-port-client library that can be connected to ipfs-message-port-server over MessagePort instance.ipfs-message-port-server library that can take IPFS node instance provide it remotely to the ipfs-message-port-client.This should keep pretty much same flexibility as IPFS has today (other than WebRTC) and overcome:
message-port-server + js-ipfs running insideSharedWorker).@oed You also mentioned some inconvenience with using certain API(s) as I recall it was the fact of having to iterate over things. I'm afraid I lost those details, could you please share specifics. It would be a real help in informing our decisions.
From what I recall during our discussion, missing functionality was to ability to send messages to peers directly. Is there anything else ?
I believe that's it but I haven't looked too deeply at the pubsub-room implementation.
Don't have a strong opinion on the message-port stuff.
could you please share specifics.
Previously we did:
const cid = (await ipfs.add(buffer)).hash
Now we have to do:
const cid = (await (await ipfs.add(buffer)).next()).value.cid
Perhaps there is a way to do this using the DAG api? I've been unable to add the same buffer and get the same CID using dag.put and various configs however .
I have encountered obstacles when running ipfs.dag API test on my message-port based implementation. That is because ipfs.dag.put can take either plain data structure e.g. {foo: "bar"} or something like DAGNode from ipld-dag-pb
And I'm guessing raw buffers as well.
By looking at what ipfs-http-client does it appears to me that there is no generic API to encode / serialize node:
This is problematic as main thread needs to pass dagNode somehow to the worker thread (ideally without having to copy things). There for some generic way to serialize nodes seems necessary.
It is also worth taking this moment to consider following:
IPLDNode could have something like toBlob():Blob method which:type field.blob: url.I believe the IPLD team is working on something along the lines of 3: https://github.com/multiformats/js-multiformats
@haadcode I would love to get some input from you as well. I am also happy to jump on a call to walk through it more closely.
Had a discussion with @achingbrain @lidel @mikeal @autonome today where I proposed to loosen up backwards compatibility constraint for this work in favor of:
Hope is that this would provide following benefits:
We have discussed idea of using js-ipfs-lite as a baseline. But collectively we've decided that I would instead propose specific API (inspired by ipfs-lite) which can we can discuss further to reach the consensus (feedback from the community is welcome).
import Block from '@ipld/block'
import CID from 'cids'
// This in nutshell is `Promise<A>` but with `X` type parameter added to
// communicate errors that may occur. Feel free to ignore the interface definition
// itself.
interface Future<X, A> {
// Maps resolve
then<B>(onfulfilled: (value: A) => B | Future<X, B>, onrejected: (error:X) => B | Future<X, B>):Future<X, B>
// Maps error value
then<Y>(onfulfilled: (value: A) => Future<Y, A>, onrejected: (error:X) => Future<Y, A>):Future<Y, A>
// Maps both error & value
then<Y, B>(onfulfilled: (value: A) => B | Future<Y, B>, onrejected: (error:X) => B | Future<Y, B>):Future<Y, B>
catch<B>(onrejected?: (error:X) => B | Future<never, B>): Future<never, B>
}
interface IPFSService extends CoreService {
dag:DAGService
blob:BlobService
}
interface DAGService {
// Note: path isn't optional higher level API wrap this to add deal with
// param skipping etc. Get returns a `Block` if block can't be (fully)
// resolved that is a `ResolveError` which will contain `remainderPath` and
// `Block` that was resolved during this get.
get(cid:CID, path:string, options?:GetOptions):Future<GetError, Block>
// Note: Block knows about format / hashAlg so those options are gone.
put(block: Block, options?: PutOptions):Future<PutError, CID>
// Note: path isn't optional higher level API can deal with that.
// Instead of async iterator you can request ranges so that:
// 1. Amount of coordination between client & server can be adjusted.
// 2. Avoids dangling references across client / server boundries.
tree(cid: CID, path:string, options?:EnumerateOptions): Future<TreeError, string[]>
}
interface BlobService {
// Note: This is instead of `ipfs.add` whether it should be called `add`
// instead is up for the debate. However it is deliberately proposed with a
// different name because:
// 1. It just produces CID and unwrapping async iterator to get it is
// inconvinience that had benig pointed out.
// 2. It is personal preference to have a function that always works as
// opposed to one that fails due to unsupported input inputs most of the
// time.
// 3. In browser this is almost always what you want to do as opposed to
// stream inputs.
put(blob:Blob, options?:PutBlobOptions):Future<PutBlobError, CID>
// Note: This is in place of `ipfs.cat` whether it should be called `cat`
// instead is up for the debate. It is deliberately proposed with a different
// name because it returns a `Blob` instead of `AsyncIterable<Buffer>`. That
// is because blobs are well supported in browsers can be read in different
// ways and avoids server/client coordination. Furthermore higher level API
// providing API compat could be added on top which would create
// `AsyncIterable`s all on the client that performs range reads with this
// API under the hood.
get(ipfsPath:string, options?:GetBlobOptions):Future<GetBlobError, Blob>
}
interface AbortOptions {
timeout?: number
signal?: AbortSignal
}
interface GetOptions extends AbortOptions {
localResolve?: boolean
}
interface PutOptions extends AbortOptions {
pin?: boolean
}
interface EnumerateOptions extends AbortOptions {
recursive?: boolean
// will skip `offset` entries if provided
offset?: number
// Will include at most `limit` number of entries
limit?: number
}
interface PutBlobOptions extends AbortOptions {
chunker?:string,
cidVersion?: 0|1,
enableShardingExperiment?: boolean,
hashAlg?: string,
onlyHash?: boolean
pin?: boolean
rawLeaves?: boolean
shardSplitThreshold?: boolean
trickle?: boolean
wrapWithDirectory?: boolean
}
interface GetBlobOptions extends AbortOptions {
offset?: number
length?: number
}
type PutBlobError =
| WriteError
type GetBlobError =
| NotFound
| ReadError
type GetError =
| ResolveFailure
| NotFound
| DecodeError
| ReadError
| AbortError
| TimeoutError
type PutError =
| EncodeError
| WriteError
| AbortError
| TimeoutError
type EnumerationError =
| DecodeError
| ResolveFailure
| NotFound
| ReadError
| AbortError
| TimeoutError
interface TreeError extends Error {
error: EnumerationError
// Entries that were succesfully enumerated before encountering error
value: string[]
}
interface ResolveFailure extends Error {
remainderPath: string
value: Block
}
interface NotFound extends Error {}
interface NoSuchPath extends Error {}
interface EncodeError extends Error {}
interface DecodeError extends Error {}
interface WriteError extends Error {}
interface ReadError extends Error {}
interface PartialFound extends Error {}
interface AbortError extends Error {}
interface TimeoutError extends Error {}
Few extra notes I'd like to make:
dag.get / dag.tree could just take ipldPath:sring argument instead of CID + path. Since CIDs get serialized over the wire anyway. dag interface, especially since it could reduce amount of chatter between server and client across message channel.ipfs.add (See: https://github.com/ipfs/js-ipfs/issues/3033) which attempts to do everything that ipfs.add with added message channel constraints. However, turns out both 3Box and Textile just add a file to IPFS and want CID back (in browsers) which is why blob API is proposed as a very effective and simple API to do just that.ArrayBuffers for DAG blocks. Mostly because I suspect it might be best make it part of the Block API.I want to elaborate bit more on my "transfer vs copy" point. Consider following example that assumes current API:
const dagNode = {
version: 1,
message: new TextEncoder().encode("Hello World")
}
const cid = await ipfs.dag.put(dagNode)
As things stand today now following assertion will hold:
TextDecoder().decode(dagNode.message) == "Hello World"
However if things are moved across message channel to the workers that is not going to be the case if underlying ArrayBuffers are transferred (which is most efficient way) instead of being copied. Instead following will be true:
dagNode.message.byteLength === 0
This is obviously simplified example. If you imagine that dagNode is assembled in one part of the program with bytes received from other part of the program with multiple references to it, transfer (and effectively emptying them) may lead to bugs in user space.
That is why I think it is important to have an API that:
@carsonfarmer @oed would good to get your eyes on https://github.com/ipfs/js-ipfs/issues/3022#issuecomment-641542193 as well.
To me https://github.com/ipfs/js-ipfs/issues/3022#issuecomment-641542193 makes sense as an interface to ipfs data @Gozala
A few questions that arise:
dag.put right now?To me #3022 (comment) makes sense as an interface to ipfs data @Gozala
A few questions that arise:
Hey @oed thanks for taking time. I should have more context, there is some is in the https://github.com/ipfs/js-ipfs/issues/3022#issuecomment-641541430. I will try to address all of the questions below
- Is the idea to have another abstraction above the DAGService api that allows you to use the codec that you need? (since you are putting raw blocks)
Not exactly. New IPLD stack changes things a bit and that imported Block refers to Block coming from https://github.com/ipld/js-block not to be mistaken for https://github.com/ipld/js-ipld-block. Which provides interface to encode / decode blocks. DAGService here will just provide a way to store / retrieve blocks from / to IPFS node.
I do however refer to another abstraction layer in the comments. That is mostly to say that we can provide API compatibility with a current / future JS-IPFS API by wrapping this DAGService API which is intended to be as lean as possible layer on top of raw MessagePort.
- Does this only cover a shared storage layer? (i.e. no shared networking)
There are more details current plan here https://github.com/ipfs/js-ipfs/issues/3022#issuecomment-635648846 to address specifically this though idea is that
IPFSClient class that will provide API to the remote remote IPFS node you can attach to:ts
class IPFSClient implements IPFSService {
static attach(port:MessagePort):IPFSClient {
// ...
}
// ...
}
IPFSClient instanceIPFSServer class that can provide IPFS node (which could be either js-ipfs-lite or full fledged js-ipfs) over the message port:ts
const main = async () => {
const ipfs = await IPFS.create({ /* ... */})
const server = new IPFSServer(ipfs)
// self is SharedWorker global scope. On every tab connecting to it
// we wire it up with a server
self.onconnect = ({ ports }) => server.connect(port)
}
>
- If there is shared networking, how can I interact with the libp2p node? (connect to nodes etc)
In phase 1 you're operating an IPFS node so you're free to do anything. I am also happy to assist in trying to work out details how to provide some convenience in terms of wiring things up across worker / main thread.
- What about the pin api? Looks like you can only pin with the
dag.putright now?
There is also ipfs.blob.put(blob, { pin: true }) which is essentially simplified version of ipfs.add which does only what your team and textile are doing with ipfs.add if that is not the case, this would be a time point it out.
If you are referring to ipfs.pin.* APIs that did not made a list because it is not something you team or Textile team has mentioned as being used. If it is something that 3box depends on I'll make sure to add that.
5. Can the blob api be implemented on top of the dag api? (asked this before)
I am not sure understand what are you referring to as blob API, could you elaborate please ?
Thanks for clarifying @Gozala, this being a new ipfs interface makes sense.
exactly. New IPLD stack changes things a bit and that imported Block refers to Block coming from https://github.com/ipld/js-block
This is clear, so developers using this new interface are expected to import @ipld/block in order to be able to put object into the dag?
If you are referring to ipfs.pin.* APIs that did not made a list because it is not something you team or Textile team has mentioned as being used.
Sorry about that, slipped my mind. Pinning is definitely something we need. At various points it might be useful to pin objects that you already have locally.
I am not sure understand what are you referring to as blob API, could you elaborate please ?
I'm basically wondering why you can't use ipfs.dag.put instead of ipfs.blob.put? Does the blob api do something that the dag api is not capable of?
This is clear, so developers using this new interface are expected to import @ipld/block in order to be able to put object into the dag?
js-ipfs should probably expose the Block interface it is using internally for others to use. If it doesn’t then people will end up importing their own and then they are going to end up re-loading a bunch of codecs and other multiformats plugins that IPFS had already loaded to use.
This is clear, so developers using this new interface are expected to import @ipld/block in order to be able to put object into the dag?
I was imagining that Block would be exposed by IPFSService maybe as IPFSService.Bock or IPFSService.dag.Block.
Sorry about that, slipped my mind. Pinning is definitely something we need. At various points it might be useful to pin objects that you already have locally.
I'll make sure to incorporate it!
I'm basically wondering why you can't use ipfs.dag.put instead of ipfs.blob.put? Does the blob api do something that the dag api is not capable of?
ipfs.blob.put(blob, {...}) does essentially what ipfs.add(blob, {...}) does, except it takes just on kind of input a Blob (Well File as well, which is subclass) and returns you a CID. This is based on the feeback I got from talking to you and @carsonfarmer where both teams seem to use very tiny subset of .add functionality. It also addresses inconvenience you've mentioned in https://github.com/ipfs/js-ipfs/issues/3022#issuecomment-635984544
You could technically do it with ipfs.dag API, but you'd need to chunk, create unixfs blocks etc...
js-ipfs should probably expose the Block interface it is using internally for others to use
It would need to be added to the exports for core and the http client.
@Gozala it would be preferable if ipfs-message-port-client used the same mechanism.
Nb. exposing dependencies like this ties our hands around versioning as the module's exposed API essentially becomes our API so when they break, we break, even if the only place they are exposed is via these exports. It also complicates the TS effort as these modules start needing type definitions where they don't already have them.
If pressed I'd rather not export anything like this for that reason though I can see the utility.
Pinning is definitely something we need. At various points it might be useful to pin objects that you already have locally.
@oed I'm curious as to why you are pinning, you say further up the thread that you aren't using GC, pinning is really only there to stop blocks from being garbage collected.
Now we have to do:
const cid = (await (await ipfs.add(buffer)).next()).value.cid
Maybe do this instead:
const { cid } = await last(ipfs.add(buffer))
You can use it-last or something from streaming-iterables to make working with async iterators less verbose.
https://github.com/ipfs/js-ipfs/issues/3022#issuecomment-641542193
My feeling on this API is that we've always tried to ensure that the API between ipfs-core and the ipfs-http-api remains the same (within reason). That lets the user switch between ipfs implementations largely by changing their setup only, they should not need to make structural changes to their application code.
It also means that we can service both implementations with the same test suite, the same documentation, the same tutorials, and that having a canonical way of performing certain actions is possible which helps immeasurably when responding to user questions here, on discuss, IRC & stack overflow.
If we do things like introduce ipfs.blob.* when there's no analog in core/http, change which positionals are optional[^1] or start adopting new modules that are incompatible with the rest of the IPFS ecosystem all that starts to go out of the window.
I think for an initial release it would be acceptable to say that not all methods or argument types are accepted, but the API that is supported should have the same methods and arguments types/names/positions.
So for example keep the ipfs.add method, but maybe in the first iteration it only accepts a String, Uint8Array or a Blob and the async iterator it returns is just a hack to maintain compatibility with accepting streams of these things.
We ship that, perhaps add some notes to the docs about what is supported where and people can start to experiment with it so we can get some feedback on our ideas when they use actual working code.
Maybe the performance is bad in the first release, I think that's ok, but I don't think it's ok to have an incompatible API for the message port client because it increases the cognitive load on the user and the maintenance/support/documentation load on the maintainers - we want this to be a first class implementation and not an experimental client API.
I'm curious as to why you are pinning, you say further up the thread that you aren't using GC, pinning is really only there to stop blocks from being garbage collected.
When GC was introduced in js-ipfs it was disabled by default. If this setting has been changed it has not been properly communicated. Please let me know what the default is, could definitely be a problem if this has changed!
Maybe do this instead
Makes sense 👍
The only way GC runs is if it is manually triggered with await ipfs.repo.gc().
There are no plans to run this on a timer or similar right now, but if that changes you can bet we'll shout about it from the rooftops first!
Thanks for clarifying @achingbrain 🙏
As to your question, we started pinning some things in preparation for the GC, because we didn't know it was not enabled by default back then. Seems like a good practice to be pinning things you want to keep around anyhow.
My feeling on this API is that we've always tried to ensure that the API between ipfs-core and the ipfs-http-api remains the same (within reason). That lets the user switch between ipfs implementations largely by changing their setup only, they should not need to make structural changes to their application code.
It also means that we can service both implementations with the same test suite, the same documentation, the same tutorials, and that having a canonical way of performing certain actions is possible which helps immeasurably when responding to user questions here, on discuss, IRC & stack overflow.
There is undoubtedly a lot of benefits to doing it that way and I would wholeheartedly agree in a broader context. Context here is however crucial. Going with the same interface comes with a tradeoffs, like you can never really change / improve things without disrupting the whole community.
In this context we are working in tandem with the group so we can validate all our assumptions and also use this as an opportunity to:
If we do things like introduce ipfs.blob.* when there's no analog in core/http, change which positionals are optional[^1] or start adopting new modules that are incompatible with the rest of the IPFS ecosystem all that starts to go out of the window.
As I tried to capture excessively in the interface comments, having simple bare-bone API does not necessarily mean API incompatibility with the IPFS ecosystem. We could and probably should layer that with the layer that provides API interop. What it merely does provides is a choice to the community to ditch all that complexity or stick with convenience of existing API.
We also have 👍 from most prominent will be users of this work that are interested in former choice. And with this context in mind I would argue that there is a benefit an trying things with willing members of community and use that as an opportunity to inform work of js-ipfs proper.
I think for an initial release it would be acceptable to say that not all methods or argument types are accepted, but the API that is supported should have the same methods and arguments types/names/positions.
I think proposed API does that to some degree, also why I called out that blob.* methods might end up being top level add / cat. However I also think that is a bad idea because:
API is going to be incompatible and masking it under familiar API is just deceiving. It creates false assumption that things are compatible, in practice leading to errors. Worse yet, some of errors could hide under the code paths uncovering issues only in production. Sure we can document differences and assume that users notice or we could just call them different names and be sure that users will notice.
I would like to offer one extra argument in favor of simplification here. I think this can also help us in "IPFS in browsers natively" endeavor. I do not believe there is any chance API as is could make it into browsers. I think this is also an opportunity to consider what such API could be and even exercise it in close collaborations. In other words inform ourselves in that axis as well.
As a likely consumer of the proposed work/APIs here, I'd just like to voice an opinion in _favor_ of simplification over compatibility. The IPFS API as it is now is vast and includes APIs that are needed when building and testing IPFS, but are less likely to be used by _apps_ that are simply using IPFS as a library. (Certainly I'm biased here... _we_ only use a small subset of the API. I don't have much experience with other projects beyond the few mentioned previously in this thread... so take that comment with a grain of salt). An extensible comparability layer sounds appealing to me, not only insofar as it helps keep the surface area of the core APIs small and robust.
I think what I've said is a simplification of the existing API. That is, don't implement it all, just implement the top-level files API (e.g. ipfs.add, ipfs.cat, ipfs.ls, ipfs.get etc) and the DAG API, and limit the inputs these take in the first release, if it helps you get something out of the door quicker.
Introducing a new ipfs.blob API does not make it simpler because now you have ipfs.add and ipfs.blob which sort of do the same job and sort of take the same arguments but sort of don't and they're not supported everywhere and don't really behave the same.
The API can be improved, redundancy can be removed and it can be made leaner, more intuitive and more straightforward and we should talk about this but I feel it's outside the scope of this piece of work.
I would like to share some updates on this work:
ipfs-message-port- client / server has been submitted for the code review https://github.com/ipfs/js-ipfs/pull/3081js-ipfs API as per documentation. Please note that js-ipfs in practice less strict than docs are, implying that uses that work with js-ipfs but are invalid according to docs will error.ipfs.add is compatible with js-ipfs, it's best to really just pass Blobs.Buffer you should expect Uint8Array instead. This is not exactly true everywhere right now, but that certainly a direction.ipfs.dagipfs.block (was added for tests)ipfs.addipfs.catipfs.files.stat (was added for tests)ipfs-message-port-server takes running IPFS node and exposes it over message channel.On the API discussion:
ipfs.dag API was pretty much ipfs.block API.ipfs.dag API behind and embrace ipfs.block API as itUsing just the ipfs.block API makes sense to me. That + the ipfs.pin API seems to be the only things needed for us on the ipfs side.
On the libp2p side our long term needs are still a bit unclear, and it likely depends on how well performant the libp2p.pubsub API is in regards to peer discovery.
Most helpful comment
I have started hacking on the
ipfs-message-port-clientthat in the spirit is very similar toipfs-http-clientexcept it wraps around MessagePort.However I'm realizing that API surface is pretty large, which reminded me of js-ipfs-lite that is a lot slimmer but also well type typed (TS ❤️). This combined with the configuration point above got me wondering if this maybe a good opportunity to consider stripping some of the convenience stuff which could be pulled in separately (maybe even opportunity to converge code bases ?)