The problem
Files at URLs are mutable - they can be changed or deleted at any time, by anyone with access - whether that access is legitimate or not. If the host gets hacked, bad code could be injected for anyone who fetches it.
The solution
Imports from IPFS. IPFS is a content-addressed globally-distributed filesystem. Files are identified by a hash of their contents, so they can never be modified without changing the file identifier. A given hash will always point to the exact same file, forever.
Additionally, as it's globally distributed, the chance of files disappearing forever when they're depended on is nearly zero. No more of this. And none of this.
Implementation options
Right now, the IPFS community seems to be standardizing on ipfs:// as a URI scheme for files on IPFS. We can use that to identify when an import is from IPFS, as opposed to a HTTP(S) import, and act accordingly.
Now that we have the file identifier, there's a few options for fetching it.
Use the local IPFS node. Chances are, if a user wants to import from IPFS they're running a node that we can talk to to resolve files. IPFS nodes have an HTTP API that typically runs (for localhost) on 127.0.0.1:5001, and allows you to get files that way. An import from ipfs://{hash} can be fetched from http://127.0.0.1:5001/ipfs/{hash} - we can also optionally pin files to the user's node so they're reprovided to others on the IPFS network. Full local HTTP API docs are here.
Use a public IPFS gateway. There are a fair number of them, and known gateways and their status are tracked here. File resolution is always at {gateway}/ipfs/{hash}.
Run a local IPFS node. This would be the most difficult to handle, as it's not just a string manipulation to translate an ipfs:// import into a http(s):// import.
Personally? I recommend using the installed local node and falling back to a select list of public gateways.
How would you be able to identify which package you are importing if it's just a hash in the import statement?
Doc comments on your deps.ts file? Ideally, the imported module would also have a doc comment at the top identifying itself.
The other option is sticking to a convention? You can reference linked data within an IPFS IPLD DAG structure (the stuff behind the hash) by name. This requires 'wrapping' the IPFS content in an external folder, but this isn't particularly burdensome.
import { Something } from "http://127.0.0.1:5001/ipfs/{hash}/some_package/mod.ts"
That also works, and preserves the core immutability features we're looking for. Although, I think that import string would be better as
import { Something } from "ipfs://{hash}/some_package/mod.ts"
to preserve the fallback options I proposed above.
Was going to create the exact same issue :sweat_smile:
Another protocol that might be suitable as well is p2p://. I think this feature would be used a lot in tandem with import maps, later the community could create cli tools to manage the json file more easily making use of some registry that maps package names to hashes, so you could do module-management-tool add library and it automatically adds library with the right hash.
{
"imports": {
"library/": "p2p://<hash>"
}
}
import foo from 'library/foo.js'
And to go the extra mile that registry can just be another json file on ipfs managed by a decentralized blockchain based organization that is economically incentivized to keep a high quality registry with possibly different channels like testing and stable where packages are reviewed carefully to make sure no funny scripts make their way to the stable registry.
IPFS is built on libp2p - it has both node.js and in-browser support.
So Deno just need to support that lib, and it can do bunch of p2p interactions then
From a security perspective, IPFS is not completely necessary, nor does it actually solve the issue of content-defined library imports - since it provides no guarantee that the requested file would internally use a secure hash scheme to import its own dependencies, and so would their own dependencies' dependencies, and so on.
I think it would also be productive to consider a URI scheme that is independent of any specific protocol (e.g. https/ipfs/p2p/magnet) and instead:
protocol://path/some_package@version/module.{hash}.ts
strict secure mode in which every imported file must include a hash (or, for the very least, files that are fetched from external sources).For compatibility and future proofing: it seems reasonable to use IPFS's multiformat scheme that allows to support various kinds of hash algorithms.
Edit:
The filename doesn't necessarily need to embed the hash. This scheme may be as secure (not completely sure, need to consider it further..):
protocol://path/some_package@version/module.ts#cid={hash}
(Using a key=value scheme the # part can be made extensible to allow for other metadata such as digital signatures etc.)
After thinking about this further I've found a simple alternative solution that guarantees 100% content addressed safety and reproducibility for the entire dependency tree:
It would look like this:
protocol://domain/path/name@version/module.ts#lock_cid=QmcRD4wkPPi6dig81r5sLj9Zm1gDCL4zgpEj9CfuRrGbzF
Where lock_cid is a content identifier (basically a hash in a flexible format) for a lock file that would include the hash of the target file (here module.ts) as well as the hashes of all imported files in the entire dependency tree (possibly including imports from the standard library):
The file might look like something like this (or might use JSON etc. this is only shown for simplicity):
https://my.website/path/name@version/module.ts Qmf8obm7bxrQS1JnjUniJdibcN2kUJy9zz732sr7o3dxtn
https://my.website/path/name@version/utils.ts Qmeg1Hqu2Dxf35TxDg18b7StQTMwjCqhWigm8ANgm8wA3p
https://my.website/path/name@version/methods.ts QmZfSNpHVzTNi9gezLcgq64Wbj1xhwi9wk4AxYyxMZgtCG
https://someother.website/path/name@version/othermodule.ts QmbKxNNCxBox7Cmv3jiUZbiG3zpzmtnYzVUuKHxfAjvpyH
https://someother.website/path/name@version/othermoduleutils.ts QmPwwoytFU3gZYk5tSppumxaGbHymMUgHsSvrBdQH69XRx
https://deno.land/[email protected]/async/delay.ts QmaLRet8qeYqNaq8xJeiqwjNnukSo3uEA8oWsDLoxxBv4Q
https://deno.land/[email protected]/async/deferred.ts QmWZtn3ahqqpGBBRZqPdthcWz2n1rxc1UuiDoWXrgrHKzZ
...
Since the lock file is content addressed, it can be fetched from anywhere, either from IPFS, or from the web server itself (say in https://my.website/path/name@version/module.ts.lock).
The lock file can also be effectively used as an IPFS-like merkledag (though unlike in IPFS, it doesn't represent a directory structure, but a collection of references to various sources), but since all the references use cids, they can all be potentially fetched from IPFS (and in parallel, which may also improve performance).
Technically, if some of the imports in the dependency tree already refer to their own lock file, then it may not be strictly necessary to include them in the lock file. However, since the storage requirements of a structure like this are relatively minimal in modern standards, it may be better not to rely on multiple sources and include everything in one file (even if techincally redundant).
ha! I had similar thoughts here: https://github.com/denoland/deno_website2/issues/406#issuecomment-633298685
@srdjan
Interesting you were going the same way.
I've modified this approach through several iterations and eventually came to a solution that doesn't actually require the library authors to know anything about the existance of a lock file or even the hashing scheme. I'll try to clarify some aspect for the design :
The hash part (#) is never actually sent to the server, it is only for local use:
https://domain/path/name@version/module.ts#lock_cid=QmcRD4wkPPi6dig81r5sLj9Zm1gDCL4zgpEj9CfuRrGbzF
The lock file is individual to the dependecy tree of a particular ts (or js) file (there is no consideration of a directory structure here), but by convention, for the above URI, one of its search locations might be:
https://domain/path/name@version/module.ts.lock
However, it doesn't have to. It can be stored locally or fetched from a p2p network like IPFS.
As an importer of a third-party ts or js file, you would be able to produce, by yourself, a lock file for that particular import and put it practically wherever you want.
Alright, after considering it even more, I've realized that the lock file isn't even strictly necessary to be stored anywhere. Instead, it can be regenerated purely based on the content and structure of the dependency tree for verification purposes.
It's actually pretty simple:
Say I want to import this URI, but I also want a strong proof that ensures that what I get is always the same:
import * from "https://example.com/path/name@version/module.ts"
I use a tool that walks the dependency tree of module.ts in some deterministic order and records the URIs and hashes of all the files it finds and puts the result in some file called module.ts.lock. I annotate the hash of that file into the link like this:
import * from "https://example.com/path/name@version/module.ts#lock_cid=QmaLRet8qeYqNaq8xJeiqwjNnukSo3uEA8oWsDLoxxBv4Q"
Now whenever someone encounters this annotated link. They have two options:
That's pretty much it.
As an update, this happened. Chrome extended support for custom protocol handlers to include, among others, support for ipfs:// and ipns://.
Most helpful comment
That also works, and preserves the core immutability features we're looking for. Although, I think that import string would be better as
to preserve the fallback options I proposed above.