Some people (myself included) are running websites through IPFS, and relying on gateways (like Cloudflare's) to serve them to users via HTTP(S).
The current IPFS protocol has some limitations when doing this. The biggest one is the inability to set custom headers that a HTTP web server might need, starting from Content-Type.
I propose we create a manifest file that can be stored inside each folder added to IPFS. The manifest file is a YAML (or JSON) document, for example called .ipfs-gateway.yaml and could contain additional metadata that is relevant to IPFS gateways only. For example:
````yaml
version: 1
files:
options:
# Redirects HTTP -> HTTPS traffic
alwaysUseHTTPS: true
````
When the IPFS gateway serves a folder, it needs to check if there's a manifest file, and apply the rules configured in it.
The manifest allows adding certain HTTP headers to files served by the gateway. We should explicitly whitelist the allowed headers, as in shared gateways there could be issues with other apps (e.g. imagine someone deployed an app that enabled HSTS, and that would impact the entire gateway).
The manifest file should be placed in the root of the folder added to IPFS. Since it's just another document published through the IPFS network, a change in the manifest file would result in the entire folder having a completely different hash, and this is by design.
There have been many users asking to implement custom metadata/headers for files inside IPFS, including on https://github.com/ipfs/faq/issues/224 I believe that, while the ask was for the ability to add metadata to files published on IPFS, in reality what users want/need could be better satisfied with a proposal like this.
Compared to adding support for metadata in IPFS, this proposal has many pros:
ipfs add -r folder/ anymore, and the CLI would become complex fast.The cons:
I like this approach but we have to be careful:
We've been waiting for any sort of progress on UnixFS for a very long time. It's starting to get to the point where I'd rather have something that works now than something that is always at some point in the future. Specifically: mimetype support.
Concretely, there hasn't been changes to UnixFSv2 for over a year!
https://github.com/ipfs/unixfs-v2
I am not very familiar with UnixFSv2, but echoing @dokterbob I am wondering too why this would have a dependency on that?
We've done some soul searching recently on mime-types and realized that, really, they probably don't belong in the filesystem itself anyways. Even if we _did_ store them in the filesystem, you're right about tooling.
Given that, I like this proposal and would be happy to accept a patch (although, to be realistic, I may take a while to review it).
Changes from the original proposal:
alwaysUseHTTPS.Thank you @ItalyPaleAle for taking the time to think this through.
Very related:
I am in favor of something like this specification for gateways. One of our issues in adopting IPFS for our frontend is that we serve HSTS preload and CSP headers on our domains, making it necessary for these headers to be present on any gateway solution. While this can be solved by making our own gateway, that defeats the purpose of the decentralized nature of IPFS.
I do wonder about integrating HTTPS and custom certificates into gateways as well. Does anyone have a solution to that? Perhaps a DNS-level solution for the gateway to provide a valid certificate?
EDIT:
alwaysUseHTTPS toggle as having valid HTTP -> HTTPS upgrades on the connection is required for HSTS preload websites@rhyeal I think that @Stebalien's point, which I do subscribe to, is that all decisions related to the transport layer, such as enabling or enforcing TLS, or adding HSTS, should be done outside of the gateway. Indeed, you likely don't want the ipfs-gateway directly exposed on the Internet, but you should proxy it with Nginx or something similar. You can then enable HTTPS and HSTS on the gateway.
Note that gateways are also used for serving sites on custom domains. For these domains the user may want to enforce things such as HSTS. I think that there should be a suggested set of allowed headers for shared-domain and custom-domain gateways can use whatever headers they would like.
I would also highly recommend JSON for the manifest as it is ubiquitous. YAML parsers have notable variation in what they accept and could cause compatibility problems. We can consider automatically compiling YAML to JSON on upload however the actual protocol should be JSON.
@kevincox I disagree on HSTS. It should be something set at the infrastructure-level and not at the app-level. Let's say also you deploy an app and serve it on that custom domain and you enable HSTS. The next version of the app does not contain that key in the manifest: now you can't roll-back HSTS easily.
As for format... My preference would be to support both YAML and JSON if possible. We could use the same file and parse it with a YAML parser... since YAML is a superset of JSON, there shouldn't be issues
I guess I can be convinced about HSTS specifically, however it would be nice if there was a portable way to specify this in case I even need to move between gateways. However I can see the argument that it can be infrastructure specific.
I think the advantages of YAML are small compared to allowing the user to write YAML and convert it to JSON before storing.
Note that YAML also has security concerns including insecure features (often disabled by default but this is a footgun that will keep happening over time) as well as Denial of Service options (a small YAML file can expand to consume arbitrary amounts of RAM). In the end you need to restrict to a subset of YAML which will cause confusion, interoperability concerns and security vulnerabilities (when people fail to make this restriction).
I strongly recommend that we stick to something simple (like JSON) for the protocol end of things.
I agree that YAML can be simpler to write, however we can benefit from that while still keeping the protocol efficient, simple and safe.
This is looking really interesting.
Regarding redirects, web apps can perform redirects client-side, for SEO purposes, a server-side redirect would be preferred.
So would be nice to be able to set a status code in the config:
...
# Add rules to specific files/patterns
files:
- name: 'redirect.html'
redirect: 'other-page.html' # Set the Location header (requires a 3xx status code)
statusCode: 302 # default could be 301
...
Most helpful comment
@kevincox I disagree on HSTS. It should be something set at the infrastructure-level and not at the app-level. Let's say also you deploy an app and serve it on that custom domain and you enable HSTS. The next version of the app does not contain that key in the manifest: now you can't roll-back HSTS easily.
As for format... My preference would be to support both YAML and JSON if possible. We could use the same file and parse it with a YAML parser... since YAML is a superset of JSON, there shouldn't be issues