Seeding/Caching : The concept of "Seeding", We already do it when we visit websites, We are not allowing for others to leech due to protocol issues and probably security, the internet was not designed for it.
The concept : The process of "Seeding" can be done when visiting websites in the form of "Caching", I visit website x.com, I get to download the images and markup files the website hosts.
Suggestion : mirror the internet into an IPFS decentralized system by building/improving on an IPFS browser extension that will allow people with the extension installed to serve as a light-weight node by seeding their cached data and allow other users that use IPFS extension to leech and re-seed it, For security issues, maybe for start we can try using it only on image formats and prompt the user to agree on each sites s/he visits (Annoying at first, but better be safe than sorry by a user leaking/seeding info s/he does not wish others to see).
Issues : Security for once, versioning / Data integrity (is the image up to date, are we going to compare/and rehash the data on each visit if needed to on demand?)
Note: We actually have an extension but it won't do what you're looking for. Take a look at the IPFS companion.
So, I think the biggest issue here will actually be agreeing on which content is correct. That is, given a URL (universal resource locator), what is the correct IPFS path or URN (universal resource name).
However, this could be really useful in conjunction with a semi-centralized (trusted) service like the Internet Archive to provide the mapping of URL to IPFS hash.
Alternatively, if we could get website operators on-board, we could use dnslink. However, that requires human consensus (which can take a while...).
We can not have a system that is based on human trust, The human element will always fail.
(Proof : We have been there, its the current form of the internet with all of its problems)
There is NO such thing as "Too big to fail" even if you will find a company that will provide such archive as mentioned. and there are point of failures in such systems when not using decentralized trust-less systems.
What i am suggesting is to brainstorm and see if maybe the developers can come up with a working proposal to the idea mentioned in the OP.
dnslink seems interesting but it requires more work and to be a part of the core after extensive testing and Q/A.
We can not have a system that is based on human trust, The human element will always fail.
(Proof : We have been there, its the current form of the internet with all of its problems)
It's a little more complicated than that but I agree that the internet could do with significantly less trust.
However, someone or some system needs to attest to the URL -> IPFS mapping. One way to do this is to delegate to website operators (the dnslink class of solutions). Without that, some entity or group/network thereof would need attest to this mapping. Unfortunately, getting a group to agree on a mapping is a bit tricky as websites aren't immutable and often change (slightly) every time they're loaded. That's why I suggested a single entity (although you'd definitely want it to be auditable).
The idea of "opening browser's cache" or "re-sharing visited sites" is indeed tempting, but once you start thinking about mapping between mutable HTTP URL and immutable IPFS NURI, it gets unreliable/subjective/centralized quite fast.
Check previous notes/discussions at:
AFAIK there is only one type of resource we might "safely" map in decentralized and automated way:
JS, CSS etc marked with SRI hash (Subresource Integrity) (mapping SRI hash→CID)
But these are already immutable, signed and hosted via CDN, so I am not sure what would be the added value of mirroring them.
That being said, I would love to hear some fresh ideas :)
You already know the answer, as i am sure some will point me to SWARM or filecoin (Yes i know its the same team, but its built on top of IPFS, i do believe the issue MUST be resolved at the core)
You need a cache system with versioning method probably backed by high resolution timestamp (Actually timestamps can be tempered with) (The cache system will translate domain names into the correct IPFS hash), That cache system needs to be global, decentralized, immune to spoofing, open source (probably) and trust-less (And will probably cost cryptocurrency and some sort of preferred proof-of-X to make sure its working correctly with data integrity).
And we all know what that is, There is no cache system that doesn't require database.
You need blockchain as a complementary element...
It means that uploading/updating files/data streams will cost cryptocurrency and will take time.
And you also need to incentivize people to keep their nodes running , Because we all know from torrents that once you finish downloading whatever it is that you want, you close the client, Even if you will enforce the requirement to seed/upload back cached data via the node/browser extension, people will find a way to block it since its consuming network bandwidth and slows down your local network.
We want to be our own service providers? we are going to have to pay people for their services.
I really like the idea, although I am aware of the problems this approach brings (as mentioned above). Thank you @lidel for pointing out the previous discussions.
Having a consensus in regards to the URI -> IPFS mappings is indeed a problem, although it is a problem which might be solvable in a decentralized way, at least for some types of content.
I think that a lot of content on the web can still be considered "static" by _some_ standards, which could be utilized by a browser extension (or an app, if you want to go that route).
Consider articles, for example. A lot of news articles and other written pieces are accessible by permalinks and don't change after publication, in terms of article content. (Granted, you might have corrections or updates here and there, but let's leave that aside from now.).
If we just focus on the essence of these links, i.e. the article content, every node could calculate the IPFS hash of the article and as long as the article _content_ did not change, the hash would be the same. Granted, the nodes would have to agree on what is considered essential, but I think this could be solved by using something like readability - which basically strips away superficial information.
Sticking with the example above, this would result in a decentralized mirror of articles, where every node could verify the URI -> IPFS mapping itself, by recomputing the hash (as long as the original article is still available via http, and given that the "stripping away" is deterministic). As mentioned by @Illasera above, one could even use a blockchain as a complementary element, if you want to move to a more sophisticated, trustless system to keep track of these mappings.
If this could be integrated into an extension/app which already provides an additional benefit (again, sticking with the example above Pocket and Instapaper come to mind), building up a network of lite nodes might not be impossible.
Granted, there might be some legal and copyright issues one would have to deal with, and users of this system would probably need to be educated enough so they know what's going on and they can opt-in
or out.
There might be a ton of things I'm overlooking, so I would love to hear your thoughts on this.
@dergigi While doable, I am highly sceptical when it comes to "stripping away superficial information". Consider questions: who decides what is not important? who writes and controls these algorithms?
It is a can of worms.
Just save thing as-is with a timestamp – as a historical record.
That way we are left with more familiar problem of "agreeing on what is the latest snapshot".
If we make a step back from technical details, the general idea sounds like the thing we would like to see is not just a glorified CDN. It is a distributed version of Internet Archive :)
Right now, Internet Archive provides us with a neat browser extension that:
Detects dead pages, 404s, DNS failures & a range of other web breakdowns, offering to show archived versions via the Internet Archive's Wayback Machine.
If you add IPFS to the mix and solve URI->IPFS mapping, it will supercharge it by:
URI → IPFS is replaced with URI → list of IPFS snapshotsThe URI → chain of snapshots → latest IPFS snapshot mapping remains an open problem, however I agree with @Illasera that append-only nature of blockchain-based consensus may be a natural way to solve it (or at least provide proper incentive to opt-in – Filecoin).
ps. FYI there is a highly relevant discussion and preliminary "recording backend" experimentation work done in:
And some other discussions:
Time to get this back on track : I have noticed that we are heading towards the route of "Mirroring" the internet, it could be your goal but its not mine (I did mention that in my OP but as the discussion progressed, i had a change of hearts, I wish to focus more on seeding cached copies with a browser extension to be used as a node, and translate readable addresses into their correct IPFS hash address), As far as the old centralized internet can stay the way it is, We have oracles that allow us to query info from it into the blockchain, Instead of trying to convert old/existing webpages, let's start anew by updating IPFS itself to work with new files, and with time, we can try and give answers to legacy files that already exists.
Note on why Filecoin : And friends aren't the answer , Swarm actually got it right.
1.)Filecoin PR was a disaster when they declared they wanted to bring accredited investors first.
2.)Filecoin was not the first to do it nor the best, They are starting to rediscover things that other blockchains have already pulled better.
3.)Filecoin's _core idea is WRONG, the problem needs to be resolved at the IPFS core_, it CAN NOT be a layer built ON TOP of IPFS, it must be IPFS or nothing (Swarm actually got this part right).
My solution : is not to reinvent things but to use existing solutions.
You don't need new blockchains for this, you have perfectly good and working blockchains as it is that are more than perfect for the job. Actually there are multiple working solutions.
More Details : What we need is a bridge protocol between the blockchains in the form of a smart contract with an escrow kicker to top it. You can store the mapping system on one blockchain or even MULTIPLE blockchains and accept different currencies as different payment methods, You do this by converting the lite-node extension to act as an oracle for trading different cryptos in between escrow contracts and as a payment method,
You can use QTUM/EOS/ETH/NEO and so on to store your mapping on different blockchains, Afterward you use the lite-node extension to resolve (Contract-call) which entry is the most up-to-date by using timestamps and comparing them from all different blockchains.
This is getting waaay off topic.
Either way it is a valid discussion that can benefit IPFS as a whole, even on a conceptual level, Let it be, Please and thank you.
Most helpful comment
You already know the answer, as i am sure some will point me to SWARM or filecoin (Yes i know its the same team, but its built on top of IPFS, i do believe the issue MUST be resolved at the core)
You need a cache system with versioning method probably backed by high resolution timestamp (Actually timestamps can be tempered with) (The cache system will translate domain names into the correct IPFS hash), That cache system needs to be global, decentralized, immune to spoofing, open source (probably) and trust-less (And will probably cost cryptocurrency and some sort of preferred proof-of-X to make sure its working correctly with data integrity).
And we all know what that is, There is no cache system that doesn't require database.
You need blockchain as a complementary element...
It means that uploading/updating files/data streams will cost cryptocurrency and will take time.
And you also need to incentivize people to keep their nodes running , Because we all know from torrents that once you finish downloading whatever it is that you want, you close the client, Even if you will enforce the requirement to seed/upload back cached data via the node/browser extension, people will find a way to block it since its consuming network bandwidth and slows down your local network.
We want to be our own service providers? we are going to have to pay people for their services.