Git LFS is an extension to git which allows storing large files outside of the main git repository on HTTP hosts. Currently, there exist centralized hosting solutions to store those files (GitHub, GitLab, etc.). But I realized that it would be really cool if files could be stored simply on IPFS. In this way people who are not so used to IPFS could simply use git and store file with git LFS extension, but internally things would go to IPFS. So this could make it much easier for people to switch to IPFS, and at the same time it would help with the question of hosting git LFS files.
This should be possible although it's unclear how flexible the "oid"s are. We may be able to just use CIDs as oids but I'm not sure.
We played around with this idea a while back. We had a pretty rough working prototype written in Python at one point. The long story short is @Stebalien is right, the OIDs aren't all that flexible, so it is difficult to use CIDs directly. You can create mappings from IODs to CIDs, but this isn't ideal and leads to many passes over the data. Happy to provide additional feedback here if others want to engage at some point!
I've been working on implementing an extension in Rust: https://github.com/sameer/git-lfs-ipfs
Based upon the LFS basic transfer API docs, it was originally set up so you would run a web server locally. A dag-pb object would created mapping object OIDs to their contents (i.e. QmWnE5vczyRHW7CtiRwsXPaQ5BRSbdZh8pAtr3bWGD6SUD):
ipfs add object --> QmObjectHash
ipfs name resolve /ipns/QmPeerId --> QmCurrentHash
ipfs object patch link QmCurrentHash <object id (sha256sum)> QmObjectId --> QmNewHash
ipfs name publish QmNewHash --key=QmPeerId
After digging some more, I found the way @carsonfarmer mentioned using a custom transfer + smudge & clean extensions. With smudge & clean, you can map from an object's oid to the hash of the dag-pb block for the object.
The clean is the only inefficient part because it is receiving the file over stdin and can't make any assumptions. It gets called for the first git lfs status when a file has been modified and for every git lfs add.
Most helpful comment
I've been working on implementing an extension in Rust: https://github.com/sameer/git-lfs-ipfs
Based upon the LFS basic transfer API docs, it was originally set up so you would run a web server locally. A dag-pb object would created mapping object OIDs to their contents (i.e. QmWnE5vczyRHW7CtiRwsXPaQ5BRSbdZh8pAtr3bWGD6SUD):
After digging some more, I found the way @carsonfarmer mentioned using a custom transfer + smudge & clean extensions. With smudge & clean, you can map from an object's oid to the hash of the dag-pb block for the object.
The clean is the only inefficient part because it is receiving the file over stdin and can't make any assumptions. It gets called for the first
git lfs statuswhen a file has been modified and for everygit lfs add.