Go-ipfs: Update S3 code to use s3gof3r

Created on 30 Nov 2015  路  6Comments  路  Source: ipfs/go-ipfs

The original S3 code is no longer being mantained and has error reporting issues. The current workaround (using datastore with an S3 FUSE mount such as goobys) is pretty slow.

Update the S3 code to use s3gof3r (https://github.com/rlmcpherson/s3gof3r), which looks pretty best-case for performance considerations RE https://github.com/ipfs/infrastructure/issues/89#issuecomment-160481264

In the future some caching considerations may be needed, but this ticket is finished by just getting a basic integration in. Optimization will depend on benchmarking.

help wanted topirepo

Most helpful comment

I am trying to bring S3 as datastore for IPFS. As POC I changed the openDefaultDatastore function of defaultds.go to initialize the repo with S3 backed datastore for blocks and leveldb for everything else. The S3 datastore uses s3gof3r library to access S3 bucket. I've implemented the Put, Get, Has, Batch functions of S3 datastore. I didn't implement the Query function as s3gof3r library doesn't support listing keys. IPFS runs fine with these changes. Whenever I add content to IPFS I can confirm blocks getting stored in the S3 bucket. I can also able to retrieve content from IPFS where the blocks are actually stored in S3.

Now I want to make the changes proper and integrate with IPFS. But I am not sure how to proceed on this as this involves changing the config file format to specify S3 parameters. I've checked https://github.com/ipfs/notes/issues/9 but not sure which one is finalized.

@whyrusleeping @jbenet Can you guide me how should I take this forward?

All 6 comments

I am currently looking into backing IPFS by FLOSS S3-compatible object stores. Finding https://github.com/ipfs/go-ipfs/blob/master/thirdparty/s3-datastore/datastore.go let me already suspect a strong binding to Amazons infrastructure. As https://github.com/ipfs/notes/issues/9 also maneuvers around plain Amazon-branded S3, I am interested in knowing your prospect:

Could IPFS also be backed by FLOSS distributed storage infrastructure?

https://github.com/jbenet/go-datastore/ lets me believe Redis would already be possible, but the documantation scattered somewhere inbetween https://github.com/jbenet/random-ideas/issues/33 and https://github.com/ipfs/faq/issues/86#issuecomment-167941955 does not explicitly mention how to use different block storage providers. Unfortunately the suggested go package also doesn't mention third party S3-compatible API providers.

https://github.com/ipfs/specs/tree/master/repo suggests to create https://github.com/ipfs/specs/blob/master/repo/s3-repo instead of (4) above.

As of https://groups.google.com/d/msg/ipfs-users/6POFh0EdnVI/rPSKGqoREgAJ there has already been some work at https://github.com/ipfs/go-ipfs/tree/s3, https://github.com/ipfs/go-ipfs/pull/1261 and https://github.com/ipfs/go-ipfs/pull/1488 which needs to be revamped either way.

Would there be any interest in supporting Ceph, Swift, GlusterFS, Eucalyptus and others as distributed storage clusters for large-scale IPFS nodes? Have a look at how ownCloud _Enterprise_ tricks their S3 client to use Ceph instead.

I am trying to bring S3 as datastore for IPFS. As POC I changed the openDefaultDatastore function of defaultds.go to initialize the repo with S3 backed datastore for blocks and leveldb for everything else. The S3 datastore uses s3gof3r library to access S3 bucket. I've implemented the Put, Get, Has, Batch functions of S3 datastore. I didn't implement the Query function as s3gof3r library doesn't support listing keys. IPFS runs fine with these changes. Whenever I add content to IPFS I can confirm blocks getting stored in the S3 bucket. I can also able to retrieve content from IPFS where the blocks are actually stored in S3.

Now I want to make the changes proper and integrate with IPFS. But I am not sure how to proceed on this as this involves changing the config file format to specify S3 parameters. I've checked https://github.com/ipfs/notes/issues/9 but not sure which one is finalized.

@whyrusleeping @jbenet Can you guide me how should I take this forward?

@sivachandran where is your s3 datastore implementation at?

On the todo list is making configuring the local datastore nicer, i'll ping back here when thats complete.

@whyrusleeping https://github.com/RealImage/go-ipfs/pull/1/ is the S3 datastore implementation. It is designed and implemented mainly to satisfy my requirements. It might require some changes to merge into master.

S3 datastore was migrated here: https://github.com/ipfs/go-ds-s3/

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jonchoi picture jonchoi  路  3Comments

0x6431346e picture 0x6431346e  路  3Comments

magik6k picture magik6k  路  3Comments

funkyfuture picture funkyfuture  路  3Comments

Mikaela picture Mikaela  路  3Comments