go-ipfs version: 0.4.10-00573cb2c
Repo version: 5
System version: amd64/windows
Golang version: go1.8.3
Bug
Low-Medium
On my machine if I try to use --nocopy with the trickledag format it will still copy file blocks, as well as output a hash for the standard dag format, but even this behaviour is inconsistent. These are the cases I've tried and their results with a log below, the sample data is just 25M from udevrandom.
Nothing has been done yet, file is added with --nocopy
Result: As expected, file data is not copied to the datastore, standard hash returned
File has been added via --nocopy already, it is added again with --nocopy and -t
Result: File data is not copied, but standard dag hash returned, not trickle
Repo is cleaned, file is added with --nocopy and -t
Result: File data is copied, and standard dag hash returned
Repo is cleaned, file added with -t
Result: As expected, file data is copied to the datastore, trickle hash returned
In addition to this please look at the log below to see the blocks directory changing size, even after everything is removed it seems to have grown is size a few kb. I'm not sure if this is expected or not.
I haven't tested this elsewhere yet. I originally caught this after adding a lot of files like this add --nocopy -r -w -Q --chunker=rabin --pin=false --cid-version=1 --hash=Blake2b-256, -t was only provided conditionally (if more than half the files provided were multimedia files) so the arguments would turn into add --nocopy -r -w -Q --chunker=rabin --pin=false --cid-version=1 --hash=Blake2b-256 -t, I don't think any of the other arguments there affect --nocopy like this.
>cd %HOMEPATH%\.ipfs
>ipfs init
...
>ipfs config --json Experimental.FilestoreEnabled true
>cksum random
2009307347 26214400 random
>du -sh blocks
68K blocks
>ipfs add --pin=false --nocopy random
added QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr random
>du -sh blocks
77K blocks
>ipfs get --output=validate QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr
Saving file(s) to validate
>cksum validate
2009307347 26214400 validate
>ipfs add --pin=false --nocopy -t random
added QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr random
>du -sh blocks
77K blocks
>ipfs get --output=validate QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr
Saving file(s) to validate
>cksum validate
2009307347 26214400 validate
>ipfs repo gc
removed QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr
...
removed zb2rhoTbL7rcs2J8enZ5R54gijwPdLGGHS3JtpFw74vDw8MyZ
>du -sh blocks
55K blocks
>ipfs add --pin=false --nocopy -t random
added QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr random
>du -sh blocks
26M blocks
>ipfs get --output=validate QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr
Saving file(s) to validate
>cksum validate
2009307347 26214400 validate
>ipfs repo gc
removed zb2rhanQbgX9Q4uzrtDC5MtsukDRLBPLPwR4XvYNaJFHpAvSt
...
removed zb2rhdQWSLuJLhMQ7kwEpRT3pGHeHcMbHjFBJo7DmfNGiWfSF
>du -sh blocks
67K blocks
>ipfs add --pin=false -t random
added QmYDL6Rnkhhog6NFwykAtAD2uT5R62PVBB4F9t85vUX6NM random
>ipfs get --output=validate QmYDL6Rnkhhog6NFwykAtAD2uT5R62PVBB4F9t85vUX6NM
Saving file(s) to validate
>cksum validate
2009307347 26214400 validate
>ipfs repo gc
removed QmWLPNh6mVuqsxpWTiW7YNaGWbMjmxfUUweJ18xaiBfn1s
...
removed QmdtwfJv8g6FQxAnHFVPpNxiasitZGvGzoVtSuScswQZfG
>du -sh blocks
79K blocks
cc @kevina
@whyrusleeping My code never supported the trickle adder, so it is unlikely there is any support for it in IPFS.
@whyrusleeping Is this issue still relevant to work on it?
@schomatis Yeah, I don't think this has been fixed yet. Go right ahead :)
@schomatis that sounds about right to me. It should be fairly easy to do.
@kevina If I'm following correctly the original PR I should replicate the SetPosInfo pattern in the trickle importer for Layout and fillTrickleRec, and also for FillNodeLayer (which I'm not seeing a counterpart in the balanced importer), am I missing something else? I'll prepare a patch and let you know.
@schomatis go ahead and create a p.r. but label it a work-in-progress (by prepending the title with "WIP:" or similar) and I will have a look.
@kevina Regarding the trickle DAG Append family of functions, what should be the value of the offset for SetPosInfo?
I don't see analogous functions for the balanced DAG and I don't know if the offset should be zero or should continue from the offset value of the base node (basen) because I'm not sure if a new file is being appended or if is a continuation of the same (previously added) file (or is that something that should be checked through the fullPath variable?)
@schomatis the offset is the offset from the start of the backing file, so it should likely be the offset of the base node. It might be helpful to look at the filestore code to see how PosInfo is used, in particular readDataObj (https://github.com/ipfs/go-ipfs/blob/master/filestore/fsrefstore.go#L152).
@kevina Thanks for the pointer, it helps me understand Filestore internals.
What I'm missing now is a use case for the Append function. From what I'm seeing in the code this function is related to the Files API / MFS and the unixfs components, but both those specs are under development (at this time empty), and although there are many discussions open about them in several issues it's hard for me to get a firm grasp on how they work (besides the intuitive concept that they allow to have a UNIX file system mounted on top of IPFS).
@whyrusleeping Maybe you could point me in the right direction to get more information about those subjects. My main concern now is to have a list of commands that would trigger an Append call to learn more about how it works. For now I'm experimenting with a basic ipfs files example I've found (I didn't find any in the official IPFS Examples, maybe after this issue is closed I could add one).
After reading the example cited above I run the following commands to trigger the Append call:
echo "Plain text file." > file
ipfs add --nocopy file
# added zb2rhdJ19xRwTk3QMZ1GG8vPEh7gZVXMpi6FP7AxanrfrocPR file
ipfs files cp /ipfs/zb2rhdJ19xRwTk3QMZ1GG8vPEh7gZVXMpi6FP7AxanrfrocPR /file
echo "Added content." | ipfs files write -o 100 /file
# 14:56:22.085 ERROR cmds/files: seekfail: expected protobuf dag node files.go:772
# Error: expected protobuf dag node
The error is due to a check in Append for a ProtoNode (where in this case the node is a RawNode), the check is not commented.
Thinking about this it makes sense this append operation fails because Filestore won't have a concrete file to back the appended data (except the stdin data received by the ipfs files write is backed up in a separate file, although it wouldn't appear so).
My question then is if Filestore should support the append operation or if the SetPosInfo should be added only to the builder function Layout, resting on the fact that Append will fail for --nocopy nodes (and the posInfo structure won't be needed). If this is the case there should be a refactoring of fillTrickleRec and FillNodeLayer that are used by both operations (build and append) for one to support Filestore and not the other.
Researching some more, the problem is not related to the way the input is passed to the write command, this also fails:
ipfs files write /file file
# Error: expected protobuf dag node
The problem is that the node passed to Append is the node already stored with Filestore as RawNode (and the check mentioned above will always fail).
@schomatis I don't think we need to make the trickle append code support filestore. The Add code only uses Layout. I believe you are correct in saying that only ipfs files write uses Append. Looking at your other comments, it looks like it needs to learn how to append to a non-protobuf node. That might get sticky, but I think its probably okay to error out in that case for now.
Closing in favor of https://github.com/ipfs/go-unixfs/issues/51.
Most helpful comment
@schomatis I don't think we need to make the trickle append code support filestore. The
Addcode only usesLayout. I believe you are correct in saying that onlyipfs files writeusesAppend. Looking at your other comments, it looks like it needs to learn how to append to a non-protobuf node. That might get sticky, but I think its probably okay to error out in that case for now.