go-ipfs version: 0.4.13-
Repo version: 6
System version: amd64/linux
Golang version: go1.9.4
Feature Request
ipfs add supports -w for filename preservation, e.g. ipfs add -w foo.zip will let you access the result as /ipfs/<dir hash>/foo.zip instead of just /ipfs/<file hash>. However, it doesn't work very well when reading data from stdin:
$ cat test.txt | ipfs add -w
added Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE
Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE
added Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY
And now test.txt can only be accessed as /ipfs/Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY/Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE
This can be fixed post hoc by creating a new parent object with the right link name (using ipfs object or, more conveniently, ipfs files), but this is awkward, and (if you need the resulting object pinned) slow.
It would be much more convenient if you could specify a name for the wrapped object on the command line, and have ipfs add create the link with that name.
@Stebalien @magik6k if someone use command "cat",the request cannot get the file name.so can i add a command such as "-w test.txt",the “test.txt” is the file name that user can specify,so we can get the file name instead of a hash string?
Currently, that command would add the file test.txt from the current directory. We'd probably need to add a new flag. However, I'm not sure the best way to do this.
@Stebalien i understand you.And i want solve this problem and i am coding about this feature.In my opinion,i design a new command "-n",it means the filename.we can use this command such as "cat text.txt | ipfs add -w -n "text.txt" ",so i can get the filename from the command.it is ok?
@Stebalien the reason i design about this is that if i use "cat" to get the file,we cannot get the filename in cmd.request.
Hey @kjzz, sorry for the delay, are you still interested in fixing this? It would seem to me that the real fix should happen in go-ipfs-cmdkit (not sure though), but we could definitely add a flag in the adder to explicit a file name that would overwrite the original file name or provide one in case it wasn't specified (as is the case if reading from stdin).
This is the function in the adder that checks file names,
and probably where the --name flag should be applied (the value passed with the flag should be stored in the Adder).
hello @schomatis ,sorry for delay,i am still interested in fix this issue,and i am working for this.i plan to add --name flag to get file name.do you agree with me?
Yes, go ahead! Ping me if you have any questions.
hello,@schomatis i have a question that if we want add two or more files to ipfs,such as ipfs add -w test.txt test1.txt.And how to use --name to explicit the files name.For example,if we have only one file , we can use ipfs add -w test.txt --name myfile.txt.However,if we have more than one file,can we use ipfs add -w test1.txt test2.txt --name myfile1,myfile2?(I plan use a comma or oher flag to split files name)maybe it is not a good solution,if you have some advice,you can ping me.
@kjzz I would start with the simplest solution (at least for this first iteration) of only one name, especially to avoid the scenario of having to check in ipfs add <dir> --name <many-names> that we're passing as many names as files in the directory hierarchy. With only one name we should support:
Passing content through stdin (where FileName() is "").
(Optional) Passing content through a single file and replace its name with --name.
We should ask @keks what's the best way to check that req.Files is actually a single file (for case 2, I think checking for FileName() == "" is enough for case 1).
@schomatis hello, i have finish the simplest solution of one name .You can review in #5399
Hey, sorry, I should have been more clear (I forgot how this issue got started), there are two different problems:
I pass data from stdin so that is added as a file without a name (actually, with a name equal to its hash).
I add data with a directory wrapper, which is anonymous, I would like to name that directory.
Those two problems (although somewhat similar) are not related:
echo "IPFS" | ipfs add
# added QmdFyxZXsFiP4csgfM5uPu99AvFiKH62CSPDw5TP92nr7w QmdFyxZXsFiP4csgfM5uPu99AvFiKH62CSPDw5TP92nr7w
# The content added through `stdin` has the hash (QmdFyxZ) as its name.
# Now wrap `stdin` inside a directory (`-w`).
echo "IPFS" | ipfs add -w
added QmdFyxZXsFiP4csgfM5uPu99AvFiKH62CSPDw5TP92nr7w QmdFyxZXsFiP4csgfM5uPu99AvFiKH62CSPDw5TP92nr7w
added QmTv9PEwAsEJefjVQwo1gCtfemcTTaDFfg9Wwq7eVs7Yxj
# Two unnamed files:
# 1. The "IPFS" string from `stdin` as before (same hash).
# 2. The directory that wraps it (QmTv9PE).
I should have made that distinction from the beginning, I was talking about the 1st problem, which of those do you want to fix? The PR seems more oriented to the 2nd.
Also, the second problem is a bit more difficult, as it involves the MFS layer in another repo, so I would suggest you start with the first one.
@schomatis
what's the best way to check that req.Files is actually a single file
Phew, good question. The problem is that usually the request will come through HTTP which uses multipart encoding, and looking through the specs there doesn't seem to be a way to check how many files are sent without reading them all. Looks like there aren't even Content-Length headers for the individual files, so we can't see whether it only is one file by checking whether the whole size matches the first file's size plus a bunch of headers and boundaries.
What we could do is check that if we set the name there only is one file (and maybe even require that it is stdin) in PreRun. Even for HTTP calls these are executed on the client side before making the request and here we can check that if we set a name (a) f.IsDirectory() returns false and if not (b) we type assert the cmdkit.Fileto a *cmdkit.SliceFile and use sf.Length() to get the number of files (though subdirectories that contain multiple files will only count as one, of course).
Does that help?
@keks
The problem is that usually the request will come through HTTP which uses multipart encoding,
Thanks, I was only thinking about the CLI case and not HTTP, so maybe we should concentrate on problem 1 for now (data from stdin),
What we could do is check that if we set the name there only is one file (and maybe even require that it is stdin) in PreRun.
How could we achieve just the stdin check in PreRun?
Hello @schomatis ,i want to fix the 2nd problem that you said.
# Now wrap `stdin` inside a directory (`-w`).
echo "IPFS" | ipfs add -w
added QmdFyxZXsFiP4csgfM5uPu99AvFiKH62CSPDw5TP92nr7w QmdFyxZXsFiP4csgfM5uPu99AvFiKH62CSPDw5TP92nr7w
added QmTv9PEwAsEJefjVQwo1gCtfemcTTaDFfg9Wwq7eVs7Yxj
Now if we use the command -w ,we can not define the path name.
$ cat test.txt | ipfs add -w
added Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE
Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE
added Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY
In my PR #5399 ,we can use command like this:
cat test1.txt | ipfs add -w --name myfile.txt
added Qmf3xdREx34pewgYZjyvHQsrZxuk1TLUFdzoF95dsXfv3k myfile.txt
added QmSLquBbWRXs3VjTCJBXYv4BHn7NFNZxh5iU3zGNp5aMcf
@schomatis
How could we achieve just the stdin check in PreRun?
I just looked into this and it looks like I was wrong and there is no straightforward way to do that, sorry. We could add a helper the "cmdkit/files" package - I'm just a bit weary that at some point we might have a lot of helpers in there.
Hey @kjzz, I think we're still mixing two different issues. The second problem is about naming the anonymous directory included by the flag -w, but you seem to be interested in naming content from stdin, that's the first problem, and it's independent from the first one, it will happen whether or not you include the -w flag.
In the 1st problem that you're trying to fix, we need to detect that the file came from stdin, but as @keks was explaining that may not be so easy. @keks Do you see a simple (even if not complete) solution here that would take into account the HTTP as well as the CLI case?
Checking that the filename is an empty string "" would work only if we're sending a single file (otherwise, if there are multiple files without names, we can't rename them all to the same name). A really dirty approach would be to just rename the first unnamed file (and leave the rest unnamed), that would work for stdin in CLI, and in HTTP, in the worst case where we're receiving many file, we just rename the first one and leave the rest as is, which doesn't sound so bad, @keks WDYT?
hello @schomatis , thanks for explain.My pr #5399 aimed at solving the second problem.And i think it is the simplest way to solve problem.if we add many files with --name,it will return error.And i think --name ,should only support one file add.WDYT?
i think --name ,should only support one file
Agreed.
My pr #5399 aimed at solving the second problem.
The second problem is about the directory name, but you're PR is adding a name to the file, e.g.,
echo "FILE" | ipfs add -w
added QmeMMd4ou96A66KZPswpdsjjk4BtNhKk9Ap9kWCVFh3tZh QmeMMd4ou96A66KZPswpdsjjk4BtNhKk9Ap9kWCVFh3tZh
# ^^^ "FILE" contents without name: 1st problem.
added QmQ7iFAvmU57VTnjptj9h2nEmXQnQnd96F6ygFb3zdLEKP
# ^^^ Directory (included because of the `-w` argument) *also* without name: 2nd problem.
So, forgetting about the 1st/2nd terminology that may be confusing, there are two things without a name:
stdin.-w.Which of those would you like to name? Your PR right now seems to be trying to name the file but is adding a check for the -w flag (which is unrelated to the fact that a stdin file has no name), that is a bit confusing to me. Once we understand what we're trying to accomplish we can focus more on the PR/code.
hello,@schomatis,let us review the issue.
As we all known,if we use the command cat
$ cat test.txt | ipfs add -w
added Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE
Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE
added Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY
And now test.txt can only be accessed as
/ipfs/Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY/Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE
but if we did not use command cat,such as
$ ipfs add test.txt -w
added Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE test.txt
added Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY
and use this command,we can access text.txt by
/ipfs/Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY/test.txt
so the difference between two command is the test.txt accessed path.
1./Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY/Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE(hash + hash)
2./ipfs/Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY/test.txt (hash + filename)
In my pr #5399 ,
we can use new command --name and cat
cat test.txt | ipfs add -w --name test.txt
added Qme9Z5gJLNsQzVELtHh2oQZfjDugHGvLGuNKVtttGAJYAE test.txt
added Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY
the accessed path is the same as ipfs add test.txt -w
ipfs/Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY/test.txt
of course,if we use command such as cat test.txt | ipfs add -w --name myfile.txt.the accessed path will change:
ipfs/Qmaq9QpAYfjF5sPAeEytKHk9rC3Vcgt6J8PRbCsNk7ypqY/myfile.txt
so my pr is aimed at to solve problem that when we use -w command,we can define is file name by ourselves.Do i explain clearly?
Yes, I think I get it now, thanks for taking the time to re-explain it to me.
My confusion was always related to the use of the -w flag which is not related to the unnamed file issue, but I know that from my understanding of the underlying MFS/UnixFS layers, but from the perspective of the user it makes perfect sense to associate the two. We (mostly I) need to work more on closing this knowledge gap. I'll take another look at the PR.
To give you some brief (and incomplete) background: all of the IPFS files you manipulate (e.g., with ipfs add) belong to a hierarchical file system named MFS. When you create a file it get's added to a root directory which the -w flag exposes. That is, this implicit root directory (Qme9Z5 in your example) always existed, even if you don't use the -w flag. The difference, for you as a user (I now realize), is that if you have access to that root directory then you can reference the added file with the typical Unix path /ipfs/<hash-dir>/<file>, and that's where your --name flag comes in, to help you reference the file with a human-readable name instead of a hash. But, if you don't use the -w flag, the name of the file (created with --name) still would exist in this root directory, just that you wouldn't be able to access the file by name because you don't have the hash of this root directory. (You don't need to understand all this, but I felt I owed you some explanation regarding the context of my confusion.)
Sorry for the back and forth in this issue, but this has been a great feedback for me on what are some of the obscure points of how do files in IPFS work. One issue that I'm identifying right now is this contradiction between the content-addressable philosophy of IPFS and the location-addressed system of any file system such as Unix. In IPFS everything is first accessed by a hash and then you can resolve links with names in a path, but that path is meaningless unless you have a hash as an anchor to that path (the hash defines in which filesystem you're in, as there is no unique root path in IPFS).
@schomatis thanks very much for giving some ipfs background,I also very interested MFS.And i have mofidy code by your advice. And i have understand why should drop wrap check.Thank you again.Please help me review my pr #5399 . And I also want to contribute more for ipfs,can you give some task to do?My email is [email protected],if you can give some some advice or task,you can emal to me or tell me here(in github).Thanks a lot!
That's great @kjzz! Why don't you take a look at https://github.com/ipfs/go-ipfs/issues/5388? It's kind of the entry point (WIP) for community contributions to how are files handled in IPFS. That may give you a general view of what can you do in that area, and we can take it from there.
@schomatis thx.And you mean we can write some file in this pr #5052 ?
And you mean we can write some file in this pr #5052 ?
Sure! I mean, I don't know if I can give you permissions to push to that PR (if there is please let me know), but feel free to create your own PR based on that one (or just create a new PR from scratch), #5052 is rather stale at the moment.
hey @schomatis ,so Should i write some doc about file api following your template in #5052 ?
Hey @kjzz, #5052 is not much of a useful template at the moment, I would start by opening an issue in the Files API milestone and start documenting there information you find interesting (from a new user perspective) and think that other users would benefit to know (or what part of the code should be better documented), so we can later discuss how to incorporate that in a doc.
Most helpful comment
@schomatis thanks very much for giving some ipfs background,I also very interested MFS.And i have mofidy code by your advice. And i have understand why should drop wrap check.Thank you again.Please help me review my pr #5399 . And I also want to contribute more for ipfs,can you give some task to do?My email is [email protected],if you can give some some advice or task,you can emal to me or tell me here(in github).Thanks a lot!