Running vault in dev mode
$ vault -version
Vault v0.5.2
Then, write a binary file into vault:
$ vault write secret/burp/cert certificate=@/Users/jaxley/Downloads/burp-certificate.crt
And immediately read it back out:
$ vault read -field=certificate secret/burp/cert > ~/Downloads/burp-from-vault.crt
However, the file is munged and does not equal the original:
$ shasum ~/Downloads/burp-certificate.crt
78c77667f95f216b1543f78fa159159c264de96f /Users/jaxley/Downloads/burp-certificate.crt
$ shasum ~/Downloads/burp-from-vault.crt
7d2b76083030af92c0d061c91a64873791b7fb23 /Users/jaxley/Downloads/burp-from-vault.crt
Not sure if the issue is how the file is represented on the way in or out, but here is a sample of what the JSON looks like when reading via the API:
GET /v1/secret/burp/cert
...
"data": {
"certificate": "0?\u0002?0?\u0002-?\u0003\u0002\u0001\u0002\u0002\u0004U???0\r\u0006\t*?H??\r\u0001\u0001\u0005\u0005...
And a diff of the bytes:
$ hexdump ~/Downloads/burp-certificate.crt
0000000 30 82 02 c4 30 82 02 2d a0 03 02 01 02 02 04 55
0000010 f0 9a 88 30 0d 06 09 2a 86 48 86 f7 0d 01 01 05
0000020 05 00 30 81 8a 31 14 30 12 06 03 55 04 06 13 0b
0000030 50 6f 72 74 53 77 69 67 67 65 72 31 14 30 12 06
0000040 03 55 04 08 13 0b 50 6f 72 74 53 77 69 67 67 65
0000050 72 31 14 30 12 06 03 55 04 07 13 0b 50 6f 72 74
0000060 53 77 69 67 67 65 72 31 14 30 12 06 03 55 04 0a
....
$ hexdump ~/Downloads/burp-from-vault.crt
0000000 30 ef bf bd 02 ef bf bd 30 ef bf bd 02 2d ef bf
0000010 bd 03 02 01 02 02 04 55 ef bf bd ef bf bd ef bf
0000020 bd 30 0d 06 09 2a ef bf bd 48 ef bf bd ef bf bd
0000030 0d 01 01 05 05 00 30 ef bf bd ef bf bd 31 14 30
0000040 12 06 03 55 04 06 13 0b 50 6f 72 74 53 77 69 67
0000050 67 65 72 31 14 30 12 06 03 55 04 08 13 0b 50 6f
0000060 72 74 53 77 69 67 67 65 72 31 14 30 12 06 03 55
....
I saw some older issues about binary data that seemed to indicate this was fixed, such as:
But it doesn't seem like the CLI properly handles this. Does the API? I presume the CLI is supposedly encoding the binary data inside the JSON request that callers directly using the API would have to mimic.
@jaxley It's up to the caller to encode binary data as base64 before sending it in via the CLI, just like it's up to API users to base64 their data encoded in their POST calls.
That's not at all clear from the tool or the contract or documentation. Since you can specify a file for storage in the CLI tool, you would expect as a user that the client using the API would properly encode and decode the data so it does not corrupt it. I can see that using the API directly this would be the case, but for someone using the CLI, I would think that responsibility belongs in the CLI tool using the API?
In fact, the documentation for the CLI tool (vault read -h) seems incorrect in saying:
-field=field If included, the raw value of the specified field
will be output raw to stdout.
This wording implies to me that this is a way to read unadulterated ("raw") data out of Vault.
That's not at all clear from the tool or the contract or documentation. Since you can specify a file for storage in the CLI tool, you would expect as a user that the client using the API would properly encode and decode the data so it does not corrupt it.
Any given path in Vault may be accessed by any client. If the CLI client transparently encoded/decoded, it would mean different behavior between the API and the CLI. The CLI is purely an API client so we don't do that.
-field=field If included, the raw value of the specified field
will be output raw to stdout.
This wording implies to me that this is a way to read unadulterated ("raw") data out of Vault.
It means that the data will not be formatted into a table, which is the normal output method. Regardless, it is reading the raw data of the field, but it's still the caller's responsibility to ensure that it's in the right format before passing it in to the API.
I can see the concern about the vault CLI introducing some "special sauce" that direct API clients and SDKs would not expect nor know how to handle.
However, if the CLI is storing bits in Vault that it _cannot properly extract with fidelity_, then I see that as a bug in the CLI that it is not properly handling data or encoding so that the input will match the output.
I'm not suggesting that the CLI should do anything different than any other client of the REST API - it should properly encode the data so it can be faithfully reproduced when retrieved! That would most likely just be Base64 encoding that would be expected by an API caller to be able to decode the data. Or perhaps you could extend the API to include metadata (optional) to communicate encoding information to the caller so they don't have to guess or hard-code that knowledge.
BTW, the wording in the documentation could be improved along the lines of your follow-up comment. "raw" does not imply what you said, which is much clearer, "data will not be formatted into a table"
I have a question about vault in hashicorp..
In vault, Write file and read it, then compare byte code, Do you know why file is changed?
$ vault write secret/catalina/files/key-dat2 [email protected]
Success! Data written to: secret/catalina/files/key-dat2
$ mv key.dat key.dat2
vault read -format=raw -field=value secret/catalina/files/key-dat2 > key.dat
cmp -c key.dat key.dat2
key.dat key.dat2 differ: byte 1, line 1 is 357 � 211 M-^I
Is there workaround?
@gramer https://github.com/hashicorp/vault/issues/1423#issuecomment-219525845
@jefferai
Thanks, I solved my issue :)
Most helpful comment
That's not at all clear from the tool or the contract or documentation. Since you can specify a file for storage in the CLI tool, you would expect as a user that the client using the API would properly encode and decode the data so it does not corrupt it. I can see that using the API directly this would be the case, but for someone using the CLI, I would think that responsibility belongs in the CLI tool using the API?
In fact, the documentation for the CLI tool (vault read -h) seems incorrect in saying:
-field=field If included, the raw value of the specified field
will be output raw to stdout.
This wording implies to me that this is a way to read unadulterated ("raw") data out of Vault.