Trying out replacing our current fluentd log shipping system with vector and I noticed your aws_s3 sync is not setting any Content-Type on uploaded log files. AWS then fills in an default application/octet-stream which while sufficient for downloading S3 files via a CLI/API, isn't so nice an experience in the Browser (e.g. with the AWS S3 Console).
Can I suggest hard coding text/plain; charset=utf8 as the best option, even for encoding.codec = "ndjson" as suggested in https://stackoverflow.com/questions/51690624/json-lines-mime-type
I'm guessing the code to change is in https://github.com/timberio/vector/blob/ee953b7f674f9cd4d844928fc06d2360361fe7a2/src/sinks/aws_s3.rs#L291-L308 but don't know how you'd feel about a hard-coded value...
Thanks for reporting @tyrken. We'll prioritize and get this fixed.
@tyrken thank you for issue and pull request. I experimented a little with content-type / content-encoding too:
Commands:
aws s3api put-object --bucket s3-issue-2769 --body data.log --key data1.log --content-type text/x-log > /dev/null
aws s3api put-object --bucket s3-issue-2769 --body data.log.gz --key data2.log.gz --content-type application/x-gzip > /dev/null
aws s3api put-object --bucket s3-issue-2769 --body data.log --key data3.log --content-type binary/octet-stream > /dev/null
aws s3api put-object --bucket s3-issue-2769 --body data.log.gz --key data4.log.gz --content-type binary/octet-stream > /dev/null
aws s3api put-object --bucket s3-issue-2769 --body data.log.gz --key data5.log.gz --content-type binary/octet-stream --content-encoding gzip > /dev/null
aws s3api put-object --bucket s3-issue-2769 --body data.log --key data6.log --content-type text/plain > /dev/null
aws s3api put-object --bucket s3-issue-2769 --body data.log.gz --key data7.log.gz --content-type text/plain > /dev/null
aws s3api put-object --bucket s3-issue-2769 --body data.log.gz --key data8.log.gz --content-type text/plain --content-encoding gzip > /dev/null
I tried all files in firefox (77.0.1) / chrome (84.0.4147.45) on fedora32.
Uncompressed files opened in firefox in case of content-type text/plain (in chrome also content-type text/x-log).
With gzip compression only content-type text/plain + content-encoding gzip opened in both browsers.
I think we should left current content-encoding option and add content-type text/plain. What you think?
Uncompressed - I don't like text/plain as when Downloading it renames the file from a ".log" extension to ".txt" with Chrome/Firefox on both Linux & Windows. OTOH text\x-log doesn't rename but still opens nicely so I'd +1 to that.
Compressed: I can see why you suggest text/plain and content-encoding: gzip as both Open & Download appear to work - as the browser/AWS decompresses the file as it downloads. Hence from a starting data.log.gz Download gives you a data.txt that works as a text file (albeit actually being ndjson, not plain text). However I regard that as bad for the Download case as I expect a file called data.log.gz.
text/x-log and content-encoding: gzip together are worse, as while Open still works there is no renaming of the file extensions so Download gives you a data.log.gz which is plain text, not actually a gzip. When you click to open it in a file explorer that launches some compressed-file-viewer which says it's corrupt.
Personally I'd prefer "Download" mean "Download" (no renaming or magic decompression) & value that over the convenience of "Open" given most professionals will have a OS setup to open compressed files easily. So I'd vote for NOT putting the content-encoding metadata on for compression.
@tyrken I made these values configurable so it's possible set any value
Thanks - that's the best we can do for now.
I'll update here/open another issue if AWS Support come back with any better suggestions. So far they've mentioned using Content-Disposition but it's not helped yet, awaiting 2nd line...
FYI AWS support have replied saying the only thing they do is override the Content-Disposition response header to be attachment for the "Download" button and inline for the "Open" button. All other behaviour, especially the decompression-during-transfer, is the response of the web browser (Chrome/Firefox) to the standard Content-Encoding: gzip header.
They have registered as a Feature Request the ability to override the Download button to set Content-Disposition: attachment; filename="dummy.log", but I wouldn't hold your breath...
So thanks for your efforts & looking forward to the next 0.10 beta release to try the config items out...
Most helpful comment
FYI AWS support have replied saying the only thing they do is override the
Content-Dispositionresponse header to beattachmentfor the "Download" button andinlinefor the "Open" button. All other behaviour, especially the decompression-during-transfer, is the response of the web browser (Chrome/Firefox) to the standardContent-Encoding: gzipheader.They have registered as a Feature Request the ability to override the Download button to set
Content-Disposition: attachment; filename="dummy.log", but I wouldn't hold your breath...So thanks for your efforts & looking forward to the next 0.10 beta release to try the config items out...