When dealing with large buckets with many files in them, it would be good to have a "summary-only" option, so that the individual files are not listed. Instead, the CLI would simply print the summary information, which includes the number of objects and the total size of those objects.
I could use a shell script to filter the output, but it still takes a while to run, and I imagine that a significant part of that time could be reduced by disabling the full output.
Example command to filter the output:
aws s3 ls "mybucket" --summarize --recursive | grep -P "(Total Objects|Total Size)"
My CLI version: aws-cli/1.10.1
Marking as a feature request. The only problem I see with this is that to do it you still have to list every object in the bucket, so it would be quite slow for any prefix with a large number of files.
"summary-only" would be a wonderful feature!
It would still be faster than piping into grep when I don't want to clutter my terminal with all the bucket contents. With just 7000 objects in my bucket right now, sometimes mikebolt's grep suggestion times out on me.
In case you have many more than 7k objects (like a backup bucket we have with 155k!) and you still want to know the details, you can send the output to a file and then run tail on it (yes it still takes a while to run).
# aws s3 ls "mybucket" --summarize --recursive > /tmp/mybucket-summary.txt
# tail -n 1 /tmp/mybucket-summary.txt
Total Size: <size in bytes>
Good Morning!
We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI.
This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports.
As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions.
We鈥檝e imported existing feature requests from GitHub - Search for this issue there!
And don't worry, this issue will still exist on GitHub for posterity's sake. As it鈥檚 a text-only import of the original post into UserVoice, we鈥檒l still be keeping in mind the comments and discussion that already exist here on the GitHub issue.
GitHub will remain the channel for reporting bugs.
Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface
-The AWS SDKs & Tools Team
This entry can specifically be found on UserVoice at: https://aws.uservoice.com/forums/598381-aws-command-line-interface/suggestions/33168340--aws-s3-ls-should-have-a-summary-only-option
Based on community feedback, we have decided to return feature requests to GitHub issues.
I solved this in NodeJS with promise and executing the command using spawn
- which returns a stream. Streaming the result is important since we don't want to buffer up incoming chunks (and don't increase RAM usage) and let the GC clear the chunks after each time data event is triggered. For NodeJS environment, this solution is pretty hygienic and I hope others find this helpful.
const { spawn } = require('child_process')
async function exec() {
return new Promise(async (resolve, reject) => {
let cmd = spawn("aws", "s3 ls path/to/folder --summarize --recursive");
let out = "";
// handle error, if any
cmd.on("error", err => reject(err));
// read strem and look for the strings we want
cmd.stdout.on("data", chunk => {
let text = chunk.toString();
if (~text.indexOf("Total Objects")) {
out += text.match(/Total Objects: \d{1,}/im)[0];
}
if (~text.indexOf("Total Size")) {
out += "\n"+ text.match(/Total Size: \d{1,}/im)[0];
// resolve promise
resolve(out);
}
});
});
}
(async() => {
let result = await exec();
console.log(result); // will out put something like:
// Total Objects: 123
// Total Size: 7654321
})();
Most helpful comment
Based on community feedback, we have decided to return feature requests to GitHub issues.