This is a feature request.
It would be great if s3 cp
command accepts multiple sources just like bash cp
command.
For example
$ aws s3 cp a b s3://BUCKET/
upload: ./a to s3://BUCKET/a
upload: ./b to s3://BUCET/b
$ aws s3 cp a* s3://BUCKET/
upload: ./a1 to s3://BUCKET/a1
upload: ./a2 to s3://BUCKET/a2
We can take this into consideration. Before that, your workaround will be use shell script to achieve similar effect.
This would be cool, specially if the cli can handle those cp in parallel. For example at the moment I want to copy 13K files from different S3 locations. They are all in the same bucket but they are not in the same folder so I have to write one 'aws s3 cp' command for each file and it takes a lot of time to run.
The commands that I'm running are something like this:
aws s3 cp s3://example-bucket/0-200M/A.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/1000M-1500M/B.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/C.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/0-200M/D.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/1000M-1500M/E.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/F.json.gz s3://example-bucket/output-dir/
aws s3 cp s3://example-bucket/another-dir/H.json.gz s3://example-bucket/output-dir/
... 13K lines more with the same command, just changing the input s3 file..
This approach takes a lot of time. Is there any workaround to this kind of issues? If not, I think the tool should support a batch-cp where you can specify a list (or maybe a file) with all the files that you want to copy.
I agree, I am doing something identical to @ejoncas, and while this isn't bad, The timout in between each cp task makes this a several hour process.
Any updates on this?
Besides scripting a loop for aws s3 cp
, I've used aws s3 sync
to accomplish this
aws s3 sync --exclude=* --include=a* s3://bucket/
you can provide multiple --excludes
and --includes
, Above I'm excluding _everything_ then including what I want
I'd say without supporting multiple files copy, the CLI is seriously crippled. There are literally no justifiable reasons of not supporting this, merely due to the laziness of AWS engineers, and bad project management of AWS CLI. No excuses! Shame on you AWS CLI folks.
Fix @yyolk 's command issue: aws: error: too few arguments
Suppose you need to sync the current folder to s3 bucket, add .
as source.
aws s3 sync --exclude=* --include=a* . s3://bucket/
Good Morning!
We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI.
This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports.
As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions.
We鈥檝e imported existing feature requests from GitHub - Search for this issue there!
And don't worry, this issue will still exist on GitHub for posterity's sake. As it鈥檚 a text-only import of the original post into UserVoice, we鈥檒l still be keeping in mind the comments and discussion that already exist here on the GitHub issue.
GitHub will remain the channel for reporting bugs.
Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface
-The AWS SDKs & Tools Team
This entry can specifically be found on UserVoice at: https://aws.uservoice.com/forums/598381-aws-command-line-interface/suggestions/33168382-why-aws-s3-cp-does-not-accept-multiple-sources
This message was created automatically by mail delivery software.
A message that you sent could not be delivered to one or more of its
recipients. This is a temporary error. The following address(es) deferred:
[email protected]
Domain salmanwaheed.info has exceeded the max emails per hour (150/150 (100%)) allowed. Message will be reattempted later
------- This is a copy of the message, including all the headers. ------
------ The body of the message is 6173 characters long; only the first
------ 5000 or so are included here.
Received: from github-smtp2-ext2.iad.github.net ([192.30.252.193]:53933 helo=github-smtp2b-ext-cp1-prd.iad.github.net)
by box1177.bluehost.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256)
(Exim 4.89_1)
(envelope-from noreply@github.com)
id 1ej0Or-001aTW-Fw
for [email protected]; Tue, 06 Feb 2018 03:22:54 -0700
Date: Tue, 06 Feb 2018 02:22:34 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com;
s=pf2014; t=1517912554;
bh=1Kwm4VO+JPgtYylbgo7s3UaRgKxjnczSbJfF6ZeTkvo=;
h=From:Reply-To:To:Cc:In-Reply-To:References:Subject:List-ID:
List-Archive:List-Post:List-Unsubscribe:From;
b=LkJKOOqN7Og6jAz60fFc+9T1hwFyxuvompSQ+YC/lRvNnNEl/Qfwk5zcqrecVAau0
b7Tn4g8n8sHzPuJqf8ALYbSVZScPSYi+QplKjGIGW9SW8+P7+lWX8ZdTaTI9Z8B8CY
/lPIB+B8P+D2KZiVIczniq+ayUGHoYL0ud9dOB8M=
From: Andre Sayre notifications@github.com
Reply-To: aws/aws-cli reply@reply.github.com
To: aws/aws-cli aws-cli@noreply.github.com
Cc: Subscribed subscribed@noreply.github.com
Message-ID:
In-Reply-To:
References:
Subject: Re: [aws/aws-cli] Why aws s3 cp does not accept multiple sources?
(#1542)
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="--==_mimepart_5a7981ea4fab3_2745c3ff519788f2c1872dc";
charset=UTF-8
Content-Transfer-Encoding: 7bit
Precedence: list
X-GitHub-Sender: ASayre
X-GitHub-Recipient: salmanwaheed
X-GitHub-Reason: subscribed
List-ID: aws/aws-cli
List-Archive: https://github.com/aws/aws-cli
List-Post: reply@reply.github.com
List-Unsubscribe:
https://github.com/notifications/unsubscribe/AO8bOAWJTkRWstvO5YakOqaIBdroPpdRks5tSCfqgaJpZM4GIAdP
X-Auto-Response-Suppress: All
X-GitHub-Recipient-Address: [email protected]
X-Spam-Status: No, score=-1.1
X-Spam-Score: -10
X-Spam-Bar: -
X-Ham-Report: Spam detection software, running on the system "box1177.bluehost.com",
has NOT identified this incoming email as spam. The original
message has been attached to this so you can view it or label
similar future email. If you have any questions, see
root\@localhost for details.
Content preview: Good Morning! We're closing this issue here on GitHub, as
part of our migration to UserVoice
for feature requests involving the AWS CLI. [...]
Content analysis details: (-1.1 points, 5.0 required)
pts rule name description
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: uservoice.com]
-0.5 SPF_PASS SPF: sender matches SPF record
0.0 HTML_MESSAGE BODY: HTML included in message
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.5 AWL AWL: Adjusted score from AWL reputation of From: address
X-Spam-Flag: NO
----==_mimepart_5a7981ea4fab3_2745c3ff519788f2c1872dc
Content-Type: text/plain;
charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Good Morning!
We're closing this issue here on GitHub, as part of our migration to Use=
rVoice for feature requests involving the AWS CLI.
This will let us get the most important features to you, by making it eas=
ier to search for and show support for the features you care the most abo=
ut, without diluting the conversation with bug reports.
As a quick UserVoice primer (if not already familiar): after an idea is p=
osted, people can vote on the ideas, and the product team will be respond=
ing directly to the most popular suggestions.
We=E2=80=99ve imported existing feature requests from GitHub - Search for=
this issue there!
And don't worry, this issue will still exist on GitHub for posterity's sa=
ke. As it=E2=80=99s a text-only import of the original post into UserVoi=
ce, we=E2=80=99ll still be keeping in mind the comments and discussion th=
at already exist here on the GitHub issue.
GitHub will remain the channel for reporting bugs. =
Once again, this issue can now be found by searching for the title on: ht=
tps://aws.uservoice.com/forums/598381-aws-command-line-interface =
-The AWS SDKs & Tools Team
-- =
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/aws/aws-cli/issues/1542#issuecomment-363377746=
----==_mimepart_5a7981ea4fab3_2745c3ff519788f2c1872dc
Content-Type: text/html;
charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Good Morning!
We're closing this issue here on GitHub, as part of our migration to <= a href=3D"https://aws.uservoice.com/forums/598381-aws-command-line-interf= ace" rel=3D"nofollow">UserVoice for feature requests involving the AW= S CLI.
This will let us get the most important features to you, by making it = easier to search for and show support for the features you care the most = about, without diluting the conversation with bug reports.
As a quick UserVoice primer (if not already familiar): after an idea i= s posted, people can vote on the ideas, and the product team will be resp= onding directly to the most popular suggestions.
We=E2=80=99ve imported existing feature requests from GitHub - Search = for this issue there!
And don't worry, this issue will still exist on GitHub for posterity's= sake. As it=E2=80=99s a text-only import of the original post into User= Voice, we=E2=80=99ll still be keeping in mind the comments and discussion= that already exist here on the GitHub issue.
GitHub will remain the channel for reporting bugs.
Once again, this issue can now be found by searching for the title on:= https://aws.uservoice.com/forums/598381-aws-comma= nd-line-interface
-The AWS SDKs & Tools Team
&m=
dash;
You are receiving this because you are subscribed to this thre=
ad.
Reply to this email directly, view it on GitHub, or mute the thread.
ta>
----==_mimepart_5a7981eac5c4d_1ffd2ad9784d8ed4406080--
Based on community feedback, we have decided to return feature requests to GitHub issues.
Any update on this? Would still be great.
I think sync and cp are like sword and needle. They have different use-cases.
@ejoncas Your case was similar to mine. Its a use case of copy and not sync. Store all the paths of individual files separated by a new line in a separate file called file_with_all_paths.txt
Something like this:
s3://example-bucket/0-200M/A.json.gz
s3://example-bucket/1000M-1500M/B.json.gz
s3://example-bucket/another-dir/C.json.gz
....
...
..
A bash loop can read through that file one by one and run the cp command
for f in $(cat ~/path_to_the_file/file_with_all_paths.txt); do echo "Now moving file $f"; aws s3 cp $f s3://example-bucket/output-dir/; done
Although I am also a beginner, I did write a blog on how I accomplished it. Check it out here:
http://www.onceaday.today/subjects/15/posts/152. If it helps someone, great!
--include and --exclude are cute and useful, but only when there's a discernable pattern to the file names. If it's just a random-looking list of names, they're useless.
What would help tremendously would be the ability to read a list of source files from a file. Or just accept multiple source files as arguments - but reading that whole list from a file would be much more powerful.
aws s3 cp --source-files long_list.txt s3://bucket_name/
This needs to work with source files that are either local or in a bucket.
The CLI would then absolutely need to do batch copies, if the API allows it.
My suggestion --
aws s3 cp --source-files long_list.txt s3://bucket_name/
aws s3 cp "file1.xls,file2.jpg,file3.txt,file4.html" s3://bucket_name/
Has there been an update enabling to do this yet?
This would be cool, specially if the cli can handle those cp in parallel. For example at the moment I want to copy 13K files from different S3 locations. They are all in the same bucket but they are not in the same folder so I have to write one 'aws s3 cp' command for each file and it takes a lot of time to run.
The commands that I'm running are something like this:
aws s3 cp s3://example-bucket/0-200M/A.json.gz s3://example-bucket/output-dir/ aws s3 cp s3://example-bucket/1000M-1500M/B.json.gz s3://example-bucket/output-dir/ aws s3 cp s3://example-bucket/another-dir/C.json.gz s3://example-bucket/output-dir/ aws s3 cp s3://example-bucket/0-200M/D.json.gz s3://example-bucket/output-dir/ aws s3 cp s3://example-bucket/1000M-1500M/E.json.gz s3://example-bucket/output-dir/ aws s3 cp s3://example-bucket/another-dir/F.json.gz s3://example-bucket/output-dir/ aws s3 cp s3://example-bucket/another-dir/H.json.gz s3://example-bucket/output-dir/ ... 13K lines more with the same command, just changing the input s3 file..
This approach takes a lot of time. Is there any workaround to this kind of issues? If not, I think the tool should support a batch-cp where you can specify a list (or maybe a file) with all the files that you want to copy.
Thanks a lot @ejoncas for your answer which help me solved my problem ! <3
Most helpful comment
This would be cool, specially if the cli can handle those cp in parallel. For example at the moment I want to copy 13K files from different S3 locations. They are all in the same bucket but they are not in the same folder so I have to write one 'aws s3 cp' command for each file and it takes a lot of time to run.
The commands that I'm running are something like this:
This approach takes a lot of time. Is there any workaround to this kind of issues? If not, I think the tool should support a batch-cp where you can specify a list (or maybe a file) with all the files that you want to copy.