for s3 ls
display is fine, but after redirect/pipe every Chinese character becomes "?" question mark.
Reproduce:
$ aws s3 ls s3://bucketname
PRE 健身/
PRE 手工制作水晶石头等/
PRE 菜谱/
$ aws s3 ls s3://bucketname > x.txt
$ cat x.txt
PRE ??/
PRE ?????????/
PRE ??/
FAILED: aws-cli/1.16.96 Python/2.7.15rc1 Linux/4.15.0-1031-aws botocore/1.12.86
SUCCESS: aws-cli/1.14.44 Python/3.6.7 Linux/4.15.0-1031-aws botocore/1.8.48 (ubuntu 18.04 LTS apt install)
SUCCESS: aws-cli/1.15.71 Python/3.5.2 Linux/4.15.0-1031-aws botocore/1.10.70 (ubuntu 18.04 LTS snap install)
files generated with version 1.14.44 and 1.15.71 are correct.
for s3 cp
one version works fine and another fails.
FAILED: version: aws-cli/1.15.71 Python/3.5.2 Linux/4.15.0-1031-aws botocore/1.10.70 (ubuntu 18.04 LTS snap install)
$ aws s3 cp s3://buck1/成都/006.JPG s3://buck2/成都/006.JPG
fatal error: 'utf-8' codec can't encode character '\udce6' in position 11: surrogates not allowed
SUCCESS: version aws-cli/1.14.44 Python/3.6.7 Linux/4.15.0-1031-aws botocore/1.8.48 (ubuntu 18.04 LTS apt install)
$ aws s3 cp s3://buck1/成都/006.JPG s3://buck2/成都/006.JPG
copy: s3://buck1/成都/006.JPG to s3://buck2/成都/006.JPG
$ echo $?
0
@dawnfantasy - Thank you for reaching out and reporting this behavior. I am investigating this problem. I was not able to reproduce the same results you are seeing when I tested:
$ aws s3 ls s3://bucketname > x.txt
$ cat x.txt
I digging a little deeper into this issue and will update you with any new findings. Any chance you can provide the debug logs using the --debug
option? It would help us understand what is happening.
@dawnfantasy - Thank you for your feedback and the debug logs. I will investigate this issue and reply back as soon as I have more information.
@dawnfantasy - Are you using EC2 Linux instances? If so, did you use custom AMIs or Amazon AMIs? Do you have the Amazon AMI IDs?
ec2 instance, with Amazon AMI.
AMI ID: ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20180912 (ami-0ac019f4fcb7cb7e6)
@justnance Just in case you missed my last reply.
Related PR https://github.com/aws/aws-cli/pull/1844.
I was able to get UTF-8 characters instead of ?????
by executing:
export PYTHONIOENCODING=utf-8
before using aws s3 ls s3://bucketname | othertool
.
Looks like this should be fixed via PR #1844.
Most helpful comment
Related PR https://github.com/aws/aws-cli/pull/1844.
I was able to get UTF-8 characters instead of
?????
by executing:before using
aws s3 ls s3://bucketname | othertool
.