Aws-cli: Example for downloading full RDS log doesn't actually work

Created on 30 Oct 2016  路  24Comments  路  Source: aws/aws-cli

The instructions from #1617 do not download the entire log file as documented.

$ aws --version
aws-cli/1.11.10 Python/2.7.11 Darwin/15.5.0 botocore/1.4.67
$ aws --output text rds describe-db-log-files --db-instance-identifier mydatabase
[...]
DESCRIBEDBLOGFILES  1477763976000   error/postgresql.log.2016-10-29-17  1908814
[...]
$ aws rds download-db-log-file-portion --db-instance-identifier mydatabase --log-file-name error/postgresql.log.2016-10-29-17 --starting-token 0 --output text > full.txt
$ ls -l full.txt
-rw-r--r--  1 philfrost  staff  1212017 Oct 30 07:59 full.txt

Note the log file has size 1908814, but downloaded it is only 1212017 bytes.

It's unclear if it's even possible to download a log file in a simple shell script as the pagination tokens do not seem to be available with --output text. I'm guessing one would need to parse JSON or XML to get them.

closing-soon guidance rds

Most helpful comment

@stealthycoin any reason why this was closed? Was this fixed?

All 24 comments

Could you post the --debug log? I'd like to see what we're sending the service to see if the issue is with us or if there's a bug service-side.

@anbotero reported similarly here. Could one of you please paste in the portion of the --debug logs that shows what we're sending to the service? You'll be looking for an entry that contains Making request for OperationModel.

@JordonPhillips hey there.

I get several of those Making requests, so this here is the first one, three in between, and the last one:

2016-11-08 13:39:54,887 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': u'0', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
2016-11-08 13:40:02,560 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': '19:1048697', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
2016-11-08 13:40:05,260 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': '19:2097350', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
2016-11-08 13:40:08,087 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': '19:3145944', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
2016-11-08 13:40:55,224 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=DownloadDBLogFilePortion) (verify_ssl=True) with params: {'body': {'Action': u'DownloadDBLogFilePortion', u'Marker': '19:16779173', 'Version': u'2014-10-31', u'LogFileName': u'slowquery/engine.very-slow-queries.log.0', u'DBInstanceIdentifier': u'mydb'}, 'url': u'https://rds.amazonaws.com/', 'headers': {'User-Agent': 'aws-cli/1.11.13 Python/2.7.10 Darwin/15.6.0 botocore/1.4.70'}, 'context': {'client_region': 'us-east-1', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x10cd793d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}

Before each one of those but the first I get this:

2016-11-08 13:40:05,250 - MainThread - botocore.hooks - DEBUG - Event needs-retry.rds.DownloadDBLogFilePortion: calling handler <botocore.retryhandler.RetryHandler object at 0x10cb20f10>
2016-11-08 13:40:05,250 - MainThread - botocore.retryhandler - DEBUG - No retry needed.

With that I鈥檓 getting, out of three tries, a 17MB file, when the file on the Web Console says it鈥檚 2.3GB.

Let me know if you need anything else.

@anbotero It looks like what we're sending to the service is correct, so whatever the service is returning must be strange. Your debug log should also contain sections that have Response headers: and Response body:. Could you post those as well? I'm willing to bet the service is indicating that there is nothing left.

@JordonPhillips indeed, it indicates something like that. This is the last of those responses:

2016-11-08 13:40:55,665 - MainThread - botocore.vendored.requests.packages.urllib3.connectionpool - DEBUG - "POST / HTTP/1.1" 200 1051028
2016-11-08 13:40:58,273 - MainThread - botocore.parsers - DEBUG - Response headers: {'x-amzn-requestid': 'a2ebcd66-b1d4-22a7-89fd-a253ac5e0b14', 'vary': 'Accept-Encoding', 'content-length': '1051028', 'content-type': 'text/xml', 'date': 'Tue, 08 Nov 2016 18:40:55 GMT'}
2016-11-08 13:40:58,274 - MainThread - botocore.parsers - DEBUG - Response body:
<DownloadDBLogFilePortionResponse xmlns="http://rds.amazonaws.com/doc/2014-10-31/">
  <DownloadDBLogFilePortionResult>
    <AdditionalDataPending>false</AdditionalDataPending>

Previous iterations of <AdditionalDataPending></AdditionalDataPending> have a true value.

@anbotero With that in mind, and since it seems to be happening on the console, I would recommend raising this issue on the service forums. I'll let them know as well.

Hi,
I can confirm that this bug is happening. I tested with different versions of awscli and python, and on different servers, but the result is always that log file is cut somehow - no matter the size.
I tested with:
aws-cli/1.10.48 Python/2.7.12 from amazon EC2 instance and
aws-cli/1.11.13 Python/3.5.2 from outside amazon.
But no meter what I do aws rds download-db-log-file-portion is not working.

Having issues with this also. I'm unable to download more than 1.3-1.5 GB of a log and then I get either the following errors

A client error (InvalidParameterValue) occurred when calling the DownloadDBLogFilePortion operation: This file contains binary data and should be downloaded instead of viewed.

or

A client error (Throttling) occurred when calling the DownloadDBLogFilePortion operation: Rate exceeded

Using the following version on an EC2 instance

aws-cli/1.10.1 Python/3.5.2 Linux/4.4.0-43-generic botocore/1.3.23

and on my laptop

aws-cli/1.10.56 Python/2.7.11 Darwin/16.1.0 botocore/1.4.46

I am also seeing this error. See the output below. Additionally, I think it's because the messages are being truncated. There is a truncation for each log file portion except for the last one.

$ aws --output text rds describe-db-log-files --db-instance-identifier $DBINSTANCE | grep 2016-11-11-18
DESCRIBEDBLOGFILES  1478890800000   error/postgresql.log.2016-11-11-18  206701928
$ aws rds download-db-log-file-portion --db-instance-identifier $DBINSTANCE --log-file-name error/postgresql.log.2016-11-11-18 --starting-token 0 --max-items 99999999999 --output=text --debug 1>stdout1 2>stderr1
$ stat -f '%z' stdout1
206241418
$ grep -c "Your log message was truncated" stdout1
196
$ echo '206701928 / ( 1024 * 1024 )' | bc
197
$ aws --version
aws-cli/1.10.56 Python/2.7.10 Darwin/14.5.0 botocore/1.4.46

This is a horrible bit code (did not have much time to spend on it) but it does let me get the entire log file (I hope) ...

#!/bin/bash 
COUNTER=1
LASTFOUNDTOKEN=0
PREVIOUSTOKEN=0

FILE=$1

rm -f ${FILE}

while [  $COUNTER -lt 100 ]; do
    echo "Lets try and get ${FILE}.${COUNTER}"
    echo "The starting-token will be set to ${LASTFOUNDTOKEN}"
    PREVIOUSTOKEN=${LASTFOUNDTOKEN}

    aws rds download-db-log-file-portion --db-instance-identifier mtsos-prd-db-pg01 --log-file-name error/${FILE} --starting-token ${LASTFOUNDTOKEN}  --debug --output text 2>>${FILE}.${COUNTER}.debug >> ${FILE}.${COUNTER}
    LASTFOUNDTOKEN=`grep "<Marker>" ${FILE}.${COUNTER}.debug | tail -1 | tr -d "<Marker>" | tr -d "/" | tr -d " "`

    echo "LASTFOUNDTOKEN is ${LASTFOUNDTOKEN}"
    echo "PREVIOUSTOKEN is ${PREVIOUSTOKEN}"

    if [ ${PREVIOUSTOKEN} == ${LASTFOUNDTOKEN} ]; then
        echo "No more new markers, exiting"
        rm -f ${FILE}.${COUNTER}.debug
        rm -f ${FILE}.${COUNTER}
        exit;
    else
        echo "Marker is ${LASTFOUNDTOKEN} more to come ... "
        echo " "
        rm -f ${FILE}.${COUNTER}.debug
        PREVIOUSTOKEN=${LASTFOUNDTOKEN}
    fi

    cat ${FILE}.${COUNTER} >> ${FILE}
    rm -f ${FILE}.${COUNTER}

    let COUNTER=COUNTER+1
done

so I pass in the logfile name ...
./example-get.sh postgresql.log.2017-02-09-15

it loops through until it gets no more new markers

.
.
.
The starting-token will be set to 1:1652232848
LASTFOUNDTOKEN is 1:1693873977
PREVIOUSTOKEN is 1:1652232848
Marker is 1:1693873977 more to come ... 
Lets try and get postgresql.log.2017-02-09-15.27
The starting-token will be set to 1:1693873977
LASTFOUNDTOKEN is 1:1693873977
PREVIOUSTOKEN is 1:1693873977
No more new markers, exiting

and I end up with a file called postgresql.log.2017-02-09-15 that has the entire log (I hope)

As I mentioned this is very quick so feel free to improve....

Same issue here. Using the last Marker as a start-token value allows to grab the rest of the log .
Thank you for the code @fmmatthewzeemann !

@stealthycoin any reason why this was closed? Was this fixed?

@jlintz asking myself the same question. seeing the same/similar problem with aws-cli/1.11.95
/cc @JordonPhillips

aws rds download-db-log-file-portion \
    --db-instance-identifier XXXXXXXXXX \
    --region XXXXXXXX \
    --log-file-name error/XXXXXXXXXXX \
    --starting-token=0 \
    --profile XXXXXXXX \
    --output text >> test.txt

only gives me ~300mb of a ~950mb logfile.

The same here. Cannot download even 100mb log.

aws --version
aws-cli/1.11.165 Python/2.7.6 Linux/3.13.0-100-generic botocore/1.7.23

I don't know why that issue has been closed. I said that using the proposed shell script was a workaround but not that the code is fixed !

Anyone had a fix yet or tried to contact aws?
I have the same with webconsole. It fails all the time.

@chefone I use script from @fmmatthewzeemann as a workaround

Yeah but it should be fixed at the API level ...

i solved this by using the golang sdk instead of the aws cli.
other sdk's would probably also do the trick.

Workaround didn't work for me as I guess the output is different in my version of aws-cli but I'm just chiming in to say that I am seeing this issue with 1.11.183.

$ aws --version
aws-cli/1.11.183 Python/2.7.6 Linux/3.13.0-128-generic botocore/1.7.41
$ aws rds download-db-log-file-portion --region us-west-2  --db-instance-identifier $DB --output text --log-file-name error/postgresql.log.2017-11-10-21  --starting-token 0 > 21
$ grep truncated 21
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]
 [Your log message was truncated]

I checked with AWS support about this and they said that the implementation doesnt work in the CLI or Boto3. They gave me some code that works on REST and I tweaked it into a module. I apologize if this is off-topic for this forum, but I thought it would help a lot of people on this thread.

Attached is the starter code the tech gave me. It seems to work so far.

sample1.txt

Another alternative to try, which worked for me: the deprecated rds-download-db-logfile command, which RDS references in their REST (!) documentation.

$ cd ~/Downloads/RDSCli-1.19.004

$ AWS_RDS_HOME=$(pwd) ./bin/rds-download-db-logfile YOUR-INSTANCE --I YOUR_ACCESS_KEY --S 'YOUR_SECRET_KEY' --region YOUR-REGION --log-file-name error/postgresql.log.2018-03-16-20 > postgresql.log.2018-03-16-20

@marksher thank you very much for that code. I cleaned it up just a little bit and turned it into something I can run in a slightly more automated fashion:

https://gist.github.com/joer14/4e5fc38a832b9d96ea5c3d5cb8cf1fe9

never mind what is the option as said in the doc "Downloads all or a portion of the specified log file, up to 1 MB in size."

Was this page helpful?
0 / 5 - 0 ratings