Fec-cms: Investigate slow queries

Created on 30 Nov 2017  路  5Comments  路  Source: fecgov/fec-cms

@AmyKort commented on Tue Nov 28 2017

We experienced API outages on 11/15/17, and undertook a series of mitigation steps. https://github.com/18F/openFEC/issues/2756

One outstanding issue is to investigate slow queries, which we'll be undertaking now.


@AmyKort commented on Tue Nov 28 2017

@patphongs Could you tack on the post-mortem doc? Thanks!


@patphongs commented on Tue Nov 28 2017

Post mortem document started by @LindsayYoung is here :lock:

Most helpful comment

All 5 comments

Added logging of queries that take longer than 1s to execute on all the RDS databases using the AWS console. Also created a PR https://github.com/18F/fec-infrastructure/pull/40 to make this logging permanent.

Used aws cli to download the logs from AWS using commands of the form:

aws rds download-db-log-file-portion --db-instance-identifier fec-govcloud-prod-replica-2 --log-file-name "error/postgresql.log.2017-12-04-01" --output text > postgresql.log.2017-12-04-01.log

Analyzed the downloaded logs for a 10 hour period on 2017-12-04 using pgBadger . pgBadger can be installed on OS X using Homebrew.
The command to analyze the logs (as RDS appears to use a custom log) is:

pgbadger --prefix '%t:%r:%u@%d:[%p]:' *.log

This yielded a nice HTML report that is easy to analyze. Attaching a sample: out.html.gz

The next step is to analyze the slow queries.

Tagging @ccostino, @LindsayYoung and @patphongs to see attached pgBadger report ^^^.

Thank you so much, @vrajmohan! Will take a look very soon!

Was this page helpful?
0 / 5 - 0 ratings