Boto3: DatabaseMigrationService.Waiter.TestConnectionSucceeds never works

Created on 6 Nov 2018 · 5Comments · Source: boto/boto3

Use case

We have some automation around DMS pipelines for a nightly dump of our databases. A small percentage of DMS tasks fail to start because the initial endpoint connection test fails. I'm attempting to handle this issue and re-test the connection before starting the task.

Expected behaviour

DatabaseMigrationService.Waiter.TestConnectionSucceeds waits for a connection test in progress to succeeds and then returns

Actual behaviour

Errors out immediately with:

Waiter TestConnectionSucceeds failed: Connection is already being tested: WaiterError
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 64, in lambda_handler
    waiter.wait(ReplicationInstanceArn=replication_instance_arn, EndpointArn=source_endpoint_arn)
  File "/var/runtime/botocore/waiter.py", line 53, in wait
    Waiter.wait(self, **kwargs)
  File "/var/runtime/botocore/waiter.py", line 313, in wait
    last_response=response
botocore.exceptions.WaiterError: Waiter TestConnectionSucceeds failed: Connection is already being tested

I'm not too familiar with how boto waiters work, but it seems like this might just be calling the dms test-connection api, which returns the same error after it's called a second time (while the connection is still testing). I've reproduced a similar error by using the CLI directly:

$ aws dms test-connection --replication-instance-arn arn:aws:dms:us-east-1:*****:rep:65LSNAJCV7QHFPNWAZUHZ5DNHQ --endpoint-arn arn:aws:dms:us-east-1:*****:endpoint:TNS6FYCD4JYFMNUYLI2OCQJMPI
{
    "Connection": {
        "ReplicationInstanceArn": "arn:aws:dms:us-east-1:*****:rep:65LSNAJCV7QHFPNWAZUHZ5DNHQ",
        "EndpointArn": "arn:aws:dms:us-east-1:*****:endpoint:TNS6FYCD4JYFMNUYLI2OCQJMPI",
        "Status": "testing",
        "EndpointIdentifier": "datatruck-scylla-nextaccounting-shards-read-replica-02-0116",
        "ReplicationInstanceIdentifier": "datatruck-scylla-next-accounting-shard-0116"
    }
}

$ aws dms test-connection --replication-instance-arn arn:aws:dms:us-east-1:*****:rep:65LSNAJCV7QHFPNWAZUHZ5DNHQ --endpoint-arn arn:aws:dms:us-east-1:*****:endpoint:TNS6FYCD4JYFMNUYLI2OCQJMPI 

An error occurred (InvalidResourceStateFault) when calling the TestConnection operation: Connection is already being tested

Code

Here's the relevant part of my python code:

replication_task = response['ReplicationTasks'][0]
replication_task_arn = replication_task['ReplicationTaskArn']
source_endpoint_arn = replication_task['SourceEndpointArn']
target_endpoint_arn = replication_task['TargetEndpointArn']
replication_instance_arn = replication_task['ReplicationInstanceArn']

logger.info(f"Testing connection between replication instance ${replication_instance_arn} and endpoint ${source_endpoint_arn}")
logger.info("Waiting for successful connection...")
waiter = client.get_waiter('test_connection_succeeds')
waiter.wait(ReplicationInstanceArn=replication_instance_arn, EndpointArn=source_endpoint_arn)
logger.info(f"Starting replication task '{replication_task_arn}' from source {source_endpoint_arn} to target {target_endpoint_arn} on {replication_instance_arn}")

bug waiters

Source

mwarkentin

👍7

Most helpful comment

I'm running into a similar issue with the the dms ReplicationTaskStopped waiter for both the AWS CLI and boto3.

I am running the following versions:
boto3 (1.9.127)
botocore (1.12.127)
aws-cli/1.16.135

I always get the error "Waiter ReplicationTaskStopped failed: Waiter encountered a terminal failure state" unless the task is already in the "stopped" state. It returns this error even if the task is in the "starting" or "running" state which doesn't seem correct.

jdonboch on 4 Apr 2019

👍3

All 5 comments

The CLI appears to be broken in the same way:

$ aws dms wait test-connection-succeeds --replication-instance-arn arn:aws:dms:us-east-1:***:rep:65LSNAJCV7QHFPNWAZUHZ5DNHQ --endpoint-arn arn:aws:dms:us-east-1:***:endpoint:TNS6FYCD4JYFMNUYLI2OCQJMPI

Waiter TestConnectionSucceeds failed: Connection is already being tested

mwarkentin on 6 Nov 2018

@mwarkentin Thanks for the report. Definitions of waiters are shared between the Python SDK and the AWS CLI (and all of our SDKs, for that matter). I can confirm that this waiter is broken for the reason you describe. We're working on getting this fixed. Labeling as a bug for now.

joguSD on 7 Nov 2018

👍1

A fix for this was pushed in yesterday's (11/7/2018) release.
As of botocore v1.12.40, boto3 v1.9.40, and aws-cli v1.16.40 this waiter should function correctly.

joguSD on 8 Nov 2018

Thanks, I’ll test it out soon!

mwarkentin on 9 Nov 2018

I'm running into a similar issue with the the dms ReplicationTaskStopped waiter for both the AWS CLI and boto3.

I am running the following versions:
boto3 (1.9.127)
botocore (1.12.127)
aws-cli/1.16.135

jdonboch on 4 Apr 2019

👍3

Was this page helpful?

0 / 5 - 0 ratings