Loki: Random "RequestError: send request failed caused by: Post" against DynamoDB

Created on 14 Jul 2020 · 5Comments · Source: grafana/loki

Describe the bug
I am running Loki on an AWS ECS cluster along with DynamoDB for indexes and S3 for storage.

Loki's config:

schema_config:
  configs:
  - from: 2020-01-01
    store: aws
    object_store: s3
    schema: v11
    index:
      prefix: ${DYNAMODB_TABLE}
      period: 168h

storage_config:
  aws:
    s3: s3://${S3_REGION}/${S3_BUCKET_NAME}
    dynamodb:
      dynamodb_url: dynamodb://${DYNAMODB_REGION}

To Reproduce
Steps to reproduce the behavior:

Started Loki 1.5.0
Injected many logs.
Queried:

curl 'https://loki.staging.domain.com/loki/api/v1/query' --data-urlencode 'query={level=~"(debug|info|warning|error)"}'

QueryPage error: table=loki_2636, err=RequestError: send request failed
caused by: Post https://dynamodb.eu-west-1.amazonaws.com/: net/http: HTTP/1.x transport connection broken: http: ContentLength=165 with Body length 0

However, if I try again, I get results:

curl 'https://loki.staging.domain.com/loki/api/v1/query' --data-urlencode 'query={level=~"(debug|info|warning|error)"}'
{"status":"success","data":{"resultType":"streams","result":[],"stats":{"summary":{"bytesProcessedPerSecond":0,"linesProcessedPerSecond":0,"totalBytesProcessed":0,"totalLinesProcessed":0,"execTime":0.032951913},"store":{"totalChunksRef":0,"totalChunksDownloaded":0,"chunksDownloadTime":0,"headChunkBytes":0,"headChunkLines":0,"decompressedBytes":0,"decompressedLines":0,"compressedBytes":0,"totalDuplicates":0},"ingester":{"totalReached":1,"totalChunksMatched":0,"totalBatches":0,"totalLinesSent":0,"headChunkBytes":0,"headChunkLines":0,"decompressedBytes":0,"decompressedLines":0,"compressedBytes":0,"totalDuplicates":0}}}}

The behavior is irregular. Sometimes it works, sometimes not.

Expected behavior
Never breaking when getting results.

I believe it may be a similar problem to this issue: https://stackoverflow.com/questions/31337891/net-http-http-contentlength-222-with-body-length-0/31338443#31338443

Environment:

Infrastructure: Amazon Elastic Container Service, Amazon S3, Amazon DynamoDB

Source

Menda

All 5 comments

Can you check if it's a rate limit problem? If so, this is controllable by the provisioning config in the table manager.

owen-d on 14 Jul 2020

Hi @owen-d , thanks for your quick response.

This is the current configuration:

table_manager:
  index_tables_provisioning:
    provisioned_write_throughput: 5
    provisioned_read_throughput: 5
  retention_deletes_enabled: true
  retention_period: 336h

Maybe 5 for provisioned_write_throughput and provisioned_read_throughput is a really low value considering defaults are 3000 and 300? Also, what do those values really mean or represent?

Menda on 14 Jul 2020

Please see https://github.com/grafana/loki/blob/master/docs/operations/storage/table-manager.md#dynamodb-provisioning.

owen-d on 14 Jul 2020

👍1

All tests by now are working fine. I will come back in case something does not work as expected. Otherwise, I will close this issue.

Menda on 15 Jul 2020

👍1

All good. Thanks for your support!

Menda on 17 Jul 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings