Keda: Azure Event Hubs scaler does not work for Java consumer app

Created on 20 Apr 2020  路  8Comments  路  Source: kedacore/keda

I have an Azure Event Hubs consumer application based on the Azure Event Hubs Java SDK. To be specific, I am using azure-messaging-eventhubs v 5.0.3 andazure-messaging-eventhubs-checkpointstore-blob v1.0.3 for the blob store based checkpointing.

KEDA is unable to scale this consumer application (details below)

Expected Behavior

KEDA should be able to scale the Event Hubs consumer application

Actual Behavior

Scaling fails with the following error:

E0418 07:45:46.862595       1 provider.go:88] keda_metrics_adapter/provider "msg"="error getting metric for scaler" "error"="unable to get checkpoint from storage: unable to download file from blob storage: -\u003e github.com/Azure/azure-storage-blob-go/azblob.newStorageError, /go/pkg/mod/github.com/!azure/[email protected]/azblob/zc_storage_error.go:42\n===== RESPONSE ERROR (ServiceCode=BlobNotFound) =====\nDescription=The specified blob does not exist.\nRequestId:ea0bb7e9-101e-0035-2b55-154ca2000000\nTime:2020-04-18T07:45:46.7756201Z, Details: \n   Code: BlobNotFound\n   GET https://THE_STORAGE_ACCOUNT_NAME.blob.core.windows.net/CONTAINER_NAME/EVENTHUBS_CONSUMER_GROUP_NAME/0?timeout=61\n   Authorization: REDACTED\n   User-Agent: [Azure-Storage/0.7 (go1.13.3; linux)]\n   X-Ms-Client-Request-Id: [2213b5a8-f514-4d2f-6386-ae3953697075]\n   X-Ms-Date: [Sat, 18 Apr 2020 07:45:46 GMT]\n   X-Ms-Version: [2018-11-09]\n   --------------------------------------------------------------------------------\n   RESPONSE Status: 404 The specified blob does not exist.\n   Content-Length: [215]\n   Content-Type: [application/xml]\n   Date: [Sat, 18 Apr 2020 07:45:46 GMT]\n   Server: [Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0]\n   X-Ms-Error-Code: [BlobNotFound]\n   X-Ms-Request-Id: [ea0bb7e9-101e-0035-2b55-154ca2000000]\n   X-Ms-Version: [2018-11-09]\n\n\n"  "ScaledObject.Name"="azure-eventhub-scaledobject" "ScaledObject.Namespace"="default" "Scaler"={}

key part of the log: Code: BlobNotFound\n GET https://THE_STORAGE_ACCOUNT_NAME.blob.core.windows.net/CONTAINER_NAME/EVENTHUBS_CONSUMER_GROUP_NAME/0?timeout=61\n (where 0 is the partition ID of the event hub)

Steps to Reproduce the Problem

Test app and k8s YAMLs are available in this repo (if needed) - https://github.com/abhirockzz/eventhubs-keda-java

  1. deploy a Java SDK based azure event hubs consumer application
  2. deploy scaled object for event hubs
  3. check KEDA operator logs

Specifications

  • KEDA Version: 1.3.0
  • Kubernetes Version: v1.15.7
  • Scaler(s): Azure Event Hubs
azure bug scaler-azure-event-hubs

Most helpful comment

I think we should fix it from a #769 perspective and cover all lanuages

All 8 comments

The problem is:

  1. checkpointing depends on Azure Blob Storage
  2. different language SDKs implement this in different ways in terms of - checkpoint (JSON) payload as well as the path within the Storage container

Today this is handled using an if-else block which depends on the blobContainer name (from the metadata) and creates a blob store URL expecting that the SDK will adhere to that (same goes for the checkpoint payload). These (and any other concerns) need to abstracted out for the scaler to be flexible and usable

Here is the full path of the checkpoint doc in the blob store container I had created https://[STORAGE_ACC_NAME].blob.core.windows.net/[CONTAINER_NAME]/[EVENTHUBS_NAMESPACE]/[EVENTHUB_NAME]/[CONSUMER_GROUP]/checkpoint/[PARTITION_ID]. Clearly, this is not the same as the one shown in the logs (hence the problem). Now I notice that #376 and #516 were closed via #517 , but I suspect that the problem, in this case, is that I am using the using the new Java SDK for Event Hubs and the fix (#517) was for the legacy Event Processor Host based client apps - the new SDK has a different storage URL format

This feels related to https://github.com/kedacore/keda/issues/762 and might have same root cause

This feels related to #762 and might have same root cause

Indeed! And here is one for Python #741. This behavior needs to be (somehow) abstracted out

@abhirockzz great investigation. It would be really great if you can propose a fix for this :)

I think we should fix it from a #769 perspective and cover all lanuages

@abhirockzz great investigation. It would be really great if you can propose a fix for this :)

Sure @zroubalik I'll work on a proposal to help rectify this

Looking into it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tomkerkhove picture tomkerkhove  路  4Comments

jeffhollan picture jeffhollan  路  3Comments

cwhfa picture cwhfa  路  4Comments

joskfg picture joskfg  路  4Comments

jeffhollan picture jeffhollan  路  3Comments