
Looking at various statistic method that AWS supports on their dashboards made me start thinking it might be useful to add a config option into aws module for different statistic methods when querying cloudwatch metrics. Currently in aws module, for each metricset, there is a default/hardcoded statistic method. If we make this configurable, users would have more flexibility to query metrics and the aws module itself will behave more similar to Cloudwatch itself.
With statistic method option, the config for aws module will look like:
- module: aws
period: 300s
metricsets:
- ec2
access_key_id: '${AWS_ACCESS_KEY_ID}'
secret_access_key: '${AWS_SECRET_ACCESS_KEY}'
session_token: '${AWS_SESSION_TOKEN}'
default_region: '${AWS_REGION:us-west-1}'
regions:
- us-west-1
statistic: Average
Two problems here:
- module: aws
period: 300s
metricsets:
- ec2
statistic:
- Average
- Sum
If we do allow multiple methods, then same metrics will be reported multiple times with different statistic methods applied.
Also we need to consider if the code should treat the config example above same as the config below (how to make Cloudwatch query more efficient that way).
- module: aws
period: 300s
metricsets:
- ec2
statistic:
- Average
- module: aws
period: 300s
metricsets:
- ec2
statistic:
- Sum
For NumberOfMessagesSent, it's using Sum by default:

For ApproximateNumberOfMessagesVisible, it's using Average by default:

This problem is the main thing that's stopping me from implementing: the ability to specify different statistic methods for different metrics in the same metricset.
After discussing with @exekias , we think it will be good to support both statistic methods and a list of metric names that user wants to collect.
- module: aws
period: 300s
metricsets:
- cloudwatch
access_key_id: '${AWS_ACCESS_KEY_ID}'
secret_access_key: '${AWS_SECRET_ACCESS_KEY}'
session_token: '${AWS_SESSION_TOKEN}'
default_region: '${AWS_REGION:us-west-1}'
regions:
- us-west-1
statistic_methods: ["Average", "Sum"]
cloudwatch_metrics:
- namespace: AWS/EC2
metric_names: ["CPUUtilization", "DiskReadOps", "DiskWriteOps"]
Please correct me if I'm wrong 馃槵 I just realized I didn't update this right after our meeting. So my memory is a little fuzzy now 馃槄
that sounds good to me!
I'm wondering now, if you can provide several Statistic methods, how will the metric names end up looking? Does it make sense to require several satistics for the same metric?
@exekias Good point, I didn't think about changing names. I was thinking just add a field name in the event called aws.statistic_method or something. So if you require two statistics for the same metric, you will get two events hmmm
To the question of if it's useful, I think if we support metric_names, then the answer I think is no. Users will have the flexibility to specify different statistic methods on different metrics and that's what it matters. Then the config will look like:
- module: aws
period: 300s
metricsets:
- cloudwatch
access_key_id: '${AWS_ACCESS_KEY_ID}'
secret_access_key: '${AWS_SECRET_ACCESS_KEY}'
session_token: '${AWS_SESSION_TOKEN}'
default_region: '${AWS_REGION:us-west-1}'
regions:
- us-west-1
cloudwatch_metrics:
- namespace: AWS/EC2
metric_names: ["CPUUtilization", "DiskReadOps", "DiskWriteOps"]
statistic_method: Average
- namespace: AWS/EC2
metric_names: ["NetworkIn", "NetworkOut"]
statistic_method: Sum
Should we only make this available on cloudwatch metricset?
that sounds good to me
Yay, I will move this issue to ready then 馃憤 Thank you for all the great ideas/discussions!
just checking:
mymetric.avg instead of adding a label for them?will it be possible to supply a list of statistic methods?
cloudwatch_metrics:
- namespace: AWS/EC2
metric_names: ["CPUUtilization", "DiskReadOps", "DiskWriteOps"]
statistic_method: ["Average", "Minimum"]
without it, you'd need a config block for each statistics function, right?
cloudwatch_metrics:
- namespace: AWS/EC2
metric_names: ["CPUUtilization", "DiskReadOps", "DiskWriteOps"]
statistic_method: "Average"
- namespace: AWS/EC2
metric_names: ["CPUUtilization", "DiskReadOps", "DiskWriteOps"]
statistic_method: "Minimum"
- namespace: AWS/EC2
metric_names: ["CPUUtilization", "DiskReadOps", "DiskWriteOps"]
statistic_method: "Maximum"
- namespace: AWS/EC2
metric_names: ["CPUUtilization", "DiskReadOps", "DiskWriteOps"]
statistic_method: "Count"
@roncohen Right now we are only hardcoding the statistic method for each metricset. I just find the default one statistic method from AWS Cloudwatch for each metricset because I was worries about how many metrics we will be reporting if default is for all statistic method. But I do agree, if default to be all statistic method, then it will give users the best out of the box experience.
For supporting a list of statistic_methods with mymetric.avg naming convention, I think if we use that, we can support a list of statistic_methods. @exekias Do you know if we have any ECS convention on statistic method naming? I thought I saw some before but I can't find any now.
I guess retrieving all methods by default is ok for cloudwatch metricset users, but really hope we can get opinionated on our modules using it, for 2 main reasons:
If we go down this path, I'm also +1 to allowing defining a list of aggregations for statistic_method.
For supporting a list of statistic_methods with
mymetric.avgnaming convention, I think if we use that, we can support a list of statistic_methods. @exekias Do you know if we have any ECS convention on statistic method naming? I thought I saw some before but I can't find any now.
I don't think this is fully described anywhere, maybe you saw https://www.elastic.co/guide/en/beats/devguide/current/event-conventions.html (let's open a separate PR to add this maybe?).
In any case, I think that's the notation we normally use, so appending .avg, .max, .sum etc sounds good to me.
As a note, this will be a breaking change, but should be ok as cloudwatch metricset is in beta.
Oh no... breaking change...! But since cloudwatch is in beta, can we still make the change and merge it into the 7.x release right? Or we have to wait for 8.0?
We try to keep them to a minimum, but do them if needed. It is ok to to this kind of changes to a Beta module. I would say it's a requirement before moving this metricset to GA
Sorry for coming back to this one, but this is also relevant in the context of ongoing Azure monitor design. I really hope we can agree on a common settings layout for both AWS cloudwatch and Azure monitor metricsets. Each one has different naming (ie AWS uses the term statistic, while Azure uses aggregation), I think we should probably respect that but keep the same layout for settings. So far we have these concepts (I'm keeping AWS terminology for this issue):
Namespace - the root for a group of metrics
Metric name - name of the metric within the namespace
Statistic methods - statistic methods to use when retrieving a metric
Our last proposed layout has the drawback of repeating the namespace when we want a different set of statistics depending on the metric, we could do a different grouping to solve that:
- module: aws
...
metricsets:
- cloudwatch
cloudwatch_metrics:
- namespace: AWS/EC2
metrics:
- name: ["DiskReadOps", "DiskWriteOps"] # this could accept also a single value
statistic: ["Average", "Maximum"]
- name: CPUUtilization
statistic: Average
We could also decide that namespace is tied to the metricset like this:
- module: aws
...
metricsets:
- cloudwatch
cloudwatch:
namespace: AWS/EC2
metrics:
- name: ["DiskReadOps", "DiskWriteOps"] # this could accept also a single value
statistic: ["Average", "Maximum"]
- name: CPUUtilization
statistic: Average
I'm leaning towards this last example, as we plan to use this metricset as an input for lightweight modules, where normally a single namespace is required. If needed, you can always create several instances of the module for different namespaces.
Any opinions? Do you expect several namespaces being needed when monitoring a single cloud service (AWS ELB, Azure API Management, etc)?
In all cases this would return in documents looking like this:
{
"aws": {
"cloudwatch": {
"namespace": "AWS/EC2",
"metrics": {
"DiskReadOps.avg": 30.2,
"DiskReadOps.max": 34,
"CPUUtilization.avg": 1.2,
}
}
}
}
cc @narph, would love your input here too
@exekias The only thing I can think of regarding to "Do you expect several namespaces being needed when monitoring a single cloud service" is: when users monitoring their EC2 instances, they might be interested in both AWS/EC2 namespace as well as the AWS/RDS or AWS/EBS at the same time.
But as you said, since users can always create more instances of the cloudwatch metricset for different namespaces, I think the second option will work.
@exekias , there is a notable difference between the azure monitor and aws cloudwatch configuration and that is the resources that have to be identified before listing the namespaces.
This is why I went for:
- module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: ''
client_secret: ''
tenant_id: ''
subscription_id: ''
refresh_list_interval: 120s
resources:
- resource_id: "/subscriptions/xxxxx/resourceGroups/obs-infrastructure/providers/Microsoft.ApiManagement/service/obsapimanagement"
namespaces:
- name: "Microsoft.ApiManagement/service"
metrics:
- name: "Requests"
aggregations: ["Maximum"]
dimensions:
- name: "Hostname"
value: "obsapimanagement.azure-api.net"
- resource_group: "obs-infrastructure"
resource_type: "Microsoft.Compute/virtualMachines"
namespaces:
- name: "Microsoft.Compute/virtualMachines"
metrics:
- name: "Percentage CPU"
aggregations: ["Average", "Minimum", "Maximum"]
- name: "Microsoft.Compute/virtualMachines"
metrics:
- name: ["Disk Read Bytes", "Data Disk Write Operations/Sec", "Network Out"]
aggregations: ["Average", "Maximum"]
- resource_query: ""
....
AWS does not have that.
The reason I am not leaning at all towards "monitor_metrics" and then repeat the "metrics" naming inside is because I am not following up with the metrics or namespaces but with the actual resource id or option of identifying these resources so I went with "resources", alternatives that I can think about are "monitor_resources".
Resources is a very common notion in azure and I think whatever we decide to name it, it should represent the fact that we are expecting users to configure ways to select resources and under each option the desired metrics.
Are we final on the document format returned?
The example above is different to what we were thinking about in the proposal and https://www.elastic.co/guide/en/beats/devguide/current/event-conventions.html
Ex:
"azure" : {
"monitor" : {
"resource" : {
"name" : "obstestmemleak",
"location" : "West Europe",
"type" : "Microsoft.Compute/virtualMachines"
},
"namespace" : "Microsoft.Compute/virtualMachines",
"subscriptionId" : "unique identifier",
"metrics" : {
"percentage_cpu.max" : 1.7,
"percentage_cpu.avg" : 1.34,
"percentage_cpu.min" : 1.04
}
}
},
The metric name is "Percentage CPU" and the aggregations selected are max, avg, min.
Several changes have been applied to the metric name , replaced " " with "_", lowercase the name. In some cases I am checking for "../sec" and replacing it with "_per_sec".
Should we just keep the original metric naming then and list the aggregations inside?
Thank you both for the answers
@exekias , there is a notable difference between the azure monitor and aws cloudwatch configuration and that is the resources that have to be identified before listing the namespaces.
This is why I went for:- module: azure metricsets: - monitor enabled: true period: 60s client_id: '' client_secret: '' tenant_id: '' subscription_id: '' refresh_list_interval: 120s resources: - resource_id: "/subscriptions/xxxxx/resourceGroups/obs-infrastructure/providers/Microsoft.ApiManagement/service/obsapimanagement" namespaces: - name: "Microsoft.ApiManagement/service" metrics: - name: "Requests" aggregations: ["Maximum"] dimensions: - name: "Hostname" value: "obsapimanagement.azure-api.net" - resource_group: "obs-infrastructure" resource_type: "Microsoft.Compute/virtualMachines" namespaces: - name: "Microsoft.Compute/virtualMachines" metrics: - name: "Percentage CPU" aggregations: ["Average", "Minimum", "Maximum"] - name: "Microsoft.Compute/virtualMachines" metrics: - name: ["Disk Read Bytes", "Data Disk Write Operations/Sec", "Network Out"] aggregations: ["Average", "Maximum"] - resource_query: "" ....AWS does not have that.
From your examples it would seem that for each resource filter you always use the same namespace, which probably makes sense. Perhaps resource filter and namespace should have a 1:1 mapping? That would mean the namespace is defined at the same level where you define the resource filter.
The reason I am not leaning at all towards "monitor_metrics" and then repeat the "metrics" naming inside is because I am not following up with the metrics or namespaces but with the actual resource id or option of identifying these resources so I went with "resources", alternatives that I can think about are "monitor_resources".
Resources is a very common notion in azure and I think whatever we decide to name it, it should represent the fact that we are expecting users to configure ways to select resources and under each option the desired metrics.
+1
Are we final on the document format returned?
The example above is different to what we were thinking about in the proposal and https://www.elastic.co/guide/en/beats/devguide/current/event-conventions.html
Ex:"azure" : { "monitor" : { "resource" : { "name" : "obstestmemleak", "location" : "West Europe", "type" : "Microsoft.Compute/virtualMachines" }, "namespace" : "Microsoft.Compute/virtualMachines", "subscriptionId" : "unique identifier", "metrics" : { "percentage_cpu.max" : 1.7, "percentage_cpu.avg" : 1.34, "percentage_cpu.min" : 1.04 } } },The metric name is "Percentage CPU" and the aggregations selected are max, avg, min.
Several changes have been applied to the metric name , replaced " " with "_", lowercase the name. In some cases I am checking for "../sec" and replacing it with "_per_sec".
Should we just keep the original metric naming then and list the aggregations inside?
Having the aggregation as a suffix or subdocument makes no difference for Elasticsearch, so your document looks good to me! I've updated my example to avoid confusion, as implementation wise, this is easier.
@exekias How about we remove the cloudwatch: in config and directly specify the namespace and/or metrics? If there is no metrics specified, then by default it will collect all available metrics from that namespace and all statistic methods.
- module: aws
period: 300s
metricsets:
- cloudwatch
namespace: AWS/EC2
metrics:
- name: ["CPUUtilization", "DiskReadOps", "DiskWriteOps"]
statistic: ["Average", "Maximum"]
dimensions:
- name: InstanceId
value: i-0686946e22cf9494a
- module: aws
period: 300s
metricsets:
- cloudwatch
namespace: AWS/RDS
metrics:
- statistic: ["Average", "Maximum"]
- module: aws
period: 300s
metricsets:
- cloudwatch
namespace: AWS/ELB
We have done this kind of namespacing (cloudwatch.*) for metricset setings in the past (check system module for instance, with process settings). That said, there is no standard way defined for this, and the proposed examples look good to me :+1:
It seems that in Azure, in some cases, multiple namespaces may be needed. For instance to retrieve custom metrics along with VM metrics in Azure: https://github.com/MicrosoftDocs/azure-docs/issues/28457.
So I would say the safe play is doing this for AWS:
- module: aws
period: 300s
metricsets:
- cloudwatch
metrics:
- name: ["CPUUtilization", "DiskWriteOps"]
namespace: AWS/EC2
statistic: ["Average", "Maximum"]
- name: ["DiskReadOps"]
namespace: AWS/EC2
statistic: ["Sum"]
And figure out Azure afterwards, as it may be different (to be seen)
It seems that in Azure, in some cases, multiple namespaces may be needed. For instance to retrieve custom metrics along with VM metrics in Azure: MicrosoftDocs/azure-docs#28457.
Yes, so it seems that users can create and send the custom metrics themselves using the following methods to Azure Monitor.
I chose the second option in the list:
and I could already see the added metric namespace to my resource:
{
"value":[
{
"id":"/subscriptions/xxxxxx/resourceGroups/obs-infrastructure/providers/Microsoft.Compute/virtualMachines/obstestmemleak/providers/microsoft.insights/metricNamespaces/Azure.VM.Windows.GuestMetrics",
"name":"Azure.VM.Windows.GuestMetrics",
"type":"Microsoft.Insights/metricNamespaces",
"classification":"Custom",
"properties":{
"metricNamespaceName":"Azure.VM.Windows.GuestMetrics"
}
},
{
"id":"/subscriptions/xxxxxxxxx/resourceGroups/obs-infrastructure/providers/Microsoft.Compute/virtualMachines/obstestmemleak/providers/microsoft.insights/metricNamespaces/Microsoft.Compute-virtualMachines",
"name":"Microsoft.Compute-virtualMachines",
"type":"Microsoft.Insights/metricNamespaces",
"classification":"Platform",
"properties":{
"metricNamespaceName":"Microsoft.Compute/virtualMachines"
}
}
]
}
I was then able to retrieve the metrics specified:
- resource_group: "obs-infrastructure"
resource_type: "Microsoft.Compute/virtualMachines"
namespaces:
- name: "Microsoft.Compute/virtualMachines"
metrics:
- name: "Percentage CPU"
aggregations: ["Average", "Minimum", "Maximum"]
- name: "Microsoft.Compute/virtualMachines"
metrics:
- name: ["Disk Read Bytes", "Data Disk Write Operations/Sec", "Network Out"]
aggregations: ["Average", "Maximum"]
- name: "Azure.VM.Windows.GuestMetrics"
metrics:
- name: ["Memory\\Committed Bytes", "System\\Processes"]
aggregations: ["Average", "Maximum"]
So a resource can allow multiple namespaces.
Thank you for sharing @narph!
So it seems clear that there are some differences between Azure and AWS, we should not try to force the same config. That said, I see an opportunity for a more similar layout, from your example:
- resource_group: "obs-infrastructure"
resource_type: "Microsoft.Compute/virtualMachines"
metrics:
- name: "Percentage CPU"
namespace: "Microsoft.Compute/virtualMachines"
aggregations: ["Average", "Minimum", "Maximum"]
- name: ["Disk Read Bytes", "Data Disk Write Operations/Sec", "Network Out"]
namespace: "Microsoft.Compute/virtualMachines"
aggregations: ["Average", "Maximum"]
- name: ["Memory\\Committed Bytes", "System\\Processes"]
namespace: "Azure.VM.Windows.GuestMetrics"
aggregations: ["Average", "Maximum"]
WDYT?
sounds good, looks more compact already
Awesome! I will start the implementation on the aws side. @exekias @narph Thanks for all the discussion!