For every workflow, many users want to know how many resource(cpu, memory) is used. This info is very useful for user to compute costs. We can use pod resource requests or limit to show every pod resource usage.
Maybe argo get <workflow> can show info like this. Can someone accept this feature? I can raise a PR to this.
Example:
➜ argo git:(release-2.2) argo get hello-world-gdpwz
Name: hello-world-gdpwz
Namespace: default
ServiceAccount: default
Status: Succeeded
Created: Wed Oct 31 15:01:27 +0800 (4 minutes ago)
Started: Wed Oct 31 15:01:27 +0800 (4 minutes ago)
Finished: Wed Oct 31 15:01:50 +0800 (3 minutes ago)
Duration: 23 seconds
Total CPU: 0.00767 (core*hour)
Total Memory: 0.00639 (GB*hour)
STEP PODNAME DURATION MESSAGE CPU(core*hour) MEMORY(GB*hour)
✔ hello-world-gdpwz hello-world-gdpwz 23s 0.00767 0.00639
For cost, it would be great if this reflects the containers start time (not stage/pod start time which includes pending pod status). Also gpu resources.
Some thoughts from my experimentation...
We can only calculate cost estimates based on the data we have:
Anything else?
We can mix-in weightings, e.g. CPU might be more costly per unit.
Yeah, it is extremely difficult to get accurate cost with entire projects like kubecost having formed to attempt to track cost accurately (integration with cloud provider's pricing api, node instance types, integration with kube-state-metrics...). There is cost spinning up/down machines for pods, container-creating (downloading large docker images), and situations where one resource type is not the bottleneck on the machine so it's essentially free.
For our team, we just need a rough estimate of cost - the 3 you list satisfy that need. Specifically we need to compare workflows against each other and be able to improve the efficiency of our workflows. This ideally being done by SQL queries against the json column in the workflow archive in postgres.
Having the estimated (CPUhour), (RAMhour), (GPU*hour) for the workflow as @xianlubird suggests persisted to the workflow archive would be sufficient for our needs. Being able to configure something like $/unit or weights as you suggest, may work as well. This could potentially make it a bit more difficult for us to customize cost metrics for different workflows that run on different instance types unless weights were configurable per-workflow or per-step. Regardless, this feature will be hugely beneficial, thanks!
Wonder if it would make most sense for resource utilization to simply be a step-level metric (https://github.com/argoproj/argo/issues/1539)
@xianlubird @ddseapy I have a whitespace day today I plan to work on this. At 5:30pm PST (8 hours away) I will down tools and what is done is done.
As I have limited time - I don't expect this can include $ cost.
@xianlubird @ddseapy I've created a POC PR that also capture pod metrics, conceptually allowing us to see how you resource you requested vs how much you actually used.
This is fixed in v2.7
Most helpful comment
@xianlubird @ddseapy I have a whitespace day today I plan to work on this. At 5:30pm PST (8 hours away) I will down tools and what is done is done.
As I have limited time - I don't expect this can include $ cost.