Cosmos-sdk: Prometheus Metrics Cleanup and Expansion

Created on 28 Aug 2018  路  32Comments  路  Source: cosmos/cosmos-sdk

Currently (excluding the go_*, promhttp_*, and process_*) we have the following prometheus metrics exported:

| Metric | type |
| -------| -----|
|consensus_block_interval_seconds | histogram |
|consensus_block_size_bytes | gauge |
|consensus_byzantine_validators | gauge |
|consensus_byzantine_validators_power | gauge |
|consensus_height | gauge |
|consensus_missing_validators | gauge |
|consensus_missing_validators_power | gauge |
|consensus_num_txs | gauge |
|consensus_rounds | gauge |
|consensus_total_txs | gauge |
|consensus_validators | gauge |
|consensus_validators_power | gauge |
|mempool_size | gauge |
|p2p_peers | gauge |

There are a couple of additional metrics that have been requested by the validator community (https://github.com/tendermint/tendermint/issues/2272):

| Metric | type | tags | notes |
|-------|------|-----|-----|
| consensus_latest_block_height | gauge | none | /status sync_info number |
| consensus_catching_up | gauge | none | either 0 (syncing) or 1 (synced) |
| p2p_peer_receive_bytes_total | counter | peer_id | for debugging p2p issues |
| p2p_peer_send_bytes_total | counter | peer_id | for debugging p2p issues |
| p2p_peer_pending_send_bytes | gauge | peer_id | for debugging p2p issues |

_NOTE_: May be able to handle the p2p metrics by adding SendMonitor and RecvMonitor readings as histograms

Other metrics that would be amazing to add:

| Metric | type | tags | notes |
|-------|------|-----|-----|
| mempool_num_txs | gauge | none | the number of transactions in the mempool at time of poll |
| mempool_tx_size_bytes | histogram | none | a histogram of transaction size |
| mempool_failed_txs | gauge | none | number of failed transactions |
| mempool_recheck_times | histogram | none | number of transactions rechecked in the mempool |
| p2p_num_txs | gauge | peer_id | number of transactions submitted by each peer_id |
| p2p_block_parts | gauge | peer_id | number of blockparts transmitted by peer |
| p2p_pending_send_bytes | gauge | peer_id | amount of data pending to be sent to peer |
| consensus_block_gas | gauge | none | amount of gas in the previous block (needs work) |
| consensus_block_processing_time | histogram | none | time between BeginBlock and EndBlock |
| sdk_num_transactions | counter | type | number of transactions by type |
| sdk_gov_proposals | counter | type | number proposals |

Other work that needs to be done on the prometheus implementation:

  • [ ] Add configuration option that allows for a metric prefix for all metrics coming out of the instance. (i.e. tm_* or gaiad_*)
  • [ ] Add ability to deactivate metric families from the config file (consensus, p2p, mempool, etc...)

related: https://github.com/tendermint/tendermint/issues/2272

Most helpful comment

@jackzampolin Great work, thanks for putting this together. I'd propose to break the issue apart in three distinct pieces:

  • Cosmos metrics - anything which can be tracked fromt the SDK codebase
  • Tendermint metrics - extension, rework of the prom metric
  • Low-level libs - likely raw performance for resource utilisation

It will make the discussion easier and we should also think in this context how we can expose the monitoring facilities of tm meaningful to the SDK.

All 32 comments

This is awesome aggregation work @jackzampolin ! Wouldn't this need to be handled on the TM side though?

I think a lot of it needs to be handled there, but there is also some stuff in the SDK that needs to be handled. I think its going to be critical to come up with a framework where we can tie metrics to modules and enable monitoring in some parts of the system, but not others. I thought we could use this issue as a place to define what that would look like, and which metrics we want, then open separate issues on tendermint for implementation.

I see I see. @xla pointed out some good insight in #1409. Coming up with an interface for modules to export some sort of metrics shouldn't be that far from trivial, but we have to be very particular about what we want to expose and track.

I am highly interested in p2p_peer data, because it will give us strong tools to analyze malicious attack including layer 7. I would like to know "p2p_peer_pending_recieve(send)_cnt(bytes)" which implies how many packets(data) received(sent) from(to) each peer.
I also want to know "REAL" public IP(Not the IP from external_IP in config.toml) of each peer, if gaiad has it from raw tcp data for analysis and defence purpose.

@jackzampolin agreed all of the above would be very intersting!

Related to https://github.com/cosmos/cosmos-sdk/issues/2131 it would make good sense to include some AVL stats as well as database (insert/update/delete counts + whatever else can be gathered), to further diagnose this.

It would also be valuable for monitoring moving forward.

@jackzampolin I'd like to add gaia_proposals_total (or similar wording) to track number of proposals.

This should make it trivial to setup monitoring that will alert the operator that a new proposal was added.

@mdyring can you mention the names of the IAVL stats we should have?

mempool_num_txs

that's the same as existing mempool_size?

mempool_recheck_times | histogram | none | number of transactions rechecked in the mempool

the name seems inconsistent with description to me. Do you want a histogram of how much it took to recheck txs (i.e. 1s - 18 txs, 2s - 10 txs, ...) or the number of transactions rechecked (gauge)?

p2p_num_txs

peers are not aware of the term "transactions". they operate on channels with raw bytes.

number of transactions submitted by each peer_id

moreover, peers do not submit transactions (clients do). peers exchange transactions between each other using mempool.

p2p_block_parts | gauge | peer_id | number of blockparts transmitted by peer

same. it should be either (consensus_proposal_block_parts) or (blockstore_block_parts)

consensus_block_gas | gauge | none | amount of gas in the previous block (needs work)

as far as I know, it's not currently implemented.

@jackzampolin well basically anything that could be used to pinpoint performance issues.

I am not that familiar with the AVL codebase, but I think the below might be useful.

Metric | type | tags | notes
-- | -- | -- | --
tm_avl_immutable_gets_total | gauge | none | Incremented for each immutable_tree get()
tm_avl_mutable_gets_total | gauge | none | Incremented for each mutable_tree get
()
tm_avl_mutable_sets_total | gauge | none | Incremented for each mutable_tree set()
tm_avl_mutable_removes_total | gauge | none | Incremented for each mutable_tree remove()
tm_avl_mutable_rollbacks_total | gauge | none | Incremented for each mutable_tree rollback()
tm_avl_mutable_load_versions_total | gauge | none | Incremented for each mutable_tree load_version()
tm_avl_mutable_save_versions_total | gauge | none | Incremented for each mutable_tree save_version()
tm_avl_mutable_delete_versions_total | gauge | none | Incremented for each mutable_tree delete_version()
tm_avl_nodedb_saves_total | gauge | none | Incremented for each nodedb save()

@jackzampolin Great work, thanks for putting this together. I'd propose to break the issue apart in three distinct pieces:

  • Cosmos metrics - anything which can be tracked fromt the SDK codebase
  • Tendermint metrics - extension, rework of the prom metric
  • Low-level libs - likely raw performance for resource utilisation

It will make the discussion easier and we should also think in this context how we can expose the monitoring facilities of tm meaningful to the SDK.

As @melekes pointed out most of the metrics concerning blocks and transactions belong in the realm of consensus.

mempool_num_txs

that's the same as existing mempool_size?

It seems both are inaccurate, we should encode the nature of the transactions and probably have metrics for the stages a transaction can go through, so the operator has insight into how many are pending, how many have been processed overall.

@mslipper and team will be handling this work. I think next steps here are to open issues in tm, sdk and iavl that detail metrics there and begin working on implementation per our call today.

Proposal for how we can handle this:

Currently, we have a MetricsProvider function that returns Metrics structs for the cs, p2p, and mempl packages. Since we're aiming to provide additional metrics for more packages than just these, I'd like to propose making the following changes to Tendermint's Metrics subsystem:

The Metrics structs as they are defined right now return the raw go-kit Counter, Gauge, and Histogram interfaces. Since Go doesn't support union types yet, I suggest we wrap the supported metrics for a given package in an interface with the following signature:

type Metrics interface {
    Counters() []Counter
    Gauges() []Gauge
    Histograms() []Histogram
    Family() string
}

Family() is used to identify the class of metrics exposed by the Metrics in order to programmatically enable/disable them.

We'll need to refactor MetricsProvider to return a slice of Metrics interfaces rather than the three package-specific structs it does now.

To support command-line flags that enable or disable particular metrics, we'll allow an additional string slice argument to NewNode that defines a list of metric categories to enable. By default, all metrics will be enabled. On node startup, the the node will iterate through the slice of Metrics objects provided by MetricsProvider disable any that are not in the list of enabled metric families.

This looks great @mslipper, nice and extensible as well as being a small change from the current implementation. 馃憤 from me

Since we're aiming to provide additional metrics for more packages than just these, I'd like to propose making the following changes to Tendermint's Metrics subsystem

most probably, but here I don't see any metrics outside of existing consensus, mempool and p2p packages.

What problem exactly are you trying to solve?

@melekes What about the IAVL and sdk module specific (gov mentioned) metrics mentioned above? This comment above also speaks to this:

I think its going to be critical to come up with a framework where we can tie metrics to modules and enable monitoring in some parts of the system, but not others. I thought we could use this issue as a place to define what that would look like, and which metrics we want, then open separate issues on tendermint for implementation.

Add ability to deactivate metric families from the config file

Prometheus metrics are cheap (it's just counters in RAM, which are periodically read by outside process). I am pretty sure you can configure Prometheus server to not read some metrics. Just want to make sure that you understand that it might be easier to expose everything & read only what user needs VERSUS framework where we enable monitoring in some parts of the system, but not others.

@melekes If you don't think that should be a requirement how would it change the implementation? I think having each module export metrics makes metrics a first class citizen in our ecosystem and allows devs an easy way to add them when building. Do you have any issues with this approach?

Lets go ahead and move forward with this proposal!

I think having each module export metrics makes metrics a first class citizen in our ecosystem and allows devs an easy way to add them when building.

This sounds amazing! I am not opposite to adding more metrics.

The thing I am struggling to understand is why each module needs to export:

type Metrics interface {
    Counters() []Counter
    Gauges() []Gauge
    Histograms() []Histogram
    Family() string
}

if you don't think that should be a requirement how would it change the implementation?

this requires no changes in Tendermint.

@melekes That way each module can export metrics of any type. Some of those functions will likely return empty slices in practice. The family name is for specifying the metric namespace.

OK, maybe I don't understand something. Please go ahead with the proposal.

That way each module can export metrics of any type

you don't need to export metrics (as an interface or something else) in order to export them to Prometheus!!!

I share @melekes concerns, the proposed solution seems to not address the original problem, which should be challenged. Can we expand why we need such granularity on metric configuration. This would only be reasonable if certain metrics impose performance hits. As long as this is not the case we should only support toggling of metrics and the metircs provider (prometheus for now).

@melekes You do need to register the metrics with prometheus however. If the application has multiple modules and each module has some metrics it wants to expose to operators/clients then having an interface to export those metrics seems sensible.

@xla can you expand on proposed solution seems to not address the original problem?

And that is correct, as long as there is no performance hit we would only need to support configuration for toggling metrics and the provider.

You do need to register the metrics with prometheus however.

True, but you don't need to export any functions. All you need is to call it (you can do it in the constructor within the package). The fact that Tendermint exposes Metrics struct and PrometheusMetrics() / NopMetrics() functions makes sense for Tendermint because some of our users use Tendermint as a library and they specifically requested the ability to change default MetricsProvider.

  1. https://github.com/tendermint/tendermint/blob/af1a405768daf8c1e8a99fca3f331b93baa20851/mempool/metrics.go#L21
  2. https://github.com/go-kit/kit/blob/master/metrics/prometheus/prometheus.go#L58

Taking discussion to Slack.

@xla can you expand on proposed solution seems to not address the original problem?

The original problem is that we want to expose a variety of new metrics, which boils down to set a gauge, add a counter or observe a histogram in the right place in the code. Fine grained control down to the metric level is introducing complexity that is asking for overhead we should just shy away from, as it will open to quite complex configuration matrix. In the most naive way we can always collect those metrics and have the configuration influence if the prometheus endpoint is enabled or not.

Addition of metrics should be confined to the package boundaries and not neccesitate any changes to how we setup metrics. The one change we should drive is the support for a common prefix (namespace in the context of prometheus) as mentioned in tendermint/tendermint#2272 Speaking from the tendermint perspective only here.

@melekes So, as far as I understand, there is global DefaultRegisterer. To add your cosmos-based app metrics, all you need is to register metrics using default Prometheus functions. Is it correct?

Going to go ahead and close this. @hleb-albau please open another issue for documenting the process of adding more metrics.

This issue confuses tendermint and sdk metrics. I'm going to close this in anticipation of @tnachen opening another issue on prometheus metrics in SDK based apps.

Was this page helpful?
0 / 5 - 0 ratings