Logstash: POJO-ification of Metrics

Created on 24 Jul 2017 · 5Comments · Source: elastic/logstash

(Jake)
Hijacking original comment with break down of progress:

[x] General acceptance of Witness direction (see below)
[x] Initial Witness data model https://github.com/elastic/logstash/pull/7947
[x] Metrics clean up in prep for Witness https://github.com/elastic/logstash/pull/7872
[x] Jackson Serialization for Witness https://github.com/elastic/logstash/pull/7984
[x] Remove lazy serialization of Metrics https://github.com/elastic/logstash/issues/7885
[x] Clean up and support for DLQ and PQ via Witness https://github.com/elastic/logstash/pull/8023
[x] Support for Plugin Metrics via Witness https://github.com/elastic/logstash/pull/8053
[x] General approval to remove the namespace API https://github.com/elastic/logstash/issues/7971
[ ] Ruby - Complete removal of namespace API in favor of Witness (for :stats) https://github.com/elastic/logstash/pull/8062
[ ] Model JVM / OS stats in Java with support for JSON serialization.
[ ] Ruby - Complete removal of namespace API in favor of Witness (for :jvm, os, etc.)
[ ] Ruby - use Java based serialization as data provider to Sinatra for JSON API
[ ] XPack - update to use Witness API
[ ] Improve the semantics of the API : https://github.com/elastic/logstash/issues/7973

Outside of scope for this issue:

Move HTTP server from Ruby to Java
Move data OS/Java stats data retrieval to Java. (discussion: https://github.com/elastic/logstash/issues/7978 and https://github.com/elastic/logstash/issues/8048)

(Andrew)
This issue sets out to clearly define our goals for the refactor of Logstash metrics. It takes over from #7274 .

Our current metrics implementation provides more than we need and has some performance problems, like #7772 to boot. Additionally, the design decision to use Map objects to store the hierarchy of metrics has resulted in some performance issues as well as prevented us from leaning on the JVM type checker to verify that metric accessess / stores are correct.

After numerous discussions with @jakelandis @ph and @jsvd we've decided to move toward a simpler POJO based architecture, where we will use the new Gauge and Counter types directly.

So, instead of ruby code like:

class Foo
  def initialize(external_metric_thing)
    @metrics = external_metric_thing.namespace(:foo)
  end

  def receivedEvent
    @metric.increment(:events_received, 1)
  end
end

We will write something more like:

class Foo
  def initialize
    self.metrics = FooMetrics.new()
  end

  def receivedEvent
    self.fooMetrics.receivedEvent()
  end
end

class FooMetrics {
  public FooMetrics() {
    this.receivedEvents = new Counter()
  }

  public void receivedEvent() {
    this.receivedEvents.increment()
  }
}

These java metrics classes will be much more complicated in the future, and might increment multiple metrics based on a single lifecycle event, and even increment metrics in linked objects.

design

Source

andrewvc

👍1

Most helpful comment

I tried out a couple of a designs and landed on implementing a fluent API to represent the metrics in Java. This is similar in spirit to what @andrewvc proposes above, but with more detail about the specific implementation and the migration strategy below.

Fluent API's favor readability and allow for developers to express domain knowledge using a meaning vocabulary. To avoid re-architecting the entire metrics subsystem, the first iteration of this new fluent API will closely mirror the existing metrics namespace design.

The existing namespace design is an array of Ruby symbols that lead up to a key/value pair (the metric itself):

[:symbolA]->[:symbolB]->[:symbolC]->[:symbolD]->[k,v]

for example:

metric.namespace([:stats, :pipelines, pipeline_id.to_s.to_sym, :plugins, plugin_id.to_sym]).increment(:in, 2)

The fluent API will mirror the namespaces, but uses a strict type safe model. The fluent API is composed of Objects called Witnesses, since they are watching\observing\witnessing what is happening. There are multiple types of witnesses, for example there is PipelineWitnees, a PluginWitness, and ReloadWitness, etc. Each type of Witness composes other Witnesses to fulfill a chain of witnesses. Each witness is responsible for storing it's associated state and serializing itself. Each chain of witnesses will have a common root witness (e.g. Object A) implemented as singleton. For example there will be a StatsWitness, a JvmWitness, and possibly more in the future.

The chain of Witnesses will look like:

(ObjectA) hasA (ObjectB) hasA (ObjectC) hasA (ObjectD) hasA [k,v]

for example:

witness.pipeline(pipeline.id).plugin(plugin.id).in(2)

The fluent API is more strict in it's data model then namespace design it is replacing. Since the fluent API is based on typed Java Objects, the Object MUST compose other Witness types, and each Witness MUST expose the appropriate key/value pair to be set. The Ruby namespace API allows for ANY symbol with ANY key/value pairing. While that is highly flexible, it can make consuming the output rather challenging since only some the schema is known. Further, lacking type safety, it is quite easy to typo the long complex namespace.

For example in Ruby the following is valid, (assume I wanted A->B->C->D):

[:symbolA]->[:symbolB]->[:symbolD]->[:symbolC]->[k,v]

Due to the type safety of the Java fluent API, this is mistake is NOT possible since ObjectB does not compose ObjectD. Further due to the hasA relationships, intellisense from IDE's work beautifully, allowing developers to simply auto-complete the Witness chain.

The Ruby implementation uses a NamespacedMetric to assist with passing around appropriately scoped objects. For example:

pipeline_scoped_metric = metric.namespace([:stats, :pipelines, pipeline_id.to_s.to_sym, :plugins])
foo(pipeline_scoped_metric)
 ```

The Java fluent implementation can do that too (but without the Ruby wrapper)

pipeline_scoped_metric = wintess.pipeline(pipeline.id).plugins
foo(pipeline_scoped_metric)
```
This can be useful if creating an API to plugin developers, such that plugins will only get the witness associated with their plugin.

The plugin API for the first pass will remain unchanged. I will need to visit ALL the plugins and possibly create new namespace -> witness mappings, or create custom witnesses for the plugins.

In addition to the well defined metrics that will be composed of Witness chains, we should to support custom witnesses for the advanced plugin developer. For example, if an http output plugin counts wants to count the number of http 500 errors there should be a hook in point to support that. I have an early POC for how to achieve this, but still working through the details. This will likely require considerable documentation that should be part of a larger how to develop plugins guide.

Since the the initial pass at the fluent API is a close mirror to the namespace API, we can fairly easily translate one to the other. A namespace --> witness adapter will be provided to allow minimal changes to the existing code, and confidence can be gained by minimizing the changes to the code base and passing the existing rspec tests.

Depending on how much we want to continue to invest in the core Ruby code, we can take a second pass and completely remove the namespace API in favor of the witness API, and enhance the witness API to speak in richer domain terms.

Sinatra will continue to serve as the HTTP engine, and the Wintess object know how to serialize themselves, so the changes there should be a drop in replacement. I will add a /v2 path while building this out, so we can test against /v2 and before release remove /v2 (the API itself is passive so no need to version it).

_As a side note, I have an implementation of Jetty/Jersey in my working branch as replacement to Sinatra...but that is different conversation and will come later._

Preview of changes:

I am only slightly past POC, so any feedback early is appreciated. I will start to clean up the changes and break them out into review sized pieces.

jakelandis on 1 Aug 2017

❤4

All 5 comments

The existing namespace design is an array of Ruby symbols that lead up to a key/value pair (the metric itself):

[:symbolA]->[:symbolB]->[:symbolC]->[:symbolD]->[k,v]

for example:

metric.namespace([:stats, :pipelines, pipeline_id.to_s.to_sym, :plugins, plugin_id.to_sym]).increment(:in, 2)

The chain of Witnesses will look like:

(ObjectA) hasA (ObjectB) hasA (ObjectC) hasA (ObjectD) hasA [k,v]

for example:

witness.pipeline(pipeline.id).plugin(plugin.id).in(2)

For example in Ruby the following is valid, (assume I wanted A->B->C->D):

[:symbolA]->[:symbolB]->[:symbolD]->[:symbolC]->[k,v]

The Ruby implementation uses a NamespacedMetric to assist with passing around appropriately scoped objects. For example:

pipeline_scoped_metric = metric.namespace([:stats, :pipelines, pipeline_id.to_s.to_sym, :plugins])
foo(pipeline_scoped_metric)
 ```

The Java fluent implementation can do that too (but without the Ruby wrapper)

The plugin API for the first pass will remain unchanged. I will need to visit ALL the plugins and possibly create new namespace -> witness mappings, or create custom witnesses for the plugins.

_As a side note, I have an implementation of Jetty/Jersey in my working branch as replacement to Sinatra...but that is different conversation and will come later._

Preview of changes:

I am only slightly past POC, so any feedback early is appreciated. I will start to clean up the changes and break them out into review sized pieces.

jakelandis on 1 Aug 2017

❤4

I love this proposal @jakelandis !

I would propose, however, that we use a different pattern for plugin stats than core, even for advanced plugin developers. The main reason being that for core stats, we know they're there, we don't need them to be programatically traversable. We can just hard code invocations of the witness objects into the API as we move forward.

For plugins I propose that we simply provide a single Map<String,AbstractMetric> and let plugin developers use it as they like. This allows for consumers of the metrics API (like our own xpack extension) to programatically create charts for them. Additionally, by not having a notion of hierarchy this simplifies UX concerns there as well in any developed charting solution. I don't see plugins hitting a level of complexity where the absence of witnesses will be a problem, and I see a constrained solution here as enabling more ease of use for consumers of that API.

Some plugin developers may feel that they are missing the ability to create hierarchies, which we do provide today via namespaces, but I think a good tradeoff is just adopting the convention there of using dots, e.g. 'my.metric.name'.

andrewvc on 1 Aug 2017

I found a edge case not currently covered by the new design, and wanted to get opinions on the correct behavior.

Found via in Logstash::Agent when we try to start one pipeline and it succeed when we successfully reload a pipeline it clears previous metrics in metrics_spec.rb

Basically this test start with this configuration:

"input { generator {} } filter { mutate { id => 'test_filter' add_tag => 'hello world' }} output { null {} }"

Then reloads with this config:

"input { generator { id => 'new' } } output { null {} }"

Note the filter was removed.

The prior behavior would remove all of the filter from the metrics on pipeline restart, as though it never existed. Attempts to set/read this value would be met with an error since it is now gone.

The new implementation will not error when reading (by design), but will continue to return the filter metric as part of the output until a restart. Nothing should be writing to the filter metric since it now gone, so if thinking of the filter metric in terms of a graph, the line would flat-line instead of disappear.

EDIT: updated to not ask a leading question.

The question here should be what is the desired behavior ?

EDIT2:

This same issue exists for entire pipelines too. e.g. if I remove a pipeline or a plugin should the associated metrics disappear or flat-line ?

jakelandis on 3 Aug 2017

Spoke with @andrewvc on slack, and confirmed to leave the behavior as-is. _(It shouldn't be too hard in the new model)_.

The guiding principal here is that metrics should be a view into the current state, and if that current state changes, like the removal of a plugin or pipeline the metrics should reflect that.

jakelandis on 3 Aug 2017

Update: The namespace adapter ☠️

Earlier I referenced using an adapter pattern to keep the two implementations (namespace and witness) in sync with each other. I was able to get the rspec unit tests to pass with very little change to code (just a place to hook-in the adapter). The functionality worked fine...but when bench-marking the performance the adapter itself proved to be quite slow, regressing any performance improvements made recently.

So we decided to kill the adapter and simply replace the namespace API in favor of the Witness API. see https://github.com/elastic/logstash/issues/7971 for additional information.

jakelandis on 22 Aug 2017

Was this page helpful?

0 / 5 - 0 ratings