Beats: [Metricbeat] ReporterV2 migration meta-issue

Created on 15 Feb 2019  ยท  10Comments  ยท  Source: elastic/beats

Using grep I managed to find this modules/metricset using Fetch() (common.Mapstr, error), Fetch () ([]common.Mapstr, error), Run(reporter mb.PushReporter) or Run(reporter mb.PushReporterV2)

Metricbeat Integrations meta technical debt

Most helpful comment

Guide to migration

  • [ ] Check state of current Metricset: This is to avoid potential environment errors. Let's say that we are testing ETCD so let's run its tests first:
(python-env) metricbeat โžœ make python-env
(python-env) metricbeat โžœ source build/python-env/bin/activate
(python-env) metricbeat โžœ MODULE=etcd make test-module
# Navigate to the Metricset being migrated and
(python-env) metricbeat โžœ cd module/etcd/store
(python-env) metricbeat โžœ go test ./...
(python-env) metricbeat โžœ go test -tags=integration -data ./...

If everything is green, continue to the next step:

  • [ ] Add new logger variable for logging. In most modules we use an old logp implementation for logging and must be updated. For example for leader Metricset in ETCD module just place this at the top of the leader.go file:

var logger = logp.NewLogger("etcd.leader")

And then use logger.Debug or logger.Error instead of logp.
After merging https://github.com/elastic/beats/pull/11106 now we have base.Logger() to use in the same way than just logger global variable.

  • [ ] Replace func (m *MetricSet) Fetch() (common.MapStr, error) { with func (m *MetricSet) Fetch(reporter mb.ReporterV2) { UPDATE: func (m *MetricSet) Fetch(reporter mb.ReporterV2) error {. Most of the times that function is in a file with the same name than the Metricset (file leader.go in ETCD leader Metricset, for example). At the same time, some Fetch methods didn't have comments so the default one is used (pasting here for quick reference):
    For example, a Fetch method like this:
func (m *MetricSet) Fetch() (common.MapStr, error) {
    content, err := m.http.FetchContent()
    if err != nil {
        return nil, err
    }
    return eventMapping(content), nil
}

Will end up looking like this:

// Fetch methods implements the data gathering and data conversion to the right
// format. It publishes the event which is then forwarded to the output. In case
// of an error set the Error field of mb.Event or simply call report.Error().
func (m *MetricSet) Fetch(reporter mb.ReporterV2) error {
    content, err := m.http.FetchContent()
    if err != nil {
        return err
    }

    reporter.Event(mb.Event{
        MetricSetFields: eventMapping(content),
    })
}

We can see that 3-liner of _logger-report-return_ that we are trying to avoid but if all modules have the same 3-liner, maybe we can make something to change them all with a script and start using https://github.com/elastic/beats/pull/10727

  • [ ] Replace TestFetch code in integration tests: Often we will find this code (taken from PostgreSQL bgwriter) that declares an event variable to assert against it:
func TestFetch(t *testing.T) {
    compose.EnsureUp(t, "postgresql")

    f := mbtest.NewEventFetcher(t, getConfig())
    event, err := f.Fetch()
    if !assert.NoError(t, err) {
        t.FailNow()
    }

    t.Logf("%s/%s event: %+v", f.Module().Name(), f.Name(), event)

    assert.Contains(t, event, "checkpoints")
    assert.Contains(t, event, "buffers")
    assert.Contains(t, event, "stats_reset")

        //continue...

So look for the line f := mbtest.NewEventFetcher(t, getConfig()) and from there, replace with:

    f := mbtest.NewReportingMetricSetV2Error(t, getConfig())
    events, errs := mbtest.WriteEventsReporterV2Error(f)
    if len(errs) > 0 {
        t.Fatalf("Expected 0 error, had %d. %v\n", len(errs), errs)
    }
    assert.NotEmpty(t, events)
    event := events[0].MetricSetFields

    // test continue untouched...

getConfig() might be also called getConfig(string) or getConfigWithData() or any other variant of it. Just check on the same file because you'll find that function somewhere there.

  • [ ] Replace TestData function: TestData can be almost fully replaced with a copy-paste. ~Just change the name of the service that compose is checking in compose.EnsureUp~ and check getConfig() call like in the previous point: (UPDATE: It's better to not use Compose here because Fetch function is the only one that should start a container in CI, this function is only to run in local)
func TestData(t *testing.T) {
    f := mbtest.NewReportingMetricSetV2Error(t, getConfig())

    if err := mbtest.WriteEventsReporterV2Error(f, t, ""); err != nil {
        t.Fatal("write", err)
    }
}
  • [ ] Check for uses in Unit Tests and replace them too: I have also found that some unit tests files use (or files without the _integration name on it) might have some tests using the reporter V1 interface. For example, this was on the store_test.go of the store Metricset of ETCD module:
func TestFetchEventContent(t *testing.T) {
    absPath, err := filepath.Abs("../_meta/test/")

    response, err := ioutil.ReadFile(absPath + "/storestats.json")
    server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        w.WriteHeader(200)
        w.Header().Set("Content-Type", "application/json;")
        w.Write([]byte(response))
    }))
    defer server.Close()

    config := map[string]interface{}{
        "module":     "etcd",
        "metricsets": []string{"store"},
        "hosts":      []string{server.URL},
    }
    f := mbtest.NewEventFetcher(t, config)
    event, err := f.Fetch()
    if err != nil {
        t.Fatal(err)
    }

    t.Logf("%s/%s event: %+v", f.Module().Name(), f.Name(), event)
}

So the approach is the same than TestFetch function, replace everything from f := mbtest.NewEventFetcher(t, config) with this:

    f := mbtest.NewReportingMetricSetV2Error(t, config)
    events, errs := mbtest.ReportingFetchV2Error(f)
    if len(errs) > 0 {
        t.Fatalf("Expected 0 error, had %d. %v\n", len(errs), errs)
    }
    assert.NotEmpty(t, events)

    t.Logf("%s/%s event: %+v", f.Module().Name(), f.Name(), events[0])
}

Here you don't need to extract MetricsetFields from the event[0]

  • [ ] BONUS POINTS! Run tests and regenerate data.json file: Quick ref:
# Replace etcd with the module you are testing
(python-env) metricbeat โžœ MODULE=etcd make test-module
# Navigate to the Metricset being migrated and
(python-env) metricbeat โžœ cd module/etcd/store
(python-env) metricbeat โžœ go test ./...
(python-env) metricbeat โžœ go test -tags=integration -data ./...

Everything is OK if

  • No errors happened during the test
  • data.json file isn't changed or its changes only reflects changes in the Root fields. For example beat field should change to agent and full objects like event or service should appear https://github.com/elastic/beats/pull/10817/files#diff-5a9cef679c0694a2ace62a1ffc7159ad.
{
    "@timestamp": "2017-10-12T08:05:34.853Z",
    "agent": {
        "hostname": "host.example.com",
        "name": "host.example.com"
    },
    "etcd": {
        "leader": {
            "followers": {},
            "leader": "8e9e05c52164694d"
        }
    },
    "event": {
        "dataset": "etcd.leader",
        "duration": 115000,
        "module": "etcd"
    },
    "metricset": {
        "name": "leader"
    },
    "service": {
        "address": "127.0.0.1:2379",
        "type": "etcd"
    }
}

So, at the end, just 2 to 4 `files are uploaded in the PR, like here https://github.com/elastic/beats/pull/10817

All 10 comments

Do we need to regenerate the data.json file just to be sure that the migration didn't break anything?

I'm regenerating data.json files but they aren't changing at all, which is a good sign, BTW :slightly_smiling_face:

Correction, some of them are changing https://github.com/elastic/beats/pull/10817/files#diff-5a9cef679c0694a2ace62a1ffc7159ad

Guide to migration

  • [ ] Check state of current Metricset: This is to avoid potential environment errors. Let's say that we are testing ETCD so let's run its tests first:
(python-env) metricbeat โžœ make python-env
(python-env) metricbeat โžœ source build/python-env/bin/activate
(python-env) metricbeat โžœ MODULE=etcd make test-module
# Navigate to the Metricset being migrated and
(python-env) metricbeat โžœ cd module/etcd/store
(python-env) metricbeat โžœ go test ./...
(python-env) metricbeat โžœ go test -tags=integration -data ./...

If everything is green, continue to the next step:

  • [ ] Add new logger variable for logging. In most modules we use an old logp implementation for logging and must be updated. For example for leader Metricset in ETCD module just place this at the top of the leader.go file:

var logger = logp.NewLogger("etcd.leader")

And then use logger.Debug or logger.Error instead of logp.
After merging https://github.com/elastic/beats/pull/11106 now we have base.Logger() to use in the same way than just logger global variable.

  • [ ] Replace func (m *MetricSet) Fetch() (common.MapStr, error) { with func (m *MetricSet) Fetch(reporter mb.ReporterV2) { UPDATE: func (m *MetricSet) Fetch(reporter mb.ReporterV2) error {. Most of the times that function is in a file with the same name than the Metricset (file leader.go in ETCD leader Metricset, for example). At the same time, some Fetch methods didn't have comments so the default one is used (pasting here for quick reference):
    For example, a Fetch method like this:
func (m *MetricSet) Fetch() (common.MapStr, error) {
    content, err := m.http.FetchContent()
    if err != nil {
        return nil, err
    }
    return eventMapping(content), nil
}

Will end up looking like this:

// Fetch methods implements the data gathering and data conversion to the right
// format. It publishes the event which is then forwarded to the output. In case
// of an error set the Error field of mb.Event or simply call report.Error().
func (m *MetricSet) Fetch(reporter mb.ReporterV2) error {
    content, err := m.http.FetchContent()
    if err != nil {
        return err
    }

    reporter.Event(mb.Event{
        MetricSetFields: eventMapping(content),
    })
}

We can see that 3-liner of _logger-report-return_ that we are trying to avoid but if all modules have the same 3-liner, maybe we can make something to change them all with a script and start using https://github.com/elastic/beats/pull/10727

  • [ ] Replace TestFetch code in integration tests: Often we will find this code (taken from PostgreSQL bgwriter) that declares an event variable to assert against it:
func TestFetch(t *testing.T) {
    compose.EnsureUp(t, "postgresql")

    f := mbtest.NewEventFetcher(t, getConfig())
    event, err := f.Fetch()
    if !assert.NoError(t, err) {
        t.FailNow()
    }

    t.Logf("%s/%s event: %+v", f.Module().Name(), f.Name(), event)

    assert.Contains(t, event, "checkpoints")
    assert.Contains(t, event, "buffers")
    assert.Contains(t, event, "stats_reset")

        //continue...

So look for the line f := mbtest.NewEventFetcher(t, getConfig()) and from there, replace with:

    f := mbtest.NewReportingMetricSetV2Error(t, getConfig())
    events, errs := mbtest.WriteEventsReporterV2Error(f)
    if len(errs) > 0 {
        t.Fatalf("Expected 0 error, had %d. %v\n", len(errs), errs)
    }
    assert.NotEmpty(t, events)
    event := events[0].MetricSetFields

    // test continue untouched...

getConfig() might be also called getConfig(string) or getConfigWithData() or any other variant of it. Just check on the same file because you'll find that function somewhere there.

  • [ ] Replace TestData function: TestData can be almost fully replaced with a copy-paste. ~Just change the name of the service that compose is checking in compose.EnsureUp~ and check getConfig() call like in the previous point: (UPDATE: It's better to not use Compose here because Fetch function is the only one that should start a container in CI, this function is only to run in local)
func TestData(t *testing.T) {
    f := mbtest.NewReportingMetricSetV2Error(t, getConfig())

    if err := mbtest.WriteEventsReporterV2Error(f, t, ""); err != nil {
        t.Fatal("write", err)
    }
}
  • [ ] Check for uses in Unit Tests and replace them too: I have also found that some unit tests files use (or files without the _integration name on it) might have some tests using the reporter V1 interface. For example, this was on the store_test.go of the store Metricset of ETCD module:
func TestFetchEventContent(t *testing.T) {
    absPath, err := filepath.Abs("../_meta/test/")

    response, err := ioutil.ReadFile(absPath + "/storestats.json")
    server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        w.WriteHeader(200)
        w.Header().Set("Content-Type", "application/json;")
        w.Write([]byte(response))
    }))
    defer server.Close()

    config := map[string]interface{}{
        "module":     "etcd",
        "metricsets": []string{"store"},
        "hosts":      []string{server.URL},
    }
    f := mbtest.NewEventFetcher(t, config)
    event, err := f.Fetch()
    if err != nil {
        t.Fatal(err)
    }

    t.Logf("%s/%s event: %+v", f.Module().Name(), f.Name(), event)
}

So the approach is the same than TestFetch function, replace everything from f := mbtest.NewEventFetcher(t, config) with this:

    f := mbtest.NewReportingMetricSetV2Error(t, config)
    events, errs := mbtest.ReportingFetchV2Error(f)
    if len(errs) > 0 {
        t.Fatalf("Expected 0 error, had %d. %v\n", len(errs), errs)
    }
    assert.NotEmpty(t, events)

    t.Logf("%s/%s event: %+v", f.Module().Name(), f.Name(), events[0])
}

Here you don't need to extract MetricsetFields from the event[0]

  • [ ] BONUS POINTS! Run tests and regenerate data.json file: Quick ref:
# Replace etcd with the module you are testing
(python-env) metricbeat โžœ MODULE=etcd make test-module
# Navigate to the Metricset being migrated and
(python-env) metricbeat โžœ cd module/etcd/store
(python-env) metricbeat โžœ go test ./...
(python-env) metricbeat โžœ go test -tags=integration -data ./...

Everything is OK if

  • No errors happened during the test
  • data.json file isn't changed or its changes only reflects changes in the Root fields. For example beat field should change to agent and full objects like event or service should appear https://github.com/elastic/beats/pull/10817/files#diff-5a9cef679c0694a2ace62a1ffc7159ad.
{
    "@timestamp": "2017-10-12T08:05:34.853Z",
    "agent": {
        "hostname": "host.example.com",
        "name": "host.example.com"
    },
    "etcd": {
        "leader": {
            "followers": {},
            "leader": "8e9e05c52164694d"
        }
    },
    "event": {
        "dataset": "etcd.leader",
        "duration": 115000,
        "module": "etcd"
    },
    "metricset": {
        "name": "leader"
    },
    "service": {
        "address": "127.0.0.1:2379",
        "type": "etcd"
    }
}

So, at the end, just 2 to 4 `files are uploaded in the PR, like here https://github.com/elastic/beats/pull/10817

I'm regenerating data.json files but they aren't changing at all, which is a good sign, BTW ๐Ÿ™‚

Correction, some of them are changing https://github.com/elastic/beats/pull/10817/files#diff-5a9cef679c0694a2ace62a1ffc7159ad

Yes! Some of the changes are expected. Like adding the event part and etc.

Sorry, clicked the wrong button...

I'm wondering if the migration must be "backported" to any branch like 7.0 or 7.x apart from being merged into master. @andrewkroh maybe you can help us in this, please?

Examples and tutorials have been updated after merging https://github.com/elastic/beats/pull/11106

TestData function have been updated in the tutorial too

Thanks to everybody for helping with this migration. Good work!

Was this page helpful?
0 / 5 - 0 ratings