Graylog2-server: Disk journal and other node metrics do not load (3.2.2) with nginx as a reverse proxy

Created on 21 Feb 2020  路  13Comments  路  Source: Graylog2/graylog2-server

Expected Behavior


There should not be an error when viewing the page for a node.

Current Behavior

On two systems using Nginx as a reverse proxy (one using a subpath and the other using a subdomain), upon going to the page there's a red box with an error 400. "Getting journal information on node ** failed.

image

Additionally there is no sign of activity in the Buffers area.

image

Messages are still being processed.

The behaviour is the same whether using a subpath or not. I do not know if this is a problem when not using a reverse proxy.

Steps to Reproduce (for bugs)


  1. Install graylog behind a reverse proxy.
  2. Try to go to a node's page
    3.
    4.

For example:
journalinfo

Your Environment

  • Graylog Version: Graylog 3.2.2 (docker (tag graylog:3.2) and on another system I'm using the Debian package (3.2.2-1))
  • Elasticsearch Version: docker tag elasticsearch-oss:6.8.2
  • MongoDB Version: 3.6.17 (docker tag mongo:3)
  • Operating System: Debian unstable
  • Browser version: Firefox 73 & Chrome 80.0.3987.116
bug

Most helpful comment

@s1shed @vwbusguy This is the fix: https://github.com/Graylog2/graylog2-server/pull/7526

All 13 comments

Maybe this could be helpful: #7265 (comment)

Thanks, I'll try. If it works I'll update the ticket while feeling like a jackass for wasting others' time.

These configurations I'm using didn't have this problem with the previous version I was running (perhaps it was 3.2.1)

Maybe this could be helpful: #7265 (comment)

Thanks, I'll try. If it works I'll update the ticket while feeling like a jackass for wasting others' time.

These configurations I'm using didn't have this problem with the previous version I was running (perhaps it was 3.2.1)

I tested with my graylog installation which is behind nginx with a subpath.

I set http_external_uri=https://hostname.tld/gl/ , which is where graylog is accessible from.
I tried setting http_publish_uri=http://127.0.0.1:9000. This did not work. I then set it to match the value I set http_external_uri to. This also did not work. I saw the same error.

I commented http_external_uri and http_publish_uri and tried going to http://127.0.0.1:9000. The same behaviour persisted.

I'm seeing this as well since upgrading to 3.2.2. We are not using nginx in our case, but we are proxying through an external load balancer service (Citrix NetScaler).

I think the proxy might be a red herring? The net console in browser shows it tried to hit the correct proxied endpoint but the actual error response is:
{"type":"ApiError","message":"Unable to map property native_lib_dir.\nKnown properties include: flush_interval, plugin_dir, segment_age, max_age, bin_dir, data_dir, max_size, directory, flush_age, segment_size"}

There are several changes in this version related to Netty's NativeLibraryUtil, so I'm guessing that the regression might be caused by one of those changes based on the message.

It seems that the native lib directory was changed in 3.2.2 : https://github.com/Graylog2/graylog2-server/pull/7404

Doing a diff on the conf in the new version, it seems a new section was added:

# Set the bin directory here (relative or absolute)
# This directory contains binaries that are used by the Graylog server.
# Default: bin
bin_dir = /usr/share/graylog-server/bin

# Set the data directory here (relative or absolute)
# This directory is used to store Graylog server state.
# Default: data
data_dir = /var/lib/graylog-server

The native lib dir is based on the data_dir conf being set and is failing with an unhelpful message as a result.

Adding this back into the conf and restarting graylog creates the /var/lib/graylog-server/libnative path but the error about key mapping persists. If I add native_lib_dir = /var/lib/graylog-server/libnative to the conf and restart, it still doesn't fix the problem.

There are defaults for bin_dir and data_dir, so settings these properties shouldn't be required. The problem seems to be that the native_lib_dir property derived from data_dir is not registered in the context of the journal API call.

Interestingly, if I go to Metrics, all of the journal endpoints appear to work fine.

@s1shed @vwbusguy @mpfz0r I was able to reproduce the problem and am working on a fix.

As noted by @vwbusguy, this issue got introduced in #7404 (and #7359). It only happens on the node details page because that page is using a cluster wide API resource. That one has issues deserializing the journal response from all graylog nodes. (/api/cluster/<node-id>/journal) The nodes overview page is using another API resource (/api/system/journal) and is not affected.

This doesn't cause any processing issues, it's mainly a display issue in the UI.

We are working on a fix that will be in the next stable release. (3.2.3)

Thank you for the report!

@s1shed @vwbusguy This is the fix: https://github.com/Graylog2/graylog2-server/pull/7526

I think the proxy might be a red herring?

Oops. I've run into a few problems that have been reverse proxy related. Sorry (to all) about the invalid assumption.

I think the proxy might be a red herring?

Oops. I've run into a few problems that have been reverse proxy related. Sorry (to all) about the invalid assumption.

No worries! :smiley: Thank you for the report!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jalogisch picture jalogisch  路  4Comments

jozefbarcin picture jozefbarcin  路  3Comments

jalogisch picture jalogisch  路  3Comments

bernd picture bernd  路  3Comments

mhaasEFD picture mhaasEFD  路  4Comments