Vector: Reapproach communication with `journald`

Created on 2 Jan 2020  路  10Comments  路  Source: timberio/vector

It turns out that our approach with loading libsystemd.so using dlopen to avoid hard dependency on libsystemd.so is problematic to use with statically linked musl, as dlopen is not implemented in the static version of musl (yet). See the following threads for details: 1, 2, 3.

So in this issue I want to discuss possible approaches to working around this.

I came up with the following approaches:

  1. Use system journalctl executable to tail the data from journald (by running journalctl -f with corresponding flags using subprocess crate).

    | Pros | Cons |
    | --- | --- |
    | Keeps dependency on libc and libsystemd opt-in | The journalctl might potentially be not present on the system, while libsystemd.so be |
    | In case of errors in the journald library libsystemd would not result in segmentation fault |

  2. Instead of using journalctl built our own companion binary, which would be dynamically linked to GNU libc and libsystemd, then call this companion binary uisng subprocess instead of journalctl in the previous approach.

    | Pros | Cons |
    | --- | --- |
    | Doesn't rely on presence of journalctl executable | Vector's binary cannot be simply taken from the distribution and placed on a remote machine, the entire distribution would be necessary to copy |
    | Could potentially be more flexible (not sure is such flexibility needed) | The companion need to be dynamically linked to some specific versions of the library |
    | Keeps dependency on libc and libjournald opt-in | |

  3. Switch back to *-unknown-linux-gnu as the default target.

    | Pros | Cons |
    | --- | --- |
    | Doesn't require any changes in the journald source code | The Vector's binary again becomes hardly dependent on presence of GNU libc on the system |

  4. Find a way to modify musl to support dlopen from static binaries.

    | Pros | Cons |
    | --- | --- |
    | The resulting binary would work on any machine having just libsystemd.so with a single binary and no dependencies on journalctl | It is not clear what is the order of magnitude of the required effort, but it is definitely not trivial. Furthermore, if done, it might require recompiling Rust to use the modified version of musl |
    | Doesn't require any changes in the journald source code | |

Among those #1 is probably the cleanest/easiest, but I'm not sure that journalctl allows to pull all the data we need.

@bruceg What do you think?

journald

All 10 comments

IIRC the point of using dlopen for journald was to be able to support systems that either do or don't run journald with a single binary. With other features, we either require the library to be present or use a Rust native solution. As such, an option would be to have a journald feature gate, which would allow for statically linking libsystemd into the vector binary on systems that don't have a dynamic linker.

Another option would be to write up a Rust native journald file format library. This would have the added benefit of increasing options for testing captive data. However, the journald file format is complex and is undocumented except for the systemd code, so writing a library duplicating that support could be challenging.

I'm not a big fan of the subprocess solution, but I agree it's probably the best of the first four solutions, and the first is more reasonable than the second (systems that require logging to journald are likely to have journalctl installed). Given what I've seen of the threads regarding supporting dlopen in musl, I think adding support for that is not feasible for us to pursue.

Will there be a negative impact on the precision or reliability of things like checkpointing if we move to a subprocess-based design?

I could be mistaken, but my understanding was that there are advantages to using the journald API directly over something like tailing a file or reading from a pipe.

Thanks for digging into this @a-rodin. I'd say 3 is off the table, the benefits that musl brings outweigh the pros you outlined. And 4 sounds very complicated and brittle. If we aren't completely happy with 1 or 2, I can reach out to people in our network that might have expertise in this area to get another opinion.

@bruceg

As such, an option would be to have a journald feature gate, which would allow for statically linking libsystemd into the vector binary on systems that don't have a dynamic linker

There are licensing nuances with linking libsystemd statically. Because libsystemd is licensed under LGPL2.1+ and Vector's source code has Apache2 license, the license of the resulting binary would be LGPL3+ (as Apache2 is compatible only with (L)GPL3, but not (L)GPL2.x). So if we decide to proceed this way, we need to ensure that we are fine with these consequences.

Another option would be to write up a Rust native journald file format library. This would have the added benefit of increasing options for testing captive data. However, the journald file format is complex and is undocumented except for the systemd code, so writing a library duplicating that support could be challenging.

The data format seems to be documented, although the documentation is claimed to be best effort: https://www.freedesktop.org/wiki/Software/systemd/journal-files/. We need to only read and not write it, which somewhat simplifies the things.

Note that the format documentation being not official means that it might change in the future in an incompatible way. However, it it happens, then statically linked version of libsystemd would not be able to read it as well without recompiling Vector.

@lukesteensen

Will there be a negative impact on the precision or reliability of things like checkpointing if we move to a subprocess-based design?

If some checkpointing or related logic cannot be achieved with approach 1, it could be worked around in approach 2, as all necessary logic related to reliability could be placed in the companion binary and consistency can be ensured by the communication protocol. Not sure how much of complexity would this add though, compared to parsing the journal data format.

So, to reiterate, it seems to me like now there are two leading approaches:

  • The one mentioned by @bruceg, which is to implement reading of the journal data format in Rust.

    | Pros | Cons |
    | --- | --- |
    | No reliance on presence of journalctl executable | Might take time to implement |
    | Could be really reliable if implemented properly | Would require us to keep it up to date with the journal file format if it changes over time |
    | Changes would be isolated to lib/journald | |
    | Might give more flexibility in adding new features in future | |

  • Run something like journalctl -f -o json --all or journalctl -f -o export as a subprocess.

    | Pros | Cons |
    | --- | --- |
    | Could be robust to potential data format changes | Requires journalctl executable to be present |
    | Might be easier to implement | Potentially can be less performant because of the need to do serialization and deserialization |
    | | Users running (h)top might be surprised by finding constantly running journalctl process |
    | | Extra care and testing are needed to ensure that no events are lost in case if, for example, Vector binary gets killed because of running OOM |

The journalctl tool can print out the cursor with each record, which would give us equivalent to what we get from the library, so I don't think reliability or accuracy of checkpointing is a concern.

Just to document: a discussion that happened among the team members led to a conclusion that the most pragmatic approach would be to use the journalctl binary as a subprocess.

I am seeing no way to emulate the local_only option we previously had in our configuration. Should I remove the option, or is this now a blocker to the transition?

I don't fully understand what that option is 馃槃 . Can you elaborate? The option description isn't super helpful either:

Include only entries from the local system

  1. Is this a problem for typical usage?
  2. Could local filtering be accomplished with service filters, etc?

Sure. The journald system can apparently collect logs from multiple sources. Within each record is an indication of what system generated the record. The systemd library provides an option to limit the records to only those generated on the local system, and we default to turning that option on.

I don't think the loss of this option is a problem for typical usage, no. Most installations of journald will only be handling logs from the local system, so it effectively becomes a no-op either way.

The filtering could definitely be accomplished with a filter. Each record includes _HOSTNAME and _MACHINE_ID fields that could be used to include or exclude specific systems. It's not clear to me what turning on the option in the library actually does. It's possible it too is just selecting on the hostname matching the system hostname, but that seems fragile to name changes.

This is resolved by #1526

Was this page helpful?
0 / 5 - 0 ratings