nixosTests.ldap fails to run - the login attempts fail to connect to the ldap server, exponentially backoff, and timeout eventually.
After some investigation, it seems openldap.service on the server fails to start (We don't wait for openldap.service it in the test, so it's not clearly visible):
invalid config directory /var/db/slapd, error 2
slaptest: bad configuration directory!
It could have been broken by 382b0aa52dfd843779bdae73a86438807ea1a2dc.
I opened the openldap module and test, and saw a lot of scary stuff w.r.t. how the configuration is handled there. On top of that, the whole preStart is entirely overwritten by nixosTests.ldap - making me ask myself if what's intended to be done there should probably be moved into the module…
I'd also like to do some refactoring there to use RuntimeDirectory= and StateDirectory= to create these directories - but I need to understand the usecases, and maybe see some real-world setups.
Pinging some possible users: @Mic92, @fpletz @dasJ @tfc.
Okay, we don't actually use the openldap module in our infrastructure, we have our own slapd module. But since I know some minor details about slapd, let me explain why I didn't bother using the upstream module.
First off, slapd supports multiple databases. While this seems like a minor detail, it's important for the OLC. slapd identifies the databases by their suffix (like dc=example,dc=com). If the suffix is the hardcoded value of cn=config, this database becomes the OLC (Online configuration). This database allows configuration changes without any downtime and deprecates the configuration file (iirc).
The OLC situation is the main difference, as we have a slapd.service and a slapd-olc.service which uses the standard LDAP tools and writes to the OLC without slapd going down.
The OLC is created with a custom prestart:
set -e
set -u
configDir='/var/lib/${cfg.dataDirName}/{0}config'
if ! [ -d "$configDir/cn=config" ]; then
# We use PrivateTmp anyway
configFile=/tmp/config
touch "$configFile"
chmod 600 "$configFile"
if ! [ -r "${cfg.olcAdminPwFile}" ]; then
echo "${cfg.olcAdminPwFile} is missing"
exit 1
fi
pw="$(cat "${cfg.olcAdminPwFile}" | tr -d '[:space:]')"
(
echo "database config"
echo "rootdn \"${cfg.olcAdminDN}\""
echo "rootpw \"$pw\""
) > "$configFile"
${pkgs.openldap.out}/bin/slaptest -F "$configDir" -f "$configFile"
rm -f "$configFile"
fi
As you can see from this example, we store our databases in /var/lib (hence we are able to use StateDirectory and store it in a structure supporting multiple databases. Also, the preStart contains no slaptest for configuration tests (just for OLC creation) because syntax checks are actually performed by slapd when the OLC data is written using ldapmodify.
If you attempt to write an invalid ACL, for example, slapd fails with an implementation-specific error code.
For systemd option (apart from my regular confinement options), these may be of interest:
{
ProtectSystem = "strict"; # No /var/db
PrivateTmp = true; # For olc creation
StateDirectory = map (db: "${cfg.dataDirName}/${db}") (remove "{-1}frontend" (attrNames (filterAttrs (k: v: v.needsDirectory) allDatabases))); # Ensure databases
RuntimeDirectory = "slapd"; # For the ldapi socket
User = "openldap"; # Do not start as root at all!
Group = "openldap";
CapabilityBoundingSet = "CAP_NET_BIND_SERVICE";
AmbientCapabilities = "CAP_NET_BIND_SERVICE";
}
Hope this helps. While I doubt I will ever have the motivation to upstream my module, I can certainly give you the needed information and code snippets to create an upstream-worthy version of it ;)
I need to pass on this one. I don't maintain an LDAP server by myself, and if both the test is broken and the module itself needs some serious hacks to get it to barely work, we might think about just dropping the module alltogether, and having the very few users using it implement it the way they deem fit for their usecase.
It's better than shipping a broken test and a module no-one knows how to get to work.
PR removing the broken test and adding the deprecation warning: #87004
Also cc @ju1m.
I use the slapd module with just the following options: enable, urlList, defaultSchemas = true, database = "bdb", suffix, rootdn, rootpw, dataDir, a few indices in extraDatabaseConfig and all the contents in declarativeContents.
While this definitely is not an advanced use case of it, it works without a hitch.
So, IMO, removing this module would be a net loss.
Well, #87004 is only about removing the test that already failed for months, and adding a deprecation notice to the module.
Without tests, we're flying blind and don't notice breakages, so I consider it appropriate to warn users about this fact, assuming they use it in production.
I can't judge on how advanced the NixOS tests are, but If you use the module and want it to keep working, having a simple, passing vm test that doesn't reach deep into module internals should be the best way forward here.
Let's move the discussion to #87004.
With nixosTests.ldap removed (by #87004), we can close this.