Right now nixpkgs only has postgresql
, which contains much more than libpq
.
By offering a libpq
package like other distros, the closure size of many packages could be reduced significantly.
Even more so because postgresql
currently depends on systemd
. And that also means that no application or library using libpq
can be built against musl
, as systemd
relies on glibc-only features and thus does not build with musl.
Tasks:
postgresql
-> systemd
dependency configurablelibpq
package containing only libpqlibpq
instead of all of postgresqlCC @matthewbauer @peti for Haskell.
CC @FRidh for Python, would Python benefit from this?
I thought the postgresql.lib output would do this for us?
@matthewbauer yes for runtime closure size, no for systemd
FWIW, regarding closure size, I considered splitting this up before when I did significant Postgres work last year but ultimately went against it because multi-output .lib
is supported for all Postgres expressions -- which does reduce the size a bit and worked well enough for dependent expressions.
That said something containing "only" libpq.so
(for example, via an override to create a new expression/derivation) would be much smaller, certainly, but it's not clear that's the best use of work or time -- right now .lib
for postgresql.lib
(9.6) is barely 6mb and that's mostly due to it containing most of the default server extensions that are shipped with the build system (libpq.so
is barely 200KB). Moving these out would probably reduce this quite a lot -- for example, they could just be part of .bin
instead.
As for dependencies, well, they're pretty slim too:
a@link> nix path-info -rS ./result-lib
/nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2 29565976
/nix/store/5p8qcz6i13ymgf80kimhx2g9pb5r5nia-bash-4.4-p23 29463504
/nix/store/lhngda6cn951z68v5rlh4dni6bb5sjw1-zlib-1.2.11 28279616
/nix/store/s5sgkvhy2wdgn7fmsvq47srxjhfp9ld9-libxml2-2.9.9 29985080
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib 41456208
/nix/store/xqs95fqkjb1kd102yjv5h5q57gcsafb3-glibc-2.27 28147008
/nix/store/yglqxz2as4ygai26d2hbfxqpf0shdh4l-openssl-1.0.2r 31880056
But why do we need them?
a@link> nix path-info -r ./result-lib | grep -v postgresql | while read l; do nix why-depends ./result-lib "$l"; done
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/uuid-ossp.so: …LIBC_2.2.5.GLIBC_2.4./nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2/lib:/nix/sto…
=> /nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/uuid-ossp.so: …LIBC_2.2.5.GLIBC_2.4./nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2/lib:/nix/sto…
=> /nix/store/57i8nqp2vljg5h61gvk4p1ar9vzhxayj-libossp-uuid-1.6.2
╚═══bin/uuid-config: …#!/nix/store/5p8qcz6i13ymgf80kimhx2g9pb5r5nia-bash-4.4-p23/bin/sh.##.## OSSP…
=> /nix/store/5p8qcz6i13ymgf80kimhx2g9pb5r5nia-bash-4.4-p23
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/pgcrypto.so: …LIBC_2.4.GLIBC_2.2.5./nix/store/lhngda6cn951z68v5rlh4dni6bb5sjw1-zlib-1.2.11/lib:/nix/store/yglq…
=> /nix/store/lhngda6cn951z68v5rlh4dni6bb5sjw1-zlib-1.2.11
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/pgxml.so: …2.2.5.LIBXML2_2.4.30./nix/store/s5sgkvhy2wdgn7fmsvq47srxjhfp9ld9-libxml2-2.9.9/lib:/nix/store/xq…
=> /nix/store/s5sgkvhy2wdgn7fmsvq47srxjhfp9ld9-libxml2-2.9.9
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/_int.so: …LIBC_2.4.GLIBC_2.2.5./nix/store/xqs95fqkjb1kd102yjv5h5q57gcsafb3-glibc-2.27/lib.XXXXXXXXXXXXXXXX…
=> /nix/store/xqs95fqkjb1kd102yjv5h5q57gcsafb3-glibc-2.27
/nix/store/w7d4sfgxjpalphg3h9bink47yzhf4f96-postgresql-9.6.12-lib
╚═══lib/libpq.so.5.9: …LIBC_2.3.4.GLIBC_2.3./nix/store/yglqxz2as4ygai26d2hbfxqpf0shdh4l-openssl-1.0.2r/lib:/nix/store/x…
=> /nix/store/yglqxz2as4ygai26d2hbfxqpf0shdh4l-openssl-1.0.2r
bash
, libuuid
, zlib
, and libxml
are needed for auxiliary extensions. So we could actually achieve significant reductions here by just moving the server extensions out of the .lib
output, which would yield nothing except glibc
and openssl
, I believe -- but these dependencies should be considered "practically free" since you always need them for anything at all. As a result you're looking at a realistic closure size of < 1mb with that change.
Therefore, I think a much better use of time would be to just slim down postgresql.lib
and call it a day. And if that doesn't fix the itch, we can always alias libpq = postgresql.lib
somewhere in the top-level and say "we have libpq
"
@nh2 Does overriding systemd
argument to null
resolve the first point?
@nh2 Does overriding
systemd
argument tonull
resolve the first point?
I don't know, but I'd rather make this an explicit flag as done in #61581 than using null
.
@thoughtpolice Thanks for your analysis, that is good to know.
I think a much better use of time would be to just slim down postgresql.lib and call it a day.
That is not enough for musl support though because with the current setup where we build all of postgresql with --with-systemd
, and then throw away everything that's not lib
, it it fails to compile already before we can throw away anything.
So in my proposal, a libpq
package would be built with enableSystemd = false
.
It seems a good idea to me to give that "postgres without systemd build dependency" thing a name, and I found libpq
the easiest solution because it solves both problems (buldability and client apps not depending on any postgresql server bits) in one go.
I don't follow the logic here. You want a systemd-less version of postgresql
so you can use Musl for your stdenv and built packages. libpq
actually does not need systemd, only the .bin
outputs do, but they do not work with Musl. So if this was turned off or libpq was recompiled separately, you could avoid it. Therefore, libpq
would override enableSystemd = false
at all times and everything would link against it, but:
postgresql
for the separate derivation, which is annoying and tiresomelibpq
in order to take any effect. Because everything currently implicitly relies on .lib
to provide libpq
, you must change it -- you can't simply override the postgresql.lib
output to point to a completely different expression -- unless you want to rebuild every version of postgresql twice just so every postgresqlXX.lib
output can be a completely different, recompiled piece of code.libpq
that is consistent with the version of postgresql
they depend on. Something smells very fishy about having something like buildInputs = [ libpq postgresql_10 ]
where libpq = postgresql96.lib
or whatever. I'm sure libpq
has also had several API/feature changes between versions, probably making this mandatory.libpq
derivation get you, exactly? You might as well just override postgresql
to enableSystemd = false
and be done with it in this same case.In other words: why should upstream glibc
users, Hydra, Borg, etc always compile a second postgresql derivation with this feature disabled, and a substantial amount of changes to all postgres
-dependent extensions? When the musl use case requires you recompiling it anyway.
Put another way -- what use cases does this particular scheme make easier/possible that aren't currently possible, and why is this the way it must be done?
Keep in mind that third parties outside use postgresql.lib
as a buildInput
as well, so this effectively creates a schism between upstream and any third party uses (now you have to override them awkwardly if you want upstream/a third party to behave the same, etc). Unless you just completely remove postgresql.lib
and break downstreams, I guess. (It would actually be worse than that, I think -- and would result in confusing build-time errors, because mkDerivation
doesn't consider the lack of a .lib
to be a problem, it just assumes the files must be in .out
instead, AKA it's not multi-output -- but those files won't be there!)
I'm not convinced this is a good plan yet. IMO it would be strictly better to simply disable systemd support unilaterally if stdenv.hostPlatforms.isMusl == true
and simply put that in the default generic
expression, and also clean up .lib
as I explained earlier. Then:
.lib
closure, no matter whatpostgres.lib
dependent expressions depend on a new expression, or involve excessive recompilations. Everyone keeps using postgresql.lib
I may be misunderstanding something, but this seems somewhat fishy as an approach, to me. Also, the whole phrasing of "introduce libpq so I can use musl" seems like a classic X-is-Y problem... The actual problem is not that libpq
must be separate -- it's that postgresql
currently has dependent inputs that are incompatible for you. That's a much narrower problem to solve with a fix that is substantially easier, and much less invasive.
IMO it would be strictly better to simply disable systemd support unilaterally if
stdenv.hostPlatforms.isMusl == true
and simply put that in the defaultgeneric
expression, and also clean up.lib
as I explained earlier.
@thoughtpolice What you propose sounds like a good plan to me.
One question though:
Are there situations where some programs may want to use only the postgresql server _libraries_?
E.g. should there be postgresql.lib
(being essentially libpq
, as you suggest), and e.g. postgresql.server-lib
(for programs that use server libs but don't need the binaries)? Or is that an uncommon use case that can be handled later, if somebody needs it in the future?
I am not sure about that in the case of first-party things shipped with PostgreSQL. I do not think there's any reasonable way for someone to interface with/link against those meaningfully. They can only be used by CREATE EXTENSION
in the server process.
For certain third party extensions, I know they do ship libraries and headers that can be consumed by other extensions. But that case is handled fine: because those expressions will act normally, have .lib
and .dev
outputs, and be part of your buildInputs
if you need them.
I don't think there are any realistic use cases where someone would want all the server libs, but not the binaries in .bin
. And if you did -- you'd probably want the include
files anyway, but there's another catch here which is that they are inside .bin
too, not in a .dev
output. So having a .server-libs
output is a net loss: you still need .bin
anyway for the header files.
So, if we cleaned up things as I mentioned earlier -- by putting server extensions in .bin
-- and someone did want them and did want to use them in some theoretical scenario, we can just say "use the .bin
output in your buildInputs
and it will work". No need for a separate .server-libs
.
But I think that's a particular, rare-or-impossible enough bridge that we can cross that one when (if) we get to it.
I don't think this is a good idea. systemd.lib is only 1.8M and seems like a reasonable thing to have a runtime dependency on:
$ du -sch $(nix-store -qR /nix/store/558zw2lrdiqwgy0i5vq779j700iph882-postgresql-9.6.12-lib)
152K /nix/store/r4bbg2bjmsvyachy2mn97xri6bcib4lj-zlib-1.2.11
160K /nix/store/kq8jb3g4xzfbc3d7m818ri3293j2nsy3-libossp-uuid-1.6.2
1.4M /nix/store/2ch3pwfv40ls6qhdmzwsz9wky9pq6cyl-bash-4.4-p23
1.7M /nix/store/hsh8afbalh1blsj2s4n8g2dvaj2j5aai-libxml2-2.9.9
3.6M /nix/store/2wj5dw9sslnqxkznyzp8jhic8ywybs72-openssl-1.0.2r
6.2M /nix/store/558zw2lrdiqwgy0i5vq779j700iph882-postgresql-9.6.12-lib
30M /nix/store/siks2gcfwx6qwh27m7c5r5lixcr621bd-glibc-2.27
43M total
Splitting packages into multiple parts makes things much more confusing to reason about. Just because other distros do it doesn't mean it's a good idea. Multiple outputs are there to avoid this mess. We should use package splitting as a last result!
On the other hand, disabling systemd support when you are building with Musl seems reasonable. But it would be much better to just fix musl support in systemd.
@matthewbauer FWIW, see #61581 for my proposed changes to @nh2's proposed changes -- which effectively do what you ask (systemd is always available, unless stdenv.hostPlatform.isMusl == true || stdenv.isDarwin
).
I also do not think systemd upstream has any interest in building with musl; some googling leads to various patches which seem to be rather invasive, and I think it would be better to avoid cluttering our own systemd patchset with a ~dozen more. It's possible to get a subset of systemd's library code working with musl (a lot of the most useful stuff like socket activation helper code, sd_log etc are all in just a single header file), but anything that truly needs libudev or libsystemd is going to be tricky.
I can think of another good reason of splitting postgresql
and postgresql_lib
. Currently we have postgres = postgresql96
, and we can't bump it because it will bump server version for those, who use postgresql
as server. Well... we can bump it, but that will be breaking change for many (last time I've checked there were tens of direct pg server usages through postgresql
)...
If we could have postgresql_lib
as separate top-level attribute, we could bump it's major version every year, as libpq
has backwards compat. Is there any need for libpq9.6 when we already have libpq11?
I wrote "splitting" but in fact I just want a separate top-level attribute for libpq. Which should be libpq = postgresql_latest.lib;
. So I'm all for multiple outputs, just libpq
deserves a separate reference.
Thank you for your contributions.
This has been automatically marked as stale because it has had no activity for 180 days.
If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.
Here are suggestions that might help resolve this more quickly:
still important to me
Most helpful comment
still important to me