Currently, when a foreign-library with the name "foo" is specified, it is built and installed as libfoo.so.
It would be nice if we could additionally specify the version of the library.
One possibilty is to allow specifying a version field with a value like 1.2, such that the SONAME would become libfoo.so.1.2.
From this version, cabal would then generate the appropriate symlinks (libfoo.so.1.2 and libfoo.so linking to libfoo.so.1, I believe).
Alternatively we could just make the SONAME configurable directly. But then we can't really generate symlinks.
On Windows, one only versions shared objects by filename. So cabal should merely modify that in this case.
CC @edsko @dcoutts @ezyang
Seems reasonable.
@abooij Would you like me to do an architectural overview of how this patch should work, sort of like what I did in #4047?
@ezyang I'd like to see how far I can get with #4049 before I make any promises on this issue.
OK, as requested on IRC.
version, because we've already got a field named that), but assuming you call it, I dunno, lib-version, you'll need to update Cabal/Distribution/Types/ForeignLib.hs with a new field for this, and add parsing support in the legacy and Parsec parsers (if you grep for one of the existing field names, you'll find the code you need.) I am not sure if you should parse this version as a Version but it seems as reasonable a choice as any. There is some design flex here: should we require users to specify a soname or keep it optional? If it is optional, then this field should have type Maybe Version, the semantics being skip setting soname and don't make any extra libraries in that case.)soname=SONAME. The code responsible for calling the linker to build the foreign library is in gbuild in Distribution/Simple/GHC.hs, search down to GBuildFLib. I think that ghcOptLinkOptions is the field to edit. There may be portability concerns, so make sure that the flag is the correct one on Linux and OS X (and Windows, if there is a way to do it.) You can get your foreign lib version from flib. Probably you want to make a helper function for computing the soname from the Version. installFLib in Distribution/Simple/GHC.hs. You'll want a clear specification of what the behavior on all platforms is here (it would be best if we abide by the local conventions of the platform); put this spec in the documentation for foreign libraries (grep Cabal/doc to find it). Also consider whether or not these extra symlinks should get setup even for the inplace build, rather than just at install time.Don't forget to add a test.
Let's think about cross-platform issues. Suppose we're writing a library foo.
ELF binaries typically have versions x.y.z, sometimes x.y.z.w. Here, x.y defines compatibility. For version 3.2.1, we'd get a SONAME libfoo.so.3 embedded in the file libfoo.so.3.2.1, and two symlinks libfoo.so.3 and libfoo.so.
Mach-O binaries have a (usually single-digit) major version in the filename (e.g. 3 in libfoo.3.dylib). This is stored in the install-name field in the binary, which is a path to the library (but this can be RPATH-based, which is what GHC does). However, the binary also has a current_version and a compatibility_version, and they tend to be of the form x.y.z, and not necessarily related to eachother. But usually, the version is specified completely by the major version plus the current version (the compatibility simply being the current version's .0.0 release). Example: /opt/local/lib/libjpeg.8.dylib (compatibility version 9.0.0, current version 9.2.0). (Typically the major version is less than or equal to the first digit of the compatibility version.)
PE binaries. I haven't quite been able to figure out what the status quo is here. But I haven't been able to find a consistent file naming scheme. You can apparently set a version field in the binary, and it can have basically any value.
Especially given the difference between ELF and Mach-O, I think it'd be wise to play safe and only implement a scheme for ELF here: so perhaps the field should be called elf-version.
However, if we're willing to restrict the user in a number of ways, we _could_ say that we take the version to be of the form x.y.z(.w), and set the Mach-O major version to be x, its compatibility version to be x.0.0(.0), and its current version to be x.y.z(.w). And then we don't do any renaming for PE, but we simply store the input version in the relevant field in the binary.
Adding to the previous, on Windows one can use module definition files to set the version of a library. And module definition files are already supported by cabal. So to me that reinforces the idea that library versioning is a per-platform thing.
OK, makes sense to me!
After some more googling, it seems that asking for a libtool-style current[:revision[:age]] style specification is the sanest thing to do. The field could then be called version-info, the name in libtool. Perhaps library-version-info would make it even less ambiguous.
On linux, current:revision:age gets translated into version number major.age.revision, where major=current-age, reflecting the fact that ABIs can be backwards compatible. And then the SONAME is libfoo.so.major.
On other unix-style operating systems, the rules differ, but are usually similar to the ones on linux, and there is some agreed rule on how this should be done (although figuring out that rule might require some research and asking around).
EDIT: This does not take away from the fact that for now I'm only intending to support Linux - but at least this scheme could be applied to some other OSes, notably *BSDs, and possibly OSX.
More opinions on the following question are welcome: how are developers supposed to specify library versions? One option is a libtool-style version-info field (with current[:revision[:age]]). Another is to fill in the fields directly (e.g. a soversion field that gets appended to the libfoo.so filename).
Relevant arguments for libtool-style versioning include:
Arguments againts include:
There is also the possibility of offering both a libtool-style version-info field, and a field for forcing the specific fields on various platforms (possibly with different names for each field on each platform), and raising an error if both are specified. For now the support will be Linux-only. (Unless somebody is able to do other platforms.)
CC @christiaanb
I asked Reddit for some comments: https://www.reddit.com/r/haskell/comments/5bkhth/request_for_comment_specifying_versions_soname/
I want to note that I only know enough about dynamic linking to fix bugs... not really an expert on conventions. Regardless, I seems that version specification is way different on OS X compared to linux, as you can see here (sorta): http://stackoverflow.com/a/32280483
Although you indicate you only want to support Linux for now (which is something I strongly encourage), someday, someone might want to add OS X (or other OS) support. It would then be quite nice if he/she wouldn't have to bikeshed with the other Cabal developers a new top-level entry for the .cabal file where to place OS version info.
So I propose something like:
version-info:
linux:
linux-version-data-1:
linux-version-data-2:
And then when the OS X implementation person comes along, all he/she has to do is add a new sub-category for the version-info.
CC @mchakravarty given that he does commercial OS X development, while I mostly use my Mac just for web browsing these days.
As @christiaanb suggests, I am strongly in favour of distinguishing platform-specific version information. At least, I don't see a good way to combine the Mach-O/macOS/iOS versioning scheme with the Linux one.
Thanks for the reddit post and your contributions here so far! Your ideas are agreeable to me, but there are a few things that require some arbitrary decision in this setup - so I'd like to leave it up to you to bikeshed about that. For example:
What to do if someone attempts to build on OSX when only Linux version data is provided? Is this a configure error, or do we ignore version data? What if both libtool and linux version data is provided?
If we set a version of 3.2.1 on Linux for a foreign library foo, does that mean we also install the appropriate symlinks libfoo.so and libfoo.so.3 and guess the SONAME libfoo.so.3? Or should these be separate fields?
Just commenting on the OS X bit:
I assume we would still have to provide the current version field, even if we have this new version-info stanza, no? Although I would understand that this might lead to some confusion.
Anyhow, under the assumption we will still have the version field, I would say that:
linux-version-info is provided@christiaanb: do you mean the version associated to a package? we won't remove that of course. i'd like to emphasize that the goal of this issue is to allow users to specify ABI versions for foreign libraries, and in principle these ABI versions are not related to the _release_ versions of the packages (although typically they are bumped at similar times). this distinction is also made in the C world: the release version of a library is not equal to the ABI version. in specific cases it may be possible to synchronise them, but this is not always necessary nor desirable.
anyway, sure, if OS-specific versions are used, and we're building on an OS that doesn't have the version, we'll just build without version data.
@ezyang, could you give a few hints on parsing @christiaanb's syntax above?
@christiaanb's proposed syntax is not going to be very easy to parse in Cabal's existing framework; I don't think we have anything that looks like it. If this is what we end up doing, we should just have separate fields for each of the subfields. (Extra benefit: you can conditionalize over them.)
@abooij @ezyang I didn't actually literally mean to use the syntax I proposed, more to give a general idea: that a library has a field called i.e. version-info with subfields <OS>, where <OS> has appropriate subfields. I care about the semantics, not the syntax ;-)
Most helpful comment
@abooij @ezyang I didn't actually literally mean to use the syntax I proposed, more to give a general idea: that a library has a field called i.e.
version-infowith subfields<OS>, where<OS>has appropriate subfields. I care about the semantics, not the syntax ;-)