Stack is one of my favorite build tools and unfortunately it currently does not follow the XDG Base Directory Specification: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
My $HOME/.stack folder is currently sitting at around 12GiB (most of which is cache), which requires me to manually check which files I have to back up.
From a quick glance the following files are configuration which would belong in $XDG_CONFIG_HOME/stack:
$HOME/.stack/config.yaml$HOME/.stack/global-project/stack.yamlThe files in $HOME/.stack/templates/ probably belong in $XDG_DATA_HOME/stack/templates/, as the user could have his own and therefore could be considered essential.
The other directories seem to only contain cache data which are non-essential and would belong in $XDG_CACHE_HOME/stack.
If you can add an option to change the directory structure, that would be great
Just thinking about how this should behave, maybe if $XDG_CACHE_HOME is set then use that for the cached items unless the cached items are already in $HOME/.stack. This way new installations would do the right thing, but existing installations would still keep using their existing cache. And something similar for $XDG_CONFIG_HOME.
I would find this helpful — I want to keep down the size of my backups, and so I currently have to manually delete caches.
The global-project example seems fairly complicated — the working directory is the global-project/.stack-work directory. This would have to change, here, too.
@borsboom Yes, that sounds sensible. We wouldn't want to copy the cache automatically when invoking stack, as it may be very large, so it should keep using the available $HOME/.stack directory if it exists.
@dbaynard Correct me if I'm wrong, but the global-project/.stack-work also falls under the cache category, as it can be re-created given a list of packages that are "gloablly" installed through stack. The logs aren't really important either, as they only become relevant when something goes wrong.
It does fall in that category, and should therefore be in $XDG_CACHE_HOME. However, it is currently maintained separately to the $HOME/.stack caches, and so I believe it uses the logic from per-project .stack-work directories (though I'm not familiar with that part of the code). I meant to say: that part of the code would have to change, possibly significantly so.
I'm trying to get a handle on how much work would be involved in a PR, and which other parts of the code would be impacted. @rszibele would you be able to take a look?
Also, you've listed backups as a reason for this change. Are there others?
I've reproduced the justification from the base directory specification you linked, here.
The XDG Base Directory Specification is based on the following concepts:
There is a single base directory relative to which user-specific data files should be written. This directory is defined by the environment variable $XDG_DATA_HOME.
There is a single base directory relative to which user-specific configuration files should be written. This directory is defined by the environment variable $XDG_CONFIG_HOME.
There is a set of preference ordered base directories relative to which data files should be searched. This set of directories is defined by the environment variable $XDG_DATA_DIRS.
There is a set of preference ordered base directories relative to which configuration files should be searched. This set of directories is defined by the environment variable $XDG_CONFIG_DIRS.
There is a single base directory relative to which user-specific non-essential (cached) data should be written. This directory is defined by the environment variable $XDG_CACHE_HOME.
There is a single base directory relative to which user-specific runtime files and other file objects should be placed. This directory is defined by the environment variable $XDG_RUNTIME_DIR.
@dbaynard I'll have to take an in-depth look at how much has to be changed and how the global project works. From a quick overall glance at the code base we have a few options:
1. The easy way with as few modifications as possible:
XDG_CACHE_HOME as the default (as most is cache data)XDG_DATA_DIR2. The clean solution with a _lot more_ modifications:
XDG_DATA_HOME, XDG_CONFIG_HOME, XDG_CACHE_HOME, XDG_RUNTIME_DIR)Stack already uses the path-io module, so no new dependencies need to be added, as the following function can be used:
http://hackage.haskell.org/package/path-io-1.4.0/docs/Path-IO.html#v:getXdgDir
I'd prefer the clean solution from a code perspective, but the first one would work equally as well from a practical standpoint and it also _should_ make the global project work as-is without any extra modifications.
NB: I haven't yet looked into how the global project works. It may or may not be much more work with the second solution.
I'm currently working on an experimental tool to generate Flatpak manifests from stack projects to allow easy distribution of Haskell binaries on GNU/Linux, so it could take a bit before I can look into this in more depth.
The main reason is backups and the ability to easily delete cache without knowing the internals of a program or excluding directories manually from backup scripts. Thankfully a lot of projects (new and old: KDE, GNOME, Chromium, Blender, GIMP 2.10, and many more) are supporting the XDG Base Directory specification.
I really hope XDG becomes the gold standard on GNU/Linux instead of the old $HOME/.myprogram, so I am also willing to work towards it whenever I have the capacity.
That looks great @rszibele! I agree on XDG; it's good to have these reasons explicit, here.
Do note that there are some major changes to stack's caching behaviour (e.g. #4254, #3922) in progress. It seems like this change should be orthogonal. It would be very good to make this change at the same time. @snoyberg is driving those changes.
I would also love to see this change, just to throw my 10c (and support) in to the conversation. The "standard" of just throwing everything in to $HOME might be the easy fire and forget choice, but it's terrible if you want the ability to back up files that are important such as configuration easily and periodically wipe caches, and also don't want ls -a ~ to look like a dumpster fire.
Just thinking about how this should behave, maybe if
$XDG_CACHE_HOMEis set then use that for the cached items unless the cached items are already in$HOME/.stack. This way new installations would do the right thing, but existing installations would still keep using their existing cache. And something similar for$XDG_CONFIG_HOME.
I just want to point out a misunderstanding about the XDG_* environment variables.
As specified in the XDG Base Directory Specification, the XDG_* environment variable don't have to be set. Setting the XDG_* environment variable is just a way to override their default values :
XDG_DATA_HOME defines the base directory relative to which user specific data files should be stored. If $XDG_DATA_HOME is either not set or empty, a default equal to $HOME/.local/share should be used.
$XDG_CONFIG_HOME defines the base directory relative to which user specific configuration files should be stored. If $XDG_CONFIG_HOME is either not set or empty, a default equal to $HOME/.config should be used.
[...]$XDG_CACHE_HOME defines the base directory relative to which user specific non-essential data files should be stored. If $XDG_CACHE_HOME is either not set or empty, a default equal to $HOME/.cache should be used.
It's really important to understand that you MUST NOT expect the XDG_* environment variables to be set to handle the XDG Base directories default locations.
It may be obvious for many of yours, but I just wanted to point it out as many softwares pretending to support the XDG Base Directory Specification expect those environment variables to be set. This results in your home directory being cluttered because those variables are not set by the linux distributions nor the shell (because they don't have to) and make this spec useless.
To briefly show the expected behavior, look at the shell script example below:
readonly config_dir="${XDG_CONFIG_HOME:-$HOME/.config}/stack"
To fully understand this spec, please read it.
Just thinking about something while writing my previous comment.
Stack has a configuration file for each project and another one for the global project.
_But is there a need for a system-wide configuration file ?_ It may set default behavior for all the users on the system (think about a system administrator who want to configure the default behavior of all the computers in a university lab).
This file may be located at /etc/stack.yaml by default on Unix systems. But its location should be modifiable during the compilation by the package maintainer.
I don't say there is a need for it. But while rewriting the configuration files loading mechanism, it may be useful to add this feature.
What do you think about it ?
But is there a need for a system-wide configuration file?
Do you mean, a stack.yaml in addition to the global project file? Or an equivalent to the config.yaml file, but in /etc/?
So,
stack.yamlstack.yamlstack.yaml / system config.yaml@dbaynard
Arf. I just saw there is already a global non-project configuration file located at /etc/stack/config.yml (see the documentation).
Two things:
@rszibele Are you still working on this?
I've just encountered another reason to do this. I’d installed pandoc using stack, then later deleted my ~/.stack directory. As a result pandoc stopped working, giving me an error about data files.
This is because data files are stored in ~/.stack/snapshots/$compiler/$snapshot/$ghc-version/share/$compiler/$package/data/. Pandoc has quite a few data files
It would be nice if this sort of thing could be separated from build caches — though for data-files I'm sure there's a workaround.
@dbaynard Unfortunately, I'm unable to allocate any time to this issue at the moment. I'm currently caught up in a commercial project that I'm expecting to ship this June/July (if all goes well).
Pandoc has recently done this (see jgm/pandoc#3582). It would be nice to get this for the next major release.
Closing, having added to the wishlist. PRs welcome!
Most helpful comment
I would also love to see this change, just to throw my 10c (and support) in to the conversation. The "standard" of just throwing everything in to
$HOMEmight be the easy fire and forget choice, but it's terrible if you want the ability to back up files that are important such as configuration easily and periodically wipe caches, and also don't wantls -a ~to look like a dumpster fire.