Julia: Reduce dotfile bloat

Created on 2 Feb 2015  Ā·  41Comments  Ā·  Source: JuliaLang/julia

Those of us with old home directories have to live with a huge amount of various dotfiles/dotdirs under the home directory root, and every new piece of software adding its own config files or whatever directly under the home directory only makes it worse.

One solution would be to follow the XDG basedir spec (store stuff under ~/.config and ~/.local/XXX), but that was already shot down for Julia, see https://github.com/JuliaLang/julia/issues/4630 .

So, as it seems we're going to have to live with a ~/.julia directory, would it be possible to at least move the other dotfiles under that directory? That is, at least something like

~/.juliarc.jl => ~/.julia/juliarc.jl
~/.julia_history => ~/.julia/julia_history

and maybe some other file I have missed?

build good first issue

Most helpful comment

Since you've very unfortunately chosen to not use XDG, could you please provide a mechanism to move HOME/.julia somewhere else? Typically projects have always used a command-line option for this such as gdb, or environment variables such as GNUPGHOME, CABAL_CONFIG, CHICKEN_REPOSITORY and many more.

If you'd like to review some case studies of tools which have adopted XDG, here's is a non-exhustive (but fairly decent) list: https://wiki.archlinux.org/index.php/XDG_Base_Directory_support (Please try reconsider as most of your points against it were inaccurate or semantically debatable.)

All 41 comments

+1

My proposal at #4630 was maybe a bit too bold, but moving all julia files to a single dir will increase clarity for everybody.

Yes, +1 to that.

I started looking at this; I guess the main thing is how to unify a move to ~/.julia/juliarc.jl with base/pkg/dir.jl I'll continue looking at this in the future if no one else does.

Awesome, thanks for working on this!

I wonder, would it be useful to have some kind of migration facility for people upgrading from older releases? Something like

A) If ~/.julia/juliarc.jl exists, use that.

B) If ~/.julia/juliarc.jl does not exist, but ~/.juliarc.jl exists, mv("~/.juliarc.jl", "~/.julia/juliarc.jl") and print a message to stderr that the file has been moved.

C) For the 0.5 release, remove the code implementing (B).

And equivalently for other moved dotfiles, except that maybe it isn't necessary to print a message that ~/.julia_history has been moved since it's not a file a user is expected to need to care about per se.

A problem with this is if users don't migrate in one step, but rather go back and forth between 0.3 and 0.4 over a period of time before stopping using 0.3. Perhaps copy the files rather than moving, or maybe just print a warning message and let the users sort it out on their own? IIRC when matplotlib switched from ~/.matplotlibrc to ~/.matplotlib/matplotlibrc it resorted to just nagging rather than copying/moving anything. So perhaps an alternative (B) would be

B.1) If ~/.julia/juliarc.jl does not exist, but ~/.juliarc.jl exists, use ~/.juliarc.jl and print a warning message to stderr that this is deprecated?

I'd rather we not do B. A lot of us are going to have parallel installations of both 0.3 and 0.4 for a while.

There was a similar change in the default location for packages between 0.2 and 0.3. We can do B.1 but I'm not sure the warning is really necessary.

Startup warnings are really annoying! I don't think getting people to follow a new arbitrary file location is a god reason, and we're not in a hurry to convert everyone.

How about just making new versions aware of both conventions, but create files in the new location, on first run on a clean machine?

If we first make a move, we should probably consider optionally loading dot-files from the vX.X folders, so that touch .julia/v0.4/.julia_history would make 0.4.X use a different history file.

Just an FYI, I'm not going to get to this anytime soon. I don't guess it's too tough, but would require more time than I have to spend on julia to boot up and learn the required tools.

Julia could at least follow the approach of Git and use an XDG directory if it exists. Or it'd be nice to be able to customize the location with an environment variable.

Bump.

I should point out that if JULIA_PKGDIR is set then that location is used instead of ~/.julia, but this environment variable does not (and should not) affect the location of the history and rc files. Under this proposed change would history and rc files be found under JULIA_PKGDIR or would they always be ~/.julia/history.jl and ~/.julia/rc.jl? Fundamentally, there are two locations for Julia files: a configurable package path and fixed location for global configuration.

That said we could have ~/.julia/history.jl and ~/.julia/rc.jl as the fixed location even if ~/.julia is not where JULIA_PKGDIR points.

Since you've very unfortunately chosen to not use XDG, could you please provide a mechanism to move HOME/.julia somewhere else? Typically projects have always used a command-line option for this such as gdb, or environment variables such as GNUPGHOME, CABAL_CONFIG, CHICKEN_REPOSITORY and many more.

If you'd like to review some case studies of tools which have adopted XDG, here's is a non-exhustive (but fairly decent) list: https://wiki.archlinux.org/index.php/XDG_Base_Directory_support (Please try reconsider as most of your points against it were inaccurate or semantically debatable.)

Maybe ~/.julia/config.jl instead of rc.jl?

or startup.jl, we describe it that way in the command line flag

I like startup.jl yes.

As I said when someone originally brought up XDG, it would be fine to have XDG as an opt-in thing, but XDG generally assumes Linux only. For Julia, having a consistent, predictable experience across Windows, macOS and Linux is more important than following XDG and having a totally different experience on Linux than everywhere else. If you have any suggestions about how opt-in XDG would work, we can certainly consider supporting it.

startup.jl is a good name. Does anyone have any thoughts on the two locations issue?

Have a JULIA_CONFIG_DIR environment variable?

So if JULIA_CONFIG_DIR is set to ~/.config then the layout is XDG-compatible? The problem is that XDG has a lot of other directories (~/.cache, etc.). Presumably if someone wants to use one of them, then they want to use all of them.

I agree that XDG (XBDS) isn't entirely appropriate for languages and their ecosystems. XBDS itself was a "base" specification written in at least 2004 (version 0.5), meant to be used internally across the XDG (now FDO) projects.

But XBDS itself has turned out to be somewhat of a nice idea for others as well and seen fairly wide adoption across many non-desktop projects, most of which have left legacy paths intact to not break those user's systems (e.g. git).

To avoid embarking in a voyage on the semantic gulf, let's step back and consider the larger picture.

Projects like systemd provide a tool called systemd-path(1), a small part of their ongoing file-hierarchy(7) effort. Running this command at the moment will print out the following (unfortunately hardcoded at the moment) locations (among many):

[...]
user-binaries: /home/earnest/.local/bin
user-library-private: /home/earnest/.local/lib
user-library-arch: /home/earnest/.local/lib/x86_64-linux-gnu
[...]

Ultimately the purpose, in my opinion, is to provide a controlled way to structure user home directories, much like Windows and MacOS already do. Where the data finally ends up going is largely an exercise in bikeshedding.

If you decide to use XDG_CONFIG_HOME for the *rc files or whether you also use XDG_DATA_HOME for REPL histories and packages or even wholly custom environments, is really besides the point. Just having soming this commanething to keep these files out of HOME, having a nice way to order system repos and backups, and the ability to depend on the environment as a PREFIX much like in most build systems, is all I really care about.

I can't be of much help implementing this, but I hope you don't view the XBDS as for desktops only, or wholly inappropriate for your needs going forward.

Thanks for reading.

We could follow the breakdown of XDG into config, cache and data directories (I'm not sure we need runtime). These could be controlled by JULIA_{CONFIG,CACHE,DATA}_DIR variables and corresponding options to julia. These can all default to being under ~/.julia. That way if someone wants to opt into XDG all they have to do is set the environment variables appropriately.

Everyone seems to be talking about Linux, but this is also an issue on Windows. The only thing more annoying than spamming the home directory on Linux is spamming the user-profile directory on Windows. Windows programs should write their files in AppData.

This would be good to have for 1.0.

The situation for Windows is especially bad.

The main thing missing here is a plan. If someone who actually knows about XDG or Windows could provide a straightforward, concrete proposal, that would help. So far, the main actionable points seem to be:

  1. Allow configuration of where Julia considers it's "home" to be. Until recently, the name JULIA_HOME was used for something else, but we can reclaim that in 1.0. It can default to ~/.julia for some reasonable definition of ~ for the platform (user home directory on UNIX, wherever we're currently sticking .julia on Windows).

  2. Move ~/.julia_history and ~/.juliarc.jl into JULIA_HOME. Perhaps as ~/.julia/logs/repl_history.jl and ~/.julia/config/startup.jl.

I have no idea how XDG or Windows fit into this. How should this interact with XDG? Is there some environment variable Julia should pay attention to? What is "the situation" on Windows?

See https://github.com/JuliaLang/julia/issues/4630 for a description of how Julia could follow XDG. Maybe Pkg3 changes the situation compared to when I filed the issue (one of my first issues!).

Idea while we're moving dot files around: shard history file by month, i.e. write to

~/.julia/logs/repl_history/YYYY/MM.jl

We can arrange for the history loading mechanism to load files on-demand if you go back in history far enough. This would make it easy to automatically delete or archive history files when they're old enough and it would automatically limit the size of history files to something reasonable.

For Julia, having a consistent, predictable experience across Windows, macOS and Linux is more important than following XDG and having a totally different experience on Linux than everywhere else.

Iā€™d argue that this is not useful at all. You then end up with by far the most programs following system standards ā€“ with Julia being one of the very few programs ignoring that. A much better approach would be to respect every systemā€™s standard and then do a good abstraction from that.

You should also have in mind that Julia will get increasing adoption, meaning that more and more users use Julia without being (core) developers ā€“ and thus donā€™t need these files and folders directly under ~/. I for example never did anything with Python, still I do have quite some Python packages on my system ā€“ and I hate Python for trashing my home, and I donā€™t care about Python being developer friendly at all.

Iā€™m sure that most users donā€™t develop packages at all, and instead just use the REPL and execute the binary. So having a good package developing experience is more an exception-requirement. Developers however should easily be able to change these folders and files to reside in ~, for having a good experience. I personally also did some little package development, but still like Julia to follow XDG standards. Please focus on ā€œnormalā€ users instead of core developers in the future ā€“ theyā€™re pretty much able to customize their environment to fit their needs with ease. You can add appropriate instructions/recommendations in the manual under ā€œWorkflow tipsā€ for being kind of obvious for developers then. In my opinion, this approach then fits both use-cases best.

The ā€œeverything in ~/.juliaā€ is quite an improvement, but itā€™s still half-baked, to me.

Currently, to describe the location of startup.jl we can just write ~/.julia/config/startup.jl and that applies to all platforms. If we did this, we'd need to write this instead:

  • On Linux it is ~/.config/julia/config/startup.jl
  • On Windows it is %APPDATA%/Julia/config/startup.jl
  • On macOS it is ~/Library/Application Support/Julia/config/startup.jl
  • On FreeBSD it is ~/.julia/config/startup.jl.

Multiply that by the number of different times we need to talk about the locations of files.

Once we remove the deprecated JULIA_HOME environment variable in 1.0 the plan (my plan at least) is to use it to allow people to override ~/.julia but we can't do that until 1.0 when we delete the current deprecated meaning. Then people who want XDG compliance can just do export JULIA_HOME=~/.config/julia in their startup scripts.

Currently, to describe the location of startup.jl we can just write ~/.julia/config/startup.jl and that applies to all platforms. If we did this, we'd need to write this instead:

On Linux it is ~/.config/julia/config/startup.jl
On Windows it is %APPDATA%/Julia/config/startup.jl
On macOS it is ~/Library/Application Support/Julia/config/startup.jl
On FreeBSD it is ~/.julia/config/startup.jl.

Multiply that by the number of different times we need to talk about the locations of files.

Thatā€™s how it is ā€“ and certainly not your fault. But this would be a solid solution for all systems. And XDG should also be valid on FreeBSD, as far as I know. Maybe it would be helpful to provide some sort of chjuliahome() function, for getting switched to that directory; or otherwise a manual config approach for getting the development folder in someones home statically. šŸ™‚

Once we remove the deprecated JULIA_HOME environment variable in 1.0 the plan (my plan at least) is to use it to allow people to override ~/.julia but we can't do that until 1.0 when we delete the current deprecated meaning. Then people who want XDG compliance can just do export JULIA_HOME=~/.config/julia in their startup scripts.

Good to hear, thanks! Although I still prefer the default to be the other way roundā€¦ ;)

Also, XDG seems to dictate that we have at least two separate locations for files: ~/.config and ~/.local/share, so even the JULIA_HOME approach doesn't really do it. Separating all the files that Julia reads and writes into configuration and doesn't map terribly well onto what we keep there. Is an installed package configuration or a user data? Even if JULIA_HOME defaulted to ~/.config/julia on Linux, I feel like we'd soon be getting complaints that we're doing it wrong and that half of the things should go under ~/.local/share instead.

Thatā€™s how it is ā€“ and certainly not your fault.

I do not have a strong track record of respecting "how it is" when "how it is" is less than ideal šŸ˜ˆ

There is also XDG_CACHE_HOME and XDG_RUNTIME_DIR which means we'd have to spread Julia-related files across four different directories to follow XDG properly.

Yeah, Iā€™d also like Windows (and maybe Mac) to disappear. šŸ˜œšŸ‘Œ

Iā€™d take .juliarc.jl as config (maybe also the history), with all the rest as "other stuff" ā‡’ ~/.local/share/julia/.

I donā€™t think that you have to follow that XDG approach THAT strictly if itā€™s not that easy for you. I think having all Julia data at least in one or two well-known standard subfolders, would be enough for making everybody here happy. šŸ˜Ž

For me ~./config is a directory where just the configuration and customizations should be stored, which may be of interest to the users. Everything else should go into "user data", because those data files themselves are irrelevant to the users in regular cases. (When I want a back-up, but with some ā€œfresh installedā€ programs, I for example just restore the ~/.config/ data for those programs, and not the user data.) For development purposes, it may be more reasonable to have oneā€™s self-developed packages in some sort of ~/Projects/Julia/ folder (which then gets internally bundled with all the other packages ā€“ compare like shells do with /bin/, /usr/bin/, /usr/local/bin/, etc). Iā€™m not that sure if self-developed packages currently reside among all other foreign packages, but that would be strange to me, as Iā€™d like my stuff to be separate from foreign/installed packages (I think I symlinked my files to my dev folder, back when I worked for some time on a package).

Care to take a crack at classifying all of the different things in ~/.julia according to XDG:

  • ~/.julia/clones ā€“Ā where we clone bare Julia repos
  • ~/.julia/compiled ā€“Ā where we put pre-compiled package files (.ji files)
  • ~/.julia/config ā€“Ā configuration files
  • ~/.julia/dev ā€“Ā where we checkout projects for development by default
  • ~/.julia/environments ā€“Ā named user environments
  • ~/.julia/logs ā€“Ā log files, including REPL history and manifest usage logs
  • ~/.julia/packages ā€“Ā installed versions of packages
  • ~/.julia/registries ā€“Ā package registries

~/.config should include relatively lightweight data reflecting settings particular to the user configuration, e.g. juliarc and list of required packages/environments. Packages registries and packages themselves and logs would probably go to ~/.local/share. Precompilation files (which can be recreated automatically from other data) would go to ~/.cache.

Note that one advantage of using standard system directories (XDG, Windows or Mac) is that they offer standard places where system administrators can put system-wide or site-wide default configuration files, package registries, etc.

I assume everything to get into $XDG_DATA_HOME, except for "compiled" better be put into $XDG_CACHE_HOME and the configuration file in $XDG_CONFIG_HOME:

  • ~/.julia/clones ā‡’ $XDG_DATA_HOME
  • ~/.julia/compiled ā‡’ $XDG_CACHE_HOME
  • ~/.julia/dev ā‡’ $XDG_DATA_HOME

    • But I propose $CWD or something like ~/[Projects/]Julia/ (without a dot) hardcoded (maybe even project-specific?) via ENV, if users would actively work there.

  • ~/.julia/environments ā‡’ $XDG_DATA_HOME

    • Though not sure about how relevant this would be to the usersā€¦ (see above). Can you get me a link to a description, please?

  • ~/.julia/logs ā‡’ $XDG_DATA_HOME
  • ~/.julia/packages ā‡’ $XDG_DATA_HOME
  • ~/.julia/registries ā‡’ $XDG_DATA_HOME

Itā€™s good practice to separate userā€™s working data (which is precious) from foreign data (which can easily be restored in most cases). Itā€™s therefore not good, putting userā€™s development files into dot directories, as these for example easily get forgotten when making backups or the like (happened to me not just once).

That dynamic runtime folder seems to just hold temporary data while an application runs. I assume that to be practically irrelevant for Julia.

$XDG_RUNTIME_DIR defines the base directory relative to which user-specific non-essential runtime files and other file objects (such as sockets, named pipes, ...) should be stored. The directory MUST be owned by the user, and he MUST be the only one having read and write access to it. Its Unix access mode MUST be 0700.
The lifetime of the directory MUST be bound to the user being logged in. It MUST be created when the user first logs in and if the user fully logs out the directory MUST be removed. If the user logs in more than once he should get pointed to the same directory, and it is mandatory that the directory continues to exist from his first login to his last logout on the system, and not removed in between. Files in the directory MUST not survive reboot or a full logout/login cycle.
The directory MUST be on a local file system and not shared with any other system. The directory MUST by fully-featured by the standards of the operating system. More specifically, on Unix-like operating systems AF_UNIX sockets, symbolic links, hard links, proper permissions, file locking, sparse files, memory mapping, file change notifications, a reliable hard link count must be supported, and no restrictions on the file name character set should be imposed. Files in this directory MAY be subjected to periodic clean-up. To ensure that your files are not removed, they should have their access time timestamp modified at least once every 6 hours of monotonic time or the 'sticky' bit should be set on the file.
If $XDG_RUNTIME_DIR is not set applications should fall back to a replacement directory with similar capabilities and print a warning message. Applications should use this directory for communication and synchronization purposes and should not place larger files in it, since it might reside in runtime memory and cannot necessarily be swapped out to disk.

Maybe @poettering can review this?

Note that one advantage of using standard system directories (XDG, Windows or Mac) is that they offer standard places where system administrators can put system-wide or site-wide default configuration files, package registries, etc.

That argument doesn't seem to apply to user files which is what we're talking about there...

So we're looking at splitting our files up like this:

  • ~/.cache/julia

    • ~/.cache/julia/clones

    • ~/.cache/julia/compiled

  • ~/.local/share/julia

    • ~/.local/share/julia/dev

    • ~/.local/share/julia/environments

    • ~/.local/share/julia/logs

    • ~/.local/share/julia/packages

    • ~/.local/share/julia/registries

  • ~/.config/julia

    • e.g. ~/.config/julia/startup.jl

I have to say that I doubt that the typical Linux user would actually prefer this complex fractured layout to just having everything Julia-related under ~/.julia with the appropriate names. But we can certainly support at JULIA_XDG=true environment variable which activates this layout. Or perhaps we should have a JULIA_SYSTEM_LAYOUT=true variable which opts into OS-specific layouts across systems. We'd have to work out what those are everywhere and I'm not even sure how much demand there is for this anywhere else.

Once again, I have to express my opinion that this feels a lot like standards for their own sake. I do not see any practical advantage of this complex layout over having everything under ~/.julia.

~/.config/julia/config ā€“ this seems silly, so maybe just ~/.config/julia

Check.

Itā€™s just really nice, when you have your home not spammed with files you donā€™t need. And except for .juliarc.jl, I never needed to look into other Julia-internal files/folders (except for placing my package symlinks to my dev folder). Iā€™m really happy that more and more programs apply XDG rules! Think about how spammed our homes would be, if every program creates folders in ~/ ā€“ or worse, several files ā€“ see my Python example above. Privately as an analogy, youā€™ll also sort your tools away and donā€™t put them on your dinner table for all time long ā€“ thatā€™s also just for cleaning and overview purposes, with no direct benefit.

And as most users donā€™t develop packages, I vote for making JULIA_XDG = true the default. šŸ˜

And as most users donā€™t develop packages, I vote for making JULIA_XDG = true the default. šŸ˜

The notion that only package developers need to be aware of what is in their ~/.julia directory does not really align with what I've seen. It seems better to make it easy to understand and explore than to try to hide things away and hope people don't need to look at them.

The "dotfile bloat" part of this has been addressed, so I'm closing. If someone wants to implement the XDG stuff, they're welcome to make a PR.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

i-apellaniz picture i-apellaniz  Ā·  3Comments

StefanKarpinski picture StefanKarpinski  Ā·  3Comments

manor picture manor  Ā·  3Comments

sbromberger picture sbromberger  Ā·  3Comments

yurivish picture yurivish  Ā·  3Comments