Each instance of the terragrunt module creates it's own cache that bloats the disk usage.
$ ls -la
-rw-r--r-- 1 ed kvm 3640 Aug 2 10:18 README.md
-rw-r--r-- 1 ed kvm 1313 Sep 4 16:26 terraform.tfvars
drwx------ 3 ed kvm 4096 Sep 3 10:11 .terragrunt-cache
$ du -sh .terragrunt-cache/
289M .terragrunt-cache/
Is it possible to use a shared cache that re-uses already downloaded modules (and their versions), so I don't have to download all of the dependencies for each module instantiation?
Collectively it is 10GB of cache.
There are a few aspects to this:
.terragrunt-cache folder makes it easier to see and figure out what Terragrunt is doing.~/.terragrunt-cache) and symlink it to the local .terragrunt-cache? Does this work properly if you're running apply-all and lots of downloads are happening concurrently? Does this work properly if you are using different versions of a repo? How do we version the repo so you can share code from the same versions but don't mix up code from different versions?.tf files reference. I have no clue if we can do anything to optimize this.Suggestions on how to improve this are welcome!
I would be nice to have an option to auto remove cache after execution. I.e. after apply command.
I just deleted 120GB of .terragrunt-cache. I'm working on multiple environments (17 to be exact, 2 mostly destroyed and left on standby, ~750 modules used across), all are aligned with same versions of modules.
Keeping .terragrunt-cache per module is wrong architectural design. terragrunt shouldn't force me to download same repos over and over again. By default it should have common directory and use symlink as @brikis98 said. Having a flag to create local .terragrunt-cache could be an option to debugg (I've never debugged it tho).
I can't imagine supporting and working on infrastructure with 100 clients or more. I would have to either delete local .terragrunt-cache after every apply, forcing me to redownload hundreds of repos or upgrade my SSD (macosx not that easy and not that cheap) with at least 1TB.
terragrunt in this form does not scale.
@3h4x Ideas on how to improve this are welcome, but we need something that explicitly explains how it solves the issues in https://github.com/gruntwork-io/terragrunt/issues/561#issuecomment-418692976.
@brikis98 I have few ideas but I'm not sure how complicated and feasible they are.
One is symlinks already mentioned, second one is proxy and replacing source with adhoc localhost cache repo. Kinda nasty hack.
Unfortunately I don't think I will be able to help to sort this issue.
Proxy sounds a bit too hacky. Symlinks are more promising, but not without a lot of complexities and gotchas. We're certainly open to PRs that can think through those issues, but for now, periodically clearing the cache as documented here is hopefully a good-enough workaround.
I'm also interested in this problem. In my case, we have throttled speed to the git server and a fresh pull every time gets slow really fast.
I may have time to invest in this for a MR if we have a favorable approach.
I may have time to invest in this for a MR if we have a favorable approach.
A PR with a proposal (e.g., just written in a README) that thinks through all the corner cases I mentioned above is welcome!
To help with the problem if you are using terraform 0.12 you can add depth=1 as a param to your source path to have terraform only do a shallow clone of the git repo. Especially when combined with the plugin cache mentioned earlier this really cut down my disk space usage.
e.g:
terraform {
source = "git::https://github.com/lgallard/terraform-aws-cognito-user-pool.git//?ref=0.4.0&depth=1"
}
It's notable that the plugin cache uses hard links at least in some cases so some tools (including du) inflate how much space is used up, notice the inode numbers at the start of this ls output are identical
$ ls -i ~/.terraform.d/plugin_cache/linux_amd64/terraform-provider-aws_v2.60.0_x4
55312406 /home/jfharden/.terraform.d/plugin_cache/linux_amd64/terraform-provider-aws_v2.60.0_x4
$ ls -i .terragrunt-cache/TmsRQq5jb8Fikqhb4v0N_CPOD8Y/wvSG5F9NOzb3ZsP4sykMWVf-V1c/.terraform/plugins/linux_amd64/terraform-provider-aws_v2.60.0_x4
55312406 .terragrunt-cache/TmsRQq5jb8Fikqhb4v0N_CPOD8Y/wvSG5F9NOzb3ZsP4sykMWVf-V1c/.terraform/plugins/linux_amd64/terraform-provider-aws_v2.60.0_x4
I'm pretty convinced symlinking is going to cause all kinds of trouble, especially with the generators creating provider files etc inside the module directory, but one possible solution which does have some caveats:
The repos could be cloned into a cache directory, something like ~/.terragrunt-cache/modules/github.com/owner/repo.git/<gitref>/ and then you could hardlink instead of symlink. Orchestrating this yourself would be painful, but if you were to rsync the directory you could use the --link-dest option which would deal with all the intricacies, this way you cut the amount of disk space used dramatically if the same module has been cloned more than once, or if the same repo has multiple modules in.
The caveats here are:
What we ended up doing internally is to create a really small wrapper over terragrunt.
This tool will fetch all sources, clone them with the format ~/path-to-cache-dir-/source-name/<gitref> then apply using this wrapper and use --terragrunt-source to specify the source.
This is highly tailored to our usecase/directory structure and module source :(
I wonder if leveraging gits reference feature may help us here? Or at least worth exploring... Somehow fetch it once and force all the others to be reference clones.
git clone --reference
https://randyfay.com/content/reference-cache-repositories-speed-clones-git-clone-reference
Most helpful comment
There are a few aspects to this:
.terragrunt-cachefolder makes it easier to see and figure out what Terragrunt is doing.~/.terragrunt-cache) and symlink it to the local.terragrunt-cache? Does this work properly if you're runningapply-alland lots of downloads are happening concurrently? Does this work properly if you are using different versions of a repo? How do we version the repo so you can share code from the same versions but don't mix up code from different versions?.tffiles reference. I have no clue if we can do anything to optimize this.Suggestions on how to improve this are welcome!