Terraform: Feature Request: Builtin Function to Get List of Files Matching a Pattern

Created on 18 Nov 2017 · 8Comments · Source: hashicorp/terraform

I've notice that there's no builtin function to obtain a list of files that match a specific pattern.

The use case that I'd want to solve with such a function is the following:
I'm creating a re-usable terraform module that defines a vault_aws_secret_backend resource and a set of vault_aws_secret_backend_role resources for that backend. It would be really cool if I could simply write the following in my module:

variable "access_key" {
}

variable "secret_key" {
}

variable "role_policy_list" {
    type = "list"
}

resource "vault_aws_secret_backend" "main" {
    access_key = "${var.access_key}"
    secret_key = "${var.secret_key}"
}

resource "vault_aws_secret_backend_role" "role" {
    count = "${length(var.role_policy_list)}"
    backend = "${vault_aws_secret_backend.main.path}"
    name = "${replace(element(var.role_policy_list, count.index),".json", "")}"
    policy = "${file(element(var.role_policy_list, count.index))}"
}

And then I could include the module like this:

module "vault_aws_secret" {
    source = "..."
    access_key = "..."
    secret_key = "..."
    role_policy_list = "${dirlist("./roles", "*.json")}"
}

I could see this function take 2 arguments: a directory to scan, and a filename pattern to look for. This seems like it should be fairly straight-forward with the filepath.Glob function.

I'm sure that there must be other similar use cases that could benefit from this.

I'd be willing to create a PR if there's interest.

config enhancement

Source

marcboudreau

Most helpful comment

To followup on this feature request, just writing in that the fileset() function is now available via Terraform v0.12.8, released yesterday.

Here's a full example. 😄

Given the following file layout:

.
├── main.tf
├── subdirectory1
│   ├── anothersubdirectory1
│   │   └── anothersubfile.txt
│   ├── subfile1.txt
│   └── subfile2.txt
└── subdirectory2
    └── subfile3.txt

And the following Terraform configuration:

terraform {
  required_providers {
    aws = "2.26.0"
  }
  required_version = "0.12.8"
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "test" {
  acl           = "private"
  bucket_prefix = "fileset-testing"
}

resource "aws_s3_bucket_object" "test" {
  for_each = fileset(path.module, "**/*.txt")

  bucket = aws_s3_bucket.test.bucket
  key    = each.value
  source = "${path.module}/${each.value}"
}

output "fileset-results" {
  value = fileset(path.module, "**/*.txt")
}

Terraform successfully maps this file structure into S3:

$ terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # aws_s3_bucket.test will be created
  + resource "aws_s3_bucket" "test" {
      + acceleration_status         = (known after apply)
      + acl                         = "private"
      + arn                         = (known after apply)
      + bucket                      = (known after apply)
      + bucket_domain_name          = (known after apply)
      + bucket_prefix               = "fileset-testing"
      + bucket_regional_domain_name = (known after apply)
      + force_destroy               = false
      + hosted_zone_id              = (known after apply)
      + id                          = (known after apply)
      + region                      = (known after apply)
      + request_payer               = (known after apply)
      + website_domain              = (known after apply)
      + website_endpoint            = (known after apply)

      + versioning {
          + enabled    = (known after apply)
          + mfa_delete = (known after apply)
        }
    }

  # aws_s3_bucket_object.test["subdirectory1/anothersubdirectory1/anothersubfile.txt"] will be created
  + resource "aws_s3_bucket_object" "test" {
      + acl                    = "private"
      + bucket                 = (known after apply)
      + content_type           = (known after apply)
      + etag                   = (known after apply)
      + id                     = (known after apply)
      + key                    = "subdirectory1/anothersubdirectory1/anothersubfile.txt"
      + server_side_encryption = (known after apply)
      + source                 = "./subdirectory1/anothersubdirectory1/anothersubfile.txt"
      + storage_class          = (known after apply)
      + version_id             = (known after apply)
    }

  # aws_s3_bucket_object.test["subdirectory1/subfile1.txt"] will be created
  + resource "aws_s3_bucket_object" "test" {
      + acl                    = "private"
      + bucket                 = (known after apply)
      + content_type           = (known after apply)
      + etag                   = (known after apply)
      + id                     = (known after apply)
      + key                    = "subdirectory1/subfile1.txt"
      + server_side_encryption = (known after apply)
      + source                 = "./subdirectory1/subfile1.txt"
      + storage_class          = (known after apply)
      + version_id             = (known after apply)
    }

  # aws_s3_bucket_object.test["subdirectory1/subfile2.txt"] will be created
  + resource "aws_s3_bucket_object" "test" {
      + acl                    = "private"
      + bucket                 = (known after apply)
      + content_type           = (known after apply)
      + etag                   = (known after apply)
      + id                     = (known after apply)
      + key                    = "subdirectory1/subfile2.txt"
      + server_side_encryption = (known after apply)
      + source                 = "./subdirectory1/subfile2.txt"
      + storage_class          = (known after apply)
      + version_id             = (known after apply)
    }

  # aws_s3_bucket_object.test["subdirectory2/subfile3.txt"] will be created
  + resource "aws_s3_bucket_object" "test" {
      + acl                    = "private"
      + bucket                 = (known after apply)
      + content_type           = (known after apply)
      + etag                   = (known after apply)
      + id                     = (known after apply)
      + key                    = "subdirectory2/subfile3.txt"
      + server_side_encryption = (known after apply)
      + source                 = "./subdirectory2/subfile3.txt"
      + storage_class          = (known after apply)
      + version_id             = (known after apply)
    }

Plan: 5 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_s3_bucket.test: Creating...
aws_s3_bucket.test: Creation complete after 2s [id=fileset-testing20190905121318114700000001]
aws_s3_bucket_object.test["subdirectory2/subfile3.txt"]: Creating...
aws_s3_bucket_object.test["subdirectory1/subfile1.txt"]: Creating...
aws_s3_bucket_object.test["subdirectory1/anothersubdirectory1/anothersubfile.txt"]: Creating...
aws_s3_bucket_object.test["subdirectory1/subfile2.txt"]: Creating...
aws_s3_bucket_object.test["subdirectory2/subfile3.txt"]: Creation complete after 0s [id=subdirectory2/subfile3.txt]
aws_s3_bucket_object.test["subdirectory1/subfile2.txt"]: Creation complete after 0s [id=subdirectory1/subfile2.txt]
aws_s3_bucket_object.test["subdirectory1/subfile1.txt"]: Creation complete after 0s [id=subdirectory1/subfile1.txt]
aws_s3_bucket_object.test["subdirectory1/anothersubdirectory1/anothersubfile.txt"]: Creation complete after 0s [id=subdirectory1/anothersubdirectory1/anothersubfile.txt]

Apply complete! Resources: 5 added, 0 changed, 0 destroyed.

Outputs:

fileset-results = [
  "subdirectory1/anothersubdirectory1/anothersubfile.txt",
  "subdirectory1/subfile1.txt",
  "subdirectory1/subfile2.txt",
  "subdirectory2/subfile3.txt",
]

For any bug reports or feature requests with fileset() functionality, please file a new GitHub issue. Otherwise for general questions about this functionality, please reach out on the community forums. Enjoy! 🎉

bflad on 5 Sep 2019

🎉2

All 8 comments

Hi @marcboudreau!

This (and functions for path wrangling in general) seems reasonable to me. It reminds me of another discussion we had on another issue (which, unfortunately, I wasn't able to quickly find) about a function that works more like the find command, to find files matching some pattern _recursively_ under a directory.

Given that there's probably a family of configuration functions here, I think it'd be good to think about what a good minimal set of useful functions is and then work through them gradually, just to make sure we end up with a set of functions that complement each other well.

We're currently in the midst of revamping the configuration language parser and expression interpreter, so at this time we're being a bit more cautious about adding new interpolation functions (they'll all require some tweaking to work with the new system) but that aside I think this is a good idea and we could start to think about what a good set of functions might look like. I'm thinking about things like Go's filepath.Join, filepath.Base, etc, so that we can support useful combinations like listdir(pathjoin("./roles", "*.json")) where each function solves a specific, well-defined problem.

apparentlymart on 21 Nov 2017

I like the idea of having a single input parameter, which is the pattern. It would be more flexible and slightly simpler to implement. In that case, perhaps renaming the function to listfiles(pattern) might make more sense.

Looking at the filepath package, here are some additional potential candidate functions:

cleanpath(path), calls filepath.Clean to lexically process the provided path to remove unnecessary path elements
abspath(path), returns an absolute path that is equivalent to path
joinpath(path1, path2, ...) or as you called it pathjoin, joins provided path elements into a path
relpath(basepath, targetpath), returns a path relative to basepath that is equivalent to targetpath when joined to basepath

I also looked at the os package for some inspiration and this might be useful too:

tempdir(), returns the system's preferred directory for temporary files

I'd like to contribute to the project, so I don't mind putting a PR together. Should the merits of each of these candidates be discussed in this issue first, or is it OK to proceed with a PR and we can see what makes the cut there?

marcboudreau on 23 Nov 2017

Hi @marcboudreau,

Thanks for putting that list together. That seems like a good list to start with, though I think I'd hold on tempdir since it is likely to encourage placing files outside of the configuration directory, and that tends to be problematic in automation scenarios where the plan and apply may not happen on the same computer.

I wonder if we could skip cleanpath by just making sure it happens as a side-effect of all of the other functions. If someone wants to _just_ clean a path without doing any other operation, that could be done with a single-argument call to joinpath, at the expense of a little more obscurity. I don't expect that _just_ cleaning a path would be a common operation, so this feels okay to me but I'm curious as to what you think.

We could potentially also drop abspath since we'd tend to discourage the use of absolute paths anyway, because they can cause problems when a state file is moved between machines where the configuration may be at a different location. For edge-cases where such things are required, similar functionality (though indeed not quite the same) could be found using pathjoin(path.cwd, some_relative_path).

I also think I prefer the aesthetic of having the names all start with path so that they group together nicely in an alphabetical list, similar to the cidr... family of functions:

pathjoin(paths...)
pathrel(base, target)

I like the suggestion of using the noun "files" in the list function rather than "dir". For similar aesthetic reasons, I'd suggest we call it filelist so that it groups nicely with file:

filelist(pathjoin(path.module, "*.json")) for just matching entries in a given dir
filelist(pathjoin(path.module, "*")) for everything in a given dir, without recursion
filelist(pathjoin(path.module, "**/*")) for everything in a given dir, recursively

With these functions, this would enable patterns like the following:

locals {
  static_src_root = "${pathjoin(path.module, "static")}"
  static_files    = "${filelist(pathjoin(local.static_src_root, "**/*"))}"
}

resource "aws_s3_bucket_object" "example" {
  count  = "${length(local.static_files)}"
  bucket = "${var.s3_bucket_name}"
  key    = "${pathrel(local.static_src_root, local.static_files[count.index])}"
  source = "${local.static_files[count.index]}"

  # Could later add a MIME-type-sniffing function to help populate `content_type` using
  # https://godoc.org/net/http#DetectContentType , but should save that for a separate
  # change.
}

As noted previously, we're trying to minimize changes to the set of interpolation functions right now since they will need to be written in a different way for the configuration language and so each new function is something additional to port. With that said, these functions will probably end up just being thin wrappers around Go path/filepath functions and so if you are motivated to work on them now we could make an exception, since I agree this is all useful functionality to enable patterns like my example above and they shouldn't be too challenging to port to the new function system once we get there.

apparentlymart on 29 Nov 2017

Thanks for the feedback @apparentlymart. I'm sorry it took so long to get back to you, I was away on vacation. I agree with all the suggestions you made, especially reversing the names so that file and path appear at the beginning of the name.

I'll start working on a PR over the next couple of days.

Thanks again

marcboudreau on 12 Dec 2017

In case anyone comes across this thread and is desperate for a way to iterate a directory I threw together https://github.com/jakexks/terraform-provider-glob

jakexks on 22 May 2019

This can be closed by #22523, which will be included in the next terraform release. Thanks!

mildwonkey on 28 Aug 2019

🎉1