terraform get does not work recursively on all directories

Created on 23 Aug 2016 · 8Comments · Source: hashicorp/terraform

Terraform Version

terraform --version
Terraform v0.7.1

Affected Resource(s)

terraform get

Expected Behavior

terraform get should have recursively downloaded all modules in the root directory tree structure.

Actual Behavior

terraform get only looks for / downloads modules in the current directory.

Directory structure:

.
├── README.md
├── aws
│   ├── modules
│   │   ├── compute
│   │   │   └── nodes
│   │   └── network
│   │       ├── subnet
│   │       └── vpc
│   │           ├── main.tf
│   │           ├── outputs.tf
│   │           └── variables.tf
│   └── project_name
│       ├── app1
│       │   ├── main.tf
│       │   ├── outputs.tf
│       │   └── variables.tf
│       ├── app2
│       └── app3
├── circle.yml
└── main.tf

Modules exist here:

cat aws/project_name/app1/main.tf 
module "vpc" {
  # module source
  source                = "../../modules/network/vpc"
---snip---
}

Steps to Reproduce

Create a directory structure listed as above.
Make sure the module definition is NOT on the root directory level
Execute terraform get

Does not work on the root level:

bash-4.3# terraform get
bash-4.3#

Works on the specific directory level:

bash-4.3# terraform get aws/project_name/.
Get: file:///terraform/aws/modules/network/vpc

question

Source

gganeshan

Most helpful comment

There are lots of different ways to structure this, but I'll describe just two main ones here.

First, I'll elaborate on what I was assuming in my original message.

If you want to apply your entire stack with a single Terraform run, then in your main.tf at the top level you would have module declarations like this, to instantiate your projects (assuming the original directory structure you posted):

module "project_name" {
  source = "./aws/project_name"
}

Then within the aws/project_name directory you would have another main.tf, with module blocks for your apps, and an AWS provider configuration for that project's AWS account:

provider "aws" {
  # configuration for accessing this project's AWS account
}

module "app1" {
  source = "./app1"
}

module "app2" {
  source = "./app2"
}

module "app3" {
  source = "./app3"
}

Then within each of the apps you can instantiate the shared modules you need:

module "network" {
  source = "../../modules/network"
}

...and proceed in this fashion, creating explicit relationships between the modules until they are all referenced in a heirarchy within Terraform. (Note that this heirarchy does not need to match the directory structure, but it can if you wish.)

Now if you run terraform get at the root it should get all of the modules used in the entire tree of referenced modules.

It's a little unwieldy to manage an entire stack with one terraform apply... Terraform will need to spend a lot of time refreshing and diffing everything even if you only want to update one project.

So a variant on the above would be to treat each project as an entirely distinct root configuration. In this model you would remove the top-level main.tf, retaining all of the _other_ module references described in my first suggestion above, and then work with each project directory separately:

cd aws/project_name
terraform get
terraform plan -out=tfplan
terraform apply tfplan
cd ../other_project_name
terraform get
terraform plan -out=tfplan
terraform apply tfplan

In this model it's expected that each project will have its own lifecycle and that you wouldn't necessarily update all of them on every run.

In both cases, some sort of automation (Atlas, Jenkins, etc) can be helpful to ensure that they get run consistently every time.

In general I would advise breaking your configuration into smaller units (the second suggestion, possibly taking it one step further and having a separate config per _application_) because it's easier to work with smaller sets of resources than large ones.

apparentlymart on 23 Aug 2016

👍14

All 8 comments

Hi @gganeshan! Sorry for the frustrations here.

It's not clear to me exactly what role each of the directories in your layout is playing, but if you wish to run terraform get, terraform plan and terraform apply at the root and have it also include what's in the other modules you would need to have module blocks in that top-level main.tf that instantiate (directly or indirectly) each of the other modules at least once.

With such module blocks in place, a terraform get in the root should start by "fetching" each of the referenced modules (which, since they are local directories, will consist of just symlinking them into the .terraform/modules directory) and then recursively get the dependencies of all of those referenced modules, until the entire tree has been dealt with.

If the above doesn't address your concern, it would help to have a little more information about how your various modules relate to each other: which ones do you directly instantiate with terraform apply, and which ones are referenced only as dependencies of other modules.

apparentlymart on 23 Aug 2016

thanks a lot @apparentlymart for your swift response. This is really helpful and something that I realized as part of my tests yesterday :smile: . I think this information was not so apparent (at least to me) in the terraform documentation for terraform get.

This does not fit very well with the use case I have :cry: .
May be you can suggest a better approach for me.

Here is my use case:

I have a master (consolidated billing) AWS account.
I create a new AWS account per project.
- Each project can have multiple apps running in AWS
- Each app can have multiple environments (poc, dev, qa prod etc.)
- Each environment can have its own set of resources (VPC, subnets, ec2, emr etc.)
I have more than 200 such accounts.

This is the reason I thought of having module definition in a separate directory so that they can be used as functions by all these accounts.

├── aws
│   ├── modules
│   │   ├── compute
│   │   │   └── nodes
│   │   └── network
│   │       ├── subnet
│   │       └── vpc
│   │           ├── main.tf
│   │           ├── outputs.tf
│   │           └── variables.tf

For the purposes of readability / maintainability / minimizing risk I was hoping to call the module blocks from inside a project-app-environment tree structure:

├── aws
│   └── project_name
│       ├── app1
│       │   ├── main.tf
│       │   ├── outputs.tf
│       │   └── variables.tf
│       ├── app2
│       └── app3

The only possible way I can make this work is if I have the structure shown below, which will become very painful (to read / find information) once the definition of all 200 accounts are in place.

.
+-- README.md
+-- aws
¦   +-- modules
¦   ¦   +-- compute
¦   ¦   ¦   +-- nodes
¦   ¦   +-- network
¦   ¦       +-- subnet
¦   ¦       +-- vpc
¦   ¦           +-- main.tf
¦   ¦           +-- outputs.tf
¦   ¦           +-- variables.tf
+-- project_name-app1-environment1-main.tf
+-- project_name-app1-environment1-variables.tf
+-- project_name-app1-environment1-output.tf
+-- project_name-app1-environmen2-main.tf
+-- project_name-app1-environment2-variables.tf
+-- project_name-app1-environment2-output.tf
+-- project_name-app2-environment1-main.tf
+-- project_name-app2-environment1-variables.tf
+-- project_name-app2-environment1-output.tf
+-- circle.yml

As you can imagine, I will probably end out with 1000s of files in the root directory :cry: .

Kindly let me know what according to you would be the best approach to tackle my use case??

gganeshan on 23 Aug 2016

There are lots of different ways to structure this, but I'll describe just two main ones here.

First, I'll elaborate on what I was assuming in my original message.

module "project_name" {
  source = "./aws/project_name"
}

Then within the aws/project_name directory you would have another main.tf, with module blocks for your apps, and an AWS provider configuration for that project's AWS account:

provider "aws" {
  # configuration for accessing this project's AWS account
}

module "app1" {
  source = "./app1"
}

module "app2" {
  source = "./app2"
}

module "app3" {
  source = "./app3"
}

Then within each of the apps you can instantiate the shared modules you need:

module "network" {
  source = "../../modules/network"
}

Now if you run terraform get at the root it should get all of the modules used in the entire tree of referenced modules.

cd aws/project_name
terraform get
terraform plan -out=tfplan
terraform apply tfplan
cd ../other_project_name
terraform get
terraform plan -out=tfplan
terraform apply tfplan

In this model it's expected that each project will have its own lifecycle and that you wouldn't necessarily update all of them on every run.

In both cases, some sort of automation (Atlas, Jenkins, etc) can be helpful to ensure that they get run consistently every time.

apparentlymart on 23 Aug 2016

👍14

@apparentlymart this is great information. Thanks a lot for explaining it with such elaborate details. I will definitely go with your second solution and let you know if I need anymore assistance.

Closing the issue for the time being (will re-open if anymore help is required).

Thank you once again.

gganeshan on 23 Aug 2016

Fyki - I have already created a circleci harness for automation.

gganeshan on 23 Aug 2016

@apparentlymart thanks for the explanation but I've got a similar setup and wondering how you would reference a project output variable inside another one?

mavencode01 on 7 Feb 2018

@apparentlymart I have a few questions below around the suggested break up of different components into isolated "projects" with it's own life cycle. This seems to be the suggested method by most for the greatest flexibility and less risk of breaking things.

However take the following example below:

a) You have a subproject/component that setups up a NEW aws vpc, lets call that /aws/vpc and stores it's state in s3 bucket under it's own key.
b) You have a subproject/component that setups up all the security groups, lets call that /aws/sg and stores it's state in s3 bucket under it's own key.
c) You have a subproject/component that setups up an RDS database in vpc created in step a) and also uses security groups created in step b and stores it's state in s3 bucket under it's own key.

Now let's assume I want to create/apply all these subprojects using these steps as you mentioned.

cd aws/vpc
terraform init
terraform plan -out=tfplan
terraform apply tfplan

cd aws/sg
terraform init
terraform plan -out=tfplan
terraform apply tfplan

cd aws/mysql
terraform init
terraform plan -out=tfplan
terraform apply tfplan

etc...

Now does this mean I would need to know the order in which to apply and destroy changes? As shown above I would require the vpc created first, then sg created before creating mysql etc... and in the case of destroy, I would have to run it in reverse order? Is that correct? I was under the impression that Terraform would figure out the dependencies and handle accordingly, but that doesn't seem to be the case, unless i'm missing something here?

sarlindo-zz on 3 May 2018

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.