Terraform-provider-azurerm: Support for File paths (and ACLs) in ADLS Gen 2 storage accounts

Created on 28 May 2020  ยท  10Comments  ยท  Source: terraform-providers/terraform-provider-azurerm

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

We currently have the azurerm_storage_data_lake_gen2_filesystem resource for initialising ADLS Gen2 filesystems, but lack the ability to manage paths and ACLs with the provider.

An example of where this would be useful is provisioning ADLS Gen 2 accounts and Azure Databricks via azurerm provider and then configuring Databricks mounts using the databricks provider from databrickslabs. For this scenario we need to be able to create the paths in the ADLS account and set the appropriate ACLs to allow access.

New or Affected Resource(s)

  • azurerm_storage_data_lake_gen2_path
  • azurerm_storage_data_lake_gen2_path_acl

Potential Terraform Configuration

Some initial thoughts on what the new resources might look like are below.

#
# Existing resource types for context
#

resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurerm_storage_account" "example" {
  name                     = "examplestorageacc"
  resource_group_name      = azurerm_resource_group.example.name
  location                 = azurerm_resource_group.example.location
  account_tier             = "Standard"
  account_replication_type = "LRS"
  account_kind             = "StorageV2"
  is_hns_enabled           = "true"
}

resource "azurerm_storage_data_lake_gen2_filesystem" "example" {
  name               = "example"
  storage_account_id = azurerm_storage_account.example.id
}

#
# New resource types
#

# create a path item (folder)
# future - allow create files by uploading contents?
resource "azurerm_storage_data_lake_gen2_path" "example" {
  storage_account_id = azurerm_storage_account.example.id
  filesystem_name               = azurerm_storage_data_lake_gen2_filesystem.name
  path = "my-path"
}

# set ACLs on a path
resource "azurerm_storage_data_lake_gen2_path_acl" "example" {
  storage_account_id = azurerm_storage_account.example.id
  filesystem_name               = "example"
  path = azurerm_storage_data_lake_gen2_path.example.path
  ace {
    scope = "default"
    type = "user"
    id = "[email protected]"
    permissions  = "rwx"
  }
  ace {
    type = "user"
    id = "stuart@contosocom"
    permissions  = "rwx"
  }
  ace {
    scope = "default"
    type = "group"
    id = "[email protected]"
    permissions  = "r--"
  }
  ace {
    type = "group"
    id = "[email protected]"
    permissions  = "r--"
  }
  ace {
    type = "other"
    permissions  = "--"
  }
  ace {
    type = "mask"
    permissions  = "rwx"
  }
 # resulting ACL string: "default:user:[email protected]:rwx,user:[email protected]:rwx,default:group:[email protected]:r--,group:[email protected]:other::---;mask:rwx"
}

References

new-resource servicstorage

Most helpful comment

Quick note to say that I've started work on this...

All 10 comments

UPDATE: After discussing with @tombuildsstuff the plan is to combine the path and ACL resources into a single resource:

resource "azurerm_storage_data_lake_gen2_path" "example" {
  storage_account_id = azurerm_storage_account.example.id
  filesystem_name               = azurerm_storage_data_lake_gen2_filesystem.name
  path = "my-path"

  ace {
    scope = "default"
    type = "user"
    id = "[email protected]"
    permissions  = "rwx"
  }
  ace {
    type = "user"
    id = "stuart@contosocom"
    permissions  = "rwx"
  }
  ace {
    scope = "default"
    type = "group"
    id = "[email protected]"
    permissions  = "r--"
  }
  ace {
    type = "group"
    id = "[email protected]"
    permissions  = "r--"
  }
  ace {
    type = "other"
    permissions  = "--"
  }
  ace {
    type = "mask"
    permissions  = "rwx"
  }
 # resulting ACL string: "default:user:[email protected]:rwx,user:[email protected]:rwx,default:group:[email protected]:r--,group:[email protected]:other::---;mask:rwx"
}

The plan is to make the ace list optional and computed to allow you to specify just a path that should be created and have it inherit the permissions from the parent folder's permissions.

Quick note to say that I've started work on this...

This is turning out to be a bigger task than I first thought (and I've had a couple of things that have taken some time away from working on it). My plan is to split it up into smaller pieces of work that are potentially mergeable as I'm keen not to end up with a load of WIP that goes stale if I haven't managed to finish before my team's next engagement starts

My plan is:

  1. directory support only (I'm close to having the resource done for this with associated giovanni changes)
  2. add support for ACLs
  3. add support for files (this brings in uploading potentially large files, handling content type properties etc)

Once 1 is done it opens up a scenario where you can create a Data Lake and set up directories with appropriate permissions using RBAC at the file system level for Databricks and other compute to be able to access and feels like a starting scenario to enable. Adding in 2 would allow you to set permissions with ACLs at a more granular level (folders within the file system) and feels like a fairly compelling scenario.

@stuartleeks thanks for this!

Directories on ACL (including default ACL) should cover 99.99% of use-cases that I see in our environments, all of which are the ACL + default ACL on the top-level directory (not /, but one below it).

Thanks @alex-goncharov - good feedback to get!

I've got an initial PR on the storage SDK that is used and corresponding WIP PR on the provider that delivers part 1 (directory creation). If they are both ok and can be merged then I might get time to do part 2 (the ACL support for directories) ๐Ÿคž

I've pushed an update to the PR that adds part 2, i.e. support for ACLs :-)

I see that this PR in go-autorest is merged and looks like it has been released. Does that mean that we can proceed with updating Giovanni to use the latest version of go-autorest and then update the Giovanni version in azurerm? ๐Ÿคž

Fixed via #7521

This has been released in version 2.37.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.37.0"
}
# ... other configuration ...

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error ๐Ÿค– ๐Ÿ™‰ , please reach out to my human friends ๐Ÿ‘‰ [email protected]. Thanks!

Was this page helpful?
0 / 5 - 0 ratings