Packer: Azure storage account type for OS & additional data disks not set correctly when capturing managed images in 1.3.4

Created on 12 Feb 2019  路  40Comments  路  Source: hashicorp/packer

Seems like 1.3.4 has a regression where the storage account type for both the OS and additional data disks defaults to Standard_LRS even when managed_image_storage_account_type is set to Premium_LRS.

Will work on getting you guys the debug log output and a repro case. I'm a little frustrated to be reporting regressions in both 1.3.3 and 1.3.4. :disappointed:

buildeazure regression upstream-bug

Most helpful comment

@paulmey is at Microsoft and can probably provide more details than I can.

All 40 comments

The repro template is essentially the same as the one I reported for #7077: https://gist.github.com/danports/419e4ea77fdb36cc455c0a4358409e1a

The storage account type for both OS and additional data disks is getting set correctly on 1.3.2.

Sorry @danports. You're right. We are working to address this.

I can confirm the issue in 1.3.4. I re-ran against master and the issue has been fixed.

/cc @paulmey

Thanks @boumenot. Is the fix available in any of the public Docker builds yet?

Unfortunately, I am not knowledgeable about that side of the house. I suspect only releases are made available, but @SwampDragons would know.

That's correct. We only update the public docker builds for true releases. I am happy to generate a binary for you to use in the meantime.

Sure, if you have one, I could give it a shot. I upgraded to 1.3.4 for the #7077 fix only to discover this and downgrade back to 1.3.2 again. :expressionless:

Not sure which architecture you need so...

windows:packer.zip

osx:
packer.zip

linux:
packer.zip

Yep, running a sample build now, looks like it is working. Thanks! Any idea when 1.3.5 will be out?

Nope, I spoke too soon - this is still broken with the build provided. The temporary disks spun up by Packer are premium disks, but the disks in the managed image Packer captures are standard disks.

I may be misunderstanding you. The type of disk is not intrinsic to the image. Once the image is captured you can deploy it to standard or premium disks.

It looks like packer/builder/azure/arm/Config.toImageParameters() hasn't changed for a while. I wonder if the implementation in Azure compute started using different defaults. Seeing if I can get a repro...

@boumenot Hmm, I will try to deploy a standard HDD image to premium disks and see if that works. What is the purpose of the managed_image_storage_account_type parameter if not to configure the managed image storage account type? Is it only there to configure the storage account type for disks attached to the temporary VM Packer spins up?

@paulmey Was starting to wonder the same thing...

Last time I checked, the Azure API call to capture a managed image didn't have any storage account type parameters - I assumed it used the storage account types of the disks attached to the live VM, but maybe that is not true, or no longer true.

I had to refresh my memory too. I walked through the Portal, and I was prompted for Standard vs. Premium when I tried to deploy the image.

The intent of the setting was to give users the ability to deploy their builds on standard or premium storage. I am aware of some builds that go on for many hours (>8), so any performance boost is worth it. At the same time, some people don't need or want to pay for the small price increase so we expose a knob.

Thanks @boumenot. So to clarify, it seems like there are three concepts at play here:

  1. The storage account type of the disks attached to the temporary VM Packer spins up: This is controlled by managed_image_storage_account_type and works fine in 1.3.4.
  2. The storage account type of the managed image Packer captures. The Managed Disk pricing page is clear that you can choose to store your images on either standard or premium storage.
  3. The storage account type of the disks attached to VMs deployed from the managed image Packer captures: I was unclear on this earlier, but thanks to @boumenot's comment, I confirmed that the Azure API has parameters that allow deployment of a managed image to any storage account type, so we're good here.

The only remaining question then is whether managed_image_storage_account_type is intended to control the storage account type of the managed image. If it is, then there's still a bug here. (If not, then its name and explanation in the documentation is highly confusing!) That said, this isn't a blocker for us now that I've realized we can deploy managed images stored on standard storage to premium storage. I imagine that might lengthen deployment times slightly (vs. deploying from an image stored on SSDs), but probably not significantly.

Actually...there is still a problem here with item 3 above: the Azure portal (and the API, from what I can tell) do not allow selecting the storage account type for additional data disks included in the managed image generated by Packer when deploying a new VM. You can select the OS disk type, but the data disk type appears to default to the managed image storage account type, and I don't see a way to change that anywhere:
image

So that means the storage account type of the managed image matters, and this is still a blocker for us. In 1.3.2, managed_image_storage_account_type controlled the storage account type of the managed image, but it doesn't appear to do that anymore in 1.3.4 (or something changed on the Azure side), so right now it's not possible for us to deploy a VM from a managed image with premium storage data disks attached. :disappointed:

I rebuilt a template with Packer 1.3.2 just now and the managed image was correctly captured with premium storage, so I think we can probably rule out the possibility of a change on the Azure side causing this bug - something changed in Packer in 1.3.3/1.3.4 that broke this scenario.

Thanks for that info, still looking...

I confirm that I hit that bug yesterday too with packer 1.3.4. Downgrading to 1.3.2 fix it.
I'm confused about this issue being closed, as reading the comments it doesn't seems to be fixed.
Thanks

I also reproed this. still looking what caused this to regress...

I poked around a little bit more and confirmed that the Azure deployment templates Packer is generating on 1.3.2 and on master are identical, as expected. I haven't sniffed the API requests going over the wire, but I don't think that there are any differences in the capture managed image API calls between the two versions either. So at the moment, the most likely explanation IMO is a bug in the Azure API that causes the storage account type for disks in a managed image to be set incorrectly when those disks have their host caching mode set to a non-default value (anything besides ReadWrite).

Have you made any progress, @paulmey?

I tested a bunch of different permutations just now - here's what you wind up with in your managed image for each, assuming you've specified "managed_image_storage_account_type": "Premium_LRS" in your template (by master I mean the build @SwampDragons posted earlier in this issue):

  • 1.3.2 with no disk_caching_type specified (you can't specify one in this version): Premium disk with read/write caching (correct/expected behavior)
  • master with no disk_caching_type specified: Standard disk with read/write caching
  • master with disk_caching_type set to None: Standard disk with no caching
  • master with disk_caching_type set to ReadOnly: Standard disk with read-only caching

I would test 1.3.3, but can't due to #7077, and I didn't bother with 1.3.4, since it's not much different from master (yet).

Here is a temporary workaround. Once packer saves image as a storage account type of HDD you can change it for Premium disk using the script here. https://blogs.technet.microsoft.com/keithmayer/2017/08/17/how-to-azure-managed-vm-images-using-premium-data-disks/

I did try this a few months ago and it worked for me.

I've finally figured this out. I think Azure has a bug when you create an image from a deallocated VM. In Azure you can 'stop' a VM, which keeps the resources reserved for your VM. This is usually faster, but can also cause a race condition with trying to delete the disks immediately after. I switched to deallocating the VM in #7203, which releases all resources in the Azure platform. The intent was to prevent race conditions when trying to delete the disks later in the workflow.
To get both, we should stop the VM before capturing the image and then deallocate it before starting cleanup. As an immediate mitigation, we can revert PR #7203.

@paulmey Do you have a Docker image with the fix or mitigation that I could test?

It worked! :confetti_ball: Thanks for your detective work @paulmey!

Can I assume that release was built from https://github.com/paulmey/packer/tree/revert-7203? I want to build a Docker image with the fix for our CI pipeline.

I'm having trouble building a Docker image from that branch - is there something wrong with the branch, or am I doing it wrong?
https://cloud.docker.com/repository/registry-1.docker.io/danports/packer-7304-fix/builds/a3e87860-79e9-4a8f-8180-38493837ba39

@danports we just released v1.4.0 so you should now be able to use the official packer docker images.

@SwampDragons I don't think we reverted that commit in v1.4.0. It's a bug on the Azure side, but the timeline for a fix is not entirely clear. While they work on fixing their bug, my private build can be a workaround, or @danports' docker image.
@danports Docker Hub doesn't let me look at your build. In general, you'd do

go get -u github.com/hashicorp/packer
cd $GOPATH/src/github.com/hashicorp/packer
git revert e189db97d
go install

That will pull sources & dependencies, revert the offending commit and build packer again. The binary is then placed in the default $GOPATH/bin directory.

@paulmey Thanks for the git revert tip - that worked. I was trying to build a Docker image from my own fork of your Packer fork with the revert already baked in and that's where I ran into trouble. :man_shrugging:

My Docker image is here, if anyone else needs it until a permanent fix is made - I've confirmed that images built with this version correctly set the disk storage account type:
https://hub.docker.com/r/danports/packer-7304-fix

Is the plan to wait for a fix on the Azure end or to make a change in Packer to work around the Azure bug?

Ah, I see. Thanks for catching that; I hadn't read closely enough.

1.4.2 with "managed_image_storage_account_type": "Premium_LRS" results in managed disk type "Standard HDD"

1.4.3 built against yesterday's git also incorrectly delivers "Standard HDD"

Yeah, this is an upstream bug so not something we can fix with new Packer builds. My understanding is we need a fix to the API itself.

Hi @SwampDragons would you please share details / link to the upstream bug? Sorry but it was not clear to me after reading this thread multiple times.

@paulmey is at Microsoft and can probably provide more details than I can.

Any idea why packer won't let me create a disk with the SKU StandardSSD_LRS?

Hey @paulmey , this issue has been open for some time now - any updates?

I did just try the setting and can verify that the Storage account type in the Update replication pane still reads Standard HDD, and that if I manually (i.e. through the UI) add a new region to replicate to, I have the option to select Premium SSD and Zone-redundant as additional options to Standard HDD

Was this page helpful?
0 / 5 - 0 ratings

Related issues

brettswift picture brettswift  路  3Comments

shashanksinha89 picture shashanksinha89  路  3Comments

mvermaes picture mvermaes  路  3Comments

znerd picture znerd  路  3Comments

mwhooker picture mwhooker  路  3Comments