_This issue was originally opened by @karthik-papajohns as hashicorp/terraform#18073. It was migrated here as a result of the provider split. The original body of the issue is below._
Terraform v0.11.5
...
https://cloud.google.com/dataflow/pipelines/specifying-exec-params
https://www.terraform.io/docs/providers/google/r/dataflow_job.html
I have found a dirty work-around:
At the top in "options":
"zone" : null,
"workerMachineType" : "n1-standard-1",
"gcpTempLocation" : "gs://dataflow-staging-us-central1-473832897378/temp/",
Again at the bottom of sdkPipelineOptions:
}, {
"namespace" : "org.apache.beam.runners.dataflow.options.DataflowPipelineOptions",
"key" : "templateLocation",
"type" : "STRING",
"value" : "gs://dataflow-templates-staging/2018-10-08-00_RC00/PubSub_to_BigQuery"
}, {
"namespace" : "org.apache.beam.runners.dataflow.options.DataflowPipelineWorkerPoolOptions",
"key" : "workerMachineType",
"type" : "STRING",
"value" : "n1-standard-1"
} ]
},
And finally in "workerPools":
"dataDisks" : [ { } ],
"machineType" : "n1-standard-1",
"numWorkers" : 0,
I realise it is a bit hacky, but it works. The pipeline gets successfully deployed on a n1-standard-1 Compute Engine instead of the default n1-standard-4.
machine_type is now configurable. The others aren't yet.
Is there any planned timeline for implementing diskSizeGb to be configurable as well? As documented in Google Dataflow's common error guidance we'd like to be able to manage the workers Disk Size when managing Dataflow jobs with terraform.
Another feature nice to have is the possibility to set numbers of workers.
+1, would like to be able to set disk_size_gb and worker_disk_type
Most helpful comment
I have found a dirty work-around:
At the top in "options":
"zone" : null,
"workerMachineType" : "n1-standard-1",
"gcpTempLocation" : "gs://dataflow-staging-us-central1-473832897378/temp/",
Again at the bottom of sdkPipelineOptions:
}, {
"namespace" : "org.apache.beam.runners.dataflow.options.DataflowPipelineOptions",
"key" : "templateLocation",
"type" : "STRING",
"value" : "gs://dataflow-templates-staging/2018-10-08-00_RC00/PubSub_to_BigQuery"
}, {
"namespace" : "org.apache.beam.runners.dataflow.options.DataflowPipelineWorkerPoolOptions",
"key" : "workerMachineType",
"type" : "STRING",
"value" : "n1-standard-1"
} ]
},
And finally in "workerPools":
"dataDisks" : [ { } ],
"machineType" : "n1-standard-1",
"numWorkers" : 0,
I realise it is a bit hacky, but it works. The pipeline gets successfully deployed on a n1-standard-1 Compute Engine instead of the default n1-standard-4.