Instead of having to upload Glue scripts to S3 manually, it would be nice if it could be done with aws cloudformation package
command the same way it's done for AWS::CloudFormation::Stack
or AWS::Lambda::Function
resource types:
Resources:
MyGlueJob:
Type: AWS::Glue::Job
Properties:
Name: MyGlueJob
Command:
ScriptLocation: ./gluescript.scala
Name: glueetl
...
@ilyasotkov Thanks for reaching out. This seems like it would be very helpful. I have changed the label to feature request. @sanathkr what are your thoughts on this idea?
What are the current workarounds ?
Kill me but i cant find how to provide inline code for glue in cloudformation template. I refuse to believe that the only option is to actually create bucket manually, then upload the script, and then hopping that i get the path right paste it into the cloudformation template.
We should implement this. It should be a few lines of code change and some tests change. Could be a good starter project for someone. PRs welcome!
My mates need this feature and I gave it a 5 minutes investigation, which I'll share here.
The relevant bit is artifact_exporter.py. I thought about creating a new subclass of Resource
just like ServerlessFunctionResource
but then I realised that the properties of all supported resources are top level, while with glue script we need to one level down: AWS::Glue::Job.JobCommand.ScriptLocation
. It might be trivial but I don't think this is currently supported by the Resource
class.
Happy to be proven wrong of course!
@sanathkr and @andreabedini - Thanks for your feedback. I added the 'needs-contributors' label.
This came up at with some teams at my company today. I'm looking into implementing this. Here's what I'm proposing as an implementation. Please provide any feedback. I'll open a PR once this is implemented.
As @andreabedini mentioned, the relevant functionality is in artifact_exporter.py. And the complexity is introduced by the ScriptLocation being nested under the Location property of the AWS::Glue::Job resource.
Add a new GlueCommandJobCommandScriptLocationResource.
class GlueCommandJobCommandScriptLocationResource(Resource):
RESOURCE_TYPE = "AWS::Glue::Job"
# Note the PROPERTY_NAME includes a '.' implying it's nested.
PROPERTY_NAME = "Command.ScriptLocation"
In order to support the nested property (ie Command.ScriptLocation), we will need to replace all of the direct resource_dict accessors, with 2 new get_nested_property_value and set_nested_property_value methods that will split the PROPERTY_NAME on '.' and traverse the resource_dict to find the appropriate object.
def get_nested_property_value(resource_dict, property_name):
"""
Searches the resource_dict for nested properties by splitting the property_name
on '.'
:param resource_dict: Dictionary containing resource definition
:param property_name: Property name of CloudFormation resource where this
local path is present
:return: Value of the property
"""
# Support nested properties by allowing '.' in the PROPERTY_NAME
if '.' in property_name:
sub_property_names = property_name.split('.')
property_value = resource_dict.get(sub_property_names[0], None)
for sub_property_name in sub_property_names[1:]:
if property_value:
property_value = property_value.get(sub_property_name, None)
else:
property_value = resource_dict.get(property_name, None)
return property_value
def set_nested_property_value(resource_dict, property_name, new_value):
"""
Searches the resource_dict for nested properties by splitting the property_name
on '.' and sets the nested property to new_value.
:param resource_dict: Dictionary containing resource definition
:param property_name: Property name of CloudFormation resource where this
local path is present
:param new_value: The new value for the property
:raise: KeyError if a property isn't found in the dictionary
"""
# Support nested properties by allowing '.' in the PROPERTY_NAME
# Assumes that the property exists in the dictionary
if '.' in property_name:
sub_property_names = property_name.split('.')
property_dict = resource_dict[sub_property_names[0]]
for sub_property_name in sub_property_names[1:-1]:
property_dict = property_dict[sub_property_name]
property_dict[sub_property_names[-1]] = new_value
else:
resource_dict[property_name] = new_value
This would b such a great feature to have . I am currently struggling setting cicd for glue and this will help us do that. Thanks so much.
@justnance @sanathkr is there anything I need to do to get #4019 reviewed and merged?
Most helpful comment
This came up at with some teams at my company today. I'm looking into implementing this. Here's what I'm proposing as an implementation. Please provide any feedback. I'll open a PR once this is implemented.
As @andreabedini mentioned, the relevant functionality is in artifact_exporter.py. And the complexity is introduced by the ScriptLocation being nested under the Location property of the AWS::Glue::Job resource.
Add a new GlueCommandJobCommandScriptLocationResource.
In order to support the nested property (ie Command.ScriptLocation), we will need to replace all of the direct resource_dict accessors, with 2 new get_nested_property_value and set_nested_property_value methods that will split the PROPERTY_NAME on '.' and traverse the resource_dict to find the appropriate object.