Aws-cdk: Missing classification-parameter when creating table in Glue

Created on 6 May 2020  路  5Comments  路  Source: aws/aws-cdk

Hey. I haven't reported bugs before, so I hope I'm doing things correctly here.

When creating Glue table using aws_cdk.aws_glue.Table with data_format = _glue.DataFormat.JSON classification is set to Unknown. Querying the table fails.

Reproduction Steps

glue_table = _glue.Table(self,'GlueTable'
            ,database = _glue.Database.from_database_arn(self, 'GlueDatabase'
                ,'arn:aws:glue:region:{}:database/abc'.format(accound_id)
            )
            ,table_name = 'def_ghi'
            ,data_format = _glue.DataFormat.JSON
            ,bucket = s3_bucket
            ,s3_prefix = 'prefix/'

If I manually add "classification" with value "json" in the Table properties, after deploying with CDK, the query works fine.

Error Log

Amazon Invalid operation: Invalid DataCatalog response for external table "abc"."def_ghi": Cannot deserialize table. Missing mandatory field: Parameters in response from external catalog. ;

Environment

  • CLI Version :
  • Framework Version: 1.37.0
  • OS :Windows 10
  • Language :Python

This is :bug: Bug Report

@aws-cdaws-glue bug efforsmall p1

Most helpful comment

To get around this I have added a post-deploy code snippet using boto3 to update the table, like this:

response = glue_client.get_table(
    DatabaseName=database_name,
    Name=table_name
)
table = response['Table']
table['StorageDescriptor']['SerdeInfo']['Parameters'] = {}
table['Parameters']['classification'] = 'json' <-- not necessary, but removes the classification: Unknown
glue_client.update_table(
    DatabaseName=table['DatabaseName']
    ,TableInput={
        'Name' : table['Name']
        ,'Description': table['Description']
        ,'Retention': table['Retention']
        ,'StorageDescriptor': table['StorageDescriptor']
        ,'TableType': table['TableType']
        ,'Parameters': table['Parameters']
    }
)

All 5 comments

After some more fiddling around, I discovered that it probably doesn't have to do with the classification=json parameter. I managed to make it work just by editing and pressing apply. I then looked at the difference and the only thing I could find was this:

SerdeInfo before:

'SerdeInfo': {'SerializationLibrary': 'org.openx.data.jsonserde.JsonSerDe'}

SerdeInfo after:

'SerdeInfo': {'SerializationLibrary': 'org.openx.data.jsonserde.JsonSerDe', 'Parameters': {}}

After some further thought, I see that this also correlates with the error message above.

To get around this I have added a post-deploy code snippet using boto3 to update the table, like this:

response = glue_client.get_table(
    DatabaseName=database_name,
    Name=table_name
)
table = response['Table']
table['StorageDescriptor']['SerdeInfo']['Parameters'] = {}
table['Parameters']['classification'] = 'json' <-- not necessary, but removes the classification: Unknown
glue_client.update_table(
    DatabaseName=table['DatabaseName']
    ,TableInput={
        'Name' : table['Name']
        ,'Description': table['Description']
        ,'Retention': table['Retention']
        ,'StorageDescriptor': table['StorageDescriptor']
        ,'TableType': table['TableType']
        ,'Parameters': table['Parameters']
    }
)

Hi @jorgenfroland - Thanks for reporting this.

I believe this is rooted in either the Glue API or how CloudFormation invokes it. In any case, passing an empty map should be the same as not passing it at all, and CDK can probably mitigate this quirk.

Filing 馃憤

Thanks @jorgenfroland :)
Your comment helped me solve the same problem.

Can confirm this is happening in typescript construct as well, Kudos to @jorgenfroland. Currently the inability to add parameters like classification and S3 exclude Path with the L2 construct is indeed a problem when using Cdk for creating Glue resources. Hope it gets stable soon.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

EduardTheThird picture EduardTheThird  路  3Comments

v-do picture v-do  路  3Comments

kawamoto picture kawamoto  路  3Comments

artyom-melnikov picture artyom-melnikov  路  3Comments

nzspambot picture nzspambot  路  3Comments