I started rolling out a Glue table with the following CDK structs:
const trailTable = new Table(this, 'TrailTable', {
bucket: <ref to bucket>,
database: <ref to db>,
tableName: 'table',
columns: [{
name: 'user',
type: Schema.STRING,
}],
dataFormat: DataFormat.JSON,
});
To my surprise the Glue table pointed towards my bucket using this URL: s3://<bucket>/data.
Looking into the documentation of CDK, indeed /data is the default location for the Glue data to be discovered from (this is the s3Prefix property). But little explanation is given why this is the default. Is this done to follow certain guidelines or is this a randomly chosen path?
I would have expected the default to be just empty; no nested folder in the bucket but just the root. Defining the blank s3Prefix seems to be out of place to achieve this behavior:
const trailTable = new Table(this, 'TrailTable', {
...,
s3Prefix: '',
});
Remove the default that points to /data for the s3Prefix and use empty string instead
OR
Provide documentation that explains why /data is chosen as default
@sam-goodwin - As the original author of this, mind if I pick your brain?
Marking this as a feature request to use an empty string as the default data location, which seems like a more reasonable default.
There was no intelligent reasoning. I agree it should be changed since it鈥檚 too opinionated.
Most helpful comment
Marking this as a feature request to use an empty string as the default data location, which seems like a more reasonable default.