Presto: Hive connector cannot create external table if location does not exists

Created on 9 Sep 2017  路  5Comments  路  Source: prestodb/presto

Presto query will fail when creating an external table where the specified location directory is not created yet.

Example query:
Create table tbl (a varchar) with (external_location = 's3://mybucket/non_existing_dir');

Exception message:
Query 20170908_215859_00007_uipf8 failed: External location must be a directory

In Hive, when the specified location does not yet exists, the HiveMetastore will create the directory. However Presto fails for this case. I was wondering if this is the expected behavior?

The exception is thrown here when HiveMetadata#createTable is constructing the external path and is checking that the external location is a directory.

If this is not expected behavior then the fix would simply be changing that condition for throwing "External location must be a directory" to
if (pathExists && !pathIsDirectory)

Thanks for taking a look

Most helpful comment

In case anyone else comes across this I had a similar but slightly different error

com.facebook.presto.spi.PrestoException: Database 'temp' location is not a directory:

The temp database pointed to an s3 key which actually did have files in it so should have been recognized as a directory.

The issue was due to the fact that the metadata associated with the Content-Type was "application/octet-stream".

$> aws s3api head-object --bucket <bucket> --key <key>
{
...
"ContentType": "application/octet-stream",
...
}

The fix for me was to delete all the contents of the directory and then create it again in the AWS console. This adds a file named 0 and sets the correct ContentType (application/x-directory) for the directory

This line was returning false
https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java#L337

metadata.getContentType() -> "application/octet-stream" which does not match.

All 5 comments

I got error even if the path is exists and a directory when I mentioned partitioned_by. But somehow resolved by adding another empty directory inside

I believe this bug (for s3 locations specifically) was fixed by this changeset on the other fork, in case anybody else hits it: https://github.com/prestosql/presto/commit/1985dcaa84f48e1994a75a791b49a0524db18f60

The tl;dr is that s3 isn't actually a filesystem, it's a kv store with "prefixes" 鈥斅燼 prefix can't exist without data under it, so HMS can't create an empty directory, so as you've already noted this check should be skipped for s3. The reason it works to add an empty "directory" is that it creates a 0b file someplace under the prefix, which keeps the prefix in existence to pass this check.

In case anyone else comes across this I had a similar but slightly different error

com.facebook.presto.spi.PrestoException: Database 'temp' location is not a directory:

The temp database pointed to an s3 key which actually did have files in it so should have been recognized as a directory.

The issue was due to the fact that the metadata associated with the Content-Type was "application/octet-stream".

$> aws s3api head-object --bucket <bucket> --key <key>
{
...
"ContentType": "application/octet-stream",
...
}

The fix for me was to delete all the contents of the directory and then create it again in the AWS console. This adds a file named 0 and sets the correct ContentType (application/x-directory) for the directory

This line was returning false
https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java#L337

metadata.getContentType() -> "application/octet-stream" which does not match.

thanks, @benrifkind!!! your comment was really helpful to solve this problem \o/

just updating code references:

the constant S3_DIRECTORY_OBJECT_CONTENT_TYPE is assigned with value application/x-directory [1] and the comparison with S3 metadata happens few lines below [2]

references:

[1] https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java#L157

[2] https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java#L359

hi, as said by @benrifkind , I created the S3 folder from AWS console and now I see that the "metadata.contentType()" returns "application/x-directory; charset=UTF-8" . even this doesn't match the check in PrestoS3FileSystem.java . Has anyone encountered this?

Was this page helpful?
0 / 5 - 0 ratings