Presto: Azure DataLake Store Connection issue: No FileSystem for scheme: adl

Created on 5 Oct 2018  路  10Comments  路  Source: prestodb/presto

I am trying the run a Presto DB on my local machine to connect the data from Azure Datalake store. Even after adding the JAR files for Azure Data lake store not able to fetch the data from Azure DataLake Store.

I get the below error:

Query 20181005_191247_00000_wcgur failed: No FileSystem for scheme: adl

    com.facebook.presto.spi.PrestoException: No FileSystem for scheme: adl
    at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:189)
    at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:47)
    at com.facebook.presto.hive.util.ResumableTasks.access$000(ResumableTasks.java:20)
    at com.facebook.presto.hive.util.ResumableTasks$1.run(ResumableTasks.java:35)
    at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: No FileSystem for scheme: adl
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
    at org.apache.hadoop.fs.PrestoFileSystemCache.createFileSystem(PrestoFileSystemCache.java:114)
    at org.apache.hadoop.fs.PrestoFileSystemCache.getInternal(PrestoFileSystemCache.java:89)
    at org.apache.hadoop.fs.PrestoFileSystemCache.get(PrestoFileSystemCache.java:62)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
    at com.facebook.presto.hive.HdfsEnvironment.lambda$getFileSystem$0(HdfsEnvironment.java:71)
    at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
    at com.facebook.presto.hive.HdfsEnvironment.getFileSystem(HdfsEnvironment.java:70)
    at com.facebook.presto.hive.HdfsEnvironment.getFileSystem(HdfsEnvironment.java:64)
    at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadPartition(BackgroundHiveSplitLoader.java:282)
    at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:256)
    at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:91)
    at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:185)
    ... 7 more

Reference:
OS: Ubuntu 16.04 LTS
Persto server: presto-server-0.212.tar.gz
Presto CLI: presto-cli-0.212-executable.jar

JARs added to plugin/hive-hadoop2:

hadoop-azure-datalake-3.1.1.jar
azure-data-lake-store-sdk-2.3.2.jar
hadoop-azure-3.1.1.jar

hive.properties

connector.name=hive-hadoop2
hive.metastore.uri=thrift://hive:9083
hive.config.resources=presto/server/etc/catalog/adls-site.xml

adls-site.xml

<configuration>
        <property>
                <name>fs.adl.impl</name>
                <value>org.apache.hadoop.fs.adl.AdlFileSystem</value>
        </property>

        <property>
                <name>fs.AbstractFileSystem.adl.impl</name>
                <value>org.apache.hadoop.fs.adl.Adl</value>
        </property>
        <property>
        <name>fs.adl.oauth2.access.token.provider.type</name>
        <value>ClientCredential</value>
        </property>

        <property>
        <name>fs.adl.oauth2.refresh.url</name>
        <value>my_url</value>
        </property>

        <property>
        <name>fs.adl.oauth2.client.id</name>
        <value>my_id</value>
        </property>

        <property>
        <name>fs.adl.oauth2.credential</name>
        <value>my_cred</value>
        </property>

  </configuration>

Any comment on this will be much helpful. Thanks in advance!

Most helpful comment

is there a plan for Presto to support Azure Datalake Storage authentication method? https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control

All 10 comments

Instead of this approach, I suggest you file an issue asking for the adl filesystem to be included with Presto.

Thanks for the suggestion. Sure, I can do that. But before that I would like to confirm that I am not missing anything in the configuration. All I did was only adding above mentioned jars in the plugin/hive-hadoop2 dir and xml configuration. Am I missing anything?

Everything looks correct to me. The Presto setup to make Hadoop work correctly is pretty complex, so I would have to use a debugger to figure out what is going on.

@venkadeshwarank : When we tried setting up some new HDFS config to read encrypted files, using hive.config.resources sometimes helped and in some instances it didn't. I suggest along with putting these settings in adls-site.xml, please copy all these settings to hdfs-site.xml and try and explicitly pass the path of hdfs-site.xml and core-site.xml to hive.config.resources parameter.

is there a plan for Presto to support Azure Datalake Storage authentication method? https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control

Notice that Starburst Presto distribution supports that already https://docs.starburstdata.com/latest/azure/azurestorage.html

We plan to issue a PR along with Gen2 support. In the meantime you can give it a try.

@kokosing just checking, on Jan 21, you mentioned that you plan to submit a PR to bring Presto support to ADLS Gen2. Has this been done?

@ryancrawcour It is done already, but it was only introduced to https://github.com/prestosql/presto. ABFS, ADLS and ALDS gen2 should work there, but you would need to provide site xml file with file system configuration something like:

hive.config.resources=/etc/hadoop/conf/azure-site.xml

@kokosing - By any chance do you have any complete examples of what that configuration (both XML and hive connector) would look like?

@migolfi since this pertains Presto SQL, I'd advise asking this on Presto Community Slack (https://prestosql.io/community.html)

Was this page helpful?
0 / 5 - 0 ratings